    1. Reviewer #1 (Public Review):

      This study presents valuable observations of white matter organisation from diffusion MRI and two types of synchrotron imaging in both monkeys and mice. Cross-modality comparisons are interesting as the different methods are able to probe anatomical structures at different length scales, from single axons in high-resolution synchrotron (ESRF) imaging, to clusters of axons in lower-resolution synchrotron (DEXY) data, to axon populations at the mm-scale in diffusion MRI. By acquiring all modalities in monkey and mouse ex vivo samples, the authors can observe principles of fibre organisation, and characterise how fibre characteristics, such as tortuosity and micro-dispersion, vary across select brain regions and in healthy tissue versus a demyelination model. The results are solid, though some statements (in the abstract/discussion) do not appear to be fully supported, and statistical tests would help confirm whether tissue characteristics are similar/different between different conditions.

      One very interesting result is the observation of apparent laminar organisation of fibres in ex vivo monkey white matter samples. DESY data from the corpus callosum shows fibres with two dominant orientations (one L-R, one slightly inclined), clustered in laminar structures within this major fibre bundle. Thanks to the authors providing open data, I was able to look through the raw DESY volume and observe regions with different "textures" (different orientations) in the described laminar arrangement. That this organisation can be observed by eye, as well as by structure tensor, is fairly convincing. As not all readers will download the data themselves, the manuscript could benefit from additional figures/videos to demonstrate (1) the quality of the DESY data and (2) a more 3D visualisation of the laminar structures (where the coronal plane shows convincing columnar structure or stripes). Similarly in Figure 5A, though this nicely depicts two populations with different orientations, it is somewhat difficult to see the laminar structure in the current image.

      ESRF data of the centrum semiovale (CS) contributes evidence for similar laminar structures in a crossing fibre region, where primarily AP fibres are shown to cluster in 3 laminar structures. As above, further visualisations of the ESRF volume in the CS (as shown in Figure 4E) would be of value (e.g. showing consistency across the 4 volumes, 2D images showing stripey/columnar patterns along different axes, etc).

      A key limitation of this result is that, though the DESY data from the CC seems convincing, the same structures were not observed in high-resolution synchrotron (ESRF) data of the same tissue sample in the corpus callosum. This seems surprising and the manuscript does not provide a convincing explanation for this inconsistency. The authors argue that this is due to the limited FOV of the ESRF data (~200x200x800 microns). However, the observed laminar structures in DESY are ~40 microns thick, and ERSF data from the CST suggests laminar thicknesses in the range of 5-40 microns with a similar FOV. This suggests the ERSF FOV would be sufficient to capture at least a partial description of the laminar organisation. Further, the DESY data from the CC shows columnar variations along the LR axis, which we might expect to be observed along the long axis of the ESFR volume of the same sample. Additional analyses or explanations to reconcile these apparently conflicting observations would be of value. For example, the authors could consider down-sampling the ESRF data in an appropriate manner to make it more similar to the DESY data, and running the same analysis, to see if the observed differences are related to resolution (i.e. the thinner laminar structures cluster in ways that they look like a thicker laminar structure at lower resolution), or crop the DESY data to the size of the ESRF volume, to test whether the observed differences can be explained by differences in FOV.

      Laminar structures were not observed in mouse data, though it is unclear if this is due to anatomical differences or somewhat related to differences in data quality across species.

      The authors further quantify various other characteristics of the white matter, such as micro-dispersion, tortuosity, and maximum displacement. Notably, the microscopic FA calculated via structure tensor is fairly consistent across regions, though not modalities. When fibre orientations are combined across the sample, they are shown to produce similar FODs to dMRI acquired in the same tissue, which is reassuring. As noted in the text, the estimates of tortuosity and max displacement are dependent on the FOV over which they are calculated. Calculating these metrics over the same FOV, or making them otherwise invariant to FOV, could facilitate more meaningful comparisons across samples and/or modalities.

      Though the results seem solid, some statements, particularly in the abstract and discussion, do not seem to be fully supported by the data. For example, the abstract states "Our findings revealed common principles of fibre organisation in the two species; small axonal fasciculi and major bundles formed laminar structures with varying angles, according to the characteristics of major pathways.", though the results show "no strong indication within the mouse CC of the axonal laminar organisation observed in the monkey". Similarly, the introduction states: "By these means, we demonstrated a new organisational principle of white matter that persists across anatomical length scales and species, which governs the arrangement of axons and axonal fasciculi into sheet-like laminar structures." Further comments on the text are provided below.

      One observation not notably discussed in the paper is that the spherical histograms of Figure 3E/H appear to have an anisotropic spread of the white points about 0,0. It would be interesting if the authors could comment on whether this could be interpreted as the FOD having asymmetric dispersion and if so, whether the axis of dispersion relates to the fibre orientations of the laminar structures.

      A limitation of the study is that it considers only small ex vivo tissue samples from two locations in a single postmortem monkey brain and slightly larger regions of mouse brain tissue. Consequently, further evidence from additional brain regions and subjects would be required to support more generalised statements about white matter organisation across the brain.

      Given the monkey results, the mouse study (section 2.5 onwards) lacks some motivation. In particular, it is unclear why a demyelination model was studied and if/how this would link to the laminar structure observed in the monkey data. Further, it is unclear how comparable tortuosity/max deviation values are across species, considering the differences in data quality and relative resolution, given that the presented results show these values are very modality-dependent.

      The paper introduces a new method of "scale-space" parameters for structure tensors. Since, to my understanding, this is the first description of the method, some simple validation of the method would be welcomed. Further, the same scale parameters are not used across monkeys and mice, with a larger kernel used in mice (Table 2) which is surprising given their smaller brain size. Some explanation would be helpful.

    2. Reviewer #2 (Public Review):

      Summary:<br /> In this work, the authors combine diffusion MRI and high-resolution x-ray synchrotron phase-contrast imaging in monkey and mouse brains to investigate the 3D organization of brain white matter across different scales and species. The work is at the forefront of the anatomical investigation of the human connectome and aligns with several current efforts to bridge the resolution gap between what we can see in vivo at the millimeter scale and the complexity of the human brain at the sub-micron scale. The authors compare the 3D white matter organization across modalities within 2 small regions in one monkey brain (body of the corpus callosum, centrum semiovale) and within one region (splenium of the corpus callosum) in healthy mice and in one murine model of focal demyelination. The study compares measures of tissue anisotropy and fiber orientations across modalities, performs a qualitative comparison of fasciculi trajectories across brain regions and tissue conditions using streamlined tractography based on the structure tensor, and attempts to quantify the shape of fasciculi trajectories by measuring the tortuosity index and the maximum deviation for each reconstructed streamline. Results show measures of anisotropy and fiber orientations largely agree across modalities, especially for larger FOV data. The high-resolution data allows us to explore the fiber trajectories in relation to tissue complexity and pathology. The authors claim the study reveals new common organization principles of white matter fibers across species and scales, for which axonal fasciculi arrange into sheet-like laminar structures.


      The aim of the study is of central importance within present efforts to bridge the gap between macroscopic structures observable in vivo in humans using conventional diffusion MRI and the microscopic organization of white matter tissue. Results obtained from this type of study are important to interpret data obtained in vivo, inform the development of novel methodologies, and expand our knowledge of the structural and thus functional organization of brain circuits.

      Multi-scale data acquired across modalities within the same sample constitute extremely valuable data that is often hard to acquire and represent a precious resource for validation of both diffusion MRI tractography and microstructure methods.

      The inclusion of multi-species data adds value to the study, allowing the exploration of common organization principles across species.

      The addition of data from a murine cuprizone model of focal demyelination adds interesting opportunities to study the underlying biological changes that follow demyelination and how these impact tissue anisotropy and fiber trajectories. These data can inform the interpretation and development of diffusion MRI microstructure models.


      The main claim of a newly discovered laminar organization principle that is consistent across scales and species is not supported strongly enough by the data. The main evidence in support of the claim comes from the larger FOV data obtained from the body of the corpus callosum in the monkey brain. A laminar organization principle is partially shown in the centrum semiovale in the monkey brain and it is not shown in mice data. Additionally, the methods lack details to help the correct interpretation of these findings (e.g., how were these fasciculi defined?; how well do they represent different axonal populations?; what is the effect of blood vessels on the structure tensor reconstruction?; how was laminar separation quantified?) and the discussion does not provide a biological background for this organization. The corpus callosum sample suggests axons within a bundle of fibers are organized in a sheet-like fashion, while data from the centrum semiovale suggest fibers belonging to different fiber bundles are organized in a sheet-like arrangement. While I acknowledge the challenges in acquiring such high-resolution data, additional samples from different regions in the same animals and from different animals would help strengthen this claim.

      The main goal of the study is to bridge the organization of white matter across anatomical length scales and species. However, given the substantial difference in FOVs between the two imaging modalities used, and the absence of intermediate-resolution data, it remains difficult to effectively understand how these results can be used to inform conventional diffusion MRI. In this sense, the introduction does not do a good enough job of building a strong motivation for the scientific questions the authors are trying to answer with these experiments and for the specific methodology used.

      The cuprizone data represent a unique opportunity to explore the effect of demyelination on white matter tissue. However, this specific part of the study is not well motivated in the introduction and seems to represent a missed opportunity for further exploration of the qualitative and quantitative relationship between diffusion MRI and sub-micron tissue information (although unfortunately not within the same brain sample). This is especially true considering the diffusion MRI protocol for mice would allow extrapolation of advanced measures from different tissue compartments.

    1. eLife assessment

      This study describes important findings related to early disruptions in disinhibitory modulation exerted by VIP+ interneurons, in CA1 in a transgenic model of Alzheimer's disease pathology. The authors provide a convincing analysis at the cellular, synaptic, network, and behavioral levels on how these changes correlate and might be related to behavioral impairments during these early stages of AD pathology.

    2. Reviewer #1 (Public Review):


      The work in the manuscript titled " Altered firing output of VIP interneurons and early dysfunctions in CA1 hippocampal circuits in the 3xTg mouse model of Alzheimer's disease" utilized patch-clamp techniques to explore the electrophysiological characteristics of VIP interneurons in the early stages of AD using the 3xTg mouse model. The study revealed that VIP interneurons exhibited prolonged action potentials and reduced firing rates. These changes could not be attributed to modifications in input signals or morphological transformations. The authors attributed aberrant VIP activity to the accumulation of beta-amyloid in those interneurons.

      The decreased frequency of VIP inhibitory events was associated with no observed changes in excitatory drive to these interneurons. Consequently, heightened activity in the general population of CA1 interneurons was observed during a decision-making task and an object recognition test. In light of these findings, the authors concluded that the altered firing patterns of VIP interneurons may initiate early-stage dysfunction in hippocampal CA1 circuits, potentially influencing the progression of AD pathology.


      Overall the work is novel and moves the field of Alzheimer's disease forward in a significant way. The manuscript reports a novel concept of aberrant activity in VIP interneurons during the early stages of AD thus contributing to dysfunctions of the CA1 microcircuit. This results in the enhancement of the inhibitory tone on the primary cells of CA1. Thus, the disinhibition by VIP interneurons of Principal Cells is dampened. The manuscript was skillfully composed, and the study was of strong scientific rigor featuring well-designed experiments. Necessary controls were present. Both sexes were included.


      (1) The authors attributed aberrant circuit activity to the accumulation of "Abeta intracellularly" inside IS-3 cells. That is problematic. 6E10 antibody recognizes amyloid plaques in addition to Amyloid Precursor Protein (APP) as well as the C99 fragment. There are no plaques at the ages 3xTg mice were examined. Thus, the staining shown in Figure 1a is of APP/C99 inside neurons, not abeta accumulations in neurons. At the ages of 3-6 months, 3xTg starts producing abeta oligomers and potentially tau oligomers as well (Takeda et al., 2013 PMID: 23640054; Takeda et al., 2015 PMID: 26458742 and others). Emerging literature suggests that abeta and tau oligomers disrupt circuit function. Thus, a more likely explanation of abeta and tau oligomers disrupting the activity of VIP neurons is plausible.

      (2) Authors suggest that their animals do not exhibit loss of synaptic connections and show Figure 3d in support of that suggestion. However, imaging with confocal microscopy of 70-micron thick sections would not allow the resolution of pre- and post-synaptic terminals. More sensitive measures such as electron microscopy or array tomography are the appropriate techniques to pursue. It is important for the authors to either remove that data from the manuscript or address the limitations of their technique in the discussion section. There is a possibility of loss of synaptic connections in their mouse model at the ages examined.

    3. Reviewer #2 (Public Review):


      The submitted manuscript by Michaud and Francavilla et al., is a very interesting study describing early disruptions in the disinhibitory modulation exerted by VIP+ interneurons in CA1, in a triple transgenic model of Alzheimer's disease. They provide a comprehensive analysis at the cellular, synaptic, network, and behavioral level on how these changes correlate and might be related to behavioral impairments during these early stages of the disease.

      Main findings:

      - 3xTg mice show early Aß accumulation in VIP-positive interneurons.

      - 3xTg mice show deficits in a spatially modified version of the novel object recognition test.

      - 3xTg mice VIP cells present slower action potentials and diminished firing frequency upon current injection.

      - 3xTg mice show diminished spontaneous IPSC frequency with slower kinetics in Oriens / Alveus interneurons.

      - 3xTg mice show increased O/A interneuron activity during specific behavioral conditions.

      - 3xTg mice show decreased pyramidal cell activity during specific behavioral conditions.


      This study is very important for understanding the pathophysiology of Alzheimer´s disease and the crucial role of interneurons in the hippocampus in healthy and pathological conditions.


      Although results nicely suggest that deficits in VIP physiological properties are related to the differences in network activity, there is no demonstration of causality.

    1. eLife assessment

      This paper expands the genetic toolset that was previously developed by the Rao lab to introduce the conditional downregulation of neurotransmission components in Drosophila. As a proof of principle, the authors tested their new collection and provide evidence of the contribution of CNMamide (a neuropeptide) to the temporal control of locomotor activity patterns. These are overall important findings supported by compelling evidence.

    2. Reviewer #1 (Public Review):


      The paper of Mao et al. expands the genetic toolset that was previously developed by the Rao lab (Denfg et al 2019) to introduce the conditional KO or downregulation of neurotransmission components in Drosophila. The authors then use these tools to investigate neurotransmission in the the clock neurons of the Drosophila brain. They first test some known components and then analyze the contribution of the CNMa neuropeptide and its receptor to the circadian behavior. The results indicate that CNMA acts from a subset of DN1ps (dorsal clock neurons) to set the phase of the morning peak of locomotor activity in light:dark cycles, with an advanced morning activity in the absence of the neuropeptide. Interestingly, the receptor for the PDF neuropeptide appears to be acting in some of the CNMa neurons to control morning activity.


      This is clearly a very useful new set of tools to restrict the manipulation of these components to specific neuronal populations, and overall (see specific points below), the paper is convincing to show that the tools indeed allow to efficiently and specifically eliminate neuropeptides/receptors from subsets of neurons. The analysis of the CNMa function in the clock network reveals a new and interesting function for CNMa in the control of morning anticipation in LD conditions. This function appears to depend on CNMA_expressing DN1ps.

      Comment on revised version:

      I believe that the authors properly addressed the main points that were raised in my comment on version 1.

    3. Reviewer #2 (Public Review):

      Original Review:

      In this study Mao and co-workers deliver a substantial suite of genetic tools in support of the senior author's recent proposal to create a "chemoconnectomic" tool kit for the expression mapping and conditional disruption of specific neurotransmitter systems with fly neurons of interest. Specifically, they describe the creation of two toolsets for recombination-based and CRISPR/Cas9-based conditional knockouts of genes supporting neurotransmitter and neuromodulator function and Flp-Out and Split-LexA toolkit for the examination of gene expression within defined subsets of neurons. The authors report the creation of conditional genetic tools for the disruption/mapping of approximately 200 chemoconnectomic gene products, an examination of the general effectiveness of these tools in the fly brain and apply them to the circadian clock network in an attempt to reveal new information regarding the transmitter/modulator systems involved in daily behavioral timing. The authors provide clear evidence of the effectiveness of the new methods along with a transparent assessment of the variability of the tools. In addition, they present evidence that the neuro peptide CNMa influences the morning peak of daily activity in the fly by regulating the timing of activity increases in anticipation of dawn.

      A major strength of the study is the transparent assessment of the effectiveness and variability of the conditional genetic approaches developed by the authors. The authors have largely achieved their aims and the study therefore represents a major delivery on the promise of chemoconnectomics made by the senior author in 2019 (Neuron, Vol. 101, p. 876). Though there are some concerns about the variability of knockout effectiveness, off target effects of the knockout strategies, and (especially) the accuracy of the gene expression approach, the tools created for this study will almost certainly be useful for the field and support a great deal of future work.

      Comments on revised version:

      The authors have responded to each of my concerns. Most importantly, they have made the discrepancies within the study and between the study and previously published work clearer to the reader. they have also corrected statements that are not consistent with the current state of the field. The issue regarding opposing effects of PDF signaling and CNMa, which was also raised by Reviewer One still stands, notwithstanding the edits made to the text.

    4. Reviewer #3 (Public Review):


      Mao and colleagues generated powerful reagents to genetically analyse chemical communication (CCT) in the brain, and in the process uncovered a function for the CNMa neuropeptide expressed in a subset of DN1p neurons that contributes to the temporal organization of locomotor activity, i.e., the timing of morning anticipation.


      The strength of the manuscript relies in the generation/characterization of new tools for conditional targeting a well-defined set of CCT genes along with the design and testing of improved versions of Cas9 for efficient knock out. Such invaluable resources will be of interest to the whole community. The authors employed these tools and intersectional genetics to provide an alternative profiling of clock neurons, which is complementary to the ones already published. Furthermore, they uncovered a role for CNMamide, expressed in two DN1ps, in the timing of morning anticipation.


      All prior concerns have been addressed.

    1. eLife assessment

      The study presents an important ecosystem designed to support literature mining in biomedical research, showcasing a methodological framework that includes tools like Pubget for article collection and labelbuddy for text annotation. The solid evidence presented for these tools suggests they could streamline the analysis and annotation of scientific literature, potentially benefiting research across a range of biomedical disciplines. While the primary focus is on neuroimaging literature, the applicability of these methods and tools might extend further, offering useful advancements in the practices of meta-research and literature mining.

    2. Reviewer #1 (Public Review):


      In this paper, the authors present new tools to collect and process information from the biomedical literature that could be typically used in a meta-analytic framework. The tools have been specifically developed for the neuroimaging literature. However, many of their functions could be used in other fields. The tools mainly enable to downloading of batches of paper from the literature, extracting relevant information along with meta-data, and annotating the data. The tools are implemented in an open ecosystem that can be used from the command line or Python.


      The tools developed here are really valuable for the future of large-scale analyses of the biomedical literature. This is a very well-written paper. The presentation of the use of the tools through several examples corresponding to different scientific questions really helps the readers to foresee the potential application of these tools.


      The tools are command-based and store outcomes locally. So users who prefer to work only with GUI and web-based apps may have some difficulties. Furthermore, the outcomes of the tools are constrained by inherent limitations in the scientific literature, in particular, here the fact that only a small portion of the publications have full text openly available.

    3. Reviewer #2 (Public Review):


      In this manuscript, the authors described the litmining ecosystem that can flexibly combine automatic and manual annotation for meta-research.


      Software development is crucial for cumulative science and of great value to the community. However, such works are often greatly under-valued in the current publish-or-perish research culture. Thus, I applaud the authors' efforts devoted to this project. All the tools and repositories are public and can be accessed or installed without difficulty. The results reported in the manuscript are also compelling that the ecosystem is relatively mature.


      First and foremost, the logic flow of the current manuscript is difficult to follow.

      The second issue is the results from the litmining ecosystem were not validated and the efficiency of using litmining was not quantified. To validate the results, it would be better to directly compare the results of litmining with recognized ground truth in each of the examples. To prove the efficiency of the current ecosystem, it would be better to use quantitative indices for comparing the litmining and the other two approaches (in terms of time and/or other costs in a typical meta-research).

      The third family of issues is about the functionality of the litmining ecosystem. As the authors mentioned, the ecosystem can be used for multiple purposes, however, the description here is not sufficient for researchers to incorporate the litmining ecosystem into their meta-research project. Imagine that a group of researchers are interested in using the litmining ecosystem to facilitate their meta-analyses, how should they incorporate litmining into their workflow? I have this question because, in a complete meta-analysis, researchers are required to (1) search in more than one database to ensure the completeness of their literature search; (2) screen the articles from the searched articles, which requires inspection of the abstract and the pdf; (3) search all possible pdf file of included articles instead of only relying on the open-access pdf files on PMC database. That said, if researchers are interested in using litmining in a meta-analysis that follows reporting standards such as PRISMA, the following functionalities are crucial:<br /> (a) How to incorporate the literature search results from different databases;<br /> (b) After downloading the meta-data of articles from databases, how to identify whose pdf files can be downloaded from PMC and whose pdf files need to be searched from other resources;<br /> (c) Is it possible to also annotate pdf files that were not downloaded by pubget?<br /> (d) How to maintain and update the meta-data and intermediate data for a meta-analysis by using litmining? For example, after searching in a database using a specific command and conducting their meta-analysis, researchers may need to update the search results and include items after a certain period.

    1. eLife assessment

      This fundamental study provides an unprecedented understanding of the roles of different combinations of NaV channel isoforms in nociceptors' excitability, with relevance for the design of better strategies targeting NaV channels to treat pain. Although the experimental combination of electrophysiological, modeling, imaging, molecular biology, and behavioral data is convincing and supports the major claims of the work, some results remain inconclusive and need to be strengthened by further evidence. The work may be of broad interest to scientists working on pain, drug development, neuronal excitability, and ion channels.

    2. Reviewer #1 (Public Review):


      In this work, Xie, Prescott and colleagues have reevaluated the role of Nav1.7 in nociceptive sensory neurons excitability. They find that nociceptors can make use of different sodium channel subtypes to reach equivalent excitability. The existence of this degeneracy is critical to understanding the neuronal physiology under normal and pathological conditions and could explain why Nav subtype-selective drugs have failed in clinical trials. More concretely, nociceptor repetitive spiking relies on Nav1.8 at DIV0 (and probably under normal conditions in vivo), but on Nav1.7 and Nav1.3 at DIV4-7 (and after inflammation in vivo).

      The conclusions of this paper are mostly well supported by data, and these findings should be of broad interest to scientists working on pain, drug development, neuronal excitability and ion channels.


      The authors have employed elegant electrophysiology experiments (including specific pharmacology and dynamic clamp) and computational simulations to study the excitability of a subpopulation of DRGs that would very likely match with nociceptors (they take advantage of using transgenic mice to detect Nav1.8-expressing neurons). They make a strong point showing the degeneracy that occurs at the ion channel expression level in nociceptors, adding this new data to previous observations in other neuronal types. They also demonstrate that the different Nav subtypes functionally overlap and are able to interchange their "typical" roles in action potential generation. As Xie, Prescott and colleagues argue, the functional implications of the degenerate character of nociceptive sensory neurons excitability need to be seriously taken into account regarding drug development and clinical trials with Nav subtype-selective inhibitors.

      In this revised version, the quality of the manuscript has been visibly improved. In my opinion, the questions and concerns raised by reviewers have been addressed clearly. After a detailed reading of this version and the comments to the reviewers, I have no additional comments or criticisms.

    3. Reviewer #2 (Public Review):


      The authors have noted in preliminary work that tetrodotoxin (TTX), which inhibits NaV1.7 and several other TTX-sensitive sodium channels, has differential effects on nociceptors, dramatically reducing their excitability under certain conditions but not under others. Partly because of this coincidental observation, the aim of the present work was to re-examine or characterize the role of NaV1.7 in nociceptor excitability and the effects on drug efficacy. The manuscript demonstrates that a NaV1.7-selective inhibitor produces analgesia only when nociceptor excitability is based on NaV1.7. More generally and comprehensively, the results show that nociceptors can achieve equivalent excitability through changes in differential NaV inactivation and NaV expression of different NaV subtypes (NaV 1.3/1.7 and 1.8). This can cause widespread changes in the role of a particular subtype over time. The degenerate nature of nociceptor excitability shows functional implications that make the assignment of pathological changes to a particular NaV subtype difficult or even impossible.

      Thus, the analgesic efficacy of NaV1.7- or NaV1.8-selective agents depends essentially on which NaV subtype controls excitability at a given time point. These results explain, at least in part, the poor clinical outcomes with the use of subtype-selective NaV inhibitors and therefore have major implications for the future development of Nav-selective analgesics.


      The results are clearly and impressively supported by the experiments and data shown. During the revision, the manuscript was consistently improved and the concerns of the first reviews were resolved. All methods are described in detail, and presumably, allow good reproducibility and were suitable to address the scientific question.

      The results showing that nociceptors can achieve equivalent excitability through changes in differential NaV inactivation and expression of different NaV subtypes are of great importance in the fields of basic and clinical pain research and sodium channel physiology and pharmacology, but also for a broad readership and community. The degenerate nature of nociceptor excitability, which is clearly shown and well supported by data has large functional implications. The results are of great importance because they may explain, at least in part, the poor clinical outcomes with the use of subtype-selective NaV inhibitors and therefore have major implications for the future development of Nav-selective analgesics.

      In summary, the authors achieved their overall aim to enlighten the role of the NaV1.7 in nociceptor excitability and the effects on drug efficacy. The data support the conclusions and clinical implications are highlighted as far as is currently justifiable due to the still limited experience in translation. This appears well-considered, not too speculative, and ultimately appropriate.

      The main weaknesses of the first version were fixed during the revision:

      (i) After revising the manuscript, the initial weakness that the computer model was described superficially has been fixed. Important information was added to the main text and additional information, including the full code and equations and values are deposited on ModelDB or are given in the Supplementary information (Suppl. Table 5 & 6).

      (ii) The authors now comment that corresponding studies on protein levels or e.g. neuroinflammatory changes could support the characterization of the time course of membrane expression and cellular changes, but this should be addressed in future studies, as these analyses would also raise new questions, such as about membrane trafficking, post-translational modifications, etc. This is plausible and has now been appropriately addressed in the text.

      (iii) During the initial review the authors were asked to discuss the promising role of NaV1.7 in the light of clinical results. In their response, the authors confidently state that they „wish to avoid speculating on which particular clinical results are better explained because our study was not designed for that." They, however, emphasize their take-home message, which is well supported "Instead, our take-home message (which is well supported; see Discussion on lines 309-321) is that NaV1.7-selective drugs may have a variable clinical effect because nociceptors' reliance on NaV1.7 is itself variable - much more than past studies would have readers believe. ... The challenge (as highlighted in the Abstract, lines 21-22) is that identifying the dominant Nav subtype to predict drug efficacy is difficult."

      Against the background of this argumentation, it must be admitted that the decision not to present as yet unproven speculations is probably appropriate from a scientific point of view and that this ultimately proves the critical assessment of one's own data and the limitations of the study. This is undoubtedly acceptable and - in retrospect - probably the right way to go.

    4. Reviewer #3 (Public Review):


      In this study the authors used patch-clamp to characterize the implication of various voltage-gated Na+ channels in the firing properties of mouse nociceptive sensory neurons. They claim that depending on the culture conditions NaV1.3, NaV1.7, and NaV1.8 have distinct contributions to action potential firing and that similar firing patterns can result from distinct relative roles of these channels.


      The paper addresses the important issue of understanding the lack of success of therapeutic strategies targeting NaV channels in the context of pain. Specifically, the authors test the hypothesis that different NaV channels contribute in a plastic manner to action potential firing, which may be the reason why it is difficult to target pain by inhibiting these channels.


      (1) - The main claim of this paper is that "nociceptors can achieve equivalent excitability using different combinations of NaV1.3, NaV1.7, and NaV1.8". From this, they allude to the manifestation of "degeneracy", a concept implying that a biological process can occur via distinct sets of underlying components.<br /> In my opinion, the analyses of the data is biased towards the author's interpretation.<br /> - First, when comparing the excitability across neurons one should relate the response (in this case mean firing frequency) to the absolute size of the stimulus, not to the size of the stimulus normalized to the rheobase (see e.g., Figs. 1A). From this particular figure the authors conclude that the excitability is similar in the culture stages DIV0 and DIV4-7, but these data were not directly compared.<br /> - Second, the authors reach their conclusion from the comparison of the (average) firing rate determined over 1 s current stimulation in distinct conditions. However, this is not the only parameter that determines how sensory neurons might convey information. For instance, the time dependence of the instantaneous frequency, the actual firing pattern, maybe also important.<br /> - Third, the use of 1 s of current stimulation might not be sufficient to characterize the firing pattern if one wants to obtain conclusions that could translate to clinical settings (i.e., sustained pain).<br /> - Fourth, out of principle, the gating properties of NaV1.7 and NaV1.8 channels are not identical, and therefore their contributions to excitability should not be the same. A neuron in which NaV1.7 is the main contributor is expected to have a damping firing pattern due to cumulative channel inactivation, whereas another depending mainly on NaV1.8 is expected to display more sustained firing. This is actually seen in the results of the modelling.

      (2) - The quality of some recordings is dubious. The currents shown as TTX-sensitive in Fig. 1D look very strange (not like the ones at Baseline DIV4-7). These traces show abnormally fast inactivation and even transient deflections above zero current line. These are obvious artifacts of the subtraction procedure, probably due to unstable current amplitudes along the recording time. Similar odd-looking traces are shown in Fig. 3A.

      (3) - I would like to point out that the main Significance Statement of the manuscript reads "The analgesic efficacy of subtype-selective drugs hinges on which subtype controls excitability". I would like to point out that, in addition of being extremely obvious for anyone knowing a bit about pain signaling, the authors did not test the analgesic efficacy of any drug in this study.

      (4) - A critical issue in the manuscript is the unnecessary use of phrases that imply that biological entities have some sort of willpower, flirting with anthropomorphism and teleological language.<br /> Sentences such as "Nociceptive sensory neurons convey pain signals to the CNS using action potentials" (see the Abstract) should be avoided. Neurons do not really "use" action potentials, they have no will to do so. Action potentials are not tools or means to be "used" by neurons. There are many other examples of misuse of the verb "use" in many other sentences. These were pointed out during the revision phase, but unfortunately the authors refused to correct them.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This paper represents important findings when identifying untargeted metabolomics and its differences between metabolomes of different biological samples. GromovMatcher is the fantasy name for the soft development. The main idea behind it is built on the assumption of featuring and matching complex datasets. Although the manuscript reflects a solid analysis, it remains incomplete for validation with putative non-curated datasets.

      We are grateful to the eLife editor for taking the time and effort to assess our manuscript.

      We are however unsure of what the editor means by “it remains incomplete for validation with putative non-curated datasets”. As noted by Reviewer 2, manually curated datasets that could be used for validation are scarce. Most publicly available datasets do not contain sufficient information to establish a ground truth matching on which GromovMatcher, M2S, or metabCombiner can be tested. Even in the case where such a ground truth matching can be established, it must be performed by-hand through a manual matching process which is extremely time-consuming and requires very specific expertise. This, in our opinion, only highlights the need for automatic alignment methods such as metabCombiner, M2S or GromovMatcher.

      We do agree that the performance of GromovMatcher (and its competitors) needs to be validated further, and we plan to continue validating GromovMatcher as additional data becomes available in EPIC and other cohorts. With that in mind, the lack of publicly available validation data is the reason why we conducted such an extensive simulation study, arguably more comprehensive than previous validations, exploring challenging settings that we believe reflect real-life scenarios (main text “Validation on ground-truth data” and Appendix 3). We would like to stress that this allows us to highlight previously ignored limitations of the previously published methods, metabCombiner and M2S.

      We wish to thank the editor and reviewers for their time and efforts in reviewing our manuscript which led to many significant additions to our paper. Namely we:

      • Performed an additional sensitivity analysis (Appendix 3) exploring how an imbalance in the number of features or samples between two studies being matched (e.g. the dataset split), affects the quality of matchings found by GromovMatcher, metabCombiner, and M2S.

      • Investigated how changing or removing the reference dataset (Appendix 5) in the EPIC study (main text “Application to EPIC data”), affects the results of GromovMatcher.

      • Improved alignment matrix visualizations in Fig. 3a for all four methods tested on the validation data, to highlight more clearly which feature matches were correctly identified or missed.

      The revised paper is uploaded as the file “main_elife_revision.pdf” where all revisions are highlighted in blue as well as a copy “main_elife_revision_nohighlights.pdf” where revisions are not highlighted.

      Public Reviews:

      Reviewer #1 (Public Review):


      The authors have implemented the Optimal Transport algorithm in GromovMatcher for comparing LC/MS features from different datasets. This paper gains significance in the proteomics field for performing meta-analysis of LC/MS data.


      The main strength is that GromovMatcher achieves significant performance metrics compared to other existing methods. The authors have done extensive comparisons to claim that GromovMatcher performs well.


      There are two weaknesses.

      (1) When the number of features is reduced the precision drops to ~0.8.

      We would like to clarify that this drop in precision occurs in the challenging setting where only a small proportion of metabolites are shared between both datasets (e.g., the overlap – or proportion of shared features - was 25% in our simulation study). When two untargeted metabolic datasets share only 25% of their features, this is a challenging setting for any automated matching method as the vast majority 75% of the features in both datasets must remain unmatched.

      In such settings, the reviewer correctly observes that the precision of GromovMatcher algorithms (GM and GMT) drops within the range of 0.80 - 0.85 (Figure 3b, top left panel). Such a precision of 0.8 or larger is still competitive compared with the alternative methods MetabCombiner (mC) and M2S whose precisions drop below 0.8 (see main text Fig. 3b, top left panel).

      Precision is measured as the number of metabolite pairs correctly matched divided by all matches identified by a method. In other words, even in the challenging setting when the number of shared features (true matches) between both datasets is small (e.g. low 25% overlap), upwards of 80% of the feature matches found by GromovMatcher are correct which is a very encouraging result.

      (2) How applicable is the method for other non-human datasets?

      We thank the reviewer for raising this question. The crux of the matter concerning the application to animal data revolves around the hypothesis that correlations between metabolites in two different studies are preserved. Theoretically, the metabolome operates under similar principles in humans, governed by an underlying network of biochemical reactions. Consequently, in comparable human populations, the GM hypothesis is likely to hold to some extent.

      However, in practice, application to animal data is more complicated. Animal studies tend to have smaller sample sizes and often stem from intervention-driven scenarios, such as mice subjected to specific diets or chemicals. This results in deliberate alterations in metabolic structures which makes finding two comparable animal studies less likely. To investigate the reviewer’s question, we have searched through the two predominant LC-MS dataset repositories (MetaboLights and NIH Metabolomics Workbench) but did not find any pairs of comparable animal studies due to the reasons mentioned above. One potential strategy to navigate this issue could entail regressing the metabolic intensities against the variables that notably differ between the two animal populations and running GM using the residual intensities. This would be an interesting direction for future research and additional validation would be needed to test the robustness of GM in this setting.

      Reviewer #2 (Public Review):


      The goal of untargeted metabolomics is to identify differences between metabolomes of different biological samples. Untargeted metabolomics identifies features with specific mass-to-charge ratio (m/z) and retention time (RT). Matching those to specific metabolites based on the model compounds from databases is laborious and not always possible, which is why methods for comparing samples on the level of unmatched features are crucial.

      The main purpose of the GromovMatcher method presented here is to merge and compare untargeted metabolomes from different experiments. These larger datasets could then be used to advance biological analyses, for example, for the identification of metabolic disease markers. The main problem that complicates merging different experiments is m/z and RT vary slightly for the same feature (metabolite).

      The main idea behind the GromovMatcher is built on the assumption that if two features match between two datasets (that feature I from dataset 1 matches feature j from dataset 2, and feature k from dataset 1 matches feature l from dataset 2), then the correlations or distances between the two features within each of the datasets (i and k, and j and l) will be similar. The authors then use the Gromov-Wasserstein method to find the best matches matrix from these data.

      The variation in m/z between the same features in different experiments is a user-defined value and it is initially set to 0.01 ppm. There is no clear limit for RT deviations, so the method estimates a non-linear deviation (drift) of RT between two studies. GromovMatcher estimates the drift between the two studies and then discards the matching pairs where the drift would deviate significantly from the estimate. It learns the drift from a weighted spline regression.

      The authors validate the’performance of their GromovMatcher method by a validation experiment using a dataset of cord blood. They use 20 different splits and compare the GromovMatcher (both its GM and GMT iterations, whereby the GMT version uses the deviation from estimated RT drift to filter the matching matrix) with two other matching methods: M2S and metabCombiner.

      The second validation was done using a (scaled and centered) dataset of metabolics from cancer datasets from the EPIC cohort that was manually matched by an expert. This dataset was also used to show that using automatic methods can identify more features that are associated with a particular group of samples than what was found by manual matching. Specifically, the authors identify additional features connected to alcohol consumption.


      I see the main strength of this work in its combination of all levels of information (m/z, RT, and higher-order information on correlations between features) and using each of the types of information in a way that is appropriate for the measure. The most innovative aspect is using the Gromov-Wasserstein method to match the features based on distance matrices.

      We thank the reviewer for acknowledging this strength of our proposed GromovMatcher method.

      The authors of the paper identify two main shortcomings with previously established methods that attempt to match features from different experiments: a) all other methods require fine-tuning of user-defined parameters, and, more importantly, b) do not consider correlations between features. The main strength of the GromovMatcher is that it incorporates the information on distances between the features (in addition to also using m/z and RT).


      The first, minor, weakness I could identify is that there seem not to be plenty of manually curated datasets that could be used for validation.

      We thank the reviewer for raising this issue concerning manually curated validation data.

      Manually curated datasets available for validation purposes are indeed scarce. This stems from the laborious nature of matching features across diverse studies, hence the need for automatic matching methods. Our future strategy involves further validation of the GromovMatcher approach as more data becomes accessible in EPIC and other cohorts.

      The scarcity of real-life publicly available datasets that can be used for validation purpose is the reason why we conducted an extensive simulation study (main text “Validation on ground-truth data” and Appendix 3). It is notably thorough, arguably more comprehensive than previous validations, utilizes real-life untargeted data, and imitates situations where data originates from distinct untargeted metabolomics studies, complete with realistic noise parameters encompassing RT, mz, and feature intensities. Our validation study comprehensively explores the performance of GromovMatcher, M2S, and metabCombiner, including in challenging realistic settings where there is a nonlinear drift in retention times, varying levels of feature overlaps between studies, normalizations of feature intensities, as well as imbalances in the number of features and samples present in the studies being matched.

      The second is also emphasized by the authors in the discussion. Namely, the method as it is set up now can be directly used only to compare two datasets.

      This is indeed a limitation that is common to all three methods considered in this paper. However, all these methods, GromovMatcher, M2S, and metabCombiner, can still be used to compare and pool multiple datasets using a multi-step procedure. Namely, this can be done by designating a 'reference' dataset and aligning all studies to it one by one. We take this exact approach in our paper when aligning the CS, HCC, and PC studies of the EPIC data in positive mode (main text “Application to EPIC data”). Namely, the HCC and PC studies are both aligned to the CS study by running GromovMatcher twice, and after obtaining these matchings, our analysis is restricted to those features in HCC and PC that are present in the CS study.

      After the reviewer’s comment, we have added an additional sensitivity analysis in Appendix 5, to compare the results produced by GromovMatcher depending on the choice of the reference study. Namely, setting the reference study to either the CS study or the HCC study, GromovMatcher identified 706 and 708 common features respectively, with an overlap of 640 features. This highlights that the choice of reference does matter to some extent. In our original analysis of the EPIC data, choosing CS as the reference was motivated by the fact that CS had the largest sample size (compared to HCC and PC) and a subset of features in HCC and PC were already matched by experts to the CS study which we could use for validation (see Loftfield et al. (2021). J Natl Cancer Inst.).

      As mentioned in the discussion section of our manuscript, the recently proposed multimarginal Gromov-Wasserstein algorithm (Beier, F., Beinert, R., & Steidl, G. (2023). Information and Inference) could potentially allow multiple metabolomic studies to be matched using one optimization routine (e.g. without the designation of a ‘reference study’ for matching). We have not explored this possibility in depth yet as fast numerical methods for multimarginal GW are still in their infancy. Also, such multimarginal methods rely on the computation and storage of coupling or matching matrices that are tensors where the number of dimensions is equal to the number of datasets being matched. Therefore, multimarginal methods have large memory costs, which currently precludes their application for the matching of multiple metabolomics datasets.

      Reviewer #2 (Recommendations For The Authors):

      (1) I was struggling with the representation used in Figure 3a. The gray points overlayed over the green points on a straight line are difficult to visually quantify. I found that my eyes mainly focused on the pattern of the red dots.

      Figure 3a has been modified to improve visual clarity. Namely we have consistently reordered the rows and columns of the coupling matrices such that the true positive matches (green points) are spatially separated from the false negative matches (red points). Now the fraction of true positive and false negative matches can be appreciated much more clearly by eye in Figure 3a.

      (2) I would also like to add the caveat that I cannot judge whether the authors used the other two methods that they compare with GromovMatcher (the M2S and metabCombiner) optimally. But I also do not see any evidence that they did not. Hopefully one of the other reviewers can address that.

      We appreciate the reviewer for highlighting the comparison of our approach GromovMatcher to the other existing methods M2S and MetabCombiner (mC). Both M2S and mC depend on tens of hyperparameters each with a discrete or continuous set of values that must be properly optimized to infer accurate matchings between dataset features. We detail in Appendix 2 how the hyperparameters of the M2S and mC methods are optimally tuned to achieve the best possible performance on the validation ground-truth data. Namely, both in the simulation study and on EPIC data, we grid-search over all important hyperparameters in the M2S and mC methods and choose those parameter combinations that result in the highest F1 score, averaged over 20 random trials. We remark that no such hyperparameter optimization was performed for our GromovMatcher method. As shown in Figures 3 and 4 of the main text, we find that GromovMatcher outperforms M2S and mC even in these cases when the hyperparameters of M2S and mC are tuned to predict optimal feature matchings.

      Given the large combinatorial space of hyperparameter choices, we believe we have thoroughly tested the important hyperparameter combinations that users of M2S and mC would be likely to explore in their own research.

      (3) Validation

      (3a) The first validation is done on a split cord blood dataset. I could not clearly see from the paper how sensitive the result is to the dataset split.

      We are grateful for the reviewer’s question and have included new experiments in Appendix 3 which show how the results of GromovMatcher, M2S, and MetabCombiner are affected by the dataset split. In our original manuscript, our validation ground-truth experiment began with an untargeted metabolomic dataset consisting of n = 499 samples and p = 4,712 metabolic features which is split equally into two datasets consisting of an equal number of samples n1 = n2 and an equal number of metabolic features p1 = p2. The features of these equal-sized datasets would then be matched by our method.

      Now in Appendix 3 (Figs. 1-3) we show the sensitivity of all three alignment methods (GromovMatcher, M2S, and MetabCombiner) when we vary the fraction of samples in dataset 1 over dataset 2 given by n1/ n2, the overlap in shared features between both datasets, and the fraction of metabolic features in dataset 1 that are not present in dataset 2 which affects the feature sizes of both datasets p1/ p2. We find that all alignment methods are able to maintain a consistent precision and recall score when these three dataset split parameters are varied. GromovMatcher achieves a higher precision and recall than M2S and MetabCombiner for all choices of dataset split, agreeing with the validation experiment results from the main text (see main text Fig. 3). All three methods tested decrease in precision (without dropping in recall) when dataset 1 and dataset 2 contain an equal number of unshared features (e.g. when p1 = p2). Therefore, these sensitivity experiments in Appendix 3 show that our results in the main text are performed in the most challenging setting for the dataset split.

      (3b) The second validation was done using a (scaled and centered) dataset of metabolics from cancer datasets from the EPIC cohort that was manually matched by an expert. Here the authors observe that metabCombiner has good precision, but lags in recall. And M2S has a very similar performance to GromovMatcher. The authors explain this by the fact that the drift in RT between the two experiments is mostly linear and thus does not affect the M2S performance. Can the authors find a different validation dataset where the drift in RT is not linear? If yes, it would be interesting to add it to the paper.

      We thank the reviewer for raising this question. As mentioned above, curated validation datasets such as the EPIC study analyzed in our paper are very rare and we do not currently have a validation study with a nonlinear retention time drift.

      Nevertheless, we performed an additional analysis of simulated data (reported in Appendix 2 – “M2S hyperparameter experiments” and Appendix 2 – Table 1) that demonstrates the decrease in M2S performance when the simulated drift is nonlinear. As presented in Appendix 2 – Table 1, in a low overlap setting with a linear drift which corresponds to the EPIC data, precision and recall were 0.831 and 0.934 respectively, instead of 0.769 and 0.905 in the main analysis where the drift was nonlinear.

    2. eLife assessment

      The authors describe an important tool, GromovMatcher, that can be used to compare proteomic data from various experimental approaches. The underlying method is innovative, the algorithm is clearly described, and the validation that is presented is convincing.

    3. Reviewer #1 (Public Review):


      The authors have implemented Optimal Transport algorithm in GromovMatcher for comparing LC/MS features from different datasets. This paper gains significance in the proteomics field for performing meta-analysis of LC/MS data.


      The main strength is that GromovMatcher acheives significant performance metrics compared to other existing methods. The authors have done extensive comparisons to claim that GromovMatcher performs well.


      The authors might need to add the limitation of datasets and thus have tested/validated their tool using simulated data in the abstract as well.

    4. Reviewer #2 (Public Review):


      The goal of untargeted metabolomics is to identify differences between metabolomes of different biological samples.

      Untargeted metabolomics identifies features with specific mass-to-charge-ratio (m/z) and retention time (RT). Matching those to specific metabolites based on the model compounds from databases is laborious and not always possible, which is why methods for comparing samples on the level of unmatched features are crucial.<br /> The main purpose of the GromovMatcher method presented here is to merge and compare untargeted metabolomes from different experiments. These larger datasets could then be used to advance biological analyses, for example, for identification of metabolic disease markers.

      The main problem that complicates merging different experiments is that m/z and RT vary slightly for the same feature (metabolite).

      The main idea behind the GromovMatcher is built on the assumption that if two features match between two datasets (that feature i from dataset 1 matches feature j from dataset 2, and feature k from dataset 1 matches feature l from dataset 2), then the correlations or distances between the two features within each of the datasets (i and k, and j and l) will be similar. The authors then use the Gromov-Wasserstein method to find the best matches matrix from these data.

      The variation in m/z between the same features in different experiments is a user-defined value and it is initially set to 0.01 ppm. There is no clear limit for RT deviations, so the method estimates a non-linear deviation (drift) of RT between two studies. GromovMatcher estimates the drift between two studies, and then discards the matching pairs where the drift would deviate significantly from the estimate. It learns the drift from a weighted spline regression.

      The authors validate the performance of their GromovMatcher method using a dataset of cord blood. They use 20 different splits and compare the GromovMatcher (both its GM and GMT iterations, whereby GMT version uses the deviation from estimated RT drift to filter the matching matrix) with two other matching methods: M2S and metabCombiner.

      The second validation was done using a (scaled and centered) dataset of metabolics from cancer datasets from the EPIC cohort that were manually matched by an expert. This dataset was also used to show that using automated methods can identify more features that are associated with a particular group of samples than what was found by manual matching. Specifically, the authors identify additional features connected to alcohol consumption.


      I see the main strength of this work in its combination of all levels of information (m/z, RT, and higher-order information on correlations between features) and using each of the types of information in a way that is appropriate for the measure. The most innovative aspect is using the Gromov-Wasserstein method to match the features based on distance matrices.

      The authors of the paper identify two main shortcomings with previously established methods that attempt to match features from different experiments: a) all other methods require fine-tuning of user-defined parameters, and, more importantly, b) do not consider correlations between features. The main strength of the GromovMatcher is that it incorporates the information on distances between the features (in addition to also using m/z and RT).


      The main weakness is that there seem not to be enough manually curated datasets that could be used for validation. It will, therefore, be important, for the authors, and the field in general to keep validating and improving their methods if more datasets become available.

      The second weakness, as emphasized by the authors in the discussion is that the method as it is set up now can be directly used only to compare two datasets. I am confident that the authors will successfully implement novel algorithms to address this issue in the future.

    1. eLife assessment

      The large-conductance Ca2+ activated K+ channel BKCa has been reported to promote breast cancer progression. The present study presents convincing evidence that an intracellular subpopulation of this channel reprograms breast cancer cells towards the Warburg phenotype, one of the metabolic hallmarks of cancer. This important finding advances the field of cancer cell metabolism and has potential therapeutic implications.

    2. Reviewer #2 (Public Review):


      The large-conductance Ca2+ activated K+ channel (BK) has been reported to promote breast cancer progression, but it is not clear how. The present study, carried out in breast cancer cell lines, concludes that BK located in mitochondria reprograms cells towards the Warburg phenotype, one of the metabolic hallmarks of cancer.


      The use of a wide array of modern complementary techniques, including metabolic imaging, respirometry, metabolomics and electrophysiology. On the whole experiments are astute and well designed, and appear carefully done. The use of a BK knock out cells to control for the specificity of the pharmacological tools is a major strength. The manuscript is clearly written. There are many interesting original observations that may give birth to new studies.

      Weaknesses: The main conclusion regarding the role of a BK channel located in mitochondria appears is not sufficiently supported. Other perfectible aspects are the interpretation of co-localization experiments and the calibration of Ca2+ dyes. These points are discussed in more detail in the following paragraphs:

      (1) May the metabolic effects be ascribed to a BK located in mitochondria? Unfortunately not, at least with the available evidence. While it is clear these cells have a BK in mitochondria (characteristic K+ currents detected in mitoplasts) and it is also well substantiated that the metabolic effects in intact cells are explained by an intracellular BK (paxilline effects absent in the BK KO), it does not follow that both observations are linked. Given that ectopic BK-DEC appeared at the surface, a confounding factor is the likely expression of BK in other intracellular locations such as ER, Golgi, endosomes, etc. To their credit authors acknowledge this limitation several times throughout the text ("...presumably mitoBK...") but not in other important places, particularly in title and abstract.

      (2) mitoBK subcellular location. Pearson correlations of 0.6 and about zero were obtained between the locations of mitoGREEN on one side, and mRFP or RFP-GPI on the other (Figs. 1G and S1E). These are nice positive and negative controls. For BK-DECRFP however the Pearson correlation was about 0.2. What is the Z resolution of apotome imaging? Assuming an optimum optical section of 600 nm, as obtained a 1.4 NA objective with a confocal, that mitochondria are typically 100 nm in diameter and that BK-DECRFP appears to stain more structures that mitoGREEN, the positive correlation of 0.2 may not reflect colocalization. For instance, it could be that BK-DECRFP in not just in mitochondria but in a close underlying organelle e.g. the ER. Along the same line, why did BK-RFP also give a positive Pearson? Isn´t that unexpected? Considering that BK-DEC was found by patch clamping at the plasma membrane, the subcellular targeting of the channel is suspect. Could it be that the endogenous BK-DEC does actually reside exclusively in mitochondria (a true mitoBK), but overflows to other membranes upon overexpression? Regarding immunodetection of BK in the mitochondrial Percoll preparation (Fig. S5), absence of NKA demonstrates absence of plasma membrane contamination, but does not inform about contamination by other intracellular membranes.

      (3) Calibration of fluorescent probes. The conclusion that BK blockers or BK expression affects resting Ca2+ levels should be better supported. Fluorescent sensors and dyes provide signals or ratios that need be calibrated if comparisons between different cell types or experimental conditions are to be made. This is implicitly acknowledged here when monitoring ER Ca2+, with an elaborate protocol to deplete the organelle in order to achieve a reading at zero Ca2+.

      (4) Line 203. "...solely by the expression of BKCa-DECRFP in MCF-7 cells". Granted, the effect of BKCa-DECRFP on the basal FRET ratio appears stronger than that of BK-RFP, but it appears that the latter had some effect. Please provide the statistics of the latter against the control group (after calibration, see above).

      The revised version of the manuscript has incorporated my suggestions to a very reasonable degree, in several cases with new experiments. The details of these improvements can be found in the correspondence.

    3. Reviewer #3 (Public Review):

      The original research article, titled "mitoBKCa is functionally expressed in murine and human breast cancer cells and promotes metabolic reprogramming" by Bischof et al, has demonstrated the underlying molecular mechanisms of alterations in the function of Ca2+ activated K+ channel of large conductance (BKCa) in the development and progression of breast cancer. The authors also proposed that targeting mitoBKCa in combination with established anti-cancer approaches, could be considered as a novel treatment strategy in breast cancer treatment.

      The paper is modified according to the reviewer's comments. Most of the queries raised by this reviewer were answered. However, the preclinical implication of this study can also be manifested in combinatorial treatment with known chemotherapeutic drugs which is lacking in this manuscript. Hopefully, the authors will consider this in their future study.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This important study reports a novel mechanism linking DHODH inhibition-mediated pyrimidine nucleotide depletion to antigen presentation. Alternative means of inducing antigen presentation provide therapeutic opportunities to augment immune checkpoint blockade for cancer treatment. While the solid mechanistic data in vitro are compelling, in vivo assessments of the functional relevance of this mechanism are still incomplete.

      Public Reviews:

      We thank all Reviewers for their insightful comments and excellent suggestions.

      Reviewer #1 (Public Review):

      The manuscript by Mullen et al. investigated the gene expression changes in cancer cells treated with the DHODH inhibitor brequinar (BQ), to explore the therapeutic vulnerabilities induced by DHODH inhibition. The study found that BQ treatment causes upregulation of antigen presentation pathway (APP) genes and cell surface MHC class I expression, mechanistically which is mediated by the CDK9/PTEFb pathway triggered by pyrimidine nucleotide depletion.

      No comment from authors

      The combination of BQ and immune checkpoint therapy demonstrated a synergistic (or additive) anti-cancer effect against xenografted melanoma, suggesting the potential use of BQ and immune checkpoint blockade as a combination therapy in clinical therapeutics.

      No comment from authors

      The interesting findings in the present study include demonstrating a novel cellular response in cancer cells induced by DHODH inhibition. However, whether the increased antigen presentation by DHODH inhibition actually contributed to the potentiation of the efficacy of immune-check blockade (ICB) is not directly examined is the limitation of the study.

      No comment from authors for preceding text, comment addresses the following text

      Moreover, the mechanism of the increased antigen presentation pathway by pyrimidine depletion mediated by CDK9/PTEFb was not validated by genetic KD or KO targeting by CDK9/PTEFb pathways.

      We appreciate this comment, and we would like to explain why we did not pursue these approaches. According to DepMap, CRISPR/Cas9-mediated knockout of CDK9 in cancer cell lines is almost universally deleterious, scoring as “essential” in 99.8% (1093/1095) of all cell lines tested (see Author response image 1 below). This makes sense, as P-TEFb is required for productive RNA polymerase II elongation of most mammalian genes. As such, it was not feasible to generate cell lines with stable genetic knockout of CDK9 to test our hypothesis.

      While knockdown of CDK9 by RNA interference could support our results, DepMap data seems to indicate that RNAi-mediated knockdown of CDK9 is generally ineffective in silencing its activity, as this perturbation scored as “essential” in only 6.2% (44/710) of tested cell lines. This suggests that incomplete depletion of CDK9 will likely not be sufficient to block APP induction downstream of nucleotide depletion. Furthermore, RNAi-mediated depletion of CDK9 may trigger transcriptional changes in the cell by virtue of its many documented protein-protein interactions, and it would be difficult to establish a consistent “time zero” at which point CDK9 protein depletion is substantial but secondary effects of this have not yet occurred to a significant degree. These factors constitute major limitations of experiments using RNAi-mediated knockdown of CDK9.

      Author response image 1.

      Essentiality score from CRISPR and RNAi perturbation of CDK9 in cancer cell lines https://depmap.org/portal/gene/CDK9?tab=overview&dependency=RNAi_merged

      At any rate, we provide evidence that three different inhibitors of CDK9 (flavopiridol, dinaciclib, and AT7519) all inhibit our effect of interest (Fig 4B). The same results were observed using a previously validated CDK9-directed proteolysis targeting chimera (PROTAC2), and this was reversed by addition of excess pomalidomide (Fig 4C), which correlated with the presence/absence of CDK9 on western blot under the exact same conditions (Fig 4D).

      It is formally possible that all CDK9 inhibitors we tested are blocking BQ-mediated APP induction by some shared off-target mechanism (or perhaps by two or more different off-target mechanisms) AND this CDK9-independent target also happens to be degraded by PROTAC2. However, this would be an extraordinarily non-parsimonious explanation for our results, and so we contend that we have provided compelling evidence for the requirement of CDK9 for BQ-mediated APP induction.

      Finally, high concentrations of BQ have been reported to show off-target effects, sensitizing cancer cells to ferroptosis, and the authors should discuss whether the dose used in the in vivo study reached the ferroptotic sensitizing dose or not.

      We are intrigued by the results shown to us by Reviewer #1 in the linked preprint (Mishima et al 2022, https://doi.org/10.21203/rs.3.rs-2190326/v1). We have also observed in our unpublished data that very high concentrations of BQ (>150µM) cause loss of cell viability that is not rescued by uridine supplementation and that occurs even in DHODH knockout cells. This effect of high-dose BQ must be DHODH-independent. We also agree that Mishima et al provide compelling evidence that the ferroptosis-sensitizing effect of high-dose BQ treatment is due (at least in large part) to inhibition of FSP1.

      Although we showed that DHODH is strongly inhibited in tumor cells in vivo (Fig 5C), we did not directly measure the concentration of BQ in the tumor or plasma. Sykes et al (PMID: 27641501) found that the maximum plasma concentration (Cmax) for [BQ]free following a single IP administration in C57Bl6/J mice (15mg/kg) is approximately 3µM, while the Cmax for [BQ]total was around 215µM. Because polar drug molecules bound to serum proteins (predominantly albumin) are not available to bind other targets, [BQ]free is the relevant parameter.

      Given a Cmax for [BQ]free of 3µM and half-life of 12.0 hours, we estimate that the steady-state [BQ]free with daily IP injections at this dose is around 4µM. Since we used an administration schedule of 10mg/kg every 24 hours, we estimate that the steady-state plasma [BQ]free in our system was 2.67µM (assuming initial Cmax of 2µM and half-life of 12.0 hours).

      To derive an upper-bound estimate for the Cmax of [BQ]free over the 12-day treatment period (Fig 5A-D), we will use the observed data for 15mg/kg dose, and we will assume that 1) there is no clearance of BQ whatsoever and 2) that [BQ]free increases linearly with increasing [BQ]total. This yields a maximum free BQ concentration of 12 x 3 = 36µM.

      Therefore, we consider it very unlikely that plasma concentrations of free BQ in our experiment exceeded the lower limit of the ferroptosis-sensitizing dose range reported by Mishima et al. However, without direct pharmacokinetic analysis, we cannot say for sure what the maximal [BQ]free was under our experimental conditions.

      Reviewer #2 (Public Review):

      In their manuscript entitled "DHODH inhibition enhances the efficacy of immune checkpoint blockade by increasing cancer cell antigen presentation", Mullen et al. describe an interesting mechanism of inducing antigen presentation. The manuscript includes a series of experiments that demonstrate that blockade of pyrimidine synthesis with DHODH inhibitors (i.e. brequinar (BQ)) stimulates the expression of genes involved in antigen presentation. The authors provide evidence that BQ mediated induction of MHC is independent of interferon signaling. A subsequent targeted chemical screen yielded evidence that CDK9 is the critical downstream mediator that induces RNA Pol II pause release on antigen presentation genes to increase expression. Finally, the authors demonstrate that BQ elicits strong anti-tumor activity in vivo in syngeneic models, and that combination of BQ with immune checkpoint blockade (ICB) results in significant lifespan extension in the B16-F10 melanoma model. Overall, the manuscript uncovers an interesting and unexpected mechanism that influences antigen presentation and provides an avenue for pharmacological manipulation of MHC genes, which is therapeutically relevant in many cancers. However, a few key experiments are needed to ensure that the proposed mechanism is indeed functional in vivo.

      The combination of DHODH inhibition with ICB reflects more of an additive response instead of a synergistic combination. Moreover, the temporal separation of BQ and ICB raises the question of whether the induction of antigen presentation with BQ is persistent during the course of delayed ICB treatment. To confidently conclude that induction of antigen presentation is a fundamental component of the in vivo response to DHODH inhibition, the authors should examine whether depletion of immune cells can reduce the therapeutic efficacy of BQ in vivo.

      We concur with this assessment.

      Moreover, they should examine whether BQ treatment induces antigen presentation in non-malignant cells and APCs to determine the cancer specificity.

      Although we showed that this occurs in HEK-293T cells, we appreciate that this cell line is not representative of human cells of any organ system in vivo. So, we agree it is important to determine if DHODH inhibition induces antigen presentation in human tissues and professional antigen presenting cells, and this is an excellent focus for future studies.

      However, it should also be noted that increased antigen presentation in non-malignant host tissues would not be expected to generate an autoimmune response, because host tissues likely lack strong neoantigens, and whatever immunogenic peptides they may have would likely be presented via MHC-I at baseline (i.e. even in the absence of DHODH inhibitor treatment), since all nucleated cells express MHC-I.

      This argument is strongly supported by clinical experience/data, as DHODH inhibitors (leflunomide and teriflunomide) are commonly used to treat rheumatoid arthritis and multiple sclerosis. While the pathophysiology of these autoimmune syndromes is complex, it is thought that both diseases are driven by aberrant T-cell attack on host tissues, mediated by incorrect recognition of host antigens presented via MHC-I (as well as MHC-II) as “foreign.”

      If increased antigen presentation in host tissues (downstream of DHODH inhibition) could lead to a de novo autoimmune response, then administration of DHODH inhibitors would be expected to exacerbate T-cell driven autoimmune disease rather than ameliorate it. Randomized controlled trials have consistently found that treatment with DHODH inhibitors leads to improvement of rheumatoid arthritis and multiple sclerosis symptoms, which is the opposite of what one would expect if DHODH inhibitors are causing de novo autoimmune reactions in human patients.

      Finally, although the authors show that DHODH inhibition induces expression of both MHC-I and MHC-II genes at the RNA level, only MHC-I is validated by flow cytometry given the importance of MHC-II expression on epithelial cancers, including melanoma, MHC-II should be validated as well.

      We fully agree with this statement. We attempted to quantify cell surface MHC-II expression by FACS using the same method as for MHC-I (Figs 1G-H, 2D, and 3F). We did not detect cell surface MHC-II in any of our cancer cell lines, despite the use of high-dose interferon gamma and other stimulants (which robustly increase MHC-II mRNA in our system) in an attempt to induce expression. However, because we did not use cells known to express MHC-II as a positive control (e.g. B-cell leukemia cell lines or primary splenocytes), we do not know if our results are due to some technical failure (perhaps related to our protocol/reagents) or if they reflect a true absence of cell surface MHC-II in our cell lines.

      If the latter is true, that implies that either 1) MHC-II mRNA is not translated or 2) that it is translated, but our cancer cell lines lack one or more elements of the machinery required for MHC-II antigen presentation.

      In any case, it is important to determine if DHODH inhibition increases MHC-II at the cell surface of cancer cells using appropriate positive and negative controls, as this could have important implications for cancer immunotherapy.

      [As a minor point, melanoma is not an epithelial cancer, as it is derived from neural crest lineage cells (melanocytes)]

      Overall, the paper is clearly written and presented. With the additional experiments described above, especially in vivo, this manuscript would provide a strong contribution to the field of antigen presentation in cancer. The distinct mechanisms by which DHODH inhibition induces antigen presentation will also set the stage for future exploration into alternative methods of antigen induction.

      Reviewer #3 (Public Review):

      Mullen et al present an important study describing how DHODH inhibition enhances efficacy of immune checkpoint blockade by increasing cell surface expression of MHC I in cancer cells. DHODH inhibitors have been used in the clinic for many years to treat patients with rheumatoid arthritis and there has been a growing interest in repurposing these inhibitors as anti-cancer drugs. In this manuscript, the Singh group build on their previous work defining combinatorial strategies with DHODH inhibitors to improve efficacy. The authors identify an increase in expression of genes involved in the antigen presentation pathway and MHC I after BQ treatment and they narrow the mechanism to be strictly pyrimidine and CDK9/P-TEFb dependent. The authors rationalize that increased MHC I expression induced by DHODH inhibition might favor efficacy of dual immune checkpoint blockade. This combinatorial treatment prolonged survival in an immunocompetent B16F10 melanoma model.

      [No comment from authors]

      Previous studies have shown that DHODH inhibitors can increase expression of innate immunity-related genes but the role of DHODH and pyrimidine nucleotides in antigen presentation has not been previously reported. A strength of the manuscript is the use of multiple controls across a panel of cell lines to exclude off-target effects and to confirm that effects are exclusively dependent on pyrimidine depletion. Overall, the authors do a thorough characterization of the mechanism that mediates MHC I upregulation using multiple strategies. Furthermore, the in vivo studies provide solid evidence for combining DHODH inhibitors with immune checkpoint blockade.

      No comment from authors

      However, despite the use of multiple cell lines, most experiments are only performed in one cell line, and it is hard to understand why particular gene sets, cell lines or time points are selected for each experiment. It would be beneficial to standardize experimental conditions and confirm the most relevant findings in multiple cell lines.

      We appreciate this comment, and we understand how the use of various cell lines may seem puzzling. We would like to explain how our cell line panel evolved over the course of the study. Our first indication that BQ caused APP upregulation came from transcriptomics experiments (Figs 1A-D, S1A) performed as part of a previous study investigating BQ resistance (Mullen et al, 2023 Cancer Letters). In that study, we used CFPAC-1 as a model for BQ sensitivity and S2-013 as a model for BQ resistance. We did RNA sequencing +/- BQ in these cell lines to look for gene expression patterns that might underlie resistance/sensitivity to BQ. When analyzing this data, we serendipitously discovered the APP/MHC phenomenon, which gave rise to the present study.

      Our next step was to extend these findings to cancer cell lines of other histologies, and we prioritized cell lines derived from common cancer types for which immunotherapy (specifically ICB) are clinically approved. This is why A549 (lung adenocarcinoma), HCT116 (colorectal adenocarcinoma), A375 (cutaneous melanoma), and MDA-MB-231 (triple-negative breast cancer) cell lines were introduced.

      Because PDAC is considered to have an especially “immune-cold” tumor microenvironment, we reasoned that even dramatically increasing cancer cell antigen presentation may be insufficient to elicit an effective anti-tumor immune response in vivo. So we shifted our focus towards melanoma, because a subset of melanoma patients is very responsive to ICB and loss of antigen presentation (by direct silencing or homozygous loss-of-function mutations in MHC-I components such as B2M, or by functional loss of IFN-JAK1/2-STAT signaling) has been shown to mediate ICB resistance in human melanoma patients. This is why we extended our findings to B16F10 murine melanoma cells, intending to use them for in vivo studies with syngeneic immunocompetent recipient mice.

      The PDAC cell line MiaPaCa2 was introduced because a collaborator at our institution (Amar Natarajan) happened to have IKK2 knockout MiaPaCa2 cells, which allowed us to genetically validate our inhibitor results showing that IKK1 and IKK2 (crucial effectors for NF-kB signaling) are dispensable for our effect of interest.

      Ultimately, realizing that our results spanned various human and murine cell lines, we chose to use HEK-293T cells to validate the general applicability of our findings to proliferating cells in 2D culture, since HEK-293T cells (compared to our cancer cell lines) have relatively few genetic idiosyncrasies and express MHC-I at baseline.

      The differential in vivo survival depending on dosing schedule is interesting. However, this section could be strengthened with a more thorough evaluation of the tumors at endpoint.

      Overall, this is an interesting manuscript proposing a mechanistic link between pyrimidine depletion and MHC I expression and a novel therapeutic strategy combining DHODH inhibitors with dual checkpoint blockade. These results might be relevant for the clinical development of DHODH inhibitors in the treatment of solid tumors, a setting where these inhibitors have not shown optimal efficacy yet.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The main issue is that it did not directly examine whether the increased antigen presentation by DHODH inhibition contributed to the potentiation of the efficacy of immune-check blockade (ICB). The additional effect of BQ in the xenograft tumor study was not examined to determine if it was due to increased antigen presentation toward the cancer cells or due to merely cell cycle arrest effect by pyrimidine depletion in the tumor cells. The different administration timing of ICB with BQ treatment (Fig 5E) would not be sufficient to answer this issue.

      We agree with this assessment and, and we believe the experiment proposed by Reviewer #2 below (comparing the efficacy of BQ in Rag-null versus immunocompetent recipients) would address this question directly. We also think that using a more immunogenic cell line for this experiment (such as B16F10 transduced with ovalbumin or some other strong neoantigen) would be useful given the poor immunogenicity and lack of any defined strong neoantigen in B16F10 cells. An orthogonal approach would be to engraft cancer cells with or without B2M knockout into immunocompetent recipient mice (+/- BQ treatment) to further implicate MHC-I and antigen presentation. These questions will be addressed in future studies.

      (2) Additionally, in the in vivo study, the increase in surface MHC1 in the protein level in by BQ treatment was not examined in the tumor samples, and it was not confirmed whether increased antigen presentation by BQ treatment actually promoted an anti-cancer immune response in immune cells. To support the story presented in the study, these data would be necessary.

      We attempted to show this by immunohistochemistry, but unfortunately the anti-H2-Db antibody that we obtained for this purpose did not have satisfactory performance to assess this in our tissue samples harvested at necropsy.

      (3) The mechanism of the increased antigen presentation pathway by pyrimidine depletion mediated by CDK9/PTEFb was not validated by genetic KD or KO targeting by CDK9/PTEFb pathways. In general, results only by the inhibitor assay have a limitation of off-target effects.

      Please see our above reply to Reviewer #1 comment making this same point, where we spell out our rationale for not pursuing these experiments.

      (4) High concentrations of BQ (> 50 uM) have been reported to show off-target effects, sensitizing cancer cells to ferroptosis, an iron-mediated lipid peroxidation-dependent cell death, independent of DHODH inhibition (https://www.researchsquare.com/article/rs-2190326/v1). It would be needed to discuss whether the dose used in the in vivo study reached the ferroptotic sensitizing dose or not.

      Please see our above reply to Reviewer #1 comment making this same point, where we explain why we are very confident that the BQ dose administered in our animal experiments was far below the minimum reported BQ dose required to sensitize cancer cells to ferroptosis in vitro.

      Reviewer #2 (Recommendations For The Authors):

      Major Points

      (1) According to the proposed model, BQ mediated induction of antigen presentation is a contributing factor to the efficacy of this therapeutic strategy. If this is true, then depletion of immune cells should reduce the therapeutic efficacy of BQ in vivo. The authors should perform the B16-F10 transplant experiments in either Rag null mice (if available) or with CD8/CD4 depletion. The expectation would be that T cell depletion (or MHC loss with genetic manipulation) should reduce the efficacy of BQ treatment. Absent this critical experiment, it is difficult to confidently conclude that induction of antigen presentation is a fundamental component of the in vivo response to DHODH inhibition.

      We agree with this assessment and the proposed experiment comparing the response in Rag-null versus immunocompetent recipients. We also think that using a more immunogenic cell line for this experiment (such as B16F10 transduced with ovalbumin or some other strong neoantigen) would be useful given the poor immunogenicity and lack of any defined strong neoantigen in B16F10 cells. An orthogonal approach would be to engraft cancer cells with or without B2M knockout into immunocompetent recipient mice (+/- BQ treatment) to further implicate MHC-I and antigen presentation. These questions will be addressed in future studies.

      (2) Does BQ treatment induce antigen presentation in non-malignant cells? APCs? If the induction of antigen presentation is not cancer specific and related to a pyrimidine depletion stress response, then there is a possibility that healthy tissues will also exhibit a similar phenotype, raising concerns about the specificity of a de novo immune response. The authors should examine antigen presentation genes in healthy tissues treated with BQ.

      We agree it is important to examine if our findings regarding nucleotide depletion and antigen presentation are true of APCs and other non-transformed cells, but we are not so concerned about the possibility of raising an immune response against non-malignant host tissues, as explained above. We have reproduced the relevant section below:

      “However, it should also be noted that increased antigen presentation in non-malignant host tissues would not be expected to generate an autoimmune response, because host tissues likely lack strong neoantigens, and whatever immunogenic peptides they may have would likely be presented via MHC-I at baseline, since all nucleated cells express MHC-I.

      This argument is strongly supported by clinical experience/data, as DHODH inhibitors (leflunomide and teriflunomide) are commonly used to treat rheumatoid arthritis and multiple sclerosis. While the pathophysiology of these autoimmune syndromes is complex, it is thought that both diseases are driven by aberrant T-cell attack on host tissues, mediated by incorrect recognition of host antigens presented via MHC-I (as well as MHC-II) as “foreign.”

      If increased antigen presentation in host tissues (downstream of DHODH inhibition) could lead to a de novo autoimmune response, then administration of DHODH inhibitors would be expected to exacerbate T-cell driven autoimmune disease rather than ameliorate it. Randomized controlled trials have consistently found that treatment with DHODH inhibitors leads to improvement of rheumatoid arthritis and multiple sclerosis symptoms, which is the opposite of what one would expect if DHODH inhibitors are causing de novo autoimmune reactions in human patients.”

      (3) In the title, the authors claim that DHODH enhances the efficacy of ICB. However, the experiment shown in Figure 5D does not demonstrate this. The Kaplan Meier curves reflect more of an additive response versus a synergistic combination. Furthermore, the concurrent treatment of BQ and ICB seems to inhibit the efficacy of ICB due to BQ toxicity in immune cells. This result seems to contradict the title.

      We do not agree with this assessment. Given that the effect of dual ICB alone was very marginal, while the effect of BQ monotherapy was quite marked, we cannot conclude from Fig 5 that BQ treatment inhibited ICB efficacy due to immune suppression.

      (4) Related to Point 3, the temporal separation of BQ and ICB raises the question of whether the induction of antigen presentation with BQ is persistent during the course of delayed ICB treatment. One explanation for the results is that BQ treatment reduces tumor burden, and then a subsequent course of ICB also reduces tumor burden but not that the two therapies are functioning in synergy. To address this, the authors should measure the duration of BQ mediated induction of antigen presentation after stopping treatment.

      We agree that the alternative explanation proposed by Reviewer #2 is possible and we appreciate the suggestion to test the stability of APP induction after stopping BQ treatment.

      (5) In Figure 1, the authors show that DHODH inhibition induces expression of both MHC-I and MHC-II genes at the RNA level. However, they only validate MHC-I by flow cytometry. A simple experiment to evaluate the effect of BQ treatment on MHC-II surface expression would provide important additional mechanistic insight into the immunomodulatory effects of DHODH inhibition, especially given recent literature reinforcing the importance of MHC-II expression on epithelial cancers, including melanoma (Oliveira et al. Nature 2022).

      We fully agree with this statement. We attempted to quantify cell surface MHC-II expression by FACS using the same method as for MHC-I (Figs 1G-H, 2D, and 3F). We did not detect cell surface MHC-II in any of our cancer cell lines, despite the use of high-dose interferon gamma and other stimulants (which robustly increase MHC-II mRNA in our system) in an attempt to induce expression. However, because we did not use cells known to express MHC-II as a positive control (e.g. B-cell leukemia cell lines or primary splenocytes), we do not know if our results are due to some technical failure (perhaps related to our protocol/reagents) or if they reflect a true absence of cell surface MHC-II in our cell lines.

      If the latter is true, that implies that either 1) MHC-II mRNA is not translated or 2) that it is translated, but our cancer cell lines lack one or more elements of the machinery required for MHC-II antigen presentation.

      In any case, it is important to determine if DHODH inhibition increases MHC-II at the cell surface of cancer cells using appropriate positive and negative controls, as this could have important implications for cancer immunotherapy.

      [As a minor point, melanoma is not an epithelial cancer, as it is derived from neural crest lineage cells (melanocytes)]

      Minor Points

      (1) The authors show ChIP-seq tracks from Tan et al. for HLA-B. However, given the pervasive effect of Ter treatment across many HLA genes, the authors should either show tracks at additional loci, or provide a heatmap of read density across more loci. This would substantiate the mechanistic claim that RNA Pol II occupancy and activity across antigen presentation genes is the major driver of response to DHODH inhibition as opposed to mRNA stabilization/increased translation.

      We appreciate this suggestion. We have changed Fig 4 by replacing the HLA-B track (old Fig 4E) with a representation of fold change (Ter/DMSO) in Pol II occupancy versus fold change (Ter/DMSO) in mRNA abundance for 23 relevant genes (new Fig 4G); both of these datasets were obtained from the Tan et al manuscript. This new figure panel (Fig 4G) also shows linear regression analysis demonstrating that Pol II occupancy and mRNA expression are significantly correlated for APP genes. While we recognize that this data in itself is not formal proof of our hypothesis, it does strongly support the notion that increased transcription is responsible for the increased mRNA abundance of APP genes that we have observed.

      (2) A compelling way to demonstrate a change in antigen presentation is through mass spectrometry based immunopeptidomics. Performing immunopeptidomic analysis of BQ treated cell lines would provide substantial mechanistic insight into the outcome of BQ treatment. While this approach may be outside the scope of the current work, the authors should speculate on how this treatment may specifically alter the antigenic landscape where future directions would include empirical immunopeptidomics measurements.

      We fully agree with this comment. While the abundance of cancer cell surface MHC-I is an important factor for anticancer immunity, another crucial factor is the identity of peptides that are presented. Treatments that cause presentation of more immunogenic peptides can enhance T-cell recognition even in the absence of a relative change in cell surface MHC-I abundance.

      While we did not perform the immunopeptidomics experiments described, we can offer some speculation regarding this comment. As shown in Fig 1D-E, transcriptomics experiments suggest that immunoproteasome subunits (PSMB8, PSMB9, PSMB10) are upregulated upon DHODH inhibition. If this change in mRNA levels translates into greater immunoproteasome activity (which was not tested in our study), this would be expected to alter the repertoire of peptides available for presentation and could thereby change the immunopeptidome.

      However, this hypothesis requires direct testing, and we hope future studies will delineate the effects of DHODH inhibition and other cancer therapies on the immunopeptidome, as this area of research will have important clinical implications.

      (3) While the signaling through CDK9 seems convincing, it still does not provide a mechanistic link between depleted pyrimidines and CDK9 activity. The authors should speculate on the mechanism that signals to CDK9.

      We agree with the assessment. A mechanistic link between depleted pyrimidines and CDK9 activity will be a subject of future studies.

      (4) Related to minor point 2, the authors should consider a genetic approach to confirm the importance of CDK9. While the pharmacological approach, including multiple mechanistically distinct CDK9 inhibitors provides strong evidence, an additional experiment with genetic depletion of CDK9 (CRISPR KO, shRNA, etc) would provide compelling mechanistic confirmation.

      Reviewer #1 raised this very same point, and we agree. Please see our reply to Reviewer #1, which details why we did not pursue this approach and argues that the evidence we present is compelling even in absence of genetic manipulation.

      Additionally, please see the new Fig 4E and 4F, which is a repeat of Fig 4B using HCT116 cells. Figure 4E shows that, in this cell line, CDK9 inhibitors (flavopiridol, dinaciclib, and AT7519) block BQ-mediated APP induction, while PROTAC2 does not. Figure 4F shows that (for reasons we cannot fully explain) PROTAC2 does not lead to CDK9 degradation in HCT116 cells. This data strongly implicates CDK9, because it excludes a CDK9-degradation-independent effect of PROTAC2.

      (5) Figure 2B needs a legend.

      Thank you for pointing this out. We have added a legend to Fig 2B.

      (6) The authors should comment in the discussion on how this strategy may be particularly useful in patients harboring genetic or epigenetic loss of interferon signaling, a known mechanism of ICB resistance. Perhaps DHODH inhibition could rescue MHC expression in cells that are deficient in interferon sensing.

      Thank you for this suggestion! We have amended the Discussion section to mention this important point. Please see paragraph 2 of the revised Discussion section where we have added the following text:

      “Because BQ-mediated APP induction does not require interferon signaling, this strategy may have particular relevance for clinical scenarios in which tumor antigen presentation is dampened by the loss or silencing of cancer cell interferon signaling, which has been demonstrated to confer both intrinsic and acquired ICB resistance in human melanoma patients.”

      Reviewer #3 (Recommendations For The Authors):

      The authors present convincing evidence of the mechanism by which pyrimidine nucleotides regulate MHC I levels and about the potential of combining DHODH inhibitors with dual immune checkpoint blockade (ICB). This is an interesting paper given the clinical relevance of DHODH inhibitors. The studies raise some questions, and some points might need clarifying as below:

      • In Figure 2C, why do the authors focus on these two genes in the uridine rescue? These are important genes mediating antigen presentation, but it might be more interesting to see how H2-Db and H2-Kb expression correlate with the protein data shown in Fig 2D. Fig. 2C-2D is a relevant control, so it would be important to validate in a different cancer cell line (e.g. one of the PDAC cell lines used for the RNAseq).

      We appreciate this comment. Although Fig 3C shows that BQ-induced expression of H2-Db, H2-Kb, and B2m is reversed by uridine (in B16F10 cells), we recognize that this was not the best placement for this data, as it can easily be overlooked here since uridine reversal is not the main point of Fig 3C. We have left Fig 3C as is, because we think that the uridine reversal demonstrated in that panel serves as a good internal positive control for reversal of BQ-mediated APP induction in that experiment.

      We have repeated the experiments shown in the original Fig 2C and substituted the original Fig 2C with a new Fig 2C and Fig S2B, which show both Tap1 and Nlrc5 as well as H2-Db, H2-Kb, and B2m after treatment with either BQ (new Fig 2C) or teriflunomide (new Fig S2B). The original Fig S2B is now Fig S2C, and it shows that uridine has no effect on the expression of any of the genes assayed in the new Fig 2C or S2B.

      The reversibility of cell surface MHC-I induction was also validated in HCT116 cells (Fig 3F). We included the uridine reversal in Fig 3F to avoid duplicating the control and BQ FACS data in multiple panels.

      We have also added the qPCR data for HCT116 cells showing this same phenotype (at the mRNA level), which is the new Fig S2D.

      We decided to prioritize HCT116 cells for our mechanistic studies (Figures S2D, S4A, and 4E-F) because previous reports indicate that it is diploid and therefore less genetically deranged compared to our other cancer cell lines.

      • Figure 2F shows an elegant experiment to discard off-target effects related to cell death and to confirm that the increased MHC I expression is uniquely dependent on pyrimidines. DHODH has recently been involved in ferroptosis, a highly immunogenic type of cell death. What are the authors´ thoughts on BQ-induced ferroptosis as a possible contributor to the effects of ICB? Does BQ + ferroptosis inhibitor (ferrostatin) affect cell surface MHC I and/or expression of antigen processing genes?

      The potential role of DHODH in ferroptosis protection (Mao et al 2021) has important implications, so we are glad that multiple reviewers raised questions concerning ferroptosis. We did not directly test the effect of ferroptosis inducing agents (with or without BQ) on MHC-I/APP expression, but that is certainly a worthwhile line of investigation.

      The DHODH/ferroptosis issue is complicated by a study pointed out by Reviewer #1 that challenges the role of DHODH inhibition in BQ-mediated ferroptosis sensitization (Mishima et al, 2022). This study argues that high-dose BQ treatment causes FSP1 inhibition, and this underlies the effect of BQ on the cellular response to ferroptosis-inducing agents.

      Regardless of whether BQ-induced ferroptosis-sensitization is dependent on DHODH, FSP1, or some other factor, the Mao and Mishima studies agree that a relatively high dose of BQ is required to observe these effects (100-200µM for most cell lines and >50µM even in the most ferroptosis-sensitive cell lines). As we explained above, we consider it very unlikely that the in vivo BQ exposure in our experiments (Fig 5) was high enough to cause significant ferroptosis, especially in the absence of any dedicated ferroptosis-inducing agent (which is typically required to cause ferroptosis even in the presence of high-dose BQ).

      • The authors nail down the mechanism to CDK9 (Fig 4). However, all these experiments are performed in 293T cells. I would like to see a repeat of Fig. 4B in a cancer cell line (either PDAC or B16). Also, does BQ have any effect on CDK9 expression/protein levels?

      We have added two figure panels that address this comment (new Fig 4E and 4F). Figure 4E (which is a repeat of Fig 4B with HCT116 cells) shows that CDK9 inhibitors (flavopiridol, AT7519, and dinaciclib) reverse BQ-mediated APP induction in HCT116 cells (this agrees with Fig S4A showing that flavopiridol reverses MHC induction by various nucleotide synthesis inhibitors in this cell line), but PROTAC2 does not. Figure 4F shows that PROTAC2 (for reasons we cannot explain) does not cause CDK9 degradation in HCT116 cells. This adds further support to our thesis that CDK9 is a critical mediator of BQ-mediated APP induction (because how else can this pattern of results be explained?). The text of the Results section has been amended to reflect this.

      We chose to use HCT116 cells for this repeat experiment 1) to align with Fig S4A and 2) because, as previously mentioned, we consider HCT116 to be a good cell line for mechanistic studies because of its relative lack of idiosyncratic genetic features (compared to CFPAC-1, for example, which was derived from a patient with cystic fibrosis).

      • What are the differences in tumor size for the experiment shown in Figure 5E? What about tumor cell death in the ICB vs. BQ+ICB groups?

      Because this was a survival assay, direct comparisons of tumor volumes between groups was not possible at later time points, since mice that die or have to be euthanized are removed from their experimental group, which lowers the average group tumor burden at subsequent time points. Although tumor volume was the most common euthanasia criteria reached, a subset of mice were either found dead or had to be euthanized for other reasons attributed to their tumor burden (moribund state, inability to ambulate or stand, persistent bleeding from tumor ulceration, severe loss of body mass, etc.). This confounds any comparison of endpoint measurements (such as immunohistochemical quantification of tumor cell death markers, T-cell markers, etc.).

      • The different response in the concurrent vs delayed treatment is very interesting. The authors suggest two possible mechanisms to explain this: "1) Concurrent BQ dampens the initial anticancer immune response generated by dual ICB, or b) cancer cell MHC-I and related genes are not maximally upregulated at the time of ICB administration with concurrent treatment". However, and despite the caveat of comparing the in vitro to the in vivo setting, Fig 2D shows upregulation of MHC I already at 24h of treatment in B16 cells. Have the authors checked T cell infiltration in the concurrent and delayed treatment setting?

      For the same reasons described in response to the preceding comment, tumors harvested upon mouse death/euthanasia from our survival experiment were not suitable for cross-cohort comparison of tumor endpoint measurements. An additional experiment in which mice are necropsied at a prespecified time point (before any mice have died or reached euthanasia criteria, as in the experiment for Fig 5A-D) would be required to answer this question.

      • Page 5, line 181 -do the authors mean "nucleotide salvage inhibitors" instead of "synthesis"?

      We believe the reviewer is referring to the following sentence:

      “The other drugs screened included nucleotide synthesis inhibitors (5-fluorouracil, methotrexate, gemcitabine, and hydroxyurea), DNA damage inducers (oxaliplatin, irinotecan, and cytarabine), a microtubule targeting drug (paclitaxel), a DNA methylation inhibitor (azacytidine), and other small molecule inhibitors (Fig 2F).”

      In this context, we believe our use of “synthesis” instead of “salvage” is correct, because methotrexate and 5-FU inhibit thymidylate synthase (which mediates de novo dTTP synthesis), while gemcitabine and hydroxyurea inhibit ribonucleotide reductase (which mediates de novo synthesis of all dNTPs).

    2. Reviewer #1 (Public Review):

      The manuscript by Mullen et al. investigated the gene expression changes in cancer cells treated with the DHODH inhibitor brequinar (BQ), to explore the therapeutic vulnerabilities induced by DHODH inhibition. The study found that BQ treatment causes upregulation of antigen presentation pathway (APP) genes and cell surface MHC class I expression, mechanistically which is mediated by the CDK9/PTEFb pathway triggered by pyrimidine nucleotide depletion. The combination of BQ and immune checkpoint therapy demonstrated a synergistic (or additive) anti-cancer effect against xenografted melanoma, suggesting the potential use of BQ and immune checkpoint blockade as a combination therapy in clinical therapeutics.

      The interesting findings in the present study include demonstrating a novel cellular response in cancer cells induced by DHODH inhibition. However, whether the increased antigen presentation by DHODH inhibition actually contributed to the potentiation of the efficacy of immune-check blockade (ICB) is not directly examined is the limitation of the study. Moreover, the mechanism of the increased antigen presentation pathway by pyrimidine depletion mediated by CDK9/PTEFb was not validated by genetic KD or KO targeting by CDK9/PTEFb pathways. Finally, high concentrations of BQ have been reported to show off-target effects, sensitizing cancer cells to ferroptosis, and the authors should discuss whether the dose used in the in vivo study reached the ferroptotic sensitizing dose or not.

      Comment on the revised version:

      In their response letter, the authors appropriately addressed the reviewer's comments.

      However, it is unfortunate that these comments are not reflected in the main text. Consequently, readers may encounter the same questions. Therefore, the reviewer recommends mentioning them in the discussion or limitations of the study, even if briefly, to address readers' concerns. Especially, addressing the comments such as the dosage of BQ being lower than the reported pro-ferroptotic dose (PMID 37407687), and the lack of examining potential impact of immune cell depletion on the efficacy of BQ treatment would be necessary for considering the proposed mechanism. The latter limitation is also raised by the other reviewer.

    3. Reviewer #2 (Public Review):

      In their manuscript entitled "DHODH inhibition enhances the efficacy of immune checkpoint blockade by increasing cancer cell antigen presentation", Mullen et al. describe an interesting mechanism of inducing antigen presentation. The manuscript includes a series of experiments that demonstrate that blockade of pyrimidine synthesis with DHODH inhibitors (i.e. brequinar (BQ)) stimulates the expression of genes involved in antigen presentation. The authors provide evidence that BQ mediated induction of MHC is independent of interferon signaling. A subsequent targeted chemical screen yielded evidence that CDK9 is the critical downstream mediator that induces RNA Pol II pause release on antigen presentation genes to increase expression. Finally, the authors demonstrate that BQ elicits strong anti-tumor activity in vivo in syngeneic models, and that combination of BQ with immune checkpoint blockade (ICB) results in significant lifespan extension in the B16-F10 melanoma model. Overall, the manuscript uncovers an interesting and unexpected mechanism that influences antigen presentation and provides an avenue for pharmacological manipulation of MHC genes, which is therapeutically relevant in many cancers. However, a few key experiments are needed to ensure that the proposed mechanism is indeed functional in vivo.

      Major Points:

      (1) According to the proposed model, BQ mediated induction of antigen presentation is a contributing factor to the efficacy of this therapeutic strategy. If this is true, then depletion of immune cells should reduce the therapeutic efficacy of BQ in vivo. The authors should perform the B16-F10 transplant experiments in either Rag null mice (if available) or with CD8/CD4 depletion. The expectation would be that T cell depletion (or MHC loss with genetic manipulation) should reduce the efficacy of BQ treatment. Absent this critical experiment, it is difficult to confidently conclude that induction of antigen presentation is a fundamental component of the in vivo response to DHODH inhibition.

      (2) Does BQ treatment induce antigen presentation in non-malignant cells? APCs? If the induction of antigen presentation is not cancer specific and related to a pyrimidine depletion stress response, then there is a possibility that healthy tissues will also exhibit a similar phenotype, raising concerns about the specificity of a de novo immune response. The authors should examine antigen presentation genes in healthy tissues treated with BQ.

      (3) In the title, the authors claim that DHODH enhances the efficacy of ICB. However, the experiment shown in Figure 5D does not demonstrate this. The Kaplan Meier curves reflect more of an additive response versus a synergistic combination. Furthermore, the concurrent treatment of BQ and ICB seems to inhibit the efficacy of ICB due to BQ toxicity in immune cells. When concurrently administered, the survival of the mice is the same as with brequinar alone, suggesting that the efficacy of ICB was diminished. However, if ICB is administered following an initial dose of BQ, there is an added survival benefit of a magnitude that is similar to ICB alone. This result seems to contradict the title. Furthermore, the authors should show the longitudinal growth curves of these tumors.

      (4) Related to Point 3, the temporal separation of BQ and ICB raises the question of whether the induction of antigen presentation with BQ is persistent during the course of delayed ICB treatment. One explanation for the results is that BQ treatment reduces tumor burden, and then a subsequent course of ICB also reduces tumor burden but not that the two therapies are functioning in synergy. To address this, the authors should measure the duration of BQ mediated induction of antigen presentation after stopping treatment.

      (5) In Figure 1, the authors show that DHODH inhibition induces expression of both MHC-I and MHC-II genes at the RNA level. However, they only validate MHC-I by flow cytometry. A simple experiment to evaluate the effect of BQ treatment on MHC-II surface expression would provide important additional mechanistic insight into the immunomodulatory effects of DHODH inhibition, especially given recent literature reinforcing the importance of MHC-II expression on epithelial cancers, including melanoma (Oliveira et al. Nature 2022).

      Minor Points:

      (1) The authors show ChIP-seq tracks from Tan et al. for HLA-B. However, given the pervasive effect of Ter treatment across many HLA genes, the authors should either show tracks at additional loci, or provide a heatmap of read density across more loci. This would substantiate the mechanistic claim that RNA Pol II occupancy and activity across antigen presentation genes is the major driver of response to DHODH inhibition as opposed to mRNA stabilization/increased translation.

      (2) A compelling way to demonstrate a change in antigen presentation is through mass spectrometry based immunopeptidomics. Performing immunopeptidomic analysis of BQ treated cell lines would provide substantial mechanistic insight into the outcome of BQ treatment. While this approach may be outside the scope of the current work, the authors should speculate on how this treatment may specifically alter the antigenic landscape where future directions would include empirical immunopeptidomics measurements.

      (3) While the signaling through CDK9 seems convincing, it still does not provide a mechanistic link between depleted pyrimidines and CDK9 activity. The authors should speculate on the mechanism that signals to CDK9.

      (4) Related to minor point 2, the authors should consider a genetic approach to confirm the importance of CDK9. While the pharmacological approach, including multiple mechanistically distinct CDK9 inhibitors provides strong evidence, an additional experiment with genetic depletion of CDK9 (CRISPR KO, shRNA, etc) would provide compelling mechanistic confirmation.

      (5) The authors should comment in the discussion on how this strategy may be particularly useful in patients harboring genetic or epigenetic loss of interferon signaling, a known mechanism of ICB resistance. Perhaps DHODH inhibition could rescue MHC expression in cells that are deficient in interferon sensing.

      Overall, the paper is clearly written and presented. With the additional experiments described above, especially in vivo, this manuscript would provide a strong contribution to the field of antigen presentation in cancer. The distinct mechanisms by which DHODH inhibition induces antigen presentation will also set the stage for future exploration into alternative methods of antigen induction.

      Comments on latest version:

      The authors address the majority of the points raised in my previous review. However, no additional in vivo experiments were performed, which seems necessary for the major conclusions of the paper.

      I disagree with the authors' assessment of Major Point 3 in my review. I have updated the text of Major Point 3 in my public review to further clarify my position.

      My final assessment is that if the authors want to claim that DHODH inhibition potentiates immune checkpoint blockade, as is stated in the title, then further in vivo experimentation is needed.

    4. Reviewer #3 (Public Review):

      Mullen et al present an important study describing how DHODH inhibition enhances efficacy of immune checkpoint blockade by increasing cell surface expression of MHC I in cancer cells. DHODH inhibitors have been used in the clinic for many years to treat patients with rheumatoid arthritis and there has been a growing interest in repurposing these inhibitors as anti-cancer drugs. In this manuscript, the Singh group builds on their previous work defining combinatorial strategies with DHODH inhibitors to improve efficacy. The authors identify an increased expression of genes in the antigen presentation pathway and MHC I after BQ treatment which is mediated strictly by pyrimidine depletion and CDK9/P-TEFb. The authors rationalize that increased MHC I expression induced by DHODH inhibition might favor efficacy of dual immune checkpoint blockade. In fact, this combinatorial treatment prolonged survival in an immunocompetent B16F10 melanoma model.

      Previous studies have shown that DHODH inhibitors can increase expression of innate immunity-related genes but the role of DHODH and pyrimidine nucleotides in antigen presentation has not been previously reported. A strength of the manuscript is the solid in vitro mechanistic data supported by analysis in multiple cell lines. The in vivo data show compelling additive effects of DHODH inhibitors and ICB. However, more controls and experiments would be required to define the nature of these effects and to confirm that the mechanistic in vitro data is conserved in vivo.

      This is a relevant manuscript proposing a mechanistic link between pyrimidine depletion and MHC I expression and a novel therapeutic approach combining DHODH inhibitors with dual checkpoint blockade. These results might be relevant for the clinical development of DHODH inhibitors in the treatment of solid tumors, a setting where these have not shown optimal efficacy yet.

      Comments on revised version:

      The authors have addressed my questions regarding validation of gene expression in other cell lines. They have also provided an explanation about why in vivo evaluations could not be performed for the experiment in Figure 5E.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study, utilizing CITE-Seq to explore CML, is considered a useful contribution to our understanding of treatment response. However, the reviewers express concern about the incomplete evidence due to the small sample size and recommend addressing these limitations. Strengthening the study with additional patient samples and validation measures would enhance its significance.

      We thank the editors for the assessment of our manuscript. In view of the comments of the three reviewers, we have increased the number of CML patient samples analyzed to confirm all the major findings included in the manuscript. In total, more than 80 patient samples across different approaches have now been analyzed and incorporated in the revised manuscript.

      To the best of our knowledge, this is the first single cell multiomics report in CML and differs substantially from the recent single cell omics-based reports where single modalities were measured one at a time (Krishnan et al., 2023; Patel et al., 2022). Thus, the sc-multiomic investigation of LSCs and HSCs from the same patient addresses a major gap in the field towards managing efficacy and toxicity of TKI treatment by enumerating CD26+CD35- LSCs and CD26-CD35+ HSCs burden and their ratio at diagnosis vs. 3 months of therapy. The findings suggest design of a simpler and cheaper FACS assay to simultaneously stratify CML patients for TKI efficacy as well as hematologic toxicity.

      Reviewer 1:


      This manuscript by Warfvinge et al. reports the results of CITE-seq to generate singlecell multi-omics maps from BM CD34+ and CD34+CD38- cells from nine CML patients at diagnosis. Patients were retrospectively stratified by molecular response after 12 months of TKI therapy using European Leukemia Net (ELN) recommendations. They demonstrate heterogeneity of stem and progenitor cell composition at diagnosis, and show that compared to optimal responders, patients with treatment failure after 12 months of therapy demonstrate increased frequency of molecularly defined primitive cells at diagnosis. These results were validated by deconvolution of an independent previously published dataset of bulk transcriptomes from 59 CML patients. They further applied a BCR-ABL-associated gene signature to classify primitive Lin-CD34+CD38- stem cells as BCR:ABL+ and BCR:ABL-. They identified variability in the ratio of leukemic to non-leukemic primitive cells between patients, showed differences in the expression of cell surface markers, and determined that a combination of CD26 and CD35 cell surface markers could be used to prospectively isolate the two populations. The relative proportion of CD26-CD35+ (BCR:ABL-) primitive stem cells was higher in optimal responders compared to treatment failures, both at diagnosis and following 3 months of TKI therapy.


      The studies are carefully conducted and the results are very clearly presented. The data generated will be a valuable resource for further studies. The strengths of this study are the application of single-cell multi-omics using CITE-Seq to study individual variations in stem and progenitor clusters at diagnosis that are associated with good versus poor outcomes in response to TKI treatment. These results were confirmed by deconvolution of a historical bulk RNAseq data set. Moreover, they are also consistent with a recent report from Krishnan et al. and are a useful confirmation of those results. The major new contribution of this study is the use of gene expression profiles to distinguish BCRABL+ and BCR-ABL- populations within CML primitive stem cell clusters and then applying antibody-derived tag (ADT) data to define molecularly identified BCR:ABL+ and BCR-ABL- primitive cells by expression of surface markers. This approach allowed them to show an association between the ratio of BCR-ABL+ vs BCR-ABL- primitive cells and TKI response and study dynamic changes in these populations following short-term TKI treatment.


      One of the limitations of the study is the small number of samples employed, which is insufficient to make associations with outcomes with confidence. Although the authors discuss the potential heterogeneity of primitive stem, they do not directly address the heterogeneity of hematopoietic potential or response to TKI treatment in the results presented. Another limitation is that the BCR-ABL + versus BCR-ABL- status of cells was not confirmed by direct sequencing for BCR-ABL. The BCR-ABL status of cells sorted based on CD26 and CD35 was evaluated in only two samples. We also note that the surface markers identified were previously reported by the same authors using different single-cell approaches, which limits the novelty of the findings. It will be important to determine whether the GEP and surface markers identified here are able to distinguish BCR-ABL+ and BCR-ABL- primitive stem cells later in the course of TKI treatment. Finally, although the authors do describe differential gene expression between CML and normal, BCR:ABL+ and BCR:ABL-, primitive stem cells they have not as yet taken the opportunity to use these findings to address questions regarding biological mechanisms related to CML LSC that impact on TKI response and outcomes.

      Reviewer #1 (Recommendations For The Authors):

      Minor comment: Fig 4 legend -E and F should be C and D.

      We thank the reviewer for positive assessment of our work. Here, we highlight the updates in the revised manuscript considering the feedback received.

      Minor comment: Fig 4 legend -E and F should be C and D.

      We have edited the revised manuscript accordingly

      One of the limitations of the study is the small number of samples employed, which is insufficient to make associations with outcomes with confidence.

      Although we performed CITE-seq for 9 CML patient samples at diagnosis, we extended our investigations to include additional samples (e.g., largescale deconvolution analysis of samples, Fig 3 C-E, qPCR for BCR::ABL1 status, Fig. 6A, and the ratio between CD35+ and CD26+ populations at diagnosis and during TKI therapy, Fig. 6C-D) as described in the manuscript.

      In comparison to a scRNA-seq, multiomic CITE-seq involves preparation and sequencing of separate libraries corresponding to RNA and ADTs thereby being even more resource demanding limiting our capacity to process an extensive number of patient samples. To confirm our findings in a larger cohort we have therefore adopted a computational deconvolution approach, CIBERSORT to analyze a larger number of independent samples (n=59). This reflects a growing, sustainable trend to study larger number of patients in face of still prohibitively expensive but potentially insightful scomics approaches (For example, please see Zeng et al, A cellular hierarchy framework for understanding heterogeneity and predicting drug response in acute myeloid leukemia, Nature Medicine, 2022).

      However, in view of the comment, we have now substantially increased the number of analyzed patients in the revised manuscript. These include increased number of patient samples to investigate the ratio between CD35 and CD26 marked populations at diagnosis, and 3 months of TKI therapy (from n=8 to n=12 with now 6 optimal responders and 5 treatment failure at diagnosis and after TKI therapy), qPCR for BCR::ABL1 expression status at diagnosis (from n=3 to n=9) , and followed up the BCR::ABL1 expression in three additional samples after TKI therapy. Moreover, we examined the CD26 and CD35 marked populations for expression of GAS2, one of our top candidate LSC signature genes in three additional samples at diagnosis and at 3m follow up. Thus, >80 patient samples across different approaches have been analyzed to strengthen all major conclusions of the study.

      We emphasize that we were cautious in generalizing the observation obtained from any one approach and sought to confirm any major finding using at least one complementary method. As an example, although CITE-seq (n=9) showed altered frequency of all cell clusters between optimal and poor responders (Fig. 3B), we refrained from generalizing because our independent large-scale computational deconvolution analysis (n=59) only substantiated the altered proportion of primitive and myeloid cell clusters (Fig. 3E).

      Although the authors discuss the potential heterogeneity of primitive stem, they do not directly address the heterogeneity of hematopoietic potential or response to TKI treatment in the results presented.

      Thanks for noting the discussion on heterogeneity of the primitive stem cells. As described in the original manuscript, the figure 6 D-E showed a relationship between heterogeneity and TKI therapy response. The results showed that CD35+/CD26+ ratio within the HSC fraction associated with this therapy response. We have now increased the number of patient samples analyzed and present the updated results in the revised manuscript (now figure 6 C-D). These observations set the stage for assessing whether long term therapy outcome can also be influenced by heterogeneity at diagnosis.

      We have shown the hematopoietic potential of HSCs marked by CD35 expression in an independent parallel study and therefore only mentioned it concisely in the current manuscript. A combination of scRNA-seq, scATAC-seq and cell surface proteomics showed CD35+ cells at the apex of healthy human hematopoiesis, containing an HSCspecific epigenetic signature and molecular program, as well as possessing self-renewal capacity and multilineage reconstitution in vivo and vitro. The preprint is available as Sommarin et al. ‘Single-cell multiomics reveals distinct cell states at the top of the human hematopoietic hierarchy’, Biorxiv; https://www.biorxiv.org/content/10.1101/2021.04.01.437998v2.full

      We also note that the surface markers identified were previously reported by the same authors using different single-cell approaches, which limits the novelty of the findings.

      Our current manuscript is indeed a continuation of and builds onto our previous paper (Warfvinge R et al. Blood, 2017). In contrast to our previous report which was limited to examination of only 96 genes per cell, CITE-seq allowed us to examine the molecular program of cells using unbiased global gene expression profiling. Finally, although CD26 appears, once again as a reliable marker of BCR::ABL1+ primitive cells, CD35 emerges as a novel and previously undescribed marker of BCR::ABL1- residual stem cells. A combination of CD35 and CD26 allowed us to efficiently distinguish between the two populations housed within the Lin-34+38/low stem cell immunophenotype.

      Another limitation is that the BCR-ABL + versus BCR-ABL- status of cells was not confirmed by direct sequencing for BCR-ABL. The BCR-ABL status of cells sorted based on CD26 and CD35 was evaluated in only two samples

      Single cell detection of fusion transcripts is challenging with low detection sensitivity in single cell RNA-seq as has been noted previously (Krishnan et al. Blood, 2023, Giustacchini et al. Nature Medicine, 2017, Rodriguez-Meira et al. Molecular Cell, 2019). However, this is likely to change with the inclusion of targetspecific probes in scRNA-seq library preparation protocols. Nonetheless, in view of the comment, we have included more patient samples (from the previous n=3 to current n=10 (including TKI treated samples) for direct assessment of BCR-ABL1 status by qPCR analysis; the updated results are included in the revised manuscript (Figure 6A).

      It will be important to determine whether the GEP and surface markers identified here are able to distinguish BCR-ABL+ and BCR-ABL- primitive stem cells later in the course of TKI treatment.

      We performed qPCR to check for BCR::ABL1 status, and the level of GAS2, one of the top genes expressed in CML cells within CD26+ and CD35+ cells at diagnosis and following 3 months of TKI therapy. The results showed that while CD26+ are BCR::ABL1+, the CD35+ cells are BCR::ABL1- at both time points. Moreover, the expression of LSC-specific gene, GAS2 was specific to BCR::ABL1+ CD26+ cells at both diagnosis as well as following 3 months of TKI therapy. The new results are presented in figure 6B in the revised manuscript.

      Finally, although the authors do describe differential gene expression between CML and normal, BCR:ABL+ and BCR:ABL-, primitive stem cells they have not as yet taken the opportunity to use these findings to address questions regarding biological mechanisms related to CML LSC that impact on TKI response and outcomes.

      We agree with the reviewer that our major focus here was to characterize the cellular heterogeneity coupled to treatment outcome and therefore we did not delve deep into the molecular mechanisms underlying TKI response. However, in response to this comment, as mentioned above, we noted that one of the top genes in BCR::ABL1 cells (Fig. 4 C; right; in red), GAS2 (Growth Specific Arrest 2) was expressed at both diagnosis and TKI therapy within CD26+ cells relative to CD35+ cells (updated figure 6B). Interestingly, GAS2 was also detected in CML LSCs in a recent scRNA-seq study (Krishnan et al. Blood, 2023) suggesting GAS2 upregulation could be a consistent molecular feature of CML cells. GAS2 has been previously noted as deregulated in CML (Janssen JJ et al. Leukemia, 2005, Radich J et al, PNAS, 2006), control of cell cycle, apoptosis, and response to Imatinib (Zhou et al. PLoS One, 2014). Future investigations are warranted to assess whether GAS2 could play a role in the outcome of long-term TKI therapy.

      Reviewer 2:


      The authors use single-cell "multi-comics" to study clonal heterogeneity in chronic myeloid leukemia (CML) and its impact on treatment response and resistance. Their main results suggest 1) Cell compartments and gene expression signatures both shared in CML cells (versus normal), yet 2) some heterogeneity of multiomic mapping correlated with ELN treatment response; 3) further definition of s unique combination of CD26 and CD35 surface markers associated with gene expression defined BCR::ABL1+ LSCs and BCR::ABL1- HSCs. The manuscript is well-written, and the method and figures are clear and informative. The results fit the expanding view of cancer and its therapy as a complex Darwinian exercise of clonal heterogeneity and the selective pressures of treatments.


      Cutting-edge technology by one of the expert groups of single-cell 'comics.


      Very small sample sizes, without a validation set. The obvious main problem with the study is that an enormous amount of results and conjecture arise from a very small data set: only nine cases for the treatment response section (three in each of the ELN categories), only two normal marrows, and only two patient cases for the division kinetic studies. Thus, it is very difficult to know the "noise" in the system - the stability of clusters and gene expression and the normal variation one might expect, versus patterns that may be reproducibly study artifact, effects of gene expression from freezing-thawing, time on the bench, antibody labeling, etc. This is not so much a criticism as a statement of reality: these elegant experiments are difficult, timeconsuming, and very expensive. Thus in the Discussion, it would be helpful for the authors to just frankly lay out these limitations for the reader to consider. Also in the Discussion, it would be interesting for the authors to consider what's next: what type of validation would be needed to make these studies translatable to the clinic? Is there a clever way to use these data to design a faster/cheaper assay?

      We thank the reviewer for appraisal of our manuscript. We take the opportunity to point out the updates in the revised manuscript in view of the comments.

      Very small sample sizes, without a validation set. The obvious main problem with the study is that an enormous amount of results and conjecture arise from a very small data set: only nine cases for the treatment response section (three in each of the ELN categories), only two normal marrows, and only two patient cases for the division kinetic studies.

      As the reviewer has noted the single cell omics experiments remain resource demanding thereby placing a limitation on the number of patients analyzed. As described above in response to the comments from reviewer 1, multiomic CITE-seq allows extraction of two modalities in comparison to a typical scRNA-seq, however, this also makes it even more limited in the number of samples processed in a sustainable way. This was one of the motivations to analyze a larger number of independent samples (n=59) while benefiting from the insights gained from CITE-seq (n=9). Furthermore, by analyzing CD34+ cells from bone marrow and peripheral blood of CML patients, including both responders and non-responders after one year of Imatinib therapy, we were able to significantly diversity the patient pool, which was lacking in our CITE-seq patient pool. As mentioned above, this reflects a growing trend to analyze larger number of patients while anchoring the analysis on prohibitively expensive but potentially insightful sc-omics approaches (For example, please see Zeng et al, A cellular hierarchy framework for understanding heterogeneity and predicting drug response in acute myeloid leukemia, Nature Medicine, 2022).

      As emphasized above, we frequently sought to confirm the findings from one approach using a complementary method and independent samples. For example, although CITE-seq (n=9) showed altered frequency of all cell clusters between optimal and poor responders (Fig. 3B), we refrained from generalizing because an independent largescale computational deconvolution analysis (n=59) only substantiated the altered proportion of primitive and myeloid clusters.

      In view of the comment, we have now increased the number of patients analyzed during the revision process. These include increased numbers to investigate the ratio between CD35+ and CD26+ populations at diagnosis, as well as 3 months of TKI therapy, qPCR for BCR::ABL1, and patients examined for GAS2, one of the top genes expressed in CML cells (see response to reviewer 1 for details). Altogether, >80 patient samples across different approaches were analyzed to strengthen the conclusions.

      During the revision, we have analyzed cells from 8 CML patients for cell cycle using gene activity scores. This is in addition to the cell division kinetics data reported previously are now together described in the supplementary figures 9C-F.

      It is very difficult to know the "noise" in the system - the stability of clusters and gene expression and the normal variation one might expect, versus patterns that may be reproducibly study artifact, effects of gene expression from freezing-thawing, time on the bench, antibody labeling, etc. This is not so much a criticism as a statement of reality: these elegant experiments are difficult, time-consuming, and very expensive. Thus in the Discussion, it would be helpful for the authors to just frankly lay out these limitations for the reader to consider.

      We agree with the reviewer that sc-omics approaches can be noisy despite continuing efforts to denoise single cell datasets through both experimental and bioinformatic innovations. Therefore, we have updated the discussion as recommended by the reviewer (paragraph 5 in the discussion).

      We also note that CITE-seq, in contrast to scRNA-seq alone provides dual features: surface marker/protein as well as RNA for annotating the same cluster. In our manuscript, for example, cell clusters in UMAP for normal BM; Fig 1B were described using both surface markers (Fig. 1C) and RNA (Fig. 1D) making the cluster identity robust. To further elaborate this approach, a new supplementary figure 1C shows annotations of clusters using both RNA and surface markers.

      To potentially address the issue of stability of clusters and gene expression, we compared the marker genes for major clusters from nBM from this study (supplementary table 4, Warfvinge et al.) with those described recently in a scRNA-seq study by Krishnan et al. supplementary table 8, Blood, 2023 using Cell Radar, a tool that identifies and visualizes which hematopoietic cell types are enriched within a given gene set (description: https://github.com/KarlssonG/cellradar

      Direct link: https://karlssong.github.io/cellradar/). To compare, we used our in-house gene list for the major clusters as well as mapped the same number of top marker genes based on log2FC from corresponding cluster from Krishnan et al. as inputs to Cell Radar. The Cell Radar plot outputs are shown below.

      Author response image 1.

      This approach showed broad similarities across clusters from this study with their counterparts from the other study suggesting the cluster identities reported here are likely to be robust. Please note these figures are for reviewer response only and not included in the final manuscript.

      Also in the Discussion, it would be interesting for the authors to consider what's next: what type of validation would be needed to make these studies translatable to the clinic? Is there a clever way to use these data to design a faster/cheaper assay?

      Our findings on CD26+ and CD35+ surface markers to enrich BCR::ABL1+ and BCR::ABL1- cells suggest a simpler, faster and cheaper FACS panel can possibly quantify leukemic and non-leukemic stem cells in CML patients. We anticipate that future investigations, clinical studies might examine whether CD26CD35+ cells could be plausible candidates for restoring normal hematopoiesis once the TKI therapy diminishes the leukemic load, and whether patients with low counts of CD35+ cells at diagnosis have a relatively higher chance of developing hematologic toxicity such as cytopenia during therapy.

      We briefly mentioned this possibility in the discussion; however, we have now moved it to another paragraph to highlight the same. Please see paragraph 5 in the revised manuscript.

      Reviewer 3:


      In this study, Warfvinge and colleagues use CITE-seq to interrogate how CML stem cells change between diagnosis and after one year of TKI therapy. This provides important insight into why some CML patients are "optimal responders" to TKI therapy while others experience treatment failure. CITE-seq in CML patients revealed several important findings. First, substantial cellular heterogeneity was observed at diagnosis, suggesting that this is a hallmark of CML. Further, patients who experienced treatment failure demonstrated increased numbers of primitive cells at diagnosis compared to optimal responders. This finding was validated in a bulk gene expression dataset from 59 CML patients, in which it was shown that the proportion of primitive cells versus lineage-primed cells correlates to treatment outcome. Even more importantly, because CITE-seq quantifies cell surface protein in addition to gene expression data, the authors were able to identify that BCR/ABL+ and BCR/ABL- CML stem cells express distinct cell surface markers (CD26+/CD35- and CD26-/CD35+, respectively). In optimal responders, BCR/ABL- CD26-/CD35+ CML stem cells were predominant, while the opposite was true in patients with treatment failure. Together, these findings represent a critical step forward for the CML field and may allow more informed development of CML therapies, as well as the ability to predict patient outcomes prior to treatment.


      This is an important, beautifully written, well-referenced study that represents a fundamental advance in the CML field. The data are clean and compelling, demonstrating convincingly that optimal responders and patients with treatment failure display significant differences in the proportion of primitive cells at diagnosis, and the ratio of BCR-ABL+ versus negative LSCs. The finding that BCR/ABL+ versus negative LSCs display distinct surface markers is also key and will allow for a more detailed interrogation of these cell populations at a molecular level.


      CITE-seq was performed in only 9 CML patient samples and 2 healthy donors. Additional samples would greatly strengthen the very interesting and notable findings.

      Reviewer #3 (Recommendations For The Authors):

      My only recommendation is to bolster findings with additional CML and healthy donor samples.

      CITE-seq was performed in only 9 CML patient samples and 2 healthy donors. Additional samples would greatly strengthen the very interesting and notable findings.

      We thank the reviewer for the positive assessment of our manuscript. As mentioned in response to comments from reviewer 1 and 2, CITE-seq remains an reource consuming single cell method potentially limiting the number of patients to be analyzed. However, during the revision process, we have increased the number of patient material analyzed for other assays; these include increased number to investigate the ratio between CD35+ and CD26+ populations at diagnosis, and 3 months of TKI therapy, qPCR for BCR::ABL1, and patients examined for GAS2, one of the top genes expressed in CML cells. Thus, >80 patient samples across different assays have been analyzed to strengthen the conclusions. (Please see comment to reviewer 1 for more details)

    2. eLife assessment

      This study presents fundamental insights into the heterogeneity of chronic myeloid leukemia (CML) stem cells and their response to tyrosine kinase inhibitor therapy, shedding light on potential mechanisms underlying treatment failure. The study's robust methodology, supported by validation with bulk RNA-seq data and surface marker analysis, provides compelling evidence for the identified associations between cellular composition and treatment outcome. These findings contribute to our understanding of CML pathogenesis and may inform the development of more targeted therapeutic strategies.

    3. Reviewer #1 (Public Review):


      This manuscript by Warfvinge et al. reports the results of CITE-seq to generate single-cell multi-omics maps from BM CD34+ and CD34+CD38- cells from nine CML patients at diagnosis. Patients were retrospectively stratified by molecular response after 12 months of TKI therapy using European Leukemia Net (ELN) recommendations. They demonstrate heterogeneity of stem and progenitor cell composition at diagnosis, and show that compared to optimal responders, patients with treatment failure after 12 months of therapy demonstrate increased frequency of molecularly defined primitive cells at diagnosis. These results were validated by deconvolution of an independent previously published dataset of bulk transcriptomes from 59 CML patients. They further applied a BCR-ABL-associated gene signature to classify primitive Lin-CD34+CD38- stem cells as BCR:ABL+ and BCR:ABL-. They identified variability in the ratio of leukemic to non-leukemic primitive cells between patients, showed differences in expression of cell surface markers and determined that a combination of CD26 and CD35 cell surface markers could be used to prospectively isolate the two populations. The relative proportion of CD26-CD35+ (BCR:ABL-) primitive stem cells was higher in optimal responders compared to treatment failures, both at diagnosis and following 3 months of TKI therapy.


      The studies are carefully conducted and the results are very clearly presented. The data generated will be a valuable resource for further studies. The strengths of this study are the application of single-cell multi-omics using CITE-Seq to study individual variations in stem and progenitor clusters at diagnosis that are associated with good versus poor outcomes in response to TKI treatment. These results were confirmed by deconvolution of a historical bulk RNAseq data set. Moreover, they are also consistent with a recent report from Krishnan et al. and are a useful confirmation of those results. The major new contribution of this study is the use of gene expression profiles to distinguish BCR-ABL+ and BCR-ABL- populations within CML primitive stem cell clusters and then applying antibody-derived tag (ADT) data to define molecularly identified BCR:ABL+ and BCR-ABL- primitive cells by expression of surface markers. This approach allowed them to show an association between the ratio of BCR-ABL+ vs BCR-ABL- primitive cells and TKI response and study dynamic changes in these populations following short-term TKI treatment.


      The number of samples studied by CITE-Seq is limited. However, the authors have confirmed their key observations in additional samples. The BCR-ABL+ versus BCR-ABL- status of cells was not confirmed by direct sequencing for BCR-ABL. However, we recognize that the methodologies to perform these analyses on single cells is still evolving and the authors have shown that CD26 and CD35 expression can consistently identify BCR-ABL+ versus BCR-ABL- cells. It will be of interest to learn whether the GEP and surface markers identified here can distinguish BCR-ABL+ primitive stem cells later in the course of TKI treatment.

    4. Reviewer #3 (Public Review):


      In this study, Warfvinge and colleagues use CITE-seq to interrogate how CML stem cells change between diagnosis and after one year of TKI therapy. This provides important insight into why some CML patients are "optimal responders" to TKI therapy while others experience treatment failure. CITE-seq in CML patients revealed several important findings. First, substantial cellular heterogeneity was observed at diagnosis, suggesting that this is a hallmark of CML. Further, patients who experienced treatment failure demonstrated increased numbers of primitive cells at diagnosis compared to optimal responders. This finding was validated in a bulk gene expression dataset from 59 CML patients, in which it was shown that the proportion of primitive cells versus lineage-primed cells correlates to treatment outcome. Even more importantly, because CITE-seq quantifies cell surface protein in addition to gene expression data, the authors were able to identify the BCR/ABL+ and BCR/ABL- CML stem cells express distinct cell surface markers (CD26+/CD35- and CD26-/CD35+, respectively). In optimal responders, BCR/ABL- CD26-/CD35+ CML stem cells were predominant, while the opposite was true in patients with treatment failure. Together, these findings represent a critical step forward for the CML field and may allow more informed development of CML therapies, as well as the ability to predict patient outcomes prior to treatment.


      This is an important, beautifully written, well-referenced study that represents a fundamental advance in the CML field. The data are clean and compelling, demonstrating convincingly that optimal responders and patients with treatment failure display significant differences in the proportion of primitive cells at diagnosis, and the ratio of BCR-ABL+ versus negative LSCs. The finding that BCR/ABL+ versus negative LSCs display distinct surface markers is also key and will allow for more detailed interrogation of these cell populations at a molecular level.


      CITE-seq was performed in only 9 CML patient samples and 2 healthy donors. Additional samples would greatly strengthen the very interesting and notable findings.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We want to thank the reviewers for their thoughtful analysis and questions.

      A brief overview of the changes to the manuscript is provided here, with individual responses to the reviewer comments following.

      The methods section has been expanded to better explain the techniques used in our analyses. CTCF binding data section has likewise been expanded, to include more detail on the dataset and our analysis of its contents. All other requested clarifications have been added to areas of the results.

      Beyond specific requests from the reviewers, we made the following changes.

      We felt that a particular terminology choice on our part resulted in some confusion: the use of “SNPs” to refer to genetic variants within our Diversity Outbred samples. While we used SNPs that lay closest to the center of our haplotype predictions as our representative loci for each linkage disequilibrium block, this was done for computational purposes only. We did not focus most of our analyses on the haplotypes themselves, because of the uncertainty of which variants within an LD block actually participated in the genetic-epigenetic interactions we imputed.

      Thus, we edited the text to remove mention of “SNPs” unless our analysis did directly and deliberately profile SNPs themselves. In all other cases, we now refer to “haplotypes”, “genetic variants”, or “variants”. This should help increase clarity in the manuscript as a whole.

      A small error was discovered within the labelling and processing of regression model outputs in chromosome 14. A consistency check was run on all chromosomes, finding that only Chr 14 was affected. Chr 14 was rerun in its entirety to verify its results, with the previous results now archived within our databases uploaded on Synapse (see Methods for a link). All relevant calculations and figures were regenerated, resulting in an average shift of 1% or less across the manuscript. All analyses remain highly statistically significant.

      Responses to comments from Reviewer #1


      • Sequencing depth was retrieved from the original publication on the primary multiomics dataset. (Line 105-106)

      • A line was added regarding initial mouse genome alignment for the original publication: we explain the GigaMUGA genotyping array, used for the DO mESC samples. For our ChIP-seq data, we reword to specify: we used liftovers from imputed strain-specific genomes to B6 mm10. (Lines 108-110; 116-120; 168-170)

      • Aneuploidy removal is expanded upon in a similar fashion: the original QC identified chromosome-level gene expression differences to remove aneuploid samples. (Line 111)

      • Mention of the pre-publication use of an alternative null model has been removed, given its lack of relevance to the rest of the text. While it was interesting to compare to the standard null model, it amounts to a side note that distracts from the focus of the paper. (Line 137-139).

      • Descriptive subheadings have been added.

      Results - Line 179 (now Line 191) now points to Methods.

      • Line 189-200 (now Line 188-204): language altered to better explain our intent: We wished to perform an intrachromosomal scan across the whole genome for non-additive genetic-epigenetic interactions. However, there were computational limits to how many possible combinations of gene, haplotype, and ATAC-seq peak we could feasibly test. We thus generated a random subset of possible combinations. This was also performed to identify target regions for focused analyses.

      • Line 195 (now line 206, expanded on in Line 210): Clarification added on the significance of our result: if non-additive genetic-epigenetic interactions were not a significant explanatory factor for gene expression, we would expect to see no enrichment of low p-value results. Instead, we see 0.07% of our models coming in at adj. p < 1x10-7.

      • Line 199 (now Line 216): The requested calculations were run, and are now included in table S3. We found that within 4 Mb of a given gene, less than 10% of variants and ATAC peaks within clustered closer to each other than they did to the gene they affected.

      Please note that this figure has a level of uncertainty due to linkage disequilibrium. Thus, rather than precisely answering the question “[are there haplotype-ATAC pairs] that are in the same locality but further away from the gene?”, we asked "is the ATAC peak closer than the gene to the point where we have the highest confidence of correctly calling the interacting genotype?". The relevant code has been deposited in our Synapse repository (see Methods for link).

      • Line 205 (now restructured in Line 221-228): The text has been edited to specify our intent. We are referring to a set of TAD-focused regression models we generated (see Methods) that comprehensively included all possible interactions between genes, and all haplotypes and ATAC peaks within +/- 1 TAD of the gene.

      • (Line 227): We specified that the previously-published TAD boundary dataset we used was retrieved from the Bing Ren lab’s Hi-C projects, which imputed locations of TAD boundaries in B6 mESCs.

      • We have relabeled Figure 1 and tweaked the surrounding text to clear up some confusing aspects. The Euler plots in Figure 1D-E reflect the fact that each ATAC-seq peak and haplotype can be in multiple relationships with local genes and regulatory factors. Some of these relationships will be simple correlation between their presence and gene expression, while others may co-regulate alongside independent regulatory factors, or engage in non-additive regulatory interactions.

      Because these non-additive regulatory interactions have not been comprehensively studied, we wished to determine whether there were any regulatory factors within our data that would not be detected as significant via more conventional methods, such as correlation analysis, mediation analysis, or regression analysis without an interaction term. Our Euler plots show that there are large subsets of both ATAC-seq peaks and haplotypes that are exclusively found in non-additive interactions. Thus, our justification for focusing on non-additive interactions for the rest of the paper.

      • Line 256 (now Line 252-255): We further clarified the above in this section: correlation and mediation analyses were previously completed by the team which initially analyzed the DO mESC dataset (Skelly et al. 2020, Cell Stem Cell). They performed a correlation analysis between open chromatin and gene expression (Skelly et al. Fig. 2A), and identified expression quantitative trait loci (eQTL) (Skelly et al. Fig. 2E). We felt that more direct comparisons to the Skelly et al. data would distract readers from our focus on genetic-epigenetic interactions. Thus, we limited our discussion of non-interacting regulatory relationships to Figures 1-2, and a brief mention in Figure 5.

      • Line 290 (now Line 337): We pulled promoter locations from the FANTOM5 database of mouse promoters, and included analysis in both the text and Figure S4A-B.

      • (Line 475-476): we clarified “DO founder SNPs” to “SNPs from the non-reference DO founder strains”.

      • Line 472 (restructured in Lines 531-564): We have expanded on this section, including answers to the reviewer’s questions regarding ChIP-seq peak counts, overlap with the TAD map we used for our other analyses, and expanded upon strain-specific CTCF binding we identified in our ChIP-seq analysis.

      Responses to comments from Reviewer #2:

      (1) Typo corrected.

      (2) Lines 194-195 (now line 206, expanded on in Line 210): We have expanded upon the intent and expectations of our analysis. In summary: if non-additive genetic-epigenetic interactions were not a significant explanatory factor for gene expression, we would expect to see no enrichment of low p-value results. Thus, we would expect 0.0000001% of results to reach adj. p < 1x10-7. Instead, we see 0.07% of our models coming in at adj. p < 1x10-7, four orders of magnitude greater than expected.

      (3) Lines 226-230 (Expanded on in Lines 252-276): We have relabeled Figure 1 and tweaked the surrounding text to clear up some confusing aspects. The percentages in the text are derived from the data summarized in the Euler plots in Figure 1D-E. These plots reflect the fact that each ATAC-seq peak and haplotype can be in multiple relationships with local genes and regulatory factors. Some of these relationships will be simple correlation between their presence and gene expression, while others may co-regulate alongside independent regulatory factors, or engage in non-additive regulatory interactions.

      (4) Line 261-263 (now lines 299-300): A companion to Figure 2B has been added (Fig. S3), which provides interaction counts for each ATAC-seq peak that contributed to Figure 2B. A horizontal line is included to highlight the locations of the highly-interacting ATAC peaks.

      (5) Analysis regarding Figure 3B had been removed from its original context. It has now been restored to the manuscript (Line 368-371).

    2. eLife assessment

      This important manuscript reports interactions between genetic variation, DNA accessibility, and chromatin structure in gene expression at a genome wide scale. The authors found that most of these interactions occur within topologically associating domains (TADs) and 3D genome structure data can be efficiently used to guide the discovery of significant genetic and epigenetic influences on gene expression. Overall, this convincing study highlights the importance of 3D chromatin structure in controlling how gene expression is regulated by genetic and epigenetic processes.

    3. Reviewer #1 (Public Review):

      This is an important manuscript that links gene expression to genetic variants and regions of open chromatin. The mechanisms of genetic gene regulation are essential to understanding how standing genetic variation translates to function and phenotype. This data set has the ability to add substantial insight into the field. In particular, the authors show how the relationships between variants, chromatin, and genes are spatially constrained by topologically associated domains.

    4. Reviewer #2 (Public Review):

      The experiments described in the manuscript are well designed and executed. Most of the data presented are of high quality, convincing, and in general support the conclusions made in the manuscript. This manuscript should be of great interest to the field of mammalian gene regulation and the approaches used here can have broader applications in studying genetic and epigenetic regulations of gene expression. The key finding reported here, the importance of 3D chromatin structure in controlling gene expression, although not unexpected, offers a better understanding of the physiological roles of TADs.

      Comments on revised version:

      I think the authors have substantially addressed reviewers' concerns. I have no further comments to add.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendation for the authors)

      I only have one comment for improvement of this study and it has to do with the comparison of simulators that they conducted. There are many other simulators around now, including scDesign3, spaSim, SPIDER, SRTSIM, etc. Are any of those methods worth including in the comparison?

      Indeed, many of the mentioned simulators did not exist when we initially developed synthspot, and upon closer examination, they are not directly comparable to our tool.

      • scDesign3: The runtime of scDesign3 is quite long as a result of its generative model. The example provided in its tutorial only simulates 183 genes and takes over seven minutes when using four cores on a system with Intel Xeon E5-2640 CPUs running at 2.5GHz. In a small downsampling analysis, we simulated 10, 50, 100, and 150 genes with scDesign3 and observed runtimes of 30, 130, 245, and 360 seconds, respectively. This seems to indicate a linear relationship between the number of genes and the runtime, therefore rendering it unsuitable for simulating whole-transcriptome datasets for deconvolution.

      • spaSim: spaSim focuses on modelling cell locations in different tissue structures but does not provide gene expression data. It is designed for testing cell colocalization capabilities rather than simulating gene expression.

      • SPIDER: Although SPIDER appears to have some overlap with our work, it seems to be in the early stages of development. The GitHub repository contains only two scripts without any documentation, and the preprint does not provide instructions on how to use the tool.

      • SRTSim: SRTSim explicitly states in its publication that it is not suitable for evaluating cell type deconvolution, as its focus is on simulating gene expression data without modelling cell type composition.

      • scMultiSim: scMultiSim, like scDesign3, is limited in its capability to model the entire transcriptome.

      Nonetheless, the inherent modularity of our Nextflow framework makes it possible for users to simply run the deconvolution methods on data that has been simulated by other simulators if need be.

      Additionally, we have added the following rationale for why we developed synthspot in “Synthspot allows simulation of artificial tissue patterns”:

      “On the other hand, general-purpose simulators are typically more focused on other inference tasks, such as spatial clustering and cell-cell communication, and are unsuitable for deconvolution. For instance, generative models and kinetic models like those of scDesign3 and scMultiSim are computationally intensive and unable to model entire transcriptomes. SRTSim focuses on modeling gene expression trends and does not explicitly model tissue composition, while spaSim only models tissue composition without gene expression.”

      The other aspect of the simulation comparison that I'm missing is some kind of spatial metric. There are metrics about feature correlation, sample-sample correlation, library size, etc. But, what about spatial correlation (e.g., Moran's I or similar). Perhaps comparing the distribution of Moran's I across genes in a simulated and real dataset would be a good first start.

      We would like to clarify that synthspot does not actually simulate the spatial location of spots, but synthetic regions where spots from the same region share similar compositions. Hence, incorporating a spatial metric in the comparison is not feasible. However, as RCTD is the only method that explicitly uses spot locations in its model (Supplementary Table 2, "Location information"), we believe that generating synthetic datasets with actual coordinates would not significantly impact the conclusions of the study.

      Reviewer #2 (Public Review)

      On the other hand, the authors state that in silver standard datasets one half of the scRNA-seq data was used for simulation and the other half was used as a reference for the algorithms, but the method of splitting the data, i.e., at random or proportionally by cell type, was not specified.

      The data was split proportionally by cell type. To clarify this, we have included an additional sentence in the main text under the first paragraph of “Cell2location and RCTD perform well in synthetic data”, as well as in Figure S2.

      Reviewer #2 (Recommendation for the authors)

      Figure legends in Figures 3, 4 and across most Supplementary material are almost illegible. Please consider increasing font size for better readability.

      Thank you for bringing this to our attention. The font size has been increased for all main and supplementary figures. Additionally, the supplementary figures have also been exported in higher resolution.

      Supplementary Notes Figure 2c reads "... total count per sampled multiplied by..."

      This has been adapted, as well as the captions of Supplementary Notes Figure 3c and 4c which had the same typo.

      Review #3 (Public review)

      The simulation setup has a significant weakness in the selection of reference single-cell RNAseq datasets used for generating synthetic spots. It is unclear why a mix of mouse and human scRNA-seq datasets were chosen, as this does not reflect a realistic biological scenario. This could call into question the findings of the "detecting rare cell types remains challenging even for top-performing methods" section of the paper, as the true "rare cell types" would not be as distinct as human skin cells in a mouse brain setting as simulated here.

      We appreciate the reviewer’s concern and would like to clarify that within one simulated dataset, we never mix mouse and human scRNA-seq data together. The synthetic spots generated for the silver standards are always sampled from a single scRNA-seq or snRNA-seq dataset. Specifically, for each of the seven public scRNA-seq datasets, we generate synthetic datasets with one of nine abundance patterns, resulting in a total of 63 synthetic datasets. These abundance patterns only affect the sampling priors that are used—the spots are still created with combinations of cells sampled from the same dataset.

      Furthermore, it is unclear why the authors developed Synthspot when other similar frameworks, such as SRTsim, exist. Have the authors explored other simulation frameworks?

      While there are other simulation frameworks available now, synthspot was designed to specifically address the requirements of our study, offering unique capabilities that make it suitable for deconvolution evaluation. Moreover, many of the simulators did not exist when we initially developed our tool. We have added the following rationale for why we developed synthspot in “Synthspot allows simulation of artificial tissue patterns”:

      “On the other hand, general-purpose simulators are typically more focused on other inference tasks, such as spatial clustering and cell-cell communication, and are unsuitable for deconvolution. For instance, generative models and kinetic models like those of scDesign3 and scMultiSim are computationally intensive and unable to model entire transcriptomes. SRTSim focuses on modeling gene expression trends and does not explicitly model tissue composition, while spaSim only models tissue composition without gene expression.”

      In our response to Reviewer 1 copied below, we also outline specific reasons why other simulators were not suitable for our benchmark:

      • scDesign3: The runtime of scDesign3 is quite long as a result of its generative model. The example provided in its tutorial only simulates 183 genes and takes over seven minutes when using four cores on a system with Intel Xeon E5-2640 CPUs running at 2.5GHz. In a small downsampling analysis, we simulated 10, 50, 100, and 150 genes with scDesign3 and observed runtimes of 30, 130, 245, and 360 seconds, respectively. This seems to indicate a linear relationship between the number of genes and the runtime, therefore rendering it unsuitable for simulating whole-transcriptome datasets for deconvolution.

      • spaSim: spaSim focuses on modelling cell locations in different tissue structures but does not provide gene expression data. It is designed for testing cell colocalization capabilities rather than simulating gene expression.

      • SPIDER: Although SPIDER appears to have some overlap with our work, it seems to be in the early stages of development. The GitHub repository contains only two scripts without any documentation, and the preprint does not provide instructions on how to use the tool.

      • SRTSim: SRTSim explicitly states in its publication that it is not suitable for evaluating cell type deconvolution, as its focus is on simulating gene expression data without modelling cell type composition.

      • scMultiSim: scMultiSim, like scDesign3, is limited in its capability to model the entire transcriptome.

      Finally, we would have appreciated the inclusion of tissue samples with more complex structures, such as those from tumors, where there may be more intricate mixing between cell types and spot types.

      We acknowledge the reviewer's suggestion and have incorporated a melanoma dataset from Karras et al. (2022) in response to this suggestion. This study profiled melanoma tumors by using both scRNA-seq and spatial technologies. The scRNA-seq consists of eight immune cell types and seven melanoma cell states. We have included this study as an additional silver standard and case study, the latter of which is presented in a separate section following the liver analysis (and a corresponding section in Methods).

      We found that method performances on synthetic datasets generated from this melanoma dataset follow previous trends (Figure S3-S5). However, the inclusion of the case study led to the following changes in the overall rankings: cell2location and RCTD are now tied for first place (previously RCTD ranked first), and Seurat and SPOTlight have swapped places. Despite these changes, the core messages and conclusions of our paper remain unchanged. All relevant figures (Figures 1a, 2, 3a, 4a, 6b, 7a, S3-S6, S9) have been updated to incorporate these new analyses and results.

      Review #3 (Recommendation for the authors)

      To maintain consistency in the results, it is recommended to exclude the human scRNAseq set when generating synthetic spots. Furthermore, addressing the other significant weaknesses mentioned earlier would be beneficial.

      Please refer to our response to the public review where we address the same remark.

      It is essential to differentiate this work from previous benchmarking and simulation frameworks.

      In addition to the rationale on why we developed our own framework (see response to the public review), we have included the following text in the discussion that highlights our versatile approach when using a real spatial dataset for evaluation:

      “In the case studies, we demonstrated two approaches for evaluating deconvolution methods in datasets without an absolute ground truth. These approaches include using proportions derived from another sequencing or spatial technology as a proxy, and leveraging spot annotations, e.g., zonation or blood vessel annotations, that typically have already been generated for a separate analysis.”

      Furthermore, we conducted an extra analysis in the liver case study, generating synthetic datasets with one experimental protocol and using the remaining two as separate references (Figure S13). This further illustrates the usefulness of our simulation framework, which we mentioned by appending this sentence in the discussion:

      “As in our silver standards, users can select the abundance pattern most resembling the real tissue to generate the synthetic spatial dataset, as we have also demonstrated in the liver case study.”

    2. Reviewer #1 (Public Review):

      Cell type deconvolution is one of the early and critical steps in the analysis and integration of spatial omic and single cell gene expression datasets, and there are already many approaches proposed for the analysis. Sang-aram et al. provide an up-to-date benchmark of computational methods for cell type deconvolution.

      In doing so, they provide some (perhaps subtle) additional elements that I would say are above the average for a benchmarking study: i) a full Nextflow pipeline to reproduce their analyses; ii) methods implemented in Docker containers (which can be used by others to run their datasets); iii) a fairly decent assessment of their simulator compared to other spatial omics simulators. A key aspect of their results is that they are generally very concordant between real and synthetic datasets. And, it is important that they authors include an appropriate "simpler" baseline method to compare against and surprisingly, several methods performed below this baseline. Overall, this study also has the potential to also set the standard of benchmarks higher, because of these mentioned elements.

      The only weakness of this study that I can readily see is that this is a very active area of research and we may see other types of data start to dominate (CosMx, Xenium) and new computational approaches will surely arrive. The Nextflow pipeline will make the the prospect of including new reference datasets and new computational methods easier.

    3. Reviewer #2 (Public Review):

      In this manuscript Sangaram et al provide a systematic methodology and pipeline for benchmarking cell type deconvolution algorithms for spatial transcriptomic data analysis in a reproducible manner. They developed a tissue pattern simulator that starts from single-cell RNA-seq data to create silver standards and used spatial aggregation strategies from real in situ-based spatial technologies to obtain gold standards. By using several established metrics combined with different deconvolution challenges they systematically scored and ranked 12 different methods and assessed both functional and usability criteria. Altogether, they present a reusable and extendable platform and reach very similar conclusions to other deconvolution benchmarking paper, including that RCTD, SpatialDWLS and Cell2location typically provide the best results. Major strengths of the simulation engine include the ability to downsample and recapitulate several cell and tissue organization patterns.

      More specifically, the authors of this study sought to construct a methodology for benchmarking cell type deconvolution algorithms for spatial transcriptomic data analysis in a reproducible manner. The authors leveraged publicly available scRNA-seq, seqFISH, and STARMap datasets to create synthetic spatial datasets modeled after that of the Visium platform. It should be noted that the underlying experimental techniques of seqFISH and STARMap (in situ hybridization) do not parallel that of Visium (sequencing), which could potentially bias simulated data. Furthermore, to generate the ground truth datasets cells and their corresponding count matrix are represented by simple centroids. Although this simplifies the analysis it might not necessarily accurately reflect Visium spots where cells could lie on a boundary and affect deconvolution results.

      The authors thoroughly and rigorously compare methods while addressing situational discrepancies in model performance, indicative of a strong analysis. The authors make a point to address both inter- and intra- dataset reference handling, which has a significant impact on performance, as the authors note in the text and conclusions. Indeed, supplying optimal reference data is - potentially most - important to achieve the best performance and hence it's important to understand that experimental design or sample matching is at least equally important to selecting the ideal deconvolution tool.

      Similarly, the authors conclude that many methods are still outperformed by bulk deconvolution methods (e.g. Music or NNLS), however, it needs to be noted that these 'bulk' methods are also among the most sensitive when using an external (inter) dataset (S10), which likely resembles the more realistic scenario for most labs.

      As the authors also discuss it's important to realize that deconvolution approaches are typically part of larger exploratory data analysis (EDA) efforts and require users to change parameters and input data multiple times. Thus, running time, computing needs, and scalability are probably key factors that researchers would like to consider when looking to deconvolve their datasets.

      The authors achieve their aim to benchmark different deconvolution methods and the results from their thorough analysis support the conclusions that creating cell type deconvolution algorithms that can handle both cell abundance and rarity throughout a given tissue sample are challenging.

      The reproducibility of the methods described will have significant utility for researchers looking to develop cell type deconvolution algorithms, as this platform will allow simultaneous replication of the described analysis and comparison to new methods.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1

      (1) Since you only included patients with early-onset preeclampsia in the study, I suggest revising the title to "Identification of novel syncytiotrophoblast membrane extracellular vesicle derived protein biomarkers in early-onset preeclampsia...."

      We have changed our title to early-onset preeclampsia.

      (2) Under methods, you state that placenta was obtained from women undergoing elective cesarean section. Was this because all the study patients were delivered before the onset of labor? Or were laboring patients specifically excluded from the study?

      Indeed, labor influences the extracellular vesicles (EVs) generated. To ensure consistency in our samples and avoid this variable, we chose placentas obtained from elective cesarean sections (CS) for our study.

      (3) In Table 1 on page 10, the 8th row (Birth weight grams) needs to be reformatted. The mean birthweights for normal pregnancy and preeclampsia should be the same.

      We have reformatted the table and using ranges instead of brackets.

      (4) In the legend for Table 1, the sentence beginning on page 10, line 227, and continuing onto page 11, line 228, does not make sense. Part of the sentence was omitted inadvertently.

      We have modified this sentence to :

      Detergent treatment, which could break down EVs, with NP-40 confirmed that the majority (99%) of our samples were largely vesicular since only 0.1 ± 0.12% of BODIPY FL N-(2-aminoethyl)-maleimide and PLAP double-positive events were detected (a reduction of 99%) (Figure 1E and 1H).'

      (5) As you acknowledge, the sample size (12 patients) was small. This is understandable because early-onset preeclampsia occurs in <1% of parturients. You could collaborate with other centers in future studies to increase the sample size.

      Thank you very much for your comment. We are willing to cooperate on future research and will try to expand our sample size in subsequent studies.

      Reviewer #2 (Recommendations For The Authors):

      (1) This is one of the many "catalogue" papers where placental exosome proteins in preeclampsia are profiled. Thus, the manuscript lacks novelty. The only novelty factor is the authors have isolated exosomes by a different method and even separated the small and large exosomes. However, there is no mention of how these exosomes differ from each other in terms of their functionality. Thus it is hard to judge the biological significance of this work.

      We appreciate your insights regarding the novelty of our study. While numerous papers have profiled placental exosome proteins in preeclampsia, our methodology for enriching sSTB-EVs (exosomes) offers a distinct perspective. We believe that the separation of sSTB-EVs (exosomes) and medium/large STB-EVs (microvesicles) introduces a differentiation that extends beyond mere profiling, with implications for their functionality. There are previous studies showed that the different sizes of placenta EVs have distinct characteristics (Zabel RR, et al. Enrichment and characterization of extracellular vesicles from ex vivo one-sided human placenta perfusion. Am J Reprod Immunol. 2021 Aug;86(2)). Furthermore, the way cells internalize and respond to EVs may depend on the size of the EV (Zhuang X et al. Treatment of brain inflammatory diseases by delivering exosome encapsulated anti-inflammatory drugs from the nasal region to the brain. Mol Ther. 2011 Oct;19(10).) Therefore, it would be important for future studies to distinguish different sizes of EVs for the research.

      (2) The authors must demonstrate that these two types of EVs are also produced in vivo by detecting them in the serum of women.

      Thank you for the comment. Many previous studies have shown the two types of placental EVs in women's blood. Nakahara et al.'s (PMCID: PMC7755551) extensive review compiles studies that have specifically isolated various subtypes of placenta-derived EVs from maternal circulation. We have also readdressed it in the introduction.

      (3) The authors must compare the proteomes of serum-derived placental exosomes and the proteome of the STBs isolated from the perfusion experiments to judge how overlapping the outcomes are from those produced naturally and those produced under ex vivo conditions.

      We appreciate the reviewer's suggestion to compare the proteomes of serum-derived placental sSTB-EVs (exosomes) with those from STBs isolated through perfusion experiments. Indeed, such a comparison would provide valuable insights into the similarities and differences between naturally produced and ex vivo-generated sSTB-EVS (exosomes). However, isolating placental EVs from maternal circulation for comprehensive proteomic profiling presents challenges. It requires a significant amount of serum or plasma sample that will be sufficient to enable the isolation of placenta-specific EVs amongst numerous EVs in the circulation. In addition, it will require multiple intricate steps such as ultracentrifugation followed by immunoprecipitation. Each of these steps can potentially lead to the loss of EVs. Additionally, given the high concentration of lipoproteins in plasma relative to EVs, there's a significant risk of obtaining low-purity isolates from the outset. These challenges might compromise the comparability of results between placenta-specific EVs from maternal circulation and those from ex vivo perfusion. Nevertheless, we acknowledge the value of such an endeavor and will consider incorporating this aspect in future studies as the EV and proteomic methodology and technology improve and become more sensitive.

      (4) I have a major issue with the chosen study subjects. While the study title and the manuscript mention preeclampsia, as per the inclusion criteria mentioned in lines 88-90, the patients will be HELLP syndrome. Please clarify what was used and modify the manuscript accordingly.

      Thank you very much for finding this error. Our patients had none of the features that would qualify them for HELLP syndrome. We have edited to:

      PE was defined as new (after 20 weeks) systolic blood pressure of 140 mmHg or diastolic pressure of 90 mmHg, proteinuria (protein/creatinine ratio of 30 mg/mmol or more). None of our patients had maternal acute kidney injury, liver dysfunction, neurological features, hemolysis, or thrombocytopenia.

      (5) It is hard to reconcile how only 15 proteins were identified in the placental extract while 300+ in EVs. There is a methodological issue in the mass spec or extraction. With such widely different denominators in the total proteins identified, it is hard to compare the outcomes in terms of the three sample types.

      We acknowledge the reviewer's concerns regarding the disparity in protein counts between the placental extract and the EVs. Ultimately, more is not necessarily better. Several factors might contribute to this discrepancy. Firstly, it is plausible that certain proteins exhibit selective affinity to varying sizes of EVs, leading to a more diverse range of proteins than the placental extract. We were also stringent in our analysis to enable us to select proteins whose biological differences are more likely to be reproducible with a different validatory method like a western blot. Additionally, although the placental extract might contain a higher total protein concentration, it doesn't necessarily translate to a richer diversity of disease-specific proteins. Considering these nuances when comparing protein outcomes across sample types is helpful.

      (6) I am unable to understand the terms least differentially expressed and most differentially expressed. Do the authors mean upregulated and downregulated? Please clarify and use the terms appropriately by providing fold change values.

      We appreciate the reviewer's request for clarification. We intended to provide a relative measure of expression for the terms 'least differentially expressed' and 'most differentially expressed'. The terms are roughly equitable to down- and upregulated. Regarding EVs, we avoid using the terms 'upregulated' and 'downregulated' as EVs act as transporters and do not possess regulatory functions per se. However, for the placenta, we recognize the relevance of these terms.

      (7) The data presented is very superficial and lacks methodological details. The authors should provide the total number of targets achieved after mass spec. The cutoffs used the FDRs and other details.

      We apologize for the omission. We have added these details to the method section.

      (8) It is not clear how were these differentially abundant proteins identified. What was the cutoff used? Was it identified in all the replicates?

      We apologize for the omission. We have added these details to the method section.

      (9) How many samples were subjected to the discovery cohort, and how many were in the validation cohort? Were they the same or different? If the samples were different, how many PE samples had differentially abundant proteins by both methods?

      The study utilized 12 samples for initial discovery and another 12 for western blot validation. The validation samples specifically targeted proteins of interest, rather than undergoing another comprehensive mass spectrometry analysis.

      (10) It is striking that the authors report the expression of prostatic acid phosphatase in the placenta. In my understanding of placental biology, this gene or protein is not known to be expressed by the placenta. Please perform immunofluorescence to demonstrate that this protein is indeed produced in the STBs

      Research has revealed that even though it's called prostate-specific antigen, it's created in tissues other than the prostate, such as the placenta. Here are a couple of references to support this claim: PMID: 10634405, PMID: 7533063, PMID: 8939403, and PMID: 8945610. Hence it is likely not beneficial to demonstrate what many researchers have already demonstrated.

      (11) Please validate the differential abundance of these proteins in the exosomes isolated from the plasma of women with and without preeclampsia. A serial measurement will be of high value to determine how early as compared to hypertension, these biomarkers can predict preeclampsia.

      We are validating each EV-carried marker individually in the circulation (plasma or serum), localizing them in the placenta, and performing downstream functional analysis. This article is already lengthy and would likely be too cumbersome to include the details of all individual proteins in this manuscript. However, we have already published papers on Siglec 6 (PMID: 32998819) and Neprilysin (PMID: 30929513), and others will be published soon. We agree that there will be a lot of value to serial measurement, not just in terms of how early as compared to hypertension, these biomarkers can predict preeclampsia but also as potentially a more sensitive or specific test. This would be the subject of subsequent papers.

      (12) The authors are recommended to carry out immunofluorescence to localize the differentially abundant proteins in the placental sections and show that they are specific to STBs.

      We have already provided a similar response earlier (see response to point 11). In addition, while it is preferable, the biomarkers don't necessarily need to be specific to STB. Not all biomarkers are mechanistic agents/targets, and not all mechanistic agents are biomarkers. However, mechanistic agents should preferably be placental-specific. For example, the total sFLT1, the most studied biomarker, is not exclusively synthesized in the placenta, even though the placental-specific isoform represents a small fraction of the total sFLT-1. For example, in the non-placental world, alkaline phosphatase (ALP) is not exclusively produced by the liver but is a ‘biomarker’ of cholestatic disease.

      (13) Table 1 should give the range and SD could be given as + instead of the bracket.

      Thank you for your suggestion. We have edited it accordingly.

      (14) It is necessary to provide the gestational age of the onset of hypertension to get a judgment of how long these women were preeclamptic, culminating in HELLP.

      We want to emphasize that none of our patients experienced HELLP syndrome. In the results section, we have included the gestational age at the time of diagnosis in the table for preeclampsia. It's crucial to understand that the gestational age at diagnosis is distinct from the gestational age when hypertension initially appeared. Detecting the exact gestational age of hypertension onset would be challenging, and it would likely require a prospective or randomized clinical trial with continuous monitoring, possibly on a daily basis. However, our study is retrospective. Thus we can only comment on the gestational age at diagnosis

      (15) For newborns the term Sex is used and not gender

      Thank you for your suggestion. We have edited it accordingly.

      (16) Figure 2 is stretched and hard to read

      Thank you for your suggestion. We have edited it accordingly by creating two separate images to promote readability.

      (17) Line 278 change the sentence "there fifteen (15) proteins in the placenta" to "there were fifteen (15) proteins in the placenta"

      Thank you for your suggestion. We have edited it accordingly.

      (18) Line 288 you mean least and not lease

      Thank you for your suggestion. We have edited it accordingly.

    2. eLife assessment

      This study presents valuable findings that could be utilized for identifying women at risk for preeclampsia before the onset of the disease. The novel aspect of this study lies in the utilization of exosomes of two different sizes. The data are solid: the methods, data, and analysis broadly support the claims. This work will be of interest to medical researchers and clinicians who work on preeclampsia and women's health.

    3. Reviewer #1 (Public Review):

      The authors primary objective in this study was to identify differences between patients with preeclampsia and normal patients with respect to the placental syncytiotrophoblast extracellular vesicle proteome.

      A strength of this study is that the authors identified novel STB-EV protein markers that are more abundant in the placenta of patients with preeclampsia compared with normal controls. This contributes a little more to what is already known about STB-EV markers and preeclampsia. If these markers can be shown to be more abundant in maternal plasma of preeclampsia patients, it would be very useful for identifying patients who are at high risk for developing early-onset preeclampsia.

      Weaknesses include:<br /> (1) The small sample size. There were only 6 patients in the study group and 6 normal controls. However, this can be considered as a pilot study.<br /> (2) The normal controls were not matched with the study patients and the authors did not state how the controls were selected.<br /> (3) The authors state that the placenta samples were obtained at the time of elective cesarean section. However, it is likely that all the preeclampsia patients were delivered for clinical indications rather than electively. This should be clarified.

    4. Reviewer #2 (Public Review):


      Preeclampsia is a disorder of pregnancy that affects 4-5% of pregnancies worldwide. Identifying this condition early is clinically relevant as it will help clinicians to make management decisions to prevent adverse outcomes. The placenta holds a key to many pregnancy-related pathologies including preeclampsia and studies have shown many differences in the placenta of women with preeclampsia as compared to controls. However as the placenta cannot be collected directly during pregnancy, the exosomes secreted by it are considered a good alternative to tissue biopsy. In this study, the authors have compared the proteins in different sizes of exosomes from the placenta of women with and without preeclampsia. The idea is to eventually use these as biomarkers for early detection of preeclampsia.


      The novelty factor of this study is the use of two different-sized exosomes which has not been achieved earlier.


      The study measured the proteins at only a single time point after the disease has already occurred. However, the placenta is an ever-changing tissue throughout pregnancy and different proteins can come up at different times in pregnancy. Thus serial measurements are necessary and a single time point measurement. The has not validated the identified biomarkers in plasma or circulating placental exosomes from women with and without preeclampsia. Thus the utility of these findings in real-life situations can not be judged from this work.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study advances our knowledge of how parasites evade the host complement immune system. The new cryo-EM structure of the trypanosome receptor ISG65 bound to complement component C3b is highly compelling and well-supported by biochemical experiments. This work will be of broad interest to parasitologists, immunologist, and structural biologists.

      We thank the reviewers and editorial team for this assessment of our work.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors set out to use structural biology (cryo-EM), surface plasmon resonance, and complement convertase assays to understand the mechanism(s) by which ISG65 dampens the cytoxicity/cellular clearance to/of trypanosomes opsonised with C3b by the innate immune system.

      The cryo-EM structure adds significantly to the author's previous crystallographic data because the latter was limited to the C3d sub-domain of C3b. Further, the in vitro convertase assay adds an additional functional dimension to this study.

      The authors have achieved their aims and the results support their conclusions.

      The role of complement in immunity to T. brucei (or lack thereof) has been a significant question in molecular parasitology for over 30 years. The identification of ISG65 as the C3 receptor and now this study providing mechanistic insights represents a major advance in the field.

      Reviewer #2 (Public Review):

      This is an excellent paper that uses structural work to determine the precise role of one of the few invariant proteins on the surface of the African trypanosome. This protein, ISG65, was recently determined to be a complement receptor and specifically a receptor of C3, whose binding to ISG65 led to resistance to complement-mediated lysis. But the molecular mechanism that underlies resistance was unknown.

      Here, through cryoEM studies, the authors reveal the interaction interface (two actually) between ISG65 and C3, and based on this, make inferences regarding downstream events in the complement cascade. Specifically, they suggest that ISG65 preferably binds the converted C3b (rather than the soluble C3). Moreover, while conversion to a C3bB complex is not blocked, the ability to bind complement receptors 1 and 3 is likely blocked.

      Of course, all this is work on proteins in isolation and the remaining question is - can this in fact happen on the membrane? The VSG-coated membrane is supposed to be incredibly dense (packed at the limits of physical density) and so it is unclear whether the interactions that are implied by the structural work can actually happen on the membrane of a live trypanosome. This is not necessarily a dig but it should be addressed in the manuscript perhaps as a caveat.

      We thank the reviewer for their positive response our work. We fully agree with the reviewer about the caveats which come from this work being done in a biochemical context. We have addressed this in lines 223-24 and 327-333.

      Reviewer #3 (Public Review):

      The authors investigate the mechanisms by which ISG65 and C3 recognize and interact with each other. The major strength is the identification of eco-site by determining the cryoEM structure of the complex, which suggests new intervention strategies. This is a solid body of work that has an important impact on parasitology, immunology, and structural biology.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A paper by Sulzen et al was published online on 27th April in Nature Communications that has a similarity (the cryo-EM structure) to this paper. This does not detract from the value of this paper. The authors should, however, include a "compare and contrast" section in this paper to explain similarities and differences in the conclusions. For example, while this paper demonstrates that ISG65 does not prevent C3 convertase activity, the Sulzen paper suggests it does prevent C5 convertase activity. The compatibility of these conclusions should be discussed.

      Two studies of ISG65 were published shortly after submission of this manuscript (Sulzen et al and Lorenzen et al) and we have added a brief comparison of the conclusions of these papers here. These mentions include lines 151, 155-6, 201-2, 274-278, 292-93 and 321-323. For a more in-depth comparison we have published an opinion piece in Trends in Parasitology, which discusses all three of these papers and which we also now reference here.

      Could the authors comment as to whether they think the association of C3b with the unstructured region of ISG65 comes about via S-S shuffling? I.e., is C3B first thioester linked to VSG and then this rearranges to ISG65 through C3b-ISG65 proximity?

      We thank the reviewer for the interesting suggestion. However, we are not aware of evidence showing that C3b, which has been conjugated to a target protein through its covalent ester bond, then becomes transferred to a second target protein. As ISG65 can bind to C3 as well as C3b, we think that the conjugate could form when ISG65-bound C3 converts to C3b, becomes reactive and, through proximity, is most likely to conjugate to ISG65. Whether this occurs to a substantial degree in trypanosomes, or whether it is more likely that ISG65 interacts with C3b which is already VSG-conjugated, requires further experiments. We have edited lines 217-222 to make this point more clearly.

      Reviewer #3 (Recommendations For The Authors):

      The authors previously reported that ISG65 C-terminus is so flexible and is not resolved in their 2022 ISG65-C3d (TED of C3b) crystal structure, which is the same case here in the cryo-EM structure of ISG65-C3b. Thus, I am wondering how C3b might find the flexible C-terminus and form a covalent bond.

      We think that the answer to the reviewer’s question relates to local concentration. When two reactive compounds are not attached together, then they diffuse freely in three-dimensions and their likelihood of colliding and reacting is subject to the randomness of Brownian motion. However, if they bind together through an interaction distinct from the reactive residues, then this increases their relative local concentration and the likelihood of collision and reaction taking place. In the case of ISG65, this is coupled with the ability of ISG65 to bind to C3 before it converts to C3b and becomes reactive. The interaction of ISG65 with C3/C3b will therefore bring together the reactive residues and increases the probability that they will collide and form a conjugate. Our control with BSA, which does not bind to C3/C3b, and does not form these conjugates supports this conclusion. We have edited lines 217-222 to clarify.

      I also find it puzzling that deleting L2 or L3 in ISG65, which they found forming additional contracts with CUB domain of C3b (12 times binding tighter), does not affect the ISG65-C3b conjugate formation in the in vitro C3 convertase formation assay.

      When we consider the affinities that the L2 and L3 loop deletions variants have for ISG65, and the concentration of ISG65 in the C3 convertase assay, we would predict that the conjugates still form with the L2 and L3 variants. This binding would therefore increase the relative local concentration of the reactive residues and ensure preferential conjugate formation, as we observe.

      (1) Page 2 bottom line, "In particular, loop 2 forms a direct contact with the CUB domain of ISG65, centered around an electrostatic", ISG65 should be C3b.

      We thank the reviewer for spotting this. It has been corrected.

      (2) Page 4, "We found that ISG65 does not complete with either factor B or Factor D and does not block the binding of factor Bb (Figure 3b). This suggests that the C3 convertase can form in the presence of ISG65", "complete" should be "compete".

      It has been corrected.

      (3) Page 4, "revealed that in the presence of ISG65 a high molecular weight band appeared, which we identified through mass spectrometry to be a conjugate of ISG65 with C3b". There is no mass spectrometry data in the manuscript to support this.

      We agree with the reviewer that this data should be included in the paper and have now added it as Supplementary Table 3.

      (4) Page 5, "By inhibiting binding of CR2 to C3d, ISG65 will reduce the likelihood that B-cell receptor binding to trypanosome antigens will result in B-cell activation and antibody production." - this sentence is a bit confusing.

      We have clarified this point in lines 243-245.

      (5) Related to Figure 2a. "This structure reveals the two distinct interfaces formed between ISG65 and C3b (Figure 2a)." It would be clearer to label where interface 1 and interface 2 are in Figure 2a.

      We have now labelled interfaces 1 and 2 above the insets in Figure 2a.

      (6) Related to Figure 2C. I suggest mutagenesis to validate ISG65 L2/L3 - C3b CUB domain interaction, i.e. mutate ISG65 (N188, R187, Y190) and perform SPR with C3b.

      We agree with the reviewer that this experiment was a valuable validation of our structural data. To achieve this aim, we changed our SPR assay, coupling C3 variants to the chip surface in an orientation which would match their conjugation to a pathogen and allowing us to reliably compare the affinities of ISG65 variants. We then assessed the binding of ISG65, ISG65∆L2, and the ISG65L2N188A,H189A,Y190A proposed by the reviewer. As predicted from the structure, both loop 2 deletion and mutation reduced the affinity for C3b but did not affect the affinity for C3d, suggesting that the difference in affinity of ISG65 for C3b and C3d is due to the observed interface 2. This new data is described in lines 150-168 and is presented in Figure 2c.

      (7) Related to Figure 3a. Is the C3b only structure in the presence of ISG65 the real C3b only? Discussion can be added.

      Our cryoEM analysis of the ISG65-C3b mixture yielded three dimensional classes which contained clear density for ISG65 and those in which there was no density for ISG65. While the reviewer is technically correct, and we cannot be 100% sure that there is not an entirely disordered ISG65 attached to these ‘unbound’ C3b, we think that this is extremely unlikely. In either case, these ‘unbound’ C3b are indistinguishable from other structures of C3b and the argument in the paper stands. We have added a clause in lines 178-179 to make this point.

      (8) Related to Figure 3e. There is no label for WT and deletion mutants. Also, L1 and L3 deletion does not seem to show on the gel.

      We have added these labels.

    2. eLife assessment

      This fundamental study significantly advances our understanding of how parasites evade the host complement immune system. The new cryo-EM structure of the trypanosome receptor ISG65 bound to complement component C3b is highly compelling and well-supported by biochemical experiments. This work will be of broad interest to parasitologists, immunologists, and structural biologists.

    3. Reviewer #1 (Public Review):

      The authors set out to use structural biology (cryo-em), SPR and complement convertase assays to understand the mechanism(s) by which ISG65 dampens the cytotoxicity/cellular clearance to/of trypanosmes opsonised with C3b by the innate immune system.

      The cryo-EM structure adds significantly the the author's previous crystallographic data because the latter was limited to the C3d sub-domain of C3b. Further, the in vitro convertase assay adds an additional functional dimension to this study.

      The authors have achieved their aims and the results support their conclusions.

      The role of complement in immunity to T. brucei (or lack thereof) has been a significant question in molecular parasitology for over 30 years. The identification of ISG65 as the C3 receptor and now this study providing mechanistic insights represents a major advance in the field.

      The authors have appropriately put their results into perspective with other recent reports on the role of ISG65.

    4. Reviewer #3 (Public Review):

      The authors investigate the mechanisms by which ISG65 and C3 recognize and interact with each other. The major strength is the identification of exo-site by determining the cryoEM structure of the complex, which suggests new intervention strategies. This is a solid body of work that has an important impact in parasitology, immunology, and structural biology.

      Comments on revised version:

      The authors have addressed all the previous concerns.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the Reviewing Editor and two additional reviewers for the insightful input they gave us on the first version of our manuscript on allosteric activity regulation of the anaerobic ribonucleotide reductase from Prevotella copri. We have revised the manuscript in the light of the reviewers' comments. In particular, we have added additional experiments using hydrogen-deuterium exchange mass spectrometry (HDX-MS) to probe the accessibility and mobility of different parts of the protein structure in the apo-state and in the presence of dATP/CTP and ATP/CTP. The results strongly confirm the binding of nucleotides to the activity and specificity sites, as seen biochemically and structurally. In the question of mobility of the glycyl radical domain the HDX-MS experiments suggest an increased mobility in the presence of dATP, though the results are not as clear-cut as for the nucleotide binding. The HDX-MS analyses are complicated by the fact that they reflect all species in solution, which are evidently multiple for all states of PcNrdD. Finally, we have rephrased key parts of the results and discussion, and modified the title, to avoid any implication that we believe the glycyl radical domain becomes extensively disordered, rather that it becomes more mobile to the extent that it cannot be seen in the cryo-EM structures.

      eLife assessment

      This study advances our understanding of the allosteric regulation of anaerobic ribonucleotide reductases (RNRs) by nucleotides, providing valuable new structural insight into class III RNRs containing ATP cones. The cryo-EM structural characterization of the system is solid, but other aspects of the manuscript, which are incomplete, could be improved by including additional functional characterization and more evidence for the proposed mechanism of inhibition by dATP. The work will be of interest to biochemists and structural biologists working on ribonucleotide reductases and other allosterically regulated enzymes.

      Public Reviews:

      Reviewer #1 (Public Review):

      The goal of this study is to understand the allosteric mechanism of overall activity regulation in an anaerobic ribonucleotide reductase (RNR) that contains an ATP-cone domain. Through cryo-EM structural analysis of various nucleotide-bound states of the RNR, the mechanism of dATP inhibition is found to involve order-disorder transitions in the active site. These effects appear to prevent substrate binding and a radical transfer needed to initiate the reaction.

      Strengths of the manuscript include the comprehensive nature of the work - including numerous structures of different forms of the RNR and detailed characterization of enzyme activity to establish the parameters of dATP inhibition. The manuscript could be improved, however, by performing additional experiments to establish that the mechanism of inhibition can be observed in other contexts and it is not an artifact of the structural approach. Additionally, some of the presentations of biochemical data could be improved to comply with standard best practices.

      The work is impactful because it reports initial observations about a potentially new mode of allosteric inhibition in this enzyme class. It also sets the stage for future work to understand the molecular basis for this phenomenon in more detail.

      We thank the editor and reviewers for their positive evaluation of the potential impact of our work. We completely agree that hypotheses based on structural data require orthogonal experimental verification. However, the number and consistency of the cryo-EM structures speak in favour of the data being representative of conditions in solution. We feel that in particular cryo-EM data should be relatively free of artefacts, e.g. biased or incorrect relative domain orientations, compared to crystallography, where crystal packing effects can affect these parameters. As we write in response to Reviewer #2, it has been difficult to propose a direct structural mechanism for transmission of the allosteric signal from the a-site in the ATP-cone to the active site and GRD given that the ATP-cones and linker are disordered in the dATP-bound dimers and only partly ordered in the dATP-bound tetramers. Further verification experiments will be performed in future but are outside the scope of the present article.

      We will improve the presentation of the biochemical data in a revised version.

      General comments:

      (1) It would be ideal to perform an additional experiment of some type to confirm the orderdisorder phenomena observed in the cryo-EM structures to rule out the possibility that it is an artifact of the structure determination approach. Circular dichroism might be a possibility?

      Circular dichroism reports only on the approximate relative proportions of helix, sheet and loop structure in a protein, thus we believe that it would not be a sensitive enough tool to distinguish between ordered and disordered states. We are considering what alternative methods might be appropriate.

      (2) Does the disordering phenomenon of one subunit in the ATP-bound structures have any significance - could it be related to half-of-sites activity? Does this RNR exhibit half-of-sites activity?

      Half-of-sites activity has not been biochemically proven in any ribonucleotide reductase in spite of the fact that it was first suggested in 1987 (PMID: 3298261). However, strong structural indication was recently published in the form of the holo-complex of the class Ia ribonucleotide reductase from Escherichia coli, which is highly asymmetrical and in which productive contacts forming an intact proton-coupled electron transfer pathway are only formed between one of two pairs of monomers (PMID: 32217749). We have not been able to prove half-of-sites activity for PcNrdD due to low overall radical content, but the structural results are indeed consistent with such an activity.

      (3) Does the disordering of the GRD with dATP bound have any long-term impact on the stability of the Gly radical? I realize that the authors tested the ability to form the Gly radical in the presence of dATP in Fig. 4 of the manuscript. But it looks like they only analyzed the samples after 20 min of incubation. Were longer time points analyzed?

      Radical content was measured after 5 min and 20 min incubation; 5 min incubations (not included in the manuscript) consistently gave higher radical content compared to 20 min incubation. Longer time points were not analysed, as we assumed that the radical content would be even lower after 20 min.

      (4) Did the authors establish whether the effect of dATP inhibition on substrate binding is reversible? If dATP is removed, can substrates rebind?

      This is an interesting question. We measured KDs for dATP in the micromolar range and are hence confident that dATP binding is reversible. Our measurements do not, however, directly prove that inhibition of the enzyme is reversible. Nevertheless, it is worth noting that the protein as purified was precipitated and analysed by the UV-visible spectrum. The aspurified PcNrdD contained 30% nucleotide contamination. The as-purified sample was then analysed by HPLC and we identified a major peak, corresponding to dATP/dADP. Therefore, purification conditions had to be optimised to remove the nucleotides. This is evidence that PcNrdD that has “seen” dATP can subsequently bind substrates in the presence of ATP. We will describe the purification more clearly in a revision.

      (5) In some figures (Fig. 6e, for example), the cryo-EM density map for the nucleotide component of the model is not continuous over the entire molecule. Can the authors comment on the significance of this phenomenon? Were the ligands validated in any way to ensure that the assignments were made correctly?

      Indeed we sometimes saw discontinuous density for the nucleotides, both in the active site and in the specificity site. However, the break was almost always near the C5’ carbon atom, which is common to all nucleotides. While we cannot readily explain this phenomenon, the nucleotides refined well with full occupancy, giving B-factors similar to those of the surrounding protein atoms. The identity of the nucleotide could always be inferred from a) the size of the base (purine or pyrimidine); b) the known nucleotide combinations added to the protein before grid preparation; c) prior knowledge on the combinations of effector and substrate that have been found valid for all RNRs since the first studies of allosteric specificity regulation.

      Reviewer #2 (Public Review):

      This manuscript describes the functional and structural characterization of an anaerobic (Class III) ribonucleotide reductase (RNR) with an ATP cone domain from Prevotella copri (PcNrdD). Most significantly, the cryo-EM structural characterization revealed the presence of a flap domain that connects the ATP cone domain and the active site and provides structural insights about how nucleotides and deoxynucleotides bind to this enzyme. The authors also demonstrated the catalytic functions and the oligomeric states. However, many of the biochemical characterizations are incomplete, and it is difficult to make mechanistic conclusions from the reported structures. The reported nucleotide-binding constants may not be accurate because of the design of the assays, which complicates the interpretation of the effects of ATP and dATP on PcNrdD oligomeric states. Importantly, statistical information was missing in most of the biochemical data. Also, while the authors concluded that the dATP binding makes the GRD flexible based on the absence of cryo-EM density for GRD in the dATP-bound PcNrdD, no other supports were provided. There was also a concern about the relevance of the proposed GRD flexibility and the stability of Gly radical. Overall, the manuscript provides structural insights about Class III RNR with ATP cone domain and how it binds ATP and dATP allosteric effectors. However, ambiguity remains about the molecular mechanism by which the dATP binding to the ATP cone domain inhibits the Class III RNR activity.


      (1) The manuscript reports the first near-atomic resolution of the structures of Class III RNR with ATP domain in complex with ATP and dATP. These structures revealed the NxN flap domain proposed to form an interaction network between the substrate, the linker to the ATP cone domain, the GRD, and loop 2 important for substrate specificity. The structures also provided insights into how ATP and dATP bind to the ATP cone domain of Class III RNR. Also, the structures suggested that the ATP cone domain is directly involved in the tetramer formation by forming an interaction with the core domain in the presence of dATP. These observations serve as an important basis for future study on the mechanism of Allosteric regulation of Class III RNR.

      (2) The authors used a wide range of methodologies including activity assays, nucleotide binding assays, oligomeric state determination, and cryo-EM structural characterization, which were impressive and necessary to understand the complex allosteric regulation of RNR.

      (3) The activity assays demonstrated the catalytic function of PcNrdD and its ability to be activated by ATP and low-concentration dATP and inhibited by high-concentration dATP.

      (4) ITC and MST were used to show the ability of PcNrdD to bind NTP and dATP.

      (5) GEMMA was used successfully to determine the oligomeric state of PcNrdD, which suggested that PcNrdD exists in dimeric and tetrameric forms, whose ratio is affected by ATP and/or dATP.


      (1) Activity assays.

      The activity assays were performed under conditions that may not represent the nucleotide reduction activity. The authors initiated the Gly radical formation and nucleotide reduction simultaneously. The authors also showed that the amount of Gly radical formation was different in the presence of ATP vs dATP. Therefore, it is possible that the observed Vmax is affected by the amount of Gly radical. In fact, some of the data fit poorly into the kinetic model. Also, the number of biological and technical replicates was not described, and no statistical information was provided for the curve fitting.

      The highest turnover activity of PcNrdD measured in presence of ATP was 1.3 s-1 (470 nmol/min/mg), a kcat comparable to recently reported values for anaerobic and aerobic RNRs from Neisseria bacilliformis, Leeuwenhoekiella blandensis, Facklamia ignava, Thermus virus P74-23, and Aquifex aeolicus (PMID: 25157154, PMID: 29388911, PMID: 30166338, PMID: 34314684, PMID: 34941255). The general trend illustrated in Figure 1 is that ATP has an activating effect on enzyme activity, whereas high concentrations of dATP have an inactivating effect on activity, which cannot be explained by suboptimal assay conditions since our EPR results consistently show that more radical is formed in incubations with dATP compared to incubations with ATP. Curve fitting methods used are listed in Materials and Methods (as specified in the Figure 1 legend), and standard errors for all specified curve fitting results (from triplicate experiments) are shown in Figure 1.

      (2) Binding assays.

      The interpretation of the binding assays is complicated by the fact that dATP binds both a- and s-sites and ATP binds a- and active sites. dATP may also bind the active site as the product. It is unknown if ATP binds s-site in PcNrdD. Despite this complexity, the binding assays were performed under the condition that all the binding sites were available.

      Therefore, it is not clear which event these assays are reporting.

      Both ITC and MST experiments involving ATP and dATP binding to the a-site were performed in the presence of at least 1 mM GTP substrate (5 mM in MST) to fill the active site, and 1 mM dTTP effector to fill the s-site (specified in the legend to Figure 2). These conditions enable binding of ATP or dATP only to the a-site in the ATP-cone.

      (3) Oligomeric states.

      Due to the ambiguity in the kinetic parameters and the binding constants determined above, the effects of ATP and dATP on the oligomeric states are difficult to interpret. The concentrations of ATP used in these experiments (50 and 100 uM) were significantly lower than KL determined by the activity assays (780 uM), while it is close to the Kd values determined by ITC or MST (~25 uM). Since it is unclear what binding events ITC and MST are reporting, the data in Figure 3 does not provide support for the claimed effects of ATP binding. For the effects of dATP, the authors did not observe a significant difference in oligomeric states between 50 or 100 uM dATP alone vs 50 uM dATP and 100 uM CTP. The former condition has dATP ~ 2x higher than the Kd and KL (Figure 1b) and therefore could be considered as "inhibited". On the other hand, NrdD should be fully active under the latter condition. Therefore, these observations show no correlation between the oligomeric state and the catalytic activity.

      The results in Figure 3 show that at in presence of 100 µM ATP plus 100 µM CTP the oligomeric equilibrium is 64% dimers plus 36% tetramers, and in presence of 50-100 µM dATP the oligomeric equilibrium is 32% dimers and 68% tetramers. We agree that there is no clear and strong correlation between oligomeric state and inhibition. We will also try to make it clearer in a revised version. Meanwhile, in order to add some clarity to our observations, SEC experiments at higher nucleotide concentrations will be done to strengthen our observations.

      (4) Effects of dATP binding on GRD structure

      One of the key conclusions of this manuscript is that dATP binding induces the dissociation of GRD from the active site. However, the structures did not provide an explanation for how the dATP binding affects the conformation of GRD or whether the dissociation of GRD is a direct consequence of dATP binding or it is due to the absence of nucleotide substrate. Also, Gly radical is unlikely to be stable when it is not protected from the bulk solvent. Therefore, it is unlikely that the GRD dissociates from the active site unless the inhibition by dATP is irreversible. Further evidence is needed to support the proposed mechanism of inhibition by dATP.

      We admit that it has been difficult to propose a direct structural mechanism for transmission of the allosteric signal from the a-site in the ATP-cone to the active site and GRD given that the ATP-cones and linker are disordered in the dATP-bound dimers and that the linker can only be partly modelled in the dATP-bound tetramers. Most likely dATP binding causes a change in the dynamics of the linker region and NxN flap that directly affects substrate binding and simultaneously causes disorder of the GRD, given that all are part of a connected system (described as “nexus” in the manuscript). The structures determined in the presence of dATP and CTP show that CTP cannot bind in the absence of an ordered NxN flap.

      In any case a major conclusion of the work is that dATP does not inhibit the anaerobic RNR by prevention of glycyl radical formation but by prevention of its subsequent transfer. We agree that further evidence is required to support the proposed mechanism, but given the extent of the data already presented in the manuscript, we feel that such studies should be the subject of a future publication.

      (5) Functional support for the observed structures.

      Evidence for connecting structural observations and mechanistic conclusions is largely missing. For example, the authors proposed that the interactions between the ATP cone domain and the core domain are responsible for tetramer formation. However, no biochemical evidence was provided to support this proposal. Similarly, the functional significance of the interaction through the NxN flap domain was not proved by mutagenesis experiments.

      We did actually make mutants to verify the observed interactions, but several of them did not behave well in our hands, e.g. with regard to protein stability. Since we have no evidence that oligomerisation is coupled to inhibition, and since we did not observe any conservation between protein sequences in the interaction area, we chose not to pursue this point further. The main merit of the tetramer structures is that they allowed a high-resolution view of dATP binding to the ATP-cone and a comparison to previously-observed ATP-cones. Nevertheless, mutation experiments, also including the NxN flap, could be the subject of future work.

      Reviewer #3 (Public Review):

      The manuscript by Bimai et al describes a structural and functional characterization of an anaerobic ribonucleotide reductase (RNR) enzyme from the human microbe, P. copri. More specifically, the authors aimed to characterize the mechanism by how (d)ATP modulates nucleotide reduction in this anaerobic RNR, using a combination of enzyme kinetics, binding thermodynamics, and cryo-EM structural determination. One of the principal findings of this paper is the ordering of a NxN 'flap' in the presence of ATP that promotes RNR catalysis and the disordering of both this flap and the glycyl radical domain (GRD) when the inhibitory effector, dATP, binds. The latter is correlated with a loss of substrate binding, which is the likely mechanism for dATP inhibition. It is important to note that the GRD is remote (>30 Ang) from the binding site of the dATP molecule, suggesting long-range communication of the structural (dis)ordering. The authors also present evidence for a shift in oligomerization in the presence of dATP. The work does provide evidence for new insights/views into the subtle differences of nucleotide modulation (allostery) of RNR through long-range interactions.

      The strengths of the work are the impressive, in-depth structural analysis of the various regulated forms of PcRNR by (d)ATP using cryo-EM. The authors present seven different models in total, with striking differences in oligomerization and (dis)ordering of select structural features, including the GRD that is integral to catalysis. The authors present several, complementary biochemical experiments (ITC, MST, EPR, kinetics) aimed at resolving the binding and regulatory mechanism of the enzyme by various nucleotides. The authors present a good breadth of the literature in which the focus of allosteric regulation of RNRs has been on the aerobic orthologues.

      Given the resolution of some of the structures in the remote regions that appear to be of importance, the rigor of the work could have been improved by complementing this experimental studies with molecular dynamics (MD) simulations to reveal the dynamics of the GRD and loops/flaps at the active site.

      We have discussed with expert colleagues the possibility of carrying out MD simulations on the different states in order to study the differential effects of ATP and dATP binding on the dynamics of the GRD. However, they felt that the chance of obtaining meaningful results was low, particularly since some structural elements are missing from the models for both forms, in particular the linker between the ATP-cone and the core.

      The biochemical data supporting the loss of substrate binding with dATP association is compelling, but the binding studies of the (d)ATP regulatory molecules are not; the authors noted less-than-unity binding stoichiometries for the effectors.

      Most of the methods used measure only binding strength, not the number of binding sites (N), whereas ITC also measures number of sites. N is dependent on the integrity of the protein, i.e. the number of protein molecules in a preparation that are involved in binding, and quite often gives lower values than the theoretical number of binding sites.

      Also, the work would benefit from additional support for oligomerization changes using an additional biochemical/biophysical approach.

      SEC (chromatography), GEMMA (mass spectrometry) and cryo-EM were used to study oligomerization. Since each method has restrictions on nucleotide concentrations as well as protein concentrations that can be used, the results are not directly comparable, but all three methods indicate nucleotide dependent oligomerization changes. The SEC results will be included in a revised version.

      Overall, the authors have mostly achieved their overall aims of the manuscript. With focused modifications, including additional control experiments, the manuscript should be a welcomed addition to the RNR field

      Recommendations for the authors: Reviewer #1 (Recommendations For The Authors):

      (1) The last sentence of the abstract is not complete. The structures implicate a complex network of interactions in ... ? What do they implicate?

      A couple of words seem to have been missed from the abstract. We have rewritten the end of the abstract to emphasise better that the dynamical transitions involve a linked network of interactions and not just the GRD.

      (2) A reference is needed in the second sentence of the introduction.

      We have added a reference as requested.

      (3) Page 2, paragraph 2. The authors state "two beta subunits (NrdB) harboring a stable radical." This is not accurate. First of all, each beta subunit harbors its own cysteine oxidant.

      And in several subclasses, that oxidant is not a stable radical but an oxidized metal cluster. Please revise to improve accuracy and also provide appropriate references.

      We have revised the description and added a recent reference.

      (4) Page 4, Fig. 1, panels C and D. The fit of the curve to the data is pretty poor. Is there an explanation? Could the data be improved in some way? In general, it is also best practice nowadays to show the individual data points in addition to the error bars in plots like the ones shown in Figure 1. Please modify the plots to include the individual data points in this figure - and probably also the subsequent figures showing binding data.

      We have modified relevant panels in Figures 1, 2 and 5 as requested.

      (5) Page 12, first paragraph. The authors state that one of the monomers in the ATP-CTP structure is well ordered and the other is less ordered. It would be ideal to show in a figure the basis for this conclusion using the cryo-EM maps. The "less ordered" monomer appears to be fully modeled.

      Since the 2-fold axis of the dimer is vertical, the GRD of the left-hand monomer is hidden from view at the back of the molecule in Figure 6. For this monomer there was a small amount of density that allowed modelling of part of the glycyl radical loop (though not the tip containing the radical Gly itself) and the NxN flap, albeit with significantly higher mobility. We have illustrated this through an additional supplement for Figure 6 (figure supplement 2) in which the B-factors of the residues are shown both as a ribbon with radius proportional to the B-factor and through colouring. We hope that the four views in Figure 6 (figure supplement 2) together illustrate the relative mobility of different parts of the dimer.

      It would also be ideal to show the basis for the conclusion that the entire GRD is disordered in the dATP-bound dimer structure.

      Thank you for this suggestion. We have added a fifth supplement to Figure 8 in which we show the cryo-EM reconstruction for the dATP-bound dimer in two orientations, with the ATP-CTP-bound structure superimposed, which clearly shows that the entire GRD, the ATPcones, linker and NxN flap are all disordered in both monomers.

      Reviewer #2 (Recommendations For The Authors):

      (1) Units to describe enzyme activity.

      • The unit for the specific activity in the main text (nmol/min•mg) is unusual. It is most likely a typo of nmol/min/mg or nmol/(min•mg).

      We have changes to nmol/min/mg in the text.

      • The unit for the Vmax is unusual and should not be confused with the specific activity. By definition, Vmax is the velocity of a reaction at a defined enzyme concentration/amount. For example, if an assay of 10 mg enzyme yielded 470 nmol of product in 1 min, Vmax is 470 nmol/min, whereas the specific activity is 47 nmol/min/mg.

      The velocity as calculated above is ca 1.3 s-1. We have added kcat values to accompany the specific activities given.

      (2) Steady-state kinetic analysis.

      • The steady-state kinetic analysis in Figure 1 needs to be repeated. While the nonlinear curve fitting for Figure 1a is reasonable, those in Figures 1b, 1c, and 1d were outside the error range. Consequently, the reported kinetic parameters are unlikely accurate. The authors should repeat the assays with different enzyme preparation to account for all the errors. If the fit curve is still outside the error range, the kinetic model is likely incorrect, and the authors need to investigate different kinetic models.

      The replotted Figure 1 now includes two different experiments for 1b (four replicates in total).

      • The authors should report the number of replicates and the statistical data for the curve fitting.

      The figure legend has been updated with statistical data for all curve fits, and the number of replicates has been added.

      • The authors should report Vmax, Ki, and KL for Figure 1d.

      Results in Figures 1c and 1d are less straightforward than those in Figures 1a and 1b where the s-site is filled with dTTP, favouring binding of GTP to the active site. The curve fit in Figure 1c is disturbed at high concentrations of ATP, which plausibly competes with the CTP substrate and results in inhibition by formed dATP. The curve fit in Figure 1d is less certain since reduction of substrate is low due to intrinsic CTP reduction in absence of effector and partially overlapping activation and inhibition effects of dATP.

      • The authors should consider presenting the data in a log scale because of the complex nature of the activation/inhibition at the lower concentrations of dATP.

      Log scale plots are included as insets in Figures 1b and 1d.

      • The basal level of CPT reduction in the absence of an effector nucleotide should be reported with an error.

      The error value has been added in the figure legend for the basal level of CTP reduction in the absence of effector.

      (3) Equations for the kinetic analysis.

      -The equations should be numbered and referred to in the Figure 1 legend.

      All equations are specified and numbered in Materials and Methods. The equation used for each curve fit in the panels in Figure 1 is specified in the figure legend.

      -KL must be defined in the main text. I suppose this is Kd for ATP or dATP. The equation for KL determination is missing brackets for dNTP.

      KL (the concentration of an allosteric effector that gives half maximal enzyme activity) is defined in Materials and Methods where the equation is described. KL is not the same as KD (the dissociation constant for a ligand and its receptor). Brackets have been added to equation 1.

      • I believe dNTP in the first equation is incorrect because ATP was the ligand for Figures 1A and 1C.

      [dNTP] in the first equation has been changed to [NTP/dNTP] to indicate that both ribonucleotides and deoxyribonucleotides can bind.

      • The second equation can be expressed as dATP as I believe this is the only ligand that inhibits the enzyme.

      We prefer to keep the more general [dNTP] in the equation.

      • The equation used for the fitting in Figure 1d must be defined more clearly than "a combination of the two equations".

      The equation used for the curve fit in Figure 1d has been specified as equation 3 in Materials and Methods.

      (4) Design of the activity assays

      It is not clear if the activity assays report the rate of glycyl radical formation or nucleotide reduction. The authors mixed NrdD and NrdG and initiated the reaction by adding formate (essential for nucleotide reduction) and dithionite (Gly radical formation). The Gly radical formation is slow (in min time scale). The authors reported that ATP/dATP affected the rate of Gly radical formation and in the presence of ATP, Gly radical formation was incomplete even after 20 min. Therefore, it is possible that within the timescale of the activity assays (5 min), the reactions could be partially limited by the Gly radical formation, which may be the reason for the poor curve fitting.

      Activity assays were performed with 5 min pre-incubation without dithionite and formate (no glycyl radical formation) and 10 min incubation after addition of dithionite and formate (glycyl radical formation plus substrate reduction). During earlier tests, NrdD and NrdG were first preincubated in the presence of dithionite (glycyl radical formation) and after addition of formate the substrate reduction was monitored during 20 min. These experiments resulted in lower enzyme activity, whereas higher activity was achieved only upon formate addition to the preincubation reaction. We suppose that the presence of dithionite, which is a strong reducing agent, affected NrdD stability and the reaction was stabilised by the presence of formate at an earlier stage of the reaction. For the EPR conditions used in the paper, 5 min incubation gave higher radical content compared to 20 min, and the reported activity assay gave highest activity after 10 min incubation; kcat of 1.3 s-1.

      (5) Methods section for the activity assays.

      • The concentration of dTTP, ATP, and dATP used in the assays must be described.

      We thank the reviewer for pointing out this omission and we have now specified the concentrations used.

      • Although the authors mentioned that they changed the concentration of dTTP, such data were not presented. Is this correct? Did the authors fix the dTTP concentration for the GTP reduction?

      We apologise for the ambiguity and have specified that the dTTP concentration was fixed at 1 mM in the GTP experiments and that only the ATP or dATP concentrations were varied.

      (6) Discrepancy between Ki/KL and Kd.

      • There is a significant ambiguity remaining about the binding event that the ITC and MST results are reporting. Although dATP binds to both a- and s-sites and ATP binds to both active site and a-site, only a single binding event was observed in both cases. To distinguish the dATP binding to a- and s-sites and the active site, the authors should perform binding assays using mutant enzymes with only one of the binding sites available for dATP/ATP binding.

      MST and ITC were performed in presence of substrate (1 mM GTP) and s-site effector (1 mM dTTP in ITC experiments, and 5 mM dTTP in MST experiments), thus dATP is blocked from binding to the s-site and ATP from binding to the active site.

      • There are significant differences between Kd determined by MST or ITC and Ki/KL determined by the activity assays. Kd measurements were performed in the absence of the substrate nucleotides, while the assays required substrates. There may be complications from the presence of NrdG and the Gly radical formation. The authors must clearly describe all these complications and the discrepancy between Kd and Ki/KL.

      MST, ITC and enzyme assays were all performed in the presence of substrate, and enzyme assays also contained NrdG, which was not present in the MST and ITC analyses. While KD is a thermodynamic constant representing the affinity of ligand to its binding site - in our case an effector nucleotide to the ATP-cone, KL is a kinetic constant (the allosteric effector concentration that gives half maximal activity) representing the relationship between the effector concentration and the reaction speed and is affected by the enzyme turnover number (kcat). The relationship between KD, KL and Ki is further complicated by conformational and possibly oligomeric state changes of NrdD upon binding of allosteric effectors, which occurs on a slower time scale than the rapid exchange of nucleotides in allosteric sites.

      • The results of ATP/dATP copurification experiments shown in Figure 2 - figure supplement 1 show the preference of dATP binding over ATP. However, the results do not necessarily support the competition between ATP and dATP for binding to the ATP cone domain. It is still possible that dATP binding to the s-site diminishes the binding of ATP to the a-site.

      Our aim was to exclude the possibility that ATP and dATP can bind to the ATP-cone at the same time and not to study competition between the two. Nevertheless, to eliminate the possibility that dATP binding to the s-site could affect nucleotide binding to the a-site, in two out of three conditions described in the supplementary figure, the experiments were performed in the presence of dTTP to prevent binding of dATP to the s-site.

      (7) Oligomeric states.

      • The authors must present the GEMMA results without ATP or dATP. Otherwise, the effects of ATP and dATP on the oligomeric state are not clear.

      We cannot report GEMMA results without ATP or dATP because apo-PcNrdD was unstable in the GEMMA buffer and clogged the capillaries. Instead, SEC analysis was performed on apo-PcNrdD in a more suitable buffer and showed a homogeneous peak corresponding to a dimer (included as Figure 3 - figure supplement 1).

      • Figure 3 does not support the induction of a2 upon ATP binding. The concentrations of ATP used in these experiments (50 and 100 uM) were significantly lower than KL determined by the activity assays (780 uM), while it is close to the Kd values determined by ITC or MST (~25 uM). Since it is unclear what binding events ITC and MST are reporting, the data in Figure 3 does not provide support for the claimed effects of ATP binding.

      MST and ITC were performed in the presence of substrate (1 mM GTP) and s-site effector (1 mM dTTP in ITC experiments, and 5 mM dTTP in MST experiments), and they thus measure binding of ATP or dATP to the ATP cone. SEC analysis with 2 µM apo-PcNrdD and higher nucleotide concentrations (1 mM) was performed, confirming the presence of both dimers and tetramers in solution at different ratios depending on the addition of ATP or dATP. The SEC analysis, included as Figure 3 - figure supplement 1, confirms the existence of an equilibrium in solution.

      • The effects of dATP must be presented more clearly. The authors did not observe a significant difference in oligomeric states between 50 or 100 uM dATP vs 50 uM dATP and 100 uM CTP. The former condition has dATP ~ 2x higher than the Kd and KL (Figure 1b) and therefore could be considered as "inhibited". On the other hand, NrdD should be fully active under the latter condition. The absence of difference in the oligomeric states between these two different conditions suggested to me that the oligomeric state does not regulate the NrdD activity. The authors seemed to indicate the same conclusion, but did not describe it clearly.

      We agree that the oligomeric state most likely does not regulate the NrdD activity and hope to have explained this better in the revised version.

      • Figure 3 legend mentioned a and b, but the figure was not labeled.

      We have corrected this.

      • The authors should triplicate the analysis and report the errors.

      Five scans were added for each trace to increase the signal-to-noise level (included in figure legend).

      (8) EPR characterization of Gly radical

      • The amount of Gly radical must be quantified by EPR. The authors must report how much NrdD has Gly radical.

      The concentration of NrdD (1 µM) in the activity assays is too low to be quantified by EPR. In the EPR experiment the glycyl radical content is given in the figure legend.

      • The authors claim that the Gly radical environment was similar based on the doublet feature. However, the double feature comes from the hyperfine splitting with α proton whose orientation relative to the radical p-orbital would not be affected by the conformation or the environment. Thus, this conclusion is incorrect and must be removed.

      We thank the reviewer for the clarifying comment and have removed our suggestion in the text.

      (9) Gly711 should be shown in Fig. 6e to help readers understand the last paragraph on page 12.

      The figure reference has been changed to Fig. 7, where this is shown more clearly. In Fig. 6e, inclusion of Gly711 would obscure other important information.

      (10) GRD structure with dATP

      The disorder of GRD in the presence of dATP does not agree with the formation of Gly radical under the same conditions. Gly radical is unlikely stable if it is extensively exposed to solvent. Most likely, the observed cryo-EM structures represent the conformation irrelevant to Gly radical formation.

      We agree that the glycyl radical is unlikely to be stable if exposed to solvent. We believe that the GRD is not completely disordered but most likely made more mobile through rigid body movements of the domain to an extent that makes it invisible in the cryo-EM maps. It is most likely still in the vicinity of the active site, shielding the glycyl radical. Our new HDX-MS results show a small but tangible increase in mobility of the GRD in the presence of dATP compared to ATP. Of course the differences in dynamics remain to be confirmed. It is worth noting that the group of Catherine Drennan at MIT published a conference abstract more than a year ago that suggested a similar pattern of ordered/dynamic GRDs, based on crystal structures, though the details have not yet been published (https://doi.org/10.1096/fasebj.2022.36.S1.R3407).

      We also agree that the cryo-EM structures do not show the GRD conformation relevant to Gly radical formation, as this has been shown spectroscopically for the GRE pyruvate formate lyase to require large conformational changes in the GRD and also the presence of the activase. However, revealing this conformation would be a completely different project. We postulate that inactivation proceeds by prevention of radical transfer to the substrate, not by prevention of its formation.

      We have altered the wording in several places in the revised manuscript, including the title, to avoid using the term “disorder”, as this may imply (partial) unfolding, and we certainly do not wish to imply that.

      (11) The difference between dATP and ATP binding

      From the presented structures, it was not clear how the absence of 2'-OH affects the oligomeric state and the structure of the GRD. The low resolution of the ATP-bound structure precluded the comparison between the ATP and dATP-bound structures.

      We agree that a detailed analysis of the differences between ATP- and dATP-bound structures requires higher resolution structures, particularly of the ATP-bound form. This will be the subject of future studies.

      (12) Conclusion about the disordered GRD.

      -The authors should describe the reason why the dATP binding affected the structure of GRD. The authors did not discuss why dATP binding affected the folding or mobility of GRD. Since this is the key conclusion of this manuscript and the authors are making this conclusion based on the absence of the ordered GRD structure (hence the negative results), the authors should carefully describe why the dATP binding does not allow the binding/folding of GRD in the position observed in the ATP-bound structure.

      As mentioned in our response to point 4 in this reviewer’s Public Review, it is difficult to propose a direct structural mechanism for transmission of the allosteric signal from the a-site in the ATP-cone to the active site and GRD given that the ATP-cones and linker are disordered in the dATP-bound dimers and that the linker cannot be completely modelled even in the dATP-bound tetramers. Our first hypotheses were that the ATP-cone might work by a steric occlusion mechanism, but the reality appears more complex. Most likely dATP binding causes a change in the dynamics of the linker region and NxN flap that directly affects substrate binding and simultaneously causes higher mobility of the GRD, given that all are part of a connected system. The structures determined in the presence of dATP and CTP show that CTP cannot bind in the absence of an ordered NxN flap. We hope that future structural studies of NrdDs from other organisms may shed further light on this mechanism.

      • The authors should test if the dATP inhibition is reversible for PcNrdD. If dATP binding induces dissociation of GRD from the active site and makes GRD flexible, Gly radical would most likely be quenched by formate or other components in the assay solution. If dATP inhibition is reversible, it is hard to believe that Gly radical dissociates completely from the active site.

      As-purified PcNrdD contains dATP and can after removal of bound nucleotides bind substrate in presence of ATP. The as-purified PcNrdD protein contained 30% nucleotide contamination. After precipitation, HPLC analysis identified a major peak corresponding to dATP/dADP. Purification conditions were optimised to remove the nucleotides and we have added this information to the purification description.

      (13) Functional support for the observed structures.

      Similar to X-ray crystallography, cryo-EM is a highly selective method that requires the selection of particles that can be analyzed with sufficient resolution. This means that the analysis could be biased towards the protein conformations stable on the cryo-EM grid. Consequently, testing the structural observations by functional characterization of mutant enzymes is critical. However, the authors did not perform such functional characterizations and made conclusions purely based on the structural observations.

      We acknowledge this limitation. We constructed several mutations located at the tetrameric interface between the ATP-cone and the core protein based on the cryo-EM structure of dATP loaded NrdD. Unfortunately, these mutant proteins were unstable and led to protein cleavage.

      (14) Other minor points:

      • In the introduction, the authors stated "The presence and function of the ATP-cone domain distinguish anaerobic RNRs from the other members of the large glycyl radical enzyme (GRE) family that are otherwise structurally and mechanistically related (Backman et al., 2017)." This statement is misleading because GREs are functionally diverse.

      We have removed the words “and mechanistically” to reduce ambiguity.

      • p. 12, e.g. should be removed.

      We are not sure what is meant here. Does the reviewer mean p. 21 “The interactions are mostly hydrophobic but are reinforced by several H-bonds, e.g. between Gln3D-Gln458A, Ser53D–Gln458A, Arg11D-Asp468A, the main chain amide of Ile12D and Tyr557A.”?

      Reviewer #3 (Recommendations For The Authors):

      Overall, the work presents an impressive and in-depth structural view of the conformational changes stemming from the interactions of (d)ATP allosteric effector molecules that are interrelated to RNR function. The manuscript is written clearly and provides a solid overview of RNR chemistry. The cryo-EM data show striking differences between ATP and dATP bound forms, though in select regions, the resolution is not good enough for strong interpretations of the finer details.

      (1) In cryo-EM structures, dATP appears to shift the oligomerization equilibrium from nearly all dimeric forms (absence of dATP) to a mixture of both dimeric and tetrameric species (presence of dATP). The examination of the oligomeric composition in solution using the GEMMA - a mass spectral technique - showed somewhat similar trends, though given the magnitude of the differences, it was less compelling. Have the authors considered a complementary solution technique, such as analytical SEC or dynamic light scattering that could provide further support for the change in oligomerization as observed in the cryo-EM?

      SEC analysis with 2 µM apoPcNrdD and higher nucleotide concentrations (1 mM) was performed, confirming the presence of both dimer and tetramer in solution at different ratios depending on the addition of ATP or dATP. The SEC analysis, included as Figure 3 - figure supplement 1, confirms the existence of an equilibrium in solution.

      (2) The protein as isolated from the final SEC shows a predominant peak corresponding to aggregate protein. It would be helpful if the authors ran an analytical SEC on the protein sample that is more refined to see how much soluble dimer/tetramer vs. aggregate protein there is. This could impact the kinetic and thermodynamic analysis of effector interactions. Further, the second major peak is labeled as 'monomer'. Is the protein isolated as a monomer and then forms dimer upon effector binding? It is unclear. The authors should consider presenting the SEC standards for the given column and buffer condition so that a reasonable estimate of the oligomerization status of the isolated protein can be assigned.

      Can the reviewer possibly have believed that Figure 1 - supplementary figure 2a shows PcNrdD rather than PcNrdG? The figure supplement corresponds to the as-isolated SEC analysis of the activase (PcNrdG), which shows the presence of two main peaks of aggregates and monomer. The monomeric peak was reinjected and showed no presence of further aggregation states. Currently it is not known which oligomeric state the activase harbours upon binding to PcNrdD and glycyl radical formation. None of the other SEC figures in the MS has any predominant peak corresponding to aggregated protein.

      (3) More details are needed for the ITC section. The ITC methods are not clear. What is the exact composition of the ligand solution being titrated into the protein solution? It is unclear how the less-than-unity binding stoichiometry was determined and what it means. Is the n value for the monomer, dimer, or tetramer forms? It is concerning that n < 1 is observed for dATP binding in the ITC whereas there are 3 dATP bound/subunit in the cryo-EM. For completeness, titration of a buffer into protein solution (no ligand) should be conducted and presented to demonstrate that the heats produced in Figure 2 correspond to the ligand only (and not a buffer mismatch).

      ITC experiments were performed in the presence of 1 mM GTP (c-site) and 1 mM dTTP (ssite). Unlike other parameters in ITC analyses, the N value is usually the least accurate of all fitted parameters and strongly depends on the concentration of the active protein in the sample. N values described in the current study are in the same range as values reported for ATP-cones in other RNRs and NrdR (Rozman Grinberg & al 2018a, 2018b, 2022 McKethan and Spiro 2013). The results most likely reflect two high-affinity binding sites for dATP and one high affinity binding site for ATP. Different nucleotide concentrations were used in the cryoEM and ITC experiments.

      (4) It is intriguing that the binding of dATP doesn't quell the glycyl radical. In fact, it appears that, as the authors suggest, the amount of glycyl radical might be increased in these samples. However, the cryo-EM data indicates that the GRD is disordered. It is unclear how these would be correlated, as one would not expect a disordered structural element to maintain such a potent oxidant.

      As already written above, we do not wish to imply that the GRD is completely or even highly disordered, just that its dynamics increase in the presence of dATP. Otherwise we completely agree that a very exposed Gly radical is incompatible with its stability. It could be that the amount of disorder is exaggerated somewhat by the vitrification process in cryo-EM. We have tried to reword some of the text to emphasise higher mobility rather than disorder.

      It has been difficult to propose a direct structural mechanism for transmission of the allosteric signal from the a-site in the ATP-cone to the active site and GRD given that the ATP-cones and linker are disordered in the dATP-bound dimers and that the linker can not be completely modelled even in the dATP-bound tetramers. We initially thought that a steric occlusion mechanism might be at play, but the reality appears more complex. Most likely dATP binding causes a change in the dynamics of the linker region and NxN flap that directly affects substrate binding and simultaneously causes higher mobility of the GRD, given that all are part of a connected system. The structures determined in the presence of dATP and CTP show that CTP cannot bind in the absence of an ordered NxN flap. We hope that future structural studies of NrdDs from other organisms may shed further light on this mechanism.

      (5) It is a bit difficult to keep track of the myriad of structural information and differences amongst the various nucleotide-dependent conditions. It would be useful for the authors to add a summary figure that depicts the various oligomers, orientations, and (dis)ordered structural elements with cartoon representations.

      Thank you for this suggestion. It has been added as Figure 11.

      (6) The mechanism by which (d)ATP binding changes the (dis)ordering of select loops based on the current cryo-EM data is unclear (even the authors agree). The addition of molecular dynamics (MD) simulations on two different structures to reveal the network or structural communication would be a great addition to the work and validate the structural data.

      We have discussed this with a colleague who is an expert in MD. Their advice was that such simulations would be very difficult given that some amino-acids are missing in both of the relevant starting structures (ATP-CTP and dATP-CTP dimer) and could give very variable results. Thus we chose to do complementary experiments with hydrogen-deuterium exchange mass spectrometry (HDX-MS) instead. The results are included in the revised manuscript.

      Minor points

      (1) There are some conflicting reports as to whether P. copri is considered a human 'pathogen'. According to Yeoh, et al Scientific Reports 2022, P. copri is one of the predominant microbes in the human gut and is linked to a positive impact on metabolism. Perhaps the addition of a citation that provides support for it as a pathogen would clarify the statement on p. 3.

      We have added a recent reference (Nii T, Maeda Y, Motooka D, et al. (2023) Genomic repertoires linked with pathogenic potency of arthritogenic Prevotella copri isolated from the gut of patients with rheumatoid arthritis. Ann Rheum Dis 82: 621-629. doi: 10.1136/annrheumdis-2022-222881).

      (2) In Figure 3, the number of dimers/tetramers for dATP (100 uM) does not add up to 100.

      What is the other 2%?

      Thank you for pointing this out - it has been corrected.

      (3) The data in Figures 5C and D do show slight changes that could be fit and interpreted as a 'weak' interaction. Thus, the statement on p 9 "where dATP-loaded PcNrdD could bind neither GTP nor CTP" should be changed to indicate that the interactions are weak (or that the nucleotides weakly associate).

      The text and the figure have been changed according to the reviewer’s suggestion.

    2. eLife assessment

      This study advances our understanding of the allosteric regulation of anaerobic ribonucleotide reductases (RNRs) by nucleotides, providing valuable new structural insight into class III RNRs containing ATP cones. The cryo-EM structural characterization of the system is solid, but some open questions remain about the interpretation of activity/binding assays and the newly incorporated HDX-MS results. The work will be of interest to biochemists and structural biologists working on ribonucleotide reductases and other allosterically regulated enzymes.

    3. Reviewer #1 (Public Review):

      The goal of this study is to understand the allosteric mechanism of overall activity regulation in an anaerobic ribonucleotide reductase (RNR) that contains an ATP-cone domain. Through cryo-EM structural analysis of various nucleotide-bound states of the RNR, the mechanism of dATP inhibition is found to involve order-disorder transitions in the active site. These effects appear to prevent binding of substrate and a radical transfer needed to initiate the reaction.

      Strengths of the manuscript include the comprehensive nature of the work - including both numerous structures of different forms of the RNR and detailed characterization of enzyme activity to establish the parameters of dATP inhibition. The manuscript has been improved in a revision by performing additional experiments to help corroborate certain aspects of the study. But these new experiments do not address all of the open questions about the structural basis for mechanism. Additionally, some questions about the strength of biochemical data and fit of binding or kinetic curves to data that were raised by other referees still remain. Some experimental observations are not consistent with the proposed model. For example, why does dATP enhance Gly radical formation when the proposed mechanism of dATP inhibition involves disorder in the Gly radical domain?

      The work is impactful because it reports initial observations about a potentially new mode of allosteric inhibition in this enzyme class. It also sets the stage for future work to understand the molecular basis for this phenomenon in more detail.

    4. Reviewer #3 (Public Review):

      The manuscript by Bimai et al describes a structural and functional characterization of an anaerobic ribonucleotide reductase (RNR) enzyme from the human microbe, P. copri. More specifically, the authors aimed to characterize the mechanism by how (d)ATP modulates nucleotide reduction in this anaerobic RNR, using a combination of enzyme kinetics, binding thermodynamics, and cryo-EM structural determination, complemented by hydrogen-deuterium exchange (HDX). One of the principal findings of this paper is the ordering of a NxN 'flap' in the presence of ATP that promotes RNR catalysis and the disordering (or increased protein dynamics) of both this flap and the glycyl radical domain (GRD) when the inhibitory effector, dATP, binds. The latter is correlated with a loss of substrate binding, which is the likely mechanism for dATP inhibition. It is important to note that the GRD is remote (>30 Ang) from the binding site of the dATP molecule, suggesting long-range communication of the structural (dis)ordering. The authors also present evidence for a shift in oligomerization in the presence of dATP. The work does provide evidence for new insights/views into the subtle differences of nucleotide modulation (allostery) of RNR, in a class III system, through long-range interactions.

      The strengths of the work are the impressive, in-depth structural analysis of the various regulated forms of PcRNR by (d)ATP using cryo-EM. The authors present seven different models in total, with striking differences in oligomerization and (dis)ordering of select structural features, including the GRD that is integral to catalysis. The authors present several, complementary biochemical experiments (ITC, MST, EPR, kinetics) aimed at resolving the binding and regulatory mechanism of the enzyme by various nucleotides. The authors present a good breadth of the literature in which the focus of allosteric regulation of RNRs has been on the aerobic orthologues.

      The addition of hydrogen-deuterium exchange mass spectrometry (HDX-MS) complements the results originating from cryo-EM data. Most notably, is the observation of the enhanced exchange (albeit quite subtle) of the GRD domain in the presence of dATP that matches the loss of structural information in this region in the cryo-EM data. The most pronounced and compelling HDX results are seen in the form of dATP-induced protection of peptides immediately adjacent to the b-hairpin at the s-site, where dATP is expected to bind based on cryo-EM. It is clear that the presence of dATP increases the rigidity of this region.

      Weaknesses: The discussion of the change in peptide mobility in the N-terminal region is complicated by the presence of bimodal mass spectral features and this may prevent detailed interpretation of the data, especially for select peptide region that shows opposite trends upon nucleotide association. Further, the HDX data in the NxN flap is unchanged upon nucleotide binding (ATP, dATP, or CTP), despite changes observed in the cryo-EM data.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Firstly, the authors place a great deal of emphasis on the impact of the Hif1-a inhibitor PX-478. The literature surrounding this inhibitor and its mode of action indicates that it is not a direct inhibitor of activity but that its greatest impact is on the production of Hif1-a. The authors do include another inhibitor as a control, Echinomycin, but it does not appear to be as biologically active and the panel of experiments conducted with this is extremely limited. I would be more comfortable with a full Seahorse experimental panel for Echinomycin, similar to SFig 2.G as performed with PX-478.

      We thank the reviewer for their comment highlighting the different mechanisms of action of the HIF-1α inhibitors used in this article. While echinomycin inhibits the binding of HIF-1α to the hypoxia response element (HRE) thereby blocking HIF-1a DNA binding capability, PX-478 inhibits HIF-1α deubiquitination, decreases HIF-1α mRNA expression, and reduces HIF-1α translation. We have included a paragraph explaining this phenomenon in the new version of the manuscript (page 9). In addition, we extended the panel of experiments performed with echinomycin, which confirmed a marked inhibition of the glycolytic pathway when DCs were stimulated with irradiated Mtb in the presence of echinomycin as assessed by SCENITH (new Figure S3H).

      Similarly, it would be of value to have Seahorse profiling that directly excludes FAO from the metabolic profile through the use of Etomoxir as an inhibitor of fatty acid oxidation, which one would assume would have no impact on the metabolic response.

      In order to estimate the contribution of FAO towards fueling protein synthesis in DCs stimulated with iMtb, the FAO inhibitor etomoxir was incorporated to the SCENITH method as previously described (Adamik et al., 2022). Overall, FAO dependence was found to be less than 10% in DCs, regardless of their activation state. While mitochondrial dependence is reduced after iMtb stimulation, there is no difference in FAO dependence, suggesting that OXPHOS is primarily driven by glucose in iMtb-stimulated cells. This is consistent with HIF1α-induced increase of glucose metabolism-related genes. We have adjusted the results section to include this new result (new Figure S1).

      Aside from these minor points, I believe this to be a rigorous study.

      Reviewer #2 (Recommendations For The Authors):

      In Fig. 1 and Fig. 2, the authors conclude that Mtb rewires the metabolism of Mo-DCs and induces both glycolysis and OXPHOS. The data shows that infection with iMtb or Mtb increases glucose uptake and lactate release, suggesting an increase in glycolysis. However, an increase in lactate is not a measure of glycolysis. Lactate is a byproduct of glycolysis; the end product of glycolysis is pyruvate.

      We are grateful for the reviewer's comment, as it gives us the opportunity to explain the conceptual framework on which we based our study. Traditionally, pyruvate has been considered to be the end product of glycolysis when oxygen is present and lactate the end product under hypoxic conditions. Numerous studies have shown that lactate is produced even under aerobic conditions (Brooks, 2018). Therefore, we frame this work in accordance with this view that states that glycolysis begins with glucose as its substrate and terminates with the production of lactate as its main end product (Rogatzki, Ferguson, Goodwin, & Gladden, 2015; Schurr, 2023; Schurr & Schurr, 2017).

      Secondly, since the authors have access to the Agilent Extracellular Flux Analyzer, they should have performed detailed ECAR/OCR measurements to conclusively demonstrate that both glycolysis and OXPHOS are increased in Mo. This is especially important for OXPHOS because the only readout shown for OXPHOS is an increase in mitochondrial mass (figure 1 G, H), which is not acceptable. Overall, the data does not indicate that Mtb triggers OXPHOS in the dendritic cells. It only indicates dead iMtb increases the mass of mitochondria in DCs.

      The reviewer’s advice is well appreciated. However, we would like to clarify what may be a misunderstanding; that is, the assays alluded to by the reviewer were not performed on monocytes but on DCs. As advised by the reviewer, we now include the OCR measurements by Seahorse and describe the figures according to their order of appearance in the new version of the manuscript.

      What happens to the mitochondrial mass when infected with live Mtb?

      In response to the reviewer’s question, we determined the mitochondrial mass in infected DCs with live Mtb. In contrast to DCs treated with irradiated Mtb, those infected with live bacteria showed a clear reduction of their mitochondrial mass (modified Figure 1G). This result indicates that, although both Mtb-infected and irradiated Mtb-exposed DCs show a clear increase in their glycolytic activity, divergent responses are observed in terms of mitochondrial mass.

      It will be best if the authors indicate in the figure headings that dead Mtb was used.

      We agree with the reviewer. For figures 1-3, we applied the term “Mtb” in the figure headings since both irradiated and viable bacteria were used for the corresponding experiments. In figures 4-5, the term “iMtb” (alluding to irradiated Mtb) was used in the figure headings as suggested by the reviewer. For the remaining figures, the term “iMtb” was indicated in their legends when dead bacteria weres used to stimulate DCs.

      E.g., Figure 1F; what does live Mtb do to GLUT1 levels etc etc?

      In response to the reviewer’s question, we included new data about Glut1 expression in DCs infected with live Mtb in the latest version of the manuscript. In line with the increase in glucose uptake shown in figure 1B, we observed an increase in the percentage of Glut1 positive DCs upon Mtb infection (new Figure 1F, lower panels). The increase in Glut1 expression strengthens the notion that DCs activates their glycolytic activity in response to the infection, as demonstrated by the elevated release of lactate, glucose consumption, HIF-1α expression, LDHA expression (Figure 1) and glycolytic activity (Figure 2, SCENITH results with viable Mtb). Therefore, these data strongly support the induction of glycolysis by Mtb (either viable or irradiated) in DCs.

      Also, we found that they were still able to activate CD4+ T cells from PPD+ donors in response to iMtb. This activation of CD4 T cells with iMtb in the presence of a HIF-1alpha inhibitor is expected, as iMtb is dead and not virulent. What happens when the cells are infected with live virulent Mtb?

      We would like to clarify the main purpose of the DC-T cells co-culture assays in the presence of the HIF-1α inhibitors. To characterize the impact of HIF-1α on DC functionality, we assessed the capacity of DCs to activate autologous CD4+ T cells when stimulated with iMtb in the presence of HIF-1α inhibitors. To this end, we used iMtb merely as a source of antigens to load DCs and evaluate the effect of HIF-1α inhibition on the activation of antigen-specific T cell. The use of viable Mtb may introduce confounding factors, such as pathogen-triggered inhibitory mechanisms (e.g., EsxH secretion by Mtb, (Portal-Celhay et al., 2016)), which would prevent us from reaching conclusions about the role of HIF-1α. Thus, we consider that the use of live bacteria for this experiment is out of the scope of this manuscript.

      The authors demonstrated that CD16+ monocytes from TB patients have higher glycolytic capacity than healthy controls Fig 7. The authors should differentiate TB patient monocytes into DCs and measure their bioenergetics to test if infection alters their glycolysis and OXPHOS.

      In agreement with the reviewer, the determination of metabolic pathways in DCs differentiated from monocytes of TB patients is a key aspect of this work. Accordingly, the bioenergetic determinations of DCs generated from monocytes from TB patients versus healthy subjects are now illustrated in Figures 6F (lactate release) and 6G (SCENITH profile).

      In the discussion, the authors state that "pathologically active glycolysis in monocytes from TB patients leads to poor glycolytic induction and migratory capacities of monocyte-derived DCs." However, the data from Fig. 1 and 2 show that treatment with iMtb or Mtb induces glycolysis in MoDCs. How do the authors explain these contrasting results?

      We thank the reviewer for pointing out this issue. Figures 1 and 2 show DCs differentiated from monocytes of healthy donors (HS). In this case, DCs from HS respond to Mtb by inducing a glycolytic and migratory profile. Yet, in the case of monocytes isolated from TB patients, these cells exhibit an early glycolytic profile from the beginning of differentiation, ultimately yielding DCs with low glycolytic capacity and low migratory activity in response to Mtb. We included this explanation in the discussion (page 18) to better clarify this issue.

      Also, the term "pathological" active glycolysis (Introduction and Discussion) is an inappropriate term.

      As requested by the reviewer, we excluded the term “pathological” to describe the phenomenon reported in this study.

      Lastly, it should be shown whether the DCs generated from CD16+ monocyte from TB patients generate tolerogenic and/or aberrant DCs, which have lower glycolytic and migration capacity compared to the CD16- monocyte population. In Figure 7B, the authors should discuss why the CD16+ monocyte population has lower glycolytic capacity compared to CD16- monocytes in healthy donors. Furthermore, in contrast to the TB patients, do DCs generated from CD16+ monocyte in healthy donors have increased glycolytic and migration capacity compared to CD16- monocyte (because these monocytes showed lower glycolytic capacity)? Furthermore, if there is no difference in glycolytic capacity among the three monocyte populations in TB patients, on what basis was it concluded that DCs generated only from the CD16+ monocyte population may be the cause of lower migration capacity? The authors state in Figure 7F that the DMOG pretreatment matches the situation where the Mo-DCs from TB patients showed reduced migration. Did the authors check the Hif-1alpha levels in monocytes obtained from TB patients?

      We appreciate this in-depth analysis by the reviewer because it allows us to clarify some interpretations of the SCENITH results in Figure 7B. It is important to keep in mind that with the SCENITH technique we can only infer about the relative contributions between the metabolic pathways, without alluding to the absolute magnitudes of such contributions. In this regard, it is key to note that the amount of lactate released during the first hours of the TB monocyte culture is much higher than that released by monocytes from healthy subjects (HS, Figure 7A), even when most of monocytes, which are CD14+ CD16-, have comparable glycolytic capacities between HS and TB. Another example to illustrate how to interpret SCENITH results can be found in Figure 2, where a lower mitochondrial dependence is observed in iMtb-stimulated DCs (Figure 2A), while the absolute ATP production associated to OXPHOS is indeed higher as measured by Seahorse (Figure 2D). Therefore, the glycolytic capacity is not a direct readout of the magnitude of glycolysis, but of its contribution to total metabolism. The low levels of lactate released from HS monocytes likely reflects their low activation state and low metabolic activity compared to TB monocytes. In this regard, we have previously demonstrated that monocytes from pulmonary TB patients display an activated phenotype (Balboa et al., 2011). The fact that there is no difference between the glycolytic capacities of TB and HS CD16- monocytes indicates that their proportional contributions to protein synthesis are comparable (again, without inferring about their absolute values, which may be very different).

      Beyond the previous clarification, the reviewer's proposal to isolate subsets of monocytes is a very interesting idea. However, the experimental approach is very difficult based on the amount of blood we can obtain from patients. The cohort of patients included in this work comprises very severe patients and we are given up to 15-20 ml of peripheral blood from each. This volume of blood yields up to 10 million PBMC with approximately 1 million monocytes. If we separate the monocyte subsets, the recovered cells per condition will be insufficient to perform the intended assays.

      Nevertheless, we incorporate new evidence that TB disease is associated with an increased activation and glycolytic profile of circulating CD16+ monocytes.

      i) First, we show that the baseline glycolytic capacity of CD16+ monocytes correlates with time since the onset of TB-related symptoms (new Figure 7C).

      ii) Second, we performed high-throughput GeneSet Enrichment Analysis (GSEA) on transcriptomic data (GEO accession number: GSE185372) of CD14+CD16-, CD14+CD16+ and CD14dimCD16+ monocytes isolated from individuals with active TB, latent TB (IGRA+), as well as from TB negative healthy controls (IGRA-). We found enrichments that, unlike oxidative phosphorylation, glycolysis tends to increase in active TB in both CD14+CD16+ and CD14dimCD16+ monocytes (new Figure 7D).

      iii) We measured the expression of HIF-1α in monocyte subsets by FACS and found that this transcription factor is expressed at higher levels in CD16+ monocyte subsets from TB patients compared to their counterparts from healthy donors (new Figure 8 A). We consider this result justifies the assays shown in Figure 8B-C, in which we prematurely activated HIF-1α in healthy donor monocytes during early differentiation to DCs and measured its impact on the migration of the generated DCs.

      In the Discussion, the authors mention that circulating monocytes from TB patients differentiate from DCs with low immunogenic potential. However, the authors have not shown any immunological defect in any of their data with monocytes from TB patients. In the proxy model mentioned in Figure 7, they have in fact shown that these preconditioned DCs have higher CD86 expression. Can the authors explain/show data to justify the statement in the first paragraph of the Discussion?

      We agree with the reviewer on this observation. Our findings are limited to the generation of DCs with low migratory potential (low chemotactic activity towards CCL21 of DC differentiated from TB patient monocytes shown in figure 6H and of DC generated from pre-conditioned monocytes shown in figure 8C). We have modified that part of the discussion to better clarify this point, replacing migratory with immunogenic.

      The authors should note that oxamate is a competitive inhibitor of the enzyme lactate dehydrogenase and not glycolysis. Also, LDHA catalyzes the conversion from pyruvate to lactate and not the other way around (Results, page 6).

      This comment relates to the first one by the reviewer, in which the dogma of glycolysis was discussed. According to the new conception of glycolysis, it begins with glucose as its substrate and terminates with the production of lactate as its main end product.

      The following statements by the authors on page 6 are incorrect: "Because irradiated and viable Mtb induced comparable activation of glycolysis, we subsequently performed all our assays with irradiated Mtb only in the rest of the study due to biosafety reasons." and: "To our knowledge, this is the first study addressing the metabolic status and migratory activity of Mo-DCs from TB patients."

      We deleted the first sentence and reworded the second sentence as "To our knowledge, this is the first study to address how the metabolic status of monocytes from TB patients influences the migratory activity of further differentiated DCs".

      The Discussion reads as if live Mtb was used in the experiments, which is not the case. This should be corrected.

      We changed Mtb for iMtb when it was the case in the discussion. In some cases, Mtb stimulation was used instead of Mtb infection.

      Minor Comments:

      (1) In Figure 1F legend "Quantification of Glut1+ cells plotted to the right". The underlined part should be "plotted below".

      It was corrected.

      (2) In Figure 1H. Please describe the quantitation method and describe how many cells or the number/size of fields were used to quantitate mitochondria.

      For mitochondrial morphometric analysis, TEM images were quantified with the ImageJ “analyze particles” plugin in thresholded images, with size (μm2) settings from 0.001 to infinite. For quantification, 8–10 cells of random fields (1000x magnification) per condition were analyzed. We included this information in the methods section of the new version of the manuscript.

      (3) Please mention the number of independent experimental repeats for each experimental data set and figure.

      In each figure, the number of independent experiments is indicated by individual dots.

      (4) In Figure 2A legend, "PER; left panel" should be PER; lower panel and "OCR; right panel" should be OCR; upper panel.

      It was corrected.

      References for reviewers

      Adamik, J., Munson, P. V., Hartmann, F. J., Combes, A. J., Pierre, P., Krummel, M. F., … Butterfield, L. H. (2022). Distinct metabolic states guide maturation of inflammatory and tolerogenic dendritic cells. Nature Communications 2022 13:1, 13(1), 1–19. https://doi.org/10.1038/s41467-022-32849-1

      Balboa, L., Romero, M. M., Basile, J. I., Sabio y Garcia, C. A., Schierloh, P., Yokobori, N., … Aleman, M. (2011). Paradoxical role of CD16+CCR2+CCR5+ monocytes in tuberculosis: efficient APC in pleural effusion but also mark disease severity in blood. Journal of Leukocyte Biology. https://doi.org/10.1189/jlb.1010577

      Brooks, G. A. (2018). Cell Metabolism The Science and Translation of Lactate Shuttle Theory. Cell Metab. https://doi.org/10.1016/j.cmet.2018.03.008

      Portal-Celhay, C., Tufariello, J. M., Srivastava, S., Zahra, A., Klevorn, T., Grace, P. S., … Philips, J. A. (2016). Mycobacterium tuberculosis EsxH inhibits ESCRT-dependent CD4+ T-cell activation. Nature Microbiology, 2, 16232. https://doi.org/10.1038/NMICROBIOL.2016.232

      Rogatzki, M. J., Ferguson, B. S., Goodwin, M. L., & Gladden, L. B. (2015). Lactate is always the end product of glycolysis. Frontiers in Neuroscience, 9(FEB), 125097. https://doi.org/10.3389/FNINS.2015.00022/BIBTEX

      Schurr, A. (2023). From rags to riches: Lactate ascension as a pivotal metabolite in neuroenergetics. Frontiers in Neuroscience, 17, 1145358. https://doi.org/10.3389/FNINS.2023.1145358/BIBTEX

      Schurr, A., & Schurr, A. (2017). Lactate, Not Pyruvate, Is the End Product of Glucose Metabolism via Glycolysis. Carbohydrate. https://doi.org/10.5772/66699

    2. eLife assessment

      This useful study tests the hypothesis that Mycobacterium tuberculosis infection increases glycolysis in monocytes, which alters their capacity to migrate to lymph nodes as monocyte-derived dendritic cells. The authors conclude that infected monocytes are metabolically pre-conditioned to differentiate, with reduced expression of Hif1a and a glycolytically exhaustive phenotype, resulting in low migratory and immunologic potential. However, the evidence is incomplete as the use of live and dead mycobacteria still limits the ability to draw firm conclusions. The study will be of interest to microbiologists and infectious disease scientists.

    3. Reviewer #3 (Public Review):

      In the revised manuscript by Maio et al, the authors examined the bioenergetic mechanisms involved in the delayed migration of DC's during Mtb infection. The authors performed a series of in vitro infection experiments including bioenergetic experiments using the Agilent Seahorse XF, and glucose uptake and lactate production experiments. Also, data from SCENITH is included in the revised manuscript as well as some clinical data. This is a well written manuscript and addresses an important question in the TB field. A remaining weakness is the use of dead (irradiated) Mtb in several of the new experiments and claims where iMtb data were used to support live Mtb data. Another notable weakness lies in the author's insistence on asserting that lactate is the ultimate product of glycolysis, rather than acknowledging a large body of historical data in support of pyruvate's role in the process. This raises a perplexing issue highlighted by the authors: if Mtb indeed upregulates glycolysis, one would expect that inhibiting glycolysis would effectively control TB. However, the reality contradicts this expectation. Lastly, the examination of the bioenergetics of cells isolated from TB patients undergoing drug therapy, rather than studying them at their baseline state is a weakness.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Thank you for your continued review and for providing insightful suggestions. Below, I share some unpublished new findings related to the MYRF ChIP, comment on the potential interplay between myrf-1 and myrf-2, and describe the modifications we've implemented to address the reviewers' comments.

      (1) MYRF-1 ChIP

      Our collaboration with the modERN (Model Organism Encyclopedia of Regulatory Networks) project has recently yielded MYRF ChIP data. The results demonstrate clear and consistent MYRF binding across samples, notably on the lin-4 promoter. Given the significant detail and extensive description required to adequately present these findings, we have decided it is impractical to include them in the current paper. These results will be more suitably published in a separate ongoing study focused on MYRF's regulatory targets during larval development.

      (2) Inter-regulation between myrf-1 and myrf-2

      We acknowledge the interpretation that myrf-2 may act as a genetic antagonist to myrf-1, as suggested by the delayed arrest in myrf-1; myrf-2 double mutants and a trend towards increased lin-4 expression in myrf-2 mutants. Additionally, our unpublished data suggest an elevated myrf-2 expression peak in myrf-1 null mutants during the L1-L2 transition, indicating a potential mutual repressive interaction between myrf1 and myrf-2.

      On the other hand, myrf-1 and myrf-2 exhibit functional redundancy in DD synaptic rewiring and lin-4 expression. A gain of function in myrf-2 promotes early DD synaptic rewiring. Furthermore, three independent co-immunoprecipitation analyses targeting myrf-1::gfp, myrf-2::gfp, and pan-1::gfp confirm a tight association between myrf-1 and myrf-2 in vivo. These findings challenge the notion of myrf-2 primarily antagonizing myrf-1, or vice versa.

      We propose a model where myrf-1 and myrf-2 collaborate and are functionally redundant, with compensatory elevated expression when one paralog is absent. For instance, the loss of myrf-1 triggers upregulation of myrf-2, which, though insufficient on its own, accelerates the transcriptional program and exacerbates system deterioration, leading to accelerated death. How exactly this takes place is currently unclear. We notice the MYRF binding on both myrf-1 and myrf-2 genes in MYRF-ChIP.

      Given the complexity of these interactions, we have chosen not to delve deeply into this discussion in the paper without more direct evidence, which would require detailed analysis.

      (3) Revisions Addressing Reviewer Suggestions

      (a) We have revised our interpretation of the mScarlet signal changes in myrf-1(ybq6) and myrf-2(ybq42) mutants to reflect a more nuanced understanding of their potential genetic relationship, as highlighted in the main text.

      “The mScarlet signals exhibit a marked reduction in the putative null mutant myrf-1(ybq6) (Figure 1D, E). Intriguingly, in the putative null myrf-2(ybq42) mutants, there is a noticeable trend towards increased mScarlet signals, although this increase does not reach statistical significance (Figure 2C, D).”

      (b) In response to feedback on Figure 2 and the characterization of lin-4(umn84) mutants, we've included a new series of images showing lin-4(umn84)/+ and lin-4(umn84) signals through larval stages, presented as Figure 2 Figure Supplement 2. This addition clarifies the functional status of lin-4 nulls in our study.

      “Our observations revealed that mScarlet signals were not detected early L1 larvae (Figure 2C-F; Figure 2 Figure Supplement 2).”

      (c) To improve the clarity of Fig 6, we've added indicator arrows in the red, green, and merge channels, enhancing the visualization of the signals.

      We appreciate the opportunity to clarify these points and hope that our revisions and additional data address the concerns raised.

    2. Reviewer #1 (Public Review):

      In this work, the authors set out to ask whether the MYRF family of transcription factors, represented by myrf-1 and myrf-2 in C. elegans, have a role in the temporally controlled expression of the miRNA lin-4. The precisely timed onset of lin-4 expression in the late L1 stage is known to be a critical step in the developmental timing ("heterochronic") pathway, allowing worms to move from the L1 to the L2 stage of development. Despite the importance of this step of the pathway, the mechanisms that control the onset of lin-4 expression are not well understood.

      Overall, the paper provides convincing evidence that MYRF factors have a key role in promoting lin-4 expression in young larvae. Using state-of-the-art techniques (knock-in reporters and conditional alleles), the authors show that MYRF factors are essential for lin-4 activation and act cell-autonomously. Results using some unusual gain-of-function alleles are supported by consistent results using other approaches. The authors also provide evidence supporting the idea that MYRF factors activate lin-4 by directly activating its promoter. Because these results are indirect test of this, further experiments will be necessary to conclusively determine whether lin-4 is indeed a direct target of MYRF factors. myrf-1 and myrf-2 likely function redundantly to activate lin-4; potential complex interactions between these two genes will be an interesting area for future work.

      Overall, the paper's results are convincing. The important findings on miRNA regulation and the control of developmental timing will make this work of interest to a broad range of developmental biologists.

    3. Reviewer #2 (Public Review):


      In this manuscript, the authors examine how temporal expression of the lin-4 microRNA is transcriptionally regulated.

      Comments on revised version:

      In the revised manuscript, the authors have suitably addressed my original concerns.

      Aims achieved: The aims of the work are now achieved.

      Impact: This study shows that a single transcription factor (MYRF-1) is important for the regulation of multiple microRNAs that are expressed early in development to control developmental timing.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We greatly appreciate the reviewers' and editors' comments and suggestions on our manuscript "Transposable elements regulate thymus development and function." We performed additional analyses to validate our results and rephrased some manuscript sections according to the comments. We believe these changes significantly increase the solidity of our conclusions. Our point-by-point answer to the reviewers' and editors' comments is detailed below. New data and analyses are shown in Figure 1d, Figure 2g and h, Figure 5e and f, Figure 1 – figure supplement 1, Figure 2 – figure supplement 2, Figure 3 – figure supplement 1 and 2, Figure 4 – figure supplement 2, Figure 5 – figure supplement 1, as well as the corresponding text sections.

      Reviewer #1:

      (1) The authors sometimes made overstatements largely due to the lack or shortage of experimental evidence.

      For example in figure 4, the authors concluded that thymic pDCs produced higher copies of TE-derived RNAs to support the constitutive expression of type-I interferons in thymic pDCs, unlike peripheral pDCs. However, the data was showing only the correlation between the distinct TE expression pattern in pDCs and the abundance of dsRNAs. We are compelled to say that the evidence is totally too weak to mention the function of TEs in the production of interferon. Even if pDCs express a distinct type and amount of TE-derived transcripts, it may be a negligible amount compared to the total cellular RNAs. How many TE-derived RNAs potentially form the dsRNAs? Are they over-expressed in pDCs?

      The data interpretation requires more caution to connect the distinct results of transcriptome data to the biological significance.

      We contend that our manuscript combines the attributes of a research article (novel concepts) and a resource article (datasets of TEs implicated in various aspects of thymus function). The critical strength of our work is that it opens entirely novel research perspectives. We are unaware of previous studies on the role of TEs in the human thymus. The drawback is that, as with all novel multi-omic systems biology studies, our work provides a roadmap for a multitude of future mechanistic studies that could not be realized at this stage. Indeed, we performed wet lab experiments to validate some but not all conclusions: i) presentation of TE-derived MAPs by TECs and ii) formation of dsRNAs in thymic pDCs. In response to Reviewer #1, we performed supplementary analyses to increase the robustness of our conclusions. Also, we indicated when conclusions relied strictly on correlative evidence and clarified the hypotheses drawn from our observations.

      Regarding the Reviewer's questions about TE-derived dsRNAs, LINE, LTR, and SINE elements all have the potential to generate dsRNAs, given their highly repetitive nature and bi-directional transcription (1). As ~32% of TE subfamilies are overexpressed in pDCs, we hypothesized that these TE sequences might form dsRNA structures in these cells. To address the Reviewer's concerns regarding the amount of TE-derived RNAs among total cellular RNAs, we also computed the percentage of reads assigned to TEs in the different subsets of thymic APCs (see Reviewer 1 comment #4).

      (2) Lack of generality of specific examples. This manuscript discusses the whole genomic picture of TE expression. In addition, one good way is to focus on the specific example to clearly discuss the biological significance of the acquisition of TEs for the thymic APC functions and the thymic selection.

      In figure 2, the authors focused on ETS-1 and its potential target genes ZNF26 and MTMR3, however, the significance of these genes in NK cell function or development is unclear. The authors should examine and discuss whether the distinct features of TEs can be found among the genomic loci that link to the fundamental function of the thymus, e.g., antigen processing/presentation.

      We thank the Reviewer for this highly relevant comment. We investigated the genomic loci associated with NK cell biology to determine if ETS1 peaks would overlap with TE sequences in protein-coding genes' promoter region. Figure 2h illustrates two examples of ETS1 significant peaks overlapping TE sequences upstream of PRF1 and KLRD1. PRF1 is a protein implicated in NK cell cytotoxicity, whereas KLRD1 (CD94) dimerizes with NKG2 and regulates NK cell activation via interaction with the nonclassical MHC-I molecule HLA-E (2, 3). Thus, we modified the section of the manuscript addressing these results to include these new analyses:

      "Finally, we analyzed publicly available ChIP-seq data of ETS1, an important TF for NK cell development (4), to confirm its ability to bind TE sequences. Indeed, 19% of ETS1 peaks overlap with TE sequences (Figure 2g). Notably, ETS1 peaks overlapped with TE sequences (Figure 2h, in red) in the promoter regions of PRF1 and KLRD1, two genes important for NK cells' effector functions (2, 3)."

      (3) Since the deep analysis of the dataset yielded many intriguing suggestions, why not add a discussion of the biological reasons and significance? For example, in Figure 1, why is TE expression negatively correlated with proliferation? cTEC-TE is mostly postnatal, while mTEC-TE is more embryonic. What does this mean?

      We thank the Reviewer for this comment. To our knowledge, the relationship between cell division and transcriptional activity of TEs has not been extensively studied in the literature. However, a recent study has shown that L1 expression is induced in senescent cells. We therefore added the following sentences to our Discussion:

      "The negative correlation between TE expression and cell cycle scores in the thymus is coherent with recent data showing that transcriptional activity of L1s is increased in senescent cells (5). A potential rationale for this could be to prevent deleterious transposition events during DNA replication and cell division."

      We also added several discussion points regarding the regulation of TEs by KZFPs to answer concerns raised by Reviewer 2 (see Reviewer 2 comment #1).

      (4) To consolidate the experimental evidence about pDCs and TE-derived dsRNAs, one option is to show the amount of TE-derived RNA copies among total RNAs. The immunohistochemistry analysis in figure 4 requires additional data to demonstrate that overlapped staining was not caused by technical biases (e.g. uneven fixation may cause the non-specifically stained regions/cells). To show this, authors should have confirmed not only the positive stainings but also the negative staining (e.g. CD3, etc.). Another possible staining control was showing that non-pDC (CD303- cell fractions in this case) cells were less stained by the ds-RNA probe.

      We thank the Reviewer for this suggestion. We computed the proportion of reads in each cell assigned to two groups of sequences known to generate dsRNAs: TEs and mitochondrial genes (1). These analyses showed that the proportion of reads assigned to TEs is higher in pDCs than other thymic APCs by several orders of magnitude (~20% of all reads). In contrast, reads derived from mitochondrial genes had a lower abundance in pDCs. We included these results in Figure 4 – figure supplement 2 and included the following text in the Results section entitled "TE expression in human pDCs is associated with dsRNA structures":

      "To evaluate if these dsRNAs arise from TE sequences, we analyzed in thymic APC subsets the proportion of the transcriptome assigned to two groups of genomic sequences known as important sources of dsRNAs, TEs and mitochondrial genes (1). Strikingly, whereas the percentage of reads from mitochondrial genes was typically lower in pDCs than in other thymic APCs, the proportion of the transcriptome originating from TEs was higher in pDCs (~22%) by several orders of magnitude (Figure 4 – figure supplement 2)."

      As a negative control for the immunofluorescence experiments, we used CD123- cells. Indeed, flow cytometry analysis of the magnetically enriched CD303+ fraction was around 90% pure, as revealed by double staining with CD123 and CD304 (two additional markers of pDCs): CD123- cells were also CD304-/lo, showing that these cells are non-pDCs. Thus, we decided to compare the dsRNA signal between CD123+ cells (pDCs) and CD123- cells (non-pDCs). The difference between CD123+ and CD123- cells was striking (Figure 4d).

      Author response image 1.

      Reviewer #1 (Recommendations For The Authors):

      It was sometimes difficult for me to recognize the dot plots representing low expression against the white background. e.g., figure 1 supplement 1.

      We thank the Reviewer for their comment, and we modified Figure 1 – figure supplement 1 as well as Figure 3 – figure 3 supplement 2 to improve the contrast between dots and background.

      Reviewer #2:

      Reviewer #2 (Recommendations For The Authors):

      (1) In the abstract, results and discussion, the following conclusions are drawn that are not supported by the data: a) TEs interact with multiple transcription factors in thymic cells, b) TE expression leads to dsRNA formation, activation of RIG-I/MDA5 and secretion of IFN-alpha, c) TEs are regulated by cell proliferation and expression of KZFPs in the thymus. All these statements derive from correlations. Only one TF has ChIP-seq data associated with it, dsRNA formation and/or IFN-alpha secretion could be independent of TE expression, and whilst KZFPs most likely regulate TEs in the thymus, the data do not demonstrate it. The authors also seem to suggests that AIRE, FEZF2 and CHD4 regulate TEs directly, but binding is not shown. The manuscript needs a thorough revision to be absolutely clear about the correlative nature of the described associations.

      We agree with Reviewer #2 that some of the conclusions in our initial manuscript were not fully supported by experimental data. In the revised manuscript, we clearly indicated when conclusions relied strictly on correlative evidence and clarified the hypotheses drawn from our observations. Regarding the regulation of TE expression by AIRE, FEZF2, and CHD4, we reanalyzed publicly available ChIP-seq data of AIRE and FEZF2 in murine mTECs. For AIRE, we confirmed that ~30% of AIRE's statistically significant peaks overlap with TE sequences (see Reviewer 2, comment #6 for more details on read alignment and peak calling), confirming its ability to bind to TE sequences directly. We added these results to the main figures (Figure 5f) and modified the "AIRE, CHD4, and FEZF2 regulate distinct sets of TE sequences in murine mTECs" as follows:

      “[…]. As a proof of concept, we validated that 31.42% of AIRE peaks overlap with TE sequences by reanalyzing ChIP-seq data, confirming AIRE's potential to bind TE sequences (Figure 5f)."

      A reanalysis of FEZF2's ChIP-seq data yielded no significant peaks while using stringent criteria. For this reason, we decided to exclude these data and only use AIRE as a proof of concept.

      Regarding KZFPs, we agree with Reviewer #2 that their impact on TE expression is probably significantly underestimated in our data. A potential reason for this is that KZFP expression is typically low; thus, transcriptomic signals from KZFPs could have been missed by the low depth of scRNA-seq. We mentioned this point in the Discussion:

      "On the other hand, the contribution of KZFPs to TE regulation in the thymus is likely underestimated due to their typically low expression (6) and scRNA-seq's limit of detection."

      (2) On the technical side, there are many dangers about analyzing RNA-seq data at the subfamily level and without stringent quality control checks. Outputs may be greatly confounded by pervasive transcription (see PMID 31425522), DNA contamination, and overlap of TEs with highly expressed genes. Whether TE transcripts are independent units or part of a gene also has important implications for the conclusions drawn. I would say that for most purposes of this work, an analysis restricted to independent TE transcripts, with appropriate controls for DNA contamination, would provide great reassurances that the results from subfamily-level analyses are sound. Showing examples from the genome browser throughout would also help.

      We agree with the Reviewer that contamination could have interfered with TE quantification. We used FastQ Screen (7) to evaluate the contamination of our human scRNA-seq data. As illustrated in the Figure below, most reads aligned with the human genome, and there were no reads uniquely assigned to another species analyzed, confirming the high purity of our dataset.

      Author response image 2.

      As stated by the Reviewer, pervasive expression is another factor that can lead to overestimation of TE expression. To evaluate if pervasive expression impacted the results of our differential expression analysis of TEs between APC subsets, we visualized read alignment to TE sequences using a genome browser. We selected two samples containing the highest numbers of mTEC(II) and pDCs (T07_TH_EPCAM and FCAImmP7277556, respectively) and used STAR to align reads to the human genome (GRCh38). We then visualized read alignment to randomly selected loci of two subfamilies identified as overexpressed by mTEC(II) or pDCs (HERVE-int and Harlequin-int, respectively). The examples below show that the signal detected is specific to the TE sequences located in introns. Even though this visualization cannot guarantee that pervasive expression did not affect TE quantification in any way, it increases the confidence that the signal detected by our analyses genuinely originates from TE expression.

      Author response image 3.

      Author response image 4.

      Author response image 5.

      Author response image 6.

      Author response image 7.

      (3) Related to the above, it would be useful to describe in the main text (and methods) how multi-mapping reads are being handled. It wasn't clear to me how kallisto handles this, and it has implications for the results. In the analysis suggested above, only uniquely mapped reads would have to be used, despite its limitations.

      We agree with the Reviewer that this information regarding assignment of multimapping reads is important. Kallisto uses an expectation-maximization (EM) algorithm to deal with multimapping reads, a strategy used by several algorithms developed to study TE expression (8). Briefly, the EM algorithm reassigns multimapping reads based on the number of uniquely mapped reads assigned to each sequence. Thus, we added the following details to the methods section:

      "Preprocessing of the scRNA-seq data was performed with the kallisto (9), which uses an expectation-maximization algorithm to reassign multimapping reads based on the frequency of unique mappers at each sequence, and bustools workflow."

      (4) Whilst I liked the basic idea, I am not convinced that correlating TE and TF expression is a good strategy for identifying TE-TF associations at enhancers. Enhancers express very low levels of short transcripts, which I doubt would be detected in low-depth scRNA-seq data. The transcripts the authors are using to make such associations may therefore have nothing to do with the enhancer roles of TEs. I would limit these analyses to cell types for which there is histone modification data and correlate TF expression with that instead.

      We agree with the Reviewer that it would have been interesting to correlate the expression of TFs with signals of histone marks at TE sequences. However, we could not perform this analysis because we did not have matched data of histone marks throughout thymic development. Therefore, we adopted an alternative, well-suited strategy.

      Our strategy to identify TE enhancer candidates is depicted in Figure 2a: i) correlation between the expression of the TF and the TE subfamily, ii) presence of the TF binding motif in the sequence of the TE enhancer candidate, and iii) colocalization of the TE enhancer candidate with significant peaks of H3K27ac and H3K4me3 in the same cell type from the ENCODE Consortium ChIP-seq data. We limited our analyses to the eight cell types present both in our dataset and the ENCODE Consortium: B cells, CD4 Single Positive T cells (CD4 SP), CD8 Single Positive T cells (CD8 SP), dendritic cells (DC), monocytes and macrophages (Mono/Macro), NK cells, Th17, and Treg.

      (5) Figure 2G: binding of ETS1 is unconvincing. Were there statistically meaningful peaks called in these regions? It would be good to also show a metaplot/heatmap of ETS1 profile over all elements of relevant subfamilies. Showing histone marks on the genome browser snapshots would also be useful. Is there any transcriptional evidence that the specific Alus shown act as alternative promoters?

      We agree with the Reviewer that the examples provided were not particularly convincing. Thus, we reanalyzed the data to determine if statistically significant ETS1 peaks (see the answer to Reviewer 2's comment #6 for details on the methods) located near gene transcription start sites overlapped with TEs. We thereby provided examples of significant ETS1 peaks overlapping TE sequences in the promoter region of two prototypical NK cell protein-coding genes (Figure 2h).

      (6) Why was -k 10 used with bowtie2? This will map the same read to multiple locations in the genome, increasing read density at more repetitive (younger) TEs. The authors should use either default settings, being clear about the outcome (random assignment of multimapping reads to one location), or use only uniquely aligned reads.

      We thank the Reviewer for their comment and agree that using the -k 10 parameter with bowtie2 was not optimal for TE analysis. To improve the strength of our analyses, we reanalyzed all ChIP-seq data of our manuscript (Figure 2g and h, Figure 5e and f) using the following strategy: alignment with bowtie2 using default parameters except –very-sensitive, multimapping read removal with samtools view -q 10, removal of duplicate reads with samtools markdup -r, peaks calling was performed with macs2 with the -m 5 50 parameter, and peaks overlapping ENCODE's blacklist regions were removed with bedtools intersect.

      These new analyses strengthen our evidence that TEs interact with multiple genes that regulate thymic development and function. We updated the results sections concerning ChIP-seq data analyses and the Methods section to include this information:

      "ChIP-seq reads were aligned to the reference Homo sapiens genome (GRCh38) using bowtie2 (version 2.3.5) (10) with the --very-sensitive parameter. Multimapping reads were removed using the samtools view function with the -q 10 parameter, and duplicate reads were removed using the samtools markdup function with the -r parameter (11). Peak calling was performed with macs2 with the -m 5 50 parameter (12). Peaks overlapping with the ENCODE blacklist regions (13) were removed with bedtools intersect (14) with default parameters. Overlap of ETS1 peaks with TE sequences was determined using bedtools intersect with default parameters. BigWig files were generated using the bamCoverage function of deeptools2 (15), and genomic tracks were visualized in the USCS Genome Browser (16)."

      (7) Figure 1d needs a y axis scale. Could the authors also provide details of how the random distribution of TE expression was generated?

      We agree that the Reviewer that Figure 1d was incomplete and made the appropriate modifications. Regarding the random distribution, we reproduced our dataset containing the expression of 809 TE subfamilies in 18 cell populations. For each combination of TE subfamily and cell type, we randomly assigned an "expression pattern" as identified by the hierarchical clustering of Figure 1b. Then, we computed the maximal occurrence of an expression pattern across cell types for each TE subfamily to generate the distribution curve in Figure 1d. We added the following details to the Methods section to clarify how the random distribution was generated:

      "As a control, a random distribution of the expression of 809 TE subfamilies in 18 cell populations was generated. A cluster (cluster 1, 2, or 3) was randomly attributed for each combination of TE subfamily and cell type, and the maximal occurrence of a given cluster across cell types was then computed for each TE subfamily. Finally, the distributions of LINE, LTR, and SINE elements were compared to the random distribution with Kolmogorov-Smirnov tests."

      (8) The motif analysis requires a minimum of 1 locus from each TE subfamily containing it in order to be reported, but this seems like a really low threshold that will output a lot of noise. What is the rationale here?

      We agree with the Reviewer that this threshold might appear low. Nonetheless, these analyses ultimately aimed to identify TE promoter and enhancer candidates. Hence, we did not want to put an arbitrary threshold at a higher value (e.g., a certain number or percentage of all loci of a given TE subfamily), as this might create a bias based on the total number of loci of a given TE subfamily. Moreover, our rationale was that a TE locus might act as a promoter/enhancer even if it is the only locus of its subfamily containing a TF binding site.

      Even though this strategy might have created some noise in the analyses of interactions between TFs and TEs of Figure 2 (panels a-e), we are confident that our bootstrap strategy efficiently removed low-quality identifications based on low correlations values or expression of TF and TE in low percentages of cells. Additionally, the subsequent analyses on TE promoter and enhancer candidates were performed exclusively for the TE loci containing TF binding sites to avoid adding noise to these analyses.

      (9) Figure 4e: is this a log2 enrichment? If not, the enrichments for some of the gene sets are not so high.

      The enrichment values represented in Figure 4e are not log-transformed. It is essential to highlight that gene set enrichment values were computed for each possible pair of thymic APCs (e.g., pDC vs. cDC1, pDC vs. mTEC(II), etc.), and the values represented in Figure 4e are an average of each comparison pictured at the bottom of the UpSet plot.

      However, we agree with Reviewer 2 that the average enrichment value is not extremely high. We thus made the following modifications to the Results section ("TE expression in human pDCs is associated with dsRNA structures") to better represent it:

      "Notably, thymic pDCs harbored moderate yet significant enrichment of gene signatures of RIG-I and MDA5-mediated IFN ɑ/β signaling compared to all other thymic APCs (Figure 4e and Supplementary file 1 – Table 8)."

      (10) Please be clear on results subtitles when these refer to mouse.

      We apologize for the confusion and modified the subtitles to clarify if the results refer to mouse or human data.

      (11) Figure 1 - figure supplement 2: "assignation" should be 'assignment'.

      We thank the Reviewer for their keen eye and changed the title of Figure 1 – figure supplement 2.

      (1) Sadeq S, Al-Hashimi S, Cusack CM, Werner A. Endogenous Double-Stranded RNA. Noncoding RNA. 2021;7(1).

      (2) Kim N, Kim M, Yun S, Doh J, Greenberg PD, Kim TD, et al. MicroRNA-150 regulates the cytotoxicity of natural killers by targeting perforin-1. J Allergy Clin Immunol. 2014;134(1):195-203.

      (3) Gunturi A, Berg RE, Forman J. The role of CD94/NKG2 in innate and adaptive immunity. Immunol Res. 2004;30(1):29-34.

      (4) Taveirne S, Wahlen S, Van Loocke W, Kiekens L, Persyn E, Van Ammel E, et al. The transcription factor ETS1 is an important regulator of human NK cell development and terminal differentiation. Blood. 2020;136(3):288-98.

      (5) De Cecco M, Ito T, Petrashen AP, Elias AE, Skvir NJ, Criscione SW, et al. L1 drives IFN in senescent cells and promotes age-associated inflammation. Nature. 2019;566(7742):73-8.

      (6) Huntley S, Baggott DM, Hamilton AT, Tran-Gyamfi M, Yang S, Kim J, et al. A comprehensive catalog of human KRAB-associated zinc finger genes: insights into the evolutionary history of a large family of transcriptional repressors. Genome Res. 2006;16(5):669-77.

      (7) Wingett SW, Andrews S. FastQ Screen: A tool for multi-genome mapping and quality control. F1000Res. 2018;7:1338.

      (8) Lanciano S, Cristofari G. Measuring and interpreting transposable element expression. Nat Rev Genet. 2020;21(12):721-36.

      (9) Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol. 2016;34(5):525-7.

      (10) Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357-9.

      (11) Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, et al. Twelve years of SAMtools and BCFtools. Gigascience. 2021;10(2).

      (12) Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008;9(9):R137.

      (13) Amemiya HM, Kundaje A, Boyle AP. The ENCODE Blacklist: Identification of Problematic Regions of the Genome. Sci Rep. 2019;9(1):9354.

      (14) Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841-2.

      (15) Ramirez F, Ryan DP, Gruning B, Bhardwaj V, Kilpert F, Richter AS, et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016;44(W1):W160-5.

      (16) Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, et al. The human genome browser at UCSC. Genome Res. 2002;12(6):996-1006.

    2. Reviewer #2 (Public Review):


      Larouche et al show that TEs are broadly expressed in thymic cells, especially in mTECs and pDCs. Their data suggest a possible involvement of TEs in thymic gene regulation and IFN-alpha secretion. They also show that at least some TE-derived peptides are presented by MHC-I in the thymus.


      The idea of high/broad TE expression in the thymus as a mechanism for preventing TE-mediated autoimmunity is certainly an attractive one, as is their involvement in IFN-alpha secretion therein. The analyses and experiments presented here are therefore a very useful primer for more in-depth experiments, as the authors point out towards the end of the discussion.


      There are many dangers about analysing RNA-seq data at the subfamily level. Outputs may be greatly confounded by pervasive transcription, DNA contamination, and overlap of TEs with highly expressed genes. Whether TE transcripts are independent units or part of a gene also has important implications for the conclusions drawn. The authors have tried to mitigate against some of these issues, but they have not been completely ruled out.

    3. eLife assessment

      This important study shows, based on analyses of single-cell RNA-seq data sets of thymus cells, that transposable elements (TEs) are broadly expressed in thymic stromal cells, especially in medullary thymic epithelial cells and plasamacytoid dendritic cells. The authors also show that at least some TE-derived peptides are presented by MHC-I molecules in the thymus. The study provides solid findings supporting a role of TEs in thymic T-cell selection and immune self-tolerance.

    4. Reviewer #1 (Public Review):


      Transposable Elements (TEs) are exogenously acquired DNA regions that have played important roles in the evolutional acquisition of various biological functions. TEs may have been important in the evolution of the immune system, but their role in thymocytes has not been fully clarified.

      Using the human thymus scRNA dataset, the authors suggest the existence of cell type-specific TE functions in the thymus. In particular, it is interesting to show that there is a unique pattern in the type and expression level of TEs in thymic antigen-presenting cells, such as mTECs and pDCs, and that they are associated with transcription factor activities. Furthermore, the authors suggested that TEs may be non-redundantly regulated in expression by Aire, Fezf2, and Chd4, and that some TE-derived products are translated and present as proteins in thymic antigen-presenting cells. These findings provide important insights into the evolution of the acquired immune system and the process by which the thymus acquires its function as a primary lymphoid tissue.


      (1) By performing single-cell level analysis using scRNA-seq datasets, the authors extracted essential information on heterogeneity within the cell population. It is noteworthy that this revealed the diversity of expression not only of known autoantigens but also of TEs in thymic antigen-presenting cells.

      (2) The attempt to use mass spectrometry to confirm the existence of TE-derived peptides is worthwhile, even if the authors did not obtain data on as many transcripts as expected.

      (3) The use of public data sets and the clearly stated methods of analysis improved the transparency of the results.


      (1) The authors sometimes made overstatements largely due to the lack or shortage of experimental evidence.

      For example in Figure 4, the authors concluded that thymic pDCs produced higher copies of TE-derived RNAs to support the constitutive expression of type-I interferons in thymic pDCs, unlike peripheral pDCs. However, the data was showing only the correlation between the distinct TE expression pattern in pDCs and the abundance of dsRNAs. We are compelled to say that the evidence is totally too weak to mention the function of TEs in the production of interferon. Even if pDCs express a distinct type and amount of TE-derived transcripts, it may be a negligible amount compared to the total cellular RNAs. How many TE-derived RNAs potentially form the dsRNAs? Are they over-expressed in pDCs?<br /> The data interpretation requires more caution to connect the distinct results of transcriptome data to the biological significance.

      We contend that our manuscript combines the attributes of a research article (novel concepts) and a resource article (datasets of TEs implicated in various aspects of thymus function). The critical strength of our work is that it opens entirely novel research perspectives. We are unaware of previous studies on the role of TEs in the human thymus. The drawback is that, as with all novel multi-omic systems biology studies, our work provides a roadmap for a multitude of future mechanistic studies that could not be realized at this stage. Indeed, we performed wet lab experiments to validate some but not all conclusions: i) presentation of TE-derived MAPs by TECs and ii) formation of dsRNAs in thymic pDCs. In response to Reviewer #1, we performed supplementary analyses to increase the robustness of our conclusions. Also, we indicated when conclusions relied strictly on correlative evidence and clarified the hypotheses drawn from our observations. Regarding the Reviewer's questions about TE-derived dsRNAs, LINE, LTR, and SINE elements all have the potential to generate dsRNAs, given their highly repetitive nature and bi-directional transcription (1). As ~32% of TE subfamilies are overexpressed in pDCs, we hypothesized that these TE sequences might form dsRNA structures in these cells. To address the Reviewer's concerns regarding the amount of TE-derived RNAs among total cellular RNAs, we also computed the percentage of reads assigned to TEs in the different subsets of thymic APCs (see Reviewer 1 comment #4).<br /> ------

      I appreciate the authors' efforts to improve the quality of this valuable paper. The additional data proposed by the authors enhanced the possibility that the non-negligible amount of RNAs in pDCs is derived from TE elements. Their biological roles and significance will be demonstrated in future research.

      (2) Lack of generality of specific examples. This manuscript discusses the whole genomic picture of TE expression. In addition, one good way is to focus on the specific example to clearly discuss the biological significance of the acquisition of TEs for the thymic APC functions and the thymic selection.

      In Figure 2, the authors focused on ETS-1 and its potential target genes ZNF26 and MTMR3, however, the significance of these genes in NK cell function or development is unclear. The authors should examine and discuss whether the distinct features of TEs can be found among the genomic loci that link to the fundamental function of the thymus, e.g., antigen processing/presentation.

      We thank the Reviewer for this highly relevant comment. We investigated the genomic loci associated with NK cell biology to determine if ETS1 peaks would overlap with TE sequences in protein-coding genes' promoter region. Figure 2h illustrates two examples of ETS1 significant peaks overlapping TE sequences upstream of PRF1 and KLRD1. PRF1 is a protein implicated in NK cell cytotoxicity, whereas KLRD1 (CD94) dimerizes with NKG2 and regulates NK cell activation via interaction with the nonclassical MHC-I molecule HLA-E (2, 3). Thus, we modified the section of the manuscript addressing these results to include these new analyses: "Finally, we analyzed publicly available ChIP-seq data of ETS1, an important TF for NK cell development (4), to confirm its ability to bind TE sequences. Indeed, 19% of ETS1 peaks overlap with TE sequences (Figure 2g). Notably, ETS1 peaks overlapped with TE sequences (Figure 2h, in red) in the promoter regions of PRF1 and KLRD1, two genes important for NK cells' effector functions (2, 3)."<br /> ------

      I am convinced by the authors' explanation that TE elements may contribute to the functions of NK cells.<br /> However, since I have understood that the main topic of this paper is about the thymus and thymic antigen-presenting cells, the mention of NK cells seems abrupt and unconnected to me. NK cells are a type of innate lymphocyte that arise in the bone marrow, and thymus is dispensable for their development and function. The readers might expect to find something more fundamental regarding the function of the thymus and immunological tolerance.

      (3) Since the deep analysis of the dataset yielded many intriguing suggestions, why not add a discussion of the biological reasons and significance? For example, in Figure 1, why is TE expression negatively correlated with proliferation? cTEC-TE is mostly postnatal, while mTEC-TE is more embryonic. What does this mean?

      We thank the Reviewer for this comment. To our knowledge, the relationship between cell division and transcriptional activity of TEs has not been extensively studied in the literature. However, a recent study has shown that L1 expression is induced in senescent cells. We therefore added the following sentences to our Discussion: "The negative correlation between TE expression and cell cycle scores in the thymus is coherent with recent data showing that transcriptional activity of L1s is increased in senescent cells (5). A potential rationale for this could be to prevent deleterious transposition events during DNA replication and cell division." We also added several discussion points regarding the regulation of TEs by KZFPs to answer concerns raised by Reviewer 2 (see Reviewer 2 comment #1).<br /> ------

      I agree on the possibility suggested by the authors.

      (4) To consolidate the experimental evidence about pDCs and TE-derived dsRNAs, one option is to show the amount of TE-derived RNA copies among total RNAs. The immunohistochemistry analysis in Figure 4 requires additional data to demonstrate that overlapped staining was not caused by technical biases (e.g. uneven fixation may cause the non-specifically stained regions/cells). To show this, authors should have confirmed not only the positive stainings but also the negative staining (e.g. CD3, etc.). Another possible staining control was showing that non-pDC (CD303- cell fractions in this case) cells were less stained by the ds-RNA probe.

      We thank the Reviewer for this suggestion. We computed the proportion of reads in each cell assigned to two groups of sequences known to generate dsRNAs: TEs and mitochondrial genes (1). These analyses showed that the proportion of reads assigned to TEs is higher in pDCs than other thymic APCs by several orders of magnitude (~20% of all reads). In contrast, reads derived from mitochondrial genes had a lower abundance in pDCs. We included these results in Figure 4 - figure supplement 2 and included the following text in the Results section "To evaluate if these dsRNAs arise from TE sequences, we analyzed in thymic APC subsets the proportion of the transcriptome assigned to two groups of genomic sequences known as important sources of dsRNAs, TEs and mitochondrial genes (1). Strikingly, whereas the percentage of reads from mitochondrial genes was typically lower in pDCs than in other thymic APCs, the proportion of the transcriptome originating from TEs was higher in pDCs (~22%) by several orders of magnitude (Figure 4 - figure supplement 2)." As a negative control for the immunofluorescence experiments, we used CD123- cells. Indeed, flow cytometry analysis of the magnetically enriched CD303+ fraction was around 90% pure, as revealed by double staining with CD123 and CD304 (two additional markers of pDCs): CD123- cells were also CD304-/lo, showing that these cells are non- pDCs. Thus, we decided to compare the dsRNA signal between CD123+ cells (pDCs) and CD123- cells (non-pDCs). The difference between CD123+ and CD123- cells was striking (Figure 4d).<br /> ------

      Although the technical concerns about immunostaining were not resolved, it is understandable that it would be difficult to rerun the experiment since the authors used the precious human thymi as the experimental material. Immunostaining co-staining requires careful interpretation so that careful experimental setup is needed.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      The very detailed insights gained by the authors into allosteric regulation require very specialized techniques in this study. This poses a challenge to communicate the methods, the results, and the meaning of the results to a broader audience. In some places, the authors overcome this challenge better than in others.

      Following this reviewer’s suggestions, we have extensively revised the text, making the text more understandable to a broader audience.

      The manuscript does not show up on BioRxiv.

      The manuscript is now deposited in Biorxv (doi: 10.1101/2023.09.12.557419)

      Fig3: GS-ES2 transition: the changes appear minimal in the illustration.

      As suggested by this reviewer, we have re-examined the GS-ES2 transition and clearly defined the structural characteristics of the conformationally excited state 2 (ES2) state. As shown in the revised Fig.3 of the main text, the ground state (GS) features a π-π packing between the aromatic rings of F100 and Y156, as well as a cation-π stacking between R308 and F102. In the ES2 state, these above interactions are disrupted, while a new π-π packing interaction is formed between F100 and F102. We added new comments in the main text clarifying these structural interactions that characterize each state.

      GS-ES1 transition: how is the K72-E91 salt bridge disrupted? How do you define the formation/disruption of a salt bridge? The current figure does not make this very clear and the K72-E91 salt bridge appears to be intact in ES1. Maybe the authors could replace the dotted K72-E91 line with a dotted line and distance?

      As stated above, we revised Fig. 3 highlighting the differences between the two states. The K72 and E91 salt bridge is formed when the distance between Nε of K72 and Oε of E91 is shorter than 4.0 Å (the typical cutoff for a salt bridge). In the ES1 state, the outward movement of the αC helix increases the distance over 4.5 Å, disrupting the salt bridge.

      L251: Could the authors remind the reader why they are only comparing V104 and I150? Could they give a little context as to why they consider the agreement to be good? It appears that they would be statistically different, so a little context for what comprises a good agreement in the literature may be helpful.

      Our mutagenesis studies show that V104 and I150 are key residues for allosteric communication, and if mutated, result in well-folded but inactive kinases (Sci Adv. doi: 10.1126/sciadv.1600663). Importantly, V104 and I150 show two distinct populations in the CEST experiments that can be directly related to the GS and ES states. Regarding the fitting of these residues, we obtained a good agreement with the direction of the chemical shifts, which supports the hypothesized GS -> ES structural transition. The lack of a quantitative agreement between the chemical shifts of the experimental and simulated excited state is not surprising for two reasons a) all state-of-the art simulations fall short in sampling slow conformational interconversions, and b) the uncertainty of the SHIFTX algorithm for the prediction of 13C chemical shifts of methyl groups is quite large. Finally, we would like to point out that most NMR relaxation-dispersion experiments (CEST and CPMG) are performed for the backbone 15N, 13Calpha and 1H resonances, which have been used to calculate the structures of the intermediate states (Neudecker, P. et. al Science, 2012, 336,doi: 10.1126/science.1214203) and yield reasonable agreement with the prediction for metastable states derived from Markov Models (Olsson, S. J. Am. Chem. Soc., 2017,139,doi:10.1021/jacs.6b09460). To the best of our knowledge, there is no literature reporting on calculations of the 13C CEST profiles for methyl groups from MD simulations, and remarkably, we found a reasonably good agreement between experimental and predicted chemical shifts (see Fig.5C).

      Just to clarify: the calculated CS values are informed by experimental CS values that were used in the calculation?

      We used the backbone chemical shifts as the restraints only in the metadynamics simulations. We used the chemical shifts of the methyl groups and their corresponding excited states to verify the ES2 state.

      Figure 8: in its current form this potentially exciting result is lost on the average reader.

      we modified Fig. 8 of the main text, making the intra- and inter-residue correlations visible to the reader.

      Reviewer #2:

      While the alphaC-beta4 loop is a conserved feature of protein kinases, the residues within this loop vary across various kinase families and groups, enabling group and family-specific control of activity through cis and trans acting elements. F102 in PKA interacts with co-conserved residues in the C-tail, which has been proposed to function as a cis regulatory element. The authors should elaborate on the conformational changes in the C-tail, particularly in the arginine that packs against F102, in the results and discussion. This would further extend the impact and scope of the manuscript, which is currently confined to PKA.

      As suggested by this reviewer, we re-analyzed the time-dependent interactions between F102 and R308 at the C-tail. As this reviewer suspected, these interactions differentiate the ES2 from the GS state. In the GS state, there is a stable cation-π interaction between F102 and R308, which becomes transient in the ES2 state (Fig. 3). For the F100A mutant, the interactions between F102 and R308 have lower occurrence relative to the WT enzyme, i.e., a weaker interaction between the αC-β4 loop and the C-tail (see new Figure 6 - figure supplement 1). The latter supports our conclusion that the structural coupling between the C-tail and the two lobes of the enzyme decreases for the F100A mutant. We added more comments in the main text.

      FAIR standards of making the data accessible and reproducible are not directly addressed.

      We have deposited all our NMR data on the Data Repository Site at the University of Minnesota, DRUM (https://hdl.handle.net/11299/261043).

      The MD data and conformational states would be a valuable resource for the community and should be shared via some open-source repositories.

      Due to the large size of the simulations (>500 GB), we could not deposit them in the Data Repository Site at the University of Minnesota (DRUM). We are actively working with the personnel at DRUM to upload all the trajectories in an alternate site. However, these data will be available to the public immediately upon request.

      The authors state that ES1 and ES2 states are novel and not observed in previous crystal structures. The authors should quantify this through comparisons with PKA inactive states and with other AGC kinases.

      We apologize for the confusion. We now clarify that the ES1 is a well-known inactivation pathway. As suggested by this reviewer, we now report a few examples of active and inactive conformations of PKA-C and other kinases (see new Figure 3 – figure supplement 2.). Briefly, ES1 corresponds to the typical αC-out conformation found for PKA-C bound to inhibitors or in R194A mutant. A similar conformation is present for Src, Abl, and CDK2. The C-out conformation features a disrupted β3K-αCE salt bridge, which is key for active kinases. In contrast, the transition GS-ES2 is not present in the inactive conformations deposited in the PDB.

      Based on the results, can the authors speculate on the impact of oncogenic mutations in the alphaCbeta4 loop mutations in PKA?

      We now include additional comments and another citation that further supports our findings. In short, the activation of a kinase is generated by mutation insertions that stabilize the αC-β4 loop as pointed out by Kannan and Zhang (see references 28, 30, and 68). In contrast, mutations that destabilize this allosteric site (e.g., F100A) are inactivating, disrupting the structural couplings of the two lobes (our work).

      Reviewer #3:

      The manuscript is somewhat difficult to read even for kinase experts, and even harder for the layman. The difficulty partially arises from mixing technical description of the simulations with structural interpretation of the results, which is more intuitive, and partially arises from the assumption that readers are familiar with kinase architecture and its key elements (the aC helix, the APE motif, etc).

      We revised the text and modified Fig. 1 in the main text to make the paper more accessible to the general audience.

      The authors haven't done a good job describing the ES2 state intuitively. From my examination of the figures, it appears that in the ES2 state, the kinase domain is more elongated and the N and the C lobes are relatively less engaged than in the ground state. This may or may not be exactly, but a more intuitive description of the ES2 state is needed.

      As suggested by this reviewer, we include a better description of the ES2 state of the kinase and the structural details of the inactivation pathway. Also, we checked the radius of gyration of the two lobes for GS and ES2. ES2 is slightly more elongated with an Rg of 20.3 ± 0.1 Å as compared to the GS state (20.0 ± 0.2 Å). This marginal difference is consistent with our characterization of the local packing around the C-4 loop, in which the lack of stable interaction with E and C-tail in the ES2 state makes the overall structure less compact.

      The authors need to introduce and give a brief description of technical terms such as CV (collective variable), PC (principal component) etc.

      We now specify both collective variables and principal components and include those definitions in the Method section. Briefly, to characterize the complex conformational transitions of PKA-C, we utilize collective variables (Figure 2 – figure supplement 1). We chose these variables based on structural motifs described in the literature to define local and global structural transitions (Camilloni C., Vendruscolo, M, Biochemistry, 2015,54,7470; Kukic, P. et al. Structure, 2015,23, 745). On the other hand, we utilized the principal component analysis to compare the conformational changes of the kinase in the same two-dimensional space, revealing the two lowest frequencies that define the global motions of the enzyme (Figures 7C, D, and E).

      The following paper should be discussed as it discussed similar ATP/substrate binding of Src kinase based on an extensive network that largely overlaps with the discussed PKA network. Foda, et al. "A dynamically coupled allosteric network underlies binding cooperativity in Src kinase." Nature communications 6.1 (2015): 5939.

      We apologize for missing this citation. Indeed, it makes our finding more general as allosteric cooperativity is key in other kinases such as Src and ERK2. We included this in the Discussion section.

      The CHESCA analysis appears to be an add-on that doesn't add much value. It is difficult to direct. I'd suggest considering removing it to the SI.

      We understand this concern. We rewrote part of the paper to make the NMR analysis of the correlated chemical shifts described by the CHESCA matrices linked to the MD calculations.

    2. eLife assessment

      This important study provides an example of integrating computational and experimental approaches that lead to new insights into the energy landscape of a model kinase. Compelling use of molecular dynamics simulations and NMR spectroscopy provide a conformational description of active and excited states of the kinase; one of which has not been captured in previously solved crystal structures. Overall, this comprehensive study expands our understanding of the architecture and allosteric features of the conserved bilobal kinase domain structure.

    3. Reviewer #1 (Public Review):


      The authors use insights into the dynamics of the PKA kinase domain, obtained by NMR experiments, to inform MD simulations that generate an energy landscape of PKA kinase domain conformational dynamics.


      The authors integrate strong experimental data through the use of state-of-the-art MD studies and derive detailed insights into allosteric communication in PKA kinase. Comparison of wt kinase with a mutant (F100A) shows clear differences in the allosteric regulation of the two proteins. These differences can be rationalized by NMR and MD results. During the revision process, the authors have addressed the reviewers' comments adequately and have improved the accessibility of the manuscript to a wider audience.

    4. Reviewer #3 (Public Review):


      Combining several MD simulation techniques (NMR-constrained replica-exchange metadynamics, Markov State Model, and unbiased MD) the authors identified the aC-beta4 loop of PKA kinase as a switch crucially involved in PKA nucleotide/substrate binding cooperatively. They identified a previously unreported excited conformational state of PKA (ES2), this switch controls and characterized ES2 energetics with respect to the ground state. Based on translating the simulations into chemical shits and NMR characterizing of PKA WT and an aC-beta4 mutant, the author made a convincing case in arguing that the simulation-suggested excited state is indeed an excited state observed by NMR, thus giving the excited state conformational details.


      This work incorporates extensive simulation works, new NMR data, and in vitro biochemical analysis. It stands out in its comprehensiveness, and I think it made a great case.


      The manuscript is somewhat difficult to read even for kinase experts, and even harder for the layman. The difficulty partially arises from mixing the technical description of the simulations with the structural interpretation of the results, which is more intuitive, and partially arises from the assumption that readers are familiar with kinase architecture and its key elements (the aC helix, the APE motif, etc).

    1. eLife assessment

      This is a useful study that identifies circadian changes in the gene expression profile of cultured mouse astrocytes. Mechanistic details linking circadian rhythmicity in HERP, a regulator of calcium signals in the endoplasmic reticulum, to altered phosphorylation of Connexin 43 remain currently incomplete. With improved manuscript clarity and statistical analysis, this work could be of interest to the field of astrocyte and circadian biology.

    2. Reviewer #1 (Public Review):


      In Ryu et al., the authors use a cortical mouse astrocyte culture system to address the functional contribution of astrocytes to circadian rhythms in the brain. The authors' starting point is transcriptional output from serum-shocked culture, comparative informatics with existing tools and existing datasets. After fairly routine pathway analyses, they focus on the calcium homeostasis machinery and one gene, Herp, in particular. They argue that Herp is rhythmic at both mRNA and protein levels in astrocytes. They then use a calcium reporter targeted to the ER, mitochondria, or cytosol and show that Herp modulates calcium signaling as a function of circadian time. They argue that this occurs through the regulation of inositol receptors. They claim that the signaling pathway is clock-controlled by a limited examination of Bmal1 knockout astrocytes. Finally, they switch to calcium-mediated phosphorylation of the gap junction protein Connexin 43 but do not directly connect HERP-mediated circadian signaling to these observations. While these experiments address very important questions related to the critical role of astrocytes in regulating circadian signaling, the mechanistic arguments for HERP function, its role in circadian signaling through inositol receptors, the connection to gap junctions, and ultimately, the functional relevance of these findings is only partially substantiated by experimental evidence.


      - The paper provides useful datasets of astrocyte gene expression in circadian time.

      - Identifies HERP as a rhythmic output of the circadian clock.

      - Demonstrates the circadian-specific sensitivity of ATP -> calcium signaling.

      - Identifies possible rhythms in both Connexin 43 phosphorylation and rhythmic movement of calcium between cells.


      - It is not immediately clear why the authors chose to focus on Ca2+ homeostasis or Herp from their initial screens as neither were the "most rhythmic" pathways in their primary analyses.

      - It would have been interesting (and potentially important) to know whether various methods of cellular synchronization would also render HERP rhythmic (e.g., temperature, forskolin, etc). If Herp is indeed relatively astrocyte-specific and rhythmic, it should be easy to assess its rhythmicity in vivo.

      - The authors show that Herp suppression reduces ATP-mediated suppression of calcium whereas it initially increases Ca2+ in the cytosol and mitochondria and then suppresses it. The dynamics of the mitochondrial and cytosolic responses are not discussed in any detail and it is unclear what their direct relationship is to Herp-mediated ER signaling. What is the explanation for Herp (which is thought to be ER-specific) to calcium signaling in other organelles?

      - What is the functional significance of promoting ATP-mediated suppression of calcium in ER?

      - The authors then nicely show that the effect of ATP is dependent on intrinsic circadian timing but do not explain why these effects are antiphase in cytosol or mitochondria. Moreover, the ∆F/F for calcium in mitochondria and cytosol both rise, cross the abscissa, and then diminish - strongly suggesting a biphasic signaling event. Therefore, one wonders whether measuring the area under the curve is the most functionally relevant measurement of the change.

      - Why are mitochondrial and cytosolic calcium not also demonstrated for Bmal1 KO astrocytes?

      - The authors claim that Herp acts by regulating the degradation of ITPRs but this hypothesis - rather central to the mechanisms proposed in this study - is not experimentally substantiated.

      - There is no clear demonstration of the functional relevance of the circadian rhythms of ATP-mediated calcium signaling.

    3. Reviewer #2 (Public Review):


      The article entitled "Circadian regulation of endoplasmic reticulum calcium response in mouse cultured astrocytes" submitted by Ryu and colleagues describes the circadian control of astrocytic intracellular calcium levels in vitro.


      The authors used a variety of technical approaches that are appropriate


      Statistical analysis is poor and could lead to a misinterpretation of the data

      Several conceptual issues have been identified.

      Overinterpretation of the data should be avoided. This is a mechanistic paper done completely in vitro, all references to the in vivo situation are speculative and should be avoided.

    4. Reviewer #3 (Public Review):

      Astrocyte biology is an active area of research and this study is timely and adds to a growing body of literature in the field. The RNA-seq, Herp expression, and Ca2+ release data across wild-type, Bmal1 knockout, and Herp knockdown cellular models are robust and lend considerable support to the study's conclusions, highlighting their importance. Despite these strengths, the manuscript presents a gap in elucidating the dynamics of HERP and the involvement of ITPR1/2 in modulating Ca2+ release patterns and their circadian variations, which remains insufficiently supported and characterized. While the Connexin data underscore the importance of rhythmic Ca2+ release triggered by ATP, the relationship here appears correlational and the role of HERP and ITPR in Cx function remains to be characterized. Moreover, enhancing the manuscript's clarity and readability could significantly benefit the presentation and comprehension of the findings.

    1. eLife assessment

      This fundamental work substantially advances our understanding of cell migration, especially in that of cranial neural crest. The additional evidence provided to support the conclusion is exceptional, with rigorous biochemical assays for materials used and with intensive genetic interventions. The work will be of broad interest to developmental biologists and cell biologists.

    2. Author Response

      The following is the authors’ response to the original reviews.

      We thank the two reviewers for their very thoughtful suggestions and the editors for writing the eLife assessment. We will submit a revised manuscript that addresses most comments and include a point-by-point response to the reviewers. We will provide evidence that overexpression of the HtrA1 protease and knockdown of its inhibitor SerpinE2 reduce the development of neural crest-derived cartilage elements in the head of Xenopus embryos. This will be done by whole mount in situ hybridization, using a probe for the chondrogenic marker Sox9. We will also provide two time-lapse movies showing (1) collective migration of cranial neural crest cells in culture and (2) failure of these cells to adhere to fibronectin upon SerpinE2 depletion. We will discuss in more depth how the SerpinE2-HtrA1 proteolytic pathway and its target, the heparan sulfate proteoglycan Syndecan-4, might regulate FGF signaling and suggest a model, in which serpin secreted by the leader cells and the protease released by the follower cells might establish a chemotactic FGF gradient for the directed migration of the neural crest cohort. The criticism that other factors such as proliferation and cell survival might contribute to the observed craniofacial phenotypes upon misexpression of SerpinE2 and HtrA1, and that it remains unclear to what extent the mechanism reported here is conserved in the trunk neural crest is valid. The reason we focused on the more amenable cranial neural crest in the Xenopus embryo and used a multitude of approaches – structure-function studies, biochemical analyses, in vitro explant assays and epistatic experiments in vivo – was to validate a central finding: that an extracellular proteolytic pathway involving a serpin, a protease and a proteoglycan regulates by a double inhibition mechanism collective cell migration.

    1. eLife assessment

      This important study reports a novel approach to studying cerebellar function based on the idea of selective recruitment using fMRI. It provides convincing evidence for task-dependent gating of neocortical input to the cerebellum during a motor task and a working memory task. The study will be of interest to a broad cognitive neuroscience audience.

    2. Reviewer #1 (Public Review):

      This is an interesting and well-written paper reporting on a novel approach to studying cerebellar function based on the idea of selective recruitment using fMRI. The study is well-designed and executed. Analyses are sound and results are properly discussed. The paper makes a significant contribution to broadening our understanding of the role of the cerebellum in human behavior.

      - While the authors provide a compelling case for the link between BOLD and the cerebellar cortical input layer, there remains considerable unexplained variance. Perhaps the authors could elaborate a bit more on the assumption that BOLD signals mainly reflect the input side of the cerebellum (see for example King et al., elife. 2023 Apr 21;12:e81511).

      - The current approach does not appear to take the non-linear relationships between BOLD and neural activity into account.

      - The authors may want to address a bit more the issue of closed loops as well as the underlying neuroanatomy including the deep cerebellar nuclei and pontine nuclei in the context of their current cerebello-cortical correlational approach. But also the contribution of other brain areas such as the basal ganglia and hippocampus.

      - What about the direct projections of mossy fibers to the DCN that actually bypasses the cerebellar cortex?

    3. Reviewer #2 (Public Review):


      Shahshahani and colleagues used a combination of statistical modelling and whole-brain fMRI data in an attempt to separate the contributions of cortical and cerebellar regions in different cognitive contexts.


      * The manuscript uses a sophisticated integration of statistical methods, cognitive neuroscience, and systems neurobiology.

      * The authors use multiple statistical approaches to ensure robustness in their conclusions.

      * The consideration of the cerebellum as not a purely 'motor' structure is excellent and important.


      * Two of the foundation assumptions of the model - that cerebellar BOLD signals reflect granule cells > purkinje neurons and that corticocerebellar connections are relatively invariant - are still open topics of investigation. It might be helpful for the reader if these ideas could be presented in a more nuanced light.

      * The assumption that cortical BOLD responses in cognitive tasks should be matched irrespective of cerebellar involvement does not cohere with the idea of 'forcing functions' introduced by Houk and Wise.

    1. Author Response

      Reviewer #1 (Public Review):

      Theoretical principles of viscous fluid mechanics are used here to assess likely mechanisms of transport in the ER. A set of candidate mechanisms is evaluated, making good use of imaging to represent ER network geometries. Evidence is provided that the contraction of peripheral sheets provides a much more credible mechanism than the contraction of individual tubules, junctions, or perinuclear sheets.

      The work has been conducted carefully and comprehensively, making good use of underlying physical principles. There is a good discussion of the role of slip; sensible approximations (low volume fraction, small particle size, slender geometries, pragmatic treatment of boundary conditions) allow tractable and transparent calculations; clear physical arguments provide useful bounds; stochastic and deterministic features of the problem are well integrated.

      We thank the reviewer for their positive assessment of our work.

      There are just a couple of areas where more discussion might be warranted, in my view.

      (1) The energetic cost of tubule contraction is estimated, but I did not see an equivalent estimate for the contraction of peripheral sheets. It might be helpful to estimate the energetic cost of viscous dissipation in generated flows at higher frequencies.

      This is a good point. We will also include an energetic cost estimate for the contractions of peripheral sheets in the revised manuscript.

      The mechanism of peripheral sheet contraction is unclear: do ATP-driven mechanisms somehow interact with thermal fluctuations of membranes?

      The new energetic estimates in the revision might help constrain possible hypotheses for the mechanism(s) driving peripheral sheet contraction, and suggest if a dedicated ATP-driven mechanism is required.

      (2) Mutations are mentioned in the abstract but not (as far as I could see) later in the manuscript. It would be helpful if any consequences for pathologies could be developed in the text.

      We are grateful for this suggestion. The need to rationalise pathology associated with the subtle effects of ER-morphogens’ mutations is indeed pointed out as one factor motivating the study of the interplay between ER structure and performance. In the revised manuscript, we plan to include a brief discussion potentially linking ER morphogenes’ malfunction to luminal transport, integrating additional freshly published data.

      Reviewer #2 (Public Review):


      This study explores theoretically the consequences of structural fluctuations of the endoplasmic reticulum (ER) morphology called contractions on molecular transport. Most of the manuscript consists of the construction of an interesting theoretical flow field (physical model) under various hypothetical assumptions. The computational modeling is followed by some simulations


      The authors are focusing their attention on testing the hypothesis that a local flow in the tubule could be driven by tubular pinching. We recall that trafficking in the ER is considered to be mostly driven by diffusion at least at a spatial scale that is large enough to account for averaging of any random flow occurring from multiple directions [note that this is not the case for plants].

      We thank the reviewer. We have indeed explored here the possibilities of active transport, focusing especially on transport over the length scale of single tubules, as a result of structural fluctuations, and found tubular pinching to be ineffective compared to e.g. peripheral sheets fluctuations. In the revised version we plan to add text mentioning what is known about the ER in plants.


      The manuscript extensively details the construction of the theoretical model, occupying a significant portion of the manuscript. While this section contains interesting computations, its relevance and utility could be better emphasized, perhaps warranting a reorganization of the manuscript to foreground this critical aspect.

      Overall, the manuscript appears highly technical with limited conclusive insights, particularly lacking predictions confirmed by experimental validation. There is an absence of substantial conclusions regarding molecular trafficking within the ER.

      We sought to balance the theoretical/computational details of our model with the biophysical conclusions drawn from its predictions. Given the model's complexity and novelty, it was essential to elucidate the theoretical underpinnings comprehensively, in order to allow others to implement it in the future with additional, or different, parameters. To maintain clarity and focus in the main text, we have judiciously relegated extensive technical details to the methods section or supplementary materials, and divided the text into stand-alone section headings allowing the reader to skip through to conclusions.

      The primary focus of our manuscript is to introduce and explore, via our theoretical model, the interplay between ER structure dynamics and molecular transport. Our approach, while in silico, generates concrete predictions about the physical processes underpinning luminal motion within the ER. For instance, our findings challenge the previously postulated role of small tubular contractions in driving luminal flow, instead highlighting the potential significance of local flat ER areas—empirically documented entities—for facilitating such motion.

      Furthermore, by deducing what type of transport may or may not occur within the range of possible ER structural fluctuations, our model offers detailed predictions designed to bridge the gap between theoretical insight and experimental verification. These predictions detail the spatial and temporal parameters essential for effective transport, delineating plausible values for these parameters. We hope that the model’s predictions will invite experimentalists to devise innovative methodologies to test them. We plan to introduce text edits to the revised version to clarify these.

    2. eLife assessment

      This study explores the physical principles underlying fluid flow and luminal transport within the endoplasmic reticulum; its important contribution is to highlight the strong physical constraints imposed by viscous dissipation in nanoscopic tubular networks. In particular, the work presents convincing evidence that commonly discussed mechanisms such as tubular contraction are unlikely to be at the origin of the observed transport velocities. As this study is solely theoretical and concerned with order of magnitude estimates, its main conclusions await experimental validation. The work will be of relevance to cell biologists and physicists interested in organelle dynamics.

    3. Reviewer #1 (Public Review):

      Theoretical principles of viscous fluid mechanics are used here to assess likely mechanisms of transport in the ER. A set of candidate mechanisms is evaluated, making good use of imaging to represent ER network geometries. Evidence is provided that the contraction of peripheral sheets provides a much more credible mechanism than the contraction of individual tubules, junctions, or perinuclear sheets.

      The work has been conducted carefully and comprehensively, making good use of underlying physical principles. There is a good discussion of the role of slip; sensible approximations (low volume fraction, small particle size, slender geometries, pragmatic treatment of boundary conditions) allow tractable and transparent calculations; clear physical arguments provide useful bounds; stochastic and deterministic features of the problem are well integrated.

      There are just a couple of areas where more discussion might be warranted, in my view.

      (1) The energetic cost of tubule contraction is estimated, but I did not see an equivalent estimate for the contraction of peripheral sheets. It might be helpful to estimate the energetic cost of viscous dissipation in generated flows at higher frequencies. The mechanism of peripheral sheet contraction is unclear: do ATP-driven mechanisms somehow interact with thermal fluctuations of membranes?

      (2) Mutations are mentioned in the abstract but not (as far as I could see) later in the manuscript. It would be helpful if any consequences for pathologies could be developed in the text.

    4. Reviewer #2 (Public Review):


      This study explores theoretically the consequences of structural fluctuations of the endoplasmic reticulum (ER) morphology called contractions on molecular transport. Most of the manuscript consists of the construction of an interesting theoretical flow field (physical model) under various hypothetical assumptions. The computational modeling is followed by some simulations


      The authors are focusing their attention on testing the hypothesis that a local flow in the tubule could be driven by tubular pinching. We recall that trafficking in the ER is considered to be mostly driven by diffusion at least at a spatial scale that is large enough to account for averaging of any random flow occurring from multiple directions [note that this is not the case for plants].


      The manuscript extensively details the construction of the theoretical model, occupying a significant portion of the manuscript. While this section contains interesting computations, its relevance and utility could be better emphasized, perhaps warranting a reorganization of the manuscript to foreground this critical aspect.

      Overall, the manuscript appears highly technical with limited conclusive insights, particularly lacking predictions confirmed by experimental validation. There is an absence of substantial conclusions regarding molecular trafficking within the ER.

    1. Author Response

      The following is the authors’ response to the original reviews.


      Recommendation #1: Address potential confounds in the experimental design:

      (1a) Confounding factors between baseline to early learning. While the visual display of the curved line remains constant, there are at least three changes between these two phases: 1) the presence of reward feedback (the focus of the paper); 2) a perturbation introduced to draw a hidden, mirror-symmetric curved line; 3) instructions provided to use reward feedback to trace the line on the screen (intentionally deceitful). As such, it remains unclear which of these factors are driving the changes in both behavior and bold signals between the two phases. The absence of a veridical feedback phase in which participants received reward feedback associated with the shown trajectory seems like a major limitation.

      (1b) Confounding Factors Between Early and Late Learning. While the authors have focused on interpreting changes from early to late due to the explore-exploit trade-off, there are three additional factors possibly at play: 1) increasing fatigue, 2) withdrawal of attention, specifically related to individuals who have either successfully learned the perturbation within the first few trials or those who have simply given up, or 3) increasing awareness of the perturbation (not clear if subjective reports about perturbation awareness were measured.). I understand that fMRI research is resource-intensive; however, it is not clear how to rule out these alternatives with their existing data without additional control groups. [Another reviewer added the following: Why did the authors not acquire data during a control condition? How can we be confident that the neural dynamics observed are not due to the simple passage of time? Or if these effects are due to the task, what drives them? The reward component, the movement execution, increased automaticity?]

      We have opted to address both of these points above within a single reply, as together they suggest potential confounding factors across the three phases of the task. We would agree that, if the results of our pairwise comparisons (e.g., Early > Baseline or Late > Early) were considered in isolation from one another, then these critiques of the study would be problematic. However, when considering the pattern of effects across the three task phases, we believe most of these critiques can be dismissed. Below, we first describe our results in this context, and then discuss how they address the reviewers’ various critiques.

      Recall that from Baseline to Early learning, we observe an expansion of several cortical areas (e.g., core regions in the DMN) along the manifold (red areas in Fig. 4A, see manifold shifts in Fig. 4C) that subsequently exhibit contraction during Early to Late learning (blue areas in Fig. 4B, see manifold shifts in Fig. 4D). We show this overlap in brain areas in Author response image 1 below, panel A. Notably, several of these brain areas appear to contract back to their original, Baseline locations along the manifold during Late learning (compare Fig. 4C and D). This is evidenced by the fact that many of these same regions (e.g., DMN regions, in Author response image 1 panel A below) fail to show a significant difference between the Baseline and Late learning epochs (see Author response image 1 panel B below, which is taken from supplementary Fig 6). That is, the regions that show significant expansion and subsequent contraction (in Author response image 1 panel A below) tend not to overlap with the regions that significantly changed over the time course of the task (in Author response image 1 panel B below).

      Author response image 1.

      Note that this basic observation above is not only true of our regional manifold eccentricity data, but also in the underlying functional connectivity data associated with individual brain regions. To make this second point clearer, we have modified and annotated our Fig. 5 and included it below. Note the reversal in seed-based functional connectivity from Baseline to Early learning (leftmost brain plots) compared to Early to Late learning (rightmost brain plots). That is, it is generally the case that for each seed-region (A-C) the areas that increase in seed-connectivity with the seed region (in red; leftmost plot) are also the areas that decrease in seed-connectivity with the seed region (in blue; rightmost plot), and vice versa. [Also note that these connectivity reversals are conveyed through the eccentricity data — the horizontal red line in the rightmost plots denote the mean eccentricity of these brain regions during the Baseline phase, helping to highlight the fact that the eccentricity of the Late learning phase reverses back towards this Baseline level].

      Author response image 2.

      Critically, these reversals in brain connectivity noted above directly counter several of the critiques noted by the reviewers. For instance, this reversal pattern of effects argues against the idea that our results during Early Learning can be simply explained due to the (i) presence of reward feedback, (ii) presence of the perturbation or (iii) instructions to use reward feedback to trace the path on the screen. Indeed, all of these factors are also present during Late learning, and yet many of the patterns of brain activity during this time period revert back to the Baseline patterns of connectivity, where these factors are absent. Similarly, this reversal pattern strongly refutes the idea that the effects are simply due to the passage of time, increasing fatigue, or general awareness of the perturbation. Indeed, if any of these factors alone could explain the data, then we would have expected a gradual increase (or decrease) in eccentricity and connectivity from Baseline to Early to Late learning, which we do not observe. We believe these are all important points when interpreting the data, but which we failed to mention in our original manuscript when discussing our findings.

      We have now rectified this in the revised paper, where we now write in our Discussion:

      “Finally, it is important to note that the reversal pattern of effects noted above suggests that our findings during learning cannot be simply attributed to the introduction of reward feedback and/or the perturbation during Early learning, as both of these task-related features are also present during Late learning. In addition, these results cannot be simply explained due to the passage of time or increasing subject fatigue, as this would predict a consistent directional change in eccentricity across the Baseline, Early and Late learning epochs.”

      However, having said the above, we acknowledge that one potential factor that our findings cannot exclude is that they are (at least partially) attributable to changes in subjects’ state of attention throughout the task. Indeed, one can certainly argue that Baseline trials in our study don’t require a great deal of attention (after all, subjects are simply tracing a curved path presented on the screen). Likewise, for subjects that have learned the hidden shape, the Late learning trials are also likely to require limited attentional resources (indeed, many subjects at this point are simply producing the same shape trial after trial). Consequently, the large shift in brain connectivity that we observe from Baseline to Early Learning, and the subsequent reversion back to Baseline-levels of connectivity during Late learning, could actually reflect a heightened allocation of attention as subjects are attempting to learn the (hidden) rewarded shape. However, we do not believe that this would reflect a ‘confound’ of our study per se — indeed, any subject who has participated in a motor learning study would agree that the early learning phase of a task is far more cognitively demanding than Baseline trials and Late learning trials. As such, it is difficult to disentangle this ‘attention’ factor from the learning process itself (and in fact, it is likely central to it).

      Of course, one could have designed a ‘control’ task in which subjects must direct their attention to something other than the learning task itself (e.g., divided attention paradigm, e.g., Taylor & Thoroughman, 2007, 2008, and/or perform a secondary task concurrently (Codol et al., 2018; Holland et al., 2018), but we know that this type of manipulation impairs the learning process itself. Thus, in such a case, it wouldn’t be obvious to the experimenter what they are actually measuring in brain activity during such a task. And, to extend this argument even further, it is true that any sort of brain-based modulation can be argued to reflect some ‘attentional’ process, rather than modulations related to the specific task-based process under consideration (in our case, motor learning). In this regard, we are sympathetic to the views of Richard Andersen and colleagues who have eloquently stated that “The study of how attention interacts with other neural processing systems is a most important endeavor. However, we think that over-generalizing attention to encompass a large variety of different neural processes weakens the concept and undercuts the ability to develop a robust understanding of other cognitive functions.” (Andersen & Cui, 2007, Neuron). In short, it appears that different fields/researchers have alternate views on the usefulness of attention as an explanatory construct (see also articles from Hommel et al., 2019, “No one knows what attention is”, and Wu, 2023, “We know what attention is!”), and we personally don’t have a dog in this fight. We only highlight these issues to draw attention (no pun intended) that it is not trivial to separate these different neural processes during a motor learning study.

      Nevertheless, we do believe these are important points worth flagging for the reader in our paper, as they might have similar questions. To this end, we have now included in our Discussion section the following text:

      “It is also possible that some of these task-related shifts in connectivity relate to shifts in task-general processes, such as changes in the allocation of attentional resources (Bédard and Song, 2013; Rosenberg et al., 2016) or overall cognitive engagement (Aben et al., 2020), which themselves play critical roles in shaping learning (Codol et al., 2018; Holland et al., 2018; Song, 2019; Taylor and Thoroughman, 2008, 2007; for a review of these topics, see Tsay et al., 2023). Such processes are particularly important during the earlier phases of learning when sensorimotor contingencies need to be established. While these remain questions for future work, our data nevertheless suggest that this shift in connectivity may be enabled through the PMC.”

      Finally, we should note that, at the end of testing, we did not assess participants' awareness of the manipulation (i.e., that they were, in fact, being rewarded based on a mirror image path). In hindsight, this would have been a good idea and provided some value to the current project. Nevertheless, it seems clear that, based on several of the learning profiles observed (e.g., subjects who exhibited very rapid learning during the Early Learning phase, more on this below), that many individuals became aware of a shape approximating the rewarded path. Note that we have included new figures (see our responses below) that give a better example of what fast versus slower learning looks like. In addition, we now note in our Methods that we did not probe participants about their subjective awareness re: the perturbation:

      “Note that, at the end of testing, we did not assess participants’ awareness of the manipulation (i.e., that they were, in fact, being rewarded based on a mirror image path of the visible path).”

      Recommendation #2: Provide more behavioral quantification.

      (2a) The authors chose to only plot the average learning score in Figure 1D, without an indication of movement variability. I think this is quite important, to give the reader an impression of how variable the movements were at baseline, during early learning, and over the course of learning. There is evidence that baseline variability influences the 'detectability' of imposed rotations (in the case of adaptation learning), which could be relevant here. Shading the plots by movement variability would also be important to see if there was some refinement of the moment after participants performed at the ceiling (which seems to be the case ~ after trial 150). This is especially worrying given that in Fig 6A there is a clear indication that there is a large difference between subjects' solutions on the task. One subject exhibits almost a one-shot learning curve (reaching a score of 75 after one or two trials), whereas others don't seem to really learn until the near end. What does this between-subject variability mean for the authors' hypothesized neural processes?

      In line with these recommendations, we have now provided much better behavioral quantification of subject-level performance in both the main manuscript and supplementary material. For instance, in a new supplemental Figure 1 (shown below), we now include mean subject (+/- SE) reaction times (RTs), movement times (MTs) and movement path variability (our computing of these measures are now defined in our Methods section).

      As can be seen in the figure, all three of these variables tended to decrease over the course of the study, though we note there was a noticeable uptick in both RTs and MTs from the Baseline to Early learning phase, once subjects started receiving trial-by-trial reward feedback based on their movements. With respect to path variability, it is not obvious that there was a significant refinement of the paths created during late learning (panel D below), though there was certainly a general trend for path variability to decrease over learning.

      Author response image 3.

      Behavioral measures of learning across the task. (A-D) shows average participant reward scores (A), reaction times (B), movement times (C) and path variability (D) over the course of the task. In each plot, the black line denotes the mean across participants and the gray banding denotes +/- 1 SEM. The three equal-length task epochs for subsequent neural analyses are indicated by the gray shaded boxes.

      In addition to these above results, we have also created a new Figure 6 in the main manuscript, which now solely focuses on individual differences in subject learning (see below). Hopefully, this figure clarifies key features of the task and its reward structure, and also depicts (in movement trajectory space) what fast versus slow learning looks like in the task. Specifically, we believe that this figure now clearly delineates for the reader the mapping between movement trajectory and the reward score feedback presented to participants, which appeared to be a source of confusion based on the reviewers’ comments below. As can be clearly observed in this figure, trajectories that approximated the ‘visible path’ (black line) resulted in fairly mediocre scores (see score color legend at right), whereas trajectories that approximated the ‘reward path’ (dashed black line, see trials 191-200 of the fast learner) resulted in fairly high scores. This figure also more clearly delineates how fPCA loadings derived from our functional data analysis were used to derive subject-level learning scores (panel C).

      Author response image 4.

      Individual differences in subject learning performance. (A) Examples of a good learner (bordered in green) and poor learner (bordered in red). (B) Individual subject learning curves for the task. Solid black line denotes the mean across all subjects whereas light gray lines denote individual participants. The green and red traces denote the learning curves for the example good and poor learners denoted in A. (C) Derivation of subject learning scores. We performed functional principal component analysis (fPCA) on subjects’ learning curves in order to identify the dominant patterns of variability during learning. The top component, which encodes overall learning, explained the majority of the observed variance (~75%). The green and red bands denote the effect of positive and negative component scores, respectively, relative to mean performance. Thus, subjects who learned more quickly than average have a higher loading (in green) on this ‘Learning score’ component than subjects who learned more slowly (in red) than average. The plot at right denotes the loading for each participant (open circles) onto this Learning score component.

      The reviewers note that there are large individual differences in learning performance across the task. This was clearly our hope when designing the reward structure of this task, as it would allow us to further investigate the neural correlates of these individual differences (indeed, during pilot testing, we sought out a reward structure to the task that would allow for these intersubject differences). The subjects who learn early during the task end up having higher fPCA scores than the subjects who learn more gradually (or learn the task late). From our perspective, these differences are a feature, and not a bug, and they do not negate any of our original interpretations. That is, subjects who learn earlier on average tend to contract their DAN-A network during the early learning phase whereas subjects who learn more slowly on average (or learn late) instead tend to contract their DAN-A network during late learning (Fig. 7).

      (2b) In the methods, the authors stated that they scaled the score such that even a perfectly traced visible path would always result in an imperfect score of 40 patients. What happens if a subject scores perfectly on the first try (which seemed to have happened for the green highlighted subject in Fig 6A), but is then permanently confronted with a score of 40 or below? Wouldn't this result in an error-clamp-like (error-based motor adaptation) design for this subject and all other high performers, which would vastly differ from the task demands for the other subjects? How did the authors factor in the wide between-subject variability?

      We think the reviewers may have misinterpreted the reward structure of the task, and we apologize for not being clearer in our descriptions. The reward score that subjects received after each trial was based on how well they traced the mirror-image of the visible path. However, all the participant can see on the screen is the visible path. We hope that our inclusion of the new Figure 6 (shown above) makes the reward structure of the task, and its relationship to movement trajectories, much clearer. We should also note that, even for the highest performing subject (denoted in Fig. 6), it still required approximately 20 trials for them to reach asymptote performance.

      (2c) The study would benefit from a more detailed description of participants' behavioral performance during the task. Specifically, it is crucial to understand how participants' motor skills evolve over time. Information on changes in movement speed, accuracy, and other relevant behavioral metrics would enhance the understanding of the relationship between behavior and brain activity during the learning process. Additionally, please clarify whether the display on the screen was presented continuously throughout the entire trial or only during active movement periods. Differences in display duration could potentially impact the observed differences in brain activity during learning.

      We hope that with our inclusion of the new Supplementary Figure 1 (shown above) this addresses the reviewers’ recommendation. Generally, we find that RTs, MTs and path variability all decrease over the course of the task. We think this relates to the early learning phase being more attentionally demanding and requiring more conscious effort, than the later learning phases.

      Also, yes, the visible path was displayed on the screen continuously throughout the trial, and only disappeared at the 4.5 second mark of each trial (when the screen was blanked and the data was saved off for 1.5 seconds prior to commencement of the next trial; 6 seconds total per trial). Thus, there were no differences in display duration across trials and phases of the task. We have now clarified this in the Methods section, where we now write the following:

      “When the cursor reached the target distance, the target changed color from red to green to indicate that the trial was completed. Importantly, other than this color change in the distance marker, the visible curved path remained constant and participants never received any feedback about the position of their cursor.”

      (2d) It is unclear from plots 6A, 6B, and 1D how the scale of the behavioral data matches with the scaling of the scores. Are these the 'real' scores, meaning 100 on the y-axis would be equivalent to 40 in the task? Why then do all subjects reach an asymptote at 75? Or is 75 equivalent to 40 and the axis labels are wrong?

      As indicated above, we clearly did a poor job of describing the reward structure of our task in our original paper, and we now hope that our inclusion of Figure 6 makes things clear. A ‘40’ score on the y-axis would indicate that a subject has perfectly traced the visible path whereas a perfect ‘100’ score would indicate that a subject has perfectly traced the (hidden) mirror image path.

      The fact that several of the subjects reach asymptote around 75 is likely a byproduct of two factors. Firstly, the subjects performed their movements in the absence of any visual error feedback (they could not see the position of a cursor that represented their hand position), which had the effect of increasing motor variability in their actions from trial to trial. Secondly, there appears to be an underestimation among subjects regarding the curvature of the concealed, mirror-image path (i.e., that the rewarded path actually had an equal but opposite curvature to that of the visible path). This is particularly evident in the case of the top-performing subject (illustrated in Figure 6A) who, even during late learning, failed to produce a completely arched movement.

      (2e) Labeling of Contrasts: There is a consistent issue with the labeling of contrasts in the presented figures, causing confusion. While the text refers to the difference as "baseline to early learning," the label used in figures, such as Figure 4, reads "baseline > early." It is essential to clarify whether the presented contrast is indeed "baseline > early" or "early > baseline" to avoid any misinterpretation.

      We thank the reviewers for catching this error. Indeed, the intended label was Early > Baseline, and this has now been corrected throughout.

      Recommendation #3. Clarify which motor learning mechanism(s) are at play.

      (3a) Participants were performing at a relatively low level, achieving around 50-60 points by the end of learning. This outcome may not be that surprising, given that reward-based learning might have a substantial explicit component and may also heavily depend on reasoning processes, beyond reinforcement learning or contextual recall (Holland et al., 2018; Tsay et al., 2023). Even within our own data, where explicit processes are isolated, average performance is low and many individuals fail to learn (Brudner et al., 2016; Tsay et al., 2022). Given this, many participants in the current study may have simply given up. A potential indicator of giving up could be a subset of participants moving straight ahead in a rote manner (a heuristic to gain moderate points). Consequently, alterations in brain networks may not reflect exploration and exploitation strategies but instead indicate levels of engagement and disengagement. Could the authors plot the average trajectory and the average curvature changes throughout learning? Are individuals indeed defaulting to moving straight ahead in learning, corresponding to an average of 50-60 points? If so, the interpretation of brain activity may need to be tempered.

      We can do one better, and actually give you a sense of the learning trajectories for every subject over time. In the figure below, which we now include as Supplementary Figure 2 in our revision, we have plotted, for each subject, a subset of their movement trajectories across learning trials (every 10 trials). As can be seen in the diversity of these trajectories, the average trajectory and average curvature would do a fairly poor job of describing the pattern of learning-related changes across subjects. Moreover, it is not obvious from looking at these plots the extent to which poor learning subjects (i.e., subjects who never converge on the reward path) actually ‘give up’ in the task — rather, many of these subjects still show some modulation (albeit minor) of their movement trajectories in the later trials (see the purple and pink traces). As an aside, we are also not entirely convinced that straight ahead movements, which we don’t find many of in our dataset, can be taken as direct evidence that the subject has given up.

      Author response image 5

      Variability in learning across subjects. Plots show representative trajectory data from each subject (n=36) over the course of the 200 learning trials. Coloured traces show individual trials over time (each trace is separated by ten trials, e.g., trial 1, 10, 20, 30, etc.) to give a sense of the trajectory changes throughout the task (20 trials in total are shown for each subject).

      We should also note that we are not entirely opposed to the idea of describing aspects of our findings in terms of subject engagement versus disengagement over time, as such processes are related at some level to exploration (i.e., cognitive engagement in finding the best solution) and exploitation (i.e., cognitively disengaging and automating one’s behavior). As noted in our reply to Recommendation #1 above, we now give some consideration of these explanations in our Discussion section, where we now write:

      “It is also possible that these task-related shifts in connectivity relates to shifts in task-general processes, such as changes in the allocation of attentional resources (Bédard and Song, 2013; Rosenberg et al., 2016) or overall cognitive engagement (Aben et al., 2020), which themselves play critical roles in shaping learning (Codol et al., 2018; Holland et al., 2018; Song, 2019; Taylor and Thoroughman, 2008, 2007; for a review of these topics, see Tsay et al., 2023). Such processes are particularly important during the earlier phases of learning when sensorimotor contingencies need to be established. While these remain questions for future work, our data nevertheless suggest that this shift in connectivity may be enabled through the PMC.”

      (3b) The authors are mixing two commonly used paradigms, reward-based learning, and motor adaptation, but provide no discussion of the different learning processes at play here. Which processes were they attempting to probe? Making this explicit would help the reader understand which brain regions should be implicated based on previous literature. As it stands, the task is hard to interpret. Relatedly, there is a wealth of literature on explicit vs implicit learning mechanisms in adaptation tasks now. Given that the authors are specifically looking at brain structures in the cerebral cortex that are commonly associated with explicit and strategic learning rather than implicit adaptation, how do the authors relate their findings to this literature? Are the learning processes probed in the task more explicit, more implicit, or is there a change in strategy usage over time? Did the authors acquire data on strategies used by the participants to solve the task? How does the baseline variability come into play here?

      As noted in our paper, our task was directly inspired by the reward-based motor learning tasks developed by Dam et al., 2013 (Plos One) and Wu et al., 2014 (Nature Neuroscience). What drew us to these tasks is that they allowed us to study the neural bases of reward-based learning mechanisms in the absence of subjects also being able to exploit error-based mechanisms to achieve learning. Indeed, when first describing the task in the Results section of our paper we wrote the following:

      “Importantly, because subjects received no visual feedback about their actual finger trajectory and could not see their own hand, they could only use the score feedback — and thus only reward-based learning mechanisms — to modify their movements from one trial to the next (Dam et al., 2013; Wu et al., 2014).”

      If the reviewers are referring to ‘motor adaptation’ in the context in which that terminology is commonly used — i.e., the use of sensory prediction errors to support error-based learning — then we would argue that motor adaptation is not a feature of the current study. It is true that in our study subjects learn to ‘adapt’ their movements across trials, but this shaping of the movement trajectories must be supported through reinforcement learning mechanisms (and, of course, supplemented by the use of cognitive strategies as discussed in the nice review by Tsay et al., 2023). We apologize for not being clearer in our paper about this key distinction and we have now included new text in the introduction to our Results to directly address this:

      “Importantly, because subjects received no visual feedback about their actual finger trajectory and could not see their own hand, they could only use the score feedback — and thus only reward-based learning mechanisms — to modify their movements from one trial to the next (Dam et al., 2013; Wu et al., 2014). That is, subjects could not use error-based learning mechanisms to achieve learning in our study, as this form of learning requires sensory errors that convey both the change in direction and magnitude needed to correct the movement.”

      With this issue aside, we are well aware of the established framework for thinking about sensorimotor adaptation as being composed of a combination of explicit and implicit components (indeed, this has been a central feature of several of our other recent neuroimaging studies that have explored visuomotor rotation learning, e.g., Gale et al., 2022 PNAS, Areshenkoff et al., 2022 elife, Standage et al., 2023 Cerebral Cortex). However, there has been comparably little work done on these parallel components within the domain of reinforcement learning tasks (though see Codol et al., 2018; Holland et al., 2018, van Mastrigt et al., 2023; see also the Tsay et al., 2023 review), and as far as we can tell, nothing has been done to date in the reward-based motor learning area using fMRI. By design, we avoided using descriptors of ‘explicit’ or ‘implicit’ in our study because our experimental paradigm did not allow a separate measurement of those two components to learning during the task. Nevertheless, it seems clear to us from examining the subjects’ learning curves (see supplementary figure 2 above), that individuals who learn very quickly are using strategic processes (such as action exploration to identify the best path) to enhance their learning. As we noted in an above response, we did not query subjects after the fact about their strategy use, which admittedly was a missed opportunity on our part.

      Author response image 6.

      With respect to the comment on baseline variability and its relationship to performance, this is an interesting idea and one that was explored in the Wu et al., 2014 Nature Neuroscience paper. Prompted by the reviewers, we have now explored this idea in the current data set by testing for a relationship between movement path variability during baseline trials (all 70 baseline trials, see Supplementary Figure 1D above for reference) and subjects’ fPCA score on our learning task. However, when we performed this analysis, we did not observe a significant positive relationship between baseline variability and subject performance. Rather, we actually found a trend towards a negative relationship (though this was non-significant; r=-0.2916, p=0.0844). Admittedly, we are not sure what conclusions can be drawn from this analysis, and in any case, we believe it to be tangential to our main results. We provide the results (at right) for the reviewers if they are interested. This may be an interesting avenue for exploration in future work.

      Recommendation #4: Provide stronger justification for brain imaging methods.

      (4a) Observing how brain activity varies across these different networks is remarkable, especially how sensorimotor regions separate and then contract with other, more cognitive areas. However, does the signal-to-noise ratio in each area/network influence manifold eccentricity and limit the possible changes in eccentricity during learning? Specifically, if a region has a low signal-to-noise ratio, it might exhibit minimal changes during learning (a phenomenon perhaps relevant to null manifold changes in the striatum due to low signal-to-noise); conversely, regions with higher signal-to-noise (e.g., motor cortex in this sensorimotor task) might exhibit changes more easily detected. As such, it is unclear how to interpret manifold changes without considering an area/network's signal-to-noise ratio.

      We appreciate where these concerns are coming from. First, we should note that the timeseries data used in our analysis were z-transformed (mean zero, 1 std) to allow normalization of the signal both over time and across regions (and thus mitigate the possibility that the changes observed could simply reflect mean overall signal changes across different regions). Nevertheless, differences in signal intensity across brain regions — particularly between cortex and striatum — are well-known, though it is not obvious how these differences may manifest in terms of a task-based modulation of MR signals.

      To examine this issue in the current data set, we extracted, for each subject and time epoch (Baseline, Early and Late learning) the raw scanner data (in MR arbitrary units, a.u.) for the cortical and striatal regions and computed the (1) mean signal intensity, (2) standard deviation of the signal (Std) and (3) temporal signal to noise ratio (tSNR; calculated by mean/Std). Note that in the fMRI connectivity literature tSNR is often the preferred SNR measure as it normalizes the mean signal based on the signal’s variability over time, thus providing a general measure of overall ‘signal quality’. The results of this analysis, averaged across subjects and regions, is shown below.

      Author response image 7.

      Note that, as expected, the overall signal intensity (left plot) of cortex is higher than in the striatum, reflecting the closer proximity of cortex to the receiver coils in the MR head coil. In fact, the signal intensity in cortex is approximately 38% higher than that in the striatum (~625 - 450)/450). However, the signal variation in cortex is also greater than striatum (middle plot), but in this case approximately 100% greater (i.e., (~5 - 2.5)/2.5)). The result of this is that the tSNR (mean/std) for our data set and the ROI parcellations we used is actually greater in the striatum than in cortex (right plot). Thus, all else being equal, there seems to have been sufficient tSNR in the striatum for us to have detected motor-learning related effects. As such, we suspect the null effects for the striatum in our study actually stem from two sources.

      The first likely source is the relatively lower number of striatal regions (12) as compared to cortical regions (998) used in our analysis, coupled with our use of PCA on these data (which, by design, identifies the largest sources of variation in connectivity). In future studies, this unbalance could be rectified by using finer parcellations of the striatum (even down to the voxel level) while keeping the same parcellation of cortex (i.e., equate the number of ‘regions’ in each of striatum and cortex). The second likely source is our use of a striatal atlas (the Harvard-Oxford atlas) that divides brain regions based on their neuroanatomy rather than their function. In future work, we plan on addressing this latter concern by using finer, more functionally relevant parcellations of striatum (such as in Tian et al., 2020, Nature Neuroscience). Note that we sought to capture these interrelated possible explanations in our Discussion section, where we wrote the following:

      “While we identified several changes in the cortical manifold that are associated with reward-based motor learning, it is noteworthy that we did not observe any significant changes in manifold eccentricity within the striatum. While clearly the evidence indicates that this region plays a key role in reward-guided behavior (Averbeck and O’Doherty, 2022; O’Doherty et al., 2017), there are several possible reasons why our manifold approach did not identify this collection of brain areas. First, the relatively small size of the striatum may mean that our analysis approach was too coarse to identify changes in the connectivity of this region. Though we used a 3T scanner and employed a widely-used parcellation scheme that divided the striatum into its constituent anatomical regions (e.g., hippocampus, caudate, etc.), both of these approaches may have obscured important differences in connectivity that exist within each of these regions. For example, areas such the hippocampus and caudate are not homogenous areas but themselves exhibit gradients of connectivity (e.g., head versus tail) that can only be revealed at the voxel level (Tian et al., 2020; Vos de Wael et al., 2021). Second, while our dimension reduction approach, by design, aims to identify gradients of functional connectivity that account for the largest amounts of variance, the limited number of striatal regions (as compared to cortex) necessitates that their contribution to the total whole-brain variance is relatively small. Consistent with this perspective, we found that the low-dimensional manifold architecture in cortex did not strongly depend on whether or not striatal regions were included in the analysis (see Supplementary Fig. 6). As such, selective changes in the patterns of functional connectivity at the level of the striatum may be obscured using our cortex x striatum dimension reduction approach. Future work can help address some of these limitations by using both finer parcellations of striatal cortex (perhaps even down to the voxel level)(Tian et al., 2020) and by focusing specifically on changes in the interactions between the striatum and cortex during learning. The latter can be accomplished by selectively performing dimension reduction on the slice of the functional connectivity matrix that corresponds to functional coupling between striatum and cortex.”

      (4b) Could the authors clarify how activity in the dorsal attention network (DAN) changes throughout learning, and how these changes also relate to individual differences in learning performance? Specifically, on average, the DAN seems to expand early and contract late, relative to the baseline. This is interpreted to signify that the DAN exhibits lesser connectivity followed by greater connectivity with other brain regions. However, in terms of how these changes relate to behavior, participants who go against the average trend (DAN exhibits more contraction early in learning, and expansion from early to late) seem to exhibit better learning performance. This finding is quite puzzling. Does this mean that the average trend of expansion and contraction is not facilitative, but rather detrimental, to learning? [Another reviewer added: The authors do not state any explicit hypotheses, but only establish that DMN coordinates activity among several regions. What predictions can we derive from this? What are the authors looking for in the data? The work seems more descriptive than hypothesis-driven. This is fine but should be clarified in the introduction.]

      These are good questions, and we are glad the reviewers appreciated the subtlety here. The reviewers are indeed correct that the relationship of the DAN-A network to behavioral performance appears to go against the grain of the group-level results that we found for the entire DAN network (which we note is composed of both the DAN-A and DAN-B networks). That is, subjects who exhibited greater contraction from Baseline to Early learning and likewise, greater expansion from Early to Late learning, tended to perform better in the task (according to our fPCA scores). However, on this point it is worth noting that it was mainly the DAN-B network which exhibited group-level expansion from Baseline to Early Learning whereas the DAN-A network exhibited negligible expansion. This can be seen in Author response image 8 below, which shows the pattern of expansion and contraction (as in Fig. 4), but instead broken down into the 17-network parcellation. The red asterisk denotes the expansion from Baseline to Early learning for the DAN-B network, which is much greater than that observed for the DAN-A network (which is basically around the zero difference line).

      Author response image 8.

      Thus, it appears that the DAN-A and DAN-B networks are modulated to a different extent during the task, which likely contributes to the perceived discrepancy between the group-level effects (reported using the 7-network parcellation) and the individual differences effects (reported using the finer 17-network parcellation). Based on the reviewers’ comments, this seems like an important distinction to clarify in the manuscript, and we have now described this nuance in our Results section where we now write:

      “...Using this permutation testing approach, we found that it was only the change in eccentricity of the DAN-A network that correlated with Learning score (see Fig. 7C), such that the more the DAN-A network decreased in eccentricity from Baseline to Early learning (i.e., contracted along the manifold), the better subjects performed at the task (see Fig. 7C, scatterplot at right). Consistent with the notion that changes in the eccentricity of the DAN-A network are linked to learning performance, we also found the inverse pattern of effects during Late learning, whereby the more that this same network increased in eccentricity from Early to Late learning (i.e., expanded along the manifold), the better subjects performed at the task (Fig. 7D). We should note that this pattern of performance effects for the DAN-A — i.e., greater contraction during Early learning and greater expansion during Late learning being associated with better learning — appears at odds with the group-level effects described in Fig. 4A and B, where we generally find the opposite pattern for the entire DAN network (composed of the DAN-A and DAN-B subnetworks). However, this potential discrepancy can be explained when examining the changes in eccentricity using the 17-network parcellation (see Supplementary Figure 8). At this higher resolution level we find that these group-level effects for the entire DAN network are being largely driven by eccentricity changes in the DAN-B network (areas in anterior superior parietal cortex and premotor cortex), and not by mean changes in the DAN-A network. By contrast, our present results suggest that it is the contraction and expansion of areas of the DAN-A network (and not DAN-B network) that are selectively associated with differences in subject learning performance.”

      Finally, re: the reviewers’ comments that we do not state any explicit hypotheses etc., we acknowledge that, beyond our general hypothesis stated at the outset about the DMN being involved in reward-based motor learning, our study is quite descriptive and exploratory in nature. Such little work has been done in this research area (i.e., using manifold learning approaches to study motor learning with fMRI) that it would be disingenuous to have any stronger hypotheses than those stated in our Introduction. Thus, to make the exploratory nature of our study clear to the reader, we have added the following text (in red) to our Introduction:

      “Here we applied this manifold approach to explore how brain activity across widely distributed cortical and striatal systems is coordinated during reward-based motor learning. We were particularly interested in characterizing how connectivity between regions within the DMN and the rest of the brain changes as participants shift from learning the relationship between motor commands and reward feedback, during early learning, to subsequently using this information, during late learning. We were also interested in exploring whether learning-dependent changes in manifold structure relate to variation in subject motor performance.”

      We hope these changes now make it obvious the intention of our study.

      (4c) The paper examines a type of motor adaptation task with a reward-based learning component. This, to me, strongly implicates the cerebellum, given that it has a long-established crucial role in adaptation and has recently been implicated in reward-based learning (see work by Wagner & Galea). Why is there no mention of the cerebellum and why it was left out of this study? Especially given that the authors state in the abstract they examine cortical and subcortical structures. It's evident from the methods that the authors did not acquire data from the cerebellum or had too small a FOV to fully cover it (34 slices at 4 mm thickness 136 mm which is likely a bit short to fully cover the cerebellum in many participants). What was the rationale behind this methodological choice? It would be good to clarify this for the reader. Related to this, the authors need to rephrase their statements on 'whole-brain' connectivity matrices or analyses - it is not whole-brain when it excludes the cerebellum.

      As we noted above, we do not believe this task to be a motor adaptation task, in the sense that subjects are not able to use sensory prediction errors (and thus error-based learning mechanisms) to improve their performance. Rather, by denying subjects this sensory error feedback they are only able to use reinforcement learning processes, along with cognitive strategies (nicely covered in Tsay et al., 2023), to improve performance. Nevertheless, we recognize that the cerebellum has been increasingly implicated in facets of reward-based learning, particularly within the rodent domain (e.g., Wagner et al., 2017; Heffley et al., 2018; Kostadinov et al., 2019, etc.). In our study, we did indeed collect data from the cerebellum but did not include it in our original analyses, as we wanted (1) the current paper to build on prior work in the human and macaque reward-learning domain (which focuses solely on striatum and cortex, and which rarely discusses cerebellum, see Averbeck & O’Doherty, 2022 & Klein-Flugge et al., 2022 for recent reviews), and, (2) allow this to be a more targeted focus of future work (specifically we plan on focusing on striatal-cerebellar interactions during learning, which are hypothesized based on the neuroanatomical tract tracing work of Bostan and Strick, etc.). We hope the reviewers respect our decisions in this regard.

      Nevertheless, we acknowledge that based on our statements about ‘whole-brain’ connectivity and vagueness about what we mean by ‘subcortex,’ that this may be confusing for the reader. We have now removed and/or corrected such references throughout the paper (however, note that in some cases it is difficult to avoid reference to “whole-brain” — e.g., “whole-brain correlation map” or “whole-brain false discovery rate correction”, which is standard terminology in the field).

      In addition, we are now explicit in our Methods section that the cerebellum was not included in our analyses.

      “Each volume comprised 34 contiguous (no gap) oblique slices acquired at a ~30° caudal tilt with respect to the plane of the anterior and posterior commissure (AC-PC), providing whole-brain coverage of the cerebrum and cerebellum. Note that for the current study, we did not examine changes in cerebellar activity during learning.”

      (4d) The authors centered the matrices before further analyses to remove variance associated with the subject. Why not run a PCA on the connectivity matrices and remove the PC that is associated with subject variance? What is the advantage of first centering the connectivity matrices? Is this standard practice in the field?

      Centering in some form has become reasonably common in the functional connectivity literature, as there is considerable evidence that task-related (or cognitive) changes in whole-brain connectivity are dwarfed by static, subject-level differences (e.g., Gratton, et al, 2018, Neuron). If covariance matrices were ordinary scalar values, then isolating task-related changes could be accomplished simply by subtracting a baseline scan or mean score; but because the space of covariance matrices is non-Euclidean, the actual computations involved in this subtraction are more complex (see our Methods). However, fundamentally (and conceptually) our procedure is simply ordinary mean-centering, but adapted to this non-Euclidean space. Despite the added complexity, there is considerable evidence that such computations — adapted directly to the geometry of the space of covariance matrices — outperform simpler methods, which treat covariance matrices as arrays of real numbers (e.g. naive substraction, see Dodero et al. & Ng et al., references below). Moreover, our previous work has found that this procedure works quite well to isolate changes associated with different task conditions (Areshenkoff et al., 2021, Neuroimage; Areshenkoff et al., 2022, elife).

      Although PCA can be adapted to work well with covariance matrix valued data, it would at best be a less direct solution than simply subtracting subjects' mean connectivity. This is because the top components from applying PCA would be dominated by both subject-specific effects (not of interest here), and by the large-scale connectivity structure typically observed in component based analyses of whole-brain connectivity (i.e. the principal gradient), whereas changes associated with task-condition (the thing of interest here) would be buried among the less reliable components. By contrast, our procedure directly isolates these task changes.

      References cited above:

      Dodero, L., Minh, H. Q., San Biagio, M., Murino, V., & Sona, D. (2015, April). Kernel-based classification for brain connectivity graphs on the Riemannian manifold of positive definite matrices. In 2015 IEEE 12th international symposium on biomedical imaging (ISBI) (pp. 42-45). IEEE.

      Ng, B., Dressler, M., Varoquaux, G., Poline, J. B., Greicius, M., & Thirion, B. (2014). Transport on Riemannian manifold for functional connectivity-based classification. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2014: 17th International Conference, Boston, MA, USA, September 14-18, 2014, Proceedings, Part II 17 (pp. 405-412). Springer International Publishing.

      (4e) Seems like a missed opportunity that the authors just use a single, PCA-derived measure to quantify learning, where multiple measures could have been of interest, especially given that the introduction established some interesting learning-related concepts related to exploration and exploitation, which could be conceptualized as movement variability and movement accuracy. It is unclear why the authors designed a task that was this novel and interesting, drawing on several psychological concepts, but then chose to ignore these concepts in the analysis.

      We were disappointed to hear that the reviewers did not appreciate our functional PCA-derived measure to quantify subject learning. This is a novel data-driven analysis approach that we have previously used with success in recent work (e.g., Areshenkoff et al., 2022, elife) and, from our perspective, we thought it was quite elegant that we were able to describe the entire trajectory of learning across all participants along a single axis that explained the majority (~75%) of the variance in the patterns of behavioral learning data. Moreover, the creation of a single behavioral measure per participant (what we call a ‘Learning score’, see Fig. 6C) helped simplify our brain-behavior correlation analyses considerably, as it provided a single measure that accounts for the natural auto-correlation in subjects’ learning curves (i.e., that subjects who learn quickly also tend to be better overall learners by the end of the learning phase). It also avoids the difficulty (and sometimes arbitrariness) of having to select specific trial bins for behavioral analysis (e.g., choosing the first 5, 10, 20 or 25 trials as a measure of ‘early learning’, and so on). Of course, one of the major alternatives to our approach would have involved fitting an exponential to each subject’s learning curves and taking measures like learning rate etc., but in our experience we have found that these types of models don’t always fit well, or derive robust/reliable parameters at the individual subject level. To strengthen the motivation for our approach, we have now included the following text in our Results:

      “To quantify this variation in subject performance in a manner that accounted the auto-correlation in learning performance over time (i.e., subjects who learned more quickly tend to exhibit better performance by the end of learning), we opted for a pure data-driven approach and performed functional principal component analysis (fPCA; (Shang, 2014)) on subjects’ learning curves. This approach allowed us to isolate the dominant patterns of variability in subject’s learning curves over time (see Methods for further details; see also Areshenkoff et al., 2022).”

      In any case, the reviewers may be pleased to hear that in current work in the lab we are using more model-based approaches to attempt to derive sets of parameters (per participant) that relate to some of the variables of interest described by the reviewers, but that we relate to much more dynamical (shorter-term) changes in brain activity.

      (4f) Overall Changes in Activity: The manuscript should delve into the potential influence of overall changes in brain activity on the results. The choice of using Euclidean distance as a metric for quantifying changes in connectivity is sensitive to scaling in overall activity. Therefore, it is crucial to discuss whether activity in task-relevant areas increases from baseline to early learning and decreases from early to late learning, or if other patterns emerge. A comprehensive analysis of overall activity changes will provide a more complete understanding of the findings.

      These are good questions and we are happy to explore this in the data. However, as mentioned in our response to query 4a above, it is important to note that the timeseries data for each brain region was z-scored prior to analysis, with the aim of removing any mean changes in activity levels (note that this is a standard preprocessing step when performing functional connectivity analysis, given that mean signal changes are not the focus of interest in functional connectivity analyses).

      To further emphasize these points, we have taken our z-scored timeseries data and calculated the mean signal for each region within each task epoch (Baseline, Early and Late learning, see panel A in figure below). The point of showing this data (where each z-score map looks near identical across the top, middle and bottom plots) is to demonstrate just how miniscule the mean signal changes are in the z-scored timeseries data. This point can also be observed when plotting the mean z-score signal across regions for each epoch (see panel B in figure below). Here we find that Baseline and Early learning have a near identical mean activation level across regions (albeit with slightly different variability across subjects), whereas there is a slight increase during late learning — though it should be noted that our y-axis, which measures in the thousandths, really magnifies this effect.

      To more directly address the reviewers’ comments, using the z-score signal per region we have also performed the same statistical pairwise comparisons (Early > Baseline and Late>Early) as we performed in the main manuscript Fig. 4 (see panel C in Author response image 9 below). In this plot, areas in red denote an increase in activity from Baseline to Early learning (top plot) and from Early to Late learning (bottom plot), whereas areas in blue denote a decrease for those same comparisons. The important thing to emphasize here is that the spatial maps resulting from this analysis are generally quite different from the maps of eccentricity that we report in Fig. 4 in our paper. For instance, in the figure below, we see significant changes in the activity of visual cortex between epochs but this is not found in our eccentricity results (compare with Fig. 4). Likewise, in our eccentricity results (Fig. 4), we find significant changes in the manifold positioning of areas in medial prefrontal cortex (MPFC), but this is not observed in the activation levels of these regions (panel C below). Again, we are hesitant to make too much of these results, as the activation differences denoted as significant in the figure below are likely to be an effect on the order of thousandths of a z-score (e.g., 0.002 > 0.001), but this hopefully assuages reviewers’ concerns that our manifold results are solely attributable to changes in overall activity levels.

      We are hesitant to include the results below in our paper as we feel that they don’t add much to the interpretation (as the purpose of z-scoring was to remove large activation differences). However, if the reviewers strongly believe otherwise, we would consider including them in the supplement.

      Author response image 9.

      Examination of overall changes in activity across regions. (A) Mean z-score maps across subjects for the Baseline (top), Early Learning (middle) and Late learning (bottom) epochs. (B) Mean z-score across brain regions for each epoch. Error bars represent +/- 1 SEM. (C) Pairwise contrasts of the z-score signal between task epochs. Positive (red) and negative (blue) values show significant increases and decreases in z-score signal, respectively, following FDR correction for region-wise paired t-tests (at q<0.05).

    1. eLife assessment

      This important study reports the fungal composition and its interaction with bacteria in the Caesarean section scar diverticulum. The data are solid and supportive of the conclusion. This work will be of interest to researchers and clinicians who work on women's health.

    2. Reviewer #2 (Public Review):


      Shotgun data have been analysed to obtain fungal and bacterial organisms abundance. Through their metabolic functions and through co-occurrence networks, a functional relationship between the two types of organisms can be inferred. By means of metabolomics, function-related metabolites are studied in order to deepen the fungus-bacteria synergy.


      Data obtained in bacteria correlate with data from other authors.<br /> The study of metabolic "interactions" between fungi and bacteria is quite new.<br /> The inclusion of metabolomics data to support the results is a great contribution.


      All my concerns have been clarified

    1. eLife assessment

      This useful manuscript describes a proteomic analysis of plasma from subjects before and after an exercise regime consisting of endurance and resistance exercise. The work identifies a putative new exerkine, CD300LG, and finds associations of this protein with aspects of insulin sensitivity and angiogenesis, but the evidence to support the main claims remains incomplete.

    2. Reviewer #1 (Public Review):


      In this paper, proteomics analysis of the plasma of human subjects that underwent an exercise training regime consisting of a combination of endurance and resistance exercise led to the identification of several proteins that were responsive to exercise training. Confirming previous studies, many exercise-responsive secreted proteins were found to be involved in the extra-cellular matrix. The protein CD300LG was singled out as a potential novel exercise biomarker and the subject of numerous follow-up analyses. The levels of CD300LG were correlated with insulin sensitivity. The analysis of various open-source datasets led to the tentative suggestion that CD300LG might be connected with angiogenesis, liver fat, and insulin sensitivity. CD300LG was found to be most highly expressed in subcutaneous adipose tissue and specifically in venular endothelial cells. In a subset of subjects from the UK Biobank, serum CD300LG levels were positively associated with several measures of physical activity - particularly vigorous activity. In addition, serum CD300LG levels were negatively associated with glucose levels and type 2 diabetes. Genetic studies hinted at these associations possibly being causal. Mice carrying alterations in the CD300LG gene displayed impaired glucose tolerance, but no change in fasting glucose and insulin. Whether the production of CD300LG is changed in the mutant mice is unclear.


      The specific proteomics approach conducted to identify novel proteins impacted by exercise training is new. The authors are resourceful in the exploitation of existing datasets to gain additional information on CD300LG.


      While the analyses of multiple open-source datasets are necessary and useful, they lead to relatively unspecific correlative data that collectively insufficiently advance our knowledge of CD300LG and merely represent the starting point for more detailed investigations. Additional more targeted experiments of CD300LG are necessary to gain a better understanding of the role of CD300LG and the mechanism by which exercise training may influence CD300LG levels. One should also be careful to rely on external data for such delicate experiments as mouse phenotyping. Can the authors vouch for the quality of the data collected?

    3. Reviewer #2 (Public Review):


      This manuscript from Lee-Odegard et al reports proteomic profiling of exercise plasma in humans, leading to the discovery of CD300LG as a secreted exercise-inducible plasma protein. Correlational studies show associations of CD300LG with glycemic traits. Lastly, the authors query available public data from CD300LG-KO mice to establish a causal role for CD300LG as a potential link between exercise and glucose metabolism. However, the strengths of this manuscript were balanced by the moderate to major weaknesses. Therefore in my opinion, while this is an interesting study, the conclusions remain preliminary and are not fully supported by the experiments shown so far.


      (1) Data from a well-phenotyped human cohort showing exercise-inducible increases in CD300LG.

      (2) Associations between CD300LG and glucose and other cardiometabolic traits in humans, that have not previously been reported.

      (3) Correlation to CD300LG mRNA levels in adipose provides additional evidence for exercise-inducible increases in CD300LG.


      (1) CD300LG is by sequence a single-pass transmembrane protein that is exclusively localized to the plasma membrane. How CD300LG can be secreted remains a mystery. More evidence should be provided to understand the molecular nature of circulating CD300LG. Is it full-length? Is there a cleaved fragment? Where is the epitope where the o-link is binding to CD300LG? Does transfection of CD300LG to cells in vitro result in secreted CD300LG?

      (2) There is a growing recognition of specificity issues with both the O-link and somalogic platforms. Therefore it is critical that the authors use antibodies, targeted mass spectrometry, or some other methods to validate that CD300LG really is increased instead of just relying on the O-link data.

      (3) It is insufficient simply to query the IMPC phenotyping data for CD300LG; the authors should obtain the animals and reproduce or determine the glucose phenotypes in their own hands. In addition, this would allow the investigators to answer key questions like the phenotype of these animals after a GTT, whether glucose production or glucose uptake is affected, whether insulin secretion in response to glucose is normal, effects of high-fat diet, and other standard mouse metabolic phenotyping assays.

      (4) I was unable to find the time point at which plasma was collected at the 12-week time point. Was it immediately after the last bout of exercise (an acute response) or after some time after the training protocol (trained state)?

    4. Reviewer #3 (Public Review):


      This manuscript by Liu et al. presents a case that CAPSL mutations are a cause of familial exudative vitreoretinopathy (FEVR). Attention was initially focused on the CAPSL gene from whole exome sequence analysis of two small families. The follow-up analyses included studies in which CAPSL was manipulated in endothelial cells of mice and multiple iterations of molecular and cellular analyses. Together, the data show that CAPSL influences endothelial cell proliferation and migration. Molecularly, transcriptomic and proteomic analyses suggest that CAPSL influences many genes/proteins that are also downstream targets of MYC and may be important to the mechanisms.


      This multi-pronged approach found a previously unknown function for CAPSLs in endothelial cells and pointed at MYC pathways as high-quality candidates in the mechanism.


      Two issues shape the overall impact for me. First, the unreported population frequency of the variants in the manuscript makes it unclear if CAPSL should be considered an interesting candidate possibly contributing to FEVR, or possibly a cause. Second, it is unclear if the identified variants act dominantly, as indicated in the pedigrees. The studies in mice utilized homozygotes for an endothelial cell-specific knockout, leaving uncertainty about what phenotypes might be observed if mice heterozygous for a ubiquitous knockout had instead been studied.

      In my opinion, the following scientific issues are specific weaknesses that should be addressed:

      (1) Please state in the manuscript the number of FEVR families that were studied by WES. Please also describe if the families had been selected for the absence of known mutations, and/or what percentage lack known pathogenic variants.

      (2) A better clinical description of family 3104 would enhance the manuscript, especially the father. It is unclear what "manifested with FEVR symptoms, according to the medical records" means. Was the father diagnosed with FEVR? If the father has some iteration of a mild case, please describe it in more detail. If the lack of clinical images in the figure is indicative of a lack of medical documentation, please note this in the manuscript.

      (3) The TGA stop codon can in some instances also influence splicing (PMID: 38012313). Please add a bioinformatic assessment of splicing prediction to the assays and report its output in the manuscript.

      (4) More details regarding utilizing a "loxp-flanked allele of CAPSL" are needed. Is this an existing allele, if so, what is the allele and citation? If new (as suggested by S1), the newly generated CAPSL mutant mouse strain needs to be entered into the MGI database and assigned an official allele name - which should then be utilized in the manuscript and who generated the strain (presumably a core or company?) must be described.

      (5) The statement in the methods "All mice used in the study were on a C57BL/6J genetic background," should be better defined. Was the new allele generated on a pure C57BL/6J genetic background, or bred to be some level of congenic? If congenic, to what generation? If unknown, please either test and report the homogeneity of the background, or consult with nomenclature experts (such as available through MGI) to adopt the appropriate F?+NX type designation. This also pertains to the Pdgfb-iCreER mice, which reference 43 describes as having been generated in an F2 population of C57BL/6 X CBA and did not designate the sub-strain of C57BL/6 mice. It is important because one of the explanations for missing heritability in FEVR may be a high level of dependence on genetic background. From the information in the current description, it is also not inherently obvious that the mice studied did not harbor confounding mutations such as rd1 or rd8.

      (6) In my opinion, more experimental detail is needed regarding Figures 2 and 3. How many fields, of how many retinas and mice were analyzed in Figure 2? How many mice were assessed in Figure 3?

      (7) I suggest adding into the methods whether P-values were corrected for multiple tests.

    1. Reviewer #2 (Public Review):


      The authors have developed marker selection and k-means (k=2) based binary clustering algorithm for the first-level supervised clustering of the CyTOF dataset. They built a seamless pipeline that offers the multiple functionalities required for CyTOF data analysis.


      The strength of the study is the potential use of the pipeline for the CyTOF community as a wrapper for multiple functions required for the analysis. The concept of the first line of binary clustering with known markers can be practically powerful.


      The weakness of the study is that there's little conceptual novelty in the algorithms suggested from the study and the benchmarking is done in limited conditions.

    2. eLife assessment

      This valuable manuscript presents ImmCellTyper, a new toolkit for CyTOF data analysis. The semi-supervised clustering tool, BinaryClust, integrates prior biological knowledge and demonstrates competitive performance in various benchmarks, but there is room for strengthening the evidence base by addressing concerns about incomplete benchmarking results and the limited consideration of CyTOF markers with binary distribution. Overall, the manuscript offers solid potential for enhancing CyTOF data analysis methodologies.

    3. Reviewer #1 (Public Review):


      This manuscript presented a useful toolkit designed for CyTOF data analysis, which integrates 5 key steps as an analytical framework. A semi-supervised clustering tool was developed, and its performance was tested in multiple independent datasets. The tool was compared to human experts as well as supervised and unsupervised methods.


      The study employed multiple independent datasets to test the pipeline. A new semi-supervised clustering method was developed.


      The examination of the whole pipeline is incomplete. Lack of descriptions or justifications for some analyses.

    4. Reviewer #3 (Public Review):


      ImmCellTyper is a new toolkit for Cytometry by time-of-flight data analysis. It includes BinaryClust, a semi-supervised clustering tool (which takes into account prior biological knowledge), designed for automated classification and annotation of specific cell types and subpopulations. ImmCellTyper also integrates a variety of tools to perform data quality analysis, batch effect correction, dimension reduction, unsupervised clustering, and differential analysis.


      The proposed algorithm takes into account the prior knowledge.<br /> The results on different benchmarks indicate competitive or better performance (in terms of accuracy and speed) depending on the method.


      The proposed algorithm considers only CyTOF markers with binary distribution.

    1. eLife assessment

      This valuable study provides new insight into how non-synaptic interactions affect the activity of adjacent gustatory neurons housed within the same sensillum. The electrophysiological, behavioral, and genetic data supporting the study's conclusions are solid, although the inclusion of additional control experiments would strengthen the study. This work will be of interest to neuroscientists studying chemosensory processing or regulation of neuronal excitability.

    2. Reviewer #1 (Public Review):


      This study identifies new types of interactions between Drosophila gustatory receptor neurons (GRNs) and shows that these interactions influence sensory responses and behavior. The authors find that HCN, a hyperpolarization-activated cation channel, suppresses the activity of GRNs in which it is expressed, preventing those GRNs from depleting the sensillum potential, and thereby promoting the activity of neighboring GRNs in the same sensilla. HCN is expressed in sugar GRNs, so HCN dampens the excitation of sugar GRNs and promotes the excitation of bitter GRNs. Impairing HCN expression in sugar GRNs depletes the sensillum potential and decreases bitter responses, especially when flies are fed on a sugar-rich diet, and this leads to decreased bitter aversion in a feeding assay. The authors' conclusions are supported by genetic manipulations, electrophysiological recordings, and behavioral assays.


      (1) Non-synaptic interactions between neurons that share an extracellular environment (sometimes called "ephaptic" interactions) have not been well-studied, and certainly not in the insect taste system. A major strength of this study is the new insight it provides into how these interactions can impact sensory coding and behavior.

      (2) The authors use many different types of genetic manipulations to dissect the role of HCN in GRN function, including mutants, RNAi, overexpression, ectopic expression, and neuronal silencing. Their results convincingly show that HCN impacts the sensillum potential and has both cell-autonomous and nonautonomous effects that go in opposite directions. There are a couple of conflicting or counterintuitive results, but the authors discuss potential explanations.

      (3) Experiments comparing flies raised on different food sources suggest an explanation for why the system may have evolved the way that it did: when flies live in a sugar-rich environment, their bitter sensitivity decreases, and HCN expression in sugar GRNs helps to counteract this decrease.


      (1) The genetic manipulations were constitutive (e.g. Ih mutations, RNAi, or misexpression), and depleting Ih from birth could lead to compensatory effects that change the function of the neurons or sensillum. Using tools to temporally control Ih expression could help to confirm the results of this study.

      (2) The behavioral experiment shows a striking loss of bitter sensitivity, but it was only conducted for one bitter compound at one concentration. It is not clear how general this effect is. The same is true for some of the bitter GRN electrophysiological experiments that only tested one compound and concentration.

      (3) Several experiments using the Gal4/UAS system only show the Gal4/+ control and not the UAS/+ control (or occasionally neither control). Since some of the measurements in control flies seem to vary (e.g., spiking rate), it is important to compare the experimental flies to both controls to ensure that any observed effects are in fact due to the transgene expression.

      (4) I was surprised that manipulations of sugar GRNs (e.g. Ih knockdown, Gr64a-f deletion, or Kir silencing) can impact the sensillum potential and bitter GRN responses even in experiments where no sugar was presented. I believe the authors are suggesting that the effects of sugar GRN activity (e.g., from consuming sugar in the fly food prior to the experiment) can have long-lasting effects, but it wasn't entirely clear if this is their primary explanation or on what timescale those long-lasting effects would occur. How much / how long of a sugar exposure do the flies need for these effects to be triggered, and how long do those effects last once sugar is removed?

      (5) The authors mention that HCN may impact the resting potential in addition to changing the excitability of the cell through various mechanisms. It would be informative to record the resting potential and other neuronal properties, but this is very difficult for GRNs, so the current study is not able to determine exactly how HCN affects GRN activity.

    3. Reviewer #2 (Public Review):


      In this manuscript, the authors start by showing that HCN loss-of-function mutation causes a decrease in spiking in bitter GRNs (bGRN) while leaving sweet GRN (sGRN) response in the same sensillum intact. They show that a perturbation of HCN channels in sweet-sensing neurons causes a similar decrease while increasing the response of sugar neurons. They were also able to rescue the response by exogenous expression. Ectopic expression of HCN in bitter neurons had no effect. Next, they measure the sensillum potential and find that sensillum potential is also affected by HCN channel perturbation. These findings lead them to speculate that HCN in sGRN increases sGRN spiking which in turn affects bGRNs. To test this idea that carried out multiple perturbations aimed at decreasing sGRN activity. They found that decreasing sGRN activity by either using receptor mutant or by expressing Kir (a K+ channel) in sGRN increased bGRN responses. These responses also increase the sensillum potential. Finally, they show that these changes are behaviorally relevant as conditions that increase sGRN activity decrease avoidance of bitter substances.


      There is solid evidence that perturbation of sweet GRNs affects bitter GRN in the same sensillum. The measurement of transsynaptic potential and how it changes is also interesting and supports the authors' conclusion.

      Weaknesses:<br /> The ionic basis of how perturbation in GRN affects the transepithelial potential which in turn affects the second neuron is not clear.

    4. Reviewer #3 (Public Review):

      Ephaptic inhibition between neurons housed in the same sensilla has been long discovered in flies, but the molecular basis underlying this inhibition is underexplored. Specifically, it remains poorly understood which receptors or channels are important for maintaining the transepithelial potential between the sensillum lymph and the hemolymph (known as the sensillum potential), and how this affects the excitability of neurons housed in the same sensilla.

      Lee et al. used single-sensillum recordings (SSR) of the labellar taste sensilla to demonstrate that the HCN channel, Ih, is critical for maintaining sensillum potential in flies. Ih is expressed in sugar-sensing GRNs (sGRNs) but affects the excitability of both the sGRNs and the bitter-sensing GRNs (bGRNs) in the same sensilla. Ih mutant flies have decreased sensillum potential, and bGRNs of Ih mutant flies have a decreased response to the bitter compound caffeine. Interestingly, ectopic expression of Ih in bGRNs also increases sGRN response to sucrose, suggesting that Ih-dependent increase in sensillum potential is not specific to Ih expressed in sGRNs. The authors further demonstrated, using both SSR and behavior assays, that exposure to sugars in the food substrate is important for the Ih-dependent sensitization of bGRNs. The experiments conducted in this paper are of interest to the chemosensory field. The observation that Ih is important for the activity in bGRNs albeit expressed in sGRNs is especially fascinating and highlights the importance of non-synaptic interactions in the taste system.

      Despite the interesting results, this paper is not written in a clear and easily understandable manner. It uses poorly defined terms without much elaboration, contains sentences that are borderline unreadable even for those in the narrower chemosensory field, and many figures can clearly benefit from more labeling and explanation. It certainly needs a bit of work.

      Below are the major points:

      (1) Throughout the paper, it is assumed that Ih channels are expressed in sugar-sensing GRNs but not bitter-sensing GRNs. However, both this paper and citation #17, another paper from the same lab, contain only circumstantial evidence for the expression of Ih channels in sGRNs. A simple co-expression analysis, using the Ih-T2A-GAL4 line and Gr5a-LexA/Gr66a-LexA line, all of which are available, could easily demonstrate the co-expression. Including such a figure would significantly strengthen the conclusion of this paper.

      (2) Throughout this paper, it is often unclear which class of labellar taste sensilla is being recorded. S-a, S-b, I-a, and I-b sensilla all have different sensitivities to bitters and sugars. Each figure should clearly indicate which sensilla is being recorded. Justification should be provided if recordings from different classes of sensilla are being pooled together for statistics.

      (3) In many figures, there is a lack of critical control experiments. Examples include Figures 1C-F (lacking UAS control), Figure 2I-J (lacking UAS control), Figure 4E (lacking the UAS and GAL4 control, and it is also strange to compare Gr64f > RNAi with Gr66a > RNAi, instead of with parental GAL4 and UAS controls.), and Figure 5D (lacking UAS control). Without these critical control experiments, it is difficult to evaluate the quality of the work.

      (4) Figure 2A could benefit from more clarification about what exactly is being recorded here. The text is confusing: a considerable amount of text is spent on explaining the technical details of how SP is recorded, but very little text about what SP represents, which is critical for the readers. The authors should clarify in the text that SP is measuring the potential between the sensillar lymph, where the dendrites of GRNs are immersed, and the hemolymph. Adding a schematic figure to show that SP represents the potential between the sensillar lymph and hemolymph would be beneficial.

      (5) The sGRN spiking rate in Figure 4B deviates significantly from previous literature (Wang, Carlson, eLife 2022; Jiao, Montell PNAS 2007, as examples), and the response to sucrose in the control flies is not dosage-dependent, which raises questions about the quality of the data. Why are the responses to sucrose not dosage-dependent? The responses are clearly not saturated at these (10 mM to 100 mM) concentrations.

      (6) In Figure 4C, instead of showing the average spike rate of the first five seconds and the next 5 seconds, why not show a peristimulus time histogram? It would help the readers tremendously, and it would also show how quickly the spike rate adapts to overexpression and control flies. Also, since taste responses adapt rather quickly, a 500 ms or 1 s bin would be more appropriate than a 5-second bin.

      (7) Lines 215 - 220. The authors state that the presence of sugars in the culture media would expose the GRNs to sugar constantly, without providing much evidence. What is the evidence that the GRNs are being activated constantly in flies raised with culture media containing sugars? The sensilla are not always in contact with the food.

      (8) Line 223. To show that bGRN spike rates in Ih mutant flies "decreased even more than WT", you need to compare the difference in spike rates between the sorbitol group and the sorbitol + sucrose group, which is not what is currently shown.

      (9) To help readers better understand the proposed mechanisms here, including a schematic figure would be helpful. This should show where Ih is expressed, how Ih in sGRNs impacts the sensillum potential, how elevated sensillum potential increases the electrical driving force for the receptor current, and affects the excitability of the bGRNs in the same sensilla, and how exposure to sugar is proposed to affect ion homeostasis in the sensillum lymph.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):


      The authors aim to address a critical challenge in the field of bioinformatics: the accurate and efficient identification of protein binding sites from sequences. Their work seeks to overcome the limitations of current methods, which largely depend on multiple sequence alignments or experimental protein structures, by introducing GPSite, a multi-task network designed to predict binding residues of various molecules on proteins using ESMFold.


      • Benchmarking. The authors provide a comprehensive benchmark against multiple methods, showcasing the performances of a large number of methods in various scenarios.

      • Accessibility and Ease of Use. GPSite is highlighted as a freely accessible tool with user-friendly features on its website, enhancing its potential for widespread adoption in the research community.

      RE: We thank the reviewer for acknowledging the contributions and strengths of our work!


      • Lack of Novelty. The method primarily combines existing approaches and lacks significant technical innovation. This raises concerns about the original contribution of the work in terms of methodological development. Moreover, the paper reproduces results and analyses already presented in previous literature, without providing novel analysis or interpretation. This further diminishes the contribution of this paper to advancing knowledge in the field.

      RE: The novelty of this work is primarily manifested in four key aspects. Firstly, although we have employed several existing tools such as ProtTrans and ESMFold to extract sequence features and predict protein conformations, these techniques were hardly explored in the field of binding site prediction. We have successfully demonstrated the feasibility of substituting multiple sequence alignments with language model embeddings and training with predicted structures, providing a new solution to overcome the limitations of current methods for genome-wide applications. Secondly, though a few methods tend to capture geometric information based on protein surfaces or atom graphs, surface calculation and property mapping are usually time-consuming, while massage passing on full atom graphs is memory-consuming and thus challenging to process long sequences. Besides, these methods are sensitive towards details and errors in the predicted structures. To facilitate large-scale annotations, we have innovatively applied geometric deep learning to protein residue graphs for comprehensively capturing backbone and sidechain geometric contexts in an efficient and effective manner (Figure 1). Thirdly, we have not only exploited multi-task learning to integrate diverse ligands and enhance performance, but also shown its capability to easily extend to the binding site prediction of other unseen ligands (Figure 4 D-E). Last but not least, as a “Tools and Resources” article, we have provided a fast, accurate and user-friendly webserver, as well as constructed a large annotation database for the sequences in Swiss-Prot. Leveraging this database, we have conducted extensive analyses on the associations between binding sites and molecular functions, biological processes, and disease-causing mutations (Figure 5), indicating the potential of our tool to unveil unexplored biology underlying genomic data.

      We have now revised the descriptions in the “The geometry-aware protein binding site predictor (GPSite)” section to highlight the novelty of our work in a clearer manner:

      “In conclusion, GPSite is distinguished from the previous approaches in four key aspects. First, profiting from the effectiveness and low computational cost of ProtTrans and ESMFold, GPSite is liberated from the reliance on MSA and native structures, thus enabling genome-wide binding site prediction. Second, unlike methods that only explore the Cα models of proteins 25,40, GPSite exploits a comprehensive geometric featurizer to fully refine knowledge in the backbone and sidechain atoms. Third, the employed message propagation on residue graphs is global structure-aware and time-efficient compared to the methods based on surface point clouds 21,22, and memory-efficient unlike methods based on full atom graphs 23,24. Residue-based message passing is also less sensitive towards errors in the predicted structures. Last but not least, instead of predicting binding sites for a single molecule type or learning binding patterns separately for different molecules, GPSite applies multi-task learning to better model the latent relationships among different binding partners.”

      • Benchmark Discrepancies. The variation in benchmark results, especially between initial comparisons and those with PeSTo. GPSite achieves a PR AUC of 0.484 on the global benchmark but a PR AUC of 0.61 on the benchmark against PeSTo. For consistency, PeSTo should be included in the benchmark against all other methods. It suggests potential issues with the benchmark set or the stability of the method. This inconsistency needs to be addressed to validate the reliability of the results.

      RE: We thank the reviewer for the constructive comments. Since our performance comparison experiments involved numerous competitive methods whose training sets are disparate, it was difficult to compare or rank all these methods fairly using a single test set. Given the substantial overlap between our protein-binding site test set and the training set of PeSTo, we meticulously re-split our entire protein-protein binding site dataset to generate a new test set that avoids any overlap with the training sets of both GPSite and PeSTo and performed a separate evaluation, where GPSite achieves a higher AUPR than PeSTo (0.610 against 0.433). This is quite common in this field. For instance, in the study of PeSTo (Nat Commun 2023), the comparisons of PeSTo with MaSIF-site, SPPIDER, and PSIVER were conducted using one test set, while the comparison with ScanNet was performed on a separate test set.

      Based on the reviewer’s suggestion, we have now replaced this experiment with a direct comparison with PeSTo using the datasets from PeSTo, in order to enhance the completeness and convincingness of our results. The corresponding descriptions are now added in Appendix 1-note 2, and the results are added in Appendix 2-table 4. For convenience, we also attach the note and table here:

      “Since 340 out of 375 proteins in our protein-protein binding site test set share > 30% identity with the training sequences of PeSTo, we performed a separate comparison between GPSite and PeSTo using the training and test datasets from PeSTo. By re-training with simply the same hyperparameters, GPSite achieves better performance than PeSTo (AUPR of 0.824 against 0.797) as shown in Appendix 2-table 4. Furthermore, when using ESMFold-predicted structures as input, the performance of PeSTo decreases substantially (AUPR of 0.691), and the superiority of our method will be further reflected. As in 24, the performance of ScanNet is also included (AUPR of 0.720), which is also largely outperformed by GPSite.”

      Author response table 1.

      Performance comparison of GPSite with ScanNet and PeSTo on the protein-protein binding site test set from PeSTo 24

      Note: The performance of ScanNet and PeSTo are directly obtained from 24. PeSTo* denotes evaluation using the ESMFold-predicted structures as input. The metrics provided are the median AUPR, median AUC and median MCC. The best/second-best results are indicated by bold/underlined fonts.

      • Interface Definition Ambiguity. There is a lack of clarity in defining the interface for the binding site predictions. Different methods are trained using varying criteria (surfaces in MaSIF-site, distance thresholds in ScanNet). The authors do not adequately address how GPSite's definition aligns with or differs from these standards and how this issue was addressed. It could indicate that the comparison of those methods is unreliable and unfair.

      RE: We thank the reviewer for the comments. The precise definition of ligand-binding sites is elucidated in the “Benchmark datasets” section. Specifically, the datasets of DNA, RNA, peptide, ATP, HEM and metal ions used to train GPSite were collected from the widely acknowledged BioLiP database [PMID: 23087378]. In BioLiP, a binding residue is defined if the smallest atomic distance between the target residue and the ligand is <0.5 Å plus the sum of the Van der Waal’s radius of the two nearest atoms. Meanwhile, most comparative methods regarding these ligands were also trained on data from BioLiP, thereby ensuring fair comparisons.

      However, since BioLiP does not include data on protein-protein binding sites, studies for protein-protein binding site prediction may adopt slightly distinct label definitions, as the reviewer suggested. Here, we employed the protein-protein binding site data from our previous study [PMID: 34498061], where a protein-binding residue was defined as a surface residue (relative solvent accessibility > 5%) that lost more than 1 Å2 absolute solvent accessibility after protein-protein complex formation. This definition was initially introduced in PSIVER [PMID: 20529890] and widely applied in various studies (e.g., PMID: 31593229, PMID: 32840562). SPPIDER [PMID: 17152079] and MaSIF-site [PMID: 31819266] have also adopted similar surface-based definitions as PSIVER. On the other hand, ScanNet [PMID: 35637310] employed an atom distance threshold of 4 Å to define contacts while PeSTo [PMID: 37072397] used a threshold of 5 Å. However, it is noteworthy that current methods in this field including ScanNet (Nat Methods 2022) and PeSTo (Nat Commun 2023) directly compared methods using different label definitions without any alignment in their benchmark studies, likely due to the subtle distinctions among these definitions. For instance, the study of PeSTo directly performed comparisons with ScanNet, MaSIF-site, SPPIDER, and PSIVER. Therefore, we followed these previous works, directly comparing GPSite with other protein-protein binding site predictors.

      In the revised “Benchmark datasets” section, we have now provided more details for the binding site definitions in different datasets to avoid any potential ambiguity:

      “The benchmark datasets for evaluating binding site predictions of DNA, RNA, peptide, ATP, and HEM are constructed from BioLiP”; “A binding residue is defined if the smallest atomic distance between the target residue and the ligand is < 0.5 Å plus the sum of the Van der Waal’s radius of the two nearest atoms”; “Besides, the benchmark dataset of protein-protein binding sites is directly from 26, which contains non-redundant transient heterodimeric protein complexes dated up to May 2021. Surface regions that become solvent inaccessible on complex formation are defined as the ground truth protein-binding sites. The benchmark datasets of metal ion (Zn2+, Ca2+, Mg2+ and Mn2+) binding sites are directly from 18, which contain non-redundant proteins dated up to December 2021 from BioLiP.”

      While GPSite demonstrates the potential to surpass state-of-the-art methods in protein binding site prediction, the evidence supporting these claims seems incomplete. The lack of methodological novelty and the unresolved questions in benchmark consistency and interface definition somewhat undermine the confidence in the results. Therefore, it's not entirely clear if the authors have fully achieved their aims as outlined.

      The work is useful for the field, especially in disease mechanism elucidation and novel drug design. The availability of genome-scale binding residue annotations GPSite offers is a significant advancement. However, the utility of this tool could be hampered by the aforementioned weaknesses unless they are adequately addressed.

      RE: We thank the reviewer for acknowledging the advancement and value of our work, as well as pointing out areas where improvements can be made. As discussed above, we have now carried out the corresponding revisions in the revised manuscript to enhance the completeness and clearness of our work.

      Reviewer #2 (Public Review):


      This work provides a new framework, "GPsite" to predict DNA, RNA, peptide, protein, ATP, HEM, and metal ions binding sites on proteins. This framework comes with a webserver and a database of annotations. The core of the model is a Geometric featurizer neural network that predicts the binding sites of a protein. One major contribution of the authors is the fact that they feed this neural network with predicted structure from ESMFold for training and prediction (instead of native structure in similar works) and a high-quality protein Language Model representation. The other major contribution is that it provides the public with a new light framework to predict protein-ligand interactions for a broad range of ligands.

      The authors have demonstrated the interest of their framework with mostly two techniques: ablation and benchmark.


      • The performance of this framework as well as the provided dataset and web server make it useful to conduct studies.

      • The ablations of some core elements of the method, such as the protein Language Model part, or the input structure are very insightful and can help convince the reader that every part of the framework is necessary. This could also guide further developments in the field. As such, the presentation of this part of the work can hold a more critical place in this work.

      RE: We thank the reviewer for recognizing the contributions of our work and for noting that our experiments are thorough.


      • Overall, we can acknowledge the important effort of the authors to compare their work to other similar frameworks. Yet, the lack of homogeneity of training methods and data from one work to the other makes the comparison slightly unconvincing, as the authors pointed out. Overall, the paper puts significant effort into convincing the reader that the method is beating the state of the art. Maybe, there are other aspects that could be more interesting to insist on (usability, interest in protein engineering, and theoretical works).

      RE: We sincerely appreciate the reviewer for the constructive and insightful comments. As to the concern of training data heterogeneity raised by the reviewer, it is noteworthy that current studies in this field, such as ScanNet (Nat Methods 2022) and PeSTo (Nat Commun 2023), directly compare methods trained on different datasets in their benchmark experiments. Therefore, we have adhered to the paradigm in these previous works. According to the detailed recommendations by the reviewer, we have now improved our manuscript by incorporating additional ablation studies regarding the effects of training procedure and language model representations, as well as case studies regarding the predicted structure’s quality and GPSite-based function annotations. We have also refined the Discussion section to focus more on the achievements of this work. A comprehensive point-by-point response to the reviewer’s recommendations is provided below.

      Reviewer #2 (Recommendations For The Authors):

      Major comments:

      Overall I think the work is slightly deserved by its presentation. Some improvements could be made to the paper to better highlight the significance of your contribution.

      RE: We thank the reviewer for recognizing the significance of our work!

      • Line 188: "As expected, the performance of these methods mostly decreases substantially utilizing predicted structures for testing because they were trained with high-quality native structures.

      This is a major ablation that was not performed in this case. You used the predicted structure to train, while the other did not. One better way to assess the interest of this approach would be to compare the performance of a network trained with only native structure to compare the leap in performance with and without this predicted structure as you did after to assess the interest of some other aspect of your method such as single to multitask.

      RE: We thank the reviewer for the valuable recommendation. We have now assessed the benefit of training with predicted instead of native structures, which brings an average AUPR increase of 4.2% as detailed in Appendix 1-note 5 and Appendix 2-table 9. For convenience, we also attach the note and table here:

      “We examined the performance under different training and evaluation settings as shown in Appendix 2-table 9. As expected, the model yields exceptional performance (average AUPR of 0.656) when trained and evaluated using native structures. However, if this model is fed with predicted structures of the test proteins, the performance substantially declines to an average AUPR of 0.573. This trend aligns with the observations for other structure-based methods as illustrated in Figure 2. More importantly, in the practical scenario where only predicted structures are available for the target proteins, training the model with predicted structures (i.e., GPSite) results in superior performance than training the model with native structures (average AUPR of 0.594 against 0.573), probably owing to the consistency between the training and testing data. For completeness, the results in Appendix 3-figure 2 are also included where GPSite is tested with native structures (average AUPR of 0.637).”

      Author response table 2.

      Performance comparison on the ten binding site test sets under different training and evaluation settings

      Note: The numbers in this table are AUPR values. “Pep” and “Pro” denote peptide and protein, respectively. “Avg” means the average AUPR values among the ten test sets. “native” and “predicted” denote applying native and predicted structures as input, respectively.

      • Line 263: "ProtTrans consistently obtains competitive or superior performance compared to the MSA profiles, particularly for the target proteins with few homologous sequences (Neff < 2)."

      This seems a bit far-fetched. If we see clearly in the figure that the performances are far superior for Neff < 2. The performances seem rather similar for higher Neff. Could the author evaluate numerically the significance of the improvement? MSA profiles outperform GPSite on 4 intervals and I don't know the distribution of the data.

      RE: We thank the reviewer for the valuable suggestion. We have now revised this sentence to avoid any potential ambiguity:

      “As evidenced in Figure 4B and Appendix 2-table 8, ProtTrans consistently obtains competitive or superior performance compared to the MSA profile. Notably, for the target proteins with few homologous sequences (Neff < 2), ProtTrans surpasses MSA profile significantly with an improvement of 3.9% on AUC (P-value = 4.3×10-8).”

      The detailed significance tests and data distribution are now added in Appendix 2-table 8 and attached below as Author response-table 3 for convenience:

      Author response table 3.

      Performance comparison between GPSite and the baseline model using MSA profile for proteins with different Neff values in the combined test set of the ten ligands

      Note: Significance tests are performed following the procedure in 12,25. If P-value < 0.05, the difference between the performance is considered statistically significant.

      • Line 285: "We first visualized the distributions of residues in this dataset using t-SNE, where the residues are encoded by raw feature vectors encompassing ProtTrans embeddings and DSSP structural properties, or latent embedding vectors from the shared network of GPSite. "

      Wouldn't embedding from single-task be more relevant to show the interest of multi-task training here? Is the difference that big when comparing embeddings from single-task training to embeddings from multi-task training? Otherwise, I think the evidence from Figure 4e is sufficient, the interest of multitasking could be well-shown by single-task vs. multi-task AUPR and a few examples or predictions that are improved.

      RE: We thank the reviewer for the comment. In the second paragraph of the “The effects of protein features and model designs” section, we have compared the performance of multi-task and single-task learning. However, the visualization results in Figure 4D are related to the third paragraph, where we conducted a downstream exploration of the possibility to extend GPSite to other unseen ligands. This is based on the hypothesis that the shared network in GPSite may have captured certain common ligand-binding mechanisms during the preceding multi-task training process. We visualized the distributions of residues in an unseen carbohydrate-binding site dataset using t-SNE, where the residues are encoded by raw feature vectors (ProtTrans and DSSP), or latent embedding vectors from the shared network trained before. Although the shared network has not been specifically trained on the carbohydrate dataset, the latent representations from GPSite effectively improve the discriminability between the binding and non-binding residues as shown in Figure 4D. This finding indicates that the shared network trained on the initial set of ten molecule types has captured common binding mechanisms and may be applied to other unseen ligands.

      We have now added more descriptions in this paragraph to avoid potential ambiguity:

      “Residues that are conserved during evolution, exposed to solvent, or inside a pocket-shaped domain are inclined to participate in ligand binding. During the preceding multi-task training process, the shared network in GPSite should have learned to capture such common binding mechanisms. Here we show how GPSite can be easily extended to the binding site prediction for other unseen ligands by adopting the pre-trained shared network as a feature extractor. We considered a carbohydrate-binding site dataset from 54 which contains 100 proteins for training and 49 for testing. We first visualized the distributions of residues in this dataset using t-SNE 55, where the residues are encoded by raw feature vectors encompassing ProtTrans embeddings and DSSP structural properties, or latent embedding vectors from the shared network of GPSite trained on the ten molecule types previously.”

      • Line291: "Employing these informative hidden embeddings as input features to train a simple MLP exhibits remarkable performance with an AUC of 0.881 (Figure 4E), higher than that of training a single-task version of GPSite from scratch (AUC of 0.853) or other state-of-the-art methods such as MTDsite and SPRINT-CBH."

      Is it necessary to introduce other methods here? The single-task vs multi-task seems enough for what you want to show?

      RE: We thank the reviewer for the comment. As discussed above, here we aim to show the potential of GPSite for the binding site prediction of unseen ligand (i.e., carbohydrate) by adopting the pre-trained shared network as a feature extractor. Thus, we think it’s reasonable to also include the performance of other state-of-the-art methods in this carbohydrate benchmark dataset as baselines.

      • Line 321: "Specifically, a protein-level binding score can be generated for each ligand by averaging the top k predicted scores among all residues. Empirically, we set k to 5 for metal ions and 10 for other ligands, considering that the binding interfaces of metal ions are usually smaller."

      Since binding sites are usually not localized on one single amino-acid, we can expect that most of the top k residues are localized around the same area of the protein both spatially and along the sequence. Is it something you observe and could consider in your method?

      RE: We thank the reviewer for the comment. We employed a straightforward method (top-k average) to convert GPSite’s residue-level annotations into protein-level annotations, where k was set empirically based on the distributions of the numbers of binding residues per sequence observed in the training set. We have not put much effort in optimizing this strategy since it mainly serves as a proof-of-concept experiment (Figure 5 A-C) to show the potential of GPSite in discriminating ligand-binding proteins. We have now revised this sentence to better explain how we selected k:

      “Specifically, a protein-level binding score indicating the overall binding propensity to a specific ligand can be generated by averaging the top k predicted scores among all residues. Empirically, we set k to 5 for metal ions and 10 for other ligands, considering the distributions of the numbers of binding residues per sequence observed in the training set.”

      As for the question raised by the reviewer, we can indeed expect that most of the top k predicted binding residues tend to cluster into several but not necessarily one area. For instance, certain macromolecules like DNA may interact with several protein surface patches due to their elongated structures (e.g., Author esponse-figure 1A). Another case may be a protein binding to multiple molecules of the same ligand type (e.g., Author response-figure 1B).

      Author response image 1.

      The structures of 4XQK (A) and 4KYW (B) in PDB.

      • Line 327: The accuracy of the GPSite protein-level binding scores is further validated by the ROC curves in Figure 5B, where GPSite achieves satisfactory AUC values for all ligands except protein (AUC of 0.608).

      Here may be a good place to compare yourself with others, do other frameworks experience the same problem? If so, AUC and AUPR are not relevant here, can you expose some recall scores for example?

      RE: We thank the reviewer for the valuable recommendation. We have conducted comprehensive method comparisons in the preceding “GPSite outperforms state-of-the-art methods” section, where GPSite surpasses all existing frameworks across various ligands. Here, the genome-wide analyses of Swiss-Prot in Figure 5 serve as a downstream demonstration of GPSite’s capacity for large-scale annotations. We didn’t compare with other methods since most of them are time-consuming or memory-consuming, thus unavailable to process sequences of substantial quantity or length. For example, it takes about 8 min for the MSA-based method GraphBind to annotate a protein with 500 residues, while it just takes about 20 s for GPSite (see Appendix 3-figure 1 for detailed runtime comparison). It is also challenging for the atom-graph-based method PeSTo to process structures more than 100 kDa (~1000 residues) on a 32 GB GPU as the authors suggested, while GPSite can easily process structures containing up to 2500 residues on a 16 GB GPU.

      Regarding the recall score mentioned by the reviewer, GPSite achieves a recall of 0.95 (threshold = 0.5) for identifying protein-binding proteins. This indicates that GPSite can accurately identify positive samples, but it also tends to misclassify negative samples as positive. In our original manuscript, we claimed that “This may be ascribed to the fact that protein-protein interactions are ubiquitous in living organisms while the Swiss-Prot function annotations are incomplete”. To better support this claim, we have now added two examples in Appendix 1-note 7, where GPSite confidently predicted the presences of the “protein binding” function (GO:0005515). Notably, this function was absent in these two proteins in the Swiss-Prot database at the time of manuscript preparation (release: 2023-05-03), but has been included in the latest release of Swiss-Prot (release: 2023-11-08). For convenience, we also attach the note here:

      “As depicted in Figure 5A, GPSite assigns relatively high prediction scores to the proteins without “protein binding” function in the Swiss-Prot annotations, leading to a modest AUC value of 0.608 (Figure 5B). This may be ascribed to the fact that protein-protein interactions are ubiquitous in living organisms while the Swiss-Prot function annotations are incomplete. To support this hypothesis, we present two proteins as case studies, both sharing < 20% sequence identity with the protein-binding training set of GPSite. The first case is Aminodeoxychorismate synthase component 2 from Escherichia coli (UniProt ID: P00903). GPSite confidently predicted this protein as a protein-binding protein with a high prediction score of 0.936. Notably, this protein was not annotated with the “protein binding” function (GO:0005515) or any of its GO child terms in the Swiss-Prot database at the time of manuscript preparation (https://rest.uniprot.org/unisave/P00903?format=txt&versions=171, release: 2023-05-03). However, in the latest release of Swiss-Prot (https://rest.uniprot.org/unisave/P00903?format=txt&versions=174, release: 2023-11-08) during manuscript revision, this protein is annotated with the “protein heterodimerization activity” function (GO:0046982), which is a child term of “protein binding”. In fact, the heterodimerization activity of this protein has been validated through experiments in the year of 1996 (PMID: 8679677), indicating the potential incompleteness of the Swiss-Prot annotations. The other case is Hydrogenase-2 operon protein HybE from Escherichia coli (UniProt ID: P0AAN1), which was also predicted as a protein-binding protein by GPSite (score = 0.909). Similarly, this protein was not annotated with the “protein binding” function in the Swiss-Prot database at the time of manuscript preparation (https://rest.uniprot.org/unisave/P0AAN1?format=txt&versions=108). However, in the latest release of Swiss-Prot (https://rest.uniprot.org/unisave/P0AAN1?format=txt&versions=111), this protein is annotated with the “preprotein binding” function (GO:0070678), which is a child term of “protein binding”. In fact, the preprotein binding function of this protein has been validated through experiments in the year of 2003 (PMID: 12914940). These cases demonstrate the effectiveness of GPSite for completing the missing function annotations in Swiss-Prot.”

      • Line 381: 'Despite the noteworthy advancements achieved by GPSite, there remains scope for further improvements. Given that the ESM Metagenomic Atlas 34 provides 772 million predicted protein structures along with pre-computed language model embeddings, self-supervised learning can be employed to train a GPSite model for predicting masked sequence and structure attributes, or maximizing the similarity between the learned representations of substructures from identical proteins while minimizing the similarity between those from different proteins using a contrastive loss function training from scratch. Additional opportunities for upgrade exist within the network architecture. For example, a variational Expectation-Maximization (EM) framework 58 can be adopted to handle the hierarchical graph structure inherent in proteins, which contains the top view of the residue graph and the bottom view of the atom graph inside a residue. Such an EM procedure enables training two separate graph neural networks for the two views while simultaneously allowing interaction and mutual enhancement between the two modules. Meta-learning could also be explored in this multi-task scenario, which allows fast adaptation to unseen tasks with limited labels.'

      I think this does not belong here. It feels like half of your discussion is not talking about the achievements of this paper but future very specific directions. Focus on the take-home arguments (performances of the model, ability to predict a large range of tasks, interest in key components of your model, easy use) of the paper and possible future direction but without being so specific.

      RE: We thank the reviewer for the valuable suggestion. We have now simplified the discussions on the future directions notably:

      “Despite the noteworthy advancements achieved by GPSite, there remains scope for further improvements. GPSite may be improved by pre-training on the abundant predicted structures in ESM Metagenomic Atlas, and then fine-tuning on binding site datasets. Besides, the hidden embeddings from ESMFold may also serve as informative protein representations. Additional opportunities for upgrade exist within the network architecture. For example, a variational Expectation-Maximization framework can be adopted to handle the hierarchical atom-to-residue graph structure inherent in proteins. Meta-learning could also be explored in this multi-task scenario, which allows fast adaptation to unseen tasks with limited labels.”

      • Overall there is also a lack of displayed structure. You should try to select a few examples of binding sites that were identified correctly by your method and not by others, if possible get some insights on why. Also, some negative examples could be interesting so as to have a better idea of the interest.

      RE: We thank the reviewer for the valuable recommendation. We have performed a case study for the structure of the glucocorticoid receptor in Figure 3 D-H to illustrate a potential reason for the robustness of GPSite. Moreover, we have now added a case study in Appendix 1-note 3 and Appendix 3-figure 5 to explain why GPSite sometimes is not as accurate as the state-of-the-art structure-based method. For convenience, we also attach the note and figure here:

      “Here we present an example of an RNA-binding protein, i.e., the ribosome biogenesis protein ERB1 (PDB: 7R6Q, chain m), to illustrate the impact of predicted structure’s quality. As shown in Appendix 3-figure 5, ERB1 is an integral component of a large multimer structure comprising protein and RNA chains (i.e., the state E2 nucleolar 60S ribosome biogenesis intermediate). Likely due to the neglect of interactions from other protein chains, ESMFold fails to predict the correct conformation of the ERB1 chain (TM-score = 0.24). Using this incorrect predicted structure, GPSite achieves an AUPR of 0.580, lower than GraphBind input with the native structure (AUPR = 0.636). However, the performance of GraphBind substantially declines to an AUPR of 0.468 when employing the predicted structure as input. Moreover, if GPSite adopts the native structure for prediction, a notable performance boost can be obtained (AUPR = 0.681).”

      Author response image 2.

      The prediction results of GPSite and GraphBind for the ribosome biogenesis protein ERB1. (A) The state E2 nucleolar 60S ribosome biogenesis intermediate (PDB: 7R6Q). The ribosome biogenesis protein ERB1 (chain m) is highlighted in blue, while other protein chains are colored in gray. The RNA chains are shown in orange. (B) The RNA-binding sites on ERB1 (colored in red). (C) The ESMFold-predicted structure of ERB1 (TM-score = 0.24). The RNA-binding sites are also mapped onto this predicted structure (colored in red). (D-G) The prediction results of GPSite and GraphBind for the predicted and native ERB1 structures. The confidence of the predictions is represented with a gradient of color from blue for non-binding to red for binding.

      Minor comments:

      • Line 169: "Note that since our test sets may partly overlap with the training sets of these methods, the results reported here should be the upper limits for the existing methods."

      Yes, but they were potentially not trained on the most recent structures in that case. These methods could also see improved performance with an updated training set.

      RE: We thank the reviewer for the comment. We have now deleted this sentence.

      • Line176: "Since 358 of the 375 proteins in our protein-binding site test set share > 30% identity with the training sequences of PeSTo, we re-split our protein-binding dataset to generate a test set of 65 proteins sharing < 30% identity with the training set of PeSTo for a fair evaluation."

      Too specific to be here in my opinion.

      RE: We thank the reviewer for the comment. We have now moved these details to Appendix 1-note 2. The description in the main text here is now more concise:

      “Given the substantial overlap between our protein-binding site test set and the training set of PeSTo, we conducted separate training and comparison using the datasets of PeSTo, where GPSite still demonstrates a remarkable improvement over PeSTo (Appendix 1-note 2).”

      • Figure 2. The authors should try to either increase Fig A's size or increase the font size. This could probably be done by compressing the size of Figure C into a single figure.

      RE: We thank the reviewer for the suggestion. We have now increased the font size in Figure A. Besides, the figures in the final version of the manuscript should be clearer where we could upload SVG files.

      • Have you tried using embeddings from more structure-aware pLM such as ESM Fold embeddings (fine-tuned) or ProstTrans (that may be more recent than this study)?

      RE: We thank the reviewer for the insightful comment. We have not yet explored the embeddings from structure-aware pLM, but we acknowledge its potential as a promising avenue for future investigation. We have now added this point in our Discussion section:

      “Besides, the hidden embeddings from ESMFold may also serve as informative protein representations.”

      Reviewer #3 (Public Review):


      The authors of this work aim to address the challenge of accurately and efficiently identifying protein binding sites from sequences. They recognize that the limitations of current methods, including reliance on multiple sequence alignments or experimental protein structure, and the under-explored geometry of the structure, which limit the performance and genome-scale applications. The authors have developed a multi-task network called GPSite that predicts binding residues for a range of biologically relevant molecules, including DNA, RNA, peptides, proteins, ATP, HEM, and metal ions, using a combination of sequence embeddings from protein language models and ESMFold-predicted structures. Their approach attempts to extract residual and relational geometric contexts in an end-to-end manner, surpassing current sequence-based and structure-based methods.


      • The GPSite model's ability to predict binding sites for a wide variety of molecules, including DNA, RNA, peptides, and various metal ions.

      • Based on the presented results, GPSite outperforms state-of-the-art methods in several benchmark datasets.

      • GPSite adopts predicted structures instead of native structures as input, enabling the model to be applied to a wider range of scenarios where native structures are rare.

      • The authors emphasize the low computational cost of GPSite, which enables rapid genome-scale binding residue annotations, indicating the model's potential for large-scale applications.

      RE: We thank the reviewer for recognizing the significance and value of our work!


      • One major advantage of GPSite, as claimed by the authors, is its efficiency. Although the manuscript mentioned that the inference takes about 5 hours for all datasets, it remains unclear how much improvement GPSite can offer compared with existing methods. A more detailed benchmark comparison of running time against other methods is recommended (including the running time of different components, since some methods like GPSite use predicted structures while some use native structures).

      RE: We thank the reviewer for the valuable suggestion. Empirically, it takes about 5-20 min for existing MSA-based methods to make predictions for a protein with 500 residues, while it only takes about 1 min for GPSite (including structure prediction). However, it is worth noting that some predictors in our benchmark study are solely available as webservers, and it is challenging to compare the runtime between a standalone program and a webserver due to the disparity in hardware configurations. Therefore, we have now included comprehensive runtime comparisons between the GPSite webserver and other top-performing servers in Appendix 3-figure 1 to illustrate the practicality and efficiency of our method. For convenience, we also attach the figure here as Author response-figure 3. The corresponding description is now added in the “GPSite outperforms state-of-the-art methods” section:

      “Moreover, GPSite is computationally efficient, achieving comparable or faster prediction speed compared to other top-performing methods (Appendix 3-figure 1).”

      Author response image 3.

      Runtime comparison of the GPSite webserver with other top-performing servers. Five protein chains (i.e., 8HN4_B, 8USJ_A, 8C1U_A, 8K3V_A and 8EXO_A) comprising 100, 300, 500, 700, and 900 residues, respectively, were selected for testing, and the average runtime is reported for each method. Note that a significant portion of GPSite’s runtime (75 s, indicated in orange) is allocated to structure prediction using ESMFold.

      • Since the model uses predicted protein structure, the authors have conducted some studies on the effect of the predicted structure's quality. However, only the 0.7 threshold was used. A more comprehensive analysis with several different thresholds is recommended.

      RE: We thank the reviewer for the comment. We assessed the effect of the predicted structure's quality by evaluating GPSite’s performance on high-quality (TM-score > 0.7) and low-quality (TM-score ≤ 0.7) predicted structures. We did not employ multiple thresholds (e.g., 0.3, 0.5, and 0.7), as the majority of proteins in the test sets were accurately predicted by ESMFold. Specifically, as shown in Figure 3B, Appendix 3-figure 3 and Appendix 2-table 5, the numbers of proteins with TM-score ≤ 0.7 are small in most datasets (e.g., 42 for DNA and 17 for ATP). Consequently, there is insufficient data available for analysis with lower thresholds, except for the RNA test set. Notably, Figure 3C presents a detailed inspection of the 104 proteins with TM-score < 0.5 in the RNA test set. Within this subset, GPSite consistently outperforms the state-of-the-art structure-based method GraphBind with predicted structures as input, regardless of the prediction quality of ESMFold. Only in cases where structures are predicted with extremely low quality (TM-score < 0.3) does GPSite fall behind GraphBind input with native structures. This result further demonstrates the robustness of GPSite. We have now added clearer explanations in the “GPSite is robust for low-quality predicted structures” section:

      “Figure 3B and Appendix 3-figure 3 show the distributions of TM-scores between native and predicted structures calculated by US-align in the ten benchmark datasets, where most proteins are accurately predicted with TM-score > 0.7 (see also Appendix 2-table 5)”; “Given the infrequency of low-quality predicted structures except for the RNA test set, we took a closer inspection of the 104 proteins with predicted structures of TM-score < 0.5 in the RNA test set.”

      • To demonstrate the robustness of GPSite, the authors performed a case study on human GR containing two zinc fingers, where the predicted structure is not perfect. The analysis could benefit from more a detailed explanation of why the model can still infer the binding site correctly even though the input structural information is slightly off.

      RE: We thank the reviewer for the comment. We have actually explained the potential reason for the robustness of GPSite in the second paragraph of the “GPSite is robust for low-quality predicted structures” section. In summary, although the whole structure of this protein is not perfectly predicted, the local structures of the binding domains of peptide, DNA and Zn2+ are actually predicted accurately as evidenced by the superpositions of the native and predicted structures in Figure 3D and 3E. Therefore, GPSite can still make reliable predictions. We have now revised this paragraph to explain these more clearly:

      “Figure 3D shows the structure of the human glucocorticoid receptor (GR), a transcription factor that binds DNA and assembles a coactivator peptide to regulate gene transcription (PDB: 7PRW, chain A). The DNA-binding domain of GR also consists of two C4-type zinc fingers to bind Zn2+ ions. Although the structure of this protein is not perfectly predicted (TM-score = 0.72), the local structures of the binding domains of peptide and DNA are actually predicted accurately as viewed by the superpositions of the native and predicted structures in Figure 3D and 3E. Therefore, GPSite can correctly predict all Zn2+ binding sites and precisely identify the binding sites of DNA and peptide with AUPR values of 0.949 and 0.924, respectively (Figure 3F, G and H).”

      • To analyze the relatively low AUC value for protein-protein interactions, the authors claimed that it is "due to the fact that protein-protein interactions are ubiquitous in living organisms while the Swiss-Prot function annotations are incomplete", which is unjustified. It is highly recommended to support this claim by showing at least one example where GPSite's prediction is a valid binding site that is not present in the current Swiss-Prot database or via other approaches.

      RE: We thank the reviewer for the valuable recommendation. To support this claim, we have now added two examples in Appendix 1-note 7, where GPSite confidently predicted the presences of the “protein binding” function (GO:0005515). Notably, this function was absent in these two proteins in the Swiss-Prot database at the time of manuscript preparation (release: 2023-05-03), but has been included in the latest release of Swiss-Prot (release: 2023-11-08). For convenience, we also attach the note below:

      “As depicted in Figure 5A, GPSite assigns relatively high prediction scores to the proteins without “protein binding” function in the Swiss-Prot annotations, leading to a modest AUC value of 0.608 (Figure 5B). This may be ascribed to the fact that protein-protein interactions are ubiquitous in living organisms while the Swiss-Prot function annotations are incomplete. To support this hypothesis, we present two proteins as case studies, both sharing < 20% sequence identity with the protein-binding training set of GPSite. The first case is Aminodeoxychorismate synthase component 2 from Escherichia coli (UniProt ID: P00903). GPSite confidently predicted this protein as a protein-binding protein with a high prediction score of 0.936. Notably, this protein was not annotated with the “protein binding” function (GO:0005515) or any of its GO child terms in the Swiss-Prot database at the time of manuscript preparation (https://rest.uniprot.org/unisave/P00903?format=txt&versions=171, release: 2023-05-03). However, in the latest release of Swiss-Prot (https://rest.uniprot.org/unisave/P00903?format=txt&versions=174, release: 2023-11-08) during manuscript revision, this protein is annotated with the “protein heterodimerization activity” function (GO:0046982), which is a child term of “protein binding”. In fact, the heterodimerization activity of this protein has been validated through experiments in the year of 1996 (PMID: 8679677), indicating the potential incompleteness of the Swiss-Prot annotations. The other case is Hydrogenase-2 operon protein HybE from Escherichia coli (UniProt ID: P0AAN1), which was also predicted as a protein-binding protein by GPSite (score = 0.909). Similarly, this protein was not annotated with the “protein binding” function in the Swiss-Prot database at the time of manuscript preparation (https://rest.uniprot.org/unisave/P0AAN1?format=txt&versions=108). However, in the latest release of Swiss-Prot (https://rest.uniprot.org/unisave/P0AAN1?format=txt&versions=111), this protein is annotated with the “preprotein binding” function (GO:0070678), which is a child term of “protein binding”. In fact, the preprotein binding function of this protein has been validated through experiments in the year of 2003 (PMID: 12914940). These cases demonstrate the effectiveness of GPSite for completing the missing function annotations in Swiss-Prot.”

      • The authors reported that many GPSite-predicted binding sites are associated with known biological functions. Notably, for RNA-binding sites, there is a significantly higher proportion of translation-related binding sites. The analysis could benefit from a further investigation into this observation, such as the analyzing the percentage of such interactions in the training site. In addition, if there is sufficient data, it would also be interesting to see the cross-interaction-type performance of the proposed model, e.g., train the model on a dataset excluding specific binding sites and test its performance on that class of interactions.

      RE: We thank the reviewer for the suggestion. We would like to clarify that the analysis in Figure 5C was conducted at “protein-level” instead of “residue-level”. As described in the second paragraph of the “Large-scale binding site annotation for Swiss-Prot” section, a protein-level ligand-binding score was assigned to a protein by averaging the top k residue-level predicted binding scores. This protein-level score indicates the overall binding propensity of the protein to a specific ligand. We gathered the top 20,000 proteins with the highest protein-level binding scores for each ligand and found that their biological process annotations from Swiss-Prot were consistent with existing knowledge. We have now revised the corresponding sentence to explain these more clearly:

      “Exploiting the residue-level binding site annotations, we could readily extend GPSite to discriminate between binding and non-binding proteins of various ligands. Specifically, a protein-level binding score indicating the overall binding propensity to a specific ligand can be generated by averaging the top k predicted scores among all residues.”

      As for the cross-interaction-type performance raised by the reviewer, we have now conducted cross-type evaluations to investigate the specificity of the ligand-specific MLPs and the inherent similarities among different ligands in Appendix 1-note 6 and Appendix 2-table 10. For convenience, we also attach the note and table here:

      “We conducted cross-type evaluations by applying different ligand-specific MLPs in GPSite for the test sets of different ligands. As shown in Appendix 2-table 10, for each ligand-binding site test set, the corresponding ligand-specific network consistently achieves the best performance. This indicates that the ligand-specific MLPs have specifically learned the binding patterns of particular molecules. We also noticed that the cross-type performance is reasonable for the ligands sharing similar properties. For instance, the DNA-specific MLP exhibits a reasonable AUPR when predicting RNA-binding sites, and vice versa. Similar trends are also observed between peptide and protein, as well as among metal ions as expected. Interestingly, the cross-type performance between ATP and HEM is also acceptable, potentially attributed to their comparable molecular weights (507.2 and 616.5, respectively).”

      Author response table 4.

      Cross-type performance by applying different ligand-specific MLPs in GPSite for the test sets of different ligands

      Note: “Pep” and “Pro” denote peptide and protein, respectively. The numbers in this table are AUPR values. The best/second-best result in each test set is indicated by bold/underlined font.

    1. Author Response

      eLife assessment

      The authors report that optogenetic inhibition of hippocampal axon terminals in retrosplenial cortex impairs the performance of a delayed non-match to place task. The significance of findings elucidating the role of hippocampal projections to the retrosplenial cortex in memory and decision-making behaviors is important. However, the strength of evidence for the paper's claims is currently incomplete.

      Public Reviews:

      Reviewer #1 (Public Review):


      This is a study on the role of the retrosplenial cortex (RSC) and the hippocampus in working memory. Working memory is a critical cognitive function that allows temporary retention of information for task execution. The RSC, which is functionally and anatomically connected to both primary sensory (especially visual) and higher cognitive areas, plays a key role in integrating spatial-temporal context and in goal-directed behaviors. However, the specific contributions of the RSC and the hippocampus in working memory-guided behaviors are not fully understood due to a lack of studies that experimentally disrupt the connection between these two regions during such behaviors.

      In this study, researchers employed eArch3.0 to silence hippocampal axon terminals in the RSC, aiming to explore the roles of these brain regions in working memory. Experiments were conducted where animals with silenced hippocampal axon terminals in the RSC performed a delayed non-match to place (DNMP) task. The results indicated that this manipulation impaired memory retrieval, leading to decreased performance and quicker decision-making in the animals. Notably, the authors observed that the effects of this impairment persisted beyond the light-activation period of the opsin, affecting up to three subsequent trials. They suggest that disrupting the hippocampal-RSC connection has a significant and lasting impact on working memory performance.


      They conducted a study exploring the impact of direct hippocampal inputs into the RSC, a region involved in encoding spatial-temporal context and transferring contextual information, on spatial working memory tasks. Utilizing eArch3.0 expressed in hippocampal neurons via the viral vector AAV5-hSyn1-eArch3.0, they aimed to bilaterally silence hippocampal terminals located at the RSC in rats pre-trained in a DNMP task. They discovered that silencing hippocampal terminals in the RSC significantly decreased working memory performance in eArch+ animals, especially during task interleaving sessions (TI) that alternated between trials with and without light delivery. This effect persisted even in non-illuminated trials, indicating a lasting impact beyond the periods of direct manipulation. Additionally, they observed a decreased likelihood of correct responses following TI trials and an increased error rate in eArch+ animals, even after incorrect responses, suggesting an impairment in error-corrective behavior. This contrasted with baseline sessions where no light was delivered, and both eArch+ and control animals showed low error rates.


      While I agree with the authors that the role of hippocampal inputs to the RSC in spatial working memory is understudied and merits further investigation, I find that the optogenetic experiment, a core part of this manuscript that includes viral injections, could be improved. The effects were rather subtle, rendering some of the results barely significant and possibly too weak to support major conclusions.

      We thank Reviewer#1 for carefully and critically reading our manuscript, and for the valuable comments provided. The judged “subtlety” of the effects stems from a perspective according to which a quantitatively lower effect bears less biological significance for cognition. We disagree with this perspective and find it rather reductive for several reasons.

      Once seen in the context of the animal’s ecology, subtle impairments can be life-threatening precisely because of their subtlety, leading the animal to confidently rely on a defective capacity, for such events as remembering the habitual location of a predator, or food source.

      Also, studies in animal cognition often undertake complete, rather than graded, suppression of a given mechanism (in the same sense as that of “knocking out” a gene that is relevant for behaviour), leading to a gravelly, rather that gradually, impaired model system, to the point of not allowing a hypothetical causal link to be mechanistically revealed beyond its mere presence. This often hinders a thorough interpretation of the perturbed factor’s role. If a caricatural analogy is allowed, it would be as if we were to study the role of an animal’s legs by chopping them both off and observing the resulting behaviour.

      In our study we conclude that silencing HIPP inputs in RSC perturbs cognition enough to impair behaviour while not disabling the animal entirely, as such allowing for behaviour to proceed, and for our observation of graded, decreased (not absent), proficiency under optogenetic silencing. So rather than weak, we would say the results are statistically significant, and biologically realistic.

      Additionally, no mechanistic investigation was conducted beyond referencing previous reports to interpret the core behavioral phenotypes.

      We fully agree with this being a weakness, as we wish we could have done more mechanistic studies to find out exactly what is Arch activation doing to HIPP-RSC transmission, which neurons are being affected, and perhaps in the future dissect its circuit determinants. We have all these goals very present and hope we can address them soon.

      Reviewer #2 (Public Review):

      The authors examine the impact of optogenetic inhibition of hippocampal axon terminals in the retrosplenial cortex (RSP) during the performance of a working memory T-maze task. Performance on a delayed non-match-to-place task was impaired by such inhibition. The authors also report that inhibition is associated with faster decision-making and that the effects of inhibition can be observed over several subsequent trials. The work seems reasonably well done and the role of hippocampal projections to retrosplenial cortex in memory and decision-making is very relevant to multiple fields. However, the work should be expanded in several ways before one can make firm conclusions on the role of this projection in memory and behavior.

      We thank Reviewer#2 for carefully and critically reading our manuscript, and for the valuable comments provided.

      (1) The work is very singular in its message and the experimentation. Further, the impact of the inhibition on behaviour is very moderate. In this sense, the results do not support the conclusion that the hippocampal projection to retrosplenial cortex is key to working memory in a navigational setting.

      As we have mentioned in response to Reviewer#1, the judged “very moderate” effect stems from a perspective according to which a quantitatively lower effect bears less biological significance for cognition, precluding its consideration as “key” for behaviour. We disagree with this perspective and find it rather reductive for several reasons. Once seen in the context of the animal’s ecology, quantitatively lower impairments in working memory are no less key for this cognitive capacity, and can be life-threatening precisely because of their subtlety, leading the animal to confidently rely on a defective capacity, for such events as remembering the habitual location of a predator, or food source. Furthermore, studies in animal cognition often undertake complete, rather than graded, suppression of a given mechanism (in the same sense as “knocking out” a gene that is relevant for behaviour), leading to a gravelly, rather that gradually, impaired model system, to the point of not allowing a hypothetical causal link to be mechanistically revealed beyond its mere presence. This often hinders a thorough interpretation of its role.

      In our study we conclude that silencing HIPP inputs in RSC perturbs behaviour enough to impair behaviour while not disabling the animal entirely, as such allowing for behaviour to proceed, and our observation of graded, decreased (not absent), proficiency under optogenetic silencing. So rather than weak, we would say the results are statistically significant, and biologically realistic.

      (2) There are no experiments examining other types of behavior or working memory. Given that the animals used in the studies could be put through a large number of different tasks, this is surprising. There is no control navigational task. There is no working memory test that is non-spatial. Such results should be presented in order to put the main finding in context.

      It is hard to gainsay this point. The more thorough and complete a behavioural characterization is, the more informative is the study, from every angle you look at it. While we agree that other forms of WM would be quite interesting in this context, we also cannot ignore the fact that DNMP is widely tested as a WM task, one that is biologically plausible, sensitive to perturbations of neural circuitry know to be at play therein, and fully accepted in the field. Faced with the impossibility of running further studies, for lack of additional funding and human resources, we chose to run this task.

      A control navigational task would, in our understanding, be used to assess whether silencing HIPP projections to RSC would affect (spatial?) navigation, rather than WM, thus explaining the observed impairment. To this we have the following to say: Spatial Navigation is a very basic cognitive function, one that relies on body orientation relative to spatial context, on keeping an updated representation of such spatial context, (“alas”, as memory), and on guiding behaviour according to acquired knowledge about spatial context. Some of these functions are integral to spatial working memory, as such, they might indeed be affected.

      Dissecting the determinants of spatial WM is indeed an ongoing effort, one that was not the intention of the current study, but also one that we have very present, in hope we can address in the future.

      A non-spatial WM task would indeed vastly solidify our claims beyond spatial WM, onto WM. We have, for this reason, changed the title of the manuscript which now reads “spatial working memory”.

      (3) The actual impact of the inhibition on activity in RSP is not provided. While this may not be strictly necessary, it is relevant that the hippocampal projection to RSP includes, and is perhaps dominated by inhibitory inputs. I wonder why the authors chose to manipulate hippocampal inputs to RSP when the subiculum stands as a much stronger source of afferents to RSP and has been shown to exhibit spatial and directional tuning of activity. The points here are that we cannot be sure what the manipulation is really accomplishing in terms of inhibiting RSP activity (perhaps this explains the moderate impact on behavior) and that the effect of inhibiting hippocampal inputs is not an effective means by which to study how RSP is responsive to inputs that reflect environmental locations.

      We fully agree that neural recordings addressing the effect of silencing on RSC neural activity is relevant. We do wish we could have provided more mechanistic studies, to find out exactly what is Arch activation doing to HIPP-RSC transmission, which neurons are being affected, and thus dissecting its circuit determinants. We have all these goals very present and hope we can address them soon. Subiculum, which we mention in the Introduction, is indeed a key player in this complex circuitry, one whose hypothetical influence is the subject of experimental studies which will certainly reveal many other key elements.

      (4) The impact of inhibition on trials subsequent to the trial during which optical stimulation was actually supplied seems trivial. The authors themselves point to evidence that activation of the hyperpolarizing proton pump is rather long-lasting in its action. Further, each sample-test trial pairing is independent of the prior or subsequent trials. This finding is presented as a major finding of the work, but would normally be relegated to supplemental data as an expected outcome given the dynamics of the pump when activated.

      We disagree that this finding is “trivial”, and object to the considerations of “normalcy”, which we are left wondering about.

      In lack of neurophysiological experiments (for the reasons stated above) to address this interesting finding, we chose to interpret it in light of (the few) published observations, such being the logical course of action in scientific reporting, given the present circumstances.

      Evidence for such a prolonged effect in the context of behaviour is scarce (to our knowledge only the one we cite in the manuscript). As such, it is highly relevant to report it, and give it the relevance we do in our manuscript, rather than “relegating it to supplementary data”, as the reviewer considers being “normal”.

      In the DNMP task the consecutive sample-test pairs are explicitly not independent, as they are part of the same behavioural session. This is illustrated by the simple phenomenon of learning, namely the intra-session learning curves, and the well-known behavioral trial-history effects. The brain does not simply erase such information during the ITI.

      (5) In the middle of the first paragraph of the discussion, the authors make reference to work showing RSP responses to "contextual information in egocentric and allocentric reference frames". The citations here are clearly deficient. How is the Nitzan 2020 paper at all relevant here?

      Nitzan 2020 reports the propagation of information from HIPP to CTX via SUB and RSC, thus providing a conduit for mnemonic information between the two structures, alternative to the one we target, thus providing thorough information concerning the HIPP-RSC circuitry at play during behaviour.

      Alexander and Nitz 2015 precisely cite the encoding, and conjunction, of two types of contextual information, internal (ego-) and external (allocentric).

      The subsequent reference is indeed superfluous here.

      We thank the Reviewer#2 for calling our attention to the fact that references for this information are inadequate and lacking. We have now cited (Gill et al., 2011; Miller et al., 2019; Vedder et al., 2017) and refer readers to the review (Alexander et al., 2023) for the purpose of illustrating the encoding of information in the two reference frames. In addition, we have substantially edited the Introduction and Discussion sections, and suppressed unnecessary passages.

      (6) The manuscript is deficient in referencing and discussing data from the Smith laboratory that is similar. The discussion reads mainly like a repeat of the results section.

      Please see above. We thank Reviewer#2 for this comment, we have now re-written the Discussion such that it is less of a summary of the Results and more focused on their implications and future directions.

    2. eLife assessment

      The authors report that optogenetic inhibition of hippocampal axon terminals in retrosplenial cortex impairs the performance of a delayed non-match to place task. The significance of findings elucidating the role of hippocampal projections to the retrosplenial cortex in memory and decision-making behaviors is important. However, the strength of evidence for the paper's claims is currently incomplete.

    3. Reviewer #1 (Public Review):


      This is a study on the role of the retrosplenial cortex (RSC) and the hippocampus in working memory. Working memory is a critical cognitive function that allows temporary retention of information for task execution. The RSC, which is functionally and anatomically connected to both primary sensory (especially visual) and higher cognitive areas, plays a key role in integrating spatial-temporal context and in goal-directed behaviors. However, the specific contributions of the RSC and the hippocampus in working memory-guided behaviors are not fully understood due to a lack of studies that experimentally disrupt the connection between these two regions during such behaviors.

      In this study, researchers employed eArch3.0 to silence hippocampal axon terminals in the RSC, aiming to explore the roles of these brain regions in working memory. Experiments were conducted where animals with silenced hippocampal axon terminals in the RSC performed a delayed non-match to place (DNMP) task. The results indicated that this manipulation impaired memory retrieval, leading to decreased performance and quicker decision-making in the animals. Notably, the authors observed that the effects of this impairment persisted beyond the light-activation period of the opsin, affecting up to three subsequent trials. They suggest that disrupting the hippocampal-RSC connection has a significant and lasting impact on working memory performance.


      They conducted a study exploring the impact of direct hippocampal inputs into the RSC, a region involved in encoding spatial-temporal context and transferring contextual information, on spatial working memory tasks. Utilizing eArch3.0 expressed in hippocampal neurons via the viral vector AAV5-hSyn1-eArch3.0, they aimed to bilaterally silence hippocampal terminals located at the RSC in rats pre-trained in a DNMP task. They discovered that silencing hippocampal terminals in the RSC significantly decreased working memory performance in eArch+ animals, especially during task interleaving sessions (TI) that alternated between trials with and without light delivery. This effect persisted even in non-illuminated trials, indicating a lasting impact beyond the periods of direct manipulation. Additionally, they observed a decreased likelihood of correct responses following TI trials and an increased error rate in eArch+ animals, even after incorrect responses, suggesting an impairment in error-corrective behavior. This contrasted with baseline sessions where no light was delivered, and both eArch+ and control animals showed low error rates.


      While I agree with the authors that the role of hippocampal inputs to the RSC in spatial working memory is understudied and merits further investigation, I find that the optogenetic experiment, a core part of this manuscript that includes viral injections, could be improved. The effects were rather subtle, rendering some of the results barely significant and possibly too weak to support major conclusions. Additionally, no mechanistic investigation was conducted beyond referencing previous reports to interpret the core behavioral phenotypes.

    4. Reviewer #2 (Public Review):

      The authors examine the impact of optogenetic inhibition of hippocampal axon terminals in the retrosplenial cortex (RSP) during the performance of a working memory T-maze task. Performance on a delayed non-match-to-place task was impaired by such inhibition. The authors also report that inhibition is associated with faster decision-making and that the effects of inhibition can be observed over several subsequent trials. The work seems reasonably well done and the role of hippocampal projections to retrosplenial cortex in memory and decision-making is very relevant to multiple fields. However, the work should be expanded in several ways before one can make firm conclusions on the role of this projection in memory and behavior.

      (1) The work is very singular in its message and the experimentation. Further, the impact of the inhibition on behavior is very moderate. In this sense, the results do not support the conclusion that the hippocampal projection to retrosplenial cortex is key to working memory in a navigational setting.

      (2) There are no experiments examining other types of behavior or working memory. Given that the animals used in the studies could be put through a large number of different tasks, this is surprising. There is no control navigational task. There is no working memory test that is non-spatial. Such results should be presented in order to put the main finding in context.

      (3) The actual impact of the inhibition on activity in RSP is not provided. While this may not be strictly necessary, it is relevant that the hippocampal projection to RSP includes, and is perhaps dominated by inhibitory inputs. I wonder why the authors chose to manipulate hippocampal inputs to RSP when the subiculum stands as a much stronger source of afferents to RSP and has been shown to exhibit spatial and directional tuning of activity. The points here are that we cannot be sure what the manipulation is really accomplishing in terms of inhibiting RSP activity (perhaps this explains the moderate impact on behavior) and that the effect of inhibiting hippocampal inputs is not an effective means by which to study how RSP is responsive to inputs that reflect environmental locations.

      (4) The impact of inhibition on trials subsequent to the trial during which optical stimulation was actually supplied seems trivial. The authors themselves point to evidence that activation of the hyperpolarizing proton pump is rather long-lasting in its action. Further, each sample-test trial pairing is independent of the prior or subsequent trials. This finding is presented as a major finding of the work, but would normally be relegated to supplemental data as an expected outcome given the dynamics of the pump when activated.

      (5) In the middle of the first paragraph of the discussion, the authors make reference to work showing RSP responses to "contextual information in egocentric and allocentric reference frames". The citations here are clearly deficient. How is the Nitzan 2020 paper at all relevant here?

      (6) The manuscript is deficient in referencing and discussing data from the Smith laboratory that is similar. The discussion reads mainly like a repeat of the results section.

    1. Author Response

      The following is the authors’ response to the current reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Hats off to the authors for taking time to decipher the seemingly subtle but important differences between the Gnai2/3 double mutant and Ptx mutant phenotypes. These results further illustrate the dynamic requirement of Gnai/0 in hair bundle establishment. I have some minor suggestions for the authors to consider and it is up to the authors to decide whether to incorporate them:

      We decided to make the current (revised) version the version of record, and we explain why below. Please include these comments in the review+rebuttal material.

      (1) The abstract could be modified to reflect the revised interpretations of the results.

      Response: the abstract is high-level and the changes in interpretation in the revised manuscript do not modify the message there. Briefly, the abstract only states that Gnai2; Gnai3 double mutants recapitulate two defects previously only observed with pertussis toxin. There is no claim about the timing or dose of GNAI proteins involved.

      (2) The three rows of OHCs are like a different beast from each other. Mireille Montcouquiol's lab has demonstrated that there is a differential requirement for Gnai3 in hair bundle orientation among the three rows of OHCs. The results described in this manuscript support this notion as well.

      To clarify, Gnai3 inactivation does not affect OHC orientation. Only pertussis toxin, and in this work Gnai2; Gnai3 double mutants, do. The Montcouquiol lab showed different degree of OHC1, OHC2 and OHC3 misorientation upon use of pertussis toxin in vitro using cochlear explants (Ezan et al 2013). We showed the same thing in vivo using transgenic models (Tarchini et al 2013; Tarchini et al 2016). The different OHC responses by row and corresponding citations are mentioned in several locations in the manuscript, including first on line 112 in the Introduction and in Fig. 1C in a graphical summary.

      (3) I wonder if "compensate" or "redundancy" may be a better term to use than "rescue" in the Discussion and figure.

      Use of “rescue” in the Discussion is line 603 and 604. We think that “rescue” is appropriate to refer to the ability of GNAI2 to compensate for the loss of GNAI1 and GNAI3 in mutant context. We would argue that these different wordings are largely interchangeable and do not change the message.

      Author Response

      The following is the authors’ response to the original reviews.

      We really appreciate the time the reviewers spent reading and commenting on the original manuscript. Although they were positive already, we decided to spend some time to address the main comments with new experiments as thoroughly as possible in a new manuscript version. We also heavily edited some sections accordingly.: 1) we delayed pertussis toxin activation in hair cells with Atoh1-Cre to show that the resulting misorientation phenotype is delayed compared to FoxG1-Cre results, as also seen in Gnai2; Gnai3 double mutants. It follows that Gnai2; Gnai3 and pertussis mutants do share a similar misorientation profile, and that GNAI proteins are required to normally reverse OHC1-2 (from medial to lateral), but also to maintain the lateral orientation, at least transiently. 2) We experimentally verified that one of our GNAI antibodies can indeed detect GNAI1, and consequently that absence of signal in Gnai2; Gnai3 double mutants is evidence that GNAI1 is not involved in apical hair cell polarization. We believe these changes strengthen the manuscript and its conclusions.

      Reviewer #1 (Public Review):

      A subclass of inhibitory heterotrimeric guanine nucleotide-binding protein subunits, GNAI, has been implicated in sensory hair cell formation, namely the establishment of hair bundle (stereocilia) orientation and staircase formation. However, the former role of hair bundle orientation has only been demonstrated in mutants expressing pertussis toxin, which blocks all GNAI subunits, but not in mutants with a single knockout of any of the Gnai genes, suggesting that there is a redundancy among various GNAI proteins in this role. Using various conditional mutants, the authors concluded that GNAI3 is the primary GNAI proteins required for hair bundle morphogenesis, whereas hair bundle orientation requires both GNAI2 and GNAI3.


      Various compound mutants were generated to decipher the contribution of individual GNAI1, GNAI2, GNAI3 and GNAIO in the establishment of hair bundle orientation and morphogenesis. The study is thorough with detailed quantification of hair bundle orientation and morphogenesis, as well as auditory functions.


      While the hair bundle orientation phenotype in the Foxg1-cre; Gnai2-/-; Gnai3 lox/lox (double mutants) appear more severe than those observed in Ptx cKO mutants, it may be an oversimplification to attribute the differences to more GNAI function in the Ptx cko mutants. The phenotypes between the double mutants and Ptx cko mutants appear qualitatively different. For example, assuming the milder phenotypes in the Ptx cKO is due to incomplete loss of GNAI function, one would expect the Ptx phenotype would be reproducible by some combination of compound mutants among various Gnai genes. Such information was not provided. Furthermore, of all the double mutant specimens analyzed for hair bundle orientation (Fig. 8), the hair bundle/kinocilium position started out normally in the lateral quadrant at E17.5 but failed to be maintained by P0. This does not appear to be the case for Ptx cKO, in which all affected hair cells showed inverted orientation by E17.5. It is not clear whether this is the end-stage of bundle orientation in Ptx cKO, and the kinocilium position started out normal, similar to the double mutants before the age of analysis at E17.5. Understanding these differences may reveal specific requirements of individual GNAI subunits or other factors are being affected in the Ptx mutants.

      This criticism was very useful and prompted new experiments as well as a change in data presentation and a fundamental rewrite regarding hair cell orientation. These changes are detailed below. Of note, however, please let us clarify that the original manuscript did show that the ptxA orientation phenotype is reproduced to some extent in Gnai2; Gnai3 double mutants (previously Fig. 8 and corresponding text line 505). We showed that OHC1-2 are also inverted in the double mutant, although at a later differentiation stage. We recognize that similarities in hair cell misorientation between ptxA and Gnai2; Gnai3 DKO were not explained and discussed well enough. This part of the manuscript has been re-worked extensively, and we hope that along with new results, comparisons between mutant models are easier to follow and understand. We notably fully adopted the idea that there are qualitative differences between ptxA and Gnai2; Gnai3 mutants, and not only a difference in the remaining “dose” of GNAI activity.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Comments related to clarification of the weakness:

      (1) In general, hair bundle orientation in the double mutants is established in the lateral quadrant of the cochlea before being inverted (Fig. 8). These results are intriguing because the lateral orientation is the correct position for these hair bundles normally and Gnai proteins are thought to be required to get the kinocilium to the lateral position. This process appears to proceed normally in the double mutants but the kinocilium reverted to the medial default position over time, which suggests that Gnai2 and Gnai3 are only required for the maintenance and not the establishment of the kinocilium in the lateral position. Is this phenotype qualitatively similar in the Ptx cKO?

      We addressed these issues with two types of modifications to the data:

      (1) We modified the eccentricity threshold used at E17.5 in Fig. 8 (orientation) to be more stringent, using 0.4 (instead of 0.25 previously) in both controls and mutants. This means that we now only graph the orientation of cells where eccentricity is more marked. The rationale is that at early stages, it is challenging to distinguish immature vs defective near-symmetrical cells. We kept a threshold of 0.25 at P0 when the hair cell apical surface is larger and better differentiated (Fig. 8C-D). Importantly, the dataset remains rigorously identical. This change usefully highlights that a large proportion of OHC1 is in fact inverted (oriented medially) at E17.5 in Gnai2; Gnai3 double mutants at the cochlear mid, as also seen in the ptxA model at the same stage and position (see new Fig. 8A). At the E17.5 base (Fig. 8B), a slightly more mature position, the outcome is unchanged (the majority of OHC1 are inverted using either a 0.25 or 0.4 threshold in double mutants and in ptxA).

      Interestingly however, the orientation trend is unchanged for OHC2: OHC2 remain oriented largely laterally (i.e. normally) at the E17.5 mid and base in Gnai2; Gnai3 double mutants even with a raised eccentricity thresholds, whereas by contrast OHC2 in ptxA are inverted at these stage and positions. In the double mutant, OHC2 only become inverted at the P0 base (Fig. 8D). This suggests that there are similarities (OHC1) but also differences (OHC2s) between the two mouse models, and that double mutants show a delay in adopting an inverted orientation compared to ptxA. Of note, OHC2 have been shown to differentiate later than OHC1 (for example, Anniko 1983 PMID:6869851).

      (2) To directly test the idea that the misorientation phenotype (inverted OHC1-2) is comparable between the two models but delayed in Gnai2; Gnai3 mutants, we performed a new experiment and added new results in the manuscript. We delayed ptxA action by using Atoh1-Cre (postmitotic hair cells) instead of FoxG1-Cre (otic progenitors). Remarkably, this produced a pattern of OHC1-2 misorientation more similar to Gnai2; Gnai3 mutants: at the E17.5 base and P0 apex, OHC2 were still largely oriented laterally (normally) in Atoh1-Cre; ptxA as in Gnai2; Gnai3 mutants whereas at the P0 base a large proportion of OHC2 were inverted (Fig. 8 Supp 1B). OHC1 were inverted at all stages and positions in the Atoh1-Cre as in the FoxG1-Cre; ptxA model. For Atoh1-Cre; ptxA, we only illustrated OHC1 and OHC2 and did not add E17.5 mid or P0 mid results because other cell types and stage/positions did not provide additional insight. In addition, we are well aware that the full FoxG1-Cre; ptxA and Gnai2; Gnai3 results for 4 cells types (IHC, OHC1-3) and 5 stages/positions is already a lot of data for cell orientation.

      These results suggest that:

      (a) The normal reversal of OHC1-2 to adopt a lateral orientation needs to be maintained, at least transiently, and that maintenance also relies on GNAI/O (Results starting line 529. Disussion line 621).

      (b) ptxA is more severe than Gnai2; Gnai3 when it comes to OHC1-2 orientation (Figure 9, role b). Oppositely, Gnai2; Gnai3 is obviously more severe when it comes to symmetry-breaking (Fig. 9, role a) and hair bundle morphogenesis (Fig. 9, c). It follows that the two early GNAI/O activities are qualitatively different and not just based on dose. This is essentially what this Reviewer correctly pointed out, and we have fully edited both Results and Discussion accordingly. We now speculate that the difference may lie in the identity of the necessary GNAI/O protein for each role. Any GNAI/O proteins acting as a switch downstream of the GPR156 receptor may relay orientation information (Fig. 9, role b), making ptxA a particularly effective disruption strategy since it downregulates all GNAI/O proteins. In contrast, symmetry-breaking may rely more specifically on GNAI2 and GNAI3, and ptxA is not expected to achieve a loss-of-function of GNAI2 and GNAI3 as extensive as a double targeted genetic inactivation of the corresponding genes. Please see new Results starting line 526 and Discussion starting line 603. We consequently abandoned the notion that increased doses of GNAI/O is required for each role, and we also clarify that symmetry-breaking (a) and orientation (b) occur at the same time (Fig. 9).

      (2) P0 may not be late enough a stage to access phenotype maturity in the double mutants. For example, it is not clear from the basal PO results whether the IHC will acquire an inverted phenotype or just misorientation in the lateral side.

      For context, the OHC1-2 misorientation pattern in the ptxA model at P0 does represent the end stage, as the same pattern is observed in adults (illustrated in Fig. 2A). In addition, OHC1-2 that express ptxA are inverted as soon as they break planar symmetry, and this was established at E16.5 in a previous publication where ptxA and Gpr156 misorientation patterns were compared and shown to be identical (Kindt et al., 2021 Supp. fig. 5C-D). However, we clearly failed to mention these important results in the original manuscript. We now cite Figure 2 for adult defects (line 522), and provide a citation for OHC1-2 inversion being observed from earliest stage of hair cell differentiation (Kindt et al., 2021) (line 519).

      The vast majority of Gnai2; Gnai3 double mutants die before weaning but the single specimen we managed to collect at P21 also showed inverted OHC1-2 (representative example in Fig. 2A). Again, we previously failed to point out this important result. We now do so line 214 and 555. This is another evidence that OHC1-2 misorientation is in fact similar in the ptxA and Gnai2; Gnai3 models (but milder and delayed in the latter).

      When it comes to IHCs and OHC3s however, the situation is less clear. These cell types are mildly misoriented in ptxA and Gpr156 mutants, but IHCs in particular appear severely misoriented in Gnai2; Gnai3 mutants based on the position of the basal body (Fig. 8). However, very dysmorphic hair bundles can pull on the basal body via the kinocilium and affect its position, which obscures hair cell orientation inferred from the basal body and subsequent interpretations. We do not delve on IHC and OHC3 and their orientation in Gnai2; Gnai3 mutants in the revision since we do not observe similar orientation defects in a different mouse model and lack sufficient adult data.

      Suggestions to improve upon the manuscript for readers:

      (1) Line 294, indicate on the figure the staining in bare zone and tips of stereocilia on row 1.

      Pertains to Figure 4. In A, we now point out the bare zone and stereocilia tips with arrow and arrowheads, respectively (as in other figures).

      (2) Fig.8 schematic diagram, the labels of the line and 90o side by side is misleading.

      We added black ticks for 0, 90, 180, 270 degree references. In contrast, the hair cell angle represented was switched to magenta.

      (3) Fig. 7 legend, redundancy towards the end of the paragraph.

      Thank you for catching this issue. A large portion of the legend was indeed accidentally repeated and is now deleted.

      (4) Line 490-493, Another plausible explanation is that other factors besides Gnai2 and Gnai3 are involved in breaking symmetry during bundle establishment.

      We now acknowledge that other proteins besides GNAI/O may be involved (Discussion line 614). That said, the notion that we do not achieve sufficient and/or early enough GNAI loss is supported for example by the Beer-Hammer 2018 study where no defects in symmetry-breaking or orientation were reported in their Gnai2 flox/flox; Gnai3 flox/flox model (Discussion new Line 637).

      (5) Line 518, the base were largely inverted (Figure 8B). Should Fig 8A be cited instead of 8B?

      Fig. 8B has graphs for the E17.5 cochlear base where OHC1-2 are inverted in both ptxA and Gnai2;3 DKO models. Fig. 8A has graphs of the E17.5 cochlear mid (less differentiated hair cells) where an inversion was not obvious previously, but is now clear although only partial in Gnai2; Gnai3 DKO (see above; raised eccentricity threshold). In the context of the previous text, this citation was thus correct. However, this section has been heavily modified to better compare Gnai2; Gnai3 DKO and ptxA and is hopefully less confusing in the revised version.

      Reviewer #2 (Public Review):

      Jarysta and colleagues set out to define how similar GNAI/O family members contribute to the shape and orientation of stereocilia bundles on auditory hair cells. Previous work demonstrated that loss of particular GNAI proteins, or inhibition of GNAIs by pertussis toxin, caused several defects in hair bundle morphogenesis, but open questions remained which the authors sought to address. Some of these questions include whether all phenotypes resulting from expression of pertussis toxin stemmed from GNAI inhibition; which GNAI family members are most critical for directing bundle development; whether GNAI proteins are needed for basal body movements that contribute to bundle patterning. These questions are important for understanding how tissue is patterned in response to planar cell polarity cues.

      To address questions related to the GNAI family in auditory hair cell development, the authors assembled an impressive and nearly comprehensive collection of mouse models. This approach allowed for each Gnai and Gnao gene to be knocked out individually or in combination with each other. Notably, a new floxed allele was generated for Gnai3 because loss of this gene in combination with Gnai2 deletion was known to be embryonic lethal. Besides these lines, a new knockin mouse was made to conditionally express untagged pertussis toxin following cre induction from a strong promoter. The breadth and complexity involved in generating and collecting these strains makes this study unique, and likely the authoritative last word on which GNAI proteins are needed for which aspect of auditory hair bundle development.

      Appropriate methods were employed by the authors to characterize auditory hair bundle morphology in each mouse line. Conclusions were carefully drawn from the data and largely based on excellent quantitative analysis. The main conclusions are that GNAI3 has the largest effect on hair bundle development. GNAI2 can compensate for GNAI3 loss in early development but incompletely in late development. The Gnai2 Gnai3 double mutant recapitulates nearly all the phenotypic effects associated with pertussis toxin expression and also reveals a role for GNAIs in early movement of the basal body. Although these results are not entirely unexpected based on earlier reports, the current results both uncover new functions and put putative functions on more solid ground.

      Based on this study, loss of GNAI1 and GNAO show a slight shortening of the tallest row of stereocilia but no other significant changes to bundle shape. Antibody staining shows no change in GNAI localization in the Gnai1 knockout, suggesting that little to no protein is found in hair cells. One caveat to this interpretation is that the antibody, while proposed to cross-react with GNAI1, is not clearly shown to immunolabel GNAI1. More than anything, this reservation mostly serves to illustrate how challenging it is to nail down every last detail. In turn, the comprehensive nature of the current study seems all the more impressive.

      (1) The original manuscript quantified stereocilia properties in Gnai1 and Gnai2 single mutants, and in Gnai1; Gnai2 double mutants using non-parametric t-tests (Mann-Whitney) for comparisons. This approach indeed suggested subtle reduction in row 1 height in IHCs in all 3 mutants. We did not quantify stereocilia features in Gnao1 mutants but could not observe defects (new Fig. 2 Supp. 1E-F). In fact, we could not observe defects in Gnai1 and Gnai2 single mutants, and in Gnai1; Gnai2 double mutants either. For this reason we have been ambivalent about reporting defects for Gnai1 and Gnai2 single and Gnai1; Gnai2 double mutants.

      In the revision, we applied a nested (hierarchical) t-test to avoid pseudo-replication (Eisner 2021; PMID: 33464305; https://pubmed.ncbi.nlm.nih.gov/33464305/). In our data, the nested t-tests structure measurements by animal instead of having all stereocilia or other cell measurements treated as independent values. This more stringent approach no longer finds row 1 height reduction significant in single Gnai1 or Gnai2 mutants, or in Gnai1; Gnai2 double mutants. We modified the text accordingly in Results and Discussion. Nested t-tests were applied uniformly across the manuscript and, besides IHC measurements in Fig. 2, now also apply to bare zone surface area in Fig. 6 and eccentricity in Fig. 7. For these experiments in contrast, previous conclusions are not changed. We think that this more careful statistical treatment is a closer representation of the data in term of the conclusions we can safely make.

      (2) The reviewer's criticism about antibody specificity is accurate and fair, and is fully addressed in the revised manuscript. First, we provide a phylogeny cartoon as Figure 1A to compare the GNAI/O proteins and highlight how closely related they are in sequence. To validate the assumption that our approach would detect GNAI1 if it were present in hair cells, we took a new dual experimental approach in the revision. First, we electroporated Gnai1, Gnai2 and Gnai3 expression constructs in the E13.5 inner ear and tested whether the two GNAI antibodies used in the study can detect ectopic GNAI1 in Kolliker organ. This revealed that “ptGNAI2” detects GNAI1 very well (in addition to GNAI2), but that “scbtGNAI3” does not detect GNAI1 efficiently (although it does detect GNAI3 very well). To verify in vivo that “ptGNAI2” can detect endogenous GNAI1, we immunolabeled the gallbladder epithelium in Gnai1 mutants and littermate controls using the “ptGNAI2” antibody. Based on IMPC consortium data* about the Gnai1 LacZ mouse strain, Gnai1 is specifically expressed in the adult gallbladder. We could verify that signals detected in the Gnai1 mutants were visually reduced in comparison to littermate controls. We now added this validation step in Results line 309 and the data in Fig. 4 Supp. 1A-B).


      Reviewer #2 (Recommendations For The Authors):

      Minor comments that may marginally improve clarity.

      Abstract line 24: delete "nor polarized" because polarization cannot be assessed since the protein is undetectable.

      This is a fair point, now deleted.

      Consider revising: Lines 80-82; 188-202 (the order in which the mutants were presented was hard to follow for me); 239-240.

      Lines 80-82: Used to read as "Ptx recapitulates severe stereocilia stunting and immature-looking hair bundles observed when GPSM2 or both GNAI2 and GNAI3 are inactivated."

      Line 88: Was now changed to "Ptx provokes immature-looking hair bundles with severely stunted stereocilia, mimicking defects in Gpsm2 mutants and Gnai2; Gnai3 double mutants".

      Lines 188-202: This was the first paragraph describing adult stereocilia defects in the different Gnai/o mouse strains. We completely rewrote the entire section to reflect the order in which the strains appear in Figure 2, hopefully making the text easier to follow because it better matches panels in Fig. 2 . We also made several other modifications to streamline comparisons and better introduce the orientation defects that are later detailed at neonate stages.

      Lines 239-240: Used to read "GNAI2 makes a clear contribution since stereocilia defects increase in severity when GNAI loss extends from GNAI3 to both GNAI2 and GNAI3".

      Line 247: Was now changed for "GNAI2 makes a clear contribution since Gnai3neo stereocilia defects dramatically increase in severity when GNAI2 is absent as well in Gnai2; Gnai3 double mutants."

      Line 164: hardwired is unclear. Conserved?

      We modified this sentence as follows: Line 171: "We reasoned that apical HC development is probably highly constrained and less likely to be influenced by genetic heterogeneity compared to susceptibility to disease, for example."

      Line 299: It is not clear why GNAI1 is a better target than GNAI3. This phrase is repeated in line 303, I suspect inadvertently. Is there evidence that this antibody detects GNAI1, perhaps in another tissue? Line 308: GNAI1 may also not be detected by this antibody.

      Please see point 2 above. We removed these hypothetical statements entirely and we instead now experimentally show that one of the two commercial antibodies used can readily detect GNAI1 (yet does not detect signal in hair cells when GNAI2 and GNAI3 are absent in Fig. 4F).

    2. eLife assessment

      This study examines an important aspect of the development of the auditory system, the role of guanine nucleotide-binding protein subunits, GNAIs, in stereociliary bundle formation and orientation, by examining bundle phenotypes in multiple compound GNAI mutants. The experiments are highly rigorous and thorough and include detailed quantifications of bundle morphologies and changes. The depth and care of the study are impressive, with convincing results regarding the roles of GNAIs in stereociliary bundle development. Further, the reviewers believe this to be the definitive study of the role of GNAIs in bundle orientation and development.

    3. Reviewer #1 (Public Review):

      A subclass of inhibitory heterotrimeric guanine nucleotide-binding protein subunits, GNAI, has been implicated in sensory hair cell formation, namely the establishment of hair bundle (stereocilia) orientation and staircase formation. However, the former role of hair bundle orientation has only been demonstrated in mutants expressing pertussis toxin, which blocks all GNAI subunits, but not in mutants with a single knockout of any of the Gnai genes, suggesting that there is a redundancy among various GNAI proteins in this role. Using various conditional mutants, the authors concluded that GNAI3 is the primary GNAI proteins required for hair bundle morphogenesis, whereas hair bundle orientation requires both GNAI2 and GNAI3.


      Various compound mutants were generated to decipher the contribution of individual GNAI1, GNAI2, GNAI3 and GNAIO in the establishment of hair bundle orientation and morphogenesis. The study is thorough with detailed quantification of hair bundle orientation and morphogenesis, as well as auditory functions.

      The revised manuscript has clarified the phenotypic differences raised between the Gnai2/3 double mutants and Ptx mutant phenotypes and resolved the weakness pointed out in the previous submission. These results further illustrate the dynamic requirement of Gnai/O in hair bundle establishment and is an important contribution to the field.

    4. Reviewer #2 (Public Review):

      Jarysta and colleagues set out to define how similar GNAI/O family members contribute to the shape and orientation of stereocilia bundles on auditory hair cells. Previous work demonstrated that loss of particular GNAI proteins, or inhibition of GNAIs by pertussis toxin, caused several defects in hair bundle morphogenesis, but open questions remained which the authors sought to address. Some of these questions include whether all phenotypes resulting from expression of pertussis toxin stemmed from GNAI inhibition; which GNAI family members are most critical for directing bundle development; whether GNAI proteins are needed for basal body movements that contribute to bundle patterning. These questions are important for understanding how tissue is patterned in response to planar cell polarity cues.

      To address questions related to the GNAI family in auditory hair cell development, the authors assembled an impressive and nearly comprehensive collection of mouse models. This approach allowed for each Gnai and Gnao gene to be knocked out individually or in combination with each other. Notably, a new floxed allele was generated for Gnai3 because loss of this gene in combination with Gnai2 deletion was known to be embryonic lethal. Besides these lines, a new knockin mouse was made to conditionally express untagged pertussis toxin following cre induction from a strong promoter. The breadth and complexity involved in generating and collecting these strains makes this study unique, and likely the authoritative last word on which GNAI proteins are needed for which aspect of auditory hair bundle development.

      Appropriate methods were employed by the authors to characterize auditory hair bundle morphology in each mouse line. Conclusions were carefully drawn from the data and largely based on excellent quantitative analysis. The main conclusions are that GNAI3 has the largest effect on hair bundle development. GNAI2 can compensate for GNAI3 loss in early development but incompletely in late development. The Gnai2 Gnai3 double mutant recapitulates nearly all the phenotypic effects associated with pertussis toxin expression and also reveals a role for GNAIs in early movement of the basal body. This comprehensive study builds on earlier reports, both uncovering new functions and putting previously putative functions on solid ground.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Reviewer #2 (Public Review):

      Major Weaknesses:

      The assertion that MOCAT can be rapidly applied in hospital pathology departments seems overstated due to the limited availability of light-sheet microscopes outside research labs. In the first rebuttal letter, authors explain the limitations of other microscopes more readily available in hospitals. This explanation relies on your own investigations and practical experience on the matter, so including them in some part of the manuscript would be beneficial.

      We appreciate the reviewer's comments and have added a discussion on the limitations of microscopes that are more readily available in hospitals in our text:

      Revised manuscript, line 305-316:

      “3.3 Microscopy options for imaging centimeter-sized specimens

      Optical sectioning techniques are crucial for obtaining high-quality volumetric images. Techniques such as confocal microscopes, multi-photon microscopy, and light-sheet microscopy filter out-of-focus signals, resulting in sharp images of individual planes. In our study, we used light-sheet microscopy and multi-point confocal (i.e., spinning disc) for imaging centimeter-sized specimens because of their scanning speeds. While two-photon and confocal microscopy offer high-resolution imaging of smaller volumes, they are not ideal for scanning entire tissues because of their prolonged scanning times.”

      Non-optical sectioning wide-field fluorescence microscopes, like the Olympus BX series or ZEISS Axio imager series, can also be used to scan samples up to about 3.5mm thick with long working distance objective lenses. In these cases, deconvolution algorithms are required to eliminate out-of-focus signals. However, it should be noted that the epifluorescence system might reduce fluorescent intensity in deeper regions within the samples.”

      Refractive index matching is a critical point in the protocol, the one providing final transparency. Authors utilized the commercial solutions NFC1 and NFC2 (Nebulem, Taiwan) with a known refractive index, but for which its composition is non-disclosable. My knowledge on the organic chemistry around refractive index matching is limited, but if users don't really know what is going on in this final step, the whole protocol would rely on a single world-wide provider and troubleshooting would be fishing. I suggest that you try to validate the approach with solutions of known composition, or at least provide the solutions sold by other providers.

      We appreciate the reviewer's suggestions. Based on our experience, the CUBIC-R solution developed by Ueda's team also serves as an effective RI-matching solution in the MOCAT pipeline. Its only drawback is the potential reddening of the specimen, likely due to the light-responsive component, antipyrine. We have now added this information to the Methods section:

      Revised manuscript, line 492-496:

      “Refractive index (RI) matching. Before imaging, the specimens were RI-matched by being immersed in NFC1 (RI = 1.47) and NFC2 (RI = 1.52) solutions (Nebulum, Taipei, Taiwan). Each immersion lasted for one day at room temperature. Alternatively, RI-matching can also be accomplished by immersing specimens in a 1:1 dilution of CUBIC-R[28] for one day, followed by pure CUBIC-R for an additional day.“

      Reviewer #2 (Recommendations For The Authors):

      A comment on the name of the protocol, MOCAT. I am sorry to bring this now, and not before. But, I strongly recommend another name for the procedure. My concern is that the present name "MOCAT" refers to the problem, and NOT to the actual solution provided by you. See, the problem to solve is: to perform Multiplex labeling Of Centimeter-sized Archived Tissue (MOCAT), but it says nothing about HOW you did it: heat-induced antigen retrieval and Tween20-delipidation for centimeter-scale FFPE specimens. In summary, I strongly recommend that the acronym of the procedure refers more to the "solution" than to the "problem", and for me this is important because otherwise the acronym is not fair with present and future techniques pretending to provide a novel solution to the same problem. Another way to put it is that researchers can own their proposed solutions, but they do not own the problem to be solved.

      We appreciate the reviewer's suggestions. In response to their concerns, we have renamed the procedure presented in this study as Heat-Induced FFPE-based Tissue Clearing, with the acronym HIF-Clear. This change reflects the critical step in our procedure. Corresponding updates have also been made in the manuscript.

    2. eLife assessment

      The reprocessing and reanalysis of archived samples can yield further insights from past experiments. Here, a useful procedure to perform tissue clearing and immunolabeling on large-scale formalin-fixed paraffin-embedded brain specimens is convincingly evaluated on a set of archival pathology specimens, and its applicability to further such samples is analyzed. This method will be of interest to both neuroscientists and pathologists.

    3. Reviewer #1 (Public Review):

      In this study, Lin et al developed a protocol termed HIF-Clear, to perform tissue clearing and labelling on large-scale FFPE mouse brain specimens. They have optimized protocols for dewaxing and adequate delipidation of FFPE tissues to enable deep immunolabelling, even for whole mouse brains. This was useful for the study of disease models such as in an astrocytoma model to evaluate spatial architecture of the tumour and its surrounding microenvironment. It was also used in a traumatic brain injury model to quantify changes in vasculature density and differences in monoaminergic innervation. They have also demonstrated the potential of multi-round immunolabelling using photobleaching, as well as expansion microscopy with FFPE samples using Hif Clear.

      Comments on revised version:

      The revised manuscript by Lin et al is much improved with a more detailed methods description. There are only a few minor comments for the authors that are still valid:

      - Some procedures, including the basic HIF-Clear protocol, seem to produce marked tissue expansion that is not mentioned in the manuscript. Users should take this fact into consideration when making measurements.<br /> - The authors have provided a comparison between mouse and human brain samples in Figure S12. However, it is misleading to mention that the "fluorescent signals are comparable at varying depth" as the figure clearly showed a lack of continuous staining especially for SMI312 at 900um depth, and human brain tissue showed considerably increased background signal (likely due to endogenous lipofuscin which has autofluorescent properties). Also, This is difficult to assess in the present design of the experiment because, at different depths, the tissue and the antigen may change themselves... making it difficult to make a direct staining comparison with other depths.

    4. Reviewer #2 (Public Review):

      The manuscript details an investigation aimed at developing a protocol to render centimeter-scale formalin-fixed paraffin-embedded specimens optically transparent and suitable for deep immunolabeling. The authors evaluate various detergents and conditions for epitope retrieval such as acidic or basic buffers combined with high temperatures in entire mouse brains that had been paraffin-embedded for months. They use various protein targets to test active immunolabeling and light-sheet microscopy registration of such preparations to validate their protocol. The final procedure, called MOCAT pipeline, briefly involves 1% Tween 20 in citrate buffer, heated in a pressure cooker at 121 {degree sign}C for 10 minutes. The authors also note that part of the delipidation is achieved by the regular procedure.

      Major Strengths<br /> - The simplicity and ease of implementation of the proposed procedure using common laboratory reagents distinguish it favorably from more complex methods.

      - Direct comparisons with existing protocols and exploration of alternative conditions enhance the robustness and practicality of the methodology.

      Final considerations<br /> The evidence presented supports the effectiveness of the proposed method in rendering thick FFPE samples transparent and facilitating repeated rounds of immunolabeling.

      The developed procedure holds promise for advancing tissue and 3D-specific determination of proteins of interest in various settings, including hospitals, basic research, and clinical labs, particularly benefiting neuroscience research.

      The methodological findings suggest that MOCAT could have broader applications beyond FFPE samples, differentiating it from other tissue-clearing approaches in that the equipment and chemicals needed are broadly accessible.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      This manuscript aims to understand the biological mechanisms underlying neuropsychiatric symptoms in Parkinson's disease by characterizing subtypes of neurons in the dorsal raphe nucleus and defining their susceptibility to the degeneration of dopaminergic and adrenergic systems in the brain. This study was well-designed, the results were presented beautifully, and the manuscript was well-written. Here are some comments that may help to improve the overall quality of this work.

      We thank the reviewer for the kind comments.

      Major concerns:

      The current study utilized an intrastriatal 6-OHDA injection, which raises the possibility that the observed electrophysiological and morphological changes of DRN5-HT and DRNDA neurons (Figs 3-6) may be due to the direct effects of 6-OHDA to DRN5-HT and DRNDA neurons projecting to the dorsal striatum (at least for DRN5-HT neurons). This possibility requires further clarification and discussion.

      6-OHDA is a catecholamine neurotoxin with low selectivity for serotonin neurons. However, changes in the levels of serotonin have been observed with high doses of 6OHDA. In our study, we used lower concentrations of 6-OHDA, which did not affect the levels of serotonin (Suppl. Fig 4D), or the number of DRN5-HT neurons (Suppl. Fig. 5B). Concerning the possible effect of 6-OHDA on DRNDA neurons, we did not observe any modification in the number of these cells in response to the administration of 6-OHDA (Suppl. Fig. 5C), (lines 170-175).

      How does the loss of nigrostriatal dopamine neurons affect the electrophysiology and morphology of DRNDA neurons (Figs. 5-6)? What are the potential circuit mechanisms?

      The dopaminergic system in the midbrain and the DRN constitute two highly interconnected nuclei and hence there are multiple possible circuit mechanisms that could explain how loss of nigrostriatal dopaminergic neurons affects DRNDA neurons: First, DRNDA neurons are directly innervated by dopaminergic neurons in the SNc and VTA and hence loss of SNc inputs might evoke acute as well as homeostatic changes in DRNDA (Lin et al., 2020; Pinto et al., 2019). Second, midbrain dopaminergic neurons are in turn innervated by the DRN (Watabe-Uchida et al., 2012) and loss of postsynaptic dopaminergic neurons might affect all neuron types in the DRN that target the midbrain. Finally, GABAergic populations in the midbrain have been shown to target DRN5-HT neurons and might potentially also target other local cell types such as DRNDA (Li et al., 2019). Another possible pathway is the bidirectional connection between the striatum and the DRN (Pollak-Dorocic et al, 2014). DA depletion in the striatum may affect the GABAergic projection to the DRN and in turn modify the properties of postsynaptic DRN neurons.

      The potential circuit mechanisms are now included in the introduction (lines 58-59).

      Whether these intrastriatal 6-OHDA mice exhibited nonmotor deficits (e.g., anxiety) that may be related to the observed changes in the DRN? Such behavioral data would enhance the overall conclusions of this work.

      The PD model utilized in this study displays non-motor deficits, including depression- and anxiety-like behavior (Masini et al. 2021, Ztaou et al., 2018). This is now highlighted in the manuscript (lines 167-169).

      Minor issues:

      The panels of Fig. 2 should be re-labelled to match the descriptions in the main text (L. 142-158).

      Fig.2 now matches the descriptions in the main text.

      Fig 4D was missing from the figure, which does not match the descriptions in the main text (L. 193-204:)

      Fig. 4D includes the parameters describing the dendritic branching and starts with the last graph on the right in the second row of the panel.

      Line 409: Extra "as" after "average"

      Corrected in revised manuscript.

      Fig 3G: Missed asterisks.

      Corrected in revised manuscript (Fig. 3G)

      Details of how action parameters were quantified should be stated and specified in the methods.

      We have now added a section called ‘Quantification of electrophysiological parameters’ in the methods where we explain how the electrophysiological properties are defined and quantified (lines 407-439).

      "Parkinson's disease" in the title should be revised to "parkinsonism"

      Corrected in revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      (1) Throughout the paper, there are numerous inaccuracies and inconsistencies in the figures, which impede the clear understanding of this paper. For example, there are discrepancies between the labeling of the main figures (sub-panels) and the corresponding manuscript (Figure 2, Figure 4).

      Corrected in the revised manuscript.

      The statistical presentations are inaccurate in several figures (Figure 3E, 3G), making it difficult to distinguish which data is statistically meaningful. Furthermore, the number of cells presented in each figure is ambiguous in the figure legend. It would be better to avoid expressions such as 'n = 28 - 43 cells per group', as in line 456 (Figure 1I). Please provide the exact number of cells for each graph.

      We agree with the reviewer, and we have now added the precise n numbers for each panel in the corresponding legends in Fig 1, Fig 3, and Fig 5. Please note that some analysis was restricted to recordings where neurons fired close to their average spontaneous firing frequency (e.g. 1Hz for DRN5-HT) to allow for a fair comparison of the data across groups and that therefore the n numbers vary in different panels.

      In some figures, the value of n in the graph seems different from the value of n in the figure legends (Figure 2G-I, Figure 4, Figure 6). Collectively, these inaccurate figures and the manuscript weaken the general credibility of the data presented.

      We apologize for the misunderstanding, but in the type of chosen graph, equal values are overlapped. The numbers described in the figure legend are correct.

      (2) Some of the authors' claims in this paper are not supported by quantitative analysis, but only by sample recording traces or simple descriptions. For example, in line 97, the authors mentioned, "no differences when comparing TH-positive to TH-negative neurons".

      But there are no data actually analyzing these two groups in Supplementary Figure 2A.

      In addition, in line 103, there is a claim that "DRN DA neurons showed that they share several properties characteristics of other DA populations located in the SNc and the ventral tegmental area". However, this claim is backed up only by a few sample traces in Figure 1E.

      The statement (lines 110-111), "a relative constant action potential (AP) amplitude", is also not supported by appropriate quantitative analysis but only by sample recording traces.

      In our study we found a small subset of DAT-tdTomato positive neurons which did not stain positive for TH after the slice recordings. In 5 of 6 of these neurons (recorded in sham), the electrophysiological properties did not differ from other TH-positive neurons. This is visualized in Suppl. Fig 2A. The absence of any statistical difference was also confirmed by a Mann Whiteny U test comparing the TH negative to the TH positive DRNDA neurons (no significant differences in all 6 of 6 properties shown in Suppl. Fig 2A). Additionally, all these cells were DAT-positive, further supporting their classification as dopaminergic neurons. Therefore, we suspect that the lack of TH staining is likely caused by the tissue processing itself. Please note that all our immunohistochemistry was run on slices after several hours of patch-clamping procedures. Finally, including or excluding this small subset of neurons in the present study does not change any of the results presented and data was therefore pooled. We have now clarified this in more detail in the results section and in Suppl. Fig 2A (lines 100-103).

      We have moved the comparison of hallmark properties found in DRNDA neurons as well as in dopaminergic neurons in the midbrain from the results section to the discussion (lines 281-283).

      The claim that DRN5HT neurons have a comparatively constant action potential amplitude compared to DRNDA neurons is supported by quantitative analysis shown in Fig 1I (left panel, “AP drop rate”), while the representative example traces are shown in Fig 1G.

      (3) In the legend of Figure 2, the mouse used in this experiment is mentioned with two different names (wild-type mice in line 463 and sham-lesion mice in line 465). Is this a mistake? Or did the authors intentionally use the brain samples from sham-lesion mice for Figure 2?

      Figure 2 shows data in control conditions (Sham-lesion in our case), both from wild-type and Dat-Tomato. The text has been changed to avoid misunderstandings.

      (4) While the primary claim of this paper is the differential alterations of DRN 5-HT and DA neurons in a mouse PD model, the observed changes in the DRN neurons of the 'DA only lesion model' are comparatively minor to the 'DA and NA lesions model'. Therefore, it looks like NA depletion has a more critical role in the DRN neurons of 6OHDA-lesion mice than DA depletion. To understand the results of this paper better, it would be great if the authors can provide additional data from the 'NA only lesion model'.

      We agree with the reviewer, and we have now added a new set of experiments in which we selectively lesioned noradrenergic cells by injecting 6-OHDA unilaterally into the LC. The new data are presented in supplementary figure 6 in the revised manuscript. We find that selective lesioning of the NA system affects DRNDA and DRN5-HT neurons mildly, suggesting that the concomitant lesion of the DA and NA systems is particularly impactful (possibly because of interactions between these two systems).

      (5) In Figure 3B and Figure 5B, only the 6-OHDA+DMI group shows significant differences from the sham group. This finding might be attributed to the effect of DMI itself, not to the nigrostriatal DA degeneration without NA degeneration. Thus, adding the 'DMI-only group' in all experiments will strengthen the conclusion of this paper.

      The effect of one acute administration of desipramine was temporally limited to the stereotactic intervention (line 373-375), which was performed several weeks before the electrophysiological and morphological analyses. Given that the half-life of desipramine is approximately 24 hrs (Nagy and Johansson, 1975), we believe that its impact was limited to the neuroprotection of NA-neurons from 6-OHDA toxicity.

      (6) DRN 5-HT neurons are known to exhibit cellular heterogeneity, and in particular their electrophysiological properties are quite heterogeneous (Bernat Kocsis. 2006; J.V. Schweimer. et al. 2011). Furthermore, 5-HT neurons in the distinct subregions of the DRN display different membrane properties (LaTasha K. Crawford, 2010). Therefore, not all DRN 5-HT neurons can be regarded as electrophysiologically identical. Given that the molecular identity of all recorded cells was confirmed with neurobiotin in this paper, it would be better to show that recorded cells are not biased toward certain subregions of DRN.

      In addition, providing more comprehensive descriptions of the electrophysiological features used in PCA analysis would be beneficial in understanding the electrophysiological profiling of DRN neurons explained in this paper.

      Although several studies have revealed electrophysiological and molecular heterogeneity within the DRN5-HT population, we did not observe any significant differences within the DRN5-HT neurons recorded in this study. We compared the properties of DRN5HT neurons recorded more anterior to those recorded in the posterior

      DRN as well as neurons found in more ventral locations to those in more dorsal locations (data not shown). We would like to point out that the largest differences within serotonergic neuron populations described by previous studies were often found when comparing those located in the medial raphe nucleus (MRN) to those found in the DRN. Calizo et al., (2011) showed for example significant differences in the input resistance and AHP amplitude between MRN5HT and DRN5HT neurons. These two properties as well as the AP amplitude, AP threshold, AP duration, and tau did however not differ between DRN subregions in their study - and neither in ours. We extended our Suppl. Fig 1 and mapped the location of DRN5HT and DRNDA neurons recorded in sham (Suppl. Fig 1D).

      Overall, we’ve sampled neurons along the anterior-posterior and dorsal-ventral axes of the DRN, while on the medial-lateral axis, recorded DRN neurons were located medially.

      We agree with the reviewer that a comprehensive description of the electrophysiological features was missing in the manuscript, and we have therefore added a new section in the materials and methods where we explain in detail how each parameter was measured and analyzed (‘Quantification of electrophysiological parameters’, lines 407-439). This section also provides detailed information about the five properties underlying the PCA shown in figure 1 (i.e. delay to the first action potential, action potential drop rate, action potential rise time, duration of the afterhyperpolarization, and capacitance).

      (7) Some sample images presented in this paper contain information that can conflict with the previous research. In Figures 4B and 6B, TH expression was significantly increased in the DMI pretreatment group compared to the control group. However, several studies have shown that the administration of DMI decreases TH expression levels (Komori et al.1992; Nestler et al.1990). Therefore, it would be great if the authors further explained how the pretreatment of DMI with 6-OHDA affects TH level within the DRN.

      Figure 4B and 6B do not show any quantification of TH expression. The difference observed in the representative pictures is casual and due to the variable expression of TH across the slice. Moreover, as mentioned in the response to point 5, mice were subjected to a single injection of DMI immediately preceding the stereotactic intervention (line 373375). In contrast, the increase in TH expression reported by Komori et al. 1992 and Nestler et al. 1990 was observed in response to chronic (two weeks) administration of DMI.

      (8) This paper lacks direct evidence to demonstrate whether DMI pretreatment could effectively protect against NA depletion. Therefore, in addition to TH expression levels, it is important to provide data to confirm the intact NA levels (or NA axons) after DMI treatment.

      NA levels in the striatum were measured by Enzyme-linked immunosorbent assay and reported in Suppl.Fig.4 in the revised manuscript.

      (9) It would be great if the authors specifically explained why 6-OHDA was injected into the striatum (neither MFB nor SNc) to make a mouse model of PD.

      Mice were injected in the dorsal striatum to produce a partial bilateral lesion of the dopamine and noradrenaline systems. This model reproduces the initial stages of PD and also recapitulates several non-motor symptoms of PD, including affective disorders, which may be related to changes in serotonergic and dopaminergic transmission in the dorsal raphe. In contrast, injections in the MFB and SNc quickly produce a severe motor phenotype closer to a late stage of the disease and cannot be done bilaterally. <br /> The striatal model has been successfully used in other publications (Kravitz et al., 2010, Masini et al., 2021, Ztaou et al., 2018, Chen et al., 2014, Branchi et al., 2008, Marques et al. 2019, Tadaiesky et al., 2008, Matheus et al., 2016, Silva et al., 2016).

      (10) Supplementary Figures 2 and 3 were erroneously cut on the right side. These figure images should be replaced with the correct ones.

      We thank the reviewer for noticing and we have now replaced the figures with the correct ones.

      (11) There should be more explanations about tdTomato-positive but non-TH neurons in Supplementary Figure 2. It is strange to regard TH-negative neurons as DA neurons although these neurons have DA neuron-like electrophysiological properties. If these tdTomato-positive but non-TH neurons cannot release DA, can we say these are DA neurons?

      In our study we found a small subset of DAT-tdTomato positive neurons which did not stain positive for TH afterwards. In 5 of 6 of these neurons (recorded in sham), the electrophysiological properties did not differ from other TH-positive neurons. This is visualized in Suppl. Fig 2A. The absence of any statistical difference was also confirmed by a Mann Whiteny U test comparing the TH-negative to the TH-positive DRNDA neurons (no significant differences in all 6 of 6 properties shown in SF2A). Additionally, all these cells were DAT-positive, further supporting their classification as dopaminergic neurons. Therefore, we suspect that the lack of TH staining is likely caused by the tissue processing itself. Please note that all our immunohistochemistry was run on slices after several hours of patch-clamping procedures. Finally, including or excluding this small subset of neurons in the present study does not change any of the results presented and data was therefore pooled. We have now clarified this in more detail in the results section and in Suppl. Fig 2A (lines 100-103).

      Reviewer #3 (Recommendations For The Authors):

      The authors report using a parametric statistical test, the t-test. The t-test makes the assumption that the data are normally distributed. Most biological data is not distributed normally, and with smaller datasets, it is difficult to say whether the underlying distribution would be normally distributed. I would recommend using the non-parametric versions of the same test (eg Mann-Whitney U test), which is likely to give a similar result while being more conservative given the potential for non-normal distribution.

      All electrophysiological data were first tested for normality before running the corresponding statistical test (either t-test for normal distributed data or Mann-Whitney U test for non-normally distributed data). The morphological data are now analyzed by the Mann-Whitney U test (lines 484-494).

      The authors state that mice were treated with 6-OHDA at 3 months, then brain slices were prepared 3 weeks later, making them about 4 months old. I could not find the age of sham/control mice and 6-OHDA/desipramine mice in the methods section. Were sham/controls and 6-OHDA slices prepared in an interleaved fashion?

      Sham and 6-OHDA+DMI mice underwent surgery at 3 months and the brain slices were prepared 3 weeks later, as the 6-OHDA mice. We have now clarified this in the methods (line 381).

      While desipramine is relatively selective as a norepinephrine reuptake inhibitor, it also can prevent serotonin reuptake. Could this mechanism also protect DRN neurons from the effects of 6-OHDA?

      Even if desipramine has some affinity for the serotonin reuptake, this affinity is 100-fold less than the one described for the noradrenaline reuptake (Richelson and Pfenning, 1984, Gillman, 2007). Moreover, in our study the 6-OHDA injection in the dorsal striatum did not cause any direct damage to the DRN5-HT, as shown by the 5-HT measurement and DRN5-HT counting (Suppl. Fig. 4D, Suppl. Fig. 5A,B), so we can exclude that the effects observed in the DMI+6-OHDA group are related to a protection of the serotonergic system exerted by a single injection of desipramine.

      On line 168, the authors use the abbreviation NA for noradrenergic. Was this abbreviation previously defined in the manuscript?

      Yes, the abbreviation is defined in the introduction (line 73).

      On line 45, the authors cite that the DRN-5HT subpopulation accounts for 30-50% of the DRN neurons. It would be helpful to know approximately what percentage of the DRN neurons belong to the DRNDA subpopulation as well.

      To the best of our knowledge, there is unfortunately no detailed analysis of the prevalence of DRNDA neurons in mice available. Previous studies in rats have estimated that this population comprises around 1000 neurons (Descarries et al., 1986). According to Calizo et al. (2011), the number of any non-serotonergic neuron population (releasing dopamine or other neurotransmitters) in the DRN is one third to one tenth less than the number of DRN5-HT neurons. But please note that this study was also performed in rats (line 55).

      While I appreciate that the authors did not over-interpret their findings, it would be useful to comment (in the Discussion) on how their findings could/should be used in interpreting other studies using 6-OHDA, as well as the relationship of their findings to loss of 5-HT and/or DRN neurons in Parkinson's Disease itself.

      In the manuscript, we refer to the utility of the 6-OHDA model for the study of a wide range of non-motor symptoms. We have now described, in this model, how the loss of midbrain dopaminergic and noradrenergic neurons affects the electrophysiological and morphological properties of DRN5-HT and DRNDA neurons. This information will allow for a more precise assessment of the mechanisms involved in the affective and cognitive aspects of PD symptomatology (lines 354-356).

    2. eLife assessment

      This important work provides a convincing dataset of neuronal heterogeneity in the raphe nucleus, including their physiological properties, morphology, and susceptibility to the neurodegeneration of noradrenaline and dopamine systems in the Parkinsonian state. These findings suggest a significant interplay between catecholaminergic systems in healthy and parkinsonian conditions, as well as neuronal structure and function. Such findings provide a strong foundation for basic scientists as well as pre-clinical researchers interested in the role of dorsal raphe neurons in Parkinson's disease.

    3. Reviewer #1 (Public Review):


      People with Parkinson's disease often experience a variety of nonmotor symptoms, the biological bases of which remain poorly understood. Johansson et al began to study potential roles of the dorsal raphe nucleus (DRN) degeneration in the pathophysiology of neuropsychiatric symptoms in PD.


      Boi et al validated a transgenic reporter mouse line that can reliably label dopaminergic neurons in the DRN. This brain region shows severe neurodegeneration and has been proposed to contribute to the manifestation of neuropsychiatric symptoms in PD. Using this mouse line (and others), Boi and colleagues characterized electrophysiological and morphological phenotypes of dopaminergic and serotoninergic neurons in the raphe nucleus. This study involved very careful topographical registration of recorded neurons to brain slices for post hoc immunohistochemical validation of cell identity, making it an elegant and thorough piece of work.

      In relevance to PD pathophysiology, the authors evaluated the physiological and morphological changes of DRN serotoninergic and dopaminergic neurons after a partial loss of nigrostriatal dopamine neurons, which serves as a mouse model of early parkinsonian pathology. Moreover, the authors identified a series of physiological and morphological changes of subtypes of DRN neurons that depend on nigral dopaminergic neurodegeneration, LC noradrenergic neurodegeneration, or both. Indeed this works highlights the importance of LC noradrenergic degeneration in PD pathophysiology.

      Overall, this is a well-designed study with high significance to the Parkinson's research field.

    4. Reviewer #2 (Public Review):

      In this paper, Boi et al. thoroughly classified the electrophysiological and morphological characteristics of serotonergic and dopaminergic neurons in the DRN and examined the alterations of these neurons in the 6-OHDA-induced mouse PD model. Using whole-cell patch clamp recording, they found that 5-HT and dopamine (DA) neurons in the DRN are electrophysiologically well-distinguished from each other. In addition, they characterized distinct morphological features of 5-HT and DA neurons in the DRN. Notably, these specific features of 5-HT and DA neurons in the DRN exhibited different changes in the 6-OHDA-induced PD model. Then the authors utilized desipramine (DMI) to separate the effects of nigrostriatal DA depletion and noradrenalin (NA) depletion which are induced by 6-OHDA. Interestingly, protection from NA depletion by DMI pretreatment reversed the changes in 5-HT neurons, while having a minor impact on the changes in DA neurons in the DRN. These data indicate that the role of NA lesion in the altered properties of DRN 5-HT neurons by 6-OHDA is more critical than the one of DA lesion.

      Overall, this study provides foundational data on the 5-HT and DA neurons in the DRN and their potential involvement in PD symptoms. Given the defects of the DRN in PD, this paper may offer insights into the cellular mechanisms that may underlie non-motor symptoms associated with PD. Despite the importance of the primary claim proposed by the authors, however, the interpretation of the authors on some DMI experiments is not explained well.

    5. Reviewer #3 (Public Review):


      Using ex vivo electrophysiology and morphological analysis, Boi et al. investigate the electrophysiological and morphological properties of serotonergic and dopaminergic subpopulations in the dorsal raphe nucleus (DRN). They performed labor-intensive and rigorous electrophysiology with posthoc immunohistochemistry and neuronal reconstruction to delineate the two major cell classes in the DRN: DRN-DA and DRN-5HT, named according to their primary neurotransmitter machinery. They find that the dopaminergic (DRN-DA) and serotonergic (DRN-5HT) neurons are electrophysiologically and morphologically distinct, and are altered following striatal injection of the toxin 6-OHDA. However, these alterations were largely prevented in DRN-5HT neurons by pre-treatment with desipramine. These findings suggest an important interplay between catecholaminergic systems in healthy and parkinsonian conditions, as well as a relationship between neuronal structure and function.


      Large, well-validated dataset that will be a resource for others.<br /> Complementary electrophysiological and anatomical characterizations.<br /> Conclusions are justified by the data.<br /> Relevant for basic scientists interested in DRN cell types and physiology<br /> Relevant for those interested in serotonin and/or DRN neurons in Parkinson's Disease


      Given the scope of the author's questions and hypotheses, I did not identify any major weaknesses.

    1. Author Response

      We are writing this response letter with regards to the insightful feedback you provided on our manuscript titled: "A metabolic modeling-based framework for predicting trophic dependencies in native rhizobiomes of crop plants" submitted for consideration in eLife.

      We sincerely appreciate the thorough and constructive reviews, seeing and fitting the intentions behind our work. We intend to fully address all points raised by the reviewers in our revised manuscript. Specifically, we plan to incorporate targeted revisions to address concerns raised during the review process, with focus on process benchmarking and validation of our framework to enhance its reliability and accuracy.

      We believe that the current revision would improve the consistency and quality of the framework, making it a suitable tool for the characterization of microbial trophic interactions in diverse biological landscapes.

      Thank you once again for both your time and dedication in reviewing our manuscript, as well as the constructive review.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      (1) Substantial revision of the claims and interpretation of the results is needed, especially in the setting of additional data showing enhanced erythrophagocytosis with decreased RBC lifespan.

      Thank you for your valuable feedback and suggestion for a substantial revision of the claims and interpretation of our results. We acknowledge the importance of considering additional data that shows enhanced erythrophagocytosis with decreased RBC lifespan. In response, we have revised our manuscript and incorporated additional experimental data to support and clarify our findings.

      (1) In our original manuscript, we reported a decrease in the number of splenic red pulp macrophages (RPMs) and phagocytic erythrocytes after hypobaric hypoxia (HH) exposure. This conclusion was primarily based on our observations of reduced phagocytosis in the spleen.

      (2) Additional experimental data on RBC labeling and erythrophagocytosis:

      • Experiment 1 (RBC labeling and HH exposure)

      We conducted an experiment where RBCs from mice were labeled with PKH67 and injected back into the mice. These mice were then exposed to normal normoxia (NN) or HH for 7 or 14 days. The subsequent assessment of RPMs in the spleen using flow cytometry and immunofluorescence detection revealed a significant decrease in both the population of splenic RPMs (F4/80hiCD11blo, new Figure 5A and C) and PKH67-positive macrophages after HH exposure (as depicted in new Figure 5A and C-E). This finding supports our original claim of reduced phagocytosis under HH conditions.

      Author response image 1.

      -Experiment 2 (erythrophagocytosis enhancement)

      To examine the effects of enhanced erythrophagocytosis, we injected Tuftsin after administering PKH67-labelled RBCs. Our observations showed a significant decrease in PKH67 fluorescence in the spleen, particularly after Tuftsin injection compared to the NN group. This result suggests a reduction in RBC lifespan when erythrophagocytosis is enhanced (illustrated in new Figure 7, A-B).

      Author response image 2.

      (3) Revised conclusions:

      • The additional data from these experiments support our original findings by providing a more comprehensive view of the impact of HH exposure on splenic erythrophagocytosis.

      • The decrease in phagocytic RPMs and phagocytic erythrocytes after HH exposure, along with the observed decrease in RBC lifespan following enhanced erythrophagocytosis, collectively suggest a more complex interplay between hypoxia, erythrophagocytosis, and RBC lifespan than initially interpreted.

      We think that these revisions and additional experimental data provide a more robust and detailed understanding of the effects of HH on splenic erythrophagocytosis and RBCs lifespan. We hope that these changes adequately address the concerns raised and strengthen the conclusions drawn in our manuscript.

      (2) F4/80 high; CD11b low are true RPMs which the cells which the authors are presenting, i.e. splenic monocytes / pre-RPMs. To discuss RPM function requires the presentation of these cells specifically rather than general cells in the proper area of the spleen.

      Thank you for your feedback requesting a substantial revision of our claims and interpretation, particularly considering additional data showing enhanced erythrophagocytosis with decreased RBC lifespan. In response, we have thoroughly revised our manuscript and included new experimental data that further elucidate the effects of HH on RPMs and erythrophagocytosis.

      (1) Re-evaluation of RPMs population after HH exposure:

      • Flow cytometry analysis (new Figure 3G, Figure 5A and B): We revisited the analysis of RPMs (F4/80hiCD11blo) in the spleen after 7 and 14 days of HH exposure. Our revised flow cytometry data consistently showed a significant decrease in the RPMs population post-HH exposure, reinforcing our initial findings.

      Author response image 3.

      Author response image 4.

      • In situ expression of RPMs (Figure S1, A-D):

      We further confirmed the decreased population of RPMs through in situ co-staining with F4/80 and CD11b, and F4/80 and CD68, in spleen tissues. These results clearly demonstrated a significant reduction in F4/80hiCD11blo (Figure S1, A and B) and F4/80hiCD68hi (Figure S1, C and D) cells following HH exposure.

      Author response image 5.

      (2) Single-cell sequencing analysis of splenic RPMs:

      • We conducted a single-cell sequencing analysis of spleen samples post 7 days of HH exposure (Figure S2, A-C). This analysis revealed a notable shift in the distribution of RPMs, predominantly associated with Cluster 0 under NN conditions, to a reduced presence in this cluster after HH exposure.

      • Pseudo-time series analysis indicated a transition pattern change in spleen RPMs, with a shift from Cluster 2 and Cluster 1 towards Cluster 0 under NN conditions, and a reverse transition following HH exposure (Figure S2, B and D). This finding implies a decrease in resident RPMs in the spleen under HH conditions.

      (3) Consolidated findings and revised interpretation:

      • The comprehensive analysis of flow cytometry, in situ staining, and single-cell sequencing data consistently indicates a significant reduction in the number of RPMs following HH exposure.

      • These findings, taken together, strongly support the revised conclusion that HH exposure leads to a decrease in RPMs in the spleen, which in turn may affect erythrophagocytosis and RBC lifespan.

      Author response image 6.

      In conclusion, our revised manuscript now includes additional experimental data and analyses, strengthening our claims and providing a more nuanced interpretation of the impact of HH on spleen RPMs and related erythrophagocytosis processes. We believe these revisions and additional data address your concerns and enhance the scientific validity of our study.

      (3) RBC retention in the spleen should be measured anyway quantitatively, eg, with proper flow cytometry, to determine whether it is increased or decreased.

      Thank you for your query regarding the quantitative measurement of RBC retention in the spleen, particularly in relation to HH exposure. We have utilized a combination of techniques, including flow cytometry and histological staining, to investigate this aspect comprehensively. Below is a summary of our findings and methodology.

      (1) Flow cytometry analysis of labeled RBCs:

      • Our study employed both NHS-biotin (new Figure 4, A-D) and PKH67 labeling (new Figure 4, E-H) to track RBCs in mice exposed to HH. Flow cytometry results from these experiments (new Figure 4, A-H) showed a decrease in the proportion of labeled RBCs over time, both in the blood and spleen. Notably, there was a significantly greater reduction in the amplitude of fluorescently labeled RBCs after NN exposure compared to the reduced amplitude of fluorescently labeled RBCs observed in blood and spleen under HH exposure. The observed decrease in labeled RBCs was initially counterintuitive, as we expected an increase in RBC retention due to reduced erythrophagocytosis. However, this decrease can be attributed to the significantly increased production of RBCs following HH exposure, diluting the proportion of labeled cells.

      • Specifically, for blood, the biotin-labeled RBCs decreased by 12.06% under NN exposure and by 7.82% under HH exposure, while the PKH67-labeled RBCs decreased by 9.70% under NN exposure and by 4.09% under HH exposure. For spleen, the biotin-labeled RBCs decreased by 3.13% under NN exposure and by 0.46% under HH exposure, while the PKH67-labeled RBCs decreased by 1.16% under NN exposure and by 0.92% under HH exposure. These findings suggest that HH exposure leads to a decrease in the clearance rate of RBCs.

      Author response image 7.

      (2) Detection of erythrophagocytosis in spleen:

      To assess erythrophagocytosis directly, we labeled RBCs with PKH67 and analyzed their uptake by splenic macrophages (F4/80hi) after HH exposure. Our findings (new Figure 5, D-E) indicated a decrease in PKH67-positive macrophages in the spleen, suggesting reduced erythrophagocytosis.

      Author response image 8.

      (3) Flow cytometry analysis of RBC retention:

      Our flow cytometry analysis revealed a decrease in PKH67-positive RBCs in both blood and spleen (Figure S4). We postulated that this was due to increased RBC production after HH exposure. However, this method might not accurately reflect RBC retention, as it measures the proportion of PKH67-labeled RBCs relative to the total number of RBCs, which increased after HH exposure.

      Author response image 9.

      (4) Histological and immunostaining analysis:

      Histological examination using HE staining and band3 immunostaining in situ (new Figure 6, A-D, and G-H) revealed a significant increase in RBC numbers in the spleen after HH exposure. This was further confirmed by detecting retained RBCs in splenic single cells using Wright-Giemsa composite stain (new Figure 6, E and F) and retained PKH67-labelled RBCs in spleen (new Figure 6, I and J).

      Author response image 10.

      (5) Interpreting the data:

      The comprehensive analysis suggests a complex interplay between increased RBC production and decreased erythrophagocytosis in the spleen following HH exposure. While flow cytometry indicated a decrease in the proportion of labeled RBCs, histological and immunostaining analyses demonstrated an actual increase in RBCs retention in the spleen. These findings collectively suggest that while the overall RBCs production is upregulated following HH exposure, the spleen's capacity for erythrophagocytosis is concurrently diminished, leading to increased RBCs retention.

      (6) Conclusion:

      Taken together, our results indicate a significant increase in RBCs retention in the spleen post-HH exposure, likely due to reduced residual RPMs and erythrophagocytosis. This conclusion is supported by a combination of flow cytometry, histological staining, and immunostaining techniques, providing a comprehensive view of RBC dynamics under HH conditions. We think these findings offer a clear quantitative measure of RBC retention in the spleen, addressing the concerns raised in your question.

      (4) Numerous other methodological problems as listed below.

      We appreciate your question, which highlights the importance of using multiple analytical approaches to understand complex physiological processes. Please find below our point-by-point response to the methodological comments.

      Reviewer #1 (Recommendations For The Authors):

      (1) Decreased BM and spleen monocytes d/t increased liver monocyte migration is unclear. there is no evidence that this happens or why it would be a reasonable hypothesis, even in splenectomized mice.

      Thank you for highlighting the need for further clarification and justification of our hypothesized decrease in BM and spleen monocytes due to increased monocyte migration to the liver, particularly in the context of splenectomized mice. Indeed, our study has not explicitly verified an augmentation in mononuclear cell migration to the liver in splenectomized mice.

      Nonetheless, our investigations have revealed a notable increase in monocyte migration to the liver after HH exposure. Noteworthy is our discovery of a significant upregulation in colony stimulating factor-1 (CSF-1) expression in the liver, observed after both 7 and 14 days of HH exposure (data not included). This observation was substantiated through flow cytometry analysis (as depicted in Figure S4), which affirmed an enhanced migration of monocytes to the liver. Specifically, we noted a considerable increase in the population of transient macrophages, monocytes, and Kupffer cells in the liver following HH exposure.

      Author response image 11.

      Considering these findings, we hypothesize that hypoxic conditions may activate a compensatory mechanism that directs monocytes towards the liver, potentially linked to the liver’s integral role in the systemic immune response. In accordance with these insights, we intend to revise our manuscript to reflect the speculative nature of this hypothesis more accurately, and to delineate the strategies we propose for its further empirical investigation. This amendment ensures that our hypothesis is presented with full consideration of its speculative basis, supported by a coherent framework for future validation.

      (2) While F4/80+CD11b+ population is decreased, this is mainly driven by CD11b and F4/80+ alone population is significantly increased. This is counter to the hypothesis.

      Thank you for addressing the apparent discrepancy in our findings concerning the F4/80+CD11b+ population and the increase in the F4/80+ alone population, which seems to contradict our initial hypothesis. Your observation is indeed crucial for the integrity of our study, and we appreciate the opportunity to clarify this matter.

      (1) Clarification of flow cytometry results:

      • In response to the concerns raised, we revisited our flow cytometry experiments with a focus on more clearly distinguishing the cell populations. Our initial graph had some ambiguities in cell grouping, which might have led to misinterpretations.

      • The revised flow cytometry analysis, specifically aimed at identifying red pulp macrophages (RPMs) characterized as F4/80hiCD11blo in the spleen, demonstrated a significant decrease in the F4/80 population. This finding is now in alignment with our immunofluorescence results.

      Author response image 12.

      Author response image 13.

      (2) Revised data and interpretation:

      • The results presented in new Figure 3G and Figure 5 (A and B) consistently indicate a notable reduction in the RPMs population following HH exposure. This supports our revised understanding that HH exposure leads to a decrease in the specific macrophage subset (F4/80hiCD11blo) in the spleen.

      We’ve updated our manuscript to reflect these new findings and interpretations. The revised manuscript details the revised flow cytometry analysis and discusses the potential mechanisms behind the observed changes in macrophage populations.

      (3) HO-1 expression cannot be used as a surrogate to quantify number of macrophages as the expression per cell can decrease and give the same results. In addition, the localization of effect to the red pulp is not equivalent to an assertion that the conclusion applies to macrophages given the heterogeneity of this part of the organ and the spleen in general.

      Thank you for your insightful comments regarding the use of HO-1 expression as a surrogate marker for quantifying macrophage numbers, and for pointing out the complexity of attributing changes in HO-1 expression specifically to macrophages in the splenic red pulp. Your observations are indeed valid and warrant a detailed response.

      (1) Role of HO-1 in macrophage activity:

      • In our study, HO-1 expression was not utilized as a direct marker for quantifying macrophages. Instead, it was considered an indicator of macrophage activity, particularly in relation to erythrophagocytosis. HO-1, being upregulated in response to erythrophagocytosis, serves as an indirect marker of this process within splenic macrophages.

      • The rationale behind this approach was that increased HO-1 expression, induced by erythrophagocytosis in the spleen’s red pulp, could suggest an augmentation in the activity of splenic macrophages involved in this process.

      (2) Limitations of using HO-1 as an indicator:

      • We acknowledge your point that HO-1 expression per cell might decrease, potentially leading to misleading interpretations if used as a direct quantifier of macrophage numbers. The variability in HO-1 expression per cell indeed presents a limitation in using it as a sole indicator of macrophage quantity.

      • Furthermore, your observation about the heterogeneity of the spleen, particularly the red pulp, is crucial. The red pulp is a complex environment with various cell types, and asserting that changes in HO-1 expression are exclusive to macrophages could oversimplify this complexity.

      (3) Addressing the concerns:

      • To address these concerns, we propose to supplement our HO-1 expression data with additional specific markers for macrophages. This would help in correlating HO-1 expression more accurately with macrophage numbers and activity.

      • We also plan to conduct further studies to delineate the specific cell types in the red pulp contributing to HO-1 expression. This could involve techniques such as immunofluorescence or immunohistochemistry, which would allow us to localize HO-1 expression to specific cell populations within the splenic red pulp.

      We’ve revised our manuscript to clarify the role of HO-1 expression as an indirect marker of erythrophagocytosis and to acknowledge its limitations as a surrogate for quantifying macrophage numbers.

      (4) line 63-65 is inaccurate as red cell homeostasis reaches a new steady state in chronic hypoxia.

      Thank you for pointing out the inaccuracy in lines 63-65 of our manuscript regarding red cell homeostasis in chronic hypoxia. Your feedback is invaluable in ensuring the accuracy and scientific integrity of our work. We’ve revised lines 63-65 to accurately reflect the understanding.

      (5) Eryptosis is not defined in the manuscript.

      Thank you for highlighting the omission of a definition for eryptosis in our manuscript. We acknowledge the significance of precisely defining such key terminologies, particularly when they play a crucial role in the context of our research findings. Eryptosis, a term referenced in our study, is a specialized form of programmed cell death unique to erythrocytes. Similar with apoptosis in other cell types, eryptosis is characterized by distinct physiological changes including cell shrinkage, membrane blebbing, and the externalization of phosphatidylserine on the erythrocyte surface. These features are indicative of the RBCs lifecycle and its regulated destruction process.

      However, it is pertinent to note that our current study does not extensively delve into the mechanisms or implications of eryptosis. Our primary focus has been to elucidate the effects of HH exposure on the processes of splenic erythrophagocytosis and the resultant impact on the lifespan of RBCs. Given this focus, and to maintain the coherence and relevance of our manuscript, we have decided to exclude specific discussions of eryptosis from our revised manuscript. This decision aligns with our aim to provide a clear and concentrated exploration of the influence of HH exposure on RBCs dynamics and splenic function.

      We appreciate your input, which has significantly contributed to enhancing the clarity and accuracy of our manuscript. The revision ensures that our research is presented with a focused scope, aligning closely with our experimental investigations and findings.

      (6) Physiologically, there is no evidence that there is any "free iron" in cells, making line 89 point inaccurate.

      Thank you for highlighting the concern regarding the reference to "free iron" in cells in line 89 of our manuscript. The term "free iron" in our manuscript was intended to refer to divalent iron (Fe2+), rather than unbound iron ions freely circulating within cells. We acknowledge that the term "free iron" might lead to misconceptions, as it implies the presence of unchelated iron, which is not physiologically common due to the potential for oxidative damage. To rectify this and provide clarity, we’ve revised line 89 of our manuscript to reflect our meaning more accurately. Instead of "free iron," we use "divalent iron (Fe2+)" to avoid any misunderstanding regarding the state of iron in cells. We also ensure that any implications drawn from the presence of Fe2+ in cells are consistent with current scientific literature and understanding.

      (7) Fig 1f no stats

      We appreciate your critical review and suggestions, which help in improving the accuracy and clarity of our research. We’ve revised statistic diagram of new Figure 1F.

      (8) Splenectomy experiments demonstrate that erythrophagocytosis is almost completely replaced by functional macrophages in other tissues (likely Kupffer cells in the liver). there is only a minor defect and no data on whether it is in fact the liver or other organs that provide this replacement function and makes the assertions in lines 345-349 significantly overstated.

      Thank you for your critical assessment of our interpretation of the splenectomy experiments, especially concerning the role of erythrophagocytosis by macrophages in other tissues, such as Kupffer cells in the liver. We appreciate your observation that our assertions may be overstated and acknowledge the need for more specific data to identify which organs compensate for the loss of splenic erythrophagocytosis.

      (1) Splenectomy experiment findings:

      • Our findings in Figure 2D do indicate that in the splenectomized group under NN conditions, erythrophagocytosis is substantially compensated for by functional macrophages in other tissues. This is an important observation that highlights the body's ability to adapt to the loss of splenic function.

      • However, under HH conditions, our data suggest that the spleen plays an important role in managing erythrocyte turnover, as indicated by the significant impact of splenectomy on erythrophagocytosis and subsequent erythrocyte dynamics.

      (2) Addressing the lack of specific organ identification:

      • We acknowledge that our study does not definitively identify which organs, such as the liver or others, take over the erythrophagocytosis function post-splenectomy. This is an important aspect that needs further investigation.

      • To address this, we also plan to perform additional experiments that could more accurately point out the specific tissues compensating for the loss of splenic erythrophagocytosis. This could involve tracking labeled erythrocytes or using specific markers to identify macrophages actively engaged in erythrophagocytosis in various organs.

      (3) Revising manuscript statements:

      Considering your feedback, we’ve revised the statements in lines 345-349 (lines 378-383 in revised manuscript) to enhance the scientific rigor and clarity of our research presentation.

      (9) M1 vs M2 macrophage experiments are irrelevant to the main thrust of the manuscript, there are no references to support the use of only CD16 and CD86 for these purposes, and no stats are provided. It is also unclear why bone marrow monocyte data is presented and how it is relevant to the rest of the manuscript.

      Thank you for your critical evaluation of the relevance and presentation of the M1 vs. M2 macrophage experiments in our manuscript. We appreciate your insights, especially regarding the use of specific markers and the lack of statistical analysis, as well as the relevance of bone marrow monocyte data to our study's main focus.

      (1) Removal of M1 and M2 macrophage data:

      Based on your feedback and our reassessment, we agree that the results pertaining to M1 and M2 macrophages did not align well with the main objectives of our manuscript. Consequently, we have decided to remove the related content on M1 and M2 macrophages from the revised manuscript. This decision was made to ensure that our manuscript remains focused and coherent, highlighting our primary findings without the distraction of unrelated or insufficiently supported data.

      The use of only CD16 and CD86 markers for M1 and M2 macrophage characterization, without appropriate statistical analysis, was indeed a methodological limitation. We recognize that a more comprehensive set of markers and rigorous statistical analysis would be necessary for a meaningful interpretation of M1/M2 macrophage polarization. Furthermore, the relevance of these experiments to the central theme of our manuscript was not adequately established. Our study primarily focuses on erythrophagocytosis and red pulp macrophage dynamics under hypobaric hypoxia, and the M1/M2 polarization aspect did not contribute significantly to this narrative.

      (2) Clarification on bone marrow monocyte data:

      Regarding the inclusion of bone marrow monocyte data, we acknowledge that its relevance to the main thrust of the manuscript was not clearly articulated. In the revised manuscript, we provide a clearer rationale for its inclusion and how it relates to our primary objectives.

      (3) Commitment to clarity and relevance:

      We are committed to ensuring that every component of our manuscript contributes meaningfully to our overall objectives and research questions. Your feedback has been instrumental in guiding us to streamline our focus and present our findings more effectively.

      We appreciate your valuable feedback, which has led to a more focused and relevant presentation of our research. These changes enhance the clarity and impact of our manuscript, ensuring that it accurately reflects our key research findings.

      (10) Biotinolated RBC clearance is enhanced, demonstrating that RBC erythrophagocytosis is in fact ENHANCED, not diminished, calling into question the founding hypothesis that the manuscript proposes.

      Thank you for your critical evaluation of our data on biotinylated RBC clearance, which suggests enhanced erythrophagocytosis under HH conditions. This observation indeed challenges our founding hypothesis that erythrophagocytosis is diminished in this setting. Below is a summary of our findings and methodology.

      (1) Interpretation of RBC labeling results:

      Both the previous results of NHS-biotin labeled RBCs (new Figure 4, A-D) and the current results of PKH67-labeled RBCs (new Figure 4, E-H) demonstrated a decrease in the number of labeled RBCs with an increase in injection time. The production of RBCs, including bone marrow and spleen production, was significantly increased following HH exposure, resulting in a consistent decrease in the proportion of labeled RBCs via flow cytometry detection both in the blood and spleen of mice compared to the NN group. However, compared to the reduced amplitude of fluorescently labeled RBCs observed in blood and spleen under NN exposure, there was a significantly weaker reduction in the amplitude of fluorescently labeled RBCs after HH exposure. Specifically, for blood, the biotin-labeled RBCs decreased by 12.06% under NN exposure and by 7.82% under HH exposure, while the PKH67-labeled RBCs decreased by 9.70% under NN exposure and by 4.09% under HH exposure. For spleen, the biotin-labeled RBCs decreased by 3.13% under NN exposure and by 0.46% under HH exposure, while the PKH67-labeled RBCs decreased by 1.16% under NN exposure and by 0.92% under HH exposure.

      Author response image 14.

      (2) Increased RBCs production under HH conditions:

      It's important to note that RBCs production, including from bone marrow and spleen, was significantly increased following HH exposure. This increase in RBCs production could contribute to the decreased proportion of labeled RBCs observed in flow cytometry analyses, as there are more unlabeled RBCs diluting the proportion of labeled cells in the blood and spleen.

      (3) Analysis of erythrophagocytosis in RPMs:

      Our analysis of PKH67-labeled RBCs content within RPMs following HH exposure showed a significant reduction in the number of PKH67-positive RPMs in the spleen (new Figure 5). This finding suggests a decrease in erythrophagocytosis by RPMs under HH conditions.

      Author response image 15.

      (4) Reconciling the findings:

      The apparent contradiction between enhanced RBC clearance (suggested by the reduced proportion of labeled RBCs) and reduced erythrophagocytosis in RPMs (indicated by fewer PKH67-positive RPMs) may be explained by the increased overall production of RBCs under HH. This increased production could mask the actual erythrophagocytosis activity in terms of the proportion of labeled cells. Therefore, while the proportion of labeled RBCs decreases more significantly under HH conditions, this does not necessarily indicate an enhanced erythrophagocytosis rate, but rather an increased dilution effect due to higher RBCs turnover.

      (5) Revised interpretation and manuscript changes:

      Given these factors, we update our manuscript to reflect this detailed interpretation and clarify the implications of the increased RBCs production under HH conditions on our observations of labeled RBCs clearance and erythrophagocytosis. We appreciate your insightful feedback, which has prompted a careful re-examination of our data and interpretations. We hope that these revisions provide a more accurate and comprehensive understanding of the effects of HH on erythrophagocytosis and RBCs dynamics.

      (11) Legend in Fig 4c-4d looks incorrect and Fig 4e-4f is very non-specific since Wright stain does not provide evidence of what type of cells these are and making for a significant overstatement in the contribution of this data to "confirming" increased erythrophagocytosis in the spleen under HH exposure (line 395-396).

      Thank you for your insightful observations regarding the data presentation and figure legends in our manuscript, particularly in relation to Figure 4 (renamed as Figure 6 in the revised manuscript) and the use of Wright-Giemsa composite staining. We appreciate your constructive feedback and acknowledge the importance of presenting our data with utmost clarity and precision.

      (1) Amendments to Figure legends:

      We recognize the necessity of rectifying inaccuracies in the legends of the previously labeled Figure 4C and D. Corrections have been meticulously implemented to ensure the legends accurately contain the data presented. Additionally, we acknowledge the error concerning the description of Wright staining. The method employed in our study is Wright-Giemsa composite staining, which, unlike Wright staining that solely stains cytoplasm (RBC), is capable of staining both nuclei and cytoplasm.

      (2) Addressing the specificity of Wright-Giemsa Composite staining:

      Our approach involved quantifying RBC retention using Wright-Giemsa composite staining on single splenic cells post-perfusion at 7 and 14 days post HH exposure. We understand and appreciate your concerns regarding the nonspecific nature of Wright staining. Although Wright stain is a general hematologic stain and not explicitly specific for certain cell types, its application in our study aimed to provide preliminary insights. The spleen cells, devoid of nuclei and thus likely to be RBCs, were stained and observed post-perfusion, indicating RBC retention within the spleen.

      (3) Incorporating additional methods for RBC identification:

      To enhance the specificity of our findings, we integrated supplementary methods for RBC identification in the revised manuscript. We employed band3 immunostaining (in the new Figure 6, C-D and G-H) and PKH67 labeling (Figure 6, I-J) for a more targeted identification of RBCs. Band3, serving as a reliable marker for RBCs, augments the specificity of our immunostaining approach. Likewise, PKH67 labeling affords a direct and definitive means to assess RBC retention in the spleen following HH exposure.

      Author response image 16. same as 10

      (4) Revised interpretation and manuscript modifications:

      Based on these enhanced methodologies, we have refined our interpretation of the data and accordingly updated the manuscript. The revised narrative underscores that our conclusions regarding reduced erythrophagocytosis and RBC retention under HH conditions are corroborated by not only Wright-Giemsa composite staining but also by band3 immunostaining and PKH67 labeling, each contributing distinctively to our comprehensive understanding.

      We are committed to ensuring that our manuscript precisely reflects the contribution of each method to our findings and conclusions. Your thorough review has been invaluable in identifying and rectifying areas for improvement in our research report and interpretation.

      (12) Ferroptosis data in Fig 5 is not specific to macrophages and Fer-1 data confirms the expected effect of Fer-1 but there is no data that supports that Fer-1 reverses the destruction of these cells or restores their function in hypoxia. Finally, these experiments were performed in peritoneal macrophages which are functionally distinct from splenic RPM.

      Thank you for your critique of our presentation and interpretation of the ferroptosis data in Figure 5 (renamed as Figure 9 in the revised manuscript), as well as your observations regarding the specificity of the experiments to macrophages and the effects of Fer-1. We value your input and acknowledge the need to clarify these aspects in our manuscript.

      (1) Clarification on cell type used in experiments:

      • We appreciate your attention to the details of our experimental setup. The experiments presented in Figure 9 were indeed conducted on splenic macrophages, not peritoneal macrophages, as incorrectly mentioned in the original figure legend. This was an error in our manuscript, and we have revised the figure legend accordingly to accurately reflect the cell type used.

      (2) Specificity of ferroptosis data:

      • We recognize that the data presented in Figure 9 need to be more explicitly linked to the specific macrophage population being studied. In the revised manuscript, we ensure that the discussion around ferroptosis data is clearly situated within the framework of splenic macrophages.

      • We also provide additional methodological details in the 'Methods' section to reinforce the specificity of our experiments to splenic macrophages.

      (3) Effects of Fer-1 on macrophage function and survival:

      • Regarding the effect of Fer-1, we agree that while our data confirms the expected effect of Fer-1 in inhibiting ferroptosis, we have not provided direct evidence that Fer-1 reverses the destruction of macrophages or restores their function in hypoxia.

      • To address this, we propose additional experiments to specifically investigate the impact of Fer-1 on the survival and functional restoration of splenic macrophages under hypoxic conditions. This would involve assessing not only the inhibition of ferroptosis but also the recovery of macrophage functionality post-treatment.

      (4) Revised interpretation and manuscript changes:

      • We’ve revised the relevant sections of our manuscript to reflect these clarifications and proposed additional studies. This includes modifying the discussion of the ferroptosis data to more accurately represent the cell types involved and the limitations of our current findings regarding the effects of Fer-1.

      • The revised manuscript presents a more detailed interpretation of the ferroptosis data, clearly describing what our current experiments demonstrate and what remains to be investigated.

      We are grateful for your insightful feedback, which has highlighted important areas for improvement in our research presentation. We think that these revisions will enhance the clarity and scientific accuracy of our manuscript, ensuring that our findings and conclusions are well-supported and precisely communicated.

      Reviewer #2 (Recommendations For The Authors):

      The following questions and remarks should be considered by the authors:

      (1) The methods should clearly state whether the HH was discontinued during the 7 or 14 day exposure for cleaning, fresh water etc. Moreover, how was CO2 controlled? The procedure for splenectomy needs to be described in the methods.

      Thank you for your inquiry regarding the specifics of our experimental methods, particularly the management of HH exposure and the procedure for splenectomy. We appreciate your attention to detail and the importance of these aspects for the reproducibility and clarity of our research.

      (1) HH exposure conditions:

      In our experiments, mice were continuously exposed to HH for the entire duration of 7 or 14 days, without interruption for activities such as cleaning or providing fresh water. This uninterrupted exposure was crucial for maintaining consistent hypobaric conditions throughout the experiment. The hypobaric chamber was configured to ensure a ventilation rate of 25 air exchanges per minute. This high ventilation rate was effective in regulating the concentration of CO2 inside the chamber, thereby maintaining a stable environment for the mice.

      (2) The splenectomy was performed as follows:

      After anesthesia, the mice were placed in a supine position, and their limbs were fixed. The abdominal operation area was skinned, disinfected, and covered with a sterile towel. A median incision was made in the upper abdomen, followed by laparotomy to locate the spleen. The spleen was then carefully pulled out through the incision. The arterial and venous directions in the splenic pedicle were examined, and two vascular forceps were used to clamp all the tissue in the main cadre of blood vessels below the splenic portal. The splenic pedicle was cut between the forceps to remove the spleen. The end of the proximal hepatic artery was clamped with a vascular clamp, and double or through ligation was performed to secure the site. The abdominal cavity was then cleaned to ensure there was no bleeding at the ligation site, and the incision was closed. Post-operatively, the animals were housed individually. Generally, they were able to feed themselves after recovering from anesthesia and did not require special care.

      We hope this detailed description addresses your queries and provides a clear understanding of the experimental conditions and procedures used in our study. These methodological details are crucial for ensuring the accuracy and reproducibility of our research findings.

      (2) The lack of changes in MCH needs explanation? During stress erythropoiesis some limit in iron availability should cause MCH decrease particularly if the authors claim that macrophages for rapid iron recycling are decreased. Fig 1A is dispensable. Fig 1G NN control 14 days does not make sense since it is higher than 7 days of HH.

      Thank you for your inquiry regarding the lack of changes in Mean Corpuscular Hemoglobin (MCH) in our study, particularly in the context of stress erythropoiesis and decreased macrophage-mediated iron recycling. We appreciate the opportunity to provide further clarification on this aspect.

      (1) Explanation for stable MCH levels:

      • Our research identified a decrease in erythrophagocytosis and iron recycling in the spleen following HH exposure. Despite this, the MCH levels remained stable. This observation can be explained by considering the compensatory roles of other organs, particularly the liver and duodenum, in maintaining iron homeostasis.

      • Specifically, our investigations revealed an enhanced capacity of the liver to engulf RBCs and process iron under HH conditions. This increased hepatic erythrophagocytosis likely compensates for the reduced splenic activity, thereby stabilizing MCH levels.

      (2) Role of hepcidin and DMT1 expression:

      Additionally, hypoxia is known to influence iron metabolism through the downregulation of Hepcidin and upregulation of Divalent Metal Transporter 1 (DMT1) expression. These alterations lead to enhanced intestinal iron absorption and increased blood iron levels, further contributing to the maintenance of MCH levels despite reduced splenic iron recycling.

      (3) Revised Figure 1 and data presentation

      To address the confusion regarding the data presented in Figure 1G, we have made revisions in our manuscript. The original Figure 1G, which did not align with the expected trends, has been removed. In its place, we have included a statistical chart of Figure 1F in the new version of Figure 1G. This revision will provide a clearer and more accurate representation of our findings.

      (4) Manuscript updates and future research:

      • We update our manuscript to incorporate these explanations, ensuring that the rationale behind the stable MCH levels is clearly articulated. This includes a discussion on the role of the liver and duodenum in iron metabolism under hypoxic conditions.

      • Future research could explore in greater detail the mechanisms by which different organs contribute to iron homeostasis under stress conditions like HH, particularly focusing on the dynamic interplay between hepatic and splenic functions.

      We thank you for your insightful question, which has prompted a thorough re-examination of our findings and interpretations. We believe that these clarifications will enhance the overall understanding of our study and its implications in the context of iron metabolism and erythropoiesis under hypoxic conditions.

      (3) Fig 2 the difference between sham and splenectomy is really marginal and not convincing. Is there also a difference at 7 days? Why does the spleen size decrease between 7 and 14 days?

      Thank you for your observations regarding the marginal differences observed between sham and splenectomy groups in Figure 2, as well as your inquiries about spleen size dynamics over time. We appreciate this opportunity to clarify these aspects of our study.

      (1) Splenectomy vs. Sham group differences:

      • In our experiments, the difference between the sham and splenectomy groups under HH conditions, though subtle, was consistent with our hypothesis regarding the spleen's role in erythrophagocytosis and stress erythropoiesis. Under NN conditions, no significant difference was observed between these groups, which aligns with the expectation that the spleen's contribution is more pronounced under hypoxic stress.

      (2) Spleen size dynamics and peak stress erythropoiesis:

      • The observed splenic enlargement prior to 7 days can be attributed to a combination of factors, including the retention of RBCs and extramedullary hematopoiesis, which is known to be a response to hypoxic stress.

      • Prior research has elucidated that splenic stress-induced erythropoiesis, triggered by hypoxic conditions, typically attains its zenith within a timeframe of 3 to 7 days. This observation aligns with our Toluidine Blue (TO) staining results, which indicated that the apex of this response occurs at the 7-day mark (as depicted in Figure 1, F-G). Here, the culmination of this peak is characteristically succeeded by a diminution in extramedullary hematopoiesis, a phenomenon that could elucidate the observed contraction in spleen size, particularly in the interval between 7 and 14 days.

      • This pattern of splenic response under prolonged hypoxic stress is corroborated by studies such as those conducted by Wang et al. (2021), Harada et al. (2015), and Cenariu et al. (2021). These references collectively underscore that the spleen undergoes significant dynamism in reaction to sustained hypoxia. This dynamism is initially manifested as an enlargement of the spleen, attributable to escalated erythropoiesis and erythrophagocytosis. Subsequently, as these processes approach normalization, a regression in spleen size ensues.

      We’ve revised our manuscript to include a more detailed explanation of these splenic dynamics under HH conditions, referencing the relevant literature to provide a comprehensive context for our findings. We will also consider performing additional analysis or providing further data on spleen size changes at 7 days to support our observations and ensure a thorough understanding of the splenic response to hypoxic stress over time.

      (4) Fig 3 B the clusters should be explained in detail. If the decrease in macrophages in Fig 3K/L is responsible for the effect, why does splenectomy not have a much stronger effect? How do the authors know which cells died in the calcein stained population in Fig 3D?

      Thank you for your insightful questions regarding the details of our data presentation in Figure 3, particularly about the identification of cell clusters and the implications of macrophage reduction. We appreciate the opportunity to address these aspects and clarify our findings.

      (1) Explanation of cell clusters in Figure 3B:

      • In the revised manuscript, we have included detailed notes for each cell population represented in Figure 3B (Figure 3D in revised manuscript). These notes provide a clearer understanding of the cell types present in each cluster, enhancing the interpretability of our single-cell sequencing data.

      • This detailed annotation will help readers to better understand the composition of the splenic cell populations under study and how they are affected by hypoxic conditions.

      (2) Impact of splenectomy vs. macrophage reduction:

      • The interplay between the reduction in macrophage populations, as evidenced by our single-cell sequencing data, and the ramifications of splenectomy presents a multifaceted scenario. Notably, the observed decline in macrophage numbers following HH exposure does not straightforwardly equate to a comparable alteration in overall splenic function, as might be anticipated with splenectomy.

      • In the context of splenectomy under HH conditions, a significant escalation in the RBCs count was observed, surpassing that in non-splenectomized mice exposed to HH. This finding underscores the spleen's critical role in modulating RBCs dynamics under HH. It also indirectly suggests that the diminished phagocytic capacity of the spleen following HH exposure contributes to an augmented RBCs count, albeit to a lesser extent than in the splenectomy group. This difference is attributed to the fact that, while the number of RPMs in the spleen post-HH is reduced, they are still present, unlike in the case of splenectomy, where they are entirely absent.

      • Splenectomy entails the complete removal of the spleen, thus eliminating a broad spectrum of functions beyond erythrophagocytosis and iron recycling mediated by macrophages. The nuanced changes observed in our study may be reflective of the spleen's diverse functionalities and the organism's adaptive compensatory mechanisms in response to the loss of this organ.

      (3) Calcein stained population in Figure 3D:

      • Regarding the identification of cell death in the calcein-stained population in Figure 3D (Figure 3A in revised manuscript), we acknowledge that the specific cell types undergoing death could not be distinctly determined from this analysis alone.

      • The calcein staining method allows for the visualization of live (calcein-positive) and dead (calcein-negative) cells, but it does not provide specific information about the cell types. The decrease in macrophage population was inferred from the single-cell sequencing data, which offered a more precise identification of cell types.

      (4) Revised manuscript and data presentation:

      • Considering your feedback, we have revised our manuscript to provide a more comprehensive explanation of the data presented in Figure 3, including the nature of the cell clusters and the interpretation of the calcein staining results.

      • We have also updated the manuscript to reflect the removal of Figure 3K/L results and to provide a more focused discussion on the relevant findings.

      We are grateful for your detailed review, which has helped us to refine our data presentation and interpretation. These clarifications and revisions will enhance the clarity and scientific rigor of our manuscript, ensuring that our conclusions are well-supported and accurately conveyed.

      (5) Is the reduced phagocytic capacity in Fig 4B significant? Erythrophagocytosis is compromised due to the considerable spontaneous loss of labelled erythrocytes; could other assays help? (potentially by a modified Chromium release assay?). Is it necessary to stimulated phagocytosis to see a significant effect?

      Thank you for your inquiry regarding the significance of the reduced phagocytic capacity observed in Figure 4B, and the potential for employing alternative assays to elucidate erythrophagocytosis dynamics under HH conditions.

      (1) Significance of reduced phagocytic capacity:

      The observed reduction in the amplitude of fluorescently labeled RBCs in both the blood and spleen under HH conditions suggests a decrease in erythrophagocytosis. This is indicative of a diminished phagocytic capacity, particularly when contrasted with NN conditions.

      (2) Investigation of erythrophagocytosis dynamics:

      To delve deeper into erythrophagocytosis under HH, we employed Tuftsin to enhance this process. Following the injection of PKH67-labeled RBCs and subsequent HH exposure, we noted a significant decrease in PKH67 fluorescence in the spleen, particularly marked after the administration of Tuftsin. This finding implies that stimulated erythrophagocytosis can influence RBCs lifespan.

      (3) Erythrophagocytosis under normal and hypoxic conditions:

      Under normal conditions, the reduction in phagocytic activity is less apparent without stimulation. However, under HH conditions, our findings demonstrate a clear weakening of the phagocytic effect. While we established that promoting phagocytosis under NN conditions affects RBC lifespan, the impact of enhanced phagocytosis under HH on RBCs numbers was not explicitly investigated.

      (4) Potential for alternative assays:

      Considering the considerable spontaneous loss of labeled erythrocytes, alternative assays such as a modified Chromium release assay could provide further insights. Such assays might offer a more nuanced understanding of erythrophagocytosis efficiency and the stability of labeled RBCs under different conditions.

      (5) Future research directions:

      The implications of these results suggest that future studies should focus on comparing the effects of stimulated phagocytosis under both NN and HH conditions. This would offer a clearer picture of the impact of hypoxia on the phagocytic capacity of macrophages and the subsequent effects on RBC turnover.

      In summary, our findings indicate a diminished erythrophagocytic capacity, with enhanced phagocytosis affecting RBCs lifespan. Further investigation, potentially using alternative assays, would be beneficial to comprehensively understand the dynamics of erythrophagocytosis in different physiological states.

      (6) Can the observed ferroptosis be influenced by bi- and not trivalent iron chelators?

      Thank you for your question regarding the potential influence of bi- and trivalent iron chelators on ferroptosis under hypoxic conditions. We appreciate the opportunity to discuss the implications of our findings in this context.

      (1) Analysis of iron chelators on ferroptosis:

      In our study, we did not specifically analyze the effects of bi- and trivalent iron chelators on ferroptosis under hypoxia. However, our observations with Deferoxamine (DFO), a well-known iron chelator, provide some insights into how iron chelation may influence ferroptosis in splenic macrophages under hypoxic conditions.

      (2) Effect of DFO on oxidative stress markers:

      Our findings showed that under 1% O2, there was an increase in Malondialdehyde (MDA) content, a marker of lipid peroxidation, and a decrease in Glutathione (GSH) content, indicative of oxidative stress. These changes are consistent with the induction of ferroptosis, which is characterized by increased lipid peroxidation and depletion of antioxidants. Treatment with Ferrostatin-1 (Fer-1) and DFO effectively reversed these alterations. This suggests that DFO, like Fer-1, can mitigate ferroptosis in splenic macrophages under hypoxia, primarily by impacting MDA and GSH levels.

      Author response image 17.

      (3) Potential role of iron chelators in ferroptosis:

      The effectiveness of DFO in reducing markers of ferroptosis indicates that iron availability plays a crucial role in the ferroptotic process under hypoxic conditions. It is plausible that both bi- and trivalent iron chelators could influence ferroptosis, given their ability to modulate iron availability within cells. Since ferroptosis is an iron-dependent form of cell death, chelating iron, irrespective of its valence state, could potentially disrupt the process by limiting the iron necessary for the generation of reactive oxygen species and lipid peroxidation.

      (4) Additional research and manuscript updates:

      Our study highlights the need for further research to explore the differential effects of various iron chelators on ferroptosis, particularly under hypoxic conditions. Such studies could provide a more comprehensive understanding of the role of iron in ferroptosis and the potential therapeutic applications of iron chelators. We update our manuscript to include these findings and discuss the potential implications of iron chelation in the context of ferroptosis under hypoxic conditions. This will provide a broader perspective on our research and its significance in understanding the mechanisms of ferroptosis.

    2. Reviewer #2 (Public Review):

      The authors aimed at elucidating the development of high altitude polycythemia which affects mice and men staying in a hypoxic atmosphere at high altitude (hypobaric hypoxia; HH). HH causes increased erythropoietin production which stimulates the production of red blood cells. The authors hypothesize that increased production is only partially responsible for exaggerated red blood cell production, i.e. polycythemia, but that decreased erythrophagocytosis in the spleen contributes to high red blood cells counts.

      The main strength of the study is the use of a mouse model exposed to HH in a hypobaric chamber. However, not all of the reported results are convincing due to some smaller effects which one may doubt to result in the overall increase in red blood cells as claimed by the authors. Moreover, direct proof for reduced erythrophagocytosis is compromised due to a strong spontaneous loss of labelled red blood cells, although effects of labelled E. coli phagocytosis are shown.

      Comments on latest version:

      The authors have partly addressed my comments.

      (1) The response to my question regarding unchanged MCH is a kind of "hand waiving" - maybe it would require substantially more extensive work to clarify this issue

      (2) The moderate if not marginal difference in normal vs splenectomy argues against a significant role of the spleen - even if the difference was slightly larger in HH

      (3) There is still overinterpretation of data. My Q was: Is the reduced phagocytic capacity in Fig 4B significant? Response: "This is indicative of a diminished phagocytic capacity, particularly when contrasted<br /> with NN conditions." I guess that is a "no"

      (4) I assume my question with respect to bi- or trivalent iron chelators was misunderstood.

      In general, as indicated above, it is an interesting hypothesis which is corroborated by data in several instances. Maybe the scientific community should decide whether it is all in all conclusive.

    3. Reviewer #3 (Public Review):

      The manuscript by Yang et al. investigated in mice how hypobaric hypoxia can modify the RBC clearance function of the spleen, a concept that is of interest. Via interpretation of their data, the authors proposed a model that hypoxia causes an increase in cellular iron levels, possibly in RPMs, leading to ferroptosis, and downregulates their erythrophagocytic capacity.

      Comments on revised version:

      The manuscript has now improved with all the new data, supporting the model proposed by the authors. However, it remains not very easy to follow for the conclusions and experimental details. Some of the most important remaining comments are listed below:

      (1) Lines 401-406 - The conclusions in this new fragment sound a bit overstated - the authors do not directly measure erytrophagocytosis capacity, only the total RBC parameters in the circulation. The increase is also very mild biologically between sham and splenectomized mice in HH conditions.

      (2) scRNA seq data are still presented in a way that is very difficult to understand. The readers could not see from the graphics that macrophages are depleted. The clusters are not labelled - some clusters in the bin 'macrophahes+DC' seem actually to be more represented in Fig. 3E; Fig. 3F does not correspond to Fig. 3D. It would be maybe more informative to present like in Figure D side by side NN versus HH? The authors could consider moving the data from supplements that relate to RPMs to the main figure and making it consistent for the Clusters - eg, the authors show data for Cluster 0 in the supplement, and the same Cluster is not marked as macrophages in the main figure. This is quite difficult to follow.

      (3) Figure 3G has likely mislabeled axis for F4/80 and CD11b - such mistakes should be avoided in a second revised version of the manuscript, and this data is now redundant with the data shown as new Figure 5A.

      (4) The data from new Figure 4 should be better mentioned in the main body of the manuscript - all panels are mentioned twice in the text, first speaking about the decline of labelled RBCs and second referring to phagocytic capacity, whereas this figure only illustrates the decline of labelled RBCs, not directly phagocytic capacity of RPMs. What is lacking, as opposed to typical RBC life span assay, is the time '0' ('starting point') - this is particularly important as we can observe a big drop in labelled RBCs for eg 7 days between NN and HH group, actually implying increased removal of labelled RBCs within the first days of hypoxia exposure. What should be better labelled in this figure is that the proportion of RBCs are labelled RBCs not all RBCs (Y axis in individual panels). Overall, the new Figure 4 brings new data to the study, but how it is presented and discussed is not at the 'state-of-the-art' level (eg, missing the time '0') and is not very straightforward to the reader.

      (5) In Figure 7, the experiments with Tuftsin are not very easy to follow, especially for the major conclusions. In panels A and B, the focus is the drug itself under NN conditions, with RBC removal as a readout. Then, in the next panels, the authors introduce HH, and then look at the F4/80 and iron staining. What was exactly the major point the authors wanted to make here?

      (6) The data from Figure 8 are informative but do not address the individual cell types - eg, a drop in HO1 or FT may be due to the depletion of RPMs. An increase of TFR1 could be due to the retention of RBCs, the same as maybe labile iron. The data from PBMC are only very loosely linked to these phenotypes observed in the total spleen, and the reason for the regulation of the same proteins in PBMC might be different. It goes back to the data in Figure 3A-C, where also total splenocytes are investigated for their viability.

      (7) Can the authors provide the data for the purity (eg cell surface markers) of their primary splenic macrophage cultures? Only ensuring that these are macrophages or addressing the readouts from Figure 8 in RPMs could link ferroptosis to RPMs under HH conditions.

      (8) All the data are not presented as individual data points which is not widely applied in papers.

      (9) No gating strategies are nicely illustrated or described.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable study provides insights into the IDA peptide with dual functions in development and immunity. The approach used is solid and helps to define the role of IDA in a two-step process, cell separation followed by activation of innate defenses. The main limitation of the study is the lack of direct evidence linking signaling by IDA and its HAE receptors to immunity. As such the work remains descriptive but it will nevertheless be of interest to a wide range of plant cell biologists.

      We thank the reviewers for thoroughly reading our manuscript. We have used their comments and suggestions- to improve the manuscript. Below is a response to the reviewer's comments.

      Public Reviews:

      Reviewer #1 (Public Review):

      The paper titled 'A dual function of the IDA peptide in regulating cell separation and modulating plant immunity at the molecular level' by Olsson Lalun et al., 2023 aims to understand how IDAHAE/HSL2 signalling modulates immunity, a pathway that has previously been implicated in development. This is a timely question to address as conflicting reports exist within the field. IDL6/7 have previously been shown to negatively regulate immune signalling, disease resistance and stress responses in leaf tissue, however IDA has been shown to positively regulate immunity through the shedding of infected tissues. Moreover, recently the related receptor NUT/HSL3 has been shown to positively regulate immune signalling and disease resistance. This work has the potential to bring clarity to this field, however the manuscript requires some additional work to address these questions. This is especially the case as it contracts some previous work with IDL peptides which are perceived by the same receptor complexes.

      Can IDA induce pathogen resistance? Does the infiltration of IDA into leaf tissue enhance or reduce pathogen growth? Previously it has been shown that IDL6 makes plants more susceptible. Is this also true for IDA? Currently cytoplasmic calcium influx and apoplastic ROS as overinterpreted as immune responses - these can also be induced by many developmental cue e.g. CLE40 induced calcium transients. Whilst gene expression is more specific is also true that treatment with synthetic peptides, which are recognised by LRR-RKs, can induce immune gene expression, especially in the short term, even when that is not there in vivo function e.g. doi.org/10.15252/embj.2019103894.

      We thank the reviewer for the concerns raised and agree that further experiments including pathogen assays would strengthen the link between IDA signaling and immunity and we plan for such experiments in future work. We have however, modified the discussion to include the possible role of IDA induced Ca2+ and ROS during development. We have recently published a preprint (accepted for publication in JXB) ( (Galindo-Trigo et al., 2023, https://doi.org/10.1101/2023.09.12.557497)) strengthening the link between IDA and defense by identifying WRKY transcription factors that regulate IDA expression through a Y1H assay.

      This paper shows that receptors other than hae/hsl2 are genetically required to induce defense gene expression, it would have been interesting to see what phenotype would be associated with higher order mutants of closely related haesa/haesa-like receptors. Indeed recently HSL1 has been shown to function as a receptor for IDA/IDL peptides. Could the triple mutant suppress all response? Could the different receptors have distinct outputs? For example for FRK1 gene expression the hae hsl2 mutant has an enhanced response. Could defence gene expression be primarily mediated by HSL1 with subfunctionalisation within this clade?

      We agree that it would be interesting to also include HSL1 in our studies. However, the focus of this study has been on HAE and HSL2 and we wanted to explore their role in IDA induced defense responses. Including HSL1 in these studies will require generation of multiple transgenic lines and repeating most of the experiments and are experiments we will consider in a follow up study together with pathogen assays (that would also address the main concern raised in the comment above). We have however, modified the text to include the known function of HSL1 and discuss the possibility of subfunctionalisation of this receptor clade.

      One striking finding of the study is the strong additive interaction between IDA and flg22 treatment on gene expression. Do the authors also see this for co-treatment of different peptides with flg22, or is this unique function of IDA? Is this receptor dependent (HAE/HSL1/HSL2)?

      This is a good question. Since our study focuses on the IDA signaling pathway we preferentially tested if the additive effect observed between flg22 and mIDA was also observed when mIDA was combined with another peptide involved in defense. The endogenous peptide PIP1, has previously been shown to amplify flg22 signaling (Hou et al 2014, doi:10.1371/journal.ppat.1004331 ). In this study it is shown that co-treatment with flg22 and PIP1 gives increased resistance to Pseudomonas PstDC3000 compared to when plants are treated with each peptide separately. In the same study, the authors also show reduced flg22 induce transcriptional activity of two defense related genes WRKY33 and PR in the receptor like kinase7 (rlk7) mutant (the receptor perceiving PIP1) (). To investigate whether PIP1 would give the same additive effect with mIDA as that observed between flg22 and mIDA, we co-treated seedlings with PIP1 and mIDA. We observed no enhanced transcriptional activity of FRK1, MYB51 and PEP3 in tissue from plants treated with both PIP1 and mIDA peptides compared to single exposure. These results are presented in supplementary figure 11. In conclusion we do not think mIDA acts as a general amplifier of all immune elicitors in plants.

      It is interesting how tissue specific calcium responses are in response to IDA and flg22, suggesting the cellular distribution of their cognate receptors. However, one striking observation made by the authors as well, is that the expression of promoter seems to be broader than the calcium response. Indicating that additional factors are required for the observed calcium response. Could diffusion of the peptide be a contributing factor, or are only some cells competent to induce a calcium response?

      It is interesting that the authors look for floral abscission phenotypes in cngc and rbohd/f mutants to conclude for genetic requirement of these in floral abscission. Do the authors have a hypothesis for why they failed to see a phenotype for the rbohd/f mutant as was published previously? Do you think there might be additional players redundantly mediating these processes?

      It is a possibility that diffusion of the peptide plays a role in the observed response. In a biological context we would assume that the local production of the peptides plays an important role in the cellular responses. In our experimental setup, we add the peptide externally and we can therefore assume that the overlaying cells get in contact with the peptide before cells in the inner tissues and this could be affecting the response recorded However, our results show that there is a differences between flg22 and mIDA induced responses even when the application of the peptides is performed in the same manner, indicating that the difference in the response is not primarily due to the diffusion rate of the peptides but is likely due to different factors being present in different cells. To acquire a better picture of the distribution of receptor expression in the root tissue and to investigate in which cells the receptors have an overlapping expression pattern, we have included results in figure 6 showing plant lines co-expressing transcriptional reporters of FLS2 and HAE or HSL2.

      Can you observe callose deposition in the cotyledons of the 35S::HAE line? Are the receptors expressed in native cotyledons? This is the only phenotype tested in the cotyledons.

      We thank the reviewer for this valuable comment. We have now conducted callose deposition assay on the 35S:HAE line. And Indeed, we observe callose depositions when cotyledons from a 35S:HAE line is treated with mIDA. We have included these results in figure 4 and have adjusted the text regarding the callose assay accordingly. In addition, we have analyzed the promoter activity of pHAE in cotelydons and we observe weak promoter activity. These results are included as supplementary figure 1d.

      Are flg22-induced calcium responses affected in hae hsl2?

      The experiment suggested by the reviewer is an important control to ensure that the hae hsl2-Aeq line can respond to a Ca2+ inducing peptide signaling through a different receptor than HAE or HSL2. One would expect to see a Ca2+ response in this line to the flg22 peptide. We performed this experiment and surprisingly we could not detect a flgg22 induced Ca2+ signal in the hae hsl2 mutnt. As it is unlikely that the Ca2+ response triggered by flg22 is dependent on HAE and HSL2 we have to assume that the lack of response is due to a malfunction of the Aeq sensor in this line. As a control to measure the amount of Aeq present in the cells we treat the Aeq seedlings with 2 M CaCl2 and measure the luminescence constantly for 180 seconds (Ranf et al., 2012, DOI10.1093/mp/ssr064). The CaCl2 treatment disrupts the cells and releases the Aeq sensor into the solution where it will react with Ca2+ and release the total possible response in the sample (Lmax) in form of a luminescent peak. When treating the hae hsl2-Aeq line with CaCl2we observe a luminescent peak, indicating the presence of the sensor, however, the response is reduced compared to WT seedlings expressing Aeq. Given the sensitivity of FLS2 to flg22 one would still expect to see a Ca2+ peak in the hae hsl2-Aeq line even if the amount of sensor is reduced. Given that this is not the case, we have to assume that localization or conformation of the sensor is somehow affected in this line or that there is another biological explanation that we cannot explain at the moment.

      We have therefore opted on omitting the results using the hae hsl2 Aeq lines from the manuscript and are in the process of mutating HAE and HSL2 by CRISPR-Cas9 in the Aeq background to verify that the mIDA triggered Ca2+ response is dependent on HAE and HSL2.

      Reviewer #2 (Public Review):

      Lalun and co-authors investigate the signalling outputs triggered by the perception of IDA, a plant peptide regulating organs abscission. The authors observed that IDA perception leads to a transient influx of Ca2+, to the production of reactive oxygen species in the apoplast, and to an increase accumulation of transcripts which are also responsive to an immunogenic epitope of bacterial flagellin, flg22. The authors show that IDA is transcriptionally upregulated in response to several biotic and abiotic stimuli. Finally, based on the similarities in the molecular responses triggered by IDA and elicitors (such as flg22) the authors proposed that IDA has a dual function in modulating abscission and immunity. The manuscript is rather descriptive and provide little information regarding IDA signalling per se. A potential functional link between IDA signalling and immune signalling remains speculative.

      We thank the reviewer for the concerns raised and agree that further experiments including pathogen assays would strengthen the link between IDA signaling and immunity and plan for such experiments in future work.

      Reviewer #3 (Public Review):

      Previously, it has been shown the essential role of IDA peptide and HAESA receptor families in driving various cell separation processes such as abscission of flowers as a natural developmental process, of leaves as a defense mechanism when plants are under pathogenic attack or at the lateral root emergence and root tip cell sloughing. In this work, Olsson et al. show for the first time the possible role of IDA peptide in triggering plant innate immunity after the cell separation process occurred. Such an event has been previously proposed to take place in order to seal open remaining tissue after cell separation to avoid creating an entry point for opportunistic pathogens.

      The elegant experiments in this work demonstrate that IDA peptide is triggering the defenseassociated marker genes together with immune specific responses including release of ROS and intracellular CA2+. Thus, the work highlights an intriguing direct link between endogenous cell wall remodeling and plant immunity. Moreover, the upregulation of IDA in response to abiotic and especially biotic stimuli are providing a valuable indication for potential involvement of HAE/IDA signalling in other processes than plant development.

      We are pleased that the reviewer finds our findings linking IDA to defense interesting and would like to thank the reviewer for this positive feedback.


      The various methods and different approaches chosen by the authors consolidates the additional new role for a hormone-peptide such as IDA. The involvement of IDA in triggering of the immunity complex process represents a further step in understanding what happens after cell separation occurs. The Ca2+ and ROS imaging and measurements together with using the haehsl2 and haehsl2 p35S::HAE-YFP genotypes provide a robust quantification of defense responses activation. While Ca2+ and ROS can be detected after applying the IDA treatment after the occurrence of cell separation it is adequately shown that the enzymes responsible for ROS production, RBOHD and RBOHF, are not implicated in the floral abscission.

      Furthermore, IDA production is triggered by biotic and abiotic factors such as flg22, a bacterial elicitor, fungi, mannitol or salt, while the mature IDA is activating the production of FRK1, MYB51 and PEP3, genes known for being part of plant defense process.

      Thank you.


      Even though there is shown a clear involvement of IDA in activating the after-cell separation immune system, the use of p35S:HAE-YFP line represent a weak point in the scientific demonstration. The mentioned line is driving the HAE receptor by a constitutive promoter, capable of loading the plant with HAE protein without discriminating on a specific tissue. Since it is known that IDA family consist of more members distributed in various tissues, it is very difficult to fully differentiate the effects of HAE present ubiquitously.

      We agree on this statement. Nevertheless, it is important to note that the responses we have observed are not detectable in WT plants that do not (over)express the HAE receptors. Suggesting that the ROS and callose deposition are induced by the addition of mIDA peptide and not the potential presence of the endogenous IDL peptides.

      The co-localization of HAE/HSL2 and FLS2 receptors is a valuable point to address since in the present work, the marker lines presented do not get activated in the same cell types of the root tissues which renders the idea of nanodomains co-localization (as hypothetically written in the discussion) rather unlikely.

      Thank you for raising an important aspect of our study. It is true that not all cells in the root which have promoter activity for FLS2 also exhibit promoter activity for either HAE or HSL2. However, we have observed that certain cells in the roots show promoter activity for both receptors. In the revised version of the manuscript, we have included plants expression a transcriptional promoter for both FLS2 and HAE or HSL2 using different fluorescent proteins. We have investigated overlapping promoter activity both at sites of lateral roots, in the tip of the primary root and in the abscission zone. Our results show overlapping expression of the transcriptional reporters in certain cells, indicating that FLS2 and HAE or HSL2 are likely to be found in some of the same cells during plant development. We also observe cells where only one or none of the promoters are active.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Supplementary Figure 3: re-labelling of y axis; 200 than 200,00 for clarity.

      This has been addressed.

      Supplementary Figure 2: It would be good to include the age of the seedlings used to study calcium influx in the legend.

      This has been addressed.

      Supplementary Figure 1: rephrase 'IDA induces ROS production in Arabidopsis'.

      This has been addressed.

      The use of chelating agents to establish the need of calcium from extracellular space is a clear experiment supporting the calcium response phenotype specific to IDA treatment in seedlings. Removing the last asparagine (N) and using it as a peptide that fails to elicit calcium response could simply be because of the peptide is smaller in length or different chemical properties. Therefore, a scrambled sequence would have been a better control.

      We thank the reviewer for the suggestion of using a scrambled peptide as a negative control, however we find it unlikely that mIDA∆N69 could induce any activity based on previous work. Results from crystal structure of mIDA bound to the HAE receptor and ligand-receptor interaction studies (10.7554/eLife.15075 ) show that the last asparagine in the mIDA peptide is essential for detectable binding to the HAE receptor and that a peptide lacking this amino acid does not have any activity. We will however, in future experiments also include a scrambled version of the peptide as an additional control.

      Reviewer #2 (Recommendations For The Authors):

      Please find below specific comments:

      (1) Most of the molecular outputs triggered by IDA can be considered as common molecular marks of plant peptides signalling, they do not represent strong evidences of a potential function of IDA in modulating immunity. For instance, perception of CIF peptides, which control the establishment of the Casparian strips, regulate the production of reactive oxygen species, and the transcription of genes associated with immune responses (Fujita et al., The EMBO Journal 2020). It should also be considered that FRK1, whose function remains unknown, may be involved in both immunity and abscission and that the upregulation of FRK1 upon IDA treatment is not indicative of active modulation of immune signalling by IDA.

      This is a fair point raised by the reviewer and we now address in the manuscript that ROS and Ca2+ are hallmarks of both plant development and defense. The function of FRK1 is not known however, it is unlikely that the upregulation of FRK1 in response to mIDA plays a role in the developmental progression of abscission as it is not temporally regulated during the abscission process, thus making it an unlikely candidate in the regulation of cell separation (Cai & Lashbrook, 2008, https://doi.org/10.1104/pp.107.110908). We do however agree that further experiments including pathogen assays would strengthen the link between IDA signaling and immunity and plan for such experiments in future work.

      (2) It remains unknown whether IDA modulate immunity. For instance, does IDA perception promote resistance to bacteria (bacterial proliferation, disease symptoms)? Is IDA genetically required for plant disease resistance immunity? Is the IDA signalling pathway genetically required for transcriptional changes induced by flg22, such as increase in FRK1 transcripts? In addition, the authors propose that the proposed function of IDA in modulating immune signalling prevents bacterial infection in tissue exposed to stress(es). Does loss of function of IDA or of its corresponding receptors leads to changes in the ability of bacteria to colonise plant root upon stress(es)?

      Please see the comment above regarding pathogen assays.

      (3) Several aspects of the work appear to correspond to preliminary investigation. For instance, the authors analyse loss of function mutant for genes encoding for Ca2+ permeable channels (CNGCs) which are transcriptionally active during the onset of abscission (Sup. Figure 5). None of the single mutants present an abscission defect. These observations provide no information regarding the identity of the channel(s) involved in IDA-induced calcium influx.

      We agree with the reviewer that we have not been able to identify the channels responsible for the IDA-induced calcium influx. Given the redundancy for many of the members of this multigenic family a future approach to identify proteins responsible for the IDA triggered calcium response could be to create multiple KO mutants by CRISPR Cas9.

      (4) Using H2DCF-DA, the authors observed a decrease in ROS accumulation in the abscission zone of rbohd/rbohf double KO line (Sup Figure 5c) but describe in the text that ROS production in this zone does not depend on RBOHD and RBOHF (L220). Please clarify.

      This has now been clarified in the text.

      (5) The authors describe that rbohd/rbohf double KO present a lower petal break-strength, which they describe as an indication of premature cell wall loosening, and that petals of rbohd/rbohf abscised one position earlier than in WT. Yet, the authors postulate that IDA-induced ROS production does not regulate abscission but may regulate additional responses. Instead the data seems to indicate that ROS production by RBOHD and RBOHF regulate the timing of abscission. In addition, it would have been interesting to test whether IDA signalling pathway regulate ROS production in the abscission zone.

      The rbohd and rbohf double mutants show several phenotypes associated to developmental stress, the mild phenotype observed with regards to premature abscission (by one position) could be caused by the phenotype of the double mutant rather than related to ROS production. Indeed, it has been suggested that the lignified brace in the AZ dependent on ROS production by the aforementioned RBOHs in necessary for the correct concentration of cell modifying enzymes (Lee et al., 2018, https://doi.org/10.1016/j.cell.2018.03.060). The precocious abscission in this double mutant clearly shows this not to be the case. We have tried to do a ROS burst assay on AZ tissue/flowers with the mIDA peptide but have not been successful with this approach. A ROS sensor expressed in AZ tissue would be a valuable tool to address whether IDA signalling regulates ROS production in AZs.

      (6) In Sup. Figure5a, it would be of interest to have a direct comparison of the transcript accumulation of the presented CNGCs and RBOHDs with other of these multigenic families.

      The CNGCs and RBOH gene expression profile shown in the figure are the family members expressed during the developmental progress of floral abscission in stamen AZs. Since there is no difference in the temporal expression of the other family members (and most are either not expressed or very weakly expressed in this tissue) it is not possible to do this comparison (Cai & Lashbrook, 2008, https://doi.org/10.1104/pp.107.110908).

      (7) L251-253, since IDAdeltaN69 cannot be perceived by its receptors, the absence of induction of pIDA::GUS by IDAdeltaN69 compared to flg22 cannot be seen as a sign of specificity in peptideinduced increase in IDA promotor activity.

      We have rephased this in the text

      (8) Please provide quantitative and statistical analysis of the calcium measurement presented in sup figure 3.

      This has been addressed.

      (9) L339-341; This sentence is unclear to me, please rephrase.

      We have rephased this in the text

      Reviewer #3 (Recommendations For The Authors):

      (1) In order to assess the role of CNGCs in abscission process, it would be more interesting to see the effect on the Ca2+ pattern and ROS signaling after application of mIDA on cngc and rbohf rbohd mutants.

      We agree in this statement and the studies on mIDA induced ROS and Ca2+ on these mutants will provide valuable information to the regulation of the response. We are in the process of making the lines needed to be able to perform these experiments. However, since it requires crossing of genetically encoded sensors into each mutant, and generation of higher order mutants this is a long process.

      (2) With regard to the ROS production (Sup Fig. 1), the application of mIDA can trigger ROS in p35S::HAE:YFP lines, but not in the wild-type plant, which is according to the text "most likely due to the absence of HAE expression" in leaves. The experiment on callose deposition is performed in wild-type cotyledons where no callose deposition could be observed after mIDA treatment (Fig. 4a,b). The conclusion from text is that IDA "is not involved in promoting deposition of callose as a long-term defence response". It appears more likely that neither ROS nor callose can be observed in wild-type plants due to the lack of HAE expression. Therefore, the callose experiment should include the p35S::HAE:YFP lines. The experiment as it is does not allow to draw any conclusion on HAE/IDA involvement in callose formation.

      We fully agree with this comment, thank you for pinpointing this out. We have now performed the callose experiment with the 35S:HAE lines. Please see our answer to reviewer #1.

      (3) Between Sup Fig. 3 and Sup Fig. 5 two different systems were used to asses the floral stage. An adjustment of the floral stages would be easier to convey the levels of HAE/HSL2 expression and hence potentially with the onset of cell-wall degradation.

      We now used the same system to assess floral stages throughout the whole manuscript.

      (4) For the Fig. 1 and 2, it will be helpful to mention the genotype used for imaging/quantification of Ca2+.

      This has been addressed.

      (5) Some of the abbreviations are not introduced as full-text at their first time use in the text, such as: mIDA (Line 68), Ef-Tu (line 85), NADPH (line 77).

      The abbreviations have now been introduced.

      (6) In the legend of Fig. 5 (lines 897 and 898)- in the figure description, the box plots are identified as light gray and dark gray, while in the panel a of the figure the box plots are colored in red and blue.

      Thank you for pointing this out, this has now been corrected.

      (7) In figure 1 and 2. the authors write that the number of replicates is 10 (n=10) but data represents a single analysis. Please provide the quantitative ROI analysis, demonstrating that the observed example is representative. This is particularly important since the authors claim very specific changes in pattern of Ca signaling between mIDA and FLG22 treatments (Line 148).

      (8) Figure 4: please use alternative scaling on the Y axis instead of breaks.

      This has now been fixed.

      (9) Figure 5: it is not clear what n=4 refers to when the authors state three independent replicates. In figure 6 they state 4 technical reps and 3 biological reps. Please ensure this is similar across all descriptions.

      We have now ensured the correct information in all descriptions.

    2. eLife assessment

      This manuscript presents valuable findings on the role of a plant peptide in coordinating developmental and immune responses signaling. The evidence supporting the claims, while mainly descriptive and and somewhat limited due to the main conclusions being drawn from overexpression lines, is mostly solid. The findings are interesting, they align with existing models, and they are of relevance to plant pathologists and developmental biologists.

    3. Reviewer #1 (Public Review):

      A descriptive manuscript investigating the ability of a peptide, implicated in development, to induce signalling responses indicative of immunity. The work clearly documents the ability of the synthetic peptide to induce these responses, and open future work to link this back to physiology.

      Comments on revised version:

      Congratulations to the authors for the improvements to the manuscript.

      I still have reservations, as raised by other reviewers, about whether the outputs observed can definitively be classified as immune/defence outputs without assaying an impact upon microbial growth. Indeed, this is challenging to address as many of the outputs are shared by multiple pathways. This is especially the case here as the peptide could have different effects in different tissues or cells with different expression levels of the receptors (e.g. hypothetically - no expression = no effect; weak expression - cell wall loosening and susceptibility; high expression - strong response and 'defence' response). I do however appreciate that the authors have toned down some of the conclusions regarding the defence response and also they included further reference to outputs also being from developmental pathways.

    4. Reviewer #3 (Public Review):

      Previously, it has been shown the essential role of IDA peptide and HAESA receptor families in driving various cell separation processes such as abscission of flowers as a natural developmental process, of leaves as a defense mechanism when plants are under pathogenic attack or at the lateral root emergence and root tip cell sloughing. In this work, Olsson et al. show for the first time the possible role of IDA peptide in triggering plant innate immunity after the cell separation process occurred. Such an event has been previously proposed to take place in order to seal open remaining tissue after cell separation to avoid creating an entry point for opportunistic pathogens. The elegant experiments in this work demonstrate that IDA peptide is triggering the defense-associated marker genes together with immune specific responses including release of ROS and intracellular CA2+. Thus, the work highlights an intriguing direct link between endogenous cell wall remodeling and plant immunity. Moreover, the upregulation of IDA in response to abiotic and especially biotic stimuli are providing a valuable indication for potential involvement of HAE/IDA signalling in other processes than plant development.

      Comments on revised version:

      We thank the authors for addressing our previous comments. Overall, we are satisfied with the improvements and appreciate the hard work that has gone into this manuscript. We wish you all the best on the further publication pathway.

    1. Reviewer #2 (Public Review):

      The authors used a whole genome CRISPR screen to identify targetable synthetic lethalities associated with PPM1D mutations, known poor prognosis and currently undruggable factors in leukemia. The authors identified the cytosolic superoxide dismutase (SOD1, Cu/Zn SOD) as a major protective factor in PPMD1 mutant vs. wt cells, and their study investigates associated mechanisms of this protection. Using both genetic depletion and small molecule inhibitors of SOD1, the authors conclude that SOD1 loss exacerbates mitochondrial dysfunction, ROS levels and DNA damage phenotypes in PPM1D mutant cells, decreasing cell growth in AML cells. The data strongly support that PPMD1 mutant cells have high levels of total peroxides and elevated DNA breaks, and that genetic depletion of SOD1 decreases cell growth in two AML cell lines. However, the authors don't explain how superoxide radical (which is not damaging by itself) induces such damage, the on-target effects of the SOD1 inhibitors at the concentrations is not clear, the increase in total hydroperoxides is not supported by loss of SOD1, the changes in mitochondrial function are small, and there is no assessment of how the mitochondrial SOD2 expression or function, which dismutates mitochondrial superoxide, is altered. Overall these studies do not distinguish between signal vs. damaging aspects of ROS in their models and do not rule out an alternate hypothesis that loss of SOD1 increases superoxide production by cytosolic NADPH activity which would significantly alter ROS-driven regulation of kinase/phosphatase signal modulation, affecting cell growth and proliferation as well as DNA repair. Additionally, with the exception of growth defects demonstrated with sgSOD1, the majority of data are acquired using two chemical inhibitors, LCS1 and ATN-224, without supporting evidence that these inhibitors are acting in an on-target manner.

      Overall, the authors address an important problem by seeking targetable vulnerabilities in PPM1D mutant AML cells, it is clear SOD1 deletion induces strong growth defects in the AML cell lines tested, most of the approaches are appropriate for the outcomes being evaluated, and the data are technically solid and well-presented. The major weakness lies in which redox pathways and ROS species are evaluated, how the resulting data are interpreted, and gaps in the follow-up experiments. Due to these omissions, as currently presented, the broader impact of these findings are unclear.

      These specific concerns are outlined in detail below and I offer some suggestions regarding how to clarify the mechanisms underlying their initial observation of SOD1 synthetic lethality:

      (1) Fig. 1 - SOD1 appears to be clustered with several other genes in the volcano plot (including FANC proteins). Did any other ROS-detoxifying enzymes show similar fitness scores? The effects of the SOD1 sgRNA are striking, however it would be useful to see qPCR or immunoblot data confirming robust depletion.

      Does SOD1 co-expression in PPM1-mutant patient AML correspond to poorer disease outcomes? This can be evaluated in publicly available patient datasets and would support the idea of SOD1 synthetic lethality.

      It would also be useful to know (given the subsequent results) whether expression of the SOD2, the mitochondrial superoxide dismutase, is altered in response to SOD1 loss.

      (2) Fig. 2 - What are the relative SOD1 levels in the mutant PPM1D vs. wt. cell lines? The effects of the chemical inhibitors are stronger in MOLM-13 than the other two lines. These data could also point to whether LCS-1 and ATN-224 cytotoxicity is on-target or off-target at these concentrations, which is a key issue not currently addressed in these studies. This is a particular concern as the OCI-AML2 line shows a stronger growth defect with CRISPR SOD1 KO (in Fig 1) but the smallest effects with these chemical inhibitors.

      While endogenous mitochondrial superoxide levels are elevated in PPM1D mutant lines, it is entirely unclear why SOD1 inhibition should affect mitochondrial superoxide as it detoxifies cytosolic superoxide. Also unclear why DCFDA signal (which measures total hydroperoxides) is *increased* under SOD1 inhibition - SOD1 dismutates superoxide radicals into hydrogen peroxide, therefore unless SOD2 is compensating for SOD1 loss, one might expect hydroperoxides to be lower (unless some entirely different oxidase is increasing their levels). None of these outcomes appear to be considered. Finally, it is not explained how lipid peroxidation, which requires production of hydroxyl or similarly high potency radicals, is being caused by increased superoxide or peroxides. One possibility is there is an increase in labile iron, in which case this phenotype would be rescued by the iron chelator desferal, and by the lipophilic antioxidant, ferrostatin.

      Do the sgSOD1 cells also show similar increases in MitoSox green, DCFDA and BODIPY signal? These experiments would clarify whether the effects with the inhibitors are directly related directly to SOD1 loss or if they represent off-target effects from the inhibitors and/or compensatory changes in SOD2.

      (3) Fig. 3 - the effects on mitochondrial respiratory parameters, while statistically significant, do not seem biologically striking. Also, these data are shown for OCI-AML2 cells which show the smallest cytotoxic effects with the SOD1 inhibitors among the 3 lines tested. They do however show the most robust growth defect with sgSOD1. This discrepancy could suggest that mitochondrial dysfunction does not underlie the observed growth defect and/or the inhibitor cytotoxicity is not on-target. Ideally mitochondrial profiling should also be carried out on this cell line with inducible SOD1 depletion. Have the authors assessed whether the mitochondrial Bcl family proteins are affected by the inhibitors?

      (4) Fig. 4 - Currently the data in this figure do not support the authors claim that PPM1D-mutant cells have impaired antioxidant defense mechanisms, leading to an elevation in ROS levels and reliance on SOD1 for protection. It should be noted that oxidative stress specifically refers to adverse cellular effects of increasing ROS, not baseline levels of various redox parameters. Ideally levels of GSSG/GSH would be a better measure of potential redox stress tolerance than the total antioxidant capacity assay. Finally, oxidative stress can be assessed by challenging the wt and mutant PPM1D cell lines with oxidant stressors such as paraquat which elevates superoxide or drugs like erastin which elevate mitochondrial ROS. The immunoblot shows negligible changes in the antioxidant proteins assayed. Again, this blot should include SOD2 which is the most relevant antioxidant in the context of mitochondrial superoxide.

      (5) Fig. 5 - These data support that DNA breaks are elevated in PPM1D mutant vs. wt cells. However, the data with the chemical SOD1 inhibitor again do not convince that the enhanced levels are due to on-target effects on SOD1. Use of the alkaline comet assay is appropriate for these studies and the 8-oxoguanine data do indicate contributions from oxidative DNA base damage. But these are unlikely to result directly from altered superoxide levels, as this species cannot directly oxidize DNA bases or cause DNA strand breaks.

      The following points summarize my specific experimental and textual recommendations:

      (1) These studies require an assessment of on-target efficacy of the inhibitors at the relevant concentration ranges. Ideally, they should have minimal effects against SOD1 knockout cell lines (acute challenge at a time point before the growth defects become apparent) and show better efficacy in SOD1-overexpressing lines. Key experiments (changes in superoxide, OCR profiling, DNA alkaline comet assay) would be more convincing if they are carried out with SOD1 knockout lines to compare against the inhibitor effects (3-4 days after introducing sgSOD1 when growth defects are not apparent).

      (2) Instead of using NAC, which elevates glutathione synthesis but also has several known side-effects, the authors may want to determine whether Tempol, a SOD mimetic can rescue the effects of SOD1 knockout or inhibition. This would directly prove that SOD1 functional loss underlies the observed growth defect and cytotoxicity from genetic SOD1 knockdown or chemical inhibition.

      (3) The complete lack of consideration of SOD2 in these studies is a missed opportunity as it reduces mitochondrial superoxide levels but elevates hydrogen peroxide levels. It would be very interesting to see whether SOD1 inhibition leads to compensatory increases in SOD2. SOD2 can be easily measured by immunoblot. Furthermore, measuring total superoxide via hydroethidium in a flow cytometric assay vs. mitochondrial ROS in PPM1D mut vs. wt cells and under SOD1 knockout would enable a determination of which species dominates (cytosolic or mitochondrial). These experiments are required to fill some logical gaps in interpretation of their redox data.

      (4) Given the DNA breaks observed in PPM1D mutant cells, it is highly recommended the authors assess whether iron levels are elevated in mut vs. wt cells and whether desferal can rescue observed SOD1 inhibition defects.

      (5) The authors may want to assess whether Rac1 or NADPH oxidase activity is altered in the SOD1 KO in wt vs. PPM1D cells. Their results may be the consequence of compromised ROS-driven survival signaling or DNA repair rather than direct ROS-induced damage, which is not caused directly by superoxide (or hydrogen peroxide).

      (6) It is recommended the discussion focus more strongly on how the signaling function of superoxide vs. its reactions with other molecular entities to induce genotoxic outcomes could be contributing to the observed phenotypes. The discussion of FANC proteins, which were targets with similar fitness scores but not experimentally investigated at all, is an unwarranted digression.

    2. Author Response

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their insightful and constructive comments of our work that have helped to strengthen the manuscript. In response to the additional suggestions provided by the reviewers, we have made revisions by adding or replacing five main figures, three supplementary figures, refining the text, and clarifying certain conclusions. Detailed responses to the reviewers’ points can be found below.

      Additional experiments, textual changes, or modulation of claims are needed to address weaknesses in the SOD1 portion of the study. Specifically:

      A) These studies require an assessment of the on-target efficacy of the inhibitors at the relevant concentration ranges. Ideally, they should have minimal effects against SOD1 knockout cell lines (an acute challenge at a time point before the growth defects become apparent) and show better efficacy in SOD1-overexpressing lines. Key experiments (changes in superoxide, OCR profiling, DNA alkaline comet assay) would be more convincing if they were carried out with SOD1 knockout lines to compare against the inhibitor effects (3-4 days after introducing sgSOD1 when growth defects are not apparent). In addition, SOD activity should be measured directly following inhibitor treatment.

      We agree with the reviewers that the on- vs. off-target effects of the pharmacologic SOD1 inhibitors is a critical point to address. We have validated that SOD activity is reduced following treatment with ATN-224 in Figure 2 – Figure supplement 1A.

      Nevertheless, we acknowledge that the potential for off-target effects of these inhibitors cannot be completely ruled out. To address this concern, we have incorporated a discussion regarding the potential off-target effects of both LCS-1 and ATN-224.

      B) Assays should be included to support that SOD1 activity is altered. ATN-224 and LCS-1 are used to inhibit SOD1 function in the majority of the experiments, which should be supported by SOD activity assays to confirm SOD inhibition. Further, the concentration of ATN-224 used in this paper (12.5 uM) is beyond the concentration of what has been reported to inhibit SOD1 function in human blood cells. In Figure 4D, the authors demonstrate comparable SOD1 total protein levels in WT and PPM1Dmutant cells. However, the authors should further address whether PPM1D-mutation alters SOD1 activity via SOD activity assays.

      We thank the reviewers for these suggestions. We have performed SOD activity assays which confirmed that SOD activity is inhibited upon treatment with ATN-224 at two concentrations (6.25 and 12.5 uM). Although we also did this for LCS-1-treated cells as well, in our hands, we did not see reduced SOD activity. However, LCS-1 has been shown to inhibit SOD activity in other publications including PMID: 21930909 and PMID: 32424294. From these assays, we have also found that PPM1D-mutant cells had increased SOD activity at baseline, despite having similar levels of SOD1 protein. These data have been added to Figure 2–Figure supplement 1A.

      C) Some conclusions are not fully supported by the data provided. The authors claimed that "upon inhibition of SOD1, there was an increase in ROS that was specific to the mutant cells" in Figure 2E. Comparison of ROS levels among untreated, ATN-224, and LCS-1 of PPM1D-mutant cells should have been made and the statistics analysis among these groups should have been provided. Moreover, in Figure 2-Figure Supplement 1E, LCS-1 treatment does not increase ROS levels in PPM1D mutant LCLs. Performing these experiments with control and SOD1 deletion cells would have strengthened the results. Along with this point, the authors should comment on why SOD2 is not identified as a top hit in the CRISPR screen, as SOD2 deletion accumulates superoxide in cells.

      After performing additional statistical analyses for Figure 2E, we found that the minor increase in ROS levels in the mutant cells after SOD1 inhibition was not statistically significant. We have revised the text accordingly.

      As for why SOD2 was not identified as a top hit, we postulate that this may be due to inherent dependency of the WT cell lines on SOD2.

      D) Fig. 1 - SOD1 appears to be clustered with several other genes in the volcano plot (including FANC proteins). Did any other ROS-detoxifying enzymes show similar fitness scores? The effects of the SOD1 sgRNA are striking, however, it would be useful to see qPCR or immunoblot data confirming robust depletion.

      Thank you for your suggestion. We have validated the loss of SOD1 protein expression after SOD1 sgRNA deletion by immunoblot and have added this data to Figure 1– figure supplement 1D. While other ROS-detoxifying enzymes were not significantly enriched in the top 37 hits, interestingly, the Fanconi Anemia pathway also has roles in counteracting oxidative stress. FA-deficient cells have mitochondrial dysfunction and redox imbalance, and several of the FA family proteins are implicated in mitophagy. Therefore, there may be an interesting interplay between SOD1 and the FA pathway that is worth highlighting in the discussion of our manuscript even though there was no experimental investigation performed.

      E) Fig. 2 - What are the relative SOD1 levels in the mutant PPM1D vs. WT. cell lines? The effects of the chemical inhibitors are stronger in MOLM-13 than in the other two lines. These data could also point to whether LCS-1 and ATN-224 cytotoxicity are on-target or off-target at these concentrations, which is a key issue not currently addressed in these studies. This is a particular concern as the OCI-AML2 line shows a stronger growth defect with CRISPR SOD1 KO (in Fig 1) but the smallest effects with these chemical inhibitors. The authors should also include SOD1 levels for Figure 1D and Figure 4Figure supplement 1C.

      SOD1 protein expression is similar between WT and PPM1D-mutant cell lines and the loss of SOD1 after SOD1 sgRNA deletion was validated by immunoblot. These data have been added to Figure 1- figure supplement 1D and Figure 4D.

      F) Does SOD1 co-expression in PPM1D-mutant patient AML correspond to poorer disease outcomes? This can be evaluated in publicly available patient datasets and would support the idea of SOD1 synthetic lethality.

      Unfortunately, there are no publicly available patient datasets with sufficient cases of de novo PPMDmutant AML to assess this question.

      G) While endogenous mitochondrial superoxide levels are elevated in PPM1D mutant lines, it is entirely unclear why SOD1 inhibition should affect mitochondrial superoxide as it detoxifies cytosolic superoxide. Also unclear why the DCFDA signal (which measures total hydroperoxides) is increased under SOD1 inhibition - SOD1 dismutates superoxide radicals into hydrogen peroxide, therefore unless SOD2 is compensating for SOD1 loss, one might expect hydroperoxides to be lower (unless some entirely different oxidase is increasing their levels). None of these outcomes appear to be considered. Finally, it is not explained how lipid peroxidation, which requires the production of hydroxyl or similarly high-potency radicals, is being caused by increased superoxide or peroxides. One possibility is there is an increase in labile iron, in which case this phenotype would be rescued by the iron chelator desferal, and by the lipophilic antioxidant, ferrostatin.

      We measured intracellular labile iron levels by flow cytometry by staining the cells with FerroOrange at baseline and after SOD1 inhibition with our pharmacologic inhibitors (ATN-224 at 12.5 uM and LCS-1 at 1.25 uM). Across the three leukemia cell lines, we saw variable results in iron levels with no appreciable patterns (see below). Therefore, we cannot make conclusions about the contribution of labile iron to our observed phenotypes.

      Author response image 1.

      H) Do the sgSOD1 cells also show similar increases in MitoSox green, DCFDA, and BODIPY signal? These experiments would clarify whether the effects of the inhibitors are directly related directly to SOD1 loss or if they represent off-target effects from the inhibitors and/or compensatory changes in SOD2.

      We do not observe changes in SOD2 in the several contexts in which we have examined this. We cannot exclude off-target effects of the inhibitors so have clarified this in the text.

      I) The authors may want to assess whether Rac1 or NADPH oxidase activity is altered in the SOD1 KO in WT vs. PPM1D cells. Their results may be the consequence of compromised ROS-driven survival signaling or DNA repair rather than direct ROS-induced damage, which is not caused directly by superoxide (or hydrogen peroxide).

      We appreciate the reviewer’s recommendations. However, due to time constraints, we regret not being able to assess Rac1 or NADPH oxidase activity. Nevertheless, we recognize the possibility of altered ROS-driven signaling rather than ROS-induced damage as a driver of our phenotype and have incorporated this possibility into our discussion.

      J) Fig. 3 - the effects on mitochondrial respiratory parameters, while statistically significant, do not seem biologically striking. Also, these data are shown for OCI-AML2 cells which show the smallest cytotoxic effects with the SOD1 inhibitors among the 3 lines tested. They do however show the most robust growth defect with sgSOD1. This discrepancy could suggest that mitochondrial dysfunction does not underlie the observed growth defect and/or the inhibitor cytotoxicity is not on-target. Ideally, mitochondrial profiling should also be carried out on this cell line with inducible SOD1 depletion. Have the authors assessed whether the mitochondrial Bcl family proteins are affected by the inhibitors?

      We assessed a few members of the mitochondrial Bcl-family proteins including MCL-1, BCL-2, and BCL-XL during the revision process. PPM1D-mutant cells have mildly increased expression of these anti-apoptotic proteins at baseline and the expression is not altered by pharmacologic SOD1 inhibition (see Author response image 2 below). Due to time constraints, we were unable to perform seahorse assays and mitochondrial profiling in the SOD1-deletion cells.

      Author response image 2.

      K) Fig. 4 - Currently the data in this figure do not support the authors' claim that PPM1D-mutant cells have impaired antioxidant defense mechanisms, leading to an elevation in ROS levels and reliance on SOD1 for protection. It should be noted that oxidative stress specifically refers to adverse cellular effects of increasing ROS, not baseline levels of various redox parameters. Ideally, levels of GSSG/GSH would be a better measure of potential redox stress tolerance than the total antioxidant capacity assay. Finally, oxidative stress can be assessed by challenging the wt and mutant PPM1D cell lines with oxidant stressors such as paraquat which elevates superoxide, or drugs like erastin which elevate mitochondrial ROS. The immunoblot shows negligible changes in the antioxidant proteins assayed. Again, this blot should include SOD2 which is the most relevant antioxidant in the context of mitochondrial superoxide.

      We measured intracellular glutathione levels by flow cytometry and found that PPM1D-mutant cells had a greater proportion of cells with low levels of GSH. This data has been added as Figure 4D. We have also repeated the western blot to look at the antioxidant proteins catalase, SOD1, and thioredoxin after SOD1-deletion and pharmacologic SOD1 inhibition. We evaluated SOD2 protein levels in these experiments, as suggested. Smooth muscle actin (SMA) is included in the antibody cocktail as a loading control. However, it is unclear to us as to why PPM1D-mutant cells consistently have significantly higher levels of SMA. Therefore, we included a separate loading control, Vinculin. Repeat of these western blots showed a clearer difference between WT and PPM1D-mutant cells in the levels of these antioxidant proteins in which PPM1D-mutant cells have decreased levels of catalase and thioredoxin. These blots also show that SOD2 levels may be mildly increased in the PPM1D-mutant cells at baseline but is not significantly upregulated upon SOD1 inhibition. We have replaced the original immunoblot from Figure 4D with the revised blots that more clearly demonstrate the reduced levels of catalase and thioredoxin, now figure 4E.

      L) Fig. 5 - These data support that DNA breaks are elevated in PPM1D mutant vs. wt cells. However, the data with the chemical SOD1 inhibitor again do not convince us that the enhanced levels are due to on-target effects on SOD1. Use of the alkaline comet assay is appropriate for these studies and the 8-oxoguanine data do indicate contributions from oxidative DNA base damage. But these are unlikely to result directly from altered superoxide levels, as this species cannot directly oxidize DNA bases or cause DNA strand breaks.

      Thank you to the reviewers for raising this point. We have performed comet assays in SOD1-deletion cells to look at levels of DNA damage. Consistent with the reviewers’ point, we do not see a significant increase in DNA breaks after SOD1 deletion. We have removed the data using the SOD1 inhibitor and instead show the COMET analysis in the PPM1D-mut and SOD1-KO cells (see Figure 5F). We now make the point that increased DNA damage with SOD1 loss cannot explain the vulnerability of the double-mutant cells.

      M) Instead of using NAC, which elevates glutathione synthesis but also has several known side effects, the authors may want to determine whether Tempol, a SOD mimetic can rescue the effects of SOD1 knockout or inhibition. This would directly prove that SOD1 functional loss underlies the observed growth defect and cytotoxicity from genetic SOD1 knockdown or chemical inhibition.

      This is an excellent suggestion; we have added comments to this effect into the discussion.

      N) It is recommended the discussion focus more strongly on how the signaling function of superoxide vs. its reactions with other molecular entities to induce genotoxic outcomes could be contributing to the observed phenotypes. The discussion of FANC proteins, which were targets with similar fitness scores but not experimentally investigated at all, is an unwarranted digression.

      Thank you for this recommendation. We have expanded the discussion to focus more on the signaling functions of superoxide. However, considering the role of the Fanconi Anemia pathway in mitigating DNA damage and oxidative stress, we believe the discussion on the FANC proteins is important due to the possible intersection with SOD1. Therefore, we have refined this portion discussion to focus more on the interplay between SOD1 and FA.

      O) The complete lack of consideration of SOD2 in these studies is a missed opportunity as it reduces mitochondrial superoxide levels but elevates hydrogen peroxide levels. It would be very interesting to see whether SOD1 inhibition leads to compensatory increases in SOD2. SOD2 can be easily measured by immunoblot. Furthermore, measuring total superoxide via hydroethidium in a flow cytometric assay vs. mitochondrial ROS in PPM1D mut vs. wt cells and under SOD1 knockout would enable a determination of which species dominates (cytosolic or mitochondrial). These experiments are required to fill some logical gaps in the interpretation of their redox data.

      During the revision process, we have included SOD2 in our studies and have found that loss of SOD1 via genetic deletion and pharmacologic inhibition does not lead to compensatory increases in SOD2 (Figure 4D). Additionally, we have measured cytoplasmic superoxide levels using dihydroethidium to differentiate between cytoplasmic vs. mitochondrial superoxide. We found that at baseline levels, the mutant cells also harbored more cytoplasmic superoxide. We have added this figure as Figure 2C and moved the original mitochondrial superoxide data to Figure 2-figure supplement 1C.

      P) Given the DNA breaks observed in PPM1D mutant cells, it is highly recommended that the authors assess whether iron levels are elevated in mut vs. wt cells and whether desferal can rescue observed SOD1 inhibition defects. Also, it has been reported that PPM1D promotes homologous recombination by forming a stable complex with BRCA1-BARD1, thereby enhancing their recruitment to doublestrand break sites. The authors should comment on why there is no difference in repair via HR in WT and PPM1D mutant cells in Figure 5C.

      Please see comment G regarding our findings about iron levels.

      The reviewers pose an interesting question as to why there is no difference in HR repair between WT and mutant cells, given the reported role of PPM1D in promoting HR. We have addressed this question in the main text. We believe that several factors can limit the extent of HR enhancement in PPM1D-mutant cells. For example, HR is typically confined to the S/G2 phase and thus may be constrained by cell cycling, among other regulatory mechanisms.

      Other comments:

      A) The authors described in the Method section that "The CRISPR Screen PPM1D mutant Cas9expressing OCI-AML2 cell lines were transduced with lentivirus library supernatant." The authors need to provide information on whether the MOI of the CRISPR screen has been well controlled to ensure that the majority of the cell population has a single copy of sgRNA transduction.

      We performed a lentiviral titer curve prior to the screen to determine the volume of viral supernatant to add for a multiplicity of infection (MOI) of 0.3. This important detail has been added to our Methods.

      B) The study convincingly shows differences between parental leukemic cells and the PPM1D mutants but one important control is missing in experiments related to Fig. 2 and 3. All PPM1D mutant clones used in this study were subjected to the blasticidin selection of the transduced cells to generate cells stably expressing Cas9 and subsequently, the clones with successful PPM1D targeting were expanded. The authors should demonstrate that increased ROS production is not just a consequence of the lentiviral transduction and antibiotic selection and that it corresponds to increased PPM1D activity in PPM1D mutant cells. To do that, authors could compare PPM1D clones to parental cells that underwent the same selection procedure (OCI-AML2-Cas9 cells and OCI-AML3-Cas9 cells).

      It is true that the parental OCI-AML2 and OCI-AML3 cell lines underwent four days of blasticidin selection to create the stably expressing Cas9 cell lines. However, after the four-day period, the blasticidin was removed from the cell culture media. From there, we induced the PPM1D-mutations into the Cas9-expressing “WT” cell lines using the RNP-based CRISPR/Cas9 delivery method and single cells were then sorted into 96-well plates. Clones were expanded and validated using Sanger sequencing, TIDE analysis, and western blot. In all of our assays, we compare the WT Cas9 cells to the PPM1D-mutant Cas9 cells. Additionally, the cells have been expanded and passaged several times after blasticidin-selection. Therefore, we believe it is unlikely that there are residual ROSinducing effects from the antibiotic treatment.

      C) The authors mention that they identified 3530 genes differentially expressed in parental and PPM1D mutant cells (line 267) but it is unclear what was the threshold for statistical significance. They mention FDR<0.05 in the Methods but show GSEA analysis with FDR<0.25 in Figure 4A. Source data for Fig. 4 is missing and the list of differentially expressed genes is not shown.

      The source data files for Figures 1 and 4 will be uploaded with the revised manuscript. Upon reviewing the source data, we noticed an error in the number of differentially expressed genes. We have corrected this in line 274 and you will see that this correlates with Figure 4-source data 1. For the thresholds, we used an FDR<0.05 for the differential gene expression analysis, and an FDR <0.25 in the GSEA, which is an appropriate threshold for GSEA. We have clarified these thresholds in the methods section.

      D) Include a definition of MFI in Figure legend Fig.2 and also in the Methods section. The unit should be indicated at both the x and y axes.

      We have defined MFI in the figure legends and methods sections and have updated the figures accordingly.

      E) Legend to Figure 2 - Figure Supplement 1 E should define the grey and pink columns (likely WT and mutants LCLs).

      Thank you. We have defined the grey and pink columns as WT and PPM1D-mutant cell lines, respectively for Figure 2 – Figure supplement 2D and E.

      F) Reporter assays in Fig. 5 convincingly show that NHEJ capacity is reduced in PPM1D mut cells. In the text, the authors state that this might reflect the impact of PPM1D on LSD1 (line 365). Although this might be the case, other options are equally possible. It would be appropriate to include a reference to the ability of PPM1D to counteract gH2AX and ATM which generate the most upstream signals in DDR.

      Thank you to the reviewers for raising this excellent point. We have revised the text to incorporate the impact of PPM1D on yH2AX and ATM on NHEJ.

      G) The authors correctly state that truncation of PPM1D leads to protein stabilization (line 85) and that it is present in U2OS cells (line 355). These observations have first been reported by Kleiblova et al 2013 and therefore one reviewer believes that this reference should be included. This study also identified truncating PPM1D mutation in colon adenocarcinoma. HCT116 cells and the role of PPM1D mutation in promoting the growth of colon cancer has subsequently been tested in an animal model (Burocziova et al., 2019).

      Thank you. We have added this reference to our text in line 360.

    1. eLife assessment

      This useful study aims to quantify associations between regular use of proton-pump inhibitors (PPI) - defined as using PPI most days of the week during the last 4 weeks at one cross-section in time - with several respiratory outcomes (6 different outcomes) up to several years later in time. Weaknesses were identified in the design of the study, such as the measurement of the primary outcome and also the potential of bias which is inherent to the study design, which means the manuscript provides incomplete evidence.

    2. Author Response

      Reviewer #1 (Public Review):


      The current study aims to quantify associations between the regular use of proton-pump inhibitors (PPI) - defined as using PPI most days of the week during the last 4 weeks at one cross-section in time - with several respiratory outcomes up to several years later in time. There are 6 respiratory outcomes included: risk of influenza, pneumonia, COVID-19, other respiratory tract infections, as well as COVID-19 severity and mortality).


      Several sensitivity analyses were performed, including i) estimation of the e-value to assess how strong unmeasured confounders should be to explain observed effects, ii) comparison with another drug with a similar indication to potentially reduce (but not eliminate) confounding by indication.

      Thank you for pointing out the strengths of our article. We also sincerely thank the reviewer for raising several concerns and providing significant suggestions to improve our manuscript. We will revise our manuscript according to our provisional responses.


      (1) The main exposure of interest seems to be only measured at one time-point in time (at study enrollment) while patients are considered many years at risk afterwards without knowing their exposure status at the time of experiencing the outcome. As indicated by the authors, PPI are sometimes used for only short amounts of time. It seems biologically implausible that an infection was caused by using PPI for a few weeks many years ago.

      We agree with the reviewer, and this is one of the limitations of the UK Biobank data. We might identify potential long-term PPI users by defining the users that have certain indications, since they tend to regularly take PPI for a long period rather than only short amounts of time. We will evaluate the effect modification for the subgroup of potential long-term PPI users.

      (2) Previous studies have shown that by focusing on prevalent users of drugs, one often induces several biases such as collider stratification bias, selection bias through depletion of susceptible, etc.

      Due to the limitations of the data from the UK Biobank, including the lack of information on the initiation of medications and close follow-up, we can only use prevalent user design to evaluate the associations between PPI use and respiratory outcomes. We will further discuss it in the limitation section.

      (3) It seems Kaplan Meier curves are not adjusted for confounding through e.g. inverse probability weighting. As such the KM curves are currently not informative (or the authors need to make clearer that curves are actually adjusted for measured confounding).

      We will provide Kaplan Meier curves adjusted for confounding by inverse probability weighting according to the reviewer’s suggestion.

      (4) Throughout the manuscript the authors seem to misuse the term multivariate (using one model with e.g. correlated error terms to assess multiple outcomes at once) when they seem to mean multivariable.

      We will correct the misused terms throughout the manuscript according to the reviewer’s suggestions.

      (5) Given multiple outcomes are assessed there is a clear argument for accounting for multiple testing, which following the logic of the authors used in terms of claiming there is no association when results are not significant may change their conclusions. More high-level, the authors should avoid the pitfall of stating there is evidence of absence if there is only an absence of evidence in a better way (no statistically significant association doesn't mean no relationship exists).

      We will revise our interpretation of the results, especially for those without statistically significant associations based on the reviewer’s advice.

      (6) While the authors claim that the quantitative bias analysis does show results are robust to unmeasured confounding, I would disagree with this. The e-values are around 2 and it is clearly not implausible that there are one or more unmeasured risk factors that together or alone would have such an effect size. Furthermore, if one would use the same (significance) criteria as used by the authors for determining whether an association exists, the required effect size for an unmeasured confounder to render effects 'statistically non-significant' would be even smaller.

      We agree with the reviewer that there might still exist one or more unmeasured risk factors that have effect sizes larger than 2. Therefore, we could not state that the results are robust to unmeasured confounding based on the current analysis, and this would be a limitation of our study. We will add the above information to the discussion section.

      (7) Some patients are excluded due to the absence of follow-up, but it is unclear how that is determined. Is there potentially some selection bias underlying this where those who are less healthy stop participating in the UK biobank?

      We will provide the details for the determination of absence of follow-up in the UK Biobank and illustrate whether it potentially induced selection bias.

      (8) Given that the exposure is based on self-report how certain can we be that patients e.g. do know that their branded over-the-counter drugs are PPI (e.g. guardium tablets)? Some discussion around this potential issue is lacking.

      In the data collection of the UK Biobank, the participants can enter the generic or trade name of the treatment on the touchscreen to match the medications they used. We will discuss this important issue in the discussion section.

      (9) Details about the deprivation index are needed in the main text as this is a UK-specific variable that will be unfamiliar to most readers.

      We will provide details about the deprivation index in the manuscript.

      (10) It is unclear how variables were coded/incorporated from the main text. More details are required, e.g. was age included as a continuous variable and if so was non-linearity considered and how?

      Age was included as a continuous variable. We will provide information on whether non-linearity was considered in our manuscript.

      (11) The authors state that Schoenfeld residuals were tested, but don't report the test statistics. Could they please provide these, e.g. it would already be informative if they report that all p-values are above a certain value.

      We will provide the test statistics for the Schoenfeld residuals.

      (12) The authors would ideally extend their discussion around unmeasured confounding, e.g. using the DAGs provided in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7832226/, in particular (but not limited to) around severity and not just presence/absence of comorbidities.

      We will use the DAGs provided by the article (PMC7832226) to extend our discussion around unmeasured confounding, especially the severity of comorbidities.

      (13) The UK biobank is known to be highly selected for a range of genetic, behavioural, cardiovascular, demographic, and anthropometric traits. The potential problems this might create in terms of collider stratification bias - as highlighted here for example: https://www.nature.com/articles/s41467-020-19478-2 - should be discussed in greater detail and also appreciated more when providing conclusions.

      We agree with the reviewer that the highly selective nature of the UK Biobank might create collider stratification bias for the evaluation of COVID-19-related outcomes. We will further discuss this in detail and be cautious when generating conclusions.  

      Reviewer #2 (Public Review):


      Zeng et al investigate in an observational population-based cohort study whether the use of proton pump inhibitors (PPIs) is associated with an increased risk of several respiratory infections among which are influenza, pneumonia, and COVID-19. They conclude that compared to non-users, people regularly taking PPIs have increased susceptibility to influenza, pneumonia, as well as COVID-19 severity and mortality. By performing several different statistical analyses, they try to reduce bias as much as possible, to end up with robust estimates of the association.


      The study comprehensively adjusts for a variety of critical covariates and by using different statistical analyses, including propensity-score-matched analyses and quantitative bias analysis, the estimates of the associations can be considered robust.

      We thank the reviewer for demonstrating the strengths of our articles. We will further revise our manuscript according to the reviewer’s suggestions.


      As it is an observational cohort study there still might be bias. Information on the dose or duration of acid suppressant use was not available, but might be of influence on the results. The outcome of interest was obtained from primary care data, suggesting that only infections as diagnosed by a physician are taken into account. Due to the self-limiting nature of the outcome, differences in health-seeking behavior might affect the results.

      We will try to adjust or provide discussions about the above factors, including the dose/duration of PPI use, outcome assessment, and health-seeking behavior.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study presents valuable findings on Legionella pneumophila effector proteins that target host vesicle trafficking GTPases during infection and more specifically modulate ubiquitination of the host GTPase Rab10. The evidence supporting the claims of the authors is solid, although it remains unclear how modification of the GTPase Rab10 with ubiquitin supports Legionella virulence and the impact of ubiquitination during LCV formation. The work will be of interest to colleagues studying animal pathogens as well as cell biologists in general.

      We greatly appreciate the positive and valuable feedback from the editors and the reviewers. According to their suggestions, we added many new experimental data and implications of our findings in Legionella virulence in terms of the biological process of its replication niche. Please find our point-to-point responses below.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this manuscript, Kubori and colleagues characterized the manipulation of the host cell GTPase Rab10 by several Legionella effector proteins, specifically members of the SidE and SidC family. They show that Rab10 undergoes both conventional ubiquitination and noncanonical phosphoribose-ubiquitination, and that this posttranslational modification contributes to the retention of Rab10 around Legionella vacuoles.


      Legionella is an emerging pathogen of increasing importance, and dissecting its virulence mechanisms allows us to better prevent and treat infections with this organism. How Legionella and related pathogens exploit the function of host cell vesicle transport GTPases of the Rab family is a topic of great interest to the microbial pathogenesis field. This manuscript investigates the molecular processes underlying Rab10 GTPase manipulation by several Legionella effector proteins, most notably members of the SidE and SidC families. The finding that MavC conjugates ubiquitin to SdcB to regulate its function is novel, and sheds further light into the complex network of ubiquitin-related effectors from Lp. The manuscript is well written, and the experiments were performed carefully and examined meticulously.


      Unfortunately, in its current form this manuscript offers only little additional insight into the role of effector-mediated ubiquitination during Lp infection beyond what has already been published. The enzymatic activities of the SidC and SidE family members were already known prior to this study, as was the importance of Rab10 for optimal Lp virulence. Likewise, it had previously been shown that SidE and SidC family members ubiquitinate various host Rab GTPases, like Rab33 and Rab1. The main contribution of this study is to show that Rab10 is also a substrate of the SidE and SidC family of effectors. What remains unclear is if Rab10 is indeed the main biological target of SdcB (not just 'a' target), and how exactly Rab10 modification with ubiquitin benefits Lp infection.

      Reviewer #1 (Recommendations for The Authors):

      Major points of concern

      (1) The authors show that SdcB increases Rab10 levels on LCVs at later times of infection and conclude that this is its main biological role. An alternative explanation may be that Rab10 is not 'the main' target of SdcB but merely 'a' target, which may explain why the effect of SdcB on Rab10 accumulation on LCV is only detectable after several hours of infection. An unbiased omics-based approach to identify the actual host target(s) of SdcB may be needed to confirm that Rab10 modification by SdcB is biologically relevant.

      We totally agree with your comment that SdcB should have multiple targets considering the abundance of ubiquitin observed on the LCVs when SdcB was expressed (Figure 3). However, the effect of SdcB on Rab10 accumulation at the later time point (7 h) (current Figure 4e) was well supported by the new data showing that the SdcB-mediated ubiquitin conjugation to Rab10 was highly detected at this time point (new Figure 4c). We have tried the comprehensive search of interaction partners of the ANK domain of SdcB. This analysis is planned to be included in our on-going study. We therefore decided not to add the data in this manuscript.

      (2) The authors show that Rab10 within cell lysate is ubiquitinated and conclude that ubiquitination of Rab10 is directly responsible for its retention on the LCV. What is the underlying molecular mechanism for this retention? Are GAP proteins prevented from binding and deactivating Rab10. This may be worth testing.

      It would be a fantastic hypothesis that a Rab10GAP is involved in the regulation of Rab10 localization on the LCV. However, as far as we know, GAP proteins against Rab10 have not been identified yet. It should be an important issue to be addressed when a Rab10GAP will be found.

      (3) Related to this, an alternative explanation would be that Rab10 retention is an indirect effect where inactivators of Rab10, such as host cell GAP proteins, are the main target of SidE/C family members and sent for degradation (see point #1). Can the authors show that Rab10 on the LCV is indeed ubiquitinated?

      The possible involvement of a putative Rab10GAP is currently untestable as it is not known. To address whether Rab10 located on the LCV is ubiquitinated nor not, we conducted the critical experiments using active Rab10 (QL) and inactive Rab10 (TN) (new Figure 4a, new Figure 4-figure supplement 1). As revealed for Rab1 (Murata et al., Nature Cell Biol. 2006; Ingmundson et al., Nature 2007), Rab10 is expected to be recruited to the LCV as a GDPbound inactive form and converted to a GTP-bound active form on the LCV. The new results clearly demonstrated that GTP-locked Rab10QL is preferentially ubiquitinated upon infection, strongly supporting the model; Rab10 is ubiquitinated “on the LCV” by the SidE and SidC family ligases.

      (4) Also, on what residue(s) is Rab10 ubiquitinated? Jeng et. al. (Cell Host Microbe, 2019, 26(4): 551-563)) suggested that K102, K136, and K154 of Rab10 are modified during Lp infection. How does substituting those residues affect the residency of Rab10 on LCVs? Addressing these questions may ultimately help to uncover if the growth defect of a sidE gene cluster deletion strain is due to its inability to ubiquitinate and retain Rab10 on the LCV.

      Thank you for the suggestion. We conducted mutagenesis of the three Lys residues of Rab10 and applied the derivative on the ubiquitination analysis (new Figure 1-figure supplement 1). The Lys substitution to Ala residues did not abrogate the ubiquitination upon Lp infection. This result indicates that ubiquitination sites are present in the other residue(s) including the PR-ubiquitination site(s), raising possibility that disruption of sidE genes would be detrimental for intracellular growth of L. pneumophila because of failure of Rab10 retention.

      (5) The authors proposed that "the SidE family primarily contributes towards ubiquitination of Rab10". In this case, what is the significance of SdcB-mediated ubiquitination of Rab10 during Lp infection?

      We found that the major contribution of SdcB is retention of Rab10 until the late stage of infection. This claim was supported by our new data (new Figure 4c) as mentioned above (response to comment #1).

      (6) The contribution of SdcB to ubiquitination of Rab10 relative to SidC and SdcA is unclear. SidC is shown to be unaffected by MavC. In this case, SidC can ubiquitinate Rab10 regardless of the regulatory mechanism of SdcB by MavC. This is not further being examined or discussed in the manuscript.

      The effect of intrinsic MavC is apparent at the later stage (9 h) of infection (Figure 7c) when SdcB gains its activity (see above). We therefore do not think that the contribution of MavC on the SidC/SdcA activities, which are effective in the early stage, would impact on Rab10 localization. However, without specific experiments addressing this issue, possible MavC effects on SidC/SdcA would be beyond the scope in this manuscript.

      (7) When is Rab10 required during Lp infection? The authors showed that Rab10 levels at LCV are rather stable from 1hr to 7hr post infection. If MavC regulates the activity of SdcB, when does this occur?

      While the Rab10 levels on the LCV (~40 %) are stable during 1-7 h post infection (Figure 2b), it reduced to ~20% at 9 h after infection (Figure 7c) (the description was added in lines 304-306). Rab10 seems to be required for optimal LCV biogenesis over the early to late stages, but may not be required at the maturation stage (9 h). We validated the effect of MavC on the Rab10 localization at this time point (Figure 7c). These observations allowed us to build the scheme described in Figure 7d. We revised the illustration in new Figure 7d according to the helpful suggestions from both the reviewers.

      (8) Previous analyses by MS showed that ubiquitination of Rab10 in Lp-infected cells decreases over time (from 1 hpi to 8 hpi - Cell Host Microbe, 2019, 26(4): 551-563). How does this align with the findings made here that Rab10 levels on the LCV and likely its ubiquitination levels increase over time?

      We carefully compared the Rab10 ubiquitination at 1 h and 7 h after infection (new Figure 1figure supplement 1b). This analysis showed that the level of its ubiquitination decreased over time in agreement with the previous report. Nevertheless, Rab10 was still significantly ubiquitinated at 7 h, which we believe to cause the sustained retention of Rab10 on the LCV at this time point. We added the observation in lines 146-148.

      (9) Polyubiquitination of Rab10 was not detected in cells ectopically producing SdcB and SdeA lacking its DUB domain (Figure 7 - figure supplement 2). Does SdcB actually ubiquitinate Rab10 (see also point #5)? Along the same line, it is curious to find that the ubiquitination pattern of Rab10 is not different for LpΔsidC/ΔsdcA compared to LpΔsidC/dsdcA/dsdcB (Figure 1C). The actual contribution of SdcB to ubiquitinating Rab10 compared to SidC/SdcA thus needs to be clarified.

      Thank you for the important point. We currently hypothesize that SidC/SdcA/SdcB-mediated ubiquitin conjugation can occur only in the presence of PR-ubiquitin on Rab10 (either directly on the PR-ubiquitin or on other residue(s) of Rab10). Failure to detect the polyubiquitination in the transfection condition (Figure 7-figure supplement 2) suggests that this specific ubiquitin conjugation can occur in the restricted condition, i.e. only “on the LCV”. We added this description in the discussion section (lines 334-335). No difference between the ΔsidCΔsdcA and ΔsidCΔsdcAΔsdcB strains (Figure 1C, 1h infection) can be explained by the result that SdcB gains activity at the later stages (see above).

      Minor comments In Figure 4b and 7b, the authors show a quantification of "Rab10-positive LCVs/SdcBpositive LCVs". Whys this distinction? It begs the question what the percentile of Rab10positive/SdcB-negative LCVs might be?

      We took this way of quantification as we just wanted to see the effect of SdcB on the Rab10 localization. To distinguish between SdcB-positive and negative LCVs, we would need to rely on the blue color signals of DAPI to visualize internal bacteria, which we thought to be technically difficult in this specific analysis.

      The band of FLAG-tagged SdcB was not detected by immunoblot using anti-FLAG antibody (Figure 5). The authors hypothesized that "disappearance of the SdcB band can be caused by auto-ubiquitination, as SdcB has an ability to catalyze auto-ubiquitination with a diverse repertoire of E2 enzymes. This can be easily confirmed by using MG-132 to inhibit proteasomal degradation of polyubiquitinated substrates.

      We conducted the experiment using MG-132 as suggested and found that proteasomal degradation is not the cause of the disappearance of the band (new Figure 5-figure supplement 2, added description in lines 228-233). SdcB is actually not degraded. Instead, its polyubiquitination causes its apparent loss by distributing the SdcB bands in the gel.

      In Figure 5F, the authors mentioned that "HA-UbAA did not conjugate to SdcB", whereas "shifted band detected by FLAG probing plausibly represents conjugation of cellular intrinsic Ub". The same argument was made in Figure 6B. These claims should be confirmed by immunoblot using anti-Ub antibody.

      Thank you. We added the data using anti-Ub antibody (P4D1) (Figure 6f, new third panel).

      Figure 7A: In cell producing MavC, SdcB is clearly present on LCV. However, in Figure 5A, SdcB was not detected by immunoblot in cells ectopically expressing MavC-C74A. What is the interpretation for these results?

      SdcB was not degraded in the cells, but just its apparent molecular weight shift occurred by polyubiquitination (see above). The detection of SdcB in the IF images (Figure 7a) supported this claim.

      Reviewer #2 (Public Review):

      This manuscript explores the interplay between Legionella Dot/Icm effectors that modulate ubiquitination of the host GTPase Rab10. Rab10 undergoes phosphoribosyl-ubiquitination (PR-Ub) by the SidE family of effectors which is required for its recruitment to the Legionella containing vacuole (LCV). Through a series of elegant experiments using effector gene knockouts, co-transfection studies and careful biochemistry, Kubori et al further demonstrate that:

      (1) The SidC family member SdcB contributes to the polyubiquitination (poly-Ub) of Rab10 and its retention at the LCV membrane.

      (2) The transglutaminase effector, MavC acts as an inhibitor of SdcB by crosslinking ubiquitin at Gln41 to lysine residues in SdcB.

      Some further comments and questions are provided below.

      (1) From the data in Figure 1, it appears that the PR-Ub of Rab10 precedes and in fact is a prerequisite for poly-Ub of Rab10. The authors imply this but there's no explicit statement but isn't this the case?

      Yes, we think that it is the case. We revised the description in the text accordingly (lines 326327).

      (2) The complex interplay of Legionella effectors and their meta-effectors targeting a single host protein (as shown previously for Rab1) suggests the timing and duration of Rab10 activity on the LCV is tightly regulated. How does the association of Rab10 with the LCV early during infection and then its loss from the LCV at later time points impact LCV biogenesis or stability? This could be clearer in the manuscript and the summary figure does not illustrate this aspect.

      Thank you for pointing the important issue. Association of Rab10 with the LCV is thought to be beneficial for L. pneumophila as it is the identified factor which supports bacterial growth in cells (Jeng et al., 2019). We speculate that its loss from the LCV at the later stage of infection would also be beneficial, since the LCV may need to move on to the maturation stage in which a different membrane-fusion process may proceed. As this is too speculative, we gave a simple modification on the part of discussion section (lines 356-358). We also modified the summary figure (revised Figure 7d) as illustrated with the time course.

      (3) How do the activities of the SidE and SidC effectors influence the amount of active Rab10 on the LCV (not just its localisation and ubiquitination)

      We agree that it is an important point. We tested the active Rab10 (QL) and inactive Rab10 (TN) for their ubiquitination and LCV-localization profiles (new Figure 4ab, new Figure 4figure supplement 1 and 2). These analyses led us to the unexpected finding that the active form of Rab10 is the preferential target of the effector-mediated manipulation. See also our response to Reviewer 1’s comment #3. Thank you very much for your insightful suggestion.

      (4) What is the fate of PR-Ub and then poly-Ub Rab10? How does poly-Ub of Rab10 result in its persistence at the LCV membrane rather than its degradation by the proteosome?

      We have not revealed the molecular mechanism in this study. We believe that it is an important question to be solved in future. We added the sentence in the discussion section (lines 376378).

      (5) Mutation of Lys518, the amino acid in SdcB identified by mass spec as modified by MavC, did not abrogate SdcB Ub-crosslinking, which leaves open the question of how MavC does inhibit SdcB. Is there any evidence of MavC mediated modification to the active site of SdcB?

      The active site of SdcB (C57) is required for the modification (Figure 5b), but it is not likely to be the target residue, as the MavC transglutaminase activity restricts the target residues to Lys. It would be expected that multiple Lys residues on SdcB can be modified by MavC to disturb the catalytic activity.

      (6) I found it difficult to understand the role of the ubiquitin glycine residues and the transglutaminase activity of MavC on the inhibition of SdcB function. Is structural modelling using Alphafold for example helpful to explain this?

      We conducted the Alphafold analysis of SdcB-Ub. Unfortunately, when the Glycine residues of Ub was placed to the catalytic pocket of SdcB, Q41 of Ub did not fit to the expected position of SdcB (K518). Probably, the ternary complex (MavC-Ub-SdcB) would cause the change of their entire conformation. A crystal structure analysis or more detailed molecular modeling would be required to resolve the issue.

      (7) Are the lys mutants of SdbB still active in poly-Ub of Rab10?

      We performed the experiment and found that K518R K891R mutant of SdcB still has the E3 ligase activity of similar level with the wild-type upon infection (new Figure 6-figure supplement 2) (lines 283-284). The level was actually slightly higher than that of the wildtype. This result may suggest that the blocking of the modification sites can rescue SdcB from MavC-mediated down regulation.

      Reviewer #2 (Recommendations For The Authors):

      see above

    2. eLife assessment

      This important study explores the interplay between Legionella Dot/Icm effectors that modulate ubiquitination of the host GTPase Rab10, which undergoes phosphoribosyl-ubiquitination by the SidE family of effectors, which in turn are required for Rab10 recruitment to the Legionella containing vacuole (LCV). The evidence supporting the claims of the authors is convincing. The study is not only relevant for the microbiology community, but will also be of interest to colleagues in the broader fields of membrane trafficking and general cell biology.

    3. Reviewer #1 (Public Review):

      This study presents valuable data on effector proteins (=virulence factors) used by the bacterial pathogen Legionella pneumophila that target host vesicle trafficking GTPases during infection. The evidence supporting the claims of the authors is robust, and the data suggest a sophisticated interplay between multiple effectors with the goal of temporarily exploiting host cell Rab10 during infection.

      The authors have done a nice job addressing my earlier concerns. I have no further criticism about the revised paper.

    4. Reviewer #2 (Public Review):

      This manuscript explores the interplay between Legionella Dot/Icm effectors that modulate ubiquitination of the host GTPase Rab10. Rab10 undergoes phosphoribosyl-ubiquitination (PR-Ub) by the SidE family of effectors which is required for its recruitment to the Legionella containing vacuole (LCV). Through a series of elegant experiments using effector gene knockouts, co-transfection studies and careful biochemistry, Kubori et al further demonstrate that:

      (1) The SidC family member SdcB contributes to the polyubiquitination (poly-Ub) of Rab10 and its retention at the LCV membrane.

      (2) The transglutaminase effector, MavC acts as an inhibitor of SdcB by crosslinking ubiquitin at Gln41 to lysine residues in SdcB.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable study applies voltage clamp fluorometry to provide new information about the function of serotonin-gated ion channels 5-HT3AR. The authors convincingly investigate structural changes inside and outside the orthosteric site elicited by agonists, partial agonists, and antagonists, helping to annotate existing cryo-EM structures. This work confirms that the activation of 5-HT3 receptors is similar to other members of this well-studied receptor superfamily. The work will be of interest to scientists working on channel biophysics but also drug development targeting ligand-gated ion channels.

      Public Reviews:

      All reviewers agreed that these results are solid and interesting. However, reviewers also raised several concerns about the interpretation of the data and some other aspects related to data analysis and discussion that should be addressed by the authors. Essential revisions should include:

      (1) Please try to explicitly distinguish between a closed pore and a resting or desensitized state of the pore, to help in clarity.

      (2) Add quantification of VCF data (e.g. sensor current kinetics, as suggested by reviewer #2) or better clarify/discuss the VCF quantitative aspects that are taken into account to reach some conclusions (reviewer #3).

      (3) Review and add relevant foundational work relevant to this study that is not adequately cited.

      (4) Revise the text according to all recommendations raised by the reviewers and listed in the individual reviews below.

      We have revised the text to address all four points. See the answers to referees’ recommendations.

      Reviewer #1 (Public Review):


      This study brings new information about the function of serotonin-gated ion channels 5-HT3AR, by describing the conformational changes undergoing during ligands binding. These results can be potentially extrapolated to other members of the Cys-loop ligand-gated ion channels. By combining fluorescence microscopy with electrophysiological recordings, the authors investigate structural changes inside and outside the orthosteric site elicited by agonists, partial agonists, and antagonists. The results are convincing and correlate well with the observations from cryo-EM structures. The work will be of important significance and broad interest to scientists working on channel biophysics but also drug development targeting ligand-gated ion channels.


      The authors present an elegant and well-designed study to investigate the conformational changes on 5-HT3AR where they combine electrophysiological and fluorometry recordings. They determined four positions suitable to act as sensors for the conformational changes of the receptor: two inside and two outside the agonist binding site. They make a strong point showing how antagonists produce conformational changes inside the orthosteric site similarly as agonists do but they failed to spread to the lower part of the ECD, in agreement with previous studies and Cryo-EM structures. They also show how some loss-of-function mutant receptors elicit conformational changes (changes in fluorescence) after partial agonist binding but failed to produce measurable ionic currents, pointing to intermediate states that are stabilized in these conditions. The four fluorescence sensors developed in this study may be good tools for further studies on characterizing drugs targeting the 5-HT3R.


      Although the major conclusions of the manuscript seem well justified, some of the comparison with the structural data may be vague. The claim that monitoring these silent conformational changes can offer insights into the allosteric mechanisms contributing to signal transduction is not unique to this study and has been previously demonstrated by using similar techniques with other ion channels.

      The referee emphasizes that “some of the comparison with the structural data may be vague”. To better illustrate the structural reorganizations seen in the cryo-EM structures and that are used for VCF data interpretation, we added a new supplementary figure 3. It shows a superimposition of Apo, setron and 5-HT bond structures, with reorganization of loop C and Cys-loop consistent with VCF data.

      Reviewer #2 (Public Review):


      This study focuses on the 5-HT3 serotonin receptor, a pentameric ligand-gated ion channel important in chemical neurotransmission. There are many cryo-EM structures of this receptor with diverse ligands bound, however assignment of functional states to the structures remains incomplete. The team applies voltage-clamp fluorometry to measure, at once, both changes in ion channel activity, and changes in fluorescence. Four cysteine mutants were selected for fluorophore labeling, two near the neurotransmitter site, one in the ECD vestibule, and one at the ECD-TMD junction. Agonists, partial agonists, and antagonists were all found to yield similar changes in fluorescence, a proxy for conformational change, near the neurotransmitter site. The strength of the agonist correlated to a degree with propagation of this fluorescence change beyond the local site of neurotransmitter binding. Antagonists failed to elicit a change in fluorescence in the vestibular the ECD-TMD junction sites. The VCF results further turned up evidence supporting intermediate (likely pre-active) states.


      The experiments appear rigorous, the problem the team tackles is timely and important, the writing and the figures are for the most part very clear. We sorely need approaches orthogonal to structural biology to annotate conformational states and observe conformational transitions in real membranes- this approach, and this study, get right to the heart of what is missing.


      The weaknesses in the study itself are overall minor, I only suggest improvements geared toward clarity. What we are still missing is application of an approach like this to annotate the conformation of the part of the receptor buried in the membrane; there is important debate about which structure represents which state, and that is not addressed in the current study.

      Reviewer #3 (Public Review):


      The authors have examined the 5-HT3 receptor using voltage clamp fluorometry, which enables them to detect structural changes at the same time as the state of receptor activation. These are ensemble measurements, but they enable a picture of the action of different agonists and antagonists to be built up.


      The combination of rigorously tested fluorescence reporters with oocyte electrophysiology is a solid development for this receptor class.


      The interpretation of the data is solid but relevant foundational work is ignored. Although the data represent a new way of examining the 5-HT3 receptor, nothing that is found is original in the context of the superfamily. Quantitative information is discussed but not presented.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Here are some suggestions that may help to improve the manuscript: - Page 6, point 2), typo: "L131W is positioned more profound in each ECD, its side chain (...)"

      “profound” have been corrected into “profoundly”

      • Fig 1C: Why not compare 5-HT responses for the four sensors studied? If the reason is the low currents elicited by 5-HT on I160C/Y207W sensor, could you comment on this effect that is not observed for the other full agonist tested (mCPBG)?

      The point of this figure (Fig 1G) is to show currents that desensitize to follow the evolution of the fluorescence signal during desensitization, that’s why for the I160C/Y207W sensor where 5-HT become a partial agonist we have judge more appropriate to use mCPBG acting as a more potent agonist to elicit currents with clear desensitization component. We have added a sentence in the legend of the figure to explain this choice more clearly.

      • Page 9, paragraph 2: "However, concentration-response curves on V106C/L131W show a small yet visible decorrelation of fluorescence and current (...)" Statistical analysis on EC50c and EC50f will help to see this decorrelation.

      Statistical analysis (unpaired t test) has been added to figure 3 panel A.

      • Page 10, paragraph 1: the authors describe how "different antagonists promote different degrees of local conformational changes". Does it have any relation to the efficacy or potency of these antagonists? Is there any interpretation for this result?

      Since setrons are competitive antagonists, the concept of efficacy of these molecules is unclear. Concerning potency, no correlation between affinity and fluorescence variation is observed. For instance, ondansetron and alosetron bind with similar nanomolar affinity to the 5-HT3R (Thompson & Lummis Curr Pharm Des. 2006;12(28):3615-30) but elicit different fluorescence variations on both S204C and I160C/Y207W sensors.

      • Fig. 1 panel A, graph to far right: axis label is cut ("current (uA)/..."). Colors of graph A - right are not clearly distinguishable e.g. cyan from green.

      The fluorescent green color that describes the mutant has been changed into limon color which is more clearly distinguishable from cyan.

      • Why is R219C/F142W not selected in the study? Are the signals comparable to the chosen R219C/F142W?

      We have chosen not to select R219C/F142W because the current elicited by this construct was lower than the current elicited by the construct R219C/Y140W. Moreover, the residue F142 belongs to the FPF motif from Cys-loop that is essential for gating (Polovinkin et al, 2018, Nature).

      • Fig. 1 legend typo: "mutated in tryptophan”

      “in” has been changed by “into”

      • Fig. 2: yellow color (graphs in panel B) is very hard to read.

      Yellow color has been darkened to yellow/brown to allow easy reading.

      • Fig. 4 is too descriptive and undermines the information of the study. It could be improved e.g. by representing specific structures or partial structures involved. As an additional minor comment, some colors in the figure are hard to differentiate, e.g. magenta and purple.

      We have added relevant specific structures involved, namely loop C, the Cys-loop and pre-M1 loop to clarify. The intensity of magenta and purple has been increased to help differentiate the two sensor positions.

      • Fig S1C: it is confusing to see the same color pattern for the single mutants without the W. I would recommend to label each trace to make it clearer.

      Labelling of the traces corresponding to the single mutants has been added.

      • Fig S2: Indicating the statistical significance in the graph for the mutants with different desensitization properties compared to the WT receptor will help its interpretation.

      The statistical significance of the difference in the desensitization properties has been added to Figure S2.

      Reviewer #2 (Recommendations For The Authors):

      Overall comments for the authors:

      Selection of cysteine mutants and engineered Trp sites is clear and logical. VCF approach with controls for comparing the functionality of WT vs. mutants, and labeled with unlabeled receptor, is well explained and satisfying. The finding that desensitization involves little change in ECD conformation makes sense. It is somewhat surprising, at least superficially, to find that competitive antagonists promote changes in fluorescence in the same 'direction' and amplitude as strong agonists, however, this is indeed consistent with the structural biology, and with findings from other groups testing different labeling sites. Importantly, the team finds that antagonist-binding changes in deltaF do not spread beyond the region near the neurotransmitter site. The finding that most labeling sites in the ECD, in particular those not in/near the neurotransmitter site, fail to report measurable fluorescence changes, is noteworthy. It contrasts with findings in GlyR, as noted by the authors, and supports a mechanism where most of each subunit's ECD behaves as a rigid body.

      Specific questions/comments:

      I am confused about the sensor current kinetics. Results section 2) states that all sensors share the same current desensitization kinetics, while Results section 5) states that the ECD-TMD site and the vestibule site sensors exhibit faster desensitization. SF1C, right-most panel of R219C suggests the mutation and/or labeling here dramatically changes apparent activation and deactivation rates measured by TEVC. Both activation and deactivation upon washout appear faster in this one example. Data for desensitization are not shown here but are shown in aggregate in earlier panels. It is a bit surprising that activation and deactivation would both change but no effect on desensitization. Indeed, it looks like, in Fig. 1G, that desensitization rate is not consistent across all constructs. Can you please confirm/clarify?

      TEVC and VCF recordings in this study show a significant variability concerning both the apparent desensitization and desactivation kinetics. This is illustrated concerning desensitization in TEVC experiments in figure S2, where the remaining currents after 45 secondes of 5-HT perfusion and the rate constants of desensitization are measured on different oocytes from different batches. Therefore, the differences in desensitization kinetics shown in fig 1.G are not significant, the aim of the figure being solely to illustrate that no variation of fluorescence is observed during the desensitization phase. A sentence in the legend of fig 1.G has been added to precise this point. We also revised the first paragraph of result section 5, clearly stating that the slight tendency of faster desensitization of V106C/L131W and R219C/Y140W sensors is not significant.

      An alternative to the conclusion-like title of Results section 2) is that the ECD (and its labels) does not undergo notable conformational changes between activated and desensitized states.

      This is a good point and we have added a sentence at the end of results section 2 to present this idea.

      I find the discussion paragraph on partial agonist mechanisms, starting with "However," to be particularly important but at times hard to follow. Please try to revise for clarity. I am particularly excited to understand how we can understand/improve assignments of cryo-EM structures using the VCF (or other) approaches. As examples of where I struggled, near the top of p. 11, related to the partial agonist discussion, there is an assumption about the pore being either activated, or resting. Is it not also possible that partial agonists could stabilize a desensitized state of the pore? Strictly speaking, the labeling sites and current measurements do not distinguish between pre-active resting and desensitized channel conformations/states. However, the cryo-EM structures can likely help fill in the missing information there- with all the normal caveats. Please try to explicitly distinguish between a closed pore and a resting or desensitized state of the pore, to help in clarity.

      We have revised the section, and hope it is clearer now. We notably state more explicitly the argument for annotation of partial agonist bound closed structures as pre-active, mainly from kinetic consideration of VCF experiments. We also mention and cite a paper by the Chakrapani group published the 4th of January 2024 (Felt et al, Nature Communication), where they present the structures of the m5HT3AR bound to partial agonists, with a set of conformations fully consistent with our VCF data.

      This statement likely needs references: "...indirect experiments of substituted cysteine accessibility method (SCAM) and VCF experiments suggested that desensitization involves weak reorganizations of the upper part of the channel that holds the activation gate, arguing for the former hypothesis."

      Reference Polovinkin et al, Nature, 2018, has been added.

      I respectfully suggest toning down this language a little bit: "VCF allowed to characterize at an unprecedented resolution the mechanisms of action of allosteric effectors and allosteric mutations, to identify new intermediate conformations and to propose a structure-based functional annotation of known high-resolution structures." This VCF stands strongly without unclear claims about unprecedented resolution. What impresses me most are the findings distinguishing how agonists/partial agonists/antagonists share a conserved action in one area and not in another, the observations consistent with intermediate states, and the efforts to integrate these simultaneous current and conformation measurements with the intimidating array of EM structures.

      We thank the referee for his positive comments. We have removed “unprecedented resolution” and revised the sentences.

      It is beyond the scope of the current study, but I am curious what the authors think the hurdles will be to tracking conformation of the pore domain- an area where non-cryo-EM based conformational measurements are sorely needed to help annotate the EM structures.

      We fully agree with the referee that structures of the TMD are very divergent between the various conditions depending on the membrane surrogate. We are at the moment working on this region by VCF, incorporating the fluorescent unnatural amino acid ANAP.


      (1) P. 5, m5-HT3R: Please clarify that this refers to the mouse receptor, if that is correct.

      OK, “mouse” has been added.

      (2) Fig. 1D, I suggest moving the 180-degree arrow to the right so it is below but between the two exterior and vestibular views.

      Ok, it has been done.

      (3) Please add a standard 2D chemical structure of MTS-TAMRA, and TAMRA attached to a cysteine, to Fig 1.

      A standard chemical structure has been added for the two isomers of MTS-TAMRA.

      (4) Please label subpanels in Fig. 1G with the identity of the label site.

      The subpanels have been labelled.

      Reviewer #3 (Recommendations For The Authors):

      This is solid work but I mainly have suggestions about placing it in context.

      (1) Abstract "Data show that strong agonists promote a concerted motion of all sensors during activation, "

      The concept of sensors here is the fluorescent labels? I did not find this meaningful until I read the significance statement.

      We have specified “fluorescently-labelled” before sensors in the abstract.

      (2) p4 "each subunit in the 5-HT3A pentamer...." this description would be identical for any pentameric LGIC so the authors should beware of a misleading specificity. This goes for other phrases in this paragraph. However, the summary of the 5HT specific results is very good.

      About the description of the structure, we added “The 5-HT3AR displays a typical pLGIC structure, where….”.

      (3) This paper is very nicely put together and generally explains itself well. The work is rigorous and comprehensive. But the meaning of quenching (by local Trp) seems straightforward, but it is not made explicit in the paper. Why doesn't simple labelling (single Cys) at this site work? And can we have a more direct demonstration of the advantage of including the Trp (not in the supplementary figure?) All this information is condensed into the first part of figure 1 (the graph in Figure 1A). Figure 1 could be split and the principle of the introduced quenching could be more clearly shown

      detailed in a few more sentences the principle of the TrIQ approach. In addition, to be more explicit, the significative differences of fluorescence comparing sensors with and without tryptophan have been added in Figure 1, panel screening and a sentence have been added in the legend of this figure.

      (4) p10 "VCF measurements are also remarkably coherent with the atomic structures showing an open pore (so called F, State 2 and 5-HT asymmetric states), "

      This statement is intriguing. What do these names or concepts represent? Are they all the same thing? Where do the names come from? What is meant here? Three different concepts, all consistent? Or three names for the same concept?

      We have tried to clarify the statement by making reference to the PDB of the structures.

      (5) "Fluorescence and VCF studies identified similar intermediate conformations for nAChRs, ⍺1-GlyRs and the bacterial homolog GLIC(21,32-35). "

      Whilst this is true, the motivation for such ideas came from earlier work identifying intermediates from electrophysiology alone (such as the flip state (Burzomato et al 2004), the priming state (Mukhatsimova 2009) and the conformational wave in ACh channels grosman et al 2000). It would be appropriate to mention some of this earlier work.

      We have incorporated and described these references in the discussion. Of note, we fully quoted these references in our previous papers on the subject (Menny 2017, Lefebvre 2021, Shi 2023), but the referee is right in asking to quote them again.

      (6) "A key finding of the study is the identification of pre-active intermediates that are favored upon binding of partial agonists and/or in the presence of loss-of-function mutations. "

      Even more fundamental, the idea of a two-state equilibrium for neurotransmitter receptors was discarded in 1957 according to the action of partial agonists.

      DEL CASTILLO J, KATZ B (1957) Interaction at end-plate receptors between different choline derivatives. Proc R Soc Lond B Biol Sci

      So to discover this "intermediate" - that is, bound but minimal activity - in the present context seems a bit much. It is a big positive of this paper that the results are congruent with our expectations, but I cannot see value in posing the results as an extension of the 2-state equilibrium (for which there are anyway other objections).

      As for intermediates being favoured by loss of function mutations, this concept is already well established in glycine receptors (Plested et al 2007, Lape et al 2012) and doubtless in other cases too.

      I do get the point that the authors want to establish a basis in 5-HT3 receptors, but these previous works suggest the results are somewhat expected. This should be commented on.

      We also agree. We replace “key finding” by “key observation”, quote most of the references proposed, and explicitly conclude that “The present work thus extends this idea to the 5HT3AR, together with providing structural blueprints for cryo-EM structure annotation”.

      (7) "In addition, VCF data allow a quantitative estimate of the complex allosteric action of partial agonists, that do not exclusively stabilize the active state and document the detailed phenotypes of various allosteric mutations."

      Where is this provided? If the authors are not motivated to do this, I have some doubts that others will step in. If it is not worth doing, it's probably not worth mentioning either.

      Language has been toned down by “In addition, VCF data give insights in the action of partial agonists, that do not exclusively stabilize the active state and document the phenotypes of various allosteric mutations."

      (8) Figure 1G please mark which construct is which.

      This has been added into Figure 1G

    2. eLife assessment

      This valuable study applies voltage clamp fluorometry to provide new information about the function of serotonin-gated ion channels 5-HT3AR. The authors convincingly investigate structural changes inside and outside the orthosteric site elicited by agonists, partial agonists, and antagonists, helping to annotate existing cryo-EM structures. This work confirms that the activation of 5-HT3 receptors is similar to other members of this well-studied receptor superfamily. The work will be of interest to scientists working on channel biophysics but also drug development targeting ligand-gated ion channels.

    3. Reviewer #1 (Public Review):


      This study brings new information about the function of serotonin-gated ion channels 5-HT3AR, by describing the conformational changes undergoing during ligands binding. These results can be potentially extrapolated to other members of the Cys-loop ligand-gated ion channels. By combining fluorescence microscopy with electrophysiological recordings, the authors investigate structural changes inside and outside the orthosteric site elicited by agonists, partial agonists, and antagonists. The results are convincing and correlate well with the observations from cryo-EM structures. The work will be of important significance and broad interest to scientists working on channel biophysics but also drug development targeting ligand-gated ion channels.


      The authors present an elegant and well-designed study to investigate the conformational changes on 5-HT3AR where they combine electrophysiological and fluorometry recordings. They determined four positions suitable to act as sensors for the conformational changes of the receptor: two inside and two outside the agonist binding site. They make a strong point showing how antagonists produce conformational changes inside the orthosteric site similarly as agonists do but they failed to spread to the lower part of the ECD, in agreement with previous studies and Cryo-EM structures. They also show how some loss-of-function mutant receptors elicit conformational changes (changes in fluorescence) after partial agonist binding but failed to produce measurable ionic currents, pointing to intermediate states that are stabilized in these conditions. The four fluorescence sensors developed in this study may be good tools for further studies on characterizing drugs targeting the 5-HT3R. The major conclusions of the manuscript seem well justified.


      Weaknesses have been very well addressed during the review process.

    4. Reviewer #2 (Public Review):


      This study focuses on the 5-HT3 serotonin receptor, a pentameric ligand-gated ion channel important in chemical neurotransmission. There are many cryo-EM structures of this receptor with diverse ligands bound, however assignment of functional states to the structures remains incomplete. The team applies voltage-clamp fluorometry to measure, at once, both changes in ion channel activity, and changes in fluorescence. Four cysteine mutants were selected for fluorophore labeling, two near the neurotransmitter site, one in the ECD vestibule, and one at the ECD-TMD junction. Agonists, partial agonists, and antagonists were all found to yield similar changes in fluorescence, a proxy for conformational change, near the neurotransmitter site. The strength of the agonist correlated to a degree with propagation of this fluorescence change beyond the local site of neurotransmitter binding. Antagonists failed to elicit a change in fluorescence in the vestibular of the ECD-TMD junction sites. The VCF results further turned up evidence supporting intermediate (likely pre-active) states.


      The experiments appear rigorous, the problem the team tackles is timely and important, the writing and the figures are for the most part very clear. We sorely need approaches orthogonal to structural biology to annotate conformational states and observe conformational transitions in real membranes- this approach, and this study, get right to the heart of what is missing.


      The weaknesses in the study itself are overall minor, I only suggest improvements geared toward clarity. What we are still missing is application of an approach like this to annotate the conformation of the part of the receptor buried in the membrane; there is an important debate about which structure represents which state, and that is not addressed in the current study.

    5. Reviewer #3 (Public Review):


      The authors have examined the 5-HT3 receptor using voltage clamp fluorometry, which enables them to detect structural changes at the same time as the state of receptor activation. These are ensemble measurements, but they enable an impressive scheme of the action of different agonists and antagonists to be built up. The growing array of structural snapshots of 5-HT3 receptors is used to good effect to understand the results.


      The combination of rigorously tested fluorescence reporters with oocyte electrophysiology across a large panel of ligands is a solid development for this receptor type.


      In their revision, the authors corrected all the weaknesses of the original submission.

    1. Author Response

      Provisional response

      We would like to thank the reviewers for taking the time to review our manuscript, for providing useful suggestions for improvement, and for highlighting the significance of our approach.

      Reviewer #1 (Public Review):


      The authors demonstrate that it is possible to carry out eQTL experiments for the model eukaryote S. cerevisiae, in "one pot" preparations, by using single-cell sequencing technologies to simultaneously genotype and measure expression. This is a very appealing approach for investigators studying genetic variation in single-celled and other microbial systems, and will likely inspire similar approaches in non-microbial systems where comparable cell mixtures of genetically heterogeneous individuals could be achieved.


      While eQTL experiments have been done for nearly two decades (the corresponding author's lab are pioneers in this field), this single-cell approach creates the possibility for new insights about cell biology that would be extremely challenging to infer using bulk sequencing approaches. The major motivating application shown here is to discover cell occupancy QTL, i.e. loci where genetic variation contributes to differences in the relative occupancy of different cell cycle stages. The authors dissect and validate one such cell cycle occupancy QTL, involving the gene GPA1, a G-protein subunit that plays a role in regulating the mating response MAPK pathway. They show that variation at GPA1 is associated with proportional differences in the fraction of cells in the G1 stage of the cell cycle. Furthermore, they show that this bias is associated with differences in mating efficiency.

      We thank the reviewer for recognizing the strengths of our overall approach and our dissection of the functional consequences of the W82R variant of GPA1.


      While the experimental validation of the role of GPA1 variation is well done, the novel cell cycle occupancy QTL aspect of the study is somewhat underexploited. The cell occupancy QTLs that are mentioned all involve loci that the authors have identified in prior studies that involved the same yeast crosses used here. It would be interesting to know what new insights, besides the "usual suspects", the analysis reveals. For example, in Cross B there is another large effect cell occupancy QTL on Chr XI that affects the G1/S stage. What candidate genes and alleles are at this locus?

      We thank the reviewer for this suggestion. We plan to expand the section on cell cycle occupancy QTL in our revision.

      And since cell cycle stages are not biologically independent (a delay in G1, could have a knock-on effect on the frequency of cells with that genotype in G1/S), it would seem important to consider the set of QTLs in concert.

      We thank the reviewer for this suggested clarification. In our revision, we will clarify that the cell cycle occupancy phenotype represents the proportion of cells assigned to a given stage. As the reviewer correctly notes, a change in the proportion of cells in one stage may alter the proportion of cells in other stages, and this could result in cell cycle occupancy QTL for multiple stages. We will make efforts to consider the cell cycle occupancy QTLs in concert in the revised manuscript.

      Reviewer #2 (Public Review):

      Boocock and colleagues present an approach whereby eQTL analysis can be carried out by scRNA-Seq alone, in a one-pot-shot experiment, due to genotypes being able to be inferred from SNPs identified in RNA-Seq reads. This approach obviates the need to isolate individual spores, genotype them separately by low-coverage sequencing, and then perform RNA-Seq on each spore separately. This is a substantial advance and opens up the possibility to straightforwardly identify eQTLs over many conditions in a cost-efficient manner. Overall, I found the paper to be well-written and well-motivated, and have no issues with either the methodological/analytical approach (though eQTL analysis is not my expertise), or with the manuscript's conclusions.

      We thank the reviewer for recognizing the significant contributions our work makes to the field.

      393 segregant experiment:

      For the experiment with the 393 previously genotyped segregants, did the authors examine whether averaging the expression by genotype for single cells gave expression profiles similar to the bulk RNA-Seq data generated from those genotypes? Also, is it possible (and maybe not, due to the asynchronous nature of the cell culture) to use the expression data to aid in genotyping for those cells whose genotypes are ambiguous? I presume it might be if one has a sufficient number of cells for each genotype, though, for the subsequent one-pot experiments, this is a moot point.

      We thank the reviewer for this comment. While we could expand the analysis along these lines, this is not relevant for the subsequent one-pot eQTL experiments, as the reviewer notes, and is therefore beyond the scope of the manuscript. We will make the data available so that anyone interested can try these analyses.

      Figure 1B:

      Is UMAP necessary to observe an ellipse/circle - I wouldn't be surprised if a simple PCA would have sufficed, and given the current discussion about whether UMAP is ever appropriate for interpreting scRNA-Seq (or ancestry) data, it seems the PCA would be a preferable approach. I would expect that the periodic elements are contained in 2 of the first 3 principal components. Also, it would be nice if there were a supplementary figure similar to Figure 4 of Macosko et al (PMID 26000488) to indeed show the cell cycle dependent expression.

      We thank the reviewer for this comment. We too have been following the debate on the utility of UMAP for scRNA-seq, and in our revision we will provide an alternative visualization of the cell cycle. We will also generate a supplementary figure similar to Figure 4 of Macosko et al. to visualize cell-cycle-dependent gene expression.

      Aging, growth rate, and bet-hedging:

      The mention of bet-hedging reminded me of Levy et al (PMID 22589700), where they saw that Tsl1 expression changed as cells aged and that this impacted a cell's ability to survive heat stress. This bet-hedging strategy meant that the older, slower-growing cells were more likely to survive, so I wondered a couple of things. It is possible from single-cell data to identify either an aging, or a growth rate signature? A number of papers from David Botstein's group culminated in a paper that showed that they could use a gene expression signature to predict instantaneous growth rate (PMID 19119411) and I wondered if a) this is possible from single-cell data, and b) whether in the slower growing cells, they see markers of aging, whether these two signatures might impact the ability to detect eQTLs, and if they are detected, whether they could in some way be accounted for to improve detection.

      We thank the reviewer for this comment and suggested analyses. We are not sure whether one can see gene expression signatures of aging in yeast scRNA-seq data. We believe that such analyses are beyond the scope of this work, but we will make the data available so that anyone interested can try them.

      AIL vs. F2 segregants:

      I'm curious if the authors have given thought to the trade-offs of developing advanced intercross lines for scRNA-Seq eQTL analysis. My impression is that AIL provides better mapping resolution, but at the expense of having to generate the lines. It might be useful to see some discussion on that.

      We thank the reviewer for their comment. We will include some discussion of the trade-offs of different experimental designs in our revision.

      10x vs SPLit-Seq

      10x is a well established, but fairly expensive approach for scRNA-Seq - I wondered how the cost of the 10x approach compares to the previously used approach of genotyping segregants and performing bulk RNA-Seq, and how those costs would change if one used SPLiT-Seq (see PMID 38282330).

      We will provide some ballpark estimates of the costs, and we will discuss the trade-offs of different scRNA-seq technologies in our revision

    2. eLife assessment

      This manuscript describes the mapping of natural DNA sequence variants that affect gene expression and its noise, as well as cell cycle timing, using as input single-cell RNA-sequencing of progeny from crosses between wild yeast strains. The method represents an important advance in the study of natural genetic variation. The findings, especially given the follow-up validation of the phenotypic impact of a mapped locus of major effect, provide convincing support for the rigor and utility of the method.

    3. Reviewer #1 (Public Review):


      The authors demonstrate that it is possible to carry out eQTL experiments for the model eukaryote S. cerevisiae, in "one pot" preparations, by using single-cell sequencing technologies to simultaneously genotype and measure expression. This is a very appealing approach for investigators studying genetic variation in single-celled and other microbial systems, and will likely inspire similar approaches in non-microbial systems where comparable cell mixtures of genetically heterogeneous individuals could be achieved.


      While eQTL experiments have been done for nearly two decades (the corresponding author's lab are pioneers in this field), this single-cell approach creates the possibility for new insights about cell biology that would be extremely challenging to infer using bulk sequencing approaches. The major motivating application shown here is to discover cell occupancy QTL, i.e. loci where genetic variation contributes to differences in the relative occupancy of different cell cycle stages. The authors dissect and validate one such cell cycle occupancy QTL, involving the gene GPA1, a G-protein subunit that plays a role in regulating the mating response MAPK pathway. They show that variation at GPA1 is associated with proportional differences in the fraction of cells in the G1 stage of the cell cycle. Furthermore, they show that this bias is associated with differences in mating efficiency.


      While the experimental validation of the role of GPA1 variation is well done, the novel cell cycle occupancy QTL aspect of the study is somewhat underexploited. The cell occupancy QTLs that are mentioned all involve loci that the authors have identified in prior studies that involved the same yeast crosses used here. It would be interesting to know what new insights, besides the "usual suspects", the analysis reveals. For example, in Cross B there is another large effect cell occupancy QTL on Chr XI that affects the G1/S stage. What candidate genes and alleles are at this locus? And since cell cycle stages are not biologically independent (a delay in G1, could have a knock-on effect on the frequency of cells with that genotype in G1/S), it would seem important to consider the set of QTLs in concert.

    4. Reviewer #2 (Public Review):

      Boocock and colleagues present an approach whereby eQTL analysis can be carried out by scRNA-Seq alone, in a one-pot-shot experiment, due to genotypes being able to be inferred from SNPs identified in RNA-Seq reads. This approach obviates the need to isolate individual spores, genotype them separately by low-coverage sequencing, and then perform RNA-Seq on each spore separately. This is a substantial advance and opens up the possibility to straightforwardly identify eQTLs over many conditions in a cost-efficient manner. Overall, I found the paper to be well-written and well-motivated, and have no issues with either the methodological/analytical approach (though eQTL analysis is not my expertise), or with the manuscript's conclusions.

      I do have several questions/comments.

      393 segregant experiment:<br /> For the experiment with the 393 previously genotyped segregants, did the authors examine whether averaging the expression by genotype for single cells gave expression profiles similar to the bulk RNA-Seq data generated from those genotypes? Also, is it possible (and maybe not, due to the asynchronous nature of the cell culture) to use the expression data to aid in genotyping for those cells whose genotypes are ambiguous? I presume it might be if one has a sufficient number of cells for each genotype, though, for the subsequent one-pot experiments, this is a moot point.

      Figure 1B:<br /> Is UMAP necessary to observe an ellipse/circle - I wouldn't be surprised if a simple PCA would have sufficed, and given the current discussion about whether UMAP is ever appropriate for interpreting scRNA-Seq (or ancestry) data, it seems the PCA would be a preferable approach. I would expect that the periodic elements are contained in 2 of the first 3 principal components. Also, it would be nice if there were a supplementary figure similar to Figure 4 of Macosko et al (PMID 26000488) to indeed show the cell cycle dependent expression.

      Aging, growth rate, and bet-hedging:<br /> The mention of bet-hedging reminded me of Levy et al (PMID 22589700), where they saw that Tsl1 expression changed as cells aged and that this impacted a cell's ability to survive heat stress. This bet-hedging strategy meant that the older, slower-growing cells were more likely to survive, so I wondered a couple of things. It is possible from single-cell data to identify either an aging, or a growth rate signature? A number of papers from David Botstein's group culminated in a paper that showed that they could use a gene expression signature to predict instantaneous growth rate (PMID 19119411) and I wondered if a) this is possible from single-cell data, and b) whether in the slower growing cells, they see markers of aging, whether these two signatures might impact the ability to detect eQTLs, and if they are detected, whether they could in some way be accounted for to improve detection.

      AIL vs. F2 segregants:<br /> I'm curious if the authors have given thought to the trade-offs of developing advanced intercross lines for scRNA-Seq eQTL analysis. My impression is that AIL provides better mapping resolution, but at the expense of having to generate the lines. It might be useful to see some discussion on that.

      10x vs SPLit-Seq<br /> 10x is a well established, but fairly expensive approach for scRNA-Seq - I wondered how the cost of the 10x approach compares to the previously used approach of genotyping segregants and performing bulk RNA-Seq, and how those costs would change if one used SPLiT-Seq (see PMID 38282330).

    1. Author Response

      The following is the authors’ response to the original reviews.

      We would like to thank the reviewers for their insightful comments and recommendations. We have extensively revised the manuscript in response to the valuable feedback. We believe the results is a more rigorous and thoughtful analysis of the data. Furthermore, our interpretation and discussion of the findings is more focused and highlights the importance of the circuit and its role in the response to stress. Thank you for helping to improve the presented science.

      Key changes made in response to the reviewers comments include:

      • Revision of statistical analyses for nearly all figures, with the addition of a new table of summary statistics to include F and/or t values alongside p-values.

      • Addition of statistical analyses for all fiber photometry data.

      • Examination of data for possible sex dependent effects.

      • Clarification of breeding strategies and genotype differences, with added details to methods to improve clarity.

      • Addressing concerns about the specificity of virus injections and the spread, with additional details added to methods.

      • Modification of terminology related to goal-directed behavior based on reviewer feedback, including removal of the term from the manuscript.

      • Clarification and additional data on the use of photostimulation and its effects, including efforts to inactivate neurons for further insight, despite technical challenges.

      • Correction of grammatical errors throughout the manuscript.

      Reviewer 1:

      Despite the manuscript being generally well-written and easy to follow, there are several grammatical errors throughout that need to be addressed.

      Thank you for highlighting this issue. Grammatical errors have been fixed in the revised version of the manuscript.

      Only p values are given in the text to support statistical differences. This is not sufficient. F and/or t values should be given as well.

      In response to this critique and similar comments from Reviewer 2, we re-evaluated our approach to statistical analyses and extensively revised analyses for nearly all figures. We also added a new table of summary statistics (Supplemental Table 1) containing the type of analysis, statistic, comparison, multiple comparisons, and p value(s). For Figures 4C-E, 5C, 6C-E, 7H-I, and 8H we analyzed these data using two-way repeated measures (RM) ANOVA that examined the main effect of time (either number of sessions or stimulation period) in the same animal and compared that to the main effect of genotype of the animal (Cre+ vs Cre-), and if there was an interaction. For Supplemental Figure 7A we also conducted a two-way RM ANOVA with time as a factor and activity state (number of port activations in active vs inactive nose port) as the other in Cre+ mice. For Figures 5D-E we conducted a two-way mixed model ANOVA that accounted and corrected for missing data. In figures that only compared two groups of data (Figures 5F-L, 6F, 8C-D, 8I, and Supp 6F-G) we used two-tailed t-test for the analysis. If our question and/or hypothesis required us to conduct multiple comparisons between or within treatments, we conducted Bonferroni’s multiple comparisons test for post hoc analysis (we note which groups we compared in Supplemental Table 1). For figures that did or did not show a change in calcium activity (Figure 3G, 3I-K, 7B, 7D-E, 8E-F), we compared waveform confidence intervals (Jean-Richard-Dit-Bressel, Clifford, McNally, 2020). The time windows we used as comparison are noted in Supplemental Table 1, and if the comparisons were significant at 95%, 99%, and 99.9% thresholds.

      None of prior comparisons in prior analyses that were significant were found to have fallen below thresh holds for significance. Of those found to be not significantly different, only one change was noted. In Figure 6E there was now a significant baseline difference between Cre+ and Cre- mice with Cre- mice taking longer to first engage the port compared to Cre+ mice (p=0.045). Although the more rigorous approach the statistical analyses did not change our interpretations we feel the enhanced the paper and thank the reviewer for pushing this improvement.

      Moreover, the fibre photometry data does not appear to have any statistical analyses reported - only confidence intervals represented in the figures without any mention of whether the null hypothesis that the elevations in activity observed are different from the baseline.

      This is particularly important where there is ambiguity, such as in Figure 3K, where the spontaneous activity of the animal appears to correlate with a spike in activity but the text mentions that there is no such difference. Without statistics, this is difficult to judge.

      Thank you for highlighting this critical point and providing an opportunity to strengthen our manuscript. We added statistical analyses of all fiber photometry data using a recently described approach based on waveform confidence intervals (Jean-Richard-Dit-Bressel, Clifford, McNally, 2020). In the statistical summary (Supplemental Table 1) we note the time window that we used for comparison in each analysis and if the comparisons were significant at 95%, 99%, and 99.9% thresholds. Thank you from highlighting this and helping make the manuscript stronger.

      With respect to Figure 3K, we are not certain we understood the spike in activity the reviewer referred to. Figure 3J and K include both velocity data (gold) and Ca2+ dependent signal (blue). We used episodes of velocity that were comparable to the avoidance respond during the ambush test and no significant differences in the Ca2+ signal when gating around changes in velocity in the absence of stressor (Supplemental Table1). This is in contrast to the significant change in Ca2+ signal following a mock predator ambush (Figure 3J). We interpret these data together to indicate that locomotion does not correlate with an increase in calcium activity in SuMVGLUT2+::POA neurons, but that coping to a stressor does. This conclusion is further examined in supplemental Figure 5, including examining cross-correlation to test for temporally offset relationship between velocity and Ca2+ signal in SUMVGLUT2+::POA neurons.

      The use of photostimulation only is unfortunate, it would have been really nice to see some inactivation of these neurons as well. This is because of the well-documented issues with being able to determine whether photostimulation is occurring in a physiological manner, and therefore makes certain data difficult to interpret. For instance, with regards to the 'active coping' behaviours - is this really the correct characterisation of what's going on? I wonder if the mice simply had developed immobile responding as a coping strategy but when they experience stimulation of these neurons that they find aversive, immobility is not sufficient to deal with the summative effects of the aversion from the swimming task as well as from the neuronal activation? An inactivation study would be more convincing.

      We agree with the point of the reviewer, experiments demonstrating necessity of SUMVGLUT2+::POA neurons would have added to the story here. We carried out multiple experiments aimed at addressing questions about necessity of SuMVGLUT2+::POA neurons in stress coping behaviors, specifically the forced swim assay. Efforts included employing chemogenetic, optogenetic, and tetanus toxin-based methods. We observed no effects on locomotor activity or stress coping. These experiments are both technically difficult and challenging to interpret. Interpretation of negative results, as we obtained, is particularly difficult because of potential technical confounds. Selective targeting of SuMVGLUT2+::POA neurons for inhibition requires a process requiring three viral injections and two recombination steps, increasing variability and reducing the number of neurons impacted. Alternatively, photoinhibition targeting SuMVGLUT2+::POA cells can be done using Retro-AAV injected into POA and a fiber implant over SuM. We tried both approaches. Data obtained were difficult to interpret because of questions about adequate coverage of SuMVGLUT2+::POA population by virally expressed constructs and/or light spread arose. The challenge of adequate coverage to effectively prevent output from the targeted population is further confounded by challenges inherent in neural inhibition, specifically determining if the inhibition created at the cellular level is adequate to block output in the context of excitatory inputs or if neurons must be first engaged in a particular manner for inhibition to be effective. Baseline neural activity, release probability, and post-synaptic effects could all be relevant, which photo-inhibition will potentially not resolve. So, while the trend is to always show “necessary and sufficient” effects, we’ve tried nearly everything, and we simply cannot conclude much from our mixed results. There are also wellestablished problems with existing photo-inhibition methods, which while people use them and tout them, are often ignored. We have a lot of expertise in photo-inhibition optogenetics, and indeed have used it with some success, developed new methods, yet in this particular case we are unable to draw conclusions related to inhibition. People have experienced similar challenges in locus coeruleus neurons, which have very low basal activity, and inhibition with chemogenetics is very hard, as well as with optogenetic pump-based approaches, because the neurons fire robust rebound APs. We have spent almost 2.5 years trying to get this to work in this circuit because reviews have been insistent on this result for the paper to be conclusive. Unfortunately, it simply isn’t possible in our view until we know more about the cell types involved. This is all in spite of experience using the approach in many other publications.

      We also employed less selective approaches, such as injecting AAV-DIO-tetanus toxin light chain (Tettox) constructs directly into SuM VGLUT2-Cre mice but found off target effects impacting animal wellbeing and impeding behavioral testing due viral spread to surrounding areas.

      While we are disappointed for being unable to directly address questions about necessity of SuMVGLUT2+::POA neurons in active coping with experimental data, we were unable to obtain results allowing for clear interpretation across numerous other domains the reviewers requested. We also feel strongly that until we have a clear picture of the molecular cell type architecture in the SuM, and Cre-drivers to target subsets of neurons, this question will be difficult to resolve for any group. We are working now on RNAseq and related spatial transcriptomics efforts in the SuM and examining additional behavioral paradigm to resolve these issues, so stay tuned for future publications.

      Accordingly, we avoid making statements relating to necessity in the manuscript. In spite of having several lines of physiological data with strong robust correlations behavior related to the SuMVGLUT2+::POA circuit.

      Nose poke is only nominally instrumental as it cannot be shown to have a unique relationship with the outcome that is independent of the stimuli-outcome relationships (in the same way that a lever press can, for example). Moreover, there is nothing here to show that the behaviours are goal-directed.

      Thank you for highlighting this point. Regarding goal-direct terminology, we removed this terminology from the manuscript. Since the mice perform highly selective (active vs inactive) port activation robustly across multiple days of training the behavior likely transitions to habitual behavior. We only tested the valuation of stimuli termination of the final day of training with time limited progressive ratio test. With respect to lever press versus active port activation, we are unclear how using a lever in this context would offer a different interpretation. Lever pressing may be more sensitive to changes in valuation when compared to nose poke port activation (Atalayer and Rowland 2008); however, in this study the focus of the operant behavior is separating innate behaviors for learned action–outcome instrumental learned behaviors for threat response (LeDoux and Daw 2018). The robust highly selective activation of the active port illustrated in Figure 6 fits as an action–outcome instrumental behavior wherein mice learn to engage the active but not inactive port to terminate photostimulation. The first activation of the port occurs through exploration of the arena but as demonstrated by the number of active port activations and the decline in time of the first active port engagement, mice expressing ChR2eYFP learn to engage the port to terminate the stimulation. To aid in illustrating this point we have added Supplemental Figure 7 showing active and inactive port activations for both Cre+ and Cre- mice. This adds clarity to high rate of selective port activation driven my stimulation of SUMVGLUT2+::POA neurons compared to controls. The elimination of goal directed and providing additional data narrows and supports one of the key points of the operant experiment.

      With regards to Figure 1: This is a nice figure, but I wonder if some quantification of the pathways and their density might be helpful, perhaps by measuring the intensity of fluorescence in image J (as these are processes, not cell bodies that can be counted)? Mind you, they all look pretty dense so perhaps this is not necessary! However, because the authors are looking at projections in so-called 'stress-engaged regions', the amygdala seems conspicuous by its absence. Did the authors look in the amygdala and find no projections? If so it seems that this would be worth noting.

      This is an interesting question but has proven to be a very technically challenging question. We consulted with several leaders who routinely use complimentary viral tracing methods in the field. We were unable to devise a method to provide a satisfactorily meaningful quantitative (as opposed to qualitative) approach to compare SUMVGLUT2+::POA to SuMVGLUT2+ projections. A few limitations are present that hinder a meaningful quantitative approach. One limitation was the need for different viral strategies to label the two populations. Labeling SuMVGLUT2+::POA neurons requires using VGLUT2-Flp mice with two injections into the POA and one into SuM. Two recombinase steps were required, reducing efficiency of overlap. This combination of viral injections, particularly the injections of RetroAAVs in the POA, can induce significant quantitative variability due to tropism, efficacy, and variability of retro-viral methods, and viral infection generally. These issues are often totally ignored in similar studies across the “neural circuit” landscape, but it doesn’t make them less relevant here.

      Although people do this in the field, and show quantification, we actually believe that it can be a quite misleading read-out of functionally relevant circuitry, given that neurotransmitter release ultimately is amplified by receptors post-synaptically, and many examples of robust behavioral effects have been observed with low fiber tracing complimentary methods (McCall, Siuda et al. 2017). In contrast, the broader SuMVGLUT2+ population was labeled using a single injection into the SuM. This means there like more efficient expression of the fluorophore. Additionally, in areas that contain terminals and passing fibers understanding and interpreting fluorescent signal is challenging. Together, these factors limit a meaningful quantitative comparison and make an interpretation difficult to make. In this context, we focused on a conservative qualitative presentation to demonstrate two central points. That 1) SuMVGLUT2+::POA neurons are subset of SuMVGLUT2+ neurons that project to specific areas and that exclude dentate gyrus, and they 2) arborize extensively to multiple areas which have be linked to threat responses. We agree that there is much to be learned about how different populations in SuM connect to targets in different regions of the brain and to continue to examine this question with different techniques. A meaningful quantitative study comparing projections is technically complex and, we feel, beyond our ability for this study.

      Also, for the reasons above we do not believe that quantification provides exceptional clarity with respect to the putative function of the circuit, glutamate released, or other cotransmitters given known amplification at the post-synaptic side of the circuit.

      With regard to the amygdala, other studies on SuM projections have found efferent projections to amygdala (Ottersen, 1980; Vertes, 1992). In our study we were unable to definitively determine projections from SuMVGLUT2+::POA neurons to amygdala, which if present are not particularly dense. For this reason we were conservative and do not comment on this particular structure.

      I would suggest removing the term goal-directed from the manuscript and just focusing on the active vs. passive distinction.

      We removed the use of goal-directed. Thank you for helping us clarify our terminology.

      The effect observed in Figure 7I is interesting, and I'm wondering if a rebound effect is the most likely explanation for this. Did the authors inhibit the VGAT neurons in this region at any other times and observe a similar rebound? If such a rebound was not observed it would suggest that it is something specific about this task that is producing the behaviour. I would like it if the authors could comment on this.

      We agree that results showing the change in coping strategy (passive to active) in forced swim after but not during stimulation of SuMVGAT+ neurons is quite interesting (Figure 7I). This experiment activated SuMVGAT+ neurons during a section of the forced swim assay and mice showed a robust shift to mobility after the stimulation of SuMVGAT+ neurons stopped. We did not carry out inhibition of SuMVGAT+ neurons in this manuscript. As the reviewer suggested, strong inhibition of local SuM neurons, including SUMVGLUT2+::POA neurons, could lead to rebound activity that may shift coping behaviors in confusing ways. We agree this is an interesting idea but do not have data to support the hypothesis further at this time.

      Reviewer 2

      (1) These are very difficult, small brain regions to hit, and it is commendable to take on the circuit under investigation here. However, there is no evidence throughout the manuscript that the authors are reliably hitting the targets and the spread is comparable across experiments, groups, etc., decreasing the significance of the current findings. There are no hit/virus spread maps presented for any data, and the representative images are cropped to avoid showing the brain regions lateral and dorsal to the target regions. In images where you can see the adjacent regions, there appears expression of cell bodies (such as Supp 6B), suggesting a lack of SuM specificity to the injections.

      We agree with the reviewer that the areas studied are small and technically challenging to hit. This was one of driving motivations for using multiple tools in tandem to restrict the area targeted for stimulation. Approaches included using a retrograde AAVs to express ChR2eFYP in SUMVGLUT2+::POA neurons; thereby, restricting expression to VGLUT2+ neurons that project to the POA. Targeting was further limited by placement of the optic fiber over cell bodies on SuM. Thus, only neurons that are VGLUT2+, project to the POA, and were close enough to the fiber were active by photostimulation. Regrettably, we were not able to compile images from mice where the fiber was misplaced leading to loss of behavioral effects. We would have liked to provide that here to address this comment. Unfortunately, generating heat maps for injections is not possible for anatomic studies that use unlabeled recombinase as part of an intersectional approach. Also determining the point of injection of a retroAAV can be difficult to accurately determine its location because neurons remote to injection site and their processes are labeled.

      Experiments described in Supplemental Figure 6B on VGAT neurons in SuM were designed and interpreted to support the point that SUMVGLUT2+::POA neurons are a distinct population that does not overlap with GABAergic neurons. For this point it is important that we targeted SuM, but highly confined targeting is not needed to support the central interpretation of the data. We do see labeling in SuM in VGAT-Cre mice but photo stimulation of SuMVGAT+ neurons does not generate the behavioral changes seen with activation of SUMVGLUT2+::POA neurons. As the reviewer points out, SuM is small target and viral injection is likely to spread beyond the anatomic boundaries to other VGAT+ neurons in the region, which are not the focus here. The activation would be restricted by the spread of light from the fiber over SuM (estimated to be about a 200um sphere in all directions). We did not further examine projections or localization of VGAT+ neurons in this study but focused on the differential behavioral effects of SUMVGLUT2+::POA neurons.

      (2) In addition, the whole brain tracing is very valuable, but there is very little quantification of the tracing. As the tracing is the first several figures and supp figure and the basis for the interpretation of the behavior results, it is important to understand things including how robust the POA projection is compared to the collateral regions, etc. Just a rep image for each of the first two figures is insufficient, especially given the above issue raised. The combination of validation of the restricted expression of viruses, rep images, and quantified tracing would add rigor that made the behavioral effects have more significance.

      For example, in Fig 2, how can one be sure that the nature of the difference between the nonspecific anterograde glutamate neuron tracing and the Sum-POA glutamate neuron tracing is real when there is no quantification or validation of the hits and expression, nor any quantification showing the effects replicate across mice? It could be due to many factors, such as the spread up the tract of the injection in the nonspecific experiment resulting in the labeling of additional regions, etc.

      Relatedly, in Supp 4, why isn’t C normalized to DAPI, which they show, or area? Similar for G what is the mcherry coverage/expression, and why isn’t Fos normalized to that?

      Thank you for highlighting the importance of anatomy and the value of anatomy. Two points based on the anatomic studies are central to our interpretation of the experimental data. First, SUMVGLUT2+::POA are a distinct population within the SuM. We show this by demonstrating they are not GABAergic and that they do not project to dentate gyrus. Projections from SuM to dentate gyrus have been described in multiple studies (Boulland et al., 2009; Haglund et al., 1987; Hashimotodani et al., 2018; Vertes, 1992) and we demonstrate them here for SuMVGLUT2+ cells. Using an intersectional approach in VGLUT2-Flp mice we show SUMVGLUT2+::POA neurons do not project to dentate gyrus. We show cell bodies of SUMVGLUT2+::POA neurons located in SuM across multiple figures including clear brain images. Thus, SUMVGLUT2+::POA neurons are SuM neurons that do not project to dentate gyrus, are not GABAergic, send projections to a distinct subset of targets, most notably excluding dentate gyrus. Second, SUMVGLUT2+::POA neurons arborize sending projections to multiple regions. We show this using a combinatorial genetic and viral approach to restrict expression of eYFP to only neurons that are in SuM (based on viral injection), project to the POA (based on retrograde AAV injection in POA), and VGLUT2+ (VGLUT2-Flp mice). Thus, any eYFP labeled projection comes from SUMVGLUT2+::POA neurons. We further confirmed projections using retroAAV injection into areas identified using anterograde approaches (Supplemental Figure 2). As discussed above in replies to Reviewer 1, we feel limitations are present that preclude meaningful quantitative analysis. We thus opted for a conservative interpretation as outlined.

      Prior studies have shown efferent projections from SuM to many areas, and projections to dentate gyrus have received substantial attention (Bouland et al., 2009; Haglund, Swanson, and Kohler, 1984; Hashimotodani et al., 2018; Soussi et al., 2010; Vertes, 1992; Pan and McNaugton, 2004). We saw many of the same projections from SuMVGLUT2+ neurons. We found no projections from SUMVGLUT2+::POA neurons to dentate gyrus (Figure 2). Our description of SuM projection to dentate gyrus is not new but finding a population of neurons in SuM that does not project to dentate gyrus but does project to other regions in hippocampus is new. This finding cannot be explained by spread of the virus in the tract or non-selective labeling.

      (3) The authors state that they use male and female mice, but they do not describe the n’s for each experiment or address sex as a biological variable in the design here. As there are baseline sex differences in locomotion, stress responses, etc., these could easily factor into behavioral effects observed here.

      Sex specific effects are possible; however, the studies presented here were not designed or powered to directly examine them. A point about experimental design that helps mitigate against strong sex dependent effect is that often the paradigm we used examined baseline (pre-stimulation) behavior, how behavior changed during stimulation, and how behavior returned (or not) to baseline after stimulation. Thus, we test changes in individual behaviors. Although we had limited statistical power, we conducted analyses to examine the effects of sex as variable in the experiments and found no differences among males and females.

      (4) In a similar vein as the above, the authors appear to use mice of different genotypes (however the exact genotypes and breeding strategy are not described) for their circuit manipulation studies without first validating that baseline behavioral expression, habituation, stress responses are not different. Therefore, it is unclear how to interpret the behavioral effects of circuit manipulation. For example in 7H, what would the VGLUT2-Cre mouse with control virus look like over time? Time is a confound for these behaviors, as mice often habituate to the task, and this varies from genotype to genotype. In Fig 8H, it looks like there may be some baseline differences between genotypes- what is normal food consumption like in these mice compared to each other? Do Cre+ mice just locomote and/or eat less? This issue exists across the figures and is related to issues of statistics, potential genotype differences, and other experimental design issues as described, as well as the question about the possibility of a general locomotor difference (vs only stress-induced). In addition, the authors use a control virus for the control groups in VGAT-Cre manipulation studies but do not explain the reasoning for the difference in approach.

      Thank you for highlighting the need for greater clarity about the breeding strategies used and for these related questions. We address the breeding strategy and then move to address the additional concerns raised. We have added details to the methods section to address this point. For VGLUT2-Cre mice we use litter mates controls from Cre/WT x WT/WT cross. The VGLUT2-Cre line (RRID:IMSR_JAX:028863) (Vong L , et al. 2011) used here been used in many other reports. We are not aware of any reports indicating a phenotype associated with the addition of the IRES-Cre to the Slc17a6 loci and there is no expected impact of expression of VGLUT2. Also, we see in many of the experiments here that the baseline (Figures 4, 5, and 7) behaviors are not different between the Cre+ and Cre- mice. For VGAT-Cre mice we used a different breeding strategy that allowed us to achieve greater control of the composition of litters and more efficient cohorts cohort. A Cre/Cre x WT/WT cross yielded all Cre/WT litters. The AAV injected, ChR2eYFP or eYFP, allowed us to balance the cohort.

      Regarding Figure 7H, which shows time immobile on the second day of a swim test, data from the Cre- mice demonstrate the natural course of progression during the second day of the test. The control mice in the VGAT-Cre cohort (Figure 7I) have similar trend. The change in behavior during the stimulation period in the Cre+ mice is caused by the activation of SUMVGLUT2+::POA neurons. The behavioral shift largely, but not completely, returns to baseline when the photostimulation stops. We have no reason to believe a VGLUT2-Cre+ mouse injected with control AAV to express eYFP would be different from WT littermate injected with AVV expressing ChR2eYFP in a Cre dependent manner.

      Turning to concerns related to 8H, which shows data from fasted mice quantify time spent interacting with food pellet immediately after presentation of a chow pellet, we found no significant difference between the control and Cre+ mice. We unaware of any evidence indicating that the two groups should have a different baseline since the Cre insertion is not expected to alter gene expression and we are unaware of reports of a phenotype relating to feeding and the presence of the transgene in this mouse line. Even if there were a small baseline shift this would not explain the large abrupt shift induced by the photostimulation. As noted above, we saw shifts in behavior abruptly induced by the initiation of photostimulation when compared to baseline in multiple experiments. This shift would not be explained by a hypothetical difference in the baseline behaviors of litter mates.

      (5) The statistics used throughout are inappropriate. The authors use serial Mann-Whitney U tests without a description of data distributions within and across groups. Further, they do not use any overall F tests even though most of the data are presented with more than two bars on the same graph. Stats should be employed according to how the data are presented together on a graph. For example, stats for pre-stim, stim, and post-stim behavior X between Cre+ and Cre- groups should employ something like a two-way repeated measures ANOVA, with post-hoc comparisons following up on those effects and interactions. There are many instances in which one group changes over time or there could be overall main effects of genotype. Not only is serially using Mann-Whitney tests within the same panel misleading and statistically inaccurate, but it cherry-picks the comparisons to be made to avoid more complex results. It is difficult to comprehend the effects of the manipulations presented without more careful consideration of the appropriate options for statistical analysis.

      We thank the reviewer for pointing this out and suggesting alterative analyses, we agree with the assessment on this topic. Therefore, we have extensively revised the statical approach to our data using the suggested approach. Reviewer 1 also made a similar comment, and we would like to point to our reply to reviewer 1’s second point in regard to what we changed and added to the new statistical analyses. Further, we have added a full table detailing the statical values for each figure to the paper.


      (6) What does the signal look like at the terminals in the POA? Any suggestion from the data that the projection to the POA is important?

      This is an interesting question that we will pursue in future investigations into the roles of the POA. We used the projection to the POA from SuM to identify a subpopulation in SuM and we were surprised to find the extensive arborization of these neurons to many areas associated with threat responses. We focused on the cell bodies as “hubs” with many “spokes”. Extensive studies are needed to understand the roles of individual projections and their targets. There is also the hypothetical technical challenge of manipulating one projection without activating retrograde propagation of action potentials to the soma. At the current time we have no specific insights into the roles of the isolated projection to POA. Interpretation of experiments activating only “spoke” of the hub would be challenging. Simple terminal stimulation experiments are challenged by the need to separate POA projections from activation of passing fibers targeting more anterior structures of the accumbens and septum.

      (7) Is this distinguishing active coping behavior without a locomotor phenotype? For example, Fig. 5I and other figure panels show a distance effect of stimulation (but see issues raised about the genotype of comparison groups). In addition, locomotor behavior is not included for many behaviors, so it is hard to completely buy the interpretation presented.

      We agree with the reviewer and thank them for highlighting this fundamental challenge in studies examining active coping behaviors in rodents, which requires movement. Additionally, actively responding to threatening stressors would include increased locomotor activity. Separation of movement alone from active coping can be challenging. Because of these concerns we undertook experiments using diverse behavioral paradigms to examine the elicited behaviors and the recruitment of SuMVGLUT2+::POA neurons to stressors. We conducted experiments to directly examine behaviors evoked by photoactivation of SuMVGLUT2+::POA. In these experiments we observed a diversity of behaviors including increased locomotion and jumping but also treading/digging (Figure 4). These are behaviors elicited in mice by threatening and noxious stimuli. An Increase of running or only jumping could signify a specific locomotor effect, but this is not what was observed. Based on these behaviors, we expected to find evidence of increase movement in open field (Figure 5G-I) and light dark choice (Figure 5J-L) assays. For many of the assays, reporting distance traveled is not practical. An important set of experiments that argues against a generic increase in locomotion is the operant behavior experiments, which require the animal to engage in a learned behavior while receiving photostimulation of SuMVGLUT2+::POA neurons (Figure 6). This is particularly true for testing using a progressive ratio when the time of ongoing photostimulation is longer, yet animals actively and selectively engage the active port (Figure 6G-H). Further, we saw a shift in behavioral strategy induce by photoactivation in forced swim test (Figure 7H). Thus, activation of SUMVGLUT2+::POA neurons elicited a range of behaviors that included swimming, jumping, treading, and learned response, not just increased movement. Together these data strongly argue that SuMVGLUT2+::POA neurons do not only promote increased locomotor behavior. We interpret these data together with the data from fiber photometry studies to show SuMVGLUT2+::POA neurons are recruited during acute stressors, contribute to aversive affective component of stress, and promote active behaviors without constraining the behavioral pattern.

      Regarding genotype, we address this in comments above as well but believe that clarifying the use of litter mates, the extensive use of the VGLUT2-Cre line by multiple groups, and experimental design allowing for comparison to baseline, stimulation evoked, and post stimulation behaviors within and across genotypes mitigate possible concerns relating to the genotype.

      (8) What is the role of GABA neurons in the SuM and how does this relate to their function and interaction with glutamate neurons? In Supp 8, GABA neuron activation also modulates locomotion and in Fig 7 there is an effect on immobility, so this seems pretty important for the overall interpretation and should probably be mentioned in the abstract.

      Thank you for noting these interesting findings. We added text to highlight these findings to the abstract. Possible roles of GABAergic neurons in SuM extend beyond the scope of the current study particularly since SuM neurons have been shown to release both GABA and glutamate (Li Y, Bao H, Luo Y, et al. 2020, Root DH, Zhang S, Barker DJ et al. 2018). GABAergic neurons regulate dentate gyrus (Ajibola MI, Wu JW, Abdulmajeed WI, Lien CC 2021), REM sleep (Billwiller F, Renouard L, Clement O, Fort P, Luppi PH 2017), and novelty processing Chen S, He L, Huang AJY, Boehringer R et al. 2020). The population of exclusively GABAergic vs dual neurotransmitter neurons in SuM requires further dissection to be understood. How they may relate to SUMVGLUT2+::POA neurons require further investigation.

      Questions about figure presentation:

      (9) In Fig 3, why are heat maps shown as a single animal for the first couple and a group average for the others?

      Thank you for highlighting this point for further clarification. We modified the labels in the figure to help make clear which figures are from one animal across multiple trials and those that are from multiple animals. In the ambush assay each animal one had one trial, to avoid habituation to the mock predator. Accordingly, we do not have multiple trials for each animal in this test. In contrast, the dunk assay (10 trial/animal) and the shock (5 trials/animal) had multiple trials for each animal. We present data from a representative animal when there are multiple trials per animal and the aggerate data.

      Why is the temporal resolution for J and K different even though the time scale shown is the same?

      Thank you for noticing this error carried forward from a prior draft of the figure so we could correct it. We replaced the image in 3J with a more correctly scaled heatmap.

      What is the evidence that these signal changes are not due to movement per se?

      Thank you for the question. There are two points of evidence. First, all the 465 nm excitation (Ca2+ dependent) data was collected in interleaved fashion with 415 nm (isosbestic) excitation data. The isosbestic signal is derived from GCaMP emission but is independent of Ca2+ binding (Martianova E, Aronson S, Proulx CD. 2019). This approach, time-division multiplexing, can correct calcium-dependent for changes in signal most often due to mechanical change. The second piece of evidence is experimental. Using multiple cohorts of mice, we examined if the change in Ca2+ signal was correlated with movement. We used the threshold of velocity of movement seen following the ambush. We found no correlation between high velocity movements and Ca2+ signal (Figure 3K) including cross correlational analysis (Supplemental figure 5). Based on these points together we conclude the change in the Ca2+ signal in SUMVGLUT2+::POA neurons is not due to movement induced mechanical changes and we find no correlation to movement unless a stressor is present, i.e. mock predator ambush or forced swim. Further, the stressors evoke very different locomotor responses fleeing, jumping, or swimming.

      (10) In Fig 4, the authors carefully code various behaviors in mice. While they pick a few and show them as bars, they do not show the distribution of behaviors in Cre- vs Cre+ mice before manipulation (to show they have similar behaviors) or how these behaviors shift categories in each group with stimulation. Which behaviors in each group are shifting to others across the stim and post-stim periods compared to pre-stim?

      This is an important point. We selected behaviors to highlight in Figure4 C-E because these behaviors are exhibited in response to stress (De Boer & Koolhaas, 2003; van Erp et al., 1994). For the highlighted behaviors, jumping, treading/digging, grooming, we show baseline (pre photostimulation), stimulation, and post stimulation for Cre+ and Cre- mice with the values for each animal plotted. We show all nine behaviors as a heat map in Figure 4B. The panels show changes that may occur as a function of time and show changes induced by photostimulation.

      The heatmaps demonstrate that photostimulation of SUMVGLUT2+::POA neurons causes a suppression of walking, grooming, and immobile behaviors with an increase in jumping, digging/treading, and rapid locomotion. After stimulation stops, there is an increase in grooming and time immobile. The control mice show a range of behaviors with no shifts noted with the onset or termination of photostimulation.

      Of note, issues of statistics, genotype, and SABV are important here. For example, the hint that treading/digging may have a slightly different pre-stim basal expression, it seems important to first evaluate strain and sex differences before interpreting these data.

      We examined the effects of sex as a biological variable in the experiments reported in the manuscript and found no differences among males and females in any of the experiments where we had enough animals in each sex (minimum of 5 mice) for meaningful comparisons. We did this by comparing means and SEM of males and females within each group (e.g. Cre+ males vs Cre+ female, Cre- males vs Cre- females) and then conducted a t-test to see if there was a difference. For figures that show time as a variable (e.g Figure 6C-E), we compared males and females with time x sex as main factors and compared them (including multiple comparisons if needed). We found no significant main effects or interactions between males and females. Because of this, and to maximize statistical power, we decided to move forward to keep males and females together in all the analyses presented in the manuscript. It is worth noting also that the core of the experimental design employed is a change in behavior caused by photostimulation. The mice are also the same strain with only difference being the modification to add an IRES and sequence for Cre behind the coding sequence of the Slc17A6 (VGLUT2) gene.

      (11) Why do the authors use 10 Hz stimulation primarily? is this a physiologically relevant stim frequency? They show that they get effects with 1 Hz, which can be quite different in terms of plasticity compared to 10 Hz.

      Thank you for the raising this important question. Because tests like open field and forced swim are subject to habituation and cannot be run multiple times per animal a test frequency was needed to use across multiple experiments for consistency. The frequency of 10Hz was selected because it falls within the rate of reported firing rates for SuM neurons (Farrel et al., 2021; Pedersen et al., 2017) and based on the robust but sub maximal effects seen in the real-time place preference assays. Identification of the native firing rates during stress response would be ideal but gathering this data for the identified population remains a dauting task.

      (12) In Fig 5A-F, it is unclear whether locomotion differences are playing a role. Entrances (which are low for both groups) are shown but distance traveled or velocity are not.

      In B, there is no color in the lower left panel. where are these mice spending their time? How is the entirety of the upper left panel brighter than the lower left? If the heat map is based on time distribution during the session, there should be more color in between blue and red in the lower left when you start to lose the red hot spots in the upper left, for example. That is, the mice have to be somewhere in apparatus. If the heat map is based on distance, it would seem the Cre- mice move less during the stim.

      We appreciate the opportunity to address this question, and the attention to detail the reviewer applied to our paper. In the real time place preference test (RTPP) stimulation would only be provided while the animal was on the stimulation side. Mice quickly leave the stimulation side of the arena, as seen in the supplemental video, particularly at the higher frequencies. Thus, the time stimulation is applied is quite low. The mice often retreat to a corner from entering the stimulation side during trials using higher frequency stimulation. Changing locomotor activity along could drive changes in the number entrances but we did not find this. In regard to the heat map, the color scale is dynamically set for each of the paired examples that are pulled from a single trial. To maximize the visibility between the paired examples the color scale does not transfer between the trials. As a result, in the example for 10 Hz the mouse spent a larger amount of time in the in the area corresponding to the lower right corner of the image and the maximum value of the color scale is assigned to that region. As seen in the supplemental video, mice often retreated to the corner of the non-stimulation side after entering the stimulation side. The control animal did not spend a concentrated amount of time in any one region, thus there is a lack of warmer colors. In contrast the baseline condition both Cre+ and Cre- mice spent time in areas disturbed on both sides of arena, as expected. As a result, the maximum value in the heat map is lower and more area are coded in warmer colors allowing for easier visual comparison between the pair. Using the scale for the 10 Hz pair across all leads to mostly dark images. We considered ways to optimized visualization across and within pairs and focused on the within pair comparison for visualization.

      (13) By starting with 1 hz, are the experimenters inducing LTD in the circuit? what would happen if you stop stimming after the first epoch? Would the behavioral effect continue? What does the heat map for the 1 hz stim look like?

      Relatedly, it is a lot of consistent stimulation over time and you likely would get glutamate depletion without a break in the stim for that long.

      Thank you for the opportunity to add clarity around this point regarding the trials in RTPP testing. Importantly, the trials were not carried out in order of increasing frequency of stimulation, as plotted. Rather, the order of trials was, to the extent possible with the number of mice, counterbalanced across the five conditions. Thus, possible contribution of effects of one trial on the next were minimized by altering the order of the trials.

      We have added a heat map for the 1 Hz condition to figure 5B.

      For experiments on RTPP the average stimulation time at 10Hz was less than 10 seconds per event. As a result, the data are unlikely to be affected by possible depletion of synaptic glutamate. For experiments using sustained stimulation (open field or light dark choice assays) we have no clear data to address if this might be a factor where 10Hz stimulation was applied for the entire trial.

      (14) In Fig 6, the authors show that the Cre- mice just don't do the task, so it is unclear what the utility of the rest of the figure is (such as the PR part). Relatedly, the pause is dependent on the activation, so isn't C just the same as D? In G and H, why ids a subset of Cre+ mice shown?

      Why not all mice, including Cre- mice?

      Thank you for the opportunity to improve the clarity of this section. A central aspect of the experiments in Figure 6 is the aversiveness of SUMVGLUT2+::POA neuron photostimulation, as shown in Figure 5B-F. The aversion to photostimulation drives task performance in the negative reinforcer paradigm. The mice perform a task (active port activation) to terminate the negative reinforcer (photostimulation of SuMVGLUT2+::POA neurons). Accordingly, control mice are not expected to perform the task because SuMVGLUT2+::POA neurons are not activated and, thus the mice are not motivated to perform the task.

      A central point we aim to covey in this figure is that while SuMVGLUT2+::POA neurons are being stimulated, mice perform the operant task. They selectively activated the active port (Supplemental Figure 7). As expected, control mice activate the active port at a low level in the process of exploring the arena. This diminishes on subsequent trials as mice habituate to the arena (Figure 6D). The data in Figures 6 C and D are related but can be divergent. Each pause in stimulation requires a port activation of a FR1 test but the number of port activations can exceed the pauses, which are 10 seconds long, if the animal continues to activate the port. Comparing data in Figures 6 C and D revels that mice generally activated the port two to three times for each pause earned with a trend towards greater efficiency on day 4 with more rewards and fewer activations.

      The purpose of the progressive ratio test is to examine if photostimulation of SuMVGLUT2+::POA continues to drive behavior as the effort required to terminate the negative stimuli increases. As seen in Figures 6 G and H, the stimulation of SuMVGLUT2+::POA neurons remains highly motivating. In the 20-minute trial we did not find a break point even as the number of port activations required to pause the stimulation exceed 50. We do not show the Cre- mice is Figure 6G and H because they did not perform the task, as seen in Figure 6F. For technical reasons in early trials, we have fully timely time stamped data for rewards and port activations from a subset of the Cre+ mice. Of note, this contains both the highest and lowest performing mice from the entire data set.

      Taken together, we interpret the results of the operant behavioral testing as demonstrating that SuMVGLUT2+::POA neuron activation is aversive, can drive performance of an operant tasks (as opposed to fixed escape behaviors), and is highly motivating.

      (15) In Fig 7, what does the GCaMP signal look like if aligned to the onset of immobility? It looks like since the hindpaw swimming is short and seems to precede immobility, and the increase in the signal is ramping up at the onset of hindpaw swimming, it may be that the calcium signal is aligned with the onset of immobility.

      What does it look like for swimming onset?

      In I, what is the temporal resolution for the decrease in immobility? Does it start prior to the termination of the stim, or does it require some elapsed time after the termination, etc?

      Thank for the opportunity to addresses these points and improve that clarity of our interpretation of the data. Regarding aligning the Ca2+ signal from fiber photometry recordings to swimming onset and offset, it is important to note that the swimming bouts are not the same length. As a result, in the time prior to alignment to offset of behaviors animals will have been swimming for different lengths of time. In Figure 7 C, we use the behavioral heat map to convey the behavioral average. Below we show the Ca2+ dependent signal aligned at the offset of hindpaw swim for an individual mouse (A) and for the total cohort (B). This alignment shows that the Ca2+ dependent signal declines corresponding to the termination of hindpaw swimming. Because these bouts last less than the total the widow shown, the data is largely included in Figure 7 C and D, which is aligned to onset. Due to the nuance of the difference is the alignment and the partial redundancy, we elected to include the requested alignment to swimming offset in the reply rather in primary figure.

      Author response image 1.

      Turning to the question regarding swimming onset, the animals started swimming immediately when placed in the water and maintained swimming and climbing behaviors until shifting behaviors as illustrated in Figure 7A and B. During this time the Ca2+-dependent signal was elevated but there is only one trial per animal. This question can perhaps be better addressed in the dunk assay presented in Figure 3C, F and G and Supplemental Figure 4 H and I. Here swimming started with each dunk and the Ca2+ signal increased.

      Regarding the question for about figure 7I. We scored for entire periods (2 mins) in aggerate. We noted in videos of the behavior test that there was an abrupt decrease in immobility tightly corresponding to the end of stimulation. In a few animals this shift occurred approximately 15-20s before the end of stimulation. This may relate to the depletion of neurotransmitter as suggested by the reviewer.

      Reviewer 3

      Major points

      (1) Results in Figure 1 suggested that SuM-Vglu2::POA projected not only POA but also to the diverse brain regions. We can think of two models which account for this. One is that homogeneous populations of neurons in SuM-Vglu2::POA have collaterals and innervated all the efferent targets shown in Figure 1. Another is to think of distinct subpopulations of neurons projecting subsets of efferent targets shown in Figure 1 as well as POA. It is suggested to address this by combining approaches taken in experiments for Figure 1 and Supplemental Figure 2.

      Thank you for raising this interesting point. We have attempted combining retroAAV injections into multiple areas that receive projections from SUMVGLUT2+::POA neurons. However, we have found the results unsatisfactory for separating the two models proposed. Using eYFP and tdTomato expressing we saw some overlapping expressing in SuM. We are not able to conclude if this indicates separate populations or partial labeling of a homogenous populations. A third option seems possible as well. There could be a mix of neurons projecting to different combinations of downstream targets. This seems particularly difficult to address using fluorophores. We are preparing to apply additional methodologies to this question, but it extends beyond the scope of this manuscript.

      (2) Since the authors drew a hypothetical model in which the diverse brain regions mediate the effect of SuM-Vglu2::POA activation in behavioral alterations at least in part, examination of the concurrent activation of those brain regions upon photoactivation of SuM-Vglu2::POA. This must help the readers to understand which neural circuits act upon the induction of active coping behavior under stress.

      Thank you for raising this important point. We agree that activating glutamatergic neurons should lead to activation of post synaptic neurons in the target regions. Delineating this in vivo is less straight forward. Doing so requires much greater knowledge of post synaptic partners of SUMVGLUT2+::POA neurons. There are a number of issues that would need to be accounted for. Undertaking two color photo stimulation plus fiber photometry is possible but not a technical triviality. Further, it is possible that we would measure Ca2+ signals in neurons that have no relevant input or that local circuits in a region may shape the signal. We would also lack temporal resolution to identify mono-postsynaptic vs polysynaptic connections. Thus, we would struggle to know if the change in signal was due to the excitatory input from SuM or from a second region. At present, we remain unclear on how to pursue this question experimentally in a manner that is likely to generate clearly interpretable results.

      (3) In Figure 4, "active coping behaviors" must be called "behaviors relevant to the active behaviors" or "active coping-like behaviors", since those behaviors were in the absence of stressors to cope with.

      Thank you for the suggestion on how to clarify our terminology. We have adopted the active coping-like term.

      (4) For the Dunk test, it is suggested to describe the results and methods more in detail, since the readers would be new to it. In particular, the mice could change their behavior between dunks under this test, although they still showed immobility across trials as in Supplemental Figure 4I. Since neural activity during the test was summarized across trials as in Figure 3, it is critical to examine whether the behavior changes according to time.

      Thank you for identifying this opportunity to improve our manuscript. We have expanded and added a detailed description of the dunk test in the methods section.

      As for Supplemental Figure 4I, we apologize for the confusion because the purpose of this figure is to show that mice remained mobile for the entire 30-second dunk trial. This did not appreciably change over the 10 trials. We have revised this figure to plot both immobile and mobile time to achieve greater clarity on this point.

      Minor points


      In Figure 1, please add a serotype of AAVs to make it compatible with other figures and their legends.

      In the main text and Figure 2K, the authors used MHb/LHb and mHb/lHb in a mixed fashion. Please make them unified.

      In the figure legend of Figure 6, change "SuMVGLUT2+::POA neurons drive" to "SuMVGLUT2+::POA neurons " in the title.

      In line 86, please change "Retro-AAV2-Nuc-flox(mCherry)-eGFP" to "AAV5-Nuc-flox(mCherry)eGFP".

      In line 80, please change "Positive controls" to "As positive controls, ".

      Thank you for taking the time and making the effort to identify and call these out. We have corrected them.

    2. eLife assessment

      This important manuscript investigates the role of a subpopulation of glutamatergic neurons in the suprammamillary nucleus that projects to the pre-optic hypothalamus area in active coping but not locomotor activity. They provide solid evidence from experiments using fibre photometry or photostimulation during threatening tasks that these neurons allow animals to produce flexible behaviours in response to stress. This work will be of interest to behavioural and systems neuroscientists.

    3. Joint Public Review:


      This important manuscript investigates a subpopulation of glutamatergic neurons in the suprammamillary nucleus that projects to the pre-optic hypothalamus area (SuM-VGLUT2+::POA). First, they define the neural circuitry of these neurons, which contact many stress/threat-associated brain regions. Then they employ fibre photometry to measure the activity of these neurons during various threatening tasks and find the responses correlate well with threat stimuli. Finally, they stimulate these neurons and find multiple lines of evidence that mice find this aversive and will act to avoid receiving this stimulation. In sum, they provide solid evidence that this neuronal population represents a new node in stress response circuitry that allows the animal to produce flexible behaviours in response to stress, which will be of interest to neuroscientists across several sub-fields.


      Overall this is a solid manuscript tackling an important question. Coping with stress by an animal in danger is essential for survival. This manuscript identifies a novel population of neurons in the murine supramamillary nucleus (SuM) projecting to the pre-optic hypothalamus area among other regions that is involved in this important process. The evidence to support the conclusions is solid.

      Specific strengths:

      • The topic is novel.

      • The manuscript follows a logical structure and neatly moves through the central story. Several potential alternate interpretations are well-controlled for.

      • The manuscript employs an array of different tasks to provide converging evidence for their conclusions.

      • The authors provide excellent evidence of the specificity of the function of this neuronal population, both from anatomical studies and from behavioural studies (e.g. demonstrating that activity of gabaergic neurons in the same region does not correlate with behaviours in the same way).

      • The study is well-powered (sample sizes are good) and the effects are convincing.


      * Not all of the reviewer comments were addressed in the manuscript itself, although this was acknowledged in the author's responses to reviewers. One key example is as follows:

      * The authors did not entirely address comments related to rigor but they at least acknowledged it. For example, in multiple places they argue that WT, purchased mice are probably not different in baseline behavior compared to Vgltu2-IRES-Cre because it is unlikely that adding the IRES-Cre will change behavior. However, they do not acknowledge that transgenic lines are not from the exact same genetic background and generation number, and there is ample evidence in the literature that transgenic mice on a B6J background can differ in basal phenotypes from one another and B6J. In one place they show some basal behavior, at least in heat map form though not quantified. Had the authors decided to apply this more pervasively, it would have made the story even more compelling in terms of a stress/threat-induced phenotype.

      Comments on revised version from the Reviewing Editor:

      The authors have done a thorough job of answering the reviewer queries, and a good job of explaining why they have not answered a particular point. Indeed, there is so much additional information in response to the reviewers that I hope readers of the manuscript will read the reviews and responses as well! I think they add a lot.

    1. Reviewer #1 (Public Review):

      The manuscript investigates the role of the membrane-deforming cytoskeletal regulator protein Abba in cortical development and its potential implications for microcephaly. It is a valuable contribution to the understanding of Abba's role in cortical development. The strengths and weaknesses identified in the manuscript are outlined below:

      Clinical Relevance:

      The authors identified a patient with microcephaly and a patient with an intellectual disability harboring a mutation in the Abba variant (R671W) adding a clinically relevant dimension to the study.

      Mechanistic Insights:

      The study offers valuable mechanistic insights into the development of microcephaly by elucidating the role of Abba in radial glial cell proliferation, radial fiber organization, and the migration of neuronal progenitors. The identification of Abba's involvement in the cleavage furrow during cell division, along with its interaction with Nedd9 and positive influence on RhoA activity, adds depth to our understanding of the molecular processes governing cortical development. Though the reported results establish the novel interaction between Abba and Nedd9, the authors have not addressed whether the mutant protein loses this interaction and whether that results in the observed effects.

      In Vivo Validation:

      The overexpression of mutant Abba protein (R671W) resulting in phenotypic similarities to Abba knockdown effects supports the significance of Abba in cortical development.

    2. Reviewer #2 (Public Review):


      Carabalona and colleagues investigated the role of the membrane-deforming cytoskeletal regulator protein Abba (MTSS1L/MTSS2) in cortical development to better understand the mechanisms of abnormal neural stem cell mitosis. The authors used short hairpin RNA targeting Abba20 with a fluorescent reporter coupled with in-utero electroporation of E14 mice to show changes to neural progenitors. They performed flow cytometry for in-depth cell cycle analysis of Abba-shRNA impact on neural progenitors and determined an accumulation in the S phase. Using culture rat glioma cells and live imaging from cortical organotypic slides from mice in utero electroporated with Abba-shRNA, the authors found Abba played a prominent role in cytokinesis. They then used a yeast-two-hybrid screen to identify three high-confidence interactors: Beta-Trcp2, Nedd9, and Otx2. They used immunoprecipitation experiments from E18 cortical tissue coupled with C6 cells to show Abba's requirement for Nedd9 localization to the cleavage furrow/cytokinetic bridge. The authors performed a shRNA knockdown of Nedd9 by in-utero electroporation of E14 mice and observed similar results as with the Abba-shRNA. They tested a human variant of Abba using in-utero electroporation of cDNA and found disorganized radial glial fibers and misplaced, multipolar neurons, but lacked the impact of cell division seen in the shRNA-Abba model.


      A fundamental question in biology about the mechanics of neural stem cell division.

      Directly connecting effects in Abba protein to downstream regulation of RhoA via Nedd9.

      Incorporation of human mutation in ABBA gene.

      Use of novel technologies in neurodevelopment and imaging.


      Unexplored components of the pathway (such as what neurogenic populations are impacted by Abba mutation) and unleveraged aspects of their data (such as the live imaging) limit the scope of their findings and leave significant questions about the effect of ABBA on radial glia development.

      (1) The claim of disorganized radial glial fibers lacks quantifications.<br /> On page 11, the authors claim that knockdown of Abba leads to changes in radial glial morphology observed with vimentin staining. Here they claim misoriented apical processes, detached end feet, and decreased number of RGP cells in the VZ. However, they do not provide quantification of process orientation to better support their first claim. Measurements of radial glia fiber morphology (directionality, length) and angle of division would be metrics that can be applied to data. Some of these analyses could be done in their time-lapse microscopy images, such as to quantify the number of cell divisions during their period of analysis (though that is short-15 hours).

      (2) It is unclear where the effect is:

      -In RG or neuroblasts? Is it in cell cleavage that results in the accumulation of cells at VZ (as sometimes indicated by their data like in Figure 2A or 4D)? Interrogation of cell death (such as by cleaved caspase 3) would also help. Given their time-lapse, can they identify what is happening to the RG fiber? The authors describe a change in "migration" but do not show evidence for this for either progenitor or neuroblast populations. Given they have nice time-lapse imaging data, could they visualize progenitor versus young neuron migration? Analysis of neuroblasts (such as with doublecortin expression in the tissue) would also help understand any issues in migration (of neurons v stem cells).

      -At cleavage furrow? In abscission? There is high-resolution data that highlights the cleavage furrow as the location of interest (Figure 3A), however, there is also data (Figure 3B) to suggest Abba is expressed elsewhere as well and there is an overall soma decrease. More detail of the localization of Abba during the division process would be helpful for example, could cleavage furrow proteins, such as Aurora B, co-localization (and potentially co-IP) help delineate subpopulations of Abba protein? Furthermore, the FRET imaging is a unique way to connect their mutation with function - could they measure/quantify differences at furrow compared to the rest of soma to further corroborate that the Abba-associated RhoA effect was furrow-enriched?

      -The data highlights nicely that a furrow doesn't clearly form when ABBA expression and subsequent RhoA activity are decreased (in Figure 3 or 5A). Does this lead to cells that can't divide because of poor abscission, especially since "rounding" still occurs? Or abnormal progenitors (with loss of fiber or inability to support neuroblast migration)? Or abnormal progression of progenitors to neuroblasts?

      (3) Limited to a singular time point of mouse cortical development

      On page 13, the authors outline the results of their Y2H screen with the identification of three high-confidence interactors. Notably, they used an E10.5-E12.5 mouse brain embryo library rather than one that includes E14, the age of their in-utero electroporation mice. Many of the authors' claims focus on in-utero electroporation of shRNA-Abba of E14 mice that are then evaluated at E16-18. Justification for the focus on this age range should be included to support that their findings can then be applied to all mouse corticogenesis.

      (4) Detail of the effect of the human variant of the ABBA mutation in mice is lacking.

      Their identification of the R671W mutation is interesting and the IUE model warrants more characterization, as they did with their original KD experiments.

      -Could they show that Abba protein levels are decreased (in either cell lines or electroporated tissue)?

      -While time-lapse morphology might not have been performed, more analysis on cell division phenotype (such as plane of division and radial glia morphology) would be helpful.

    1. Reviewer #1 (Public Review):


      This work proposes a new method, DyNetCP, for inferring dynamic functional connectivity between neurons from spike data. DyNetCP is based on a neural network model with a two-stage model architecture of static and dynamic functional connectivity.

      This work evaluates the accuracy of the synaptic connectivity inference and shows that DyNetCP can infer the excitatory synaptic connectivity more accurately than a state-of-the-art model (GLMCC) by analyzing the simulated spike trains. Furthermore, it is shown that the inference results obtained by DyNetCP from large-scale in-vivo recordings are similar to the results obtained by the existing methods (jitter-corrected CCG and JPSTH). Finally, this work investigates the dynamic connectivity in the primary visual area VISp and in the visual areas using DyNetCP.


      The strength of the paper is that it proposes a method to extract the dynamics of functional connectivity from spike trains of multiple neurons. The method is potentially useful for analyzing parallel spike trains in general, as there are only a few methods (e.g. Aertsen et al., J. Neurophysiol., 1989, Shimazaki et al., PLoS Comput Biol 2012) that infer the dynamic connectivity from spikes. Furthermore, the approach of DyNetCP is different from the existing methods: while the proposed method is based on the neural network, the previous methods are based on either the descriptive statistics (JSPH) or the Ising model.


      Although the paper proposes a new method, DyNetCP, for inferring the dynamic functional connectivity, its strengths are neither clear nor directly demonstrated in this paper. That is, insufficient analyses are performed to support the usefulness of DyNetCP.

      First, this paper attempts to show the superiority of DyNetCP by comparing the performance of synaptic connectivity inference with GLMCC (Figure 2). However, the improvement in the synaptic connectivity inference does not seem to be convincing. While this paper compares the performance of DyNetCP with a state-of-the-art method (GLMCC), there are several problems with the comparison. For example:

      (1) This paper focused only on excitatory connections (i.e., ignoring inhibitory neurons).

      (2) This paper does not compare with existing neural network-based methods (e.g., CoNNECT: Endo et al. Sci. Rep. 2021; Deep learning: Donner et al. bioRxiv, 2024).

      (3) Only a population of neurons generated from the Hodgkin-Huxley model was evaluated.

      Thus, the results in this paper are not sufficient to conclude the superiority of DyNetCP in the estimation of synaptic connections. In addition, this paper compares the proposed method with the standard statistical methods Jitter-corrected CCG (Figure 3) and JPSTH (Figure 4). Unfortunately, these results do not show the superiority of the proposed method. It only shows that the results obtained by the proposed method are consistent with those obtained by the existing methods (CCG or JPSTH). This paper also compares the proposed method with standard statistical methods, such as jitter-corrected CCG (Figure 3) and JPSTH (Figure 4). It only shows that the results obtained by the proposed method are consistent with those obtained by the existing methods (CCG or JPSTH), which does not show the superiority of the proposed method.

      In summary, although DyNetCP has the potential to infer synaptic connections more accurately than existing methods, the paper does not provide sufficient analysis to make this claim. It is also unclear whether the proposed method is superior to the existing methods for estimating functional connectivity, such as jitter-corrected CCG and JPSTH. Thus, the strength of DyNetCP is unclear.

    2. eLife assessment

      This study presents a useful method for using multi-electrode spike recordings to track the time-varying functional connectivity between neurons. However, the evidence is incomplete: a demonstration of the utility of the method relative to conventional approaches is needed. If such a demonstration is made, this could be a tool for gaining insight into circuit structure.

    3. Reviewer #2 (Public Review):


      Here the authors describe a model for tracking time-varying coupling between neurons from multi-electrode spike recordings. Their approach extends a GLM with static coupling between neurons to include dynamic weights, learned by a long-short-term-memory (LSTM) model. Each connection has a corresponding LSTM embedding and is read out by a multi-layer perceptron to predict the time-varying weight.


      This is an interesting approach to an open problem in neural data analysis. I think, in general, the method would be interesting to computational neuroscientists.


      It is somewhat difficult to interpret what the model is doing. I think it would be worthwhile to add some additional results that make it more clear what types of patterns are being described and how.

      Major Issues:

      Simulation for dynamic connectivity. It certainly seems doable to simulate a recurrent spiking network whose weights change over time, and I think this would be a worthwhile validation for this DyNetCP model. In particular, I think it would be valuable to understand how much the model overfits, and how accurately it can track known changes in coupling strength. If the only goal is "smoothing" time-varying CCGs, there are much easier statistical methods to do this (c.f. McKenzie et al. Neuron, 2021. Ren, Wei, Ghanbari, Stevenson. J Neurosci, 2022), and simulations could be useful to illustrate what the model adds beyond smoothing.

      Stimulus vs noise correlations. For studying correlations between neurons in sensory systems that are strongly driven by stimuli, it's common to use shuffling over trials to distinguish between stimulus correlations and "noise" correlations or putative synaptic connections. This would be a valuable comparison for Figure 5 to show if these are dynamic stimulus correlations or noise correlations. I would also suggest just plotting the CCGs calculated with a moving window to better illustrate how (and if) the dynamic weights differ from the data.

    1. eLife assessment

      In this valuable study, the authors use Staphylococcus aureus to understand how organic acids inhibit bacterial growth. They provide convincing evidence that acetic acid specifically inhibits the activity of the Ddl enzyme and that S. aureus maintains a high intracellular D-ala concentration to circumvent acetate-mediated growth inhibition. This work will be of interest to researchers studying bacteria and antimicrobials.

    2. Reviewer #2 (Public Review):


      In this manuscript, using Staphylococcus aureus as a model organism, Panda et al. aim to understand how organic acids inhibit bacterial growth. Through careful characterization and interdisciplinary collaboration, the authors present valuable evidence that acetic acid specifically inhibits the activity of Ddl enzyme that converts 2 D-alanine amino acids into D-ala-D-ala dipeptide, which is then used to generate the stem pentapeptide of peptidoglycan (PG) precursors in the cytoplasm. Thus, a high concentration of acetic acid weakens the cell wall by limiting PG-crosslinking (which requires a D-ala portion). However, S. aureus maintains a high intracellular D-ala concentration to circumvent acetate-mediated growth inhibition.


      The authors utilized a well-established transposon mutant library to screen for mutants that struggle to grow in the presence of acetic acid. This screen allowed authors to identify that a strain lacking intact alr1, which encodes for alanine racemase (converts L-ala to D-ala), is unable to grow well in the presence of acetic acid. This phenotype is rescued by the addition of external D-ala. Next, the authors rule out the contribution of other pathways that could lead to the production of D-ala in the cell. Finally, by analyzing D-ala and D-ala-D-ala concentrations, as well as muropeptide intermediates accumulation in different mutants, the authors pinpoint Ddl as the specific target of acetic acid. In fact, the synthetic overexpression of ddl alone overcomes the toxic effects of acetic acid. Using genetics, biochemistry, and structural biology, the authors show that Ddl activity is specifically inhibited by acetic acid and likely by other biologically relevant organic acids. Interestingly, this mechanism is different from what has been reported for other organisms such as Escherichia coli (where methionine synthesis is affected). It remains to be seen if this mechanism is conserved in other organisms that are more closely related to S. aureus, such as Clostridioides difficile and Enterococcus faecalis.


      Although the authors have conclusively shown that Ddl is the target of acetic acid, it appears that the acetic acid concentration used in the experiments may not truly reflect the concentration range S. aureus would experience in its environment. Moreover, Ddl is only significantly inhibited at a very high acetate concentration (>400 mM). Thus, additional experiments showing growth phenotypes at lower organic acid concentrations may be beneficial. Another aspect not adequately discussed is the presence of D-ala in the gut environment, which may be protective against acetate toxicity based on the model provided.

    3. Reviewer #1 (Public Review):


      The manuscript entitled "Staphylococcus aureus counters organic acid anion-mediated inhibition of peptidoglycan cross-linking through robust alanine racemase activity" by Panda, S et al. reports an extensive biochemical analysis of the result from a Tn screen that identified alr1 as being required for acetic acid tolerance. In the end, they demonstrate that reduced D-Ala pools in the ∆alr1 mutant lead to a drastic reduction in D-Ala-D-Ala dipeptide. They show that this is due to the ability of organic acid anions to limit the D-Ala-D-Ala ligase enzyme Ddl. They demonstrate that:

      (1) Acetate exposure in the ∆alr1 results in reduced D-Ala-D-Ala dipeptide, but not the monomers.

      (2) Acetate can bind to purified Ddl in vitro.

      (3) This binding results in reduced enzyme activity.

      (4) Other organic acid anions such as lactate, proprionate, and itaconitate can also inhibit Ddl.

      The experiments are clearly described and logically laid out. I have only a few minor comments to add.


      The most significant strength is the exceptional experimental data that supports the authors' hypotheses.


      Only minor weaknesses were identified by this reviewer.

      (1) Which allele is alr1, the one upstream of MazEF or the one in the Lysine biosynthetic operon?

      (2) Figure 3B. Where does the C3N2 species come from in the WT and why is it absent in the mutants? It is about 25% of the total dipeptide pool.

      (3) Figure 3D could perhaps be omitted. I understand that the authors attained statistical significance in the fitness defect, but biologically this difference is very minor. One would have to look at the isotopomer distribution in the Dat overexpressing strain to make sure that increased flux actually occurred since there are other means of affecting activity (e.g. allosteric modulators).

      (4) In Figure 4A, why is the complete subunit UDP-NAM-AEKAA increasing in each strain upon acetate challenge if there was such a stark reduction in D-Ala-D-Ala, particularly in the ∆alr1 mutant? For that matter, why are the levels of UDP-NAM-AEKAA in the ∆alr1 mutant identical to that of WT with/out acetate?

      (5) Figure 4B. Is there no significant difference between ddl and murF transcripts between WT and ∆alr1 under acetate stress? This comparison was not labeled if the tests were done.

      (6) Although tricky, it is possible to measure intracellular acetate. It might be of interest to know where in the Ddl inhibition curve the cells actually are.

    1. Reviewer #2 (Public Review):

      In this manuscript, the authors analyze the shapes of cerebral cortices from several primate species, including subgroups of young and old humans, to characterize commonalities in patterns of gyrification, cortical thickness, and cortical surface area. The authors state that the observed scaling law shares properties with fractals, where shape properties are similar across several spatial scales. One way the authors assess this is to perform a "cortical melting" operation that they have devised on surface models obtained from several primate species. The authors also explore differences in shape properties between brains of young (~20 year old) and old (~80) humans. A challenge the authors acknowledge struggling with in reviewing the manuscript is merging "complex mathematical concepts and a perplexing biological phenomenon." This reviewer remains a bit skeptical about whether the complexity of the mathematical concepts being drawn from are justified by the advances made in our ability to infer new things about the shape of the cerebral cortex.

      (1) The series of operations to coarse-grain the cortex illustrated in Figure 1 produces image segmentations that do not resemble real brains. The process to assign voxels in downsampled images to cortex and white matter is biased towards the former, as only 4 corners of a given voxel are needed to intersect the original pial surface, but all 8 corners are needed to be assigned a white matter voxel. The reason for introducing this bias (and to the extent that it is present in the authors' implementation) is not provided. The authors provide an intuitive explanation of why thickness relates to folding characteristics, but ultimately an issue for this reviewer is, e.g., for the right-most panel in Figure 2b, the cortex consists of several 4.9-sided voxels and thus a >2 cm thick cortex. A structure with these morphological properties is not consistent with the anatomical organization of typical mammalian neocortex.

      (2) For the comparison between 20-year-old and 80-year-old brains, a well-documented difference is that the older age group possesses more cerebral spinal fluid due to tissue atrophy, and the distances between the walls of gyri becomes greater. This difference is born out in the left column of Figure 4b. It seems this additional spacing between gyri in 80 year olds requires more extensive down-sampling (larger scale values in Figure 4a) to achieve a similar shape parameter K as for the 20 year olds. The authors assert that K provides a more sensitive measure (associated with a large effect size) than currently used ones for distinguishing brains of young vs. old people. A more explicit, or elaborate, interpretation of the numbers produced in this manuscript, in terms of brain shape, might make this analysis more appealing to researchers in the aging field.

      (3) In the Discussion, it is stated that self-similarity, operating on all length scales, should be used as a test for existing and future models of gyrification mechanisms. Given the lack of association between the abstract mathematical parameters described in this study and explicit properties of brain tissue and its constituents, it is difficult to envision how the coarse-graining operation can be used to guide development of "models of cortical gyrification."

      (4) There are several who advocate for analyzing cortical mid-thickness surfaces, as the pial surface over-represents gyral tips compared to the bottoms of sulci in the surface area. The authors indicate that analyses of mid-thickness representations will be taken on in future work, but this seems to be a relevant control for accepting the conclusions of this manuscript.

    2. Reviewer #3 (Public Review):

      Summary: Through a rigorous methodology, the authors demonstrated that within 11 different primates, the shape of the brain followed a universal scaling law with fractal properties. They enhanced the universality of this result by showing the concordance of their results with a previous study investigating 70 mammalian brains, and the discordance of their results with other folded objects that are not brains. They incidentally illustrated potential applications of this fractal property of the brain by observing a scale-dependant effect of aging on the human brain.

      Strengths:<br /> - New hierarchical way of expressing cortical shapes at different scales derived from previous report through implementation of a coarse-graining procedure<br /> - Investigation of 11 primate brains and contextualisation with other mammals based on prior literature<br /> - Proposition of tool to analyse cortical morphology requiring no fine tuning and computationally achievable<br /> - Positioning of results in comparison to previous works reinforcing the validity of the observation.<br /> - Illustration of scale-dependance of effects of brain aging in the human.

      Weaknesses:<br /> - The notion of cortical shape, while being central to the article, is not really defined, leaving some interpretation to the reader<br /> - The organization of the manuscript is unconventional, leading to mixed contents in different sections (sections mixing introduction and method, methods and results, results and discussion...). As a result, the reader discovers the content of the article along the way, it is not obvious at what stages the methods are introduced, and the results are sometimes presented and argued in the same section, hindering objectivity.<br /> To improve the document, I would suggest a modification and restructuring of the article such that: 1) by the end of the introduction the reader understands clearly what question is addressed and the value it holds for the community, 2) by the end of the methods the reader understands clearly all the tools that will be used to answer that question (not just the new method), 3) by the end of the results the reader holds the objective results obtained by applying these tools on the available data (without subjective interpretations and justifications), and 4) by the end of the discussion the reader understands the interpretation and contextualisation of the study, and clearly grasps the potential of the method depicted for the better understanding of brain folding mechanisms and properties.

    1. eLife assessment

      In this important paper, the authors propose a computational model for understanding how the dynamics of neural representations may lead to specific patterns of errors as observed in working memory tasks. The paper provides solid evidence showing how a two-area model of sensory-memory interactions can account for the error patterns reported in orientation estimation tasks with delays. By integrating ideas from efficient coding and attractor networks, the resulting theoretical framework is appealing, and nicely captures some basic patterns of behavior data and the distributed nature of memory representation as reported in prior neurophysiological studies. The paper can be strengthened if (i) further analyses are conducted to deepen our understanding of the circuit mechanisms underlying the behavior effects; (ii) the necessity of the two-area network model is better justified; (iii) the nuanced aspects of the behavior that are not captured by the current model are discussed in more detail.

    2. Reviewer #1 (Public Review):


      Working memory is imperfect - memories accrue errors over time and are biased towards certain identities. For example, previous work has shown memory for orientation is more accurate near the cardinal directions (i.e., variance in responses is smaller for horizontal and vertical stimuli) while being biased towards diagonal orientations (i.e., there is a repulsive bias away from horizontal and vertical stimuli). The magnitude of errors and biases increase the longer an item is held in working memory and when more items are held in working memory (i.e., working memory load is higher). Previous work has argued that biases and errors could be explained by increased perceptual acuity at cardinal directions. However, these models are constrained to sensory perception and do not explain how biases and errors increase over time in memory. The current manuscript builds on this work to show how a two-layer neural network could integrate errors and biases over a memory delay. In brief, the model includes a 'sensory' layer with heterogenous connections that lead to the repulsive bias and decreased error in the cardinal directions. This layer is then reciprocally connected with a classic ring attractor layer. Through their reciprocal interactions, the biases in the sensory layer are constantly integrated into the representation in memory. In this way, the model captures the distribution of biases and errors for different orientations that have been seen in behavior and their increasing magnitude with time. The authors compare the two-layer network to a simpler one-network model, showing that the one-model network is harder to tune and shows an attractive bias for memories that have lower error (which is incompatible with empirical results).


      The manuscript provides a nice review of the dynamics of items in working memory, showing how errors and biases differ across stimulus space. The two-layer neural network model is able to capture the behavioral effects as well as relate to neurophysiological observations that memory representations are distributed across the sensory cortex and prefrontal cortex.

      The authors use multiple approaches to understand how the network produces the observed results. For example, analyzing the dynamics of memories in the low-dimensional representational space of the networks provides the reader with an intuition for the observed effects.

      As a point of comparison with the two-layer network, the authors construct a heterogenous one-layer network (analogous to a single memory network with embedded biases). They argue that such a network is incapable of capturing the observed behavioral effects but could potentially explain biases and noise levels in other sensory domains where attractive biases have lower errors (e.g., color).

      The authors show how changes in the strength of Hebbian learning of excitatory and inhibitory synapses can change network behavior. This argues for relatively stronger learning in inhibitory synapses, an interesting prediction.

      The manuscript is well-written. In particular, the figures are well done and nicely schematize the model and the results.


      Despite its strengths, the manuscript does have some weaknesses.

      First, as far as we can tell, behavioral data is only presented in schematic form. This means some of the nuances of the effects are lost. It also means that the model is not directly capturing behavioral effects. Therefore, while providing insight into the general phenomenon, the current manuscript may be missing some important aspects of the data.

      Relatedly, the models are not directly fit to behavioral data. This makes it hard for the authors to exclude the possibility that there is a single network model that could capture the behavioral effects. In other words, it is hard to support the authors' conclusion that "....these evolving errors...require network interaction between two distinct modules." (from the abstract, but similar comments are made throughout the manuscript). Such a strong claim needs stronger evidence than what is presented. Fitting to behavioral data could allow the authors to explore the full parameter space for both the one-layer and two-layer network architectures.

      In addition, directly comparing the ability of different model architectures to fit behavioral data would allow for quantitative comparison between models. Such quantitative comparisons are currently missing from the manuscript.

      To help broaden the impact of the paper, it would be helpful if the authors provided insight into how the observed behavioral biases and/or network structures influence cognition. For example, previous work has argued that biases may counteract noise, leading to decreased variance at certain locations. Is there a similar normative explanation for why the brain would have repulsive biases away from commonly occurring stimuli? Are they simply a consequence of improved memory accuracy? Why isn't this seen for all stimulus domains?

      Previous work has found both diffusive noise and biases increase with the number of items in working memory. It isn't clear how the current model would capture these effects. The authors do note this limitation in the Discussion, but it remains unclear how the current model can be generalized to a multi-item case.

      The role of the ring attractor memory network isn't completely clear. There is noise added in this stage, but how is this different from the noise added at the sensory stage? Shouldn't these be additive? Is the noise necessary? Similarly, it isn't clear whether the memory network is necessary - can it be replaced by autapses (self-connections) in the sensory network to stabilize its representation? In short, it would be helpful for the authors to provide an intuition for why the addition of the memory network facilitates the repulsive bias.


      Overall, the manuscript was successful in building a model that captured the biases and noise observed in working memory. This work complements previous studies that have viewed these effects through the lens of optimal coding, extending these models to explain the effects of time in memory. In addition, the two-layer network architecture extends previous work with similar architectures, adding further support to the distributed nature of working memory representations.

    3. Reviewer #2 (Public Review):

      In this manuscript, Yang et al. present a modeling framework to understand the pattern of response biases and variance observed in delayed-response orientation estimation tasks. They combine a series of modeling approaches to show that coupled sensory-memory networks are in a better position than single-area models to support experimentally observed delay-dependent response bias and variance in cardinal compared to oblique orientations. These errors can emerge from a population-code approach that implements efficient coding and Bayesian inference principles and is coupled to a memory module that introduces random maintenance errors. A biological implementation of such operation is found when coupling two neural network modules, a sensory module with connectivity inhomogeneities that reflect environment priors, and a memory module with strong homogeneous connectivity that sustains continuous ring attractor function. Comparison with single-network solutions that combine both connectivity inhomogeneities and memory attractors shows that two-area models can more easily reproduce the patterns of errors observed experimentally. This, the authors take as evidence that a sensory-memory network is necessary, but I am not convinced about the evidence in support of this "necessity" condition. A more in-depth understanding of the mechanisms operating in these models would be necessary to make this point clear.


      The model provides an integration of two modeling approaches to the computational bases of behavioral biases: one based on Bayesian and efficient coding principles, and one based on attractor dynamics. These two perspectives are not usually integrated consistently in existing studies, which this manuscript beautifully achieves. This is a conceptual advancement, especially because it brings together the perceptual and memory components of common laboratory tasks.

      The proposed two-area model provides a biologically plausible implementation of efficient coding and Bayesian inference principles, which interact seamlessly with a memory buffer to produce a complex pattern of delay-dependent response errors. No previous model had achieved this.


      The correspondence between the various computational models is not fully disclosed. It is not easy to see this correspondence because the network function is illustrated with different representations for different models and the correspondence between components of the various models is not specified. For instance, Figure 1 shows that a specific pattern of noise is required in the low-dimensional attractor model, but in the next model in Figure 2, the memory noise is uniform for all stimuli. How do these two models integrate? What element in the population-code model of Figure 2 plays the role of the inhomogeneous noise of Figure 1? Also, the Bayesian model of Figure 2 is illustrated with population responses for different stimuli and delays, while the attractor models of Figures 3 and 4 are illustrated with neuronal tuning curves but not population activity. In addition, error variance in the Bayesian model appears to be already higher for oblique orientations in the first iteration whereas it is only first shown one second into the delay for the attractor model in Figure 4. It is thus unclear whether variance inhomogeneities appear already at the perceptual stage in the attractor model, as it does in the population-code model. Of course, correspondences do not need to be perfect, but the reader does not know right now how far the correspondence between these models goes.

      The manuscript does not identify the mechanistic origin in the model of Figure 4 of the specific noise pattern that is required for appropriate network function (with higher noise variance at oblique orientations). This mechanism appears critical, so it would be important to know what it is and how it can be regulated. In particular, it would be interesting to know if the specific choice of Poisson noise in Equation (3) is important. Tuning curves in Figure 4 indicate that population activity for oblique stimuli will have higher rates than for cardinal stimuli and thus induce a larger variance of injected noise in oblique orientations, based on this Poisson-noise assumption. If this explanation holds, one wonders if network inhomogeneities could be included (for instance in neural excitability) to induce higher firing rates in the cardinal/oblique orientations so as to change noise inhomogeneities independently of the bias and thus control more closely the specific pattern of errors observed, possibly within a single memory network.

      The main conclusion of the manuscript, that the observed patterns of errors "require network interaction between two distinct modules" is not convincingly shown. The analyses show that there is a quantitative but not a qualitative difference between the dynamics of the single memory area compared to the sensory-memory two-area network, for specific implementations of these models (Figure 7 - Figure Supplement 1). There is no principled reasoning that demonstrates that the required patterns of response errors cannot be obtained from a different memory model on its own. Also, since the necessity of the two-area configuration is highlighted as the main conclusion of the manuscript, it is inconvenient that the figure that carefully compares these conditions is in the Supplementary Material.

      The proposed model has stronger feedback than feedforward connections between the sensory and memory modules. This is not a common assumption when thinking about hierarchical processing in the brain, and it is not discussed in the manuscript.

    4. Reviewer #3 (Public Review):


      The present study proposes a neural circuit model consisting of coupled sensory and memory networks to explain the circuit mechanism of the cardinal effect in orientation perception which is characterized by the bias towards the oblique orientation and the largest variance at the oblique orientation.


      The authors have done numerical simulations and preliminary analysis of the neural circuit model to show the model successfully reproduces the cardinal effect. And the paper is well-written overall. As far as I know, most of the studies on the cardinal effect are at the level of statistical models, and the current study provides one possibility of how neural circuit models reproduce such an effect.


      There are no major weaknesses and flaws in the present study, although I suggest the author conduct further analysis to deepen our understanding of the circuit mechanism of the cardinal effects. Please find my recommendations for concrete comments.

    1. eLife assessment

      This important study reports human single-neuron recordings in subcortical structures while participants performed a tactile detection task around the perceptual threshold. The study and the analyses are well conducted and provide solid evidence that the thalamus and the subthalamic nucleus contain neurons whose activity correlates with the task, with stimulus presentation, and even with whether the stimulation is consciously detected or not. The study will be relevant for researchers interested in the role of subcortical structures in tactile perception and the neural correlates of consciousness.

    2. Reviewer #1 (Public Review):


      A cortico-centric view is dominant in the study of the neural mechanisms of consciousness. This investigation represents the growing interest in understanding how subcortical regions are involved in conscious perception. To achieve this, the authors engaged in an ambitious and rare procedure in humans of directly recording from neurons in the subthalamic nucleus and thalamus. While participants were in surgery for the placement of deep brain stimulation devices for the treatment of essential tremor and Parkinson's disease, they were awakened and completed a perceptual-threshold tactile detection task. The authors identified individual neurons and analyzed single-unit activity corresponding with the task phases and tactile detection/perception. Among the neurons that were perception-responsive, the authors report changes in firing rate beginning ~150 milliseconds from the onset of the tactile stimulation. Curiously, the majority of the perception-responsive neurons had a higher firing rate for missed/not perceived trials. In summary, this investigation is a valuable addition to the growing literature on the role of subcortical regions in conscious perception.


      The authors achieved the challenging task of recording human single-unit activity while participants performed a tactile perception task. The methods and statistics are clearly explained and rigorous, particularly for managing false positives and non-normal distributions. The results offer new detail at the level of individual neurons in the emerging recognition of the role of subcortical regions in conscious perception.


      "Nonetheless, it remains unknown how the firing rate of subcortical neurons changes when a stimulus is consciously perceived." (lines 76-77) The authors could be more specific about what exactly single-unit recordings offer for interrogating the role of subcortical regions in conscious perception that is unique from alternative neural activity recordings (e.g., local field potential) or recordings that are used as proxies of neural activity (e.g., fMRI).

      Related comment for the following excerpts:

      "After a random delay ranging from 0.5 to 1 s, a "respond" cue was played, prompting participants to verbally report whether they felt a vibration or not. Therefore, none of the reported analyses are confounded by motor responses." (lines 97-99).

      "These results show that subthalamic and thalamic neurons are modulated by stimulus onset, irrespective of whether it was reported or not, even though no immediate motor response was required." (lines 188-190).

      "By imposing a delay between the end of the tactile stimulation window and the subjective report, we ensured that neuronal responses reflected stimulus detection and not mere motor responses." (lines 245-247).

      It is a valuable feature of the paradigm that the reporting period was initiated hundreds of milliseconds after the stimulus presentation so that the neural responses should not represent "mere motor responses". However, verbal report of having perceived or not perceived a stimulus is a motor response and because the participants anticipate having to make these reports before the onset of the response period, there may be motor preparatory activity from the time of the perceived stimulus that is absent for the not perceived stimulus. The authors show sensitivity to this issue by identifying task-selective neurons and their discussion of the results that refer to the confound of post-perceptual processing. Still, direct treatment of this possible confound would help the rigor of the interpretation of the results.

      "When analyzing tactile perception, we ensured that our results were not contaminated with spurious behavior (e.g. fluctuation of attention and arousal due to the surgical procedure)." (lines 118-117).

      Confidence in the results would be improved if the authors clarified exactly what behaviors were considered as contaminating the results (e.g., eye closure, saccades, and bodily movements) and how they were determined.

      The authors' discussion of the thalamic neurons could be more precise. The authors show that only certain areas of the thalamus were recorded (in or near the ventral lateral nucleus, according to Figure S3C). The ventral lateral nucleus has a unique relationship to tactile and motor systems, so do the authors hypothesize these same perception-selective neurons would be active in the same way for visual, auditory, olfactory, and taste perception? Moreover, the authors minimally interpret the location of the task, sensory, and perception-responsive neurons. Figure S3 suggests these neurons are overlapping. Did the authors expect this overlap and what does it mean for the functional organization of the ventral lateral nucleus and subthalamic nucleus in conscious perception?

      "We note that, 6 out of 8 neurons had higher firing rates for missed trials than hit trials, although this proportion was not significant (binomial test: p = 0.145)." (lines 215-216).

      It appears that in the three example neurons shown in Figure 4, 2 out of 3 (#001 and #068) show a change in firing rate predominantly for the missed stimulations. Meanwhile, #034 shows a clear hit response (although there is an early missed response - decreased firing rate - around 150 ms that is not statistically significant). This is a counterintuitive finding when compared to previous results from the thalamus (e.g., local field potentials and fMRI) that show the opposite response profile (i.e., missed/not perceived trials display no change or reduced response relative to hit/perceived trials). The discussion of the results should address this, including if these seemingly competing findings can be rectified.

      The authors report 8 perception-responsive neurons, but there are only 5 recording sites highlighted (i.e., filled-in squares and circles) in Figures S3C and 4D. Was this an omission or were three neurons removed from the perception-responsive analysis?

      Could the authors speak to the timing of the responses reported in Figure 4? The statistically significant intervals suggested both early (~160-200ms) to late responses (~300ms). Some have hypothesized that subcortical regions are early - ahead of cortical activation that may be linked with conscious perception. Do these results say anything about this temporal model for when subcortical regions are active in conscious perception?

    3. Reviewer #2 (Public Review):

      The authors have studied subpopulations of individual neurons recorded in the thalamus and subthalamic nucleus (STN) of awake humans performing a simple cognitive task. They have carefully designed their task structure to eliminate motor components that could confound their analyses in these subcortical structures, given that the data was recorded in patients with Parkinson's Disease (PD) and diagnosed with an Essential Tremor (ET). The recorded data represents a promising addition to the field. The analyses that the authors have applied can serve as a strong starting point for exploring the kinds of complex signals that can emerge within a single neuron's activity. Pereira et. al conclude that their results from single neurons indicate that task-related activity occurs, purportedly separate from previously identified sensory signals. These conclusions are a promising and novel perspective for how the field thinks about the emergence of decisions and sensory perception across the entire brain as a unit.

      Despite the strength of the data that was obtained and the relevant nature of the conclusions that were drawn, there are certain limitations that must be taken into consideration:

      (1) The authors make several claims that their findings are direct representations of consciousness identifiable in subcortical structures. The current context for consciousness does not sufficiently define how the consciousness is related to the perceptual task.

      (2) The current work would benefit greatly from a description and clarification of what all the neurons that have been recorded are doing. The authors' criteria for selecting subpopulations with task-relevant activity are appropriate, but understanding the heterogeneity in a population of single neurons is important for broader considerations that are being studied within the field.

      (3) The authors have omitted a proper set of controls for comparison against the active trials, for example, where a response was not necessary. Please explain why this choice was made and what implications are necessary to consider.

    4. Reviewer #3 (Public Review):


      This important study relies on a rare dataset: intracranial recordings within the thalamus and the subthalamic nucleus in awake humans, while they were performing a tactile detection task. This procedure allowed the authors to identify a small but significant proportion of individual neurons, in both structures, whose activity correlated with the task (e.g. their firing rate changed following the audio cue signalling the start of a trial) and/or with the stimulus presentation (change in firing rate around 200 ms following tactile stimulation) and/or with participant's reported subjective perception of the stimulus (difference between hits and misses around 200 ms following tactile stimulation). Whereas most studies interested in the neural underpinnings of conscious perception focus on cortical areas, these results suggest that subcortical structures might also play a role in conscious perception, notably tactile detection.


      There are two strongly valuable aspects in this study that make the evidence convincing and even compelling. First, these types of data are exceptional, the authors could have access to subcortical recordings in awake and behaving humans during surgery. Additionally, the methods are solid. The behavioral study meets the best standards of the domain, with a careful calibration of the stimulation levels (staircase) to maintain them around the detection threshold, and an additional selection of time intervals where the behavior was stable. The authors also checked that stimulus intensity was the same on average for hits and misses within these selected periods, which warrants that the effects of detection that are observed here are not confounded by stimulus intensity. The neural data analysis is also very sound and well-conducted. The statistical approach complies with current best practices, although I found that, in some instances, it was not entirely clear which type of permutations had been performed, and I would advocate for more clarity in these instances. Globally the figures are nice, clear, and well presented. I appreciated the fact that the precise anatomical location of the neurons was directly shown in each figure.


      Some clarification is needed for interpreting Figure 3, top rows: in my understanding the black curve is already the result of a subtraction between stimulus present trials and catch trials, to remove potential drifts; if so, it does not make sense to compare it with the firing rate recorded for catch trials.

      I also think that the article could benefit from a more thorough presentation of the data and that this could help refine the interpretation which seems to be a bit incomplete in the current version. There are 8 stimulus-responsive neurons and 8 perception-selective neurons, with only one showing both effects, resulting in a total of 15 individual neurons being in either category or 13 neurons if we exclude those in which the behavior is not good enough for the hit versus miss analysis (Figure S4A). In my opinion, it should be feasible to show the data for all of them (either in a main figure, or at least in supplementary), but in the present version, we get to see the data for only 3 neurons for each analysis. This very small selection includes the only neuron that shows both effects (neuron #001; which is also cue selective), but this is not highlighted in the text. It would be interesting to see both the stimulus-response data and the hit versus miss data for all 13 neurons as it could help develop the interpretation of exactly how these neurons might be involved in stimulus processing and conscious perception. This should give rise to distinct interpretations for the three possible categories. Neurons that are stimulus-responsive but not perception-selective should show the same response for both hits and misses and hence carry out indifferently conscious and unconscious responses. The fact that some neurons show the opposite pattern is particularly intriguing and might give rise to a very specific interpretation: if the neuron really doesn't tend to respond to the stimulus when hits and misses are put together, it might be a neuron that does not directly respond to the stimulus, but whose spontaneous fluctuations across trials affect how the stimulus is perceived when they occur in a specific time window after the stimulus. Finally, neuron #001 responds with what looks like a real burst of evoked activity to stimulation and also shows a difference between hits and misses, but intriguingly, the response is strongest for misses. In the discussion, the interesting interpretation in terms of a specific gating of information by subcortical structures seems to apply well to this last example, but not necessarily to the other categories.

    1. Reviewer #3 (Public Review):


      Chang et al. investigated the mechanisms governing collagen fibrillogenesis, firstly demonstrating that cells within tail tendons are able to uptake exogenous collagen and use this to synthesize new collagen-1 fibrils. Using an endocytic inhibitor, the authors next showed that endocytosis was required for collagen fibrillogenesis and that this process occurs in a circadian rhythmic manner. Using knockdown and overexpression assays, it was then demonstrated that collagen fibril formation is controlled by vacuolar protein sorting 33b (VPS33b), and this VPS33b-dependent fibrillogenesis is mediated via Integrin alpha-11 (ITGA11). Finally, the authors demonstrated increased expression of VPS33b and ITGA11 at the gene level in fibroblasts from patients with idiopathic pulmonary fibrosis (IPF), and greater expression of these proteins in both lung samples from IPF patients and in chronic skin wounds, indicating that endocytic recycling is disrupted in fibrotic diseases.


      The authors have performed a comprehensive functional analysis of the regulators of endocytic recycling of collagen, providing compelling evidence that VPS33b and ITGA11 are crucial regulators of this process.


      Throughout the study, several different cell types have been used (immortalised tail tendon fibroblasts, NIHT3T cells, and HEK293T cells). In general, it is not clear which cells have been used for a particular experiment, and the rationale for using these different cell types is not explained. In addition, some experimental details are missing from the methods.

      There is also a lack of functional studies in patient-derived IPF fibroblasts which means the link between endocytic recycling of collagen and the role of VPS33b and ITGA11 cannot be fully established.

    2. Reviewer #2 (Public Review):


      In this manuscript, the authors describe a mechanism, by which fluorescently-labelled Collagen type I is taken up by cells via endocytosis and then incorporated into newly synthesized fibers via an ITGA11 and VPS33B-dependent mechanism. The authors claim the existence of this collagen recycling mechanism and link it to fibrotic diseases such as IPF and chronic wounds.


      The manuscript is well-written, and experimentally contains a broad variation of assays to support their conclusions. Also, the authors added data of IPF patient-derived fibroblasts, patient-derived lung samples, and patient-derived samples of chronic wounds that highlight a potential in vivo disease correlation of their findings.

      The authors were also analyzing the membrane topology of VPS33B and could unravel a likely 'hairpin' like conformation in the ER membrane.


      Experimental evidence is missing that supports the non-degradative endocytosis of the labeled collagen.

      The authors show and mention in the text that the endocytosis inhibitor Dyngo®4a shows an effect on collagen secretion. It is not clear to me how specific this readout is if the inhibitor affects more than endocytosis. This issue was unfortunately not further discussed. The authors use commercial rat tail collagen, it is unclear to me which state the collagen is in when it's endocytosed. Is it fully assembled as collagen fiber or are those single heterotrimers or homotrimers?

      The Cy-labeled collagen is clearly incorporated into new fibers, but I'm not sure whether the collagen is needed to be endocytosed to be incorporated into the fibers or if that is happening in the extracellular space mediated by the cells.

      In general for the collagen blots, due to the lack of molecular weight markers, what chain/form of collagen type I are you showing here?

      Besides the VPS33B siRNA transfected cells the authors also use CRISPR/Cas9-generated KO. The KO cells do not seem to be a clean system, as there is still a lot of mRNA produced. Were the clones sequenced to verify the KO on a genomic level? For the siRNA transfection, a control blot for efficiency would be great to estimate the effect size. To me it is not clear where the endocytosed collagen and VPS33B eventually meet in the cells and whether they interact. Or is ITGA11 required to mediate this process, in case VPS33B is not reaching the lumen?

      The authors show an upregulation of ITGA11 and VPS33B in IPF patients-derived fibroblasts, which can be correlated to an increased level of ColI uptake, however, it is not clear whether this increased uptake in those cells is due to the elevated levels of VPS33B and/or ITGA11.

    3. Reviewer #1 (Public Review):


      The authors describe that the endocytic pathway is crucial for ColI fibrillogenesis. ColI is endocytosed by fibroblasts, prior to exocytosis and formation of fibrils, which can include a mixture of endogenous/nascent ColI chains and exogenous ColI. ColI uptake and fibrillogenesis are regulated by circadian rhythm as described by the authors in 2020, thanks to the dependence of this pathway on circadian-clock-regulated protein VPS33B. Cells are capable of forming fibrils with recently endocytosed ColI when nascent chains are not available. Previously identified VPS33B is demonstrated not to have a role in endocytosis of ColI, but to play a role in fibril formation, which the authors demonstrate by showing the loss of fibril formation in VPS33B KO, and an excess of insoluble fibrils - along-side a decrease in soluble ColI secretion - in VPS33B overexpression conditions. A VPS33B binding protein VIPAS39 is also shown to be required for fibrillogenesis and to colocalise with ColI. The authors thus conclude that ColI is internalised into endosomal structures within the cell, and that ColI, VPS33B, and VIPA39 are co-trafficked to the site of fibrillogenesis, where along with ITGA11, which by mass spectrometric analysis is shown to be regulated by VPS33B levels, ColI fibrils are formed. Interestingly, in involved human skin sections from idiopathic pulmonary fibrosis (IPF) patients, ITGA11 and VPS33B expression is increased compared to healthy tissue, while in patient-derived fibroblasts, uptake of fluorescently-labelled ColI is also increased. This suggests that there may be a significant contribution of endocytosis-dependent fibrillogenesis in the formation of fibrotic and chronic wound-healing diseases in humans.


      This is an interesting paper that contributes an exciting novel understanding of the formation of fibrotic disease, which despite its high occurrence, still has no robust therapeutic options. The precise mechanisms of fibrillogenesis are also not well understood, so a study devoted to this complex and key mechanism is well appreciated. The dependence of fibrillogenesis on VPS33B and VIPA39 is convincing and robust, while the distinction between soluble ColI secretion and insoluble fibrillar ColI is interesting and informative.


      There are a number of limitations to this study in its current state. Inhibition of ColI uptake is performed using Dyngo4a, which although proposed as an inhibitor of Clathrin-dependent endocytosis is known to be quite un-specific. This may not be a problem however, as the endocytic mechanism for ColI also does not seem to be well defined in the literature, in fact, the principle mechanism described in the papers referred to by the authors is that of phagocytosis. It would be interesting to explore this important part of the mechanism further, especially in relation to the intracellular destination of ColI. The circadian regulation does not appear as robust as the authors' last paper, however, there could be a larger lag between endocytosis of ColI and realisation of fibrils. The authors state that the endocytic pathway is the mechanism of trafficking and that they show ColI, VPS33B, and VIPA39 are co-trafficked. However, the only link that is put forward to the endosomes is rather tenuously through VPS33B/VIPA39. There is no direct demonstration of ColI localisation to endosomes (ie. immunofluorescence), and this is overstated throughout the text. Demonstrating the intracellular trafficking and localisation of ColI, and its actual relationship to VPS33B and VIPA39, followed by ITGA11, would broaden the relevance of this paper significantly to incorporate the field of protein trafficking. Finally, the "self-formation" of ColI fibrils is discussed in relation to the literature and the concentration of fluorescently-tagged ColI, however as the key message of the paper is the fibrillogenesis from exocytosed colI, I do not feel like it is demonstrated to leave no doubt. Specific inhibition of intracellular trafficking steps, or following the progressive formation of ColI fibrils over time by immunofluorescence would demonstrate without any further doubt that ColI must be endocytosed first, to form fibrils as a secondary step, rather than externally-added ColI being incorporated directly to fibrils, independent of cellular uptake.

    4. eLife assessment

      This important work substantially advances our understanding of how collagen fibrils are built and maintained in a manner regulated by circadian rhythms in intracellular secretory trafficking pathways. The evidence supporting the data are solid, although further data regarding the molecular mechanisms regulating endocytic recycling of collagen would have strengthened the study. The work will be of considerable interest to those who study extracellular matrix assembly or collagen homeostasis.

    1. eLife assessment

      This elegant study presents important findings into how small molecules that were originally developed to inhibit the oncogenic kinase, BRAF, instead trigger activation of this kinase target. Compelling and comprehensive evidence supports a new allosteric model to explain the paradoxical activation. This rigorous work will be of great interest to biochemists, structural biologists, and those working on strategies to inhibit kinases in the context of human disease.

    2. Reviewer #1 (Public Review):


      The authors quantitatively describe the complex binding equilibria of BRAF and its inhibitors resulting in some cases in the paradoxical activation of BRAF dimer when bound to ATP competitive inhibitors. The authors use a biophysical tour de force involving FRET binding assays, NMR, kinase activity assays, and DEER spectroscopy.


      The strengths of the study are the beautifully conducted assays that allow for a thorough characterization of the allostery in this complex system. Additionally, the use of F-NMR and DEER spectroscopy provides important insights into the details of the process.

      The resulting model for binding of inhibitors and dimerization (Figure 4) is very helpful.


      This is a complex system and its communication is inherently challenging. It might be of interest to the broader readership to understand the implications of the model for drug development and therapy.

    3. Reviewer #2 (Public Review):


      This manuscript uses FRET, 19F-NMR, and DEER/EPR solution measurements to examine the allosteric effects of a panel of BRAF inhibitors (BRAFi). These include first-generation aC-out BRAFi, and more recent Type I and Type II aC-in inhibitors. Intermolecular FRET measurements quantify Kd for BRAF dimerization and inhibitor binding to the first and second subunits. Distinct patterns are found between aC-in BRAFi, where Type I BRAFi binds equally well to the first and second subunits within dimeric BRAF. In contrast, Type II BRAFi shows stronger affinity for the first subunit and weaker affinity for the second subunit, an effect named "allosteric asymmetry". Allosteric asymmetry has the potential for Type II inhibitors to promote dimerization while favoring occupancy of only one subunit (BBD form), leading to the enrichment of an active dimer.

      Measurements of in vitro BRAF kinase activity correlate amazingly well with the calculated amounts of the half-site-inhibited BBD forms with Type II inhibitors. This suggests that the allosteric asymmetry mechanism explains paradoxical activation by this class of inhibitors. DEER/EPR measurements further examine the positioning of helix aC. They show systematic outward movement of aC with Type II inhibitors, relative to the aC-in state with Type I inhibitors, and further show that helix aC adopts multiple states and is therefore dynamic in apo BRAF. This makes a strong case that negative cooperativity between sites in the BRAF dimer can account for paradoxical kinase activation by Type II inhibitors by creating a half-site-occupied homodimer, BBD. In contrast, Type I inhibitors and aC-out inhibitors do not fit this model, and are therefore proposed to be explained by previously proposed models involving negative allostery between subunits in BRAF-CRAF heterodimers, RAS priming, and transactivation.


      This study integrates orthogonal spectroscopic and kinetic strategies to characterize BRAF dynamics and determine how it impacts inhibitor allostery. The unique combination of approaches presented in this study represents a road map for future work in the important area of protein kinase dynamics. The work represents a worthy contribution not only to the field of BRAF regulation but to protein kinases in general.


      Some questions remain regarding the proposed model for Type II inhibitors and its comparison to Type I and aC-out inhibitors that would be useful to clarify. Specifically, it would be helpful to address whether the activation of BRAF by Type II inhibitors, while strongly correlated with BBD model predictions in vitro, also depends on CRAF via BRAF-CRAF in cells and therefore overlaps with the mechanisms of paradoxical activation by Type I and aC-out inhibitors.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We are very grateful to both Reviewers, the Reviewing Editor and the Senior Editor for carefully reviewing our manuscript and for providing useful comments and suggestions that further improved the quality of our work. We appreciate that our work is perceived to substantially advance the understanding of osteoblast migration and that the experiments are found to be rigorous and to provide conclusive evidence. We also look forward to reaching a broad audience in the field. Below we provide a point-by-point response to each suggestion made by the reviewers and explain how we included their recommendations in the revised manuscript.

      Public Reviews

      Reviewer #1 (Public Review):


      The authors were trying to achieve that Tgif1 expression is regulated by EAK1/2 and PTH in a timedependent manner, and its roles in suppressing Pak3 for facilitating osteoblast adhesion. The authors further tried to show that the Tgif1- Pak3 signaling plays a significant role in osteoblast migration to the site of bone repair and bone remodeling.


      • In a previous study, it was demonstrated that Tgif1 is a target gene of PTH, and the absence of Tgif1 failed to increase bone mass by PTH treatment (Saito et al., Nat Commun., 2019). In this study, the authors found that Tgif1-Pak3 signaling prompts osteoblast migration through osteoblast adhesion to prompt bone regeneration. This novel finding provides a better understanding of how Tgif1 expression in osteoblasts regulates adherence, spreading, and migration during bone healing and bone remodeling.

      • The authors demonstrated that ERK1/2 and PTH regulate Tgif1 expression in a time-dependent manner and its role in suppressing Pak3 through various experimental approaches such as luciferase assay, ChIP assay, and gene silencing. These results contribute to the overall strength of the article.

      We thank the reviewer for acknowledging the novelty of our findings as well as the strength of the manuscript.


      • The authors need to further justify why they focused on Pak3 in the introduction by mentioning its known function for cell adhesion.

      We thank the reviewer for this suggestion. We mention in the introduction that we further investigated Pak3 due to its implication in cell adhesion (page 6, lines 7-8).

      • Some results indicated statistically significant but small changes. The authors need to explain in the discussion part why they believe this is the major mechanism or why there may be some other possible mechanisms.

      We agree with this comment. We are confident that our work identified an important mechanism by which Tgif1 regulates cellular features of osteoblasts. However, it is certainly possible that other mechanisms may exist as well. We discuss this point in the revised manuscript (page 18, lines 16-17).

      • The study does not include enough in vivo data to claim that this mechanism is crucial for bone healing and bone remodeling in vivo.

      Re: We agree with this point and have modified the abstract accordingly by replacing “crucial” with “implicated in” as well as the text by changing “crucial” to “important” (page 2, line 9). Furthermore, we discuss this limitation in the revised manuscript (page 18, lines 9-14).

      Reviewer #2 (Public Review):


      Bolamperti S. et al. 2023 investigate whether the expression of TG-interacting factor (Tgif1) is essential for osteoblastic cellular activity regarding morphology, adherence, migration/recruitment, and repair. Towards this end, germ-line Tgif1 deletion (Tgif1-/-) mice or male mice lacking expression of Tgif1 in mature osteoblastic and osteocytic cells (Dmp1-Cre+; Tgif1fl/fl) and corresponding controls were studied in physiological, bone anabolic, and bone fracture-repair conditions. Both Tgif1-/- and Dmp1-Cre+; Tgif1fl/fl exhibited decreased osteoblasts on cancellous bone surfaces and adherent to collagen I-coated plates. Tgif1-/- mice exhibit impaired healing in the tibial midshaft fracture model, as indicated by decreased bone volume (BV/Cal.V), osteoid (OS/BS), and low osteoblasts (number and surface). Likewise, both Tgif1-/- and Dmp1-Cre+; Tgif1fl/fl show impaired PTH 1-34, (100µg/kg, 5x/wk for 3 wks) osteoblast activation in vivo, as detected by increases in quiescent bone surfaces. Mechanistic in vitro studies then utilized primary osteoblasts isolated from Tgif1-/- mice and siRNA Tgif1 knockdown OCY454 cells to further investigate and identify the downstream Tgif1 target driving these osteoblastic impairments. In vitro, Tgif1-/- osteoblastic and Tgif1 knockdown OCY454 cells exhibit decreased migration, abnormal morphology, and decreased focal adhesions/cells. Unexpectantly though, localization assays revealed Tgif1 to primarily concentrate in the nucleus and not to co-localize with focal adhesions (paxillin, talin). Also, the expression of major focal adhesion components (paxillin, talin, FAK, Src, etc.) or the Cdc42 family was not altered by loss of Tgif1 expression. In contrast, PAK3 expression is markedly upregulated by loss of Tgif1. In silico analysis followed by mechanistic molecular assays involving ChIP, siRNA (Tgif1, PAK3), and transfection (rat PAK3 promoter) techniques show that Tgif1 physically binds to a specific site in the PAK3 promoter region. Further, the knockdown of PAK3 rescues the Tgif1-deficient abnormal morphology in OCY454 cells. This is the first study to identify the novel transcriptional repression of PAK3 by Tgif1 as well as the specific Tgif1 binding site within the PAK3 promoter.


      This work has a plethora of strengths. The co-authors achieved their aim of eliciting the role of Tgif1 expression in osteoblastic cellular functions (morphology, spreading/attachment, migration).

      Further, this work is the first to depict the novel mechanism of Tgif1 transcriptional repression of PAK3 by a thorough usage of mechanistic molecular assays (in silico analysis, ChIP, siRNA, transfection etc.). The conclusions are well supported and justified by these findings, as the appropriate controls, sample sizes (statistical power), statistics, and assays were fully utilized. The claims and conclusions are justified by the data.

      Re: We are grateful to this reviewer for recognizing the novelty, strengths, and rigor of our study and for acknowledging that the data convincingly support the conclusions drawn.


      The discussion section could be expanded with a few sentences regarding limitations to the current study and potential future directions.

      Re: In the revised manuscript, we are discussing limitations of the work and describe possible future directions (page 18, line 9-14).

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The cell spreading and migration assay is quite artificial. Trypsinized osteoblasts and quiescent osteoblasts are totally different. The authors need to cite papers from other groups to justify whether the cell spreading and migration assay is appropriate to achieve the goals of this study.

      Re: The reviewer is right that in vitro assays are often artificial and do not necessarily fully reflect in vivo situations. We have taken this aspect into account and discuss it in the revised manuscript (page 18, lines 9-10). In addition, we have included references from other groups who have used similar assays to study cell spreading and migration (Dejaeger M et al., 2017 and Dang et al., 2018).

      (2) Page 13 Line 15: The statement "Osteoblasts are greatly impaired in the ability to migrate into the repair zone" is an overstatement. The experiments in Figure 5 do not necessarily reflect osteoblast migration activities. The authors need to rephrase the sentence or need to show observation of earlier time points (e.g., 1 week after fracture) in their bone healing experiments. The number of osteoblasts/surface in Tgif1+/+ and Tgif1-/- mice at different time points during bone healing should be a good indicator for the migration of osteoblasts to the repair site.

      Re: We understand the critique that a time course or lineage tracing experiments would provide better evidence for the statement of osteoblast migration into the repair zone. To avoid overinterpretations we have removed the sentence from the revised manuscript.

      (3) Page 14, Line 24: Regarding the sentence "The observation that Tgif1 is crucial for osteoblast adherence, spreading, and migration", the authors need to clearly mention this statement is based on the in vitro experiments. The animal studies are not enough to claim that the mechanism is crucial for adherence, spreading, and migration.

      Re: We thank the reviewer for pointing out this limitation. We have clarified that the finding that Tgif1 is crucial for osteoblast adherence, spreading and migration was made in vitro (page 14, line 22).

      (4) The authors need to demonstrate the suppression of Pak3 expression in PTH-treated mice in vivo, in addition to the in vitro culture system (Fig. 7C and 7D).

      Re: We agree with the reviewer that this experiment would be very insightful. However, this is beyond the scope of the current work. Nevertheless, to take this valid point into consideration, we mention it in the discussion as potential future direction (page 18, lines 11-14).

      (5) The authors need to demonstrate that the pharmacologic suppression of Pak3 in Tgif1-/- mice reduces the % of quiescent surface/BS in vivo.

      Re: This point is also well taken, and we agree that a suppression of Pak3 in Tgif1-deficient mice would be very informative to support our in vitro findings. However, this may also be part of future investigations. This is emphasized in the discussion of the revised manuscript (page 18, lines 11-14).

      Figures (Minor)

      Fig. 1:

      Fig. 1A

      Arrows need to indicate a more precise position.

      Re: The position of the arrows has been optimized.

      Fig. 1DE

      What are blue/red bars (genotypes)?

      Re: The colors indicate the genotypes. A legend has been added to the revised figure.

      Fig. 1K

      Quantification data is needed.

      Re: Thank you for this suggestion. We added a quantification of the data (Fig. 1L, M; page 8, lines 3-4; page 21, lines 5-6)

      Fig. 2A

      Show the representative high-magnification image of round (non-spread) cells.

      Re: Representative high-magnification images (insets) are provided in the revised figure 2A.

      Fig. 5

      Red arrows need to indicate a more precise position.

      Re: The arrows have been repositioned.

      Fig. 6A, C

      Red arrows need to indicate a more precise position.

      Re: The arrows have been repositioned.

      Reviewer #2 (Recommendations For The Authors):

      (1) The microscopy images and analyses are excellent.

      Re: We thank the reviewer for acknowledging the quality of our microscopy studies.

      (2) Since the Tgif1-/- mouse has low osteoclast numbers, is it possible that this is a contributing factor to the delays/impairment in bone healing, given that resorption also has a role in fracture repair? Since the focus of these studies is on osteoblastic cells, this point is a little out of scope. However, would the authors consider exploring this further in the discussion section?

      Re: This point is well taken by the reviewer, and we agree that osteoclasts could certainly play a role in the impaired fracture healing. To acknowledge this aspect, we followed the recommendation and discuss this aspect in the revised manuscript (page 16, lines 22-24).


      Would the authors consider slightly re-wording the title? Tgif1 suppresses PAK3 expression; however, Tgif1-deficiency leads to the unregulated elevation of PAK3 expression.

      Re: Thank you for pointing this out. We agree with the reviewer and adapted the title accordingly.


      (1) Is it possible that apoptosis and/or anoikis is being induced by Tgif1 deficiency in osteoblastic cells?

      Re: We do not have data towards this direction and although Tgif1-deficient osteoblasts are overall viable and well expanding, we cannot fully exclude this possibility.

      (2) For the fracture study, any differences in overall callus size? Would it be possible to perform micro-CT imaging with some of these samples?

      Re: There is no difference in non-mineralized callus size between Tgif1+/+ and Tgif1-/- mice. However, there is less mineralized bone per callus area in Tgif1-/- mice, confirming an impaired osteoblast phenotype. As suggested by the reviewer, we added representative micro-CT images and the respective information to the revised manuscript (Fig 5F; pages 19-20).

      (3) Fracture repair experiment-is PAK3 expression downregulated with fracture injury; and/or, is PAK3 upregulated by loss of Tgif1 expression?

      Re: Unfortunately, we do not have data to answer this very interesting question and it would need to be addressed in future studies. This is mentioned in the revised discussion (page 18, lines 12-14).

      (4) Fig 7F. within PTH treated cells, is the light blue SCR sphericity statistically different than the light green siTgif1 + siPAK3 ? While the statement of the "lack of both, Tgif1 and PAK3 prevented PTH-induced decrease in cell sphericity" is supported by the lack of differences between dark green vs. light green; is it also possible that this is due to the siPAK3 returning sphericity to control (scr) levels? (i.e. hitting a floor limit of detection).

      Re: We thank the reviewer for this thoughtful question. There is no statistically significant difference between light blue and light green. Silencing PAK3 restores the impaired capacity to spread that occurs in the absence of Tgif1 to the level of scr controls (significant difference between dark and light red vs. dark and light green and no difference between either dark or light blue vs. dark or light green). However, unlike in the (scr) controls, in the absence of both Tgif1 and PAK3, the cells do not respond to PTH (statistically significant difference between dark and light blue, no difference between dark and light green). Based on the data, cells can reach sphericity of less than 0.2 and thus it is unlikely that sphericity is “hitting the floor level of detection” in these groups.

    1. Author Response

      We would like to thank the reviewers for their positive comments and valuable suggestions for improvements to the manuscript. We intend to revisit the discussion to clarify our interpretation of how azithromycin resistance mutations impact the transmission potential of P. falciparum and expand on the differences between mouse and human malaria. Additionally, we intend to adjust the title to better align with the revised interpretation of the main findings. These changes will be reflected in the revised manuscript to be submitted as the eLife Version of Record.

    1. Author Response

      The following is the authors’ response to the original reviews.

      General remarks for the Editor and the Reviewers

      We would like to thank the Editor and the Reviewers for their feedback. Below we address their comments and present our point-by-point responses as well as the related changes in the manuscript.

      In addition to these changes, in a few cases we have found it necessary to move some texts and provide some additional explanations within the manuscript. We emphasize that these amendments have been made for only technical reasons, and do not alter the results and conclusions of the paper, but may help to render the text more coherent and understandable to readers with little knowledge of the subject.

      These minor corrections are:

      • We extended the Introduction section by a sentence (lines 40-42) that is intended to fit the proposed template directed, non-enzymatic replication mechanism into a more general prebiotic evolutionary context, thus emphasizing its biological relevance. This sentence includes an additional reference (Rosenberger et al., 2021).

      • Two very methodologically oriented and repeated descriptions of random sequence generation have been moved to the Methods section (lines 178-185) from the Results section (lines 336-339 and lines 351-354).

      • We complemented the Data availability statement with licensing information (lines 684-685).

      • Further minor changes (also indicated by red texts) have been implemented to remedy logical and grammatical glitches.

      Public Reviews:

      Reviewer #1 (Public Review):


      Szathmary and colleagues explore the parabolic growth regime of replicator evolution. Parabolic growth occurs when nucleic acid strain separation is the rate-limiting step of the replication process which would have been the case for non-enzymatic replication of short oligonucleotide that could precede the emergence of ribozyme polymerases and helicases. The key result is that parabolic replication is conducive to the maintenance of genetic diversity, that is, the coexistence of numerous master sequences (the Gause principle does not apply). Another important finding is that there is no error threshold for parabolic replication except for the extreme case of zero fidelity.


      I find both the analytic and the numerical results to be quite convincing and well-described. The results of this work are potentially important because they reveal aspects of a realistic evolutionary scenario for the origin of replicators.


      There are no obvious technical weaknesses. It can be argued that the results represent an incremental advance because many aspects of parabolic replication have been explored previously (the relevant publications are properly cited). Obviously, the work is purely theoretical, experimental study of parabolic replication is due. In the opinion of this reviewer, though, these are understandable limitations that do not actually detract from the value of this work.

      We are grateful that this Reviewer appreciates our work. We completely agree that the ultimate validation must come from experiments. It is important to stress that in this field theory often preceded experimental work by decades, and the former often guided the latter. We hope that for the topic of the present paper experiments will follow considerably faster.

      Reviewer #2 (Public Review):


      A dominant hypothesis concerning the origin of life is that, before the appearance of the first enzymes, RNA replicated non-enzymatically by templating. However, this replication was probably not very efficient, due to the propensity of single strands to bind to each other, thus inhibiting template replication. This phenomenon, known as product inhibition, has been shown to lead to parabolic growth instead of exponential growth. Previous works have shown that this situation limits competition between alternative replicators and therefore promotes RNA population diversity. The present work examines this scenario in a model of RNA replication, taking into account finite population size, mutations, and differences in GC content. The main results are (1) confirmation that parabolic growth promotes diversity, but that when the population size is small enough, sequences least efficient at replicating may nevertheless go extinct; (2) the observation that fitness is not only controlled by the replicability of sequences, but also by their GC content; (3) the observation that parabolic growth attenuates the impact of mutations and, in particular, that the error threshold to which exponentially growing sequences are subject can be exceeded, enabling sequence identity to be maintained at higher mutation rates.


      The analyses are sound and the observations are intriguing. Indeed, it has been noted previously that parabolic growth promotes coexistence, its role in mitigating the error threshold catastrophe - which is often presented as a major obstacle to our understanding of the origin of life - had not been examined before.


      Although all the conclusions are interesting, most are not very surprising for people familiar with the literature. As the authors point out, parabolic growth is well known to promote diversity (SzathmaryGladkih 89) and it has also been noted previously that a form of Darwinian selection can be found at small population sizes (Davis 2000).

      Given that under parabolic growth, no sequence is ever excluded for infinite populations, it is also not surprising to find that mutations have a less dramatic exclusionary impact.

      In the two articles cited (Szathmary-Gladkih 1989 and Davis 2000) the subexponentiality of the system was implemented in a mechanistic way, by introducing the exponent 0 < 𝑝 < 1. Although the behaviour of these models is more or less consistent with experimental findings (von Kiedrowski, 1986; Zielinski and Orgel, 1987), the divergence of per capita growth rates (𝑥̇/𝑥) at very low concentrations–which guarantees the ability to maintain unlimited diversity in the case of infinite population sizes–makes this formal approach partly unrealistic.

      To avoid the possible artefacts of this mechanistic approach, and as there are no previous studies analysing the diversity maintaining ability of finite populations of parabolic replicators in an individual-based model context, we implemented a simplified template replication mechanism leading to parabolic growth and analysed the dynamics in an individual-based stochastic model context. The key point of our investigation is that considerable diversity can be maintained in the system even when the population size is quite small.

      Regarding the Reviewer’s comment on selection: Darwinian selection can only occur in a simple subexponential dynamics if the ratio of replicabilities diverges, cf. Eq. (8) and the preceding paragraph in Davis, 2000.

      Our results also show (Figs. 4B and 4C) that high mutation rates and the error threshold problem can still be considered as a major limiting factor for parabolically replicating systems in terms of their diversity-maintaining ability. In the light of the above, potential mechanisms to relax the error threshold in such systems, one of which is demonstrated in the present study, seem to be important steps to account for the sequence diversification and increase in molecular complexity during the early evolution of RNA replicators.

      A general weakness is the presentation of models and parameters, whose choices often appear arbitrary. Modeling choices that would deserve to be further discussed include the association of the monomers with the strands and the ensuing polymerization, which are combined into a single association/polymerization reaction (see also below), or the choice to restrict to oligomers of length L = 10. Other models, similar to the one employed here, have been proposed that do not make these assumptions, e.g. Rosenberger et al. Self-Assembly of Informational Polymers by Templated Ligation, PRX 2021. To understand how such assumptions affect the results, it would be helpful to present the model from the perspective of existing models.

      The assumption of one-step polymerization reactions that we used here is a common technique for modelling template replication of sequence-represented replicators [see, e.g., Fontana and Schuster, 1998 (10.1126/science.280.5368.1451), Könnyű et al., 2008 (10.1186/1471-2148-8267), Vig-Milkovics et al, 2019 (10.1016/j.jtbi.2018.11.020) or Szilágyi et al., 2020 (10.1371/journal.pgen.1009155)]. This is because assuming base-to-base polymerisation of the copy would lead to a very large number of different types of intermediates, which a Gillespietype stochastic simulation algorithm could not handle in reasonable computation times, even if the sequences were relatively short. For comparison, in our model, where polymerization is one-step, the characteristic time of a simulation for 𝐿 = 10, 𝑁 = 105 and 𝛿 = 0.01 was 552 hours.

      Note that in Rosenberg et al. (PRX 2021), in contrast to a pioneering work [Fernando et al, 2007 (10.1007/s00239-006-0218-4)], sequences of replicators are not represented, which makes this approach completely inapplicable to our case, in which sequence defines the fitness. In sum, we suggest that this valid criticism points to possible future work.

      The values of the (many) parameters, often very specific, also very often lack justifications. For example, why is the "predefined error factor" ε = 0.2 and not lower or higher? How would that affect the results?

      A general remark. For the more important parameters , several values were used to test the behaviour of the model (see Table 1), but due to the considerable number of parameters, it is impossible to examine all possible combinations. 𝑐+ = 1 fixes the timescale, 𝐿 is set to 10 to obtain reasonable running times (see above).

      𝜀 characterizes how replicability decreases as the number of mutations increases. In the manuscript we used the following default vector: 𝜀 = (0.05, 0.2, 1) in which the third element corresponds to the mutation-free sequence, so it must to be 1. The first element determines the baseline replicability (see Methods), which we preferred not to change because it would fundamentally alter the ratio of replication propensities to association and dissociation propensities (as the substantial amount of complementary sequences of the master sequences are of baseline replicability) and thus would alter the reaction kinetics to an extent that it is not comparable with the original results. Therefore, only the second element can be adjusted. Accordingly, we have analysed the behaviour of the model in the cases of a steeper and a more gradual loss of replicability using the following two vectors, respectively: 𝜀, = (0.05, 𝟎. 𝟎𝟓, 1) and 𝜀,, = (0.05, 𝟎. 𝟓, 1). The choice of 𝜀, is chemically more plausible, since for very short oligomers the loss of chemical activity and replicability as a function of the number of mutations can be very sharp. We performed a series of simulations with all possible combinations of 𝛿 = 0.001, 0.005, 0.1 and 𝑁 = 103, 104, 105 for 𝜀′ and 𝜀,,in the constant population and chemostat model context (36 different runs). For other parameters, we took the default values, see Table 1. These values also correspond to the parameters we used in Figures 2 and 6. The results show that the steeper loss of replicability (𝜀,) slightly increases the diversity maintaining ability of the system, whereas the more gradual loss of replicability (𝜀,,) moderately decreases the diversity-maintaining ability of the system, and that these shifts are more pronounced in the constant population size model (Author response image 1) than in the chemostat model (Author response image 2). Altogether, these results confirm that the qualitative outcome of the model is robust in a wide range of loss of replicability (𝜀 vector) values.

      Author response image 1.

      Replicator coexistence in the constant population model with different loss of replicability (𝜀 vector) values. Within a given combination of 𝛿 and 𝑁 parameter values, the upper panel corresponds to the steeper loss of replicability (𝜀!), the middle panel to the default 𝜀 vector (Figure 2A), and the bottom panel to the more gradual loss of replicability vector (𝜀!!). Within each 𝛿; 𝑁 parameter combination, the same master sequence set was used with the three different 𝜀 vectors for comparability.

      Author response image 2.

      Replicator coexistence in the chemostat model with different loss of replicability (𝜀 vector) values. Within a given combination of 𝛿 and 𝑁 parameter values, the upper panel corresponds to the steeper loss of replicability (𝜀!), the middle panel to the default 𝜀 vector (Figure 6A), and the bottom panel to the more gradual loss of replicability vector (𝜀!!). Within each 𝛿; 𝑁 parameter combination, the same master sequence set was used with the three different 𝜀 vectors for comparability.

      Similarly, in equation (11), where does the factor 0.8 come from?

      This factor scales the decay rate of duplex sequences (𝑐"!") as the function of the binding energy

      (𝐸b). The value of 0.8 is an arbitrary choice, the value should be in the interval (0,1) and is only relevant in the chemostat model. It is expected to have a similar effect on the dynamics as the duplex decay factor parameter 𝑓, which we have investigated in a wide range of different values (cf. Table 1, Fig. 6), although 𝑓 is independent of the binding energy (𝐸/): increasing/decreasing the 0.8 factor is expected to decrease/increase the average total population size. We have investigated the diversity maintaining ability of the system at smaller (0.6) and larger (0.9) parameter values at different population sizes (𝑁 ≈ 103, 104 and 105) and at different replicability distances (δ = 0.001, 0.005 and 0.01) as shown in Fig. 6. We have found that the number of coexisting master types changes very little in response to changes in this factor. Only two shifts could be detected (underlined): factor 0.9 combined with 𝑁 ≈ 104 and 𝛿 = 0.001 caused the number of surviving master types to decrease by one, while factor 0.9 combined with 𝑁 ≈ 103 and 𝛿 = 0.01 caused the number of surviving master types to increase by one (Author response table 1). Factor 0.6 produced the same number of surviving types as the default (Author response table 1). In summary, the model shows marked robustness to changes in the values of this parameter.

      Author response table 1.

      Number of coexisting master types in the chemostat model with different binding energy dependent duplex decay rates. Within each 𝛿; 𝑁 parameter combination, the same master sequence set was used with the three different factor values: 0.6, 0.8 (the original) and 0.9 for comparability.

      Why is the kinetic constant for duplex decay reaction 1.15e10−8?

      Note that this value is the minimum of the duplex decay rate, Table 1 correctly shows the interval of this kinetic constant as: [1.15 ⋅ 10-8, 6.4 ⋅ 10-5]. Both values are derived from the basic parameters of the system and can be computed according to Eq. (11). The minimum: as the parameter set corresponding to this value is: . The maximum: with .

      Are those values related to experiments, or are they chosen because specific behaviors can happen only then?

      See above.

      The choice of the model and parameters potentially impact the two main results, the attenuation of the error threshold and the role of GC content:

      Regarding the error threshold, it is also noted (lines 379-385) that it disappears when back mutations are taken into account. This suggests that overcoming the error threshold might not be as difficult as suggested, and can be achieved in several ways, which calls into question the importance of the particular role of parabolic growth. Besides, when the concentration of replicators is low, product inhibition may be negligible, such that a "parabolic replicator" is effectively growing exponentially and an error catastrophe may occur. Do the authors think that this consideration could affect their conclusion? Can simulations be performed?

      The assumption of back mutation only provides a theoretical solution to the error threshold problem: back mutation guarantees a positive (non-zero) concentration of a master type, but, since the probability of back mutation is generally very low, this equilibrium concentration may be extremely low, or negligible for typical system sizes. Consequently, back mutation alone does not solve the problem of the error catastrophe: in our system back mutation is present (the probability that a sequence with 𝑘 errors mutates back to a master sequence is 𝜇k(1−𝜇)L-k), and the diversity-maintaining ability is limited. The effect of back mutation decreases exponentially with increasing sequence length.

      Regarding the role of the GC content, GC-rich oligomers are found to perform the worst but no rationale is provided.

      For GC-rich oligonucleotides the dissociation probability of a template-copy complex is relatively low (cf. Eqs. (9, 10)), thus they have a relatively low number of offspring, cf. lines 557-561: “a relatively high dissociation probability and the consequential higher propensity of being in a simple stranded form provides an advantage for sequences with relatively low GC content in terms of their replication affinity, that is, the expected number of offspring in case of such variants will be relatively high.”. Note that the simulation results shown in Fig. 3A, demonstrate the realization of this effect with prepared sequences (along a GC content gradient).

      One may assume that it happens because GC-rich sequences are comparatively longer to release the product. However, it is also conceivable that higher GC content may help in the polymerization of the monomers as the monomers attach longer on the template (as described in Eq. (9)). This is an instance where the choice to pull into a single step the association and polymerization reactions are pulled into a single step independent of GC content may be critical.

      It would be important to show that the result arises from the actual physics and not from this modeling choice.

      Some more specific points that would deserve to be addressed:

      • Line 53: it is said that p "reflects how easily the template-reaction product complex dissociates". This statement is not correct. A reaction order p<1 reflects product inhibition, the propensity of templates to bind to each other, not slow product release. Product release can be limiting, yet a reaction order of 1 can be achieved if substrate concentrations are sufficiently high relative to oligomer concentrations (von Kiedrowski et al., 1991).

      We think the key reference is Von Kiedrowski (1993) in this case. Other things being equal, his Table 1 on p. 134 shows that a sufficient increase in 𝐾4, i.e., the stability of the duplex (template and copy) (association rate divided by dissociation rate) throws the system into the parabolic regime. This is what we had in mind. In order to clarify this, we modified the quoted sentence thus: “In this kinetics, the growth order is equal or close to 0.5 (i.e., the dynamics is sub-exponential) because increased stability of the template-copy complex (rate of association divided by dissociation) promotes parabolic growth (von Kiedrowski et al., 1991; von Kiedrowski & Szathmáry, 2001).”

      • Population size is a key parameter, and a comparison is made between small (10^3) and large (10^5) populations, but without explaining what determines the scale (small/large relative to what?).

      The “small” value (103) corresponds to the smallest meaningful population size, significantly smaller population sizes (e.g. 102) cannot maintain the 10 master types (or any subset of them) and are chemically unrealistic. The “large value” (105) is the largest population size for which simulation times are still acceptable, in the case of 106 the runtimes are in the order of months.

      • In the same vein, we might expect size not to be the only important parameter, but also concentration.

      With constant volume population size and concentration are strictly coupled.

      • Lines 543-546: if understanding correctly, the quantitative result is that the error threshold rises from 0.1 in the exponential case to 0.196 in the parabolic. Are the authors suggesting that a factor of 2 is a significant difference?

      In this paragraph we compared the empirical error threshold of our system (which is close to 𝑝"#$ = 0.15) with the error threshold of the well-known single peak fitness landscape (which can be approximated by ) as a reference case. To make the message even clearer we have extended the last sentence (lines 596-597) as follows: “but note that applying this approach to our system is a serious oversimplification”. The 0.196 is simply the probability of error-free replication of a sequence when , but we have removed this sentence (“corresponding to the replication accuracy of a master sequence”) from the manuscript as it seems to be confusing.

      • Figure 3C: this figure shows no statistically significant effect?

      Thank you for pointing out this. We statistically tested the hypothesis that the GC content between the survived and the extinct master subsets are different. This analysis revealed that the differences between these two groups are statistically significant, which we now included in the manuscript at lines 380-390: “A direct investigation of whether the sequence composition of the master types is associated with their survival outcome was conducted using the data from the constant population model simulation results (Figure 2). In these data, the average GC content was measured to be lower in the surviving master subpopulations than in the extinct subpopulations (Figure 3C). To determine whether this difference was statistically significant, nonparametric, two-sample Wilcoxon rank-sum tests (Hollander & Wolfe, 1999) were performed on the GC content of the extinct-surviving master subsets. The GC content was significantly different between these two groups in all nine investigated parameter combinations of population size (N) and replicability distance (δ) at p<0.05 level, indicating a selective advantage for a lower GC content in the constant population model context. The exact p values obtained from this analysis are shown in Figure 3C.”

      • line 542: "phase transition-like species extension (Figure 4B)": such a clear threshold is not apparent.

      Thank you for pointing out the incorrect phrasing. As there is no clear threshold in the number of coexisting types as a function of the mutation rate, we removed the “phase transition-like” expression: “However, when finite population sizes and stochastic effects are taken into account, at the largest investigated per-base mutation rate (𝑝mut = 0.15), the summed relative steady-state master frequencies approach zero (Figure 4C) with accelerating species extinction (Figure 4B), indicating that this value is close to the system׳s empirical error threshold.” (lines 589-594).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      On the whole, the work is well done and presented, there are no major recommendations. It seems a good idea to cite and briefly discuss this recent paper: https://pubmed.ncbi.nlm.nih.gov/36996101/ which develops a symbiotic scenario of the coevolution of primordial replicators and reproducers that appears to be fully compatible with the results of the current work.

      Thank you for bringing this article to our attention. We have inserted the following sentence at lines 621-624: “The demonstrated diversity-maintaining mechanism of finite parabolic populations can be used as a plug-in model to investigate the coevolution of naked and encapsulated molecular replicators (e.g., Babajanyan et al., 2023).”

      The manuscript is well written, but there are some minor glitches that merit attention. For example:

      l. 5 "carriers presents a problem, because product formation and mutual hybridization" - "mutual" is superfluous here, delete

      l. 13 "amplification. In addition, sequence effects (GC content) and the strength of resource" - hardly "effects" - should be 'features' or 'properties'

      l. 41 "If enzyme-free replication of oligomer modules with a high degree of sequence" - "modules" here is only confusing - simply, "oligomers"

      l. 44 "under ecological competition conditions with which distinct replicator types with different" - delete "with" etc, there are many such minor glitches that are best corrected.

      Thank you for pointing out, we have corrected! Other drafting errors, glitches, superfluous sentences have also been corrected.

      Reviewer #2 (Recommendations For The Authors):


      Editor (Recommendations For The Authors):

      In the manuscript, it appears that coexistence is assessed at a given point in time, while figures seem to show that it remains time-dependent. It would be great if the authors could clarify this and/or discuss this.

      We appreciate you bringing this to our attention, as we have indeed missed to elaborate on this important point. The steady state characteristic of the coexistence is assessed in our model in the following way: the relative frequency of each master sequence is tested for the condition of ≥ 100- (cut-off relative frequency for survival) in every 2,000th replication step in the interval between 10,000 replication steps before termination and actual termination (10= replication steps). If the above condition is true more than once, we consider the master type in question as survived (we have included this explanation in the Methods section: lines 258-268). Although this relatively narrow time interval can still be regarded as a snapshot of the state of the system, according to our numerical experiences, the resulting measure is a reliable quantitative indicator of the apparent stability of species coexistence in the parabolic dynamics.

    2. Reviewer #1 (Public Review):


      Szathmary and colleagues explore the parabolic growth regime of replicator evolution. Parabolic growth occurs when nucleic acid strain separation is the rate limiting step of the replication process which would have been the case for non-enzymatic replication of short oligonucleotide that could precede the emergence of ribozyme polymerases and helicases. The key result is that parabolic replication is conducive to the maintenance of genetic diversity, that is, coexistence of numerous master sequences (the Gause principle does not apply). Another important finding is that there is no error threshold for parabolic replication except for the extreme case of zero fidelity.


      I find both the analytic and the numerical results to be quite convincing and well described. The results of this work are potentially important because they reveal aspects of a realistic evolutionary scenario for the origin of replicators.


      There are no obvious technical weaknesses. It can be argued that the results represent an incremental advance because many aspects of parabolic replication have been explored previously (the relevant publications are properly cited). Obviously, the work is purely theoretical, experimental study of parabolic replication is due. In the opinion of this reviewer, though, these are understandable limitations that do not actually detract from the value of this work.

    3. eLife assessment

      This study provides a valuable theoretical exploration of non-enzymatic sustained replication of RNA systems, in the parabolic growth regime of the evolution of putative primordial replicators. It provides convincing evidence that parabolic growth mitigates the error threshold catastrophe, thus demonstrating another way in which this regime contributes to the maintenance of genetic diversity. The findings shed light on relevant evolutionary regimes of primordial replicators, with potential applicability to our understanding of the origin of life.

    4. Reviewer #2 (Public Review):


      A dominant hypothesis concerning the origin of life is that, before the appearance of the first enzymes, RNA replicated non-enzymatically by templating. However, this replication was probably not very efficient, due to the propensity of single strands to bind to each other, thus inhibiting template replication. This phenomenon, known as product inhibition, has been shown to lead to parabolic growth instead of exponential growth. Previous works have shown that this situation limits competition between alternative replicators and therefore promotes RNA population diversity. The present work examines this scenario in an agent-based model of RNA replication, taking into account finite population size, mutations and differences in GC content. The main results are (1) confirmation that parabolic growth promotes diversity, but that when the population size is small enough, sequences least efficient at replicating may nevertheless go extinct; (2) the observation that fitness is not only controlled by the replicability of sequences, but also by their GC content ; (3) the observation that parabolic growth attenuates the impact of mutations and, in particular, that the error threshold to which exponentially growing sequences are subject can be exceeded, enabling sequence identity to be maintained at higher mutation rates.


      The analyses are sound and the observations intriguing. Indeed, while it has been noted previously that parabolic growth promotes coexistence, this is the first work to show that it can also mitigate the error threshold catastrophe, which is often presented as a major obstacle to our understanding of the origin of life.


      A general weakness, which can however be seen as inherent in an agent-based model that aims to be more realistic than earlier, more phenomenological models, is the proliferation of parameters. The choice and values of these parameters are generally justified and, in many cases, several values are tested to assess the robustness of the results, but it can be difficult for the reader to identify the modeling choices that are truly critical from those that are less so.

    1. Author Response

      eLife assessment

      In this study, the authors offer a theoretical explanation for the emergence of nematic bundles in the actin cortex, carrying implications for the assembly of actomyosin stress fibers. As such, the study is a valuable contribution to the field actomyosin organization in the actin cortex. While the theoretical work is solid, experimental evidence in support of the model assumptions remains incomplete. The presentation could be improved to enhance accessibility for readers without a strong background in hydrodynamic and nematic theories.

      To address the weaknesses identified in this assessment, we plan to expand the description of the theoretical model to make it more accessible to a broader spectrum of readers. We will discuss in more detail the relation between the different mathematical terms and physical processes at the molecular scale, as well as the experimental evidence supporting the model assumptions. We will also discuss more explicitly how our results are relevant to different systems exhibiting actomyosin nematic bundles beyond stress fibers.

      Public Reviews:

      Reviewer #1 (Public Review):


      In this article, Mirza et al developed a continuum active gel model of actomyosin cytoskeleton that account for nematic order and density variations in actomyosin. Using this model, they identify the requirements for the formation of dense nematic structures. In particular, they show that self-organization into nematic bundles requires both flow-induced alignment and active tension anisotropy in the system. By varying model parameters that control active tension and nematic alignment, the authors show that their model reproduces a rich variety of actomyosin structures, including tactoids, fibres, asters as well as crystalline networks. Additionally, discrete simulations are employed to calculate the activity parameters in the continuum model, providing a microscopic perspective on the conditions driving the formation of fibrillar patterns.


      The strength of the work lies in its delineation of the parameter ranges that generate distinct types of nematic organization within actomyosin networks. The authors pinpoint the physical mechanisms behind the formation of fibrillar patterns, which may offer valuable insights into stress fiber assembly. Another strength of the work is connecting activity parameters in the continuum theory with microscopic simulations.

      We thank the referee for these comments.


      This paper is a very difficult read for nonspecialists, especially if you are not well-versed in continuum hydrodynamic theories. Efforts should be made to connect various elements of theory with biological mechanisms, which is mostly lacking in this paper. The comparison with experiments is predominantly qualitative.

      We agree with the referee that the manuscript will benefit from a better description of the theoretical model and the results in relation with specific molecular and cellular mechanisms. We will further emphasize how a number of experimental observations in the literature support our model assumptions and can be explained by our results. A quantitative comparison is difficult for several reasons. First, many of the parameters in our theory have not been measured, and in fact estimates in the literature often rely on comparison with hydrodynamic models such as ours. Second, the effective physical properties of actomyosin gels can vary wildly between cells, which may explain the diversity of forms, dynamics and functions. For these reasons, we chose to delineate regimes leading to qualitatively different emerging architectures and dynamics. In the revised manuscript, we will make this point clearer and will further study the literature to seek quantitative comparison.

      It is unclear if the theory is suited for in vitro or in vivo actomyosin systems. The justification for various model assumptions, especially concerning their applicability to actomyosin networks, requires a more thorough examination.

      We thank the referee for this comment. Our theory is applicable to actomyosin in living cells. To our knowledge, reconstituted actomyosin gels currently lack the ability to sustain the dynamical steady-states involved in the proposed self-organization mechanism, which balance actin flows with turnover. In addition to actomyosin gels in living cells, in vitro systems based on encapsulated cell extracts can also sustain such dynamical steady states [e.g. https://doi.org/10.1038/s41567-018-0413-4], and therefore our theory may be applicable to these systems as well. Of course, with advancements in the field of reconstituted systems, this may change in the near future. We will explicitly discuss this point in the revised manuscript.

      The classification of different structures demands further justification. For example, the rationale behind categorizing structures as sarcomeric remains unclear when nematic order is perpendicular to the axis of the bands. Sarcomeres traditionally exhibit a specific ordering of actin filaments with alternating polarity patterns.

      We agree and will avoid the term “sarcomeric”.

      Similarly, the criteria for distinguishing between contractile and extensile structures need clarification, as one would expect extensile structures to be under tension contrary to the authors' claim.

      We plan to clarify this point by representing in a main figure the stress profiles across dense nematic structures (currently in Supp Fig 2), along with a more detailed description. In short, depending on the parameter regime, the competition between active and viscous stresses in the actin gel determine whether the emergent structures are extensile or contractile. In our system tension is positive in all directions at all times. However, in “contractile” structures, tension is larger along the bundle, whereas in “extensile” structures, tension is larger perpendicular to the bundle. This is consistent with the common expression for active stress of incompressible nematic systems [see e.g. https://doi.org/10.1038/s41467-018-05666-8], that takes the form –zQ, where z is positive for an extensile system, showing that in this case active tension is negative along the nematic direction. This point, also been raised by another referee, will be clarified and connected to existing literature.

      Additionally, its unclear if the model's predictions for fiber dynamics align with observations in cells, as stress fibers exhibit a high degree of dynamism and tend to coalesce with neighboring fibers during their assembly phase.

      In the present work, we focus on the self-organization of a periodic patch of actomyosin gel. However, in adherent cells boundary conditions play an essential role, e.g. with inflow at the cell edge as a result of polymerization and exclusion at the nucleus. In ongoing work, we are studying with the present model the dynamics of assembly and reconfiguration of dense nematic structures in domains with boundary conditions mimicking in adherent cells, as suggested by the referee. We would like to note, however, that the prominent stress fibers in cells adhered to stiff substrates, so abundantly reported in the literature, are not the only instance of dense nematic actin bundles, and may not be representative of physiologically relevant situations. In the present manuscript, we emphasize the relation of the predicted organizations with those found in different in vivo contexts not related to stress fibers, such as the aligned patterns of bundles in insects (trachea, scales in butterfly wings), in hydra, or in reproductive organs of C elegans; the highly dynamical network of bundles observed in C elegans early embryos; or the labyrinth patters of micro-ridges in the apical surface of epidermal cells in fish. We will further emphasize these points in the revised manuscript.

      Finally, it seems that the microscopic model is unable to recapitulate the density patterns predicted by the continuum theory, raising questions about the suitability of the simulation model.

      We thank the referee for raising this question, which needs further clarification. The goal of the microscopic model is not to reproduce the self-organized patterns predicted by the active gel theory. The microscopic model lacks essential ingredients, notably a realistic description of hydrodynamics and turnover. Our goal with the agent-based simulations is to extract the relation between nematic order and active stresses for a small homogeneous sample of the network. This small domain is meant to represent the homogeneous active gel prior to pattern formation, and it allows us to substantiate key assumptions of the continuum model leading to pattern formation, notably the dependence of isotropic and deviatoric components of the active stress on density and nematic order (Eq. 7) and the active generalized stress promoting ordering.

      We should mention that reproducing the range of out-of-equilibrium mesoscale architectures predicted by our active gel model with agent-based simulations seems at present not possible, or at least significantly beyond the state-of-the-art. We note for instance that parameter regimes in which agent-based simulations of actin gels display extended contractile steady-states are non-generic, as these simulations often lead to irreversible clumping (as do many reconstituted contractile systems), see e.g. https://doi.org/10.1038/ncomms10323 or https://doi.org/10.1371/journal.pcbi.1005277. Very few references report sustained actin flows or the organization of a few bundles (https://doi.org/10.1371/journal.pcbi.1009506). While agent-based cytoskeletal simulations are very attractive because they directly connect with molecular mechanisms, active gel continuum models are better suited to describe out-ofequilibrium emergent hydrodynamics at a mesoscale. We believe that these two complementary modeling frameworks are rather disconnected in the literature, and for this reason, we have attempted substantiate our continuum modeling with discrete simulations. In the revised manuscript, we will better frame the relationship between them.

      Reviewer #2 (Public Review):


      The article by Waleed et al discusses the self organization of actin cytoskeleton using the theory of active nematics. Linear stability analysis of the governing equations and computer simulations show that the system is unstable to density fluctuations and self organized structures can emerge. While the context is interesting, I am not sure whether the physics is new. Hence I have reservations about recommending this article.

      We thank the referee for these comments. In the revised manuscript, we will highlight the novelty of the paper in terms of the theoretical model, the mechanism of patterning of dense nematic structures, the nature and dynamics of the resulting architectures, their relation with the experimental record, and the connection with microscopic models.

      We will emphasize the fact that nematic architectures in the actin cytoskeleton are characterized by a co-localization of order and density (and strong variations in each of these fields), that recent work shows that isotropic and nematic organizations coexist and are part of a single heterogeneous network, that the emergence and maintenance of nematic order requires active contraction, and that the assembly and maintenance of dense nematic bundles involves convergent flows. None of these key features can be described by the common incompressible models of active nematics. To address this, we develop here a compressible and density dependent model for an active nematic gel. We will carefully justify that the proposed model is meaningful for actomyosin gels, and we will highlight the commonalities and differences with previous models of active nematics.


      (i) Analytical calculations complemented with simulations (ii) Theory for cytoskeletal network


      Not placed in the context or literature on active nematics.

      We agree with the referee that the manuscript requires a better contextualization of the work in relation with the very active field of active nematics. In the revised manuscript, we will clearly describe the relation of our model with existing ones.

      Reviewer #3 (Public Review):

      The manuscript "Theory of active self-organization of dense nematic structures in the actin cytoskeleton" analysis self-organized pattern formation within a two-dimensional nematic liquid crystal theory and uses microscopic simulations to test the plausibility of some of the conclusions drawn from that analysis. After performing an analytic linear stability analysis that indicates the possibility of patterning instabilities, the authors perform fully non-linear numerical simulations and identify the emergence of stripelike patterning when anisotropic active stresses are present. Following a range of qualitative numerical observations on how parameter changes affect these patterns, the authors identify, besides isotropic and nematic stress, also active self-alignment as an important ingredient to form the observed patterns. Finally, microscopic simulations are used to test the plausibility of some of the conclusions drawn from continuum simulations.

      The paper is well written, figures are mostly clear and the theoretical analysis presented in both, main text and supplement, is rigorous. Mechano-chemical coupling has emerged in recent years as a crucial element of cell cortex and tissue organization and it is plausible to think that both, isotropic and anisotropic active stresses, are present within such effectively compressible structures. Even though not yet stated this way by the authors, I would argue that combining these two is of the key ingredients that distinguishes this theoretical paper from similar ones. The diversity of patterning processes experimentally observed is nicely elaborated on in the introduction of the paper, though other closely related previous work could also have been included in these references (see below for examples).

      We thank the referee for these comments and for the suggestion to emphasize the interplay of isotropic and anisotropic active tension, which is possible only in a compressible gel. We thank the suggestions of the referee to better connect with existing literature.

      To introduce the continuum model, the authors exclusively cite their own, unpublished pre-print, even though the final equations take the same form as previously derived and used by other groups working in the field of active hydrodynamics (a certainly incomplete list: Marenduzzo et al (PRL, 2007), Salbreux et al (PRL, 2009, cited elsewhere in the paper), Jülicher et al (Rep Prog Phys, 2018), Giomi (PRX, 2015),...). To make better contact with the broad active liquid crystal community and to delineate the present work more compellingly from existing results, it would be helpful to include a more comprehensive discussion of the background of the existing theoretical understanding on active nematics. In fact, I found it often agrees nicely with the observations made in the present work, an opportunity to consolidate the results that is sometimes currently missed out on. For example, it is known that self-organised active isotropic fluids form in 2D hexagonal and pulsatory patterns (Kumar et al, PRL, 2014), as well as contractile patches (Mietke et al, PRL 2019), just as shown and discussed in Fig. 2. It is also known that extensile nematics, \kappa<0 here, draw in material laterally of the nematic axis and expel it along the nematic axis (the other way around for \kappa>0, see e.g. Doostmohammadi et al, Nat Comm, 2018 "Active Nematics" for a review that makes this point), consistent with all relative nematic director/flow orientations shown in Figs. 2 and 3 of the present work.

      We thank the referee for these suggestions. Indeed, in the original submission we had outsourced much of the justification of the model and the relevant literature to a related pre-print, but this is not reasonable. In the revised manuscript, we will discuss our model in the context of the state-of-the-art, emphasizing connections with existing results.

      The results of numerical simulations are well-presented. Large parts of the discussion of numerical observations - specifically around Fig. 3 - are qualitative and it is not clear why the analysis is restricted to \kappa<0. Some of the observations resonate with recent discussions in the field, for example the observation of effectively extensile dynamics in a contractile system is interesting and reminiscent of ambiguities about extensile/contractile properties discussed in recent preprints (https://arxiv.org/abs/2309.04224). It is convincingly concluded that, besides nematic stress on top of isotropic one, active self-alignment is a key ingredient to produce the observed patterns.

      We thank the referee for these comments. We will expand the description of the results around Figure 3. We are reluctant to extend the detailed analysis of emergent architectures and dynamics to the case \kappa > 0 as it leads to architectures not observed, to our knowledge, in actin networks. We will expand the characterization of emergent contractile/extensile networks by describing the distribution of the different components of the stress tensor across the bundles and will place our results in the context of related recent work.

      I compliment the authors for trying to gain further mechanistic insights into this conclusion with microscopic filament simulations that are diligently performed. It is rightfully stated that these simulations only provide plausibility tests and, within this scope, I would say the authors are successful. At the same time, it leaves open questions that could have been discussed more carefully. For example, I wonder what can be said about the regime \kappa>0 (which is dropped ad-hoc from Fig. 3 onward) microscopically, in which the continuum theory does also predict the formation of stripe patterns - besides the short comment at the very end? How does the spatial inhomogeneous organization the continuum theory predicts fit in the presented, microscopic picture and vice versa?

      We thank the referee for this compliment. We think that the point raised by the referee is very interesting. It is reasonable to expect that the sign of \kappa will not be a constant but rather depend on S and \rho. Indeed, for a sparse network with low order, the progressive bundling by crosslinkers acting on nearby filaments is likely to produce a large active stress perpendicular to the nematic direction, whereas in a dense and highly ordered region, myosin motors are more likely to effectively contract along the nematic direction whereas there is little room for additional lateral contraction by additional bundling. In the revised manuscript, we envision to further deepen in this issue in two ways. First, we plan to perform additional agent-based simulations in a regime leading to kappa > 0. Second, we will modify the active gel model such that kappa < 0 for low density/order, so that a fibrillar pattern is assembled, and kappa > 0 for high density/order, so that the emergent fibers are highly contractile.

      Overall, the paper represents a valuable contribution to the field of active matter and, if strengthened further, might provide a fruitful basis to develop new hypothesis about the dynamic self-organisation of dense filamentous bundles in biological systems.

    1. Author Response

      We would like to thank the editorial board and the reviewers for their assessment of our manuscript and their constructive feedback that we believe will make our manuscript stronger and clearer. Please find below our provisional response to the public reviews; these responses outline our plan to address the concerns of the reviewers for a planned resubmission. Our responses are written in red.

      Public Reviews:

      Reviewer #1 (Public Review):


      In this paper, Misic et al showed that white matter properties can be used to classify subacute back pain patients that will develop persisting pain.


      Compared to most previous papers studying associations between white matter properties and chronic pain, the strength of the method is to perform a prediction in unseen data. Another strength of the paper is the use of three different cohorts. This is an interesting paper that provides a valuable contribution to the field.

      We thank the reviewer for emphasizing the strength of our paper and the importance of validation on multiple unseen cohorts.


      The authors imply that their biomarker could outperform traditional questionnaires to predict pain: "While these models are of great value showing that few of these variables (e.g. work factors) might have significant prognostic power on the long-term outcome of back pain and provide easy-to-use brief questionnaires-based tools, (21, 25) parameters often explain no more than 30% of the variance (28-30) and their prognostic accuracy is limited.(31)". I don't think this is correct; questionnaire-based tools can achieve far greater prediction than their model in about half a million individuals from the UK Biobank (Tanguay-Sabourin et al., A prognostic risk score for the development and spread of chronic pain, Nature Medicine 2023).

      We agree with the reviewer that we might have under-estimated the prognostic accuracy of questionnaire-based tools, especially, the strong predictive accuracy shown by Tangay-Sabourin 2023. In the revised version, we will change both the introduction and the discussion to reflect the the questionnaires based prognostic accuracy reported in the seminal work by TangaySabourin. We do note here, however, that the latter paper while very novel is unique in showing the power of questionnaires. In addition, the questionnaires we have tested in our cohort did not show any baseline differences suggestive of prognostic accuracy.

      Moreover, the main weakness of this study is the sample size. It remains small despite having 3 cohorts. This is problematic because results are often overfitted in such a small sample size brain imaging study, especially when all the data are available to the authors at the time of training the model (Poldrack et al., Scanning the horizon: towards transparent and reproducible neuroimaging research, Nature Reviews in Neuroscience 2017). Thus, having access to all the data, the authors have a high degree of flexibility in data analysis, as they can retrain their model any number of times until it generalizes across all three cohorts. In this case, the testing set could easily become part of the training making it difficult to assess the real performance, especially for small sample size studies.

      The reviewer raises a very important point of limited sample size and of the methodology intrinsic of model development and testing. We acknowledge the small sample size in the “Limitations” section of the discussion. In the resubmission, we will acknowledge the degree of flexibility that is afforded by having access to all the data at once. However, we will also note that our SLF-FA based model is a simple cut-off approach that does not include any learning or hidden layers and that the data obtained from Open Pain were never part of the “training” set at any point at either the New Haven or the Mannheim site. Regarding our SVC approach we follow standard procedures for machine learning where we never mix the training and testing sets. The models are trained on the training data with parameters selected based on crossvalidation within the training data. Therefore, no models have ever seen the test data set. The model performances we reported reflect the prognostic accuracy of our model. Finally, as discussed by Spisak et al., 1 the key determinant of the required sample size in predictive modeling is the ” true effect size of the brain-phenotype relationship” which we think is the determinant of the replication we observe in this study. As such the effect size in the New Haven and Mannheim data is Cohen’s d >1.

      Even if the performance was properly assessed, their models show AUCs between 0.65-0.70, which is usually considered as poor, and most likely without potential clinical use. Despite this, their conclusion was: "This biomarker is easy to obtain (~10 min 18 of scanning time) and opens the door for translation into clinical practice." One may ask who is really willing to use an MRI signature with a relatively poor performance that can be outperformed by self-report questionnaires?

      The reviewer is correct, the model performance is poor to fair which limits its usefulness for clinical translation. We wanted to emphasize that obtaining diffusion images can be done in a short period of time and, hence, as such models predictive accuracy improves, clinical translation becomes closer to reality. In addition, our findings are based on old diffusion data and limited sample size coming from different sites and different acquisition sequences. This by itself would limit the accuracy especially that evidence shows that sample size affect also model performance (i.e. testing AUC)1. In the revision, we will re-word the sentence mentioned by the reviewer to reflect the points discussed here. This also motivates us to collect a more homogeneous and larger sample.

      Overall, these criticisms are more about the wording sometimes used and the inference they made. I think the strength of the evidence is incomplete to support the main claims of the paper.

      Despite these limitations, I still think this is a very relevant contribution to the field. Showing predictive performance through cross-validation and testing in multiple cohorts is not an easy task and this is a strong effort by the team. I strongly believe this approach is the right one and I believe the authors did a good job.

      We thank the reviewer for acknowledging that our effort and approach were the right ones.

      Minor points:


      I get the voxel-wise analysis, but I don't understand the methods for the structural connectivity analysis between the 88 ROIs. Have the authors run tractography or have they used a predetermined streamlined form of 'population-based connectome'? They report that models of AUC above 0.75 were considered and tested in the Chicago dataset, but we have no information about what the model actually learned (although this can be tricky for decision tree algorithms).

      We apologize for the lack of clarity; we did run tractography and we did not use a predetermined streamlined form of the connectome. We will clarify this point in the methods section.

      Finding which connections are important for the classification of SBPr and SBPp is difficult because of our choices during data preprocessing and SVC model development: (1) preprocessing steps which included TNPCA for dimensionality reduction, and regressing out the confounders (i.e., age, sex, and head motion); (2) the harmonization for effects of sites; and (3) the Support Vector Classifier which is a hard classification model2. Such models cannot tell us the features that are important in classifying the groups. Our model is considered a black-box predictive model like neural networks.


      What results are shown in Figure 7? It looks more descriptive than the actual results.

      The reviewer is correct; Figure 7 and supplementary Figure 4 are both qualitatively illustrating the shape of the SLF.

      Reviewer #2 (Public Review):

      The present study aims to investigate brain white matter predictors of back pain chronicity. To this end, a discovery cohort of 28 patients with subacute back pain (SBP) was studied using white matter diffusion imaging. The cohort was investigated at baseline and one-year follow-up when 16 patients had recovered (SBPr) and 12 had persistent back pain (SBPp). A comparison of baseline scans revealed that SBPr patients had higher fractional anisotropy values in the right superior longitudinal fasciculus SLF) than SBPp patients and that FA values predicted changes in pain severity. Moreover, the FA values of SBPr patients were larger than those of healthy participants, suggesting a role of FA of the SLF in resilience to chronic pain. These findings were replicated in two other independent datasets. The authors conclude that the right SLF might be a robust predictive biomarker of CBP development with the potential for clinical translation.

      Developing predictive biomarkers for pain chronicity is an interesting, timely, and potentially clinically relevant topic. The paradigm and the analysis are sound, the results are convincing, and the interpretation is adequate. A particular strength of the study is the discovery-replication approach with replications of the findings in two independent datasets.

      We thank reviewer 2 for pointing to the strength of our study.

      The following revisions might help to improve the manuscript further.

      Definition of recovery. In the New Haven and Chicago datasets, SBPr and SBPp patients are distinguished by reductions of >30% in pain intensity. In contrast, in the Mannheim dataset, both groups are distinguished by reductions of >20%. This should be harmonized. Moreover, as there is no established definition of recovery (reference 79 does not provide a clear criterion), it would be interesting to know whether the results hold for different definitions of recovery. Control analyses for different thresholds could strengthen the robustness of the findings.

      The reviewer raises an important point regarding the definition of recovery. To address the reviewers concern we will add a supplementary figure showing the results in the Mannheim data set if a 30% reduction is used as a recovery criterion. We would like to emphasize here several points that support the use of different recovery thresholds between New Haven and Mannheim. The New Haven primary pain ratings relied on visual analogue scale (VAS) while the Mannheim data relied on the German version of the West-Haven-Yale Multidimensional Pain Inventory. In addition, the Mannheim data was pre-registered with a definition of recovery at 20% and is part of a larger sub-acute to chronic pain study with prior publications from this cohort using the 20% cut-off3. Finally, a more recent consensus publication4 from IMMPACT indicates that a change of at least 30% is needed for a moderate improvement in pain on the 0-10 Numerical Rating Scale but that this percentage depends on baseline pain levels.

      Analysis of the Chicago dataset. The manuscript includes results on FA values and their association with pain severity for the New Haven and Mannheim datasets but not for the Chicago dataset. It would be straightforward to show figures like Figures 1 - 4 for the Chicago dataset, as well.

      We welcome the reviewer’s suggestion; we will therefore add these analyses to the results section of our manuscript upon resubmission

      Data sharing. The discovery-replication approach of the present study distinguishes the present from previous approaches. This approach enhances the belief in the robustness of the findings. This belief would be further enhanced by making the data openly available. It would be extremely valuable for the community if other researchers could reproduce and replicate the findings without restrictions. It is not clear why the fact that the studies are ongoing prevents the unrestricted sharing of the data used in the present study.

      Reviewer #3 (Public Review):


      Authors suggest a new biomarker of chronic back pain with the option to predict the result of treatment. The authors found a significant difference in a fractional anisotropy measure in superior longitudinal fasciculus for recovered patients with chronic back pain.


      The results were reproduced in three different groups at different studies/sites.


      The number of participants is still low.

      We have discussed this point in our replies to reviewer number 1.

      An explanation of microstructure changes was not given.

      The reviewer points to an important gap in our discussion. While we cannot do a direct study of actual tissue micro-structure, we will explore further the changes observed in the SLF by calculating diffusivity measures and discuss possible explanations of these changes.

      Some technical drawbacks are presented.

      We are uncertain if the reviewer is suggesting that we have acknowledged certain technical drawbacks and expects further elaboration on our part. We kindly request that the reviewer specify what particular issues they would like us to address so that we can respond appropriately.

      (1) Spisak T, Bingel U, Wager TD. Multivariate BWAS can be replicable with moderate sample sizes. Nature 2023;615:E4-E7.

      (2) Liu Y, Zhang HH, Wu Y. Hard or Soft Classification? Large-margin Unified Machines. J Am Stat Assoc 2011;106:166-177.

      (3) Loffler M, Levine SM, Usai K, et al. Corticostriatal circuits in the transition to chronic back pain: The predictive role of reward learning. Cell Rep Med 2022;3:100677.

      (4) Smith SM, Dworkin RH, Turk DC, et al. Interpretation of chronic pain clinical trial outcomes: IMMPACT recommended considerations. Pain 2020;161:2446-2461.

    2. eLife assessment

      This valuable study provides incomplete evidence that white matter diffusion imaging of the right superior longitudinal fasciculus might help to develop a predictive biomarker of chronic back pain chronicity. The results are based on a discovery-replication approach with different cohorts, but the sample size is limited, and the clinical relevance is overstated. The findings will interest researchers interested in the brain mechanisms of chronic pain and in developing brain-based biomarkers of chronic pain.

    3. Reviewer #1 (Public Review):


      In this paper, Misic et al showed that white matter properties can be used to classify subacute back pain patients that will develop persisting pain.


      Compared to most previous papers studying associations between white matter properties and chronic pain, the strength of the method is to perform a prediction in unseen data. Another strength of the paper is the use of three different cohorts. This is an interesting paper that provides a valuable contribution to the field.


      The authors imply that their biomarker could outperform traditional questionnaires to predict pain: "While these models are of great value showing that few of these variables (e.g. work factors) might have significant prognostic power on the long-term outcome of back pain and provide easy-to-use brief questionnaires-based tools, (21, 25) parameters often explain no more than 30% of the variance (28-30) and their prognostic accuracy is limited.(31)". I don't think this is correct; questionnaire-based tools can actually achieve far greater prediction than their model in about half a million individuals from the UK Biobank (Tanguay-Sabourin et al., A prognostic risk score for the development and spread of chronic pain, Nature Medicine 2023).

      Moreover, the main weakness of this study is the sample size. It remains small despite having 3 cohorts. This is problematic because results are often overfitted in such a small sample size brain imaging study, especially when all the data are available to the authors at the time of training the model (Poldrack et al., Scanning the horizon: towards transparent and reproducible neuroimaging research, Nature Reviews in Neuroscience 2017). Thus, having access to all the data, the authors have a high degree of flexibility in data analysis, as they can retrain their model any number of times until it generalizes across all three cohorts. In this case, the testing set could easily become part of the training making it difficult to assess the real performance, especially for small sample size studies.

      Even if the performance was properly assessed, their models show AUCs between 0.65-0.70, which is usually considered as poor, and most likely without potential clinical use. Despite this, their conclusion was: "This biomarker is easy to obtain (~10 min 18 of scanning time) and opens the door for translation into clinical practice." One may ask who is really willing to use an MRI signature with a relatively poor performance that can be outperformed by self-report questionnaires?

      Overall, these criticisms are more about the wording sometimes used and the inference they made. I think the strength of the evidence is incomplete to support the main claims of the paper.

      Despite these limitations, I still think this is a very relevant contribution to the field. Showing predictive performance through cross-validation and testing in multiple cohorts is not an easy task and this is a strong effort by the team. I strongly believe this approach is the right one and I believe the authors did a good job.

      Minor points:


      I get the voxel-wise analysis, but I don't understand the methods for the structural connectivity analysis between the 88 ROIs. Have the authors run tractography or have they used a predetermined streamlined form of 'population-based connectome'? They report that models of AUC above 0.75 were considered and tested in the Chicago dataset, but we have no information about what the model actually learned (although this can be tricky for decision tree algorithms).

      Minor:<br /> What results are shown in Figure 7? It looks more descriptive than the actual results.

    4. Reviewer #2 (Public Review):

      The present study aims to investigate brain white matter predictors of back pain chronicity. To this end, a discovery cohort of 28 patients with subacute back pain (SBP) was studied using white matter diffusion imaging. The cohort was investigated at baseline and one-year follow-up when 16 patients had recovered (SBPr) and 12 had persistent back pain (SBPp). A comparison of baseline scans revealed that SBPr patients had higher fractional anisotropy values in the right superior longitudinal fasciculus SLF) than SBPp patients and that FA values predicted changes in pain severity. Moreover, the FA values of SBPr patients were larger than those of healthy participants, suggesting a role of FA of the SLF in resilience to chronic pain. These findings were replicated in two other independent datasets. The authors conclude that the right SLF might be a robust predictive biomarker of CBP development with the potential for clinical translation.

      Developing predictive biomarkers for pain chronicity is an interesting, timely, and potentially clinically relevant topic. The paradigm and the analysis are sound, the results are convincing, and the interpretation is adequate. A particular strength of the study is the discovery-replication approach with replications of the findings in two independent datasets.

      The following revisions might help to improve the manuscript further.

      - Definition of recovery. In the New Haven and Chicago datasets, SBPr and SBPp patients are distinguished by reductions of >30% in pain intensity. In contrast, in the Mannheim dataset, both groups are distinguished by reductions of >20%. This should be harmonized. Moreover, as there is no established definition of recovery (reference 79 does not provide a clear criterion), it would be interesting to know whether the results hold for different definitions of recovery. Control analyses for different thresholds could strengthen the robustness of the findings.

      - Analysis of the Chicago dataset. The manuscript includes results on FA values and their association with pain severity for the New Haven and Mannheim datasets but not for the Chicago dataset. It would be straightforward to show figures like Figures 1 - 4 for the Chicago dataset, as well.

      - Data sharing. The discovery-replication approach of the present study distinguishes the present from previous approaches. This approach enhances the belief in the robustness of the findings. This belief would be further enhanced by making the data openly available. It would be extremely valuable for the community if other researchers could reproduce and replicate the findings without restrictions. It is not clear why the fact that the studies are ongoing prevents the unrestricted sharing of the data used in the present study.

    5. Reviewer #3 (Public Review):


      Authors suggest a new biomarker of chronic back pain with the option to predict the result of treatment. The authors found a significant difference in a fractional anisotropy measure in superior longitudinal fasciculus for recovered patients with chronic back pain.

      Strengths:<br /> The results were reproduced in three different groups at different studies/sites.

      Weaknesses:<br /> - The number of participants is still low.<br /> - An explanation of microstructure changes was not given.<br /> - Some technical drawbacks are presented.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):


      One important question is needed to further clarify the mechanisms of aberrant Ca2+ microwaves as described below.

      Synapsin promoter labels both excitatory pyramidal neurons and inhibitory neurons. To avoid aberrant Ca2+ microwave, a combination of Flex virus and CaMKII-Cre or Thy-1-GCaMP6s and 6f mice were tested. However, all these approaches limit the number of infected pyramidal neurons. While the comprehensive display of these results is appreciated, a crucial question remains unanswered. To distinguish whether the microwave of Ca2+ is caused selectively via the abnormality of interneurons, or just a matter of pyramidal neuron density, testing Flex-GCaMP6 in interneuron specific mouse lines such as PV-Cre and SOM-Cre will be critical.

      We agree that unravelling the role of interneurons is important to the understanding of the cellular mechanisms. However, the primary goal of this preprint was to alert the field and those embarking on in vivo Ca2+ imaging to AAV transduction induced artefacts mediated by one of the most widely used viral constructs for Ca2+ imaging in the field. It was important to us to distribute this finding among the community in a timely manner to avoid the unnecessary waste of resources.

      We consider a thorough understanding of cell-type specific mechanisms interesting. However, the biological relevance of the Ca2+ waves is as yet unclear and to disentangle exactly which cellular and subcellular factors that drive the aberrant phenomenon will require a large systematic effort which goes beyond our resources. For instance, it will be technically not trivial to separate biologically relevant contributions from technical differences. For instance, the absence of Ca2+ waves under the principal neuron promotor CaMKII may suggest the involvement of interneurons. However, alternate possibilities are a reduced density of expression across principal neurons or that the expression levels between the 2 promoters is different.

      The important, take-home message of the preprint, in our opinion, is that users check carefully their viral protocols, adjust the protocols for their specific scientific question and report any issues. We now emphasise the fact that although Ca2+ waves were not observed following conditional expression of syn.GCaMP with CaMKII.cre, this may not be due to a requirement for interneuronal expression but simply reflect differences in final GCaMP expression density and levels between the two transduction procedures (P12, L298-303).

      Reviewer #2 (Public Review):


      Whether micro-waves are associated with the age of mice was not quantified. This would be good to know and the authors do have this data.

      We plotted the animal age at the time of injection for all injections of Syn.GCaMP6 into CA1/CA3 and found no correlation in either the occurrence of Ca2+ waves nor the frequency of Ca2+ waves during the age period between 5 – 79 wks (see reviewer Fig1; linear regression fit to the Ca2+ wave frequency against age was not significant: intercept = 1.37, slope = -0.007, p=0.62, n = 14; and generalized linear model relating Ca2+ wave ~ age was not significant: z score = 0.19, deviance above null = 0.04, p = 0.85, n=24). We have now added a statement to this in the revised manuscript (P14 L354-359) and for the reviewers we have added the plots below.

      Author response image 1.

      Plot of Ca2+ micro-wave frequency (left: number of Ca2+ waves/min) or occurrence (right: yes/no) against the animal age at the time of viral injection. Blue line is linear (left) or logistic (right) fit to the data with 95% confidence level.

      The effect of micro-waves on single cell function was not analyzed. It would be useful, for example, if we knew the influence of micro-waves on place fields. Can a place cell still express a place field in a hippocampus that produces micro-waves? What effect might a microwave passing over a cell have on its place field? Mice were not trained in these experiments, so the authors do not have the data.

      We agree that these are interesting questions; however, the preprint is focused on describing the GECI expression conditions prone to generating these artefacts. Studying the effects of Ca2+ micro-waves on the circuitry are scientific questions, and would require an experimental framework of testing the aberrant activity on a specific physiological function e.g. place activity or specific oscillations (e.g. sharp-wave activity). Ca2+ microwaves, as the ones described here, have not been reported under physiological conditions or pathophysiological conditions and studying the effects of such artefactual waves on the circuit was not our intention.

      With respect to place cell activity, specifically, it is intuitive that during the Ca2+ micro-wave the participating cell’s place field activity would be obscured by the artefactual activity. Cell activity appears to return immediately following the wave suggesting that the cells could exhibit place activity outside their participation in the Ca2+ micro-waves. However, we do not know if the Ca2+ micro-wave activity disrupts the generation or maintenance of place fields. We have now added a brief reference to possible effects on place coding to the paper (P12, L315-317).

      The CaMKII-Cre approach for flexed-syn-GCaMP expression shows no micro-waves and is convincing, but it is only from 2 animals, even though both had no micro-waves. In light of the reviewer’s comment, we have added a further 3 animals with conditional expression of GCaMP6m from the DZNE to complement the current dataset with conditional expression of GCaMP6s from UoB (P10, L236 & 239 and revised table 1). Although Ca2+ waves were not observed in any of the in total 5 animals, we still do not know with all certainty whether this approach is completely safe. Time will show if researchers still encounter the phenotype under certain conditions when using this conditional approach.

      The authors state in their Discussion that even without observable microwaves, a syn-Ca2+-indicator transduction strategy could still be problematic. This may be true, but they do not check this in their analysis, so it remains unknown

      We agree with the reviewer and have now made this point clearer in the revised discussion (P11, L257-258)

      Reviewer #3 (Public Review):


      I believe that the weaknesses of the manuscript are appropriately highlighted by the authors themselves in the discussion. I would, however, like to emphasize several additional points.

      As the authors state, the exact conditions that lead to Ca2+ micro-waves are unclear from this manuscript. It is also unclear if Ca2+ micro-waves are specific to GECI expression or if high-titer viral transduction of other proteins such as genetically encoded voltage indicators, static fluorescent proteins, recombinases, etc could also cause Ca2+ micro-waves.

      The high expression of other proteins has been shown to result in artefactual phenomenon such as toxicity or fluorescent puncta (for GFP see Hechler et al. 2006; Katayama et al. 2008 for GEVI see Rühl et al. 2021), but we are not aware of reports of micro-waves. Although it is certainly possible that high expression levels of other proteins could lead to waves, we suspect the Ca2+ micro-waves observed in this preprint result from a dysregulation of Ca2+ homeostasis. This is not to suggest that voltage indicators could not result in micro-waves (e.g. Ca2+ homeostasis may be indirectly affected).

      The authors almost exclusively tested high titer (>5x10^12 vg/mL) large volume (500-1000 nL) injections using the synapsin promoter and AAV1 serotypes. It is possible that Ca2+ micro-waves are dramatically less frequent when titers are lowered further but still kept high enough to be useful for in vivo imaging (e.g. 1x10^12 vg/mL) or smaller injection volumes are used. It is also possible that Ca2+ micro-waves occur with high titer injections using other viral promoter sequences such as EF1α or CaMKIIα. There may additionally be effects of viral serotype on micro-wave occurrence.

      We agree with all points raised by the reviewer. Notably, we used viral transduction protocols with titers and volumes within in the range of those previously used for viral transduction of GCaMP under the synapsin promoter (see P11 L269-275) and we observed Ca2+ micro-waves. As the reviewer suggested, we did find that lowering the titer is an important factor in reducing these Ca2+ micro-waves and there is likely a wide range of approaches that avoid the phenomenon. With regards to viral serotype, we show that micro-waves occurred across AAV1 and 9, but it is possible that other serotypes may avoid the phenomenon.

      We reiterate in the abstract of the revised manuscript that expression level is a crucial factor (P2, L40 and P2, L44-45) and now mention that other promoters and induction protocols that result in high Ca2+ indicator expression may result in Ca2+ micro-waves (P12, L291-294.

      The number of animals in any particular condition are fairly low (Table 1) with the exception of V1 imaging and thy1-GCaMP6 imaging. This prohibits rigorous comparison of the frequency of pathological calcium activity across conditions.

      We have now added 3 more animals with conditional GCaMP6 expression. In total, the study contains 34 animals with viral injection into the hippocampus from different laboratories and under different conditions resulting in multiple groups. As such we are cognizant of the resulting limitations for statistical evaluation.

      However, in light of the reviewer’s comment, we have now employed a generalized linear model tested on all the data to examine the relationship between the Ca2+ micro-wave incidence and the different factors. The multivariate GLM did find a significant relationship between Ca2+ micro-wave incidence and both viral dilution and weeks post injection (see below and revised manuscript P8, L189-193).

      For injections into CA1 in the hippocampus (n=28), a GLM found no relationship between Ca2+ micro-waves and each of the individual variables x (Ca-wave ~ x) ; viral dilution: z score = 1.14, deviance above null = 1.31, p = 0.254; post injection weeks: : z score = 1.18, deviance above null = 1.44, p = 0.239; injection volume: : z score = -0.76, deviance above null = 0.59, p = 0.45; construct: : z score = 1.18, difference in deviance above null = 1.44, p = 0.239)

      However, a multivariable logistic GLM relating dilution and post injection weeks (Ca-wave ~ dilution + p.i_wks) showed that together both variables were significantly related to Ca2+ micro-waves (Deviation above null = 7.5; Dilution: z score = 2.18, p < 0.05; p.i_wks : z score = 2.22, p < 0.05).

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      Results are straightforward and convincing. While a couple of ways to reduce the aberrant microwaves of calcium responses were demonstrated, delving into the functions of interneurons is crucial for a more comprehensive understanding of cellular causality.

      As mentioned in the public response, disentangling cellular mechanism from technical requirements will need a large and systematic study. To determine the contribution from interneurons, the use of specific interneuron promoters would be required, and viral titers systematically varied to result in similar cellular GCaMP expression levels as seen under the synapsin promoter condition.

      Reviewer #2 (Recommendations For The Authors):

      Do the authors think the cells are firing when they participate in a micro-wave, or do they think the calcium influx is due to something else? A discussion point on this would be good.

      This is an excellent point raised by the reviewer. We do not know if the elevated cellular Ca2+ during the artifactual Ca2+ micro-wave reflects action potential firing or an increase of Ca2+ from intracellular stores. As already described in the text of the preprint, their optical spatiotemporal profile neither fits with known microseizure progression patterns, nor with spreading depolarization/depression. We have adopted the reviewer’s suggestion and added the following point to the discussion section in the revised preprint (P12, L308-315):

      In a limited dataset, we attempted to detect the Ca2+ micro-waves by hippocampal LFP recordings (using a conventional insulated Tungsten wire, diameter ~110µm). We could not identify a specific signature, e.g. ictal activity or LFP depression, which may correspond to these Ca2+ micro-waves. The crucial shortcoming of this experiment of course is that with these LFP recordings, we could not simultaneous perform hippocampal 2-photon microscopy. Thus, it is uncertain if the Ca2+ micro-waves indeed occurred in proximity to our electrode.

      The results seem to suggest that micro-waves may involve interneurons as their CaMKII-Cre strategy avoids waves - possibly due to a lack of expression of GECIs in interneurons. It would be great to hear the author's thoughts on this and add a brief discussion point.

      As mentioned in public response to Reviewer 1, it is difficult to disentangle cellular mechanisms from technical requirements, and the exact requirements for the Ca2+ micro-waves to occur are still not fully clear. The absence of Ca2+ micro-waves in our CaMKII-Cre dataset may indeed reflect the requirement of interneurons. However, it could just as well be due to a sparse labelling of principle cells or simply reflect differences in the expression levels of GCaMP under the different promotors.

      All in all, a more complete understanding of the requirements of such Ca2+ micro-waves will require a community effort. Therefore, it is important that each group check the safety profile of their GECI and report problems to the community.

      We have added these points to the revised preprint (P12, L291 and P12, L298)

      Plotting the incidence of micro-waves as a function of the age of mice would be a nice addition (the authors have the data).

      There was no relationship of Ca2+ micro-wave occurrence or frequency with age over the range of 5-79 wks (see public response) and this has been added to the preprint (P14, L354)

      Reviewer #3 (Recommendations For The Authors):

      I appreciate the authors raising the awareness of this issue. I had personally observed micro-waves in my own data as well. In agreement with their findings, I found that the occurrence of micro-waves was dramatically lower when I reduced the viral titer. Anecdotally, I also observed voltage micro-waves when virally transducing genetically encoded voltage indicators at similar titers. For that reason, I am skeptical that this issue is exclusive to GECIs.

      We find it interesting that the reviewer has also seen artefactual micro-waves following viral transduction of genetically encoded voltage indicators. Without seeing the voltage waves the referee is referring to or the conditions, it is of course difficult to compare with the Ca2+ micro-waves we report. However, this comment again raises the question of mechanism. We believe that in the GECI framework, Ca2+ homeostatic aspects are important. Voltage indicators are based on different sensor mechanisms, and expressed in the cell membrane, but it may very well be that there are overlapping factors between Ca2+ and voltage indicators that could trigger a similar, or even the same phenomenon in the end.

      Minor comments:

      (1) Line 131-132: I believe the authors only tested for micro-waves in V1. This should be made clear in the results. It could be that micro-waves could occur in other parts of cortex with the same viral titers.

      Both V1 and somatosensory cortex were tested as described in the methods (P15, L395-397), we have made this clearer in the revised preprint (P6, L138).

      (2) There are no statistics associated with the data from Fig 1e.

      We have now added statistics (P5, L126).

      (3) The authors may be able to make a stronger claim about the pathological nature of the micro-waves if there are differences in the histology between the injected and non-injected hemispheres. For example, is there evidence of widespread cell death in the injected hemisphere (e.g. lower cell count, smaller hippocampal volume, caspase staining, etc).

      We found no evidence of gross morphological changes to the hippocampus following viral transduction with no changes in CA1 pyramidal cell layer thickness or CA1 thickness (pyramidal cell layer thickness: 49 ± 12.5 µm ipsilateral and 50.3 ± 11.1 µm contralateral, n=4, Student’s t-test p=0.89; CA1 thickness: 553.3 ± 14 µm ipsilateral and 555.8 ± 62 µm contralateral, n = 4, Student’s t-test p=0.94; 48 ± 13 weeks post injection at time of perfusion).

      We have added this to the preprint (P5, L117-122)

      (4) The broader micro-waves in the stratum oriens versus the stratum pyramidale are likely due to the spread of the basal dendrites of pyramidal cells. If the typical size of the basal dendritic arbor of CA1 pyramidal neurons is taken into account, does this explain the wider calcium waves in this layer.

      Absolutely, great point, yes, we completely agree on this. It is likely the active neuropil (including dendritic arbour) are contributing to the apparent broader diameter. In addition, as evident in the video 5 cell somata in the stratum Oriens (possibly interneurons) are active and their processes also contribute.

      We have now mentioned these points in the revised preprint (P5, L132)

      (5) Lines 179-181: Is the difference in the prevalence of micro-waves between viral titers statistically significant?

      Although we have a large number of animals in total (n=34) with viral injection into the hippocampus, the number of animals in each condition, given the many factors, is low. We therefore used a generalized linear model to test the relationship between the Ca2+ micro-waves and the variables.

      We have now added this analysis to the revised preprint (P8, L189-193)

      (6) Lines 200-203: The CA3 micro-waves were only observed at one institution. The current wording is slightly misleading.

      We agree and have changed this to be clearer (P9 L216)

    2. Reviewer #4 (Public Review):


      Masala N et al showed interesting aberrant calcium microwaves in the hippocampus when synapsin promoter driven GCaMPs were expressed for a long period of time. These aberrant hippocampal Ca2+ micro-waves depend on the viral titre of the GECI. The microwave of Ca2+ was not observed when GECI was expressed only a sparse set of neurons.


      These findings are important to wide neuroscience community especially when considering a great number of investigators are using similar approaches. Results look convincing and are consistent across several laboratories.


      Synapsin promoter labels both excitatory pyramidal neurons and inhibitory neurons. To avoid aberrant Ca2+ microwave, a combination of Flex virus and CaMKII-Cre or Thy-1-GCaMP6s and 6f mice were tested. However, all these approaches limit the number of infected pyramidal neurons. While the comprehensive display of these results is appreciated, one additional important test would be more informative. To distinguish whether the microwave of Ca2+ is sufficiently caused via the expression of GCaMP in interneurons, or just a matter of pyramidal neuron density, testing Flex-GCaMP6 in interneuron specific mouse lines such as PV-Cre and SOM-Cre will provide further clarifications.

    3. eLife assessment

      This important study provides convincing evidence of artifactual calcium micro-waves during calcium imaging of populations of neurons in the hippocampus using methods that are common in the field. The work raises awareness of these artifacts so that any research labs planning to do calcium imaging in the hippocampus can avoid them by using alternative strategies that the authors propose.

    4. Reviewer #2 (Public Review):


      The authors describe and quantify a phenomenon in the CA1 and CA3 of the hippocampus that they call aberrant Ca2+ micro-waves. Micro-waves are sometimes seen during 2-photon calcium imaging of populations of neurons under certain conditions. They are spatially confined slow calcium events that start in a few cells and slowly spread to neighboring groups of cells. This phenomenon has been uttered between researchers in the field at conferences, but no one has taken the time to carefully capture and quantify micro-waves and pin down the causes. The authors show that micro-waves are dependent on the viral titre of the genetically encoded calcium indicators (GECIs), the genetic promoter (synapsin), the neuronal subtype (granule cells in the dentate gyrus do not produce micro-waves and they are not seen in the neocortex), and the density of GECI expression. The authors should be commended for their work and for raising awareness to all labs doing any form of calcium imaging in populations of neurons. The authors also come up with alternative approaches to avoid artifactual micro-waves such as reducing the transduction titre (1:2 dilution of virus) and a transduction method employing sparser and cre-dependent GECI expression in principal cells using a CaMKII promoter.


      The micro-waves reported in the paper were robustly observed across 4 laboratories and 3 different countries with various experimenters and calcium imaging set-ups. This adds significant strength to the work.

      The age of mice used covered a broad range (from 6 to 43 weeks). This is a strength because it covers most ages that are used in labs that regularly do calcium imaging.

      Another strength is they used different GCaMP variants (GCaMP6m, GCaMP6s, GCaMP7f), as well as a red indicator: RCaMP. This shows the micro-waves are not an issue with any particular GECI, as the authors suggest.

      The authors include many movies of micro-waves. This is extremely useful for researchers in the field to view them in real-time so they can identify them in their own data.

      They provide a useful table with specific details of the virus injected, titre, dilution, and other information along with the incidence of micro-waves. A nice look-up table for researchers to see if their viral strategy is associated with a high or low incidence of micro-waves.


      The effect of mico-waves on single cell function was not analyzed. It would be useful, for example, if we knew the influence of micro-waves on place fields. Can a place cell still express a place field in a hippocampus that produces micro-waves? What effect might a microwave passing over a cell have on its place field? Mice were not trained in these experiments, so the authors do not have the data. However, they do briefly discuss these ideas.

    5. Reviewer #3 (Public Review):


      The work by Masala and colleagues highlights a striking artifact that can result from a particular viral method for expressing genetically encoded calcium indicators (GECIs) in neurons. In a cross-institutional collaboration, the authors find that viral transduction of GECIs in the hippocampus can result in aberrant slow-traveling calcium (Ca2+) micro-waves. These Ca2+ micro-waves are distinct from previously described ictal activity but nevertheless are likely a pathological consequence of overexpression of virally transduced proteins. Ca2+ micro-waves will most-likely obscure the physiology that most researchers are interested in studying with GECIs, and their presence indicates that the neural circuit is in an unintended pathological state. Interestingly this pathology was not observed using the same viral transduction methods in other brain regions. The authors recommend several approaches that may help other experimenters avoid this confound in their own data such as reducing the titer of viral injections or using recombinase-dependent expression. The intent of this manuscript is to raise awareness of the potential unintended consequences of viral overexpression, particularly for GECIs. A rigorous investigation into the exact causes of Ca2+ micro-waves or the mechanisms supporting them are beyond the authors' intended scope.


      The authors clearly demonstrate that Ca2+ micro-waves occur in the CA1 and CA3 regions of the hippocampus following large volume, high titer injections of adeno-associated viruses (AAV1 and AAV9) encoding GECIs. The supplementary videos provide undeniable proof of their existence.

      By forming an inter-institutional collaboration, the authors demonstrate that this phenomenon is robust to changes in surgical techniques or imaging conditions.


      I believe that the weaknesses of the manuscript are appropriately highlighted by the authors themselves in the discussion. The manuscript does not attempt to exhaustively characterize the conditions under which calcium micro-waves occur. Rather, the authors raise awareness of this problem.

    1. Author Response

      eLife assessment

      The authors used electrophysiology in brain slices and computer modeling and suggest that layer 2/3 pyramidal neurons of the mouse cortex express functional HCN channels, despite little evidence in the past that they are present. The study is useful at the present time, but results are incomplete because the methods, data, and analyses do not always support the conclusions.

      Public Reviews:

      Reviewer #1 (Public Review):

      The manuscript by Oleh et al. uses in vitro electrophysiology and compartmental modeling (via NEURON) to investigate the expression and function of HCN channels in mouse L2/3 pyramidal neurons. The authors conclude that L2/3 neurons have developmentally regulated HCN channels, the activation of which can be observed when subjected to large hyperpolarizations. They further conclude via blockade experiments that HCN channels in L2/3 neurons influence cellular excitability and pathway-specific EPSP kinetics, which can be neuromodulated. While the authors perform a wide range of slice physiology experiments, concrete evidence that L2/3 cells express functionally relevant HCN channels is limited. There are serious experimental design caveats and confounds that make drawing strong conclusions from the data difficult. Furthermore, the significance of the findings is generally unclear, given modest effect sizes and a lack of any functional relevance, either directly via in vivo experiments or indirectly via strong HCN-mediated changes in known operations/computations/functions of L2/3 neurons.

      Specific points:

      (1) The interpretability and impact of this manuscript are limited due to numerous methodological issues in experimental design, data collection, and analysis. The authors have not followed best practices in the field, and as such, much of the data is ambiguous and/or weak and does not support their interpretations (detailed below). Additionally, the authors fail to appropriately explain their rationale for many of their choices, making it difficult to understand why they did what they did. Furthermore, many important references appear to be missing, both in terms of contextualizing the work and in terms of approach/method. For example, the authors do not cite Kalmbach et al 2018, which performed a directly comparable set of experiments on HCN channels in L2/3 neurons of both humans and mice. This is an unacceptable omission. Additionally, the authors fail to cite prior literature regarding the specificity or lack thereof of Cs+ in blocking HCN. In describing a result, the authors state "In line with previous reports, we found that L2/3 PCs exhibited an unremarkable amount of sag at 'typical' current commands" but they then fail to cite the previous reports.

      We thank the reviewer for the thorough examination of our manuscript; however, we strongly disagree with many of the raised concerns for several reasons, as detailed in an initial response below:

      To address the lack of certain citations, we would like to emphasize that in the introduction section, we did focus on a several decades-long line of investigation into the HCN channel content of layer 2/3 pyramidal cells (L2/3 PCs), where there has undoubtedly been some controversy as to their functional contribution. We did not explicitly cite papers that claimed to find no/little HCN channels/sag- although this would be a significant list of pubs from some excellent senior investigators, as we wanted to avoid shining a negative light on otherwise excellent publications. However, we plan to address this more clearly in the upcoming revision.

      Just to take an example: in the publication mentioned by the reviewer (Kalmbach et al 2018), the investigators did not carry out voltage clamp recordings. Furthermore, the reported input resistance values in the aforementioned paper were far above other reports in mice (Routh et al. 2022, Brandalise et al 2022, Hedrick et al 2012; which were similar and our findings here), suggesting that recordings in Kalmbach were carried out at membrane potentials where HCN activation is less available (Routh, Brager and Johnston 2022).

      Another reason for some mixed findings in the field is undoubtedly due to the small/nonexistent sag in L2/3 current clamp recordings in mice. We also found a small sag, and that we have shown to be explained by the following: The ‘sag’ potential is a biphasic voltage response emerging from a relatively fast passive membrane response and a slower Ih activation. In L2/3 PCs, hyperpolarization-activated currents are apparently faster than previously described and are located proximally (our findings here). Therefore, their recruitment in mouse L2/3 PCs is on a similar timescale as the passive membrane response, resulting in a more monophasic response. Again, we plan to include a full set of citations in the updated introduction section, to highlight the importance of HCN channels in L2/3 PCs in mice and other species. The justification for using cesium (i.e., ‘best practices’) is detailed in the next paragraph.

      (2) A critical experimental concern in the manuscript is the reliance on cesium, a nonspecific blocker, to evaluate HCN channel function. Cesium blocks HCN channels but also acts at potassium channels (and possibly other channels as well). The authors do not acknowledge this or attempt to justify their use of Cs+ and do not cite prior work on this subject. They do not show control experiments demonstrating that the application of Cs+ in their preparation only affects Ih. Additionally, the authors write 1 mM cesium in the text but appear to use 2 mM in the figures. In later experiments, the authors switch to ZD7288, a more commonly used and generally accepted more specific blocker of HCN channels. However, they use a very high concentration, which is also known to produce off-target effects (see Chevaleyre and Castillo, 2002). To make robust conclusions, the authors should have used both blockers (at accepted/conservative concentrations) for all (or at least most) experiments. Using one blocker for some experiments and then another for different experiments is fraught with potential confounds.

      To address the concerns regarding the usage of cesium to block HCN channels, we would like to state that neither cesium nor ZD-7288 are without off-target effects, however in our case the potential off-target effects of external cesium were deemed less impactful, especially concerning AP firing output experiments. Extracellular cesium has been widely accepted as a blocker of HCN channels (Lau et al. 2010, Wickenden et al. 2009, Rateau and Ropert 2005, Hemond et al. 2009, Yang et al. 2015, Matt et al. 2010). However, it is known to act on potassium channels as well, which has mostly been demonstrated with intracellular application (Puil et al. 1981, Fleidervish et al. 2008, Williams et al. 1991, 2008). However, we acknowledge off-target effects and we will better cite the appropriate literature in our manuscript in the revision.

      Although we performed internal control experiments during the recordings, these were not included in the manuscript- which we plan to correct in the revision. These are detailed as follows: during our recordings cesium had no significant effect on action potential halfwidth, ruling out substantial blocking of potassium channels, nor did it affect any other aspects of suprathreshold activity. Furthermore, we observed similar effects on passive properties (resting membrane potential, input resistance) following ZD-7288 as with cesium, which we will also update in our figures. We did acknowledge that ZD-7288 is a widely accepted blocker of HCN, and for this reason we carried out some of our experiments using this pharmacological agent instead of cesium. However, these experiments were always supported by complementary findings using external cesium. For example, the effect of ZD-7288 on EPSPs was confirmed by similar synaptic stimulation experiments using cesium. This is important, as synaptic inputs of L2/3 PCs are modulated by both dendritic sodium (Ferrarese et al. 2018) and calcium channels (Landau 2022), therefore the application of ZD-7288 alone may have been difficult to interpret in isolation.

      On the other hand, ZD-7288 suffers from its own side effects, such as a substantial effect on sodium channels (Wu et al. 2012) and calcium channels (Sánchez-Alonso et al. 2008, Felix et al. 2003). As our aim was to provide functional evidence for the importance of HCN channels, we deemed these effects unacceptable in experiments where AP firing output (e.g., in cell-attached experiments) was measured.

      (3) A stronger case could be made that HCN is expressed in the somatic compartment of L2/3 cells if the authors had directly measured HCN-isolated currents with outside-out or nucleated patch recording (with appropriate leak subtraction and pharmacology). Whole-cell voltage-clamp in neurons with axons and/or dendrites does not work. It has been shown to produce erroneous results over and over again in the field due to well-known space clamp problems (see Rall, Spruston, Williams, etc.). The authors could have also included negative controls, such as recordings in neurons that do not express HCN or in HCN-knockout animals. Without these experiments, the authors draw a false equivalency between the effects of cesium and HCN channels, when the outcomes they describe could be driven simply by multiple other cesium-sensitive currents. Distortions are common in these preparations when attempting to study channels (see Williams and Womzy, J Neuro, 2011). In Fig 2h, cesium-sensitive currents look too large and fast to be from HCN currents alone given what the authors have shown in their earlier current clamp data. Furthermore, serious errors in leak subtraction appear to be visible in Supplementary Figure 1c. To claim that these conductances are solely from HCN may be misleading.

      We disagree with the argument that “Whole-cell voltage-clamp in neurons with axons and/or dendrites does not work”. Although this method is not without its confounds (i.e. space clamp), it is still a useful initial measure as demonstrated countless times in the literature. However, the reviewer is correct that the best approach to establish the somatodendritic distribution of ion channels is by direct somatic and dendritic outside-out patches. Due to the small diameter of L2/3 PC dendrites, these experiments haven’t been carried out yet in the literature for any other ion channel either to our knowledge. Mapping this distribution may be outside the scope of the current manuscript, but it was hard for us to ignore the sheer size of the Cs+ sensitive hyperpolarizing currents in whole cell. Thus, we will opt to report this data.

      Also, we should point out that space clamp-related errors manifest in the overestimation of frequency-dependent features, such as activation kinetics, and underestimation of steady-state current amplitudes. The activation time constant of our measured currents are somewhat faster than previously reported- reducing major concerns regarding space clamp errors. Furthermore, we simply do not understand what “too large… to be from HCN currents” means. We would like to ask the reviewer to point out what the “serious errors in leak subtraction” are, as the measured currents are similar in shape and correction artifacts to previously reported HCN currents (Meng et al. 2011, Li 2011, Zhao et al. 2019, Yu et al. 2004, Zhang et al. 2008, Spinelli et al. 2018, Craven et al. 2006, Ying et al. 2012, Biel et al. 2009).

      Furthermore, we would be grateful if the reviewer would mention the other possible ion channels that are activated at hyperpolarized voltages, have the same voltage dependence as HCN currents, do not show inactivation, influence both input resistance and resting membrane potential, and are blocked by low concentration extracellular cesium.

      (4) The authors present current-clamp traces with some sag, a primary indicator of HCN conductance, in Figure 2. However, they do not show example traces with cesium or ZD7288 blockade. Additionally, the normalization of current injected by cellular capacitance and the lack of reporting of input resistance or estimated cellular size makes it difficult to determine how much current is actually needed to observe the sag, which is important for assessing the functional relevance of these channels. The sag ratio in controls also varies significantly without explanation (Figure 6 vs Figure 7). Could this variability be a result of genetically defined subgroups within L2/3? For example, in humans, HCN expression in L2/3 varies from superficial and deep neurons. The authors do not make an effort to investigate this. Regardless of inconsistencies in either current injection or cell type, the sag ratio appears to be rather modest and similar to what has already been reported previously in other papers.

      We thank the reviewer for pointing out that our explanation for the modest sag ratio might have not been sufficient to properly understand why this measurement cannot be applied to layer 2/3 pyramidal cells. We will clarify this section in the results section. Briefly: sag potential emerges from a relatively (compared to Ih) fast passive membrane response and a slower HCN recruitment. The opposing polarity and different timescales of these two mechanisms results in a biphasic response called “sag” potential. However, if the timescale of these two mechanisms is similar, the voltage response is not predicted to be biphasic. We have shown that hyperpolarization activated currents in our preparations are fast and proximal, therefore they are recruited during the passive response (see Figure 2g.). This means that although a substantial amount of HCN currents are activated during hyperpolarization, their activation will not result in substantial sag. Therefore, sag ratio measurement is not necessarily applicable to approximate the HCN content of L2/3 PCs. We would like to emphasize that sag ratio measurements are correct in case of other cell types, and our aim is not to discredit the method, but rather to show that it cannot be applied in case of mouse L2/3 PCs.

      Our own measurements, similar to others in the literature show that L2/3 PCs exhibit modest sag ratios, however, this does not mean that HCN is not relevant. Ih activation in L2/3 PCs does not manifest in large sag potential but rather in a continuous distortion of steady-state responses (Figure 2b.). The reviewer is correct that L2/3 PCs are non-homogenous, therefore we sampled along the entire L2/3 axis. This yielded some variability in our results (i.e., passive properties); yet we did not observe any cells where hyperpolarizing-activated/Cs+-sensitive currents could not be resolved. As structural variability of L2/3 cells does result in variability in cellular capacitance, we compensated for this variability by injecting cellular capacitance-normalized currents. Our measured cellular capacitances were in accordance with previously published values, in the range of 50-120 pF. Therefore, the injected currents were not outside frequently used values. Together, we would like to state that whether substantial sag potential is present or not, initial estimates of the HCN content for each L2/3 PC should be treated with caution.

      (5) In the later experiments with ZD7288, the authors measured EPSP half-width at greater distances from the soma. However, they use minimal stimulation to evoke EPSPs at increasingly far distances from the soma. Without controlling for amplitude, the authors cannot easily distinguish between attenuation and spread from dendritic filtering and additional activation and spread from HCN blockade. At a minimum, the authors should share the variability of EPSP amplitude versus the change in EPSP half-width and/or stimulation amplitudes by distance. In general, this kind of experiment yields much clearer results if a more precise local activation of synapses is used, such as dendritic current injection, glutamate uncaging, sucrose puff, or glutamate iontophoresis. There are recording quality concerns here as well: the cell pictured in Figure 3a does not have visible dendritic spines, and a substantial amount of membrane is visible in the recording pipette. These concerns also apply to the similar developmental experiment in 6f-h, where EPSP amplitude is not controlled, and therefore, attenuation and spread by distance cannot be effectively measured. The outcome, that L2/3 cells have dendritic properties that violate cable theory, seems implausible and is more likely a result of variable amplitude by proximity.

      To resolve this issue, we will make a supplementary figure showing elicited amplitudes, which showed no significant distance dependence and minimal variability. We thank the reviewer for suggesting an amplitude-halfwidth comparison control. To address the issue of the non-visible spines, we would like to note that these images are of lower magnification. The presence of dendritic spines was confirmed in every recorded pyramidal cell observed using 2P microscopy.

      We would like to emphasize that although our recordings “seemingly” violated the cable theory, this is only true if we assume a completely passive condition. As shown in our manuscript, cable theory was not violated, as the presence of NMDA receptor boosting explained the observed ‘non-Rallian’ phenomenon. We plan to clarify this in the fully revised manuscript.

      (6) Minimal stimulation used for experiments in Figures 3d-i and Figures 4g-h does not resolve the half-width measurement's sensitivity to dendritic filtering, nor does cesium blockade preclude only HCN channel involvement. Example traces should be shown for all conditions in 3h; the example traces shown here do not appear to even be from the same cell. These experiments should be paired (with and without cesium/ZD). The same problem appears in Figure 4, where it is not clear that the authors performed controls and drug conditions on the same cells. 4g also lacks a scale bar, so readers cannot determine how much these measurements are affected by filtering and evoked amplitude variability. Finally, if we are to believe that minimal stimulation is used to evoke responses of single axons with 50% fail rates, NMDA receptor activation should be minimal to begin with. If the authors wish to make this claim, they need to do more precise activation of NMDA-mediated EPSPs and examine the effects of ZD7288 on these responses in the same cell. As the data is presented, it is not possible to draw the conclusion that HCN boosts NMDA-mediated responses in L2/3 neurons.

      As stated in the figure legends, the control and drug application traces are from the same cell, both in figure 3 and figure 4, and the scalebar is not included as the amplitudes were normalized for clarity. We have address the effects of dendritic filtering above in answer (5), and cesium blockade above in answer (2). To reiterate, dendritic filtering alone cannot explain our observations, and cesium is often a better choice for blocking HCN channels compared to ZD-7288, which blocks sodium channels as well. When an excitatory synaptic signal arrives onto a pyramidal cell in typical conditions, neurotransmitter sensitive receptors transmit a synaptic current to the dendritic spine. This dendritic spine is electrically isolated by the high resistance of the spine neck and due to the small membrane surface of the spine, the synaptic current elicits remarkably large voltage changes. These voltage changes can be large enough to depolarize the spine close to zero millivolts upon even single small inputs (Jayant et al. 2016). Therefore, to state that single inputs arriving to dendritic spines cannot be large enough to recruit NMDA receptor activation is incorrect. This is further exemplified by the substantial literature showing ‘miniature’ NMDA recruitment via stochastic vesicle release alone.

      (7) The quality of recordings included in the dataset has concerning variability: for example, resting membrane potentials vary by >15-20 mV and the AP threshold varies by 20 mV in controls. This is indicative of either a very wide range of genetically distinct cell types that the authors are ignoring or the inclusion of cells that are either unhealthy or have bad seals.

      Although we are aware of the diversity of L2/3 PCs, resolving further layer depth differences is outside the scope of our current manuscript. However, as shown in Kalmbech et al, resting membrane potential can greatly vary (>15-20 mV) in L2/3 PCs depending on distance from pia. We acknowledge that the variance in AP threshold is large and could be due to genetically distinct cell types. Therefore, we plan to present AP peak/width information in the revision, which showed a significantly smaller variability, therefore validating our recording conditions.

      (8) The authors make no mention of blocking GABAergic signaling, so it must be assumed that it is intact for all experiments. Electrical stimulation can therefore evoke a mixture of excitatory and inhibitory responses, which may well synapse at very different locations, adding to interpretability and variability concerns.

      We thank the reviewer for pointing out our lack of detail regarding the GABAergic signaling blocker SR 95531. We did include this drug in our recordings of signal summation, so GABAergic responses did not contaminate our recordings. We plan to clarify in the revision.

      (9) The investigation of serotonergic interaction with HCN channels produces modest effect sizes and suffers the same problems as described above.

      We do not agree with the reviewer that 50% drop in neuronal AP firing responses (Figure 7b) was a modest effect size. Thus we plan to keep this data in the manuscript.

      (10) The computational modeling is not well described and is not biologically plausible. Persistent and transient K channels are missing. Values for other parameters are not listed. The model does not seem to follow cable theory, which, as described above, is not only implausible but is also not supported by the experimental findings.

      The model was downloaded from the Cell Type Database from the Allen Institute, with only minor modifications including the addition of dendritic HCN channels and NDMA receptors- which were varied along a wide parameter space to find a ‘best fit’ to our observations. These additions were necessary to recapitulate our experimental findings. We agree the model likely does not fully recapitulate all aspects of the dendrites, which as we hope to convey in this manuscript, are not fully resolved in mouse L2/3 PCs. This is a published neuronal model, and despite its potential shortcomings, is one among a handful of open-source neuronal models of fully reconstructed L2/3 PCs. We are open to improvement suggestions.

      Reviewer #2 (Public Review):


      This paper by Olah et al. uncovers a previously unknown role of HCN channels in shaping synaptic inputs to L2/3 cortical neurons. The authors demonstrate using slice electrophysiology and computational modeling that, unlike layer 5 pyramidal neurons, L2/3 neurons have an enrichment of HCN channels in the proximal dendrites. This location provides a locus of neuromodulation for inputs onto the proximal dendrites from L4 without an influence on distal inputs from L1. The authors use pharmacology to demonstrate the effect of HCN channels on NMDA-mediated synaptic inputs from L4. The authors further demonstrate the developmental time course of HCN function in L2/3 pyramidal neurons. Taken together, this a well-constructed investigation of HCN channel function and the consequences of these channels on synaptic integration in L2/3 pyramidal neurons.


      The authors use careful, well-constrained experiments using multiple pharmacological agents to asses HCN channel contributions to synaptic integrations. The authors also use a voltage clamp to directly measure the current through HCN channels across developmental ages. The authors also provide supplemental data showing that their observation is consistent across multiple areas of the cerebral cortex.


      The gradient of the HCN channel function is based almost exclusively on changes in EPSP width measured at the soma. While providing strong evidence for the presence of HCN current in L2/3 neurons, there are space clamp issues related to the use of somatic whole-cell voltage clamps that should be considered in the discussion.

      We thank the reviewer for pointing out our careful and well-constrained experiments and for making suggestions. The potential effects of space clamp errors will be detailed in the discussion section (see extended explanations under Reviewer 1).

      Reviewer #3 (Public Review):


      The authors study the function of HCN channels in L2/3 pyramidal neurons, employing somatic whole-cell recordings in acute slices of visual cortex in adult mice and a bevy of technically challenging techniques. Their primary claim is a non-uniform HCN distribution across the dendritic arbor with a greater density closer to the soma (roughly opposite of the gradient found in L5 PT-type neurons). The second major claim is that multiple sources of long-range excitatory input (cortical and thalamic) are differentially affected by the HCN distribution. They further describe an interesting interplay of NMDAR and HCN, serotonergic modulation of HCN, and compare HCN-related properties at 1, 2 and 6 weeks of age. Several results are supported by biophysical simulations.


      The authors collected data from both male and female mice, at an age (6-10 weeks) that permits comparison with in vivo studies, in sufficient numbers for each condition, and they collected a good number of data points for almost all figure panels. This is all the more positive, considering the demanding nature of multi-electrode recording configurations and pipette-perfusion. The main strength of the study is the question and focus.


      Unfortunately, in its present form, the main claims are not adequately supported by the experimental evidence: primarily because the evidence is indirect and circumstantial, but also because multiple unusual experimental choices (along with poor presentation of results) undermine the reader's confidence. Additionally, the authors overstate the novelty of certain results and fail to cite important related publications. Some of these weaknesses can be addressed by improved analysis and statistics, resolving inconsistent data across figures, reorganizing/improving figure panels, more complete methods, improved citations, and proofreading. In particular, given the emphasis on EPSPs, the primary data (for example EPSPs, overlaid conditions) should be shown much more.

      However, on the experimental side, addressing the reviewer's concerns would require a very substantial additional effort: direct measurement of HCN density at different points in the dendritic arbor and soma; the internal solution chosen here (K-gluconate) is reported to inhibit HCN; bath-applied cesium at the concentrations used blocks multiple potassium channels, i.e. is not selective for HCN (the fact that the more selective blocker ZD7288 was used in a subset of experiments makes the choice of Cs+ as the primary blocker all the more curious); pathway-specific synaptic stimulation, for example via optogenetic activation of specific long-range inputs, to complement / support / verify the layer-specific electrical stimulation.

      We thank the reviewer for their very careful examination of our manuscript and helpful suggestions. We will address the concerns raised in the review and present substantially more raw traces in our figures. Although direct dendritic HCN mapping measurements are likely outside the scope of the current manuscript due to the morphological constraints presented by L2/3 PCs (which explains why no other full dendritic nonlinearity distribution has been described in L2/3 PCs with this method), we will nonetheless supplement our manuscript with additional suggested experiments. For example we plan to include the excellent suggestion of pathway-specific optogenetic stimulation to further validate the disparate effect of HCN channels for distal and proximal inputs. We will also include control measurements using different internal solutions. We agree that ZD-7288 is a widely accepted blocker of HCN channels. However, the off-target effects on sodium channels may have significantly confounded our measurements of AP output using extracellular stimulation. Therefore we chose cesium as the primary blocker for those experiments, but did validate several other Cs+-based results with ZD-7288. These controls will also be represented in a more clear fashion in a new supplementary figure.

    2. eLife assessment

      The authors used electrophysiology in brain slices and computer modeling and suggest that layer 2/3 pyramidal neurons of the mouse cortex express functional HCN channels, despite little evidence in the past that they are present. The study is useful at the present time, but results are incomplete because the methods, data, and analyses do not always support the conclusions.

    3. Reviewer #1 (Public Review):

      The manuscript by Oleh et al. uses in vitro electrophysiology and compartmental modeling (via NEURON) to investigate the expression and function of HCN channels in mouse L2/3 pyramidal neurons. The authors conclude that L2/3 neurons have developmentally regulated HCN channels, the activation of which can be observed when subjected to large hyperpolarizations. They further conclude via blockade experiments that HCN channels in L2/3 neurons influence cellular excitability and pathway-specific EPSP kinetics, which can be neuromodulated. While the authors perform a wide range of slice physiology experiments, concrete evidence that L2/3 cells express functionally relevant HCN channels is limited. There are serious experimental design caveats and confounds that make drawing strong conclusions from the data difficult. Furthermore, the significance of the findings is generally unclear, given modest effect sizes and a lack of any functional relevance, either directly via in vivo experiments or indirectly via strong HCN-mediated changes in known operations/computations/functions of L2/3 neurons.

      Specific points:

      (1) The interpretability and impact of this manuscript are limited due to numerous methodological issues in experimental design, data collection, and analysis. The authors have not followed best practices in the field, and as such, much of the data is ambiguous and/or weak and does not support their interpretations (detailed below). Additionally, the authors fail to appropriately explain their rationale for many of their choices, making it difficult to understand why they did what they did. Furthermore, many important references appear to be missing, both in terms of contextualizing the work and in terms of approach/method. For example, the authors do not cite Kalmbach et al 2018, which performed a directly comparable set of experiments on HCN channels in L2/3 neurons of both humans and mice. This is an unacceptable omission. Additionally, the authors fail to cite prior literature regarding the specificity or lack thereof of Cs+ in blocking HCN. In describing a result, the authors state "In line with previous reports, we found that L2/3 PCs exhibited an unremarkable amount of sag at 'typical' current commands" but they then fail to cite the previous reports.

      (2) A critical experimental concern in the manuscript is the reliance on cesium, a nonspecific blocker, to evaluate HCN channel function. Cesium blocks HCN channels but also acts at potassium channels (and possibly other channels as well). The authors do not acknowledge this or attempt to justify their use of Cs+ and do not cite prior work on this subject. They do not show control experiments demonstrating that the application of Cs+ in their preparation only affects Ih. Additionally, the authors write 1 mM cesium in the text but appear to use 2 mM in the figures. In later experiments, the authors switch to ZD7288, a more commonly used and generally accepted more specific blocker of HCN channels. However, they use a very high concentration, which is also known to produce off-target effects (see Chevaleyre and Castillo, 2002). To make robust conclusions, the authors should have used both blockers (at accepted/conservative concentrations) for all (or at least most) experiments. Using one blocker for some experiments and then another for different experiments is fraught with potential confounds.

      (3) A stronger case could be made that HCN is expressed in the somatic compartment of L2/3 cells if the authors had directly measured HCN-isolated currents with outside-out or nucleated patch recording (with appropriate leak subtraction and pharmacology). Whole-cell voltage-clamp in neurons with axons and/or dendrites does not work. It has been shown to produce erroneous results over and over again in the field due to well-known space clamp problems (see Rall, Spruston, Williams, etc.). The authors could have also included negative controls, such as recordings in neurons that do not express HCN or in HCN-knockout animals. Without these experiments, the authors draw a false equivalency between the effects of cesium and HCN channels, when the outcomes they describe could be driven simply by multiple other cesium-sensitive currents. Distortions are common in these preparations when attempting to study channels (see Williams and Womzy, J Neuro, 2011). In Fig 2h, cesium-sensitive currents look too large and fast to be from HCN currents alone given what the authors have shown in their earlier current clamp data. Furthermore, serious errors in leak subtraction appear to be visible in Supplementary Figure 1c. To claim that these conductances are solely from HCN may be misleading.

      (4) The authors present current-clamp traces with some sag, a primary indicator of HCN conductance, in Figure 2. However, they do not show example traces with cesium or ZD7288 blockade. Additionally, the normalization of current injected by cellular capacitance and the lack of reporting of input resistance or estimated cellular size makes it difficult to determine how much current is actually needed to observe the sag, which is important for assessing the functional relevance of these channels. The sag ratio in controls also varies significantly without explanation (Figure 6 vs Figure 7). Could this variability be a result of genetically defined subgroups within L2/3? For example, in humans, HCN expression in L2/3 varies from superficial and deep neurons. The authors do not make an effort to investigate this. Regardless of inconsistencies in either current injection or cell type, the sag ratio appears to be rather modest and similar to what has already been reported previously in other papers.

      (5) In the later experiments with ZD7288, the authors measured EPSP half-width at greater distances from the soma. However, they use minimal stimulation to evoke EPSPs at increasingly far distances from the soma. Without controlling for amplitude, the authors cannot easily distinguish between attenuation and spread from dendritic filtering and additional activation and spread from HCN blockade. At a minimum, the authors should share the variability of EPSP amplitude versus the change in EPSP half-width and/or stimulation amplitudes by distance. In general, this kind of experiment yields much clearer results if a more precise local activation of synapses is used, such as dendritic current injection, glutamate uncaging, sucrose puff, or glutamate iontophoresis. There are recording quality concerns here as well: the cell pictured in Figure 3a does not have visible dendritic spines, and a substantial amount of membrane is visible in the recording pipette. These concerns also apply to the similar developmental experiment in 6f-h, where EPSP amplitude is not controlled, and therefore, attenuation and spread by distance cannot be effectively measured. The outcome, that L2/3 cells have dendritic properties that violate cable theory, seems implausible and is more likely a result of variable amplitude by proximity.

      (6) Minimal stimulation used for experiments in Figures 3d-i and Figures 4g-h does not resolve the half-width measurement's sensitivity to dendritic filtering, nor does cesium blockade preclude only HCN channel involvement. Example traces should be shown for all conditions in 3h; the example traces shown here do not appear to even be from the same cell. These experiments should be paired (with and without cesium/ZD). The same problem appears in Figure 4, where it is not clear that the authors performed controls and drug conditions on the same cells. 4g also lacks a scale bar, so readers cannot determine how much these measurements are affected by filtering and evoked amplitude variability. Finally, if we are to believe that minimal stimulation is used to evoke responses of single axons with 50% fail rates, NMDA receptor activation should be minimal to begin with. If the authors wish to make this claim, they need to do more precise activation of NMDA-mediated EPSPs and examine the effects of ZD7288 on these responses in the same cell. As the data is presented, it is not possible to draw the conclusion that HCN boosts NMDA-mediated responses in L2/3 neurons.

      (7) The quality of recordings included in the dataset has concerning variability: for example, resting membrane potentials vary by >15-20 mV and the AP threshold varies by 20 mV in controls. This is indicative of either a very wide range of genetically distinct cell types that the authors are ignoring or the inclusion of cells that are either unhealthy or have bad seals.

      (8) The authors make no mention of blocking GABAergic signaling, so it must be assumed that it is intact for all experiments. Electrical stimulation can therefore evoke a mixture of excitatory and inhibitory responses, which may well synapse at very different locations, adding to interpretability and variability concerns.

      (9) The investigation of serotonergic interaction with HCN channels produces modest effect sizes and suffers the same problems as described above.

      (10) The computational modeling is not well described and is not biologically plausible. Persistent and transient K channels are missing. Values for other parameters are not listed. The model does not seem to follow cable theory, which, as described above, is not only implausible but is also not supported by the experimental findings.

      Taken together, there are serious methodological and analytical concerns that need to be addressed before the authors' claims can be supported. Combined with the small effect sizes and high data variability throughout the paper, this makes it hard to see how the manuscript could make a strong contribution to advancing our understanding of L2/3 cortical pyramidal neuron function.

    4. Reviewer #2 (Public Review):


      This paper by Olah et al. uncovers a previously unknown role of HCN channels in shaping synaptic inputs to L2/3 cortical neurons. The authors demonstrate using slice electrophysiology and computational modeling that, unlike layer 5 pyramidal neurons, L2/3 neurons have an enrichment of HCN channels in the proximal dendrites. This location provides a locus of neuromodulation for inputs onto the proximal dendrites from L4 without an influence on distal inputs from L1. The authors use pharmacology to demonstrate the effect of HCN channels on NMDA-mediated synaptic inputs from L4. The authors further demonstrate the developmental time course of HCN function in L2/3 pyramidal neurons. Taken together, this a well-constructed investigation of HCN channel function and the consequences of these channels on synaptic integration in L2/3 pyramidal neurons.


      The authors use careful, well-constrained experiments using multiple pharmacological agents to asses HCN channel contributions to synaptic integrations. The authors also use a voltage clamp to directly measure the current through HCN channels across developmental ages. The authors also provide supplemental data showing that their observation is consistent across multiple areas of the cerebral cortex.


      The gradient of the HCN channel function is based almost exclusively on changes in EPSP width measured at the soma. While providing strong evidence for the presence of HCN current in L2/3 neurons, there are space clamp issues related to the use of somatic whole-cell voltage clamps that should be considered in the discussion.

    5. Reviewer #3 (Public Review):


      The authors study the function of HCN channels in L2/3 pyramidal neurons, employing somatic whole-cell recordings in acute slices of visual cortex in adult mice and a bevy of technically challenging techniques. Their primary claim is a non-uniform HCN distribution across the dendritic arbor with a greater density closer to the soma (roughly opposite of the gradient found in L5 PT-type neurons). The second major claim is that multiple sources of long-range excitatory input (cortical and thalamic) are differentially affected by the HCN distribution. They further describe an interesting interplay of NMDAR and HCN, serotonergic modulation of HCN, and compare HCN-related properties at 1, 2 and 6 weeks of age. Several results are supported by biophysical simulations.


      The authors collected data from both male and female mice, at an age (6-10 weeks) that permits comparison with in vivo studies, in sufficient numbers for each condition, and they collected a good number of data points for almost all figure panels. This is all the more positive, considering the demanding nature of multi-electrode recording configurations and pipette-perfusion. The main strength of the study is the question and focus.


      Unfortunately, in its present form, the main claims are not adequately supported by the experimental evidence: primarily because the evidence is indirect and circumstantial, but also because multiple unusual experimental choices (along with poor presentation of results) undermine the reader's confidence. Additionally, the authors overstate the novelty of certain results and fail to cite important related publications. Some of these weaknesses can be addressed by improved analysis and statistics, resolving inconsistent data across figures, reorganizing/improving figure panels, more complete methods, improved citations, and proofreading. In particular, given the emphasis on EPSPs, the primary data (for example EPSPs, overlaid conditions) should be shown much more.

      However, on the experimental side, addressing the reviewer's concerns would require a very substantial additional effort: direct measurement of HCN density at different points in the dendritic arbor and soma; the internal solution chosen here (K-gluconate) is reported to inhibit HCN; bath-applied cesium at the concentrations used blocks multiple potassium channels, i.e. is not selective for HCN (the fact that the more selective blocker ZD7288 was used in a subset of experiments makes the choice of Cs+ as the primary blocker all the more curious); pathway-specific synaptic stimulation, for example via optogenetic activation of specific long-range inputs, to complement / support / verify the layer-specific electrical stimulation.

    1. Author Response

      We thank all the reviewers for their comments and insight. We plan to address the comments and recommendations in the revised version of the manuscript. Provisional response on key points are given below.

      Reviewer #1 (Public Review):


      In this manuscript, Chowdhury and co-workers provide interesting data to support the role of G4-structures in promoting chromatin looping and long-range DNA interactions. The authors achieve this by artificially inserting a G4-containing sequence in an isolated region of the genome using CRISPR-Cas9 and comparing it to a control sequence that does not contain G4 structures. Based on the data provided, the authors can conclude that G4-insertion promotes long-range interactions (measured by Hi-C) and affects gene expression (measured by qPCR) as well as chromatin remodelling (measured by ChIP of specific histone markers).

      Whilst the data presented is promising and partially supports the authors' conclusion, this reviewer feels that some key controls are missing to fully support the narrative. Specifically, validation of actual G4-formation in chromatin by ChIP-qPCR (at least) is essential to support the association between G4-formation and looping. Moreover, this study is limited to a genomic location and an individual G4-sequence used, so the findings reported cannot yet be considered to reflect a general mechanism/effect of G4-formation in chromatin looping.


      This is the first attempt to connect genomics datasets of G4s and HiC with gene expression. The use of Cas9 to artificially insert a G4 is also very elegant.


      Lack of controls, especially to validate G4-formation after insertion with Cas9. The work is limited to a single G4-sequence and a single G4-site, which limits the generalisation of the findings.

      In an earlier study, we reported intracellular G4 formation in the hTERT promoter region in human cells (Sharma et al., Cell Reports, 2021). Exactly the same stretch of DNA was taken for insertion here. This is mentioned in the current manuscript as- “The array of G4-forming sequences used for insertion was previously reported to form stable G4s in human cells.” under the paragraph titled “Insertion of an array of G4s in an isolated locus” in the Results section. As the reviewer points out, we understand that intracellular G4 formation needs to be confirmed upon insertion at the non-native location. These experiments/results will be included in the revised version.

      To directly address the second point we are attempting insertion of the same G4-sequence at another locus. Experiments/results on this, and if the insertion is successful, how the insertion affects chromatin organization and nearby gene expression will be included in the revised manuscript.

      Reviewer #2 (Public Review):


      Roy et al. investigated the role of non-canonical DNA structures called G-quadruplexes (G4s) in long-range chromatin interactions and gene regulation. Introducing a G4 array into chromatin significantly increased the number of long-range interactions, both within the same chromosome (cis) and between different chromosomes (trans). G4s functioned as enhancer elements, recruiting p300 and boosting gene expression even 5 megabases away. The study proposes a mechanism where G4s directly influence 3D chromatin organization, facilitating communication between regulatory elements and genes.


      The findings are valuable for understanding the role of G4-DNA in 3D genome organization and gene transcription.


      The study would benefit from more robust and comprehensive data, which would add depth and clarity.

      (1) Lack of G4 Structure Confirmation: The absence of direct evidence for G4 formation within cells undermines the study's foundation. Relying solely on in vitro data and successful gene insertion is insufficient.

      As pointed out in response to the above comment, direct evidence of G4 formation by the stretch of DNA was published by us earlier (Sharma et al., Cell Reports, 2021). We understand here it is important to check/confirm this at the insertion site. These experiments are being initiated.

      (2) Alternative Explanations: The study does not sufficiently address alternative explanations for the observed results. The inserted sequences may not form G4s or other factors like G4-RNA hybrids may be involved.

      G4 formation at the insertion site will be checked to confirm. It has been reported G4 structures associate with R-loops to strengthen CTCF binding and enhance chromatin looping (Wulfridge et al., 2023). This can discussed further for readers.

      (3) Limited Data Depth and Clarity: ChIP-qPCR offers limited scope and considerable variation in some data makes conclusions difficult.

      Variation with one of the primers in a few ChIP-qPCR experiments (in Figures 2 and 3D) we have noted. The change however was statistically significant, and consistent with the overall trend across experiments (Figures 2, 3 and 4). Enhancer function, in addition to ChIP, was confirmed using other assays like 3C and RNA expression.

      (4) Statistical Significance and Interpretation: The study could be more careful in evaluating the statistical significance and magnitude of the effects to avoid overinterpreting the results.

      As pointed out, the manuscript will be revised to ensure we are not overinterpreting any results.

      Reviewer #3 (Public Review):


      This paper aims to demonstrate the role of G-quadruplex DNA structures in the establishment of chromosome loops. The authors introduced an array of G4s spanning 275 bp, naturally found within a very well-characterized promoter region of the hTERT promoter, in an ectopic region devoid of G-quadruplex and annotated gene. As a negative control, they used a mutant version of the same sequence in which G4 folding is impaired. Due to the complexity of the region, 3 G4s on the same strand and one on the opposite strand, 12 point mutations were made simultaneously (G to T and C to A). Analysis of the 3D genome organization shows that the WT array establishes more contact within the TAD and throughout the genome than the control array. Additionally, a slight enrichment of H3K4me1 and p300, both enhancer markers, was observed locally near the insertion site. The authors tested whether the expression of genes located either nearby or up to 5 Mb away was up-regulated based on this observation. They found that four genes were up-regulated from 1.5 to 3-fold. An increased interaction between the G4 array compared to the mutant was confirmed by the 3C assay. For in-depth analysis of the long-range changes, they also performed Hi-C experiments and showed a genome-wide increase in interactions of the WT array versus the mutated form.


      The experiments were well-executed and the results indicate a statistical difference between the G4 array inserted cell line and the mutated modified cell line.


      The control non-G4 sequence contains 12 point mutations, making it difficult to draw clear conclusions. These mutations not only alter the formation of G4, but also affect at least three Sp1 binding sites that have been shown to be essential for the function of the hTERT promoter, from which the sequence is derived. The strong intermingling of G4 and Sp1 binding sites makes it impossible to determine whether all the observations made are dependent on G4 or Sp1 binding. As a control, the authors used Locked Nucleic Acid probes to prevent the formation of G4. As for mutations, these probes also interfere with two Sp1 binding sites. Therefore, using this alternative method has the same drawback as point mutations. This major issue should be discussed in the paper. It is also possible that other unidentified transcription factor binding sites are affected in the presented point mutants.

      Since the sequence we used to test the effects of G4 structure formation is highly G-rich, we had to introduce at least 12 mutations to be sure that a stable G4 structure would not form in the mutated control sequence. Sp1 has been reported to bind to G4 structures (Raiber et al., 2012). So, Sp1 binding could also be associated with the G4-dependent enhancer functions observed here. We also appreciate that apart from Sp1, other unidentified transcription factor binding sites might be affected by the mutations we introduced. We will discuss these possibilities in the revised version of the manuscript.

    2. Reviewer #2 (Public Review):


      Roy et al. investigated the role of non-canonical DNA structures called G-quadruplexes (G4s) in long-range chromatin interactions and gene regulation. Introducing a G4 array into chromatin significantly increased the number of long-range interactions, both within the same chromosome (cis) and between different chromosomes (trans). G4s functioned as enhancer elements, recruiting p300 and boosting gene expression even 5 megabases away. The study proposes a mechanism where G4s directly influence 3D chromatin organization, facilitating communication between regulatory elements and genes.


      The findings are valuable for understanding the role of G4-DNA in 3D genome organization and gene transcription.


      The study would benefit from more robust and comprehensive data, which would add depth and clarity.

      (1) Lack of G4 Structure Confirmation: The absence of direct evidence for G4 formation within cells undermines the study's foundation. Relying solely on in vitro data and successful gene insertion is insufficient.

      (2) Alternative Explanations: The study does not sufficiently address alternative explanations for the observed results. The inserted sequences may not form G4s or other factors like G4-RNA hybrids may be involved.

      (3) Limited Data Depth and Clarity: ChIP-qPCR offers limited scope and considerable variation in some data makes conclusions difficult.

      (4) Statistical Significance and Interpretation: The study could be more careful in evaluating the statistical significance and magnitude of the effects to avoid overinterpreting the results.

    3. Reviewer #3 (Public Review):


      This paper aims to demonstrate the role of G-quadruplex DNA structures in the establishment of chromosome loops. The authors introduced an array of G4s spanning 275 bp, naturally found within a very well-characterized promoter region of the hTERT promoter, in an ectopic region devoid of G-quadruplex and annotated gene. As a negative control, they used a mutant version of the same sequence in which G4 folding is impaired. Due to the complexity of the region, 3 G4s on the same strand and one on the opposite strand, 12 point mutations were made simultaneously (G to T and C to A). Analysis of the 3D genome organization shows that the WT array establishes more contact within the TAD and throughout the genome than the control array. Additionally, a slight enrichment of H3K4me1 and p300, both enhancer markers, was observed locally near the insertion site. The authors tested whether the expression of genes located either nearby or up to 5 Mb away was up-regulated based on this observation. They found that four genes were up-regulated from 1.5 to 3-fold. An increased interaction between the G4 array compared to the mutant was confirmed by the 3C assay. For in-depth analysis of the long-range changes, they also performed Hi-C experiments and showed a genome-wide increase in interactions of the WT array versus the mutated form.


      The experiments were well-executed and the results indicate a statistical difference between the G4 array inserted cell line and the mutated modified cell line.


      The control non-G4 sequence contains 12 point mutations, making it difficult to draw clear conclusions. These mutations not only alter the formation of G4, but also affect at least three Sp1 binding sites that have been shown to be essential for the function of the hTERT promoter, from which the sequence is derived. The strong intermingling of G4 and Sp1 binding sites makes it impossible to determine whether all the observations made are dependent on G4 or Sp1 binding. As a control, the authors used Locked Nucleic Acid probes to prevent the formation of G4. As for mutations, these probes also interfere with two Sp1 binding sites. Therefore, using this alternative method has the same drawback as point mutations. This major issue should be discussed in the paper. It is also possible that other unidentified transcription factor binding sites are affected in the presented point mutants.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):


      The authors establish a recombinant insect cell expression and purification scheme for the antiviral Dicer complex of C. elegans. In addition to Dicer-1, the complex harbors two additional proteins, the RIG-I-like helicase DRH-1, and the dsRNA-binding protein RDE-4. The authors show that the complex prefers blunt-end dsRNA over dsRNAs that contain overhangs. Furthermore, whereas ATP-dependent dsRNA cleavage only exacerbates regular dsRNA cleavage activity, the presence of RDE-4 is essential to ATP-dependent and ATP-independent dsRNA cleavage. Single-particle cryo-EM studies of the ternary C. elegans Dicer complex reveal that the N-terminal domain of DRH-1 interacts with the helicase domain of DCR-1, thereby relieving its autoinhibitory state. Lastly, the authors show that the ternary complex is able to processively cleave long dsRNA, an activity primarily relying on the helicase activity of DRH-1.


      First thorough biochemical characterization of the antiviral activity of C. elegans Dicer in complex with the RIG-I-like helicase DRH-1 and the dsRNA-binding protein RDE-4. • Discovery that RDE-4 is essential to dsRNA processing, whereas ATP hydrolysis is not.

      Discovery of an autoinhibitory role of DRH-1's N-terminal domain (in analogy to the CARD domains of RIG-I).

      First structural insights into the ternary complex DCR-1:DRH-1:RDE-4 by cryo-EM to medium resolution.

      Trap experiments reveal that the ternary DCR-1 complex cleaves blunt-ended dsRNA processively. Likely, the helicase domain of DRH-1 is responsible for this processive cleavage.

      We thank the reviewer for this accurate and thoughtful summary of the strengths of our study. We note that although ATP hydrolysis is not essential for dsRNA processing, it is essential for promoting an alternative, and dramatically more efficient, cleavage mechanism that is wellsuited for processing viral dsRNA.


      Cryo-EM Structure of the ternary Dicer-1:DRH-1:RED-4 complex to only medium resolution.

      We agree with the reviewer that our structures are only of modest resolution. We continue to work towards a higher resolution structure of this conformationally heterogeneous complex. We do want to emphasize that despite our modest resolution, our structures provide novel insights into how the factors in the antiviral complex interact with each other, and also allow us to compare our findings to other Dicer systems. For example, the dsRNA binding protein RDE-4 binds the Hel2i subdomain, and this is similar to accessory dsRNA binding proteins of other Dicers, including human and Drosophila. Most importantly, for the first time, we uncover the interaction of DRH-1 with C. elegans Dicer; our structures show DRH-1's N-terminal domain interacting with Dicer's helicase domain. This observation spurred our experiments that showed the N-terminal domain of DRH-1, like the analogous domain of RIG-I, enables an autoinhibited conformation. While RIG-I autoinhibition is relieved by dsRNA binding, we do not observe this with C. elegans DRH-1 and speculate that instead it is the interaction with Dicer's helicase domain that relieves autoinhibition.

      High-resolution structure of the C-terminal domain of DRH-1 bound to dsRNA does not reveal the mechanism of how blunt-end dsRNA and overhang-containing one are being discriminated.

      The cryo-EM structure of DCR1:DRH-1:RDE-4 in the presence of ATP only reveals the helicase and CTD domains of DRH-1 bound to dsRNA. No information on dsRNA termini recognition is presented. The paragraph seems detached from the general flow of the manuscript.

      We agree with the reviewer that our paper would be improved with a high-resolution structure of DRH-1 bound to the dsRNA terminus to better understand terminus discrimination. Since we did not obtain a high-resolution structure of DRH-1 bound to the dsRNA terminus, we could not comment on how DRH-1 discriminates termini. However, our structure of DRH-1’s helicase and CTD bound to the middle of the dsRNA does provide important evidence that DRH-1 translocates along dsRNA, which is crucial for our interpretation of DRH-1’s ATPase function in the antiviral complex. Furthermore, our analysis of the DRH-1:dsRNA contacts reveals just how well conserved DRH-1 is with mammalian RLRs.

      The antiviral DCR-1:DRH-1:RDE-4 complex shows largely homologous activities and regulation than Drosophila Dicer-2.

      It is unclear to us why this is a weakness. In our Discussion in the section “Relationship to previously characterized Dicer activities,” we compare and contrast the C. elegans antiviral complex and the most well characterized antiviral Dicer: Drosophila Dcr2. While it might not be surprising that two invertebrate activities that both must target viral dsRNA have similar enzymatic properties, we find this remarkable given that Dcr2 orchestrates cleavage with a single protein, while two helicases and a dsRNA binding protein cooperate in the C. elegans reaction. Our careful biochemical analyses reveal how the three proteins cooperate. In vivo, C. elegans Dicer must function to cleave pre-miRNAs, endogenous siRNAs as well as viral dsRNA, and we speculate that the use of diverse accessory factors allows C. elegans Dicer to carry out these distinct tasks.

      Reviewer #2 (Public Review):


      To investigate the evolutionary relationship between the RNAi pathway and innate immunity, this study uses biochemistry and structural biology to investigate the trimeric complex of Dicer1, DRH-1 (a RIGI homologue), and RDE-4, which exists in C. elegans. The three subunits were co-expressed to promote stable purification of the complex. This complex promoted ATPdependent cleavage of blunt-ended dsRNAs. A detailed kinetic analysis was also carried out to determine the role of each subunit of the trimeric complex in both the specificity and efficiency of cleavage. These studies indicate that RDE-4 is critical for cleavage while DRC-1 is primarily involved in the specificity of the reaction, and DRH-1 promotes ATP hydrolysis. Finally, a moderate density (6-7 angstrom) cryo-EM structure is presented with attempts to position each of the components.


      (1) Newly described methods for studying the C. elegans DICER complex.

      (2) New structure, albeit only moderate resolution.

      (3) Kinetic study of the complex in the presence and absence of individual subunits and mutations, provides detailed insight into the contribution of each subunit.


      (1) Limited insight due to limited structural resolution.

      (2) No attempts to extend findings to other Dicer or RLR systems.

      Overall, we agree with the assessment of this reviewer, and we thank them for their efforts in evaluating our manuscript. Whenever possible we have discussed the similarities and differences of the C. elegans Dicer to other Dicers and RLR systems. We are unsure how we could have expanded upon this further (as suggested in point 2).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor recommendations to the authors:

      Page 10: To assess the role of ATP hydrolysis for dsRNA binding, please refrain from using the term "fuzzy band" as a qualitative measure of RNA binding to the ternary complexes.

      We searched our entire manuscript and did not find the term “fuzzy band.” We did describe some of the bands in the gel shift assays as “diffuse.” This is an accurate description of the bands we see in our gels and distinguishes them from other more well-defined bands.

      Page 13: "positioned internally" - please explain "internally" better here.

      We agree with the reviewer that “positioned internally” is confusing. In our revised manuscript we have changed this sentence to (Page 13, line 1):

      “Under these conditions, we obtained a 2.9 Å reconstruction of the helicase and CTD domains of DRH-1 bound to the middle region of the dsRNA, rather than its terminus (Figures 4C and S9), suggesting that DRH-1 hydrolyzes ATP to translocate along dsRNA.”

      Page 13: Please re-consider the detailed description of the dsRNA:DRH-1 contacts.

      We feel it is very important to illustrate and describe these contacts, which will be of interest to those who study mammalian RLRs.

      Figure 1C/D: Please write "minus/+ ATP" on top of the gels to make this distinction more clearly visible.

      In our original manuscript the gels are labeled with “minus ATP” (panel C) or “5mM ATP” (panel D) on the left to indicate both gels in panel C and both gels in panel D have the same conditions. This is also stated in the figure legend. We have not made revisions in response to this comment because we think it is already clear.

      Figure 2: Please explain R = RDE-4 in a clearly visible legend.

      We agree with the author that the illustration above the gels was not explained clearly. In our revised manuscript we have added the sentence below to the beginning of Figure legend 2A. “Cartoons indicate complexes and variants, with mutations in DCR-1 (green) and DRH-1 (blue) indicated with the amino acid change, and the presence of RDE-4 (R) represented with a purple circle.”

      Figure 4A: Please label the DRH-1 helicase domain and the C-terminal domain.

      We agree with the reviewer that we could more clearly define our labeled domains. In the revised manuscript we have added a sentence to the legend of Figure 4A: “The domains of DCR-1, DRH-1, and RDE-4 are color coded the same as in Fig 1A. For simplicity, only domains discussed in the text are labeled.”

      Reviewer #2 (Recommendations For The Authors):

      This study is complete in that all necessary controls and data are included and the authors are careful in their interpretation so as to not overstate the data or conclusions. The only suggestion is that further extension of the study to address the weaknesses above would increase the breadth of impact of this work.

      We thank the reviewer for their thoughtful comments. Weaknesses are addressed above in public reviews, and we will add again that we agree that a higher resolution structure would provide additional insight. In ongoing research, we are working towards this goal.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their careful comments. We sincerely agree with the comments from both reviewers, and noticed the word “cell transplantation”, throughout the manuscript including the title, was confusing. We revised the manuscript to clarify the aim of the study, and to express the conclusion more straightforwardly.

      Response to the reviewers:

      We interpret the data of the present study as the color of each RPE cell is a temporal condition which does not necessarily represent the quality (e.g. for cell transplantation) of the cells. We consider this may be applicable not only in vitro but also in vivo, although we do not know whether RPE shows heterogeneous level of pigmentation in vivo.

      As our concern for iPSC-RPE is always about their quality for cell transplantation, maybe we haven’t fairly evaluated the scientific significance obtained from the present study.

      Another thing we noticed was, although we used the term “cell transplantation” to explain what we meant by “quality” of the cells, we agree this was confusing. The aim of the study was not to show how the pigmentation level of transplant-RPE affects the result of cell transplantation, but to show the heterogeneous gene expression of iPSC-derived RPE cells, and the less correlation of the heterogeneity with pigmentation level. We went through the manuscript, including the title, to more straightforwardly lead this conclusion: the degree of pigmentation had some but weak correlation with the expression levels of functional genes, and the reason for the weakness of the correlation may be because the color is a temporal condition (as we interpreted from the data) that is different from more stable characteristics of the cells.

      We agree that “cell transplantation” in the title (and other parts) was misleading. So, we changed the title, and removed the phrase that led as if the aim of the study was to show something about cell transplantation or in vivo results.

      Also, to face scientifically significant results obtained from the present study appropriately, we discussed more about the correlation of the pigmentation level with some functional genes, and brought this as one of the conclusions of the manuscript.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      (1) Methods, please state the sex of the mice.

      This has now been added to the methods section:

      “Three to nine month old Thy1-GCaMP6S mice (Strain GP4.3, Jax Labs), N=16 stroke (average age: 5.4 months; 13 male, 3 female), and 5 sham (average age: 6 months; 3 male, 2 female), were used in this study.”

      (2) The analysis in Fig 3B-D, 4B-C, and 6A, B highlights the loss of limb function, firing rate, or connections at 1 week but this phenomenon is clearly persisting longer in some datasets (Fig. 3 and 6). Was there not a statistical difference at weeks 2,3,4,8 relative to "Pre-stroke" or were comparisons only made to equivalent time points in the sham group? Personally, I think it is useful to compare to "pre-stroke" which should be more reflective of that sample of animals than comparing to a different set of animals in the Sham group. A 1 sample t-test could be used in Fig 4 and 6 normalized data.

      On further analysis of our datasets, normalization throughout the manuscript was unnecessary for proper depiction of results, and all normalized datasets have been replaced with nonnormalized datasets. All within group statistics are now indicated within the manuscript.

      (3) Fig 4A shows a very striking change in activity that doesn't seem to be borne out with group comparisons. Since many neurons are quiet or show very little activity, did the authors ever consider subgrouping their analysis based on cells that show high activity levels (top 20 or 30% of cells) vs those that are inactive most of the time? Recent research has shown that the effects of stroke can have a disproportionate impact on these highly active cells versus the minimally active ones.

      A qualitative analysis supports a loss of cells with high activity at the 1-week post-stroke timepoint, and examination of average firing rates at 1-week shows reductions in the animals with the highest average rates. However, we have not tracked responses within individual neurons or quantitatively analyzed the data by subdividing cells into groups based on their prestroke activity levels. We have amended the discussion of the manuscript with the following to highlight the previous data as it relates to our study:

      “Recent research also indicates that stroke causes distinct patterns of disruption to the network topology of excitatory and inhibitory cells [73], and that stroke can disproportionately disrupt the function of high activity compared to low activity neurons in specific neuron sub-types [61]. Mouse models with genetically labelled neuronal sub-types (including different classes of inhibitory interneurons) could be used to track the function of those populations over time in awake animals.”

      (4) Fig 4 shows normalized firing rates when moving and at rest but it would be interesting to know what the true difference in activity was in these 2 states. My assumption is that stroke reduces movement therefore one normalizes the data. The authors could consider putting non-normalized data in a Supp figure, or at least provide a rationale for not showing this, such as stating that movement output was significantly suppressed, hence the need for normalization.

      On further analysis of our datasets, normalization throughout the manuscript was unnecessary for proper depiction of results, and all normalized datasets have been replaced with nonnormalized datasets.

      (5) One thought for the discussion. The fact that the authors did not find any changes in "distant" cortex may be specific to the region they chose to sample (caudal FL cortex). It is possible that examining different "distant" regions could yield a different outcome. For example, one could argue that there may have been no reason for this area to "change" since it was responsive to FL stimuli before stroke. Further, since it was posterior to the stroke, thalamocortical projects should have been minimally disturbed.

      We would like to thank the reviewer for this comment. We have amended the discussion with the following:

      “Our results suggest a limited spatial distance over which the peri-infarct somatosensory cortex displays significant network functional deficits during movement and rest. Our results are consistent with a spatial gradient of plasticity mediating factors that are generally enhanced with closer proximity to the infarct core [84,88,90,91]. However, our analysis outside peri-infarct cortex is limited to a single distal area caudal to the pre-stroke cFL representation. Although somatosensory maps in the present study were defined by a statistical criterion for delineating highly responsive cortical regions from those with weak responses, the distal area in this study may have been a site of activity that did not meet the statistical criterion for inclusion in the baseline map. The lack of detectable changes in population correlations, functional connectivity, assembly architecture and assembly activations in the distal region may reflect minimal pressure for plastic change as networks in regions below the threshold for regional map inclusion prior to stroke may still be functional in the distal cortex. Thus, threshold-based assessment of remapping may further overestimate the neuroplasticity underlying functional reorganization suggested by anaesthetized preparations with strong stimulation. Future studies could examine distal areas medial and anterior to the cFL somatosensory area, such as the motor and pre-motor cortex, to further define the effect of FL targeted stroke on neuroplasticity within other functionally relevant regions. Moreover, the restriction of these network changes to peri-infarct cortex could also reflect the small penumbra associated with photothrombotic stroke, and future studies could make use of stroke models with larger penumbral regions, such as the middle cerebral artery occlusion model. Larger injuries induce more sustained sensorimotor impairment, and the relationship between neuronal firing, connectivity, and neuronal assemblies could be further probed relative to recovery or sustained impairment in these models.”

      Minor comments:

      Line 129, I don't necessarily think the infarct shows "hyper-fluorescence", it just absorbs less white light (or reflects more light) than blood-rich neighbouring regions.

      Sentence in the manuscript has been changed to:

      “Resulting infarcts lesioned this region, and borders could be defined by a region of decreased light absorption 1 week post-stroke (Fig 1D, Top).”

      Line 130-132: the authors refer to Fig 1D to show cellular changes but these cannot be seen from the images presented. Perhaps a supplementary zoomed-in image would be helpful.

      As changes to the morphology of neurons are not one of the primary objectives of this study, and sampled resolution was not sufficiently high to clearly delineate the processes of neurons necessary for morphological assessment, we have amended the text as follows:

      “Within the peri-infarct imaging region, cellular dysmorphia and swelling was visually apparent in some cells during two photon imaging 1-week after stroke, but recovered over the 2 month poststroke imaging timeframe (data not shown). These gross morphological changes were not visually apparent in the more distal imaging region lateral to the cHL.”

      Lines 541-543, was there a rationale for defining movement as >30mm/s? Based on a statistical estimate of noise?

      Text has been altered as follows:

      “Animal movement within the homecage during each Ca2+ imaging session was tracked to determine animal speed and position. Movement periods were manually annotated on a subset of timeseries by co-recording animal movement using both the Mobile Homecage tracker, as well as a webcam (Logitech C270) with infrared filter removed. Movement tracking data was low pass filtered to remove spurious movement artifacts lasting below 6 recording frames (240ms). Based on annotated times of animal movement from the webcam recordings and Homecage tracking, a threshold of 30mm/s from the tracking data was determined as frames of animal movement, whereas speeds below 30mm/s was taken as periods of rest.”

      Lines 191-195: Note that although the finding of reduced neural activity is in disagreement with a multi-unit recording study, it is consistent with other very recent single-cell Ca++ imaging data after stroke (PMID: 34172735 , 34671051).

      Text has been altered as follows:

      “These results indicate decreased neuronal spiking 1-week after stroke in regions immediately adjacent to the infarct, but not in distal regions, that is strongly related to sensorimotor impairment. This finding runs contrary to a previous report of increased spontaneous multi-unit activity as early as 3-7 days after focal photothrombotic stroke in the peri-infarct cortex [1], but is in agreement with recent single-cell calcium imaging data demonstrating reduced sensoryevoked activity in neurons within the peri-infarct cortex after stroke [60,61].”

      Fig 7. I don't understand what the color code represents. Are these neurons belonging to the same assembly (or membership?).

      That is correct, neurons with identical color code belong to the same assembly. The legend of Fig 7 has been modified as follows to make this more explicit:

      “Fig 7. Color coded neural assembly plots depict altered neural assembly architecture after stroke in the peri-infarct region. (A) Representative cellular Ca2+ fluorescence images with neural assemblies color coded and overlaid for each timepoint. Neurons belonging to the same assembly have been pseudocolored with identical color. A loss in the number of neural assemblies after stroke in the peri-infarct region is visually apparent, along with a concurrent increase in the number of neurons for each remaining assembly. (B) Representative sham animal displays no visible change in the number of assemblies or number of neurons per assembly.”

      Reviewer #2 (Recommendations For The Authors):

      Materials and methods

      Identification of forelimb and hindlimb somatosensory cortex representations [...] Cortical response areas are calculated using a threshold of 95% peak activity within the trial. The threshold is presumably used to discriminate between the sensory-evoked response and collateral activation / less "relevant" response (noise). Since the peak intensity is lower after stroke, the "response" area is larger - lower main signal results in less noise exclusion. Predictably, areas that show a higher response before stroke than after are excluded from the response area before stroke and included after. While it is expected that the remapped areas will exhibit a lower response than the original and considering the absence of neuronal activity, assembly architecture, or functional connectivity in the "remapped" regions, a minimal criterion for remapping should be to exhibit higher activation than before stroke. Please use a different criterion to map the cortical response area after stroke.

      We would like to thank the reviewer for this comment. We agree with the reviewer’s assessment of 95% of peak as an arbitrary criterion of mapped areas. To exclude noise from the analysis of mapped regions, a new statistical criterion of 5X the standard deviation of the baseline period was used to determine the threshold to use to define each response map. These maps were used to determine the peak intensity of the forelimb response. We also measured a separate ROI specifically overlapping the distal region, lateral to the hindlimb map, to determine specific changes to widefield Ca2+ responses within this distal region. We have amended the text as follows and have altered Figure 2 with new data generated from our new criterion for cortical mapping.

      “The trials for each limb were averaged in ImageJ software (NIH). 10 imaging frames (1s) after stimulus onset were averaged and divided by the 10 baseline frames 1s before stimulus onset to generate a response map for each limb. Response maps were thresholded at 5 times the standard deviation of the baseline period deltaFoF to determine limb associated response maps. These were merged and overlaid on an image of surface vasculature to delineate the cFL and cHL somatosensory representations and were also used to determine peak Ca2+ response amplitude from the timeseries recordings. For cFL stimulation trials, an additional ROI was placed over the region lateral to the cHL representation (denoted as “distal region” in Fig 2E) to measure the distal region cFL evoked Ca2+ response amplitude pre- and post-stroke. The dimensions and position of the distal ROI was held consistent relative to surface vasculature for each animal from pre- to post-stroke.”


      Mice used have an age that goes from 3 to 9 months. This is a big difference given that literature on healthy aging reports changes in neurovascular coupling starting from 8-9 months old mice. Consider adding age as a covariate in the analysis.

      We do not have sufficient numbers of animals within this study to examine the effect of age on the results observed herein. We have amended the discussion with the following to address this point:

      “A potential limitation of our data is the undefined effect of age and sex on cortical dynamics in this cohort of mice (with ages ranging from 3-9 months) after stroke. Aging can impair neurovascular coupling [102–107] and reduce ischemic tolerance [108–111], and greater investigation of cortical activity changes after stroke in aged animals would more effectively model stroke in humans. Future research could replicate this study with mice in middle-age and aged mice (e.g. 9 months and 18+ months of age), and with sufficient quantities of both sexes, to better examine age and sex effects on measures of cortical function.”


      Please describe the "normalization" that was applied to the firing rate. Since a mixedeffects model was used, why wasn't baseline simply added as a covariate? With this type of data, normalization is useful for visualization purposes.

      On further analysis of our datasets, normalization throughout the manuscript was unnecessary for the visualization of results, and all normalized datasets have been replaced with nonnormalized datasets. All within group comparisons are now indicated throughout the manuscript and in the figures.


      Line 93 awake, freely behaving but head-fixed. That's not freely. Should just say behaving.

      Sentence has been edited as follows:

      “We used awake, behaving but head-fixed mice in a mobile homecage to longitudinally measure cortical activity, then used computational methods to assess functional connectivity and neural assembly architecture at baseline and each week for 2 months following stroke.”

      110 - 112 The last part of this sentence is unjustified because these areas have been incorrectly identified as locations of representational remapping.

      We agree with the reviewer and have amended the manuscript as follows after re-analyzing the dataset on widefield Ca2+ imaging of sensory-evoked responses: “Surprisingly, we also show that significant alterations in neuronal activity (firing rate), functional connectivity, and neural assembly architecture are absent within more distal regions of cortex as little as 750 µm from the stroke border, even in areas identified by regional functional imaging (under anaesthesia) as ‘remapped’ locations of sensory-evoked FL activity 8-weeks post-stroke.”


      149-152 There is no observed increase in the evoked response area. There is an observed change in the criteria for what is considered a response.

      We agree with the reviewer. Text has been amended as follows:

      “Fig 2A shows representative montages from a stroke animal illustrating the cortical cFL and cHL Ca2+ responses to 1s, 100Hz limb stimulation of the contralateral limbs at the pre-stroke and 8week post-stroke timepoints. The location and magnitude of the cortical responses changes drastically between timepoints, with substantial loss of supra-threshold activity within the prestroke cFL representation located anterior to the cHL map, and an apparent shift of the remapped representation into regions lateral to the cHL representation at 8-weeks post-stroke. A significant decrease in the cFL evoked Ca2+ response amplitude was observed in the stroke group at 8-weeks post-stroke relative to pre-stroke (Fig 2B). This is in agreement with past studies [19–25], and suggests that cFL targeted stroke reduces forelimb evoked activity across the cFL somatosensory cortex in anaesthetized animals even after 2 months of recovery. There was no statistical change in the average size of cFL evoked representation 8-weeks after stroke (Fig 2C), but a significant posterior shift of the supra-threshold cFL map was detected (Fig 2D). Unmasking of previously sub-threshold cFL responsive cortex in areas posterior to the original cFL map at 8-weeks post-stroke could contribute to this apparent remapping. However, the amplitude of the cFL evoked widefield Ca2+ response in this distal region at 8-weeks post-stroke remains reduced relative to pre-stroke activation (Fig 2E). Previous studies suggest strong inhibition of cFL evoked activity during the first weeks after photothrombosis [25]. Without longitudinal measurement in this study to quantify this reduced activation prior to 8-weeks poststroke, we cannot differentiate potential remapping due to unmasking of the cFL representation that enhances the cFL-evoked widefield Ca2+ response from apparent remapping that simply reflects changes in the signal-to-noise ratio used to define the functional representations. There were no group differences between stroke and sham groups in cHL evoked intensity, area, or map position (data not shown).”

      A lot of the nonsignificant results are reported as "statistical trends towards..." While the term "trend" is problematic, it remains common in its use. However, assigning directionality to the trend, as if it is actively approaching a main effect, should be avoided. The results aren't moving towards or away from significance. Consider rewording the way in which these results are reported.

      We have amended the text to remove directionality from our mention of statistical trends.

      R squared and p values for significant results are reported in the "impaired performance on tapered beam..." and "firing rate of neurons in the peri-infarct cortex..." subsections of the results, but not the other sections. Please report the results in a consistent manner.

      R-squared and p-values have been removed from the results section and are now reported in figure captions consistently.


      288 Remapping is defined as "new sensory-evoked spiking". This should be the main criterion for remapping, but it is not operationalized correctly by the threshold method.

      With our new criterion for determining limb maps using a statistical threshold of 5X the standard deviation of baseline fluorescence, we have edited text throughout the manuscript to better emphasize that we may not be measuring new sensory-evoked spiking with the mesoscale mapping that was done. We have edited the discussion as follows:

      “Here, we used longitudinal two photon calcium imaging of awake, head-fixed mice in a mobile homecage to examine how focal photothrombotic stroke to the forelimb sensorimotor cortex alters the activity and connectivity of neurons adjacent and distal to the infarct. Consistent with previous studies using intrinsic optical signal imaging, mesoscale imaging of regional calcium responses (reflecting bulk neuronal spiking in that region) showed that targeted stroke to the cFL somatosensory area disrupts the sensory-evoked forelimb representation in the infarcted region. Consistent with previous studies, this functional representation exhibited a posterior shift 8-weeks after injury, with activation in a region lateral to the cHL representation. Notably, sensory-evoked cFL representations exhibited reduced amplitudes of activity relative to prestroke activation measured in the cFL representation and in the region lateral the cHL representation. Longitudinal two-photon calcium imaging in awake animals was used to probe single neuron and local network changes adjacent the infarct and in a distal region that corresponded to the shifted region of cFL activation. This imaging revealed a decrease in firing rate at 1-week post-stroke in the peri-infarct region that was significantly negatively correlated with the number of errors made with the stroke-affected limbs on the tapered beam task. Periinfarct cortical networks also exhibited a reduction in the number of functional connections per neuron and a sustained disruption in neural assembly structure, including a reduction in the number of assemblies and an increased recruitment of neurons into functional assemblies. Elevated correlation between assemblies within the peri-infarct region peaked 1-week after stroke and was sustained throughout recovery. Surprisingly, distal networks, even in the region associated with the shifted cFL functional map in anaesthetized preparations, were largely undisturbed.”

      “Cortical plasticity after stroke Plasticity within and between cortical regions contributes to partial recovery of function and is proportional to both the extent of damage, as well as the form and quantity of rehabilitative therapy post-stroke [80,81]. A critical period of highest plasticity begins shortly after the onset of stroke, is greatest during the first few weeks, and progressively diminishes over the weeks to months after stroke [19,82–86]. Functional recovery after stroke is thought to depend largely on the adaptive plasticity of surviving neurons that reinforce existing connections and/or replace the function of lost networks [25,52,87–89]. This neuronal plasticity is believed to lead to topographical shifts in somatosensory functional maps to adjacent areas of the cortex. The driver for this process has largely been ascribed to a complex cascade of intra- and extracellular signaling that ultimately leads to plastic re-organization of the microarchitecture and function of surviving peri-infarct tissue [52,80,84,88,90–92]. Likewise, structural and functional remodeling has previously been found to be dependent on the distance from the stroke core, with closer tissue undergoing greater re-organization than more distant tissue (for review, see [52]).”

      “Previous research examining the region at the border between the cFL and cHL somatosensory maps has shown this region to be a primary site for functional remapping after cFL directed photothrombotic stroke, resulting in a region of cFL and cHL map functional overlap [25]. Within this overlapping area, neurons have been shown to lose limb selectivity 1-month post-stroke [25]. This is followed by the acquisition of more selective responses 2-months post-stroke and is associated with reduced regional overlap between cFL and cHL functional maps [25]. Notably, this functional plasticity at the cellular level was assessed using strong vibrotactile stimulation of the limbs in anaesthetized animals. Our findings using longitudinal imaging in awake animals show an initial reduction in firing rate at 1-week post-stroke within the peri-infarct region that was predictive of functional impairment in the tapered beam task. This transient reduction may be associated with reduced or dysfunctional thalamic connectivity [93–95] and reduced transmission of signals from hypo-excitable thalamo-cortical projections [96]. Importantly, the strong negative correlation we observed between firing rate of the neural population within the peri-infarct cortex and the number of errors on the affected side, as well as the rapid recovery of firing rate and tapered beam performance, suggests that neuronal activity within the peri-infarct region contributes to the impairment and recovery. The common timescale of neuronal and functional recovery also coincides with angiogenesis and re-establishment of vascular support for peri-infarct tissue [83,97–100].”

      “Consistent with previous research using mechanical limb stimulation under anaesthesia [25], we show that at the 8-week timepoint after cFL photothrombotic stroke the cFL representation is shifted posterior from its pre-stroke location into the area lateral to the cHL map. Notably, our distal region for awake imaging was directly within this 8-week post-stroke cFL representation. Despite our prediction that this distal area would be a hotspot for plastic changes, there was no detectable alteration to the level of population correlation, functional connectivity, assembly architecture or assembly activations after stroke. Moreover, we found little change in the firing rate in either moving or resting states in this region. Contrary to our results, somatosensoryevoked activity assessed by two photon calcium imaging in anesthetized animals has demonstrated an increase in cFL responsive neurons within a region lateral to the cHL representation 1-2 months after focal cFL stroke [25]. Notably, this previous study measured sensory-evoked single cell activity using strong vibrotactile (1s 100Hz) limb stimulation under aneasthesia [25]. This frequency of limb stimulation has been shown to elicit near maximal neuronal responses within the limb-associated somatosensory cortex under anesthesia [101]. Thus, strong stimulation and anaesthesia may have unmasked non-physiological activity in neurons in the distal region that is not apparent during more naturalistic activation during awake locomotion or rest. Regional mapping defined using strong stimulation in anesthetized animals may therefore overestimate plasticity at the cellular level.”

      “Our results suggest a limited spatial distance over which the peri-infarct somatosensory cortex displays significant network functional deficits during movement and rest. Our results are consistent with a spatial gradient of plasticity mediating factors that are generally enhanced with closer proximity to the infarct core [84,88,90,91]. However, our analysis outside peri-infarct cortex is limited to a single distal area caudal to the pre-stroke cFL representation. Although somatosensory maps in the present study were defined by a statistical criterion for delineating highly responsive cortical regions from those with weak responses, the distal area in this study may have been a site of activity that did not meet the statistical criterion for inclusion in the baseline map. The lack of detectable changes in population correlations, functional connectivity, assembly architecture and assembly activations in the distal region may reflect minimal pressure for plastic change as networks in regions below the threshold for regional map inclusion prior to stroke may still be functional in the distal cortex. Thus, threshold-based assessment of remapping may further overestimate the neuroplasticity underlying functional reorganization suggested by anaesthetized preparations with strong stimulation. Future studies could examine distal areas medial and anterior to the cFL somatosensory area, such as the motor and pre-motor cortex, to further define the effect of FL targeted stroke on neuroplasticity within other functionally relevant regions. Moreover, the restriction of these network changes to peri-infarct cortex could also reflect the small penumbra associated with photothrombotic stroke, and future studies could make use of stroke models with larger penumbral regions, such as the middle cerebral artery occlusion model. Larger injuries induce more sustained sensorimotor impairment, and the relationship between neuronal firing, connectivity, and neuronal assemblies could be further probed relative to recovery or sustained impairment in these models. Recent research also indicates that stroke causes distinct patterns of disruption to the network topology of excitatory and inhibitory cells [73], and that stroke can disproportionately disrupt the function of high activity compared to low activity neurons in specific neuron sub-types [61]. Mouse models with genetically labelled neuronal sub-types (including different classes of inhibitory interneurons) could be used to track the function of those populations over time in awake animals. A potential limitation of our data is the undefined effect of age and sex on cortical dynamics in this cohort of mice (with ages ranging from 3-9 months) after stroke. Aging can impair neurovascular coupling [102–107] and reduce ischemic tolerance [108–111], and greater investigation of cortical activity changes after stroke in aged animals would more effectively model stroke in humans. Future research could replicate this study with mice in middle-age and aged mice (e.g. 9 months and 18+ months of age), and with sufficient quantities of both sexes, to better examine age and sex effects on measures of cortical function.”

      315 - 317 Remodelling is dependent on the distance from the stroke core, with closer tissue undergoing greater reorganization than more distant tissue. There is no evidence that the more distant tissue undergoes any reorganization at all.

      We agree with the reviewer that no remodelling is apparent in our distal area. We have removed reference to our study showing remodeling in the distal area, and have amended the text as follows:

      “Likewise, structural and functional remodeling has previously been found to be dependent on the distance from the stroke core, with closer tissue undergoing greater re-organization than more distant tissue (for review, see [52]).”

      412-414 The authors speculate that a strong stimulation under anaesthesia may unmask connectivity in distal regions. However, the motivation for this paper is that anaesthesia is a confounding factor. It appears to me that, given the results of this study, the authors should argue that the functional connectivity observed under anaesthesia may be spurious.

      The incorrect word was used here. We have corrected the paragraph of the discussion and amended it as follows:

      “Consistent with previous research using mechanical limb stimulation under anaesthesia [25], we show that at the 8-week timepoint after cFL photothrombotic stroke the cFL representation is shifted posterior from its pre-stroke location into the area lateral to the cHL map. Notably, our distal region for awake imaging was directly within this 8-week post-stroke cFL representation. Despite our prediction that this distal area would be a hotspot for plastic changes, there was no detectable alteration to the level of population correlation, functional connectivity, assembly architecture or assembly activations after stroke. Moreover, we found little change in the firing rate in either moving or resting states in this region. Contrary to our results, somatosensoryevoked activity assessed by two photon calcium imaging in anesthetized animals has demonstrated an increase in cFL responsive neurons within a region lateral to the cHL representation 1-2 months after focal cFL stroke [25]. Notably, this previous study measured sensory-evoked single cell activity using strong vibrotactile (1s 100Hz) limb stimulation under aneasthesia [25]. This frequency of limb stimulation has been shown to elicit near maximal neuronal responses within the limb-associated somatosensory cortex under anesthesia [101]. Thus, strong stimulation and anaesthesia may have unmasked non-physiological activity in neurons in the distal region that is not apparent during more naturalistic activation during awake locomotion or rest. Regional mapping defined using strong stimulation in anesthetized animals may therefore overestimate plasticity at the cellular level.”


      Figure 1 and 2: Scale bar missing.

      Scale bars added to both figures.

      Figure 2: The representative image shows a drastic reduction of the forelimb response area, contrary to the general description of the findings. It would also be beneficial to see a graph with lines connecting the pre-stroke and 8-week datapoints.

      The data for Figure 2 has been re-analyzed using a new criterion of 5X the standard deviation of the baseline period for determining the threshold for limb mapping. Figure 2 and relevant manuscript and figure legend text has been amended. In agreement with the reviewers observation, there is no increase in forelimb response area, but instead a non-significant decrease in the average forelimb area.

    1. Author Response

      We would like to thank the reviewers for providing constructive feedback on the manuscript. To address the weaknesses identified, we are performing additional experiments and generating additional data, to be added to the updated manuscript.

      (1) The utility of a pipeline depends on the generalization properties.

      While the proposed pipeline seems to work for the data the authors acquired, it is unclear if this pipeline will actually generalize to novel data sets possibly recorded by a different microscope (e.g. different brand), or different imagining conditions (e.g. illumination or different imagining artifacts) or even to different brain regions or animal species, etc.

      The authors provide a 'black-box' approach that might work well for their particular data sets and image acquisition settings but it is left unclear how this pipeline is actually widely applicable to other conditions as such data is not provided.

      In my experience, without well-defined image pre-processing steps and without training on a wide range of image conditions pipelines typically require significant retraining, which in turn requires generating sufficient amounts of training data, partly defying the purpose of the pipeline. It is unclear from the manuscript, how well this pipeline will perform on novel data possibly recorded by a different lab or with a different microscope.

      To address generalizability, we are performing several validation experiments with data from different 1) channels, 2) species (rat), and 3) microscopes, to highlight the robustness of our deep learning (DL) segmentation model to out-of-distribution data with different characteristics and acquisition protocols. We first used our model to segment three images (507x507 x&y, 250-170 um z) from three C57BL/6 mice acquired on the same two-photon fluorescent microscope following the same imaging protocol. The vasculature was labelled with the Texas Red dextran, as in the current experiment. In place of the EYFP signal from pyramidal neurons (2nd channel), gaussian noise was generated with a mean and standard deviation identical to the acquired vascular channel. A second set of two images(507x507 x&y, 300-400 um z) from two Fischer rats with Alexa680-dextran label in the plasma; these rats were imaged on the same two-photon fluorescence microscope, but with galvano scanners (instead of resonant scanners). A second channel of random Gaussian noise was also added here. Finally, an image of vasculature from a ex-vivo cleared mouse brain (1665x1205x780 um) imaged on a light sheet fluorescence microscope (Miltenyi UltraMicroscope Blaze) was also segmented with our model. Lectin-DyLight 649 was used to label the vasculature in this cohort. The Dice Score, Precision, Recall, Hausdorff 95%, and Mean surface distance will be reported for all of these additional image segmentations, upon generation of ground truth images. Finally, examples of the generated segmentation masks are presented in Author response image 1 for visual comparison. Of final note, should the segmentation results on a new data set be unsatisfactory, the methods downstream from segmentation are still applicable and the model can be further fine-tuned on other out-of-distribution data.

      Author response image 1.

      Examples of the deep learning model output on out of distribution data from a different mouse strain, from a different species (Fischer rat), and on a different microscope using a different imaging modality.

      (2) Some of the chosen analysis results seem to not fully match the shown data, or the visualization of the data is hard to interpret in the current form.

      We are updating the visualizations to make them more accessible and we will ensure matching between tables and figures.

      (3) Additionally, some measures seem not fully adapted to the current situation (e.g. the efficiency measure does not consider possible sources or sinks). Thus, some additional analysis work might be required to account for this.

      Thank you for your comment. The efficiency metric was selected as it does not consider sources or sinks. We do agree that accounting for vessel subtypes in the analysis (thus classifying larger vessels as either supplying or draining) would be uniquely useful: notwithstanding, it is extremely laborious. We are therefore leveraging machine learning in a parallel project to afford vessel classification by subtype. The source/sink analysis is also confounded by the small field-of-view of in situ 2PFM. Future work will investigate network remodelling across the whole brain with ex-vivo light sheet fluorescence microscopy.

      (4) The authors apply their method to in vivo data. However, there are some weaknesses in the design that make it hard to accept many of the conclusions and even to see that the method could yield much useful data with this type of application. Primarily, the acquisition of a large volume of tissue is very slow. In order to obtain a network of vascular activity, large volumes are imaged with high resolution. However, the volumes are scanned once every 42 seconds following stimulation. Most vascular responses to neuronal activation have come and gone in 42 seconds so each vessel segment is only being sampled at a single time point in the vascular response. So all of the data on diameter changes are impossible to compare since some vessels are sampled during the initial phase of the vascular response, some during the decay, and many probably after it has already returned to baseline. The authors attempt to overcome this by alternating the direction of the scan (from surface to deep and vice versa). But this only provides two sample points along the vascular response curve and so the problem still remains.

      We thank the Reviewer for bringing up this important point.

      Although vessels can show relatively rapid responses to perturbation, vascular responses to photostimulation of ChannelRhodopsin-2 in neighbouring neurons are typically long lasting: they do not come and go in 42 seconds. To demonstrate this point, we acquired higher temporal-resolution images of smaller volumes of tissue over 5 minutes preceding and following the 5-s photoactivation with the original parameters. Imaging protocol was different in that we utilized a piezoelectric motor, smaller field of view, and only 3x frame averaging, resulting in a temporal resolution of 1.57-2.63 seconds. This acquisition was repeated at 4 different cortical depths (325 um, 250 um, 150um, and 40 um) in a single mouse.The vascular radii were estimated using our presented pipeline. Raw data and LOESS fits are shown in Author response image 2 (below). Vessels shorter than 20 um in length were excluded from the analysis. A video of one of the acquisitions is shown along with the timecourses of select vessels’ caliber changes in Author response image 3. The vascular caliber changes following photostimulation persisted for several minutes, consistent with earlier observations by us and others1–4. These higher temporal-resolution scans of smaller tissue volumes will be repeated in two more mice; we will therein assess the repeatability of individual vessel responses to repeated stimulations.

      Author response image 2.

      A. The vascular radii of multiple vessels were imaged at 4 different cortical depths, each within a 507 x (75-150) x (30-45)um tissue volume. Baseline scanning lasted for 5 minutes, followed by 5 seconds of blue or green light stimulation at 4.3 mW/mm2, and culminating in 5 minutes of post-stimulation scanning. B. LOESS fits of the vessel radius estimates for each vessel segment identified.

      Author response image 3.

      Estimated vascular radius at each timepoint for select vessels from the imaging stack shown in the following video: https://flip.com/s/kB1eTwYzwMJE

      (5) A second problem is the use of optogenetic stimulation to activate the tissue. First, it has been shown that blue light itself can increase blood flow (Rungta et al 2017). The authors note the concern about temperature increases but that is not the same issue. The discussion mentions that non-transgenic mice were used to control for this with "data not shown". This is very important data given these earlier reports that have found such effects and so should be included.

      We will update the manuscript to incorporate the data on volumetric scanning in nontransgenic C57BL/6 mice undergoing blue light stimulation, with identical parameters as those used in Thy-ChR2 mice. As before, responders were identified as vessels that following blue light stimulation show a radius change greater than 2 standard deviations of their baseline radius standard deviation: their estimated radii changes are shown in Author response image 4 below. There were no statistical difference between radii distributions of any of the photostimulation conditions and pre-photostimulation baseline. A comparison of this with the transgenic THY1-ChR2-EYFP mice will be included in manuscript updates.

      Author response image 4.

      Radius change measurements for responding vessels from the Thy1-ChR2 mice described in the manuscript (top row) vs. 4 wild-type C57BL6/J mice (bottom row). Response to photostimulation was defined as a change above twice their baseline standard deviation. 458nm light was applied at 1.1 mW/mm^2 and 4.3 mW/mm^2; while 552 nm light was applied at 4.3 mW/mm^2. No statistically significant differences were observed between the radii distributions in any condition, Wilcoxon test, Bonferroni correction.

      (6) Secondly, there doesn't seem to be any monitoring of neural activity following the photo-stimulation. The authors repeatedly mention "activated" neurons and claim that vessel properties change based on distance from "activated" neurons. But I can't find anything to suggest that they know which neurons were active versus just labeled. Third, the stimulation laser is focused at a single depth plane. Since it is single-photon excitation, there is likely a large volume of activated neurons. But there is no way of knowing the spatial arrangement of neural activity and so again, including this as a factor in the analysis of vascular responses seems unjustified.

      Given the high fidelity of Channel-Rhodpsin2 activation with blue light, we assume that all labeled neurons within the volume of photostimulation are being activated. Depending on their respective connectivities, their postsynaptic neurons (whether or not they are labelled) are also activated. We indeed agree with the reviewer that the spatial distribution of neuronal activation is not well defined. We will revise the manuscript to update the terminology from activated to labeled neurons and stress in the Discussion that the motivation for assessing the distance to the closest labelled neuron as one of our metrics is purely to demonstrate the possibility of linking vascular response to activations in some of their neighbouring neurons and including morphological metrics in the computational pipeline. Of final note, the depth-dependence of the distance between labelled neurons and responding vessels can also readily be assessed using our computational pipeline.

      (7) The study could also benefit from more clear illustration of the quality of the model's output. It is hard to tell from static images of 3-D volumes how accurate the vessel segmentation is. Perhaps some videos going through the volume with the masks overlaid would provide some clarity. Also, a comparison to commercial vessel segmentation programs would be useful in addition to benchmarking to the ground truth manual data.

      We generated a video demonstrating the deep-learning model outputs and have made the video available here: https://flip.com/s/_XBs4yVxisNs Additional videos will be uploaded.

      (8) Another useful metric for the model's success would be the reproducibility of the vessel responses. Seeing such a large number of vessels showing constrictions raises some flags and so showing that the model pulled out the same response from the same vessels across multiple repetitions would make such data easier to accept.

      We have generated a figure demonstrating the repeatability of the vascular responses following photoactivation in a volume, and presented them next to the corresponding raw acquisitions for visual inspection. It is important to note that there is a significant biological variability in vessels’ responses to repeated stimulation, as described previously 2,5. Constrictions have been reported in the literature by our group and others 1,3,4,6,7, though their prevalence has not been systematically studied to date. Concerning the reproducibility of our analysis, we will demonstrate model reproducibility (as a metric of its success) in the updated manuscript.

      Author response image 5.

      Registered acquisitions of the vasculature before and after optogenetic stimulation for 5 scan pairs over 3 different stimulation conditions. The estimated radii along vessel segments are presented.

      Author response image 6.

      Sample capillaries constrictions from maximum intensity projections at repeated timepoints following optogenetic stimulation. Baseline (pre-stimulation) image is shown on the left and the post-stimulation image, on the right, with the estimated radius changes listed to the left.

      (9) A number of findings are questionable, at least in part due to these design properties. There are unrealistically large dilations and constrictions indicated. These are likely due to artifacts of the automated platform. Inspection of these results by eye would help understand what is going on.

      Some of the dilations were indeed large in magnitude. We present select examples of large dilations and constrictions ranging in magnitude from 2.08 to 10.80 um for visual inspection (for reference, average, across vessel and stimuli, magnitude of radius changes were 0.32 +/- 0.54 um). Diameter changes above 5 um were visually inspected.

      Author response image 7.

      Additional views of diameter changes in maximum intensity projections ranging in magnitude from 2.08 um to 10.80 um.

      (10) In Figure 6, there doesn't seem to be much correlation between vessels with large baseline level changes and vessels with large stimulus-evoked changes. It would be expected that large arteries would have a lot of variability in both conditions and veins much less. There is also not much within-vessel consistency. For instance, the third row shows what looks like a surface vessel constricting to stimulation but a branch coming off of it dilating - this seems biologically unrealistic.

      We now plot photostimulation-elicited vesselwise radius changes vs. their corresponding baseline radius standard deviations (Author response image 8 below). The Pearson correlation between the baseline standard deviation and the radius change was 0.08 (p<1e-5) for 552nm 4.3 mW/mm^2 stimulation, -0.08 (p<1e-5) for 458nm 1.1 mW/mm^2 stimulation, and -0.04 (p<1e-5) for 458nm 4.3 mW/mm^2 stimulation. For non-control (i.e. blue) photostimulation conditions, the change in the radius is thus negatively correlated to the vessel’s baseline radius standard deviation. The within-vessel consistency is explicitly evaluated in Figure 8 of the manuscript. As for the instance of a surface vessel constricting while a downstream vessel dilates, it is important to remember that the 2PFM FOV restricts us to imaging a very small portion of the cortical microvascular network (one (among many) daughter vessels showing changes in the opposite direction to the parent vessel is not violating the conservation of mass).

      Author response image 8.

      A plot of the vessel radius change elicited by photostimulation vs. baseline radius standard deviation.

      (11) As mentioned, the large proportion of constricting capillaries is not something found in the literature. Do these happen at a certain time point following the stimulation? Did the same vessel segments show dilation at times and constriction at other times? In fact, the overall proportion of dilators and constrictors is not given. Are they spatially clustered? The assortativity result implies that there is some clustering, and the theory of blood stealing by active tissue from inactive tissue is cited. However, this theory would imply a region where virtually all vessels are dilating and another region away from the active tissue with constrictions. Was anything that dramatic seen?

      The kinetics of the vascular responses are not accessible via the current imaging protocol and acquired data; however, this computational pipeline can readily be adapted to test hypotheses surrounding the temporal evolution of the vascular responses, as shown in Author response image 2 (with higher temporal-resolution data). Some vessels dilate at some time points and constrict at others as shown in Author response image 2. As listed in Table 2, 4.4% of all vessels constrict and 7.5% dilate for 452nm stimulation at 4.3 mW/mm^2. There was no obvious spatial clustering of dilators or constrictors: we expect such spatial patterns to more likely result from different modes of stimulation and/or in the presence of a pathology. The assortativity peaked at 0.4 (i.e. is quite far from 1 where each vessel’s response exactly matches that of its neighbour).

      (12) Why were nearly all vessels > 5um diameter not responding >2SD above baseline? Did they have highly variable baselines or small responses? Usually, bigger vessels respond strongly to local neural activity.

      In Author response image 9, we now present the stimulation-induced radius changes vs. baseline radius variability across vessels with a radius greater than 5 um. The Pearson correlation between the radius change and the baseline radius standard deviation was 0.04 (p=0.5) for 552nm 4.3 mW/mm^2 stimulation, -0.26 (p<1e-5) for 458nm 1.1 mW/mm^2 stimulation, and -0.24 (p<1e-5) for 458nm 4.3 mW/mm^2 stimulation. We will incorporate an additional analysis to address this issue by identifying responding vessels as those showing supra-threshold percent change in their radius (instead of SD).

      Author response image 9.

      A plot of the vessel radius change elicited by photostimulation vs. baseline radius standard deviation in vessels with a baseline radius greater than 5 um.


      (1) Alarcon-Martinez L, Villafranca-Baughman D, Quintero H, et al. Interpericyte tunnelling nanotubes regulate neurovascular coupling. Nature. 2020;kir 2.1(7823):91-95. doi:10.1038/s41586-020-2589-x

      (2) Mester JR, Bazzigaluppi P, Weisspapir I, et al. In vivo neurovascular response to focused photoactivation of Channelrhodopsin-2. NeuroImage. 2019;192:135-144. doi:10.1016/j.neuroimage.2019.01.036

      (3) O’Herron PJ, Hartmann DA, Xie K, Kara P, Shih AY. 3D optogenetic control of arteriole diameter in vivo. Nelson MT, Calabrese RL, Nelson MT, Devor A, Rungta R, eds. eLife. 2022;11:e72802. doi:10.7554/eLife.72802

      (4) Hartmann DA, Berthiaume AA, Grant RI, et al. Brain capillary pericytes exert a substantial but slow influence on blood flow. Nat Neurosci. Published online February 18, 2021:1-13. doi:10.1038/s41593-020-00793-2

      (5) Mester JR, Bazzigaluppi P, Dorr A, et al. Attenuation of tonic inhibition prevents chronic neurovascular impairments in a Thy1-ChR2 mouse model of repeated, mild traumatic brain injury. Theranostics. 2021;11(16):7685-7699. doi:10.7150/thno.60190

      (6) Mester JR, Rozak MW, Dorr A, Goubran M, Sled JG, Stefanovic B. Network response of brain microvasculature to neuronal stimulation. NeuroImage. 2024;287:120512. doi:10.1016/j.neuroimage.2024.120512

      (7) Hall CN, Reynell C, Gesslein B, et al. Capillary pericytes regulate cerebral blood flow in health and disease. Nature. 2014;508(7494):55-60. doi:10.1038/nature13165

    1. eLife assessment

      This valuable study advances our understanding of the brain nuclei involved in rapid-eye movement (REM) sleep regulation. Using a combination of imaging, electrophysiology, and optogenetic tools, the study provides convincing evidence that inhibitory neurons in the preoptic area of the hypothalamus influence REM sleep. This work will be of interest to neurobiologists working on the brain circuits of sleep.

    2. Reviewer #1 (Public Review):

      This paper identifies GABA cells in the preoptic hypothalamus and others in the posterior hypothalamus which are involved in REM sleep rebound (the increase in REM sleep) after selective REM sleep deprivation. By calcium photometry, these preoptic cells are most active during REM, and show more calcium signals during REM deprivation, suggesting they respond to "REM pressure". Inhibiting these cells ontogenetically diminishes REM sleep. The optogenetic and photometry work is carried out to a high standard, the paper is well written, and the findings are interesting and enhance our understanding of REM sleep regulation. The new findings make it clear that as for the circuitry that regulates NREM sleep, REM sleep circuitry is also quite distributed in the brain. It is unclear if there is a true "REM center". The study of mechanisms of catching up on lost sleep (sleep homeostasis), has previously focused on NREM sleep, where various circuits have been identified. That there is a special mechanism that also tracks time awake and compensates with REM sleep is intriguing.

      In a broader context, the existence of REM rebound suggests that REM sleep must have a function, otherwise why catch up on it. There is a lot of literature that suggests REM contributes to emotional processing, for example. The new findings deepen our appreciation of REM regulation. As REM sleep is often disturbed in stress (e.g. post-traumatic stress disorder) and in depression, understanding more about REM regulation could ultimately aid treatments for people living with these conditions.

    3. Reviewer #2 (Public Review):

      Maurer et al investigated the contribution of GAD2+ neurons in the preoptic area (POA), projecting to the tuberomammillary nucleus (TMN), to REM sleep regulation. They applied an elegant design to monitor and manipulate activity of this specific group of neurons: a GAD2-Cre mouse, injected with retrograde AAV constructs in the TMN, thereby presumably only targeting GAD2+ cells projecting to the TMN. Using this set-up in combination with technically challenging techniques including EEG with photometry and REM sleep deprivation, the authors found that this cell-type studied becomes active shortly (≈40sec) prior to entering REM sleep and remains active during REM sleep. Moreover, optogenetic inhibition of GAD2+ cells inhibits REM sleep by a third, and also impairs the rebound in REM sleep in the following hour. Thus, the data makes a convincing case for a role of GAD2+ neurons in the POA projecting to the TMN in REM sleep regulation.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We would like to extend our gratitude to the reviewers for their meticulous analysis and constructive feedback on our manuscript. We have revised our paper based on the suggestions regarding supporting literature and the theory behind CAPs along with detailed insights regarding our methods. Their suggestions have been extremely useful in strengthening the clarity and rigor of our manuscript.

      Reviewer #1 (Recommendations For The Authors):

      (1) There are no obvious problems with this paper and it is relatively straightforward. There are some challenges that I would like to suggest. These variants have multiple mutations, so it would be interesting if you could drill down to find out which mutation is the most important for the collective changes reported here. I would like to see a sequence alignment of these variants, perhaps in the supplemental material, just to get some indication of the extent of mutations involved.

      Finding the most important mutation within a set is a tricky question, as each mutation changes the way future mutations will affect function due to epistasis. Indeed, this is what we aim to explore in this work. To illustrate this point, we included a new supplementary figure S5A. Three critical mutations that emerged quickly, and were frequently observed in other dominant variants, were S477N, T478K, and N501Y. Thus, we computed the EpiScore values of these three mutations, with several critical residues contributing to hACE2 binding. The EpiScore distribution indicates that residues 477, 478, and 501 have strong epistatic (i.e., non-additive) interactions, as indicated by EpiScore values above 2.0.

      To further investigate these epistatic interactions, we first conducted MD simulations and computed the DFI profile of these three single mutants. We analyzed how different the DFI scores of the hACE2 binding interface residues of the RBD are, across three single mutants with Omicron, Delta, and Omicron XBB variants (Fig S5B). Fig S5B shows how mutations at these particular sites affect the binding interface DFI in various backgrounds, as the three mutations are also observed in the Omicron, XBB, and XBB 1.5 variants. If the difference in the DFI profile of the mutant and the given variant is close to 0, then we could safely state that this mutation affected the variant the most. However, what we observe is quite the opposite: the DFI profile of the mutation is significantly different in different variant backgrounds. While these mutations may change overall behavior, their individual contributions to overall function are more difficult to pin down because overall function is dependent on the non-additive interactions between many different residues.

      Author response image 1.

      (A) Three critical mutations that emerged quickly, and were frequently observed in other dominant variants, were S477N, T478K, and N501Y. EpiScores of sites 477, 478, and 501 with one another are shown with k = the binding interface of the open chain. These residues are highly epistatic, producing higher responses than expected when perturbed together. (B) The difference in the dynamic flexibility profiles between the single mutants and the most common variants for the hACE2 binding residues of the RBD. DFI profiles exhibit significant variation from zero, and also show different flexibility in each background variant, highlighting the critical non-additive interactions of the other mutation in the given background variant. Thus, these three critical mutations, impacting binding affinity, do not solely contribute to the binding. There are epistatic interactions with the other mutations in VOCs that shape the dynamics of the binding interface to modulate binding affinity with hACE2.

      As we discussed above, while the epistatic interactions are crucial and the collective impact of the mutations shape the mutational landscape of the spike protein, we would like note that mutation S486P is one of the critical mutations we identify, modulating both antibody and hACE2 binding and our analysis reveals the strong non-additive interactions with the other mutational sites. This mutational site appears in both XBB1.5 and earlier Omicron strains which highlights its importance in functional evolution of the spike protein. CAPs 346R, 486F, and 498Q also may be important, as they have a high EpiScore, indicating critical epistatic interaction with many mutation sites.

      Regarding to the suggestion about presenting the alignment of the different variants, we have attached a mutation table, highlighting the mutated residues for each strain compared to the reference sequence as supplemental Figure S1 along with the full alignment file.

      (2) Also, I am wondering if it would be possible to insert some of these flexibilities and their correlations directly into the elastic network models to enable a simpler interpretation of these results. I realize this is beyond the scope of the present work, but such an effort might help in understanding these relatively complex effects.

      This is great suggestion. A similar analysis has been performed for different proteins by Mcleash (See doi: 10.1016/j.bpj.2015.08.009) by modulating the spring constants of specific position to alter specific flexibility and evaluate change in elastic free energy to identify critical mutation (in particular, allosteric mutation) sites. We will be happy to pursue this as future work.


      (3) 1 typo on line 443 - should be binding instead of biding.

      Fixed, thanks for spotting that.

      (4) The two shades of blue in Fig. 4B were not distinguishable in my version.

      To fix this, we have changed the overlapping residues between Delta and Omicron to a higher contrast shade of blue.

      (5) Compensatory is often used in an entirely different way - additional mutations that help to recover native function in the presence of a deleterious mutation.

      Although our previous study (Ose et al. 2022, Biophysical Journal) shows that compensatory mutations were generally additive, the two ideas are not one and the same. We thank the reviewer for pointing this out. Therefore, to clarify, we have now described our results in terms of dynamic additivity, rather than compensation.

      Reviewer #2 (Recommendations For The Authors):

      (1) The authors note that the identified CAPs overlap with those of others (Cagliani et al. 2020; Singh and Yi 2021; Starr, Zepeda, et al. 2022). In itself, this merits a deeper discussion and explicit indication of which positions are not identified. However, there is one point that I believe may represent a fundamental flaw in this study in that the calculation of EP from the alignment of S proteins ignores entirely the differences in the interacting interface with which S for different coronaviruses in the alignment interact in the different receptors in each host species. This may be the reason why so many "CAPs" are in the RBD. The authors should at the very least make a convincing case of why they are not simply detecting constraints imposed by the different interacting partners, at least in the case of positions within the RBD interface with ACE2. Another point that the authors should discuss is that ACE2 is not the only receptor that facilitates infection, TMPRSS2 and possibly others have been identified as well. The results should be discussed in light of this.

      To begin with, we have now explicitly noted (on line 135) that “sites 478, 486, 498, and 681 have already been implicated in SARS-CoV-2 evolution, leaving the remaining 11 CAPs as undiscovered candidate sites for adaptation.” Evolutionary analyses are done using orthologous protein sequences, so there is no way to integrate information on different receptors in each host species in the calculation of EPs. However, we appreciate that the preponderance of CAPs in the RBD is likely due to different binding environments. We have added the following text (on line 83) to clarify our point: “Adaptation in this case means a virus which can successfully infect human hosts. As CAPs are unexpected polymorphisms under neutral theory, their existence implies a non-neutral effect. This can come in the form of functional changes (Liu et al. 2016) or compensation for functional changes (Ose et al. 2022). Therefore, we suspect that these CAPs, being unexpected changes from coronaviruses across other host species with different binding substrates, may be partially responsible for the functional change of allowing human infection.” This hypothesis is supported by the overlap of CAPs we identified with the positions identified in other studies (e.g., 478, 486, 498, and 681). Binding to TMPRSS2 and other substrates are also covered by this analysis as it is a measure of overall evolutionary fitness, rather than binding to any specific substrate. Our paper does focus on discussing hACE2 binding and mentions furin cleavage, but indeed lacks discussion on the role of TMPRSS2. We have added the following text to line 157: “Another host cell protease, TMPRSS2, facilitates viral attachment to the surface of target cells upon binding either to sites Arg815/Ser816, or Arg685/Ser686 which overlaps with the furin cleavage site 676-689, further emphasizing the importance of this area (Hoffmann et al. 2020b; Fraser et al. 2022).”

      (2) Turning now to the computational methods utilized to study dynamics, I have serious reservations about the novelty of the results as well as the validity of the methodology. First of all, the authors mention the work of Teruel et al. (PLOS Comp Bio 2021) in an extremely superficial fashion and do not mention at all a second manuscript by Teruel et al. (Biorxiv 2021.12.14.472622 (2021)). However, the work by Teruel et al. identifies positions and specific mutations that affect the dynamics of S and the evolution of the SARS-CoV-2 virus in light of immune escape, ACE2 binding, and open and closed state dynamics. The specific differences in approach should be noted but the results specifically should be compared. This omission is evident throughout the manuscript. Several other groups have also published on the use of nomal-mode analysis methods to understand the Spike protein, among them Verkhivker et al., Zhou et al., Majumder et al., etc.

      Thank you for your suggestions. Upon further examination of the listed papers, we have added citations to other groups employing similar methods. However, it's worth noting that the results of Teruel et al.'s studies are generally not directly comparable to our own. Particularly, they examine specific individual mutations and overall dynamical signatures associated with them, whereas our results are always considered in the context of epistasis and joint effects with CAPs, and all mutations belong to the common variants. Although important mutations may be highlighted in both cases, it is for very different reasons. Nevertheless, we provide a more detailed mention of the results of both studies. See lines 178, 255, and 393.

      (3) The last concern that I have is with respect to the methodology. The dynamic couplings and the derived index (DCI) are entirely based on the use of the elastic network model presented which is strictly sequence-agnostic. Only C-alpha positions are taken into consideration and no information about the side-chain is considered in any manner. Of course, the specific sequence of a protein will affect the unique placement of C-alpha atoms (i.e., mutations affect structure), therefore even ANM or ENM can to some extent predict the effect of mutations in as much as these have an effect on the structure, either experimentally determined or correctly and even incorrectly modelled. However, such an approach needs to be discussed in far deeper detail when it comes to positions on the surface of a protein such that the reader can gauge if the observed effects are the result of modelling errors.

      We would like to clarify that most of our results do not involve simulations of different variants, but rather how characteristic mutation sites for those variants contribute to overall dynamics. For the full spike, we operate on only two simulations: open and closed. When we do analyze different variants, starting on line 438, the observed difference does not come from the structure, but from the covariance matrix obtained from molecular dynamics (MD) simulations, which are sensitive to single amino acid changes.

      Reviewer #3 (Recommendations For The Authors):

      (1) On line 99 there is a misspelling, 'withing'.

      It has been fixed. Thanks for spotting that.

      (2) Some graphical suggestions to make the figures easier to read:

      In Figure 1C, a labeled circle around the important sites, the receptor binding domain, and the Furin cleavage site, would help the reader orient themselves. Moreover, it would make clear which CAPs are NOT in the noteworthy sites described in the text.

      Good idea. We have added transparent spheres and labels to show hACE2 binding sites and Furin cleavage sites.

      In Figure 2C the colors are a bit low contrast; moreover, there are multiple text sizes on the same figure which should perhaps be avoided to ensure legibility.

      We have made yellow brighter and standardized font sizes.

      Figure 3 is a bit dry, perhaps indicating in which bins the 'interesting' sites could be informative.

      Thank you for the suggestion, but the overall goal of Figure 3 is to illustrate that the mutational landscape is governed by the equilibrium dynamics in which flexible sites undergo more mutations during the evolution of the CoV2 spike protein. Therefore, adding additional positional information may complicate our message.

      Figure 4, the previous suggestions about readability apply.

      We ensured same sized text and higher contrast colors.

      Figure 5B, the residue labels are too small.

      We increased the font size of the residue labels.

      In Figure 8 maybe adding Delta to let the reader orient themselves would be helpful to the discussion.

      Unfortunately, there is no single work that has experimentally quantified binding affinities towards hACE2 for all the variants. When we conducted the same analysis for the Delta variant in Figure 8, the experimental values were obtained from a different source (doi: 10.1016/j.cell.2022.01.001) and the values were significantly different from the experimental work we used for Omicron (Yue et al. 2023). When we could adjust based on the difference in experimentally measured binding affinity values of the original Wuhan strain in these two separate studies, we observed a similar correlation, as seen below. However, we think this might not be a proper representation. Therefore, we chose to keep the original figure.

      Author response image 2.

      The %DFI calculations for variants Delta, Omicron, XBB, and XBB 1.5. (A) %DFI profile of the variants are plotted in the same panel. The grey shaded areas and dashed lines indicate the ACE2 binding regions, whereas the red dashed lines show the antibody binding residues. (B) The sum of %DFI values of RBD-hACE2 interface residues. The trend of total %DFI with the log of Kd values overlaps with the one seen with the experiments. (C) The RBD antibody binding residues are used to calculate the sum of %DFI. The ranking captured with the total %DFI agrees with the susceptibility fold reduction values from the experiments.

      (3) Replicas of the MD simulations would make the conclusions stronger in my opinion.

      We ran a 1µs long simulation and performed convergence analysis for the MD simulations using the prior work (Sawle L, Ghosh K. 2016.) More importantly, we also evaluated the statistical significance of computed DFI values as explained in detail below (Please see the answer to question 3 of Reviewer #3 (Public Review):)

      Reviewer #3 (Public Review):

      (1) A longer discussion of how the 19 orthologous coronavirus sequences were chosen would be helpful, as the rest of the paper hinges on this initial choice.

      The following explanation has been added on line 114: EP scores of the amino acid variants of the S protein were obtained using a Maximum Likelihood phylogeny (Kumar et al. 2018) built from 19 orthologous coronavirus sequences. Sequences were selected by examining available non-human sequences with a sequence identity of 70% or above to the human SARS CoV-2’s S protein sequence. This cutoff allows for divergence over evolutionary history such that each amino acid position had ample time to experience purifying selection, whilst limiting ourselves to closely related coronaviruses. (Figure 1A).

      (2) The 'reasonable similarity' with previously published data is not well defined, nor there was any comment about some of the residues analyzed (namely 417-484). We have revised this part of the manuscript and add to the revised version.

      We removed the line about reasonable similarity as it was vague, added a line about residues 417-484, and revised the text accordingly, starting on line 354.

      (3) There seem to be no replicas of the MD simulations, nor a discussion of the convergence of these simulations. A more detailed description of the equilibration and production schemes used in MD would be helpful. Moreover, there is no discussion of how the equilibration procedure is evaluated, in particular for non-experts this would be helpful in judging the reliability of the procedure.

      We opted for a single, extended equilibrium simulation to comprehensively explore the longterm behavior of the system. Given the specific nature of our investigation and resource constraints, a well-converged, prolonged simulation was deemed a practical and scientifically valid approach, providing a thorough understanding of the system's dynamics. (doi: 10.33011/livecoms.1.1.5957, https://doi.org/10.1146/annurev-biophys-042910-155255 )

      We updated our methods section starting on line 605 with extended information about the MD simulations and the converge criteria for the equilibrium simulations. We also added a section that explains our analysis to check statistical significance of obtained DFI values.

    2. Reviewer #3 (Public Review):


      The manuscript uses a combination of evolutionary approaches and structural/dynamics observations to provide mechanistic insights in the adaptation of the Spike protein during the evolution of SARS-COV-2 variants. The conclusion that CAP sites should be taken in particular account when considering the impact of the emergence of new strains and mutations is warranted.


      The results presented in this work are very well outlined with well-written text, pleasant and well-described pictures, didactical and clear description of the methods, e.g. the discussion of how the MD equilibration procedure is applied and evaluated is clear and well argument.<br /> The citation of relevant similar results with different approaches strengthens the reasoning; in particular, comparing the calculated scores with previous experimentally obtained data is one of the strongest points of the manuscript.


      There are no replicas of the molecular dynamics (MD) simulations, understandable since it's not a MD-focused paper. However, the comparison of multiple replicas could enhance the reliability of the findings.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We greatly thank you and the reviewers for your expert comments and valuable suggestions on our manuscript. After reading these comments, we realized that the previous version of the manuscript contained some weak points. Surely, the issues raised by the six reviewers were of great help in the revision of our manuscript.

      According to the comments, we have now fully revised the manuscript to address most of the questions and suggestions. In addition, we reworded some parts of the Introduction, Results and Discussion, Figures, Figure legends and Experimental Methods to increase the rigor of our conclusions.

      Overall, you will see that we have paid serious attention to all the concerns and criticisms expressed by reviewers. Addressing these various issues has most certainly allowed us to prepare a much-improved manuscript and for this we offer our hearty thanks.

      Reviewer #1 (Public Review):


      The organization of cell surface receptors in membrane nanodomains is important for signaling, but how this is regulated is poorly understood. In this study, the authors employ TIRFM single-molecule tracking combined with multiple analyses to show that ligand exposure increases the diffusion of the immune receptor FLS2 in the plasma membrane and its co-localization with remorin REM1.3 in a manner dependent on the phosphosite S938. They additionally show that ligand increases the dwell time of FLS2, and this is linked to FLS2 endocytosis, also in a manner dependent on S938 phosphorylation. The study uncovers a regulatory mechanism of FLS2 localization in the nanodomain crucial for signaling.


      TIRFM single-molecule tracking, FRAP, FRET, and endocytosis experiments were nicely done. The role of S938 phosphorylation is convincing.


      Question 1: The model suggests that S938 is phosphorylated upon flg22 treatment. This is actually not known.

      Reply: Thank you for your expert comments. Although the phosphorylation of Ser-938 upon flg22 treatment is not known, the model presented in the manuscript is based on previous studies that have shown the importance of Ser-938 phosphorylation for the function of FLS2 (Cao et al, 2013). When it is mutated to the phosphorylation-mimicking residues aspartate or glutamate, immune responses remain normal. These findings suggest that the phosphorylation of Ser-938 plays a critical role in activating defense mechanisms upon flagellin detection (Cao et al, 2013). Now we added the results of Cao et al. (2013) to the introduction to strengthen in the revised manuscript.

      Question 2: In addition, the S938D mutant does not show constitutively increased diffusion and co-localization with remorin. It is necessary to soften the tone in the conclusion.

      Reply: We appreciate the valuable suggestions from the reviewer. Based on our findings, we observed that the phosphorylation of Ser-938 significantly impacts the dynamics of flg22-induced FLS2. However, it does not alter the diffusion coefficient of FLS2 itself. In the revised manuscript, we have carefully adjusted the conclusion by softening the tone to reflect these findings.

      Question 3: The introduction (only two paragraphs) and discussion are not properly written in the context of the current understanding of plant receptors in nanodomains. The authors basically just cited a few publications of their own, and this is not acceptable.

      Reply: We accepted the criticisms here. Now, we have reworded the introduction and discussion sections to improve clarity. Furthermore, we have incorporated several new reports on plant receptors in nanodomains into the revised manuscript. Besides, we deleted some publications from our own group, while citing the latest references on plant receptors and nanodomains.

      Reviewer #2 (Public Review):


      The research conducted by Yaning Cui and colleagues delves into understanding FLS2-mediated immunity. This is achieved by comparing the spatiotemporal dynamics of an FLS2-S938A mutant and FLS2-WT, especially in relation to their association with the remorin protein. To delineate the differences between the FLS2-S938A mutant and FLS2-WT, they utilized a plethora of advanced fluorescent imaging techniques. By analyzing surface dynamics and interactions involving the receptor signal co-receptor BAK1 and remorin proteins, the authors propose a model of how FLS2 and BAK1 are assembled and positioned within a remorin-specific nano-environment during FLS2 ligand-induced immune responses.


      These techniques offer direct visualizations of molecular dynamics and interactions, helping us understand their spatial relationships and interactions during innate immune responses. Advanced cell biology imaging techniques are crucial for obtaining high-resolution insights into the intracellular dynamics of biomolecules. The demonstrated imaging systems are excellent examples to be used in studying plant immunity by integrating other functional assays. Weaknesses:

      It's essential to acknowledge that every fluorescence-based method, just like biochemical assays, comes with its unique limitations. These often pertain to spatial and temporal resolutions, as well as the sensitivity of the cameras employed in each setup. Meticulous interpretation is pivotal to guarantee an accurate depiction and to steer clear of potential misunderstandings when employing specific imaging systems to analyze molecular attributes. Moreover, a discerning interpretation and accurate image analysis can offer invaluable guidance for future studies on plant signaling molecules using these nice cell imaging techniques. For instance, although single-particle analysis couldn't conclusively link FLS2 and remorin, FLIM-FRET effectively highlighted their ligand-triggered association and the disengagement brought on by mutations. While these methodologies seemed to present differing outcomes, they were described in the manuscript as harmonious. In reality, these differences could highlight distinct protein populations active in immune responses, each accentuated differently by the respective imaging techniques due to their individual spatial and temporal limitations. Addressing these variations is imperative, especially when designing future imaging explorations of immune complexes.

      Reply: Thank you for your insightful comments and suggestions. We appreciate your expertise in fluorescence-based methods and the importance of careful interpretation and accurate image analysis. We agree with you that different imaging techniques may have their limitations and can highlight distinct aspects of protein dynamics and interactions.

      In our study, we used single-particle analysis and FLIM-FRET to investigate the spatiotemporal dynamics of FLS2 and its association with remorin. While single-particle analysis did not conclusively link FLS2 and remorin, FLIM-FRET effectively highlighted their ligand-triggered association and the disengagement caused by mutations. We acknowledge that these techniques may have different spatial and temporal resolutions, leading to the discrepancy in their results. However, after the normalized treatment, we can provide very similar conclusions. Accordingly, we have revised the manuscript.

      Reviewer #3 (Public Review):


      Receptor kinases (RKs) perceive extracellular signals to regulate many processes in plants. FLS2 is an RK that acts as a pattern-recognition receptor (PRR) to recognize bacterial flagellin and activate pattern-triggered immunity (PTI). PRRs such as FLS2 have been previously shown to reside within PM nanodomains, which can regulate downstream PTI signaling. In the current manuscript, Cui et al use single particle tracking to characterize the effect of previously-described phosposite mutants (FLS2-S938A/D) on the PM organization, endocytosis, and signaling functions of FLS2. The authors confirm that FLS2-S938D but not -S938A is functional for flg22-induced responses, while also demonstrating that phopshodead mutation at this site (S938A) prevents flg22-induced sorting into nanodomains and endocytosis. These results are consistent with S938 being an important phosphorylation site for FLS2 function, however, they fall short of demonstrating that membrane disorganization of FLS2-938A is responsible for downstream signaling defects.


      The authors' experiments (single particle tracking, co-localization, etc) do a good job of demonstrating how a non-functional version of FLS2 (S938A) does not alter its spatio-temporal dynamics, nanodomain organization, and endocytosis in response to flg22, suggesting that these require a functional receptor and are regulated by intracellular signaling components.


      Question 1: The authors do not provide direct evidence that S938 phosphorylation specifically affects membrane organization, rather than FLS2 signaling more generally. All evidence is consistent with S938A being a non-functional version of FLS2, wherein an activated/functional receptor is required for all downstream events including membrane re-organization, downstream signalling, internalization, etc. Furthermore, the authors never demonstrate that this site is phosphorylated in planta in the basal or flg22-elicited state.

      Reply: Sorry that we did not describe clearly in the original manuscript. In fact, we found in our study that the phosphorylation of the Ser-938 site influences the efficient sorting of FLS2 into AtRem1.3-associated microdomains rather than membrane organization, as depicted in Figure 2. Furthermore, we found that the immune responses are disrupted when Ser-938 is mutated to alanine, which is consistent with previously reported results (Cao et al, 2013). However, they remain normal when mutated to the phosphorylation-mimicking residues aspartate or glutamate. These results suggest that the phosphorylation of Ser-938 is crucial for activating defense mechanisms upon flagellin detection. Although the phosphorylation of Ser-938 in plant at the basal or flg22-elicited state is not known, the model presented in the manuscript is based on the results of our current investigation together with those in the previous study that have shown the importance of Ser-938 phosphorylation for FLS2 function (Cao et al, 2013).

      Question 2: As written, the manuscript also has numerous scientific issues, including a misleading/incomplete description of plant immune signaling, lack of context from previous work, and extensive use of inappropriate references.

      Reply: We accept the criticism here. After reading the comments, we realized the problem. Now we have revised the misleading or incomplete description of plant immune signaling, added the context of previous works and deleted inappropriate references in the revised manuscript.

      Reviewer #1 (Recommendations For The Authors):

      Question 1: The description of the data has no details. How many biological repeats were done? How were statistical analyses done? What is the concentration of flg22? How was the calcium flux done (Fig. 4A)? The method also lacks details and relevant references.

      Reply: We apologize for the lack of detail in presenting the data. Following your suggestion, we added comprehensive figure legends that provide clear explanations for each figure. Additionally, we included supplementary information on the measurement methods and references pertaining to calcium flux in the revised manuscript.

      Question 2: Data in Fig. 4 basically repeated the 2013 PLoS Pathog paper. Why were these experiments even performed? Were GFP-tagged FLS2 lines used in these experiments? If this is the case, the data just verified that the GFP-tagged FLS2 functions as expected and should be moved to supporting data.

      Reply: Thanks for the expert suggestions. In our study, we utilized GFP-tagged FLS2 lines to generate FLS2-S938 mutants and conducted experiments to investigate the flg22-induced immune response. Although some experiments in Figure 4 are similar to those reported (Cao et al, 2013), we provided a more detailed analysis of the immune response. The comprehensive analysis included early immune responses and late immune responses, e.g., the activation of a calcium burst, mitogen-activated protein kinases (MAPKs), the induction of immune-responsive genes and callose deposition, ultimately resulting in the inhibition of plant growth. As some results are analogous to the previous paper, we transfer some of the experiments as suggested, including the analysis of MAPKs and callose deposition, to the supporting data section of the revised manuscript.

      Question 3: Flg22-induced FLS2-BAK1 association does not require S938, this is consistent with prior study that flg22 acts as a molecular glue for the ectodomains of FLS2 and BAK1 (Sun et al., 2013 Science). This needs to be cited.

      Reply: Yes, we agree with the comment. Now we added an additional sentence in the revised manuscript: “ This aligns with the previous finding that flg22 acts as a molecular glue for FLS2 and BAK1 ectodomains (Sun et al., 2013).”

      Question 4: Line 50, the references cited do not match what they say here.

      Reply: We are sorry for the mistake in citing inappropriate references. In the revised manuscript, we deleted this sentence as well as the incorrect reference.

      Question 5: Line 105, "flg22 can act as a ligand-like factor". It is a ligand!

      Reply: Sorry for the mistake. Now, the sentence was corrected in the revised manuscript by deleting the word “like”.

      Question 6: Line 107, FLS2/BAK1 heterodimerization, not heteroologomerization.

      Reply: Now we used “heterodimerization” to replace “heteroologomerization” in the revised manuscript.

      Question 7: Line 114, are these really the best references to cite here?

      Reply: After reading the comment, we found the references were not suitable here. Now we changed references by citing “(Martinière et al., 2021)” in the revised manuscript.

      Question 8: Lines 123-124, the sentence is incomplete.

      Reply: In the revised manuscript, we reworded the sentence to make it complete now. We changed “In a previous investigation, we demonstrated that flg22 induces FLS2 translocation from AtFlot1-negative to AtFlot1-positive nanodomains in the plasma membrane, implying a connection between FLS2 phosphorylation and membrane nanodomain distribution (Cui et al., 2018). To validate this, we assessed the association of FLS2/FLS2S938D/FLS2S938A with membrane microdomains, using AtRem1.3-associated microdomains as representatives (Huang et al., 2019).” in the revised manuscript.

      Question 9: Lines 169-170, Why is this "most important"?

      Reply: Sorry for the unsuitable description. As we have dramatically changed the manuscript, this sentence was deleted from the new version.

      Reviewer #2 (Recommendations For The Authors):

      Here are some specific areas of ambiguity in the study to be improved.

      Question 1: Clarity in statistical analysis is necessary. Many figure legends omit details such as the sample size "n", and the nature of the measurements, like ROIs, images, and dots, the size of the seedlings, etc.

      Reply: We appreciated this suggestion, which was raised by the reviewer I as well. Now, we provided the details for each figure, including the sample size, the nature of the measurements in the revised manuscript.

      Question 2: Additional background about the choice of FLS2-S938 mutant would be beneficial, given that this mutant doesn't affect the BAK1 interaction but nullifies several PTI responses.

      Reply: Yes, we agreed that some additional background is required for the FLS2-S938 mutant. Therefore, we added a sentence here: “FLS2 Ser-938 mutations impact flg22-induced signaling, while BAK1 binding remains unaffected, thereby suggesting Ser-938 regulates other aspects of FLS2 activity (Cao et al., 2013).” in the revised manuscript.

      Question 3: A specific segment "... Using CLSM, Fluorescence Correlation Spectroscopy (FCS) and Western blotting, we found that the endocytic vesicles of FLS2S938D increased significantly after flg22 treatment (Figure 3B-3E)..." is not easy to follow. The author may want to differentiate these methods and highlight them by indicting them as endocytic vesicle counting, receptor density on PM measurement by FCS, and WB-based protein degradation characterization to understand such mixed descriptions better. By the way, "Number of Endocytosis" should be "number of endocytic vesicles". Endocytosis is a process and uncountable.

      Reply: We thank the reviewer for kindly reminding us to differentiate experimental methods. Therefore, we changed the sentences in the revised manuscript: “Employing confocal laser-scanning microscopy (CLSM) during 10μM flg22 treatment, we tracked FLS2 endocytosis and quantified vesicle numbers over time (Figure 3B). It is evident that both FLS2 and FLS2S938D vesicles appeared 15 min after-flg22 treatment, significantly increasing thereafter (Figure 3C). Notably, only a few vesicles were detected in FLS2S938A-GFP, indicating Ser-938 phosphorylation's impact on flg22-induced FLS2 endocytosis. Additionally, fluorescence correlation spectroscopy (FCS) (Chen et al., 2009) monitored molecular density changes at the PM before and after flg22 treatment (Figure S3F). Figure 3D shows that both FLS2-GFP and FLS2S938D-GFP densities significantly decreased after flg22 treatment, while FLS2S938A-GFP exhibited minimal changes, indicating Ser-938 phosphorylation affects FLS2 internalization. Western blotting confirmed that Ser-938 phosphorylation influences FLS2 degradation after flg22 treatment (Figure 3E), consistent with single-molecule analysis findings.” Besides, we also changed “number of endocytosis” to “the number of endocytic vesicles” in Figure 3C as suggested.

      Question 4: In Figure 1 E, a discrepancy exists where the total percentages in the red and black columns don't sum up to 100%, while other groups look right. This needs clarification.

      Reply: We are sorry for our carelessness in making the data incomplete. Now we thoroughly supplemented, collated, and rechecked the data in Figure 1E. Due to an oversight during the production of the figure, some data was inadvertently omitted, resulting in the red column not reaching 100%. Besides, we checked the data in the black column again, and the total percentage indeed added up to 100%.

      Question 5: Although Figure 1F uses UMAP analysis to differentiate between FLS2WT and A mutants, only data pertaining to the "D" mutant is shown.

      Reply: Thank you for the expert comments. Because there are several images in Figure 1, we only selected the data related to the “D” mutant as a representative for display. As suggested, we have added all the UMAP images in the revised supplement figure S1F.

      Question 6: There are apparent inconsistencies in the FRAP results, particularly regarding the initial recovery points post-bleaching. A detailed statistical analysis, supplemented with FRAP images over time, should be included for clarity. Were they bleached to a similar ground level before monitoring their recovery? The data points from "before" and "after "bleaching were not shown. I found the red and blue curves showed similar recovery slop, which suggests no long-distance movement changes for all three FLS2 versions, with or without flg22. This is opposite from the conclusions made by the author.

      Reply: Thank you for the expert comments. After reading the comments, we recognized this terrible problem. Therefore, we carried out a new FRAP experiment. The new results showed that, following complete bleaching of three samples of FLS2 to ground level, the recovery rates of FLS2 and FLS2S938D under flg22 treatment were significantly higher compared to the control group (Fig. 1G). In contrast, the recovery rates of the FLS2S938A-GFP after flg22 treatment remain similar to that before treatment (Fig. 1G), indicating that the Ser-938 phosphorylation site indeed affects the flg22-induced lateral diffusion of FLS2 at the PM. The new results are basically consistent with the motion range of single-molecule results, which is not contradictory to long-distance movement changes. Accordingly, we incorporated the new time-lapse FRAP images into Figure 1G and S1B.

      Question 7: There's a potential typo in Figure 1B regarding the bar size. It could neither possibly be 200 um nor 200 nm. Figure 1A also needs a scale bar.

      Reply: Apologies for the mistake. We now corrected “200 μm” to “2 μm”. Besides, we also included a scale bar in Figure 1A in the revised manuscript.

      Question 8: Due to the unreliable tracking for a long-time by Imaris, the authors analyzed the tracks within 10s and quantified very short live particles under 4s. Such 4S surface retention for a receptor does not seem to match functional endocytic internalization time for cargo. Even after the endocytic adaptor module recruitment, it would take at least more than 10s to finish the internalization. In the field of endocytosis, these events are often described as abortive endocytic events. However, the disappearance of cargoes, FLS2 in this case, indicates internalization into the cytoplasm, which is interesting. May the author discuss more on how these short events analyzed enhance our understanding of the functional behavior of FLS2?

      Reply: We greatly appreciated the valuable comments provided by the reviewer. After thorough consideration, we acknowledged that in our original manuscript, we failed to distinguish the short-lived from the long-lived particles and vaguely put them collectively into the internalized particles. We realized that and it is inappropriate to ambiguously categorize all particles as internalized. Therefore, we added the sentence “Additionally, numerous FLS2 exhibited short-lived dwell times, indicating abortive endocytic events associated with the endocytic pathway and signal transduction (Bertot et al., 2018)” in the revised manuscript.

      Question 9: Figure 2D should be comprehensive, presenting data for the WT, A, and D versions.

      Reply: Yes, we agreed with the suggestions. Now, we added several representative images for the WT, A, and D versions in the revised manuscript.

      Question 10: In Figure 2D, TIRM-SIM should be a typo and rectified to TIRF-SIM. Also, a detailed explanation of the TIRF-SIM setup and its specifics would be important. The imaging approach of SIM, especially the time duration for finishing all frames before reconstruction, is essential to rationalize its use in capturing and measuring an appropriate speed range of particle movement. May the author elaborate on the technique details and the use of TIRF-SIM for colocalization analysis? To clarify these, the author may provide additional TIRF-only movies of FLS2 (WT, A, D) and AtRem1.3 for comparison with TIRF-SIM still images.

      Reply: Sorry for the mistake. In the revised manuscript, we have corrected “TIRM-SIM” to “TIRF-SIM”. In order to rationalize its use in capturing and measuring an appropriate speed range of particle movement, we included a more detailed description of the imaging approach and the colocalization analysis of TIRF-SIM in the Materials and Methods section as follows: “The SIM images were taken by a 60 × NA 1.49 objective on a structured illumination microscopy (SIM) platform (DeltaVision OMX SR) with a sCMOS camera (Camera pixel size, 6.5 μm). The light source for TIRF-SIM included diode laser at 488 nm and 568 nm with pixel sizes (μm) of 0.0794 and 0.0794 (Barbieri et al., 2021). For the dual-color imaging, FLS2/FLS2S938A/FLS2S938D-GFP (488 nm/30.0%) and AtRem1.3-mCherry (561 nm/30.0%) were excited sequentially. The exposure time of the camera was set at 50 ms throughout single-particle imaging. The time interval for time-lapse imaging was 100 ms, the total time was 2s, and the total time points were 21s. The Imaris intensity correlation analysis plugin was used to calculate the co-localization ratio.” in the revised manuscript. Furthermore, we provided additional TIRF-SIM movies of FLS2 (WT, A, D) and AtRem1.3.

      Question 11: The colocalization displayed in Figure 2D is hard to tell. A colocalization ratio of FLS2-AtRem1.3 is shown as ~0.8%, which has only ~0.2% difference from the flg22-treated condition. "n" of Figure 2F should be specified in the legend, such as a line with a specific length, or an ROI with a specific area size.

      Reply: Thank you for the expert comments. Although the increased colocalization after flg22 treatment is not high, the change is statistically significant as compared with the wild type. We agreed that every fluorescence-based method, like biochemical analysis, has its own unique limitations, which were raised by the Reviewer #2 (Public Review) as well. In order to provide strong evidence, we also carried out the FLIM-FRET experiment as a supplement, which can effectively detect their ligand-triggered association or disassociation. From figure 2G and H, we clearly found that the co-localization of FLS2/FLS2S938D-GFP with AtRem1.3-mCherry significantly increase in response to flg22 treatment (FLS2-GFP control: 2.45 ± 0.019 s; FLS2-GFP flg22-treated: 2.39 ± 0.016 s; FLS2S938D-GFP control: 2.42 ± 0.010 ns; FLS2S938D-GFP flg22-treated: 2.35 ± 0.028 ns). In contrast, FLS2S938A-GFP shows no significant changes (control: 2.53 ± 0.011 ns; flg22-treated: 2.56 ± 0.013 ns), indicating that Ser-938 phosphorylation influences efficient sorting of FLS2 into AtRem1.3-associated microdomains. Following the suggestion of the reviewer, we now rearranged the order of 2E and 2F, in which N represents the entire image region used for analysis rather than a specific region of interest.

      Question 12: I appreciate the nice results of the FLIM-FRET results for FLS2-Rem1.3. Figure 2H should be supplemented with additional representative images of all FLS2 variants including WT and mutants.

      Reply: Thanks for your warm encouragement. As suggested, we added all the representative images in the revised manuscript.

      Question 13: The unit of the X-axis of Figure 2E can not be pixel. Should it be, um? In the method, the author could specify the camera model and magnification for TIRF-SIM to understand pixel size of the image better.

      Reply: Sorry for the mistake here. Indeed, the unit of the X-axis in Figure 2E should be μm. Now we correct this mistake in Figure 2E in the revised manuscript. Besides, we included a detailed description of the imaging approach of TIRF-SIM in the Materials and Methods section as follows: “The SIM images were taken by a 60 × NA 1.49 objective on a structured illumination microscopy (SIM) platform (DeltaVision OMX SR) with a sCMOS camera (Camera pixel size, 6.5 μm)”.

      Question 14: "... as shown in A..." in Figure Legend 2E should be "... as shown in D..."

      Reply: Thanks for pointing out this mistake. In the revised manuscript, we used “as shown in D” to replace “as shown in A”.

      Question 15: I recommend that the authors exercise caution when drawing conclusions based on the Rem1.3 data and when representing the "microdomain" concept in their final model. While Rem1.3 punctate is a nanometer-sized protein cluster specific to its identity, its shape can be categorized as a nanodomain. Conceptually, however, it neither universally represents all nanodomains nor microdomains, as depicted in Figure 4. We should exercise caution to prevent providing misleading information to the field.

      Reply: We thank the reviewer for expert comments. To avoid misleading conclusions, we changed “nanodomains” to “AtRem1.3-associated microdomains” in the revised manuscript. Besides, we have also made modifications to Figure 4.

      Reviewer #3 (Recommendations For The Authors):

      Question 1: The manuscript needs to be extensively re-written and has severe issues as-is. Many references are either not quite appropriate or are completely unrelated to the use in the text. In general, the current state-of-the-art of PTI and RK signaling is not correctly described or incorporated.

      Reply: We accepted the criticisms here. As suggested, we thoroughly rewrote the manuscript to address the concerns raised. Furthermore, we have thoroughly checked and revised the manuscript by removing 21 irrelevant references and adding 30 relevant references. We also incorporated the most up-to-date descriptions of the PTI and RK signaling pathways.

      Question 2: Receptor-like kinase (RLK) should generally be receptor kinase (RK) as receptor functions are now well established.

      Reply: Yes, we agreed with your expert comment here. Now, we changed “Receptor-like kinase (RLK)” into “receptor kinase (RK)” in the revised manuscript.

      Question 3: Line 20 - is this really true?

      Reply: Sorry for the mistake. In the revised manuscript, we changed “However, the mechanisms underlying the regulation of FLS2 phosphorylation activity at the plasma membrane in response to flg22 remain largely enigmatic.” to “However, the dynamic FLS2 phosphorylation regulation at the plasma membrane in response to flg22 needs further elucidation.”

      Question 4: S938D sorts better in response to Flg22; S938A is unaffected - suggests phosphorylation of S938 is not dynamic in response to Fig 22 but is required for pre-elicitation sorting. Overall, there is a chicken-and-egg problem in this paper: which comes first, immune/signalling functionality or nanodomain sorting? And which is explaining the defects of S938A?

      Reply: We thank the reviewer for expert suggestions. In fact, the previous studies showed that membrane microdomains serve as signaling platforms that mediate cargo protein sorting and protein-protein interactions in a variety of contexts (Goldfinger et al. 2017). Since our previous research showed that the disruption of membrane microdomains affected flg22-induced immune signaling (Cui et al. 2018), we speculate that the immune signal occurred after entering the membrane microdomains.

      As shown in Figure 1 and 2, ligand exposure leads to an increase in diffusion coefficient and enhanced co-localization with REM1.3, both of which are dependent on the phosphorylation of the Ser-938 site. Deducing from these results, we inferred that the defects in S938A resulted largely from its failure to sort into membrane microdomains. The phosphorylation of the Ser-938 site can regulate FLS2 into functional AtRem1.3-associated microdomains, thereby affecting flg22-induced plant immunity.

      Question 5: Line 37 conserved, not conservative (though not technically true - the domain organization is conserved but the ECDs are not conserved).

      Reply: Thank you for pointing this mistake out. In the revised manuscript, we used “conserved” to replace “conservative”.

      Question 6: Lines 40-42 - not all phosphorylation sites are within the kinase domain, for example, sites are well-described on the JM and/or C-tail regions outside of the kinase domain.

      Reply: We accepted the criticisms here. We have corrected the sentence to “with phosphorylation sites mainly located in PKC” in the revised manuscript.

      Question 7: Line 42 - what is BIK1? Intro to relevant topics is severely lacking.

      Reply: Sorry for the incomplete introduction here. We added the relevant introduction of BIK1 by adding that “Upon recognizing flg22, FLS2 interacts with the co-receptor Brassinosteroid-Insensitive 1-associated Kinase 1 (BAK1), initiating phosphorylation events through the activation of receptor-like cytoplasmic kinases (RLCKs) such as BOTRYTIS-INDUCED KINASE 1 (BIK1) to elicit downstream immune responses (Chinchilla et al., 2006; Li et al., 2016b; Majhi et al., 2021). ” in the revised manuscript.

      Question 8: Lines 42-44 - not sure this sequence of events is being properly described (e.g. BIK1 release is unlikely to precede activation by BAK1/SERKs).

      Reply: We apologize for not expressing this sentence clearly. Now, we reworded the sentence: “Upon recognizing flg22, FLS2 interacts with the co-receptor Brassinosteroid-Insensitive 1-associated Kinase 1 (BAK1), initiating phosphorylation events through the activation of receptor-like cytoplasmic kinases (RLCKs) such as BOTRYTIS-INDUCED KINASE 1 (BIK1) to elicit downstream immune responses (Chinchilla et al., 2006; Li et al., 2016b; Majhi et al., 2021).” in the revised manuscript.

      Question 9: Line 61 - S938 was identified by Cao et al (2013) based on in vitro MS, but was functionally validated using genetic assays, not based on MS.

      Reply: Thank you for your comments. Now, we changed the sentence: “In vitro mass spectrometry (MS) identified multiple phosphorylation sites in FLS2. Genetic analysis further identified Ser-938 as a functionally important site for FLS2 in vivo (Cao et al., 2013).” in the revised manuscript.

      Question 10: Line 68-69 - phospho-dead and phospho-mimic, not phosphorylated/non-phosphorylated.

      Reply: We thank the reviewer for expert suggestions. In the revised manuscript, we changed the sentence by replacing “phosphorylated/non-phosphorylated” with “phospho-mimic” and “phospho-dead”.

      Question 11: Lines 104-106 - this is wildly misleading. Flg22 is more than a ligand-like factor, as it is a bona fide ligand, and the heterodimerization with BAK1/SERKs is extremely well-established (and relevant foundational papers should be cited here in place of the authors' previous work).

      Reply: We apologize for the incorrect expression here. After reading the comments, we realized the problem which was raised by the reviewer I as well. Now, we changed “ligand-like factor” to “ligand”. Besides, we cited the new references “(Orosa et al., 2018)” to replace the references of our group in the revised manuscript.

      Question 12: Lines 107-112 - again, this is confusing. There is a decade of (uncited, undiscussed) work previously establishing that heterodimerization of RK-co-receptor complexes is mediated by extracellular ligand binding and independent of intracellular phosphorylation.

      Reply: We thank the reviewer for expert suggestions. Now, we added several sentences in the revised manuscript: “Therefore, we further investigated if Ser-938 phosphorylation affects FLS2/BAK1 heterodimerization. Tesseler segmentation, FRET-FLIM, and smPPI analyses revealed no impact of Ser-938 phosphorylation on FLS2/BAK1 heterodimerization (Figure 2A-C and S2). This aligns with the previous finding that flg22 acts as a molecular glue for FLS2 and BAK1 ectodomains (Sun et al., 2013), confirming the independence of FLS2/BAK1 heterodimerization from phosphorylation, with these events occurring sequentially.”

      Question 13: Line 119 - this is the wrong citation - Yu et al 2020 is a review and does not cover RALFs; correct citation is Gronnier et al 2022 eLife.

      Reply: In the revised manuscript, we updated the reference from “ (Yu et al., 2020)” to “(Gronnier et al., 2022)”.

      Question 14: Lines 123-124 - this sentence is incomplete.

      Reply: Sorry for the incomplete sentence. Now we reworded the sentence to “In a previous investigation, we demonstrated that flg22 induces FLS2 translocation from AtFlot1-negative to AtFlot1-positive nanodomains in the plasma membrane, implying a connection between FLS2 phosphorylation and membrane nanodomain distribution (Cui et al., 2018). To validate this, we assessed the association of FLS2/FLS2S938D/FLS2S938A with membrane microdomains, using AtRem1.3-associated microdomains as representatives (Huang et al., 2019).” in the revised manuscript.

      Question 15: Line 126 - this requires a reference.

      Reply: Yes, we added a new reference: “(Huang et al., 2019)” in the revised manuscript.

      Question 16: Lines 125-128 - should clarify that the authors are not looking at direct interaction between FLS2 and REM1.3.

      Reply: Sorry for the inappropriate expressions here. In the revised manuscript, we reworded the sentence as follows: “To validate this, we assessed the association of FLS2/FLS2S938D/FLS2S938A with membrane microdomains, using AtRem1.3-associated microdomains as representatives (Huang et al., 2019)” .

      Question 17: Line 138 - these are odd references to use for such a broad statement.

      Reply: Now the inappropriate references cited here have been deleted.

      Question 18: Line 161 - incorrect reference, again.

      Reply: Sorry for this mistake. In the revised manuscript, we reworded the sentence and changed the reference.

      Question 19: Lines 160-165 - this is very confusing and misleading. I would suggest just having a short section introducing PTI earlier on (with appropriate references).

      Reply: As suggestion, we reworded and added a section in the revised manuscript as follows: “PTI plays a pivotal role in host defense against pathogenic infections (Lorrai et al., 2021; Ma et al., 2022). Previous studies demonstrated that FLS2 perception of flg22 initiates a complex signaling network with multiple parallel branches, including calcium burst, mitogen-activated protein kinases (MAPKs) activation, callose deposition, and seedling growth inhibition (Baral et al., 2015; Marcec et al., 2021; Huang et al., 2023). Our focus was to investigate the significance of Ser-938 phosphorylation in flg22-induced plant immunity. Figure 4A-F illustrates diverse immune responses in FLS2 and FLS2S938D plants following flg22 treatment. These responses encompass calcium burst activation, MAPKs cascade reaction, callose deposition, hypocotyl growth inhibition, and activation of immune-responsive genes. In contrast, FLS2S938A (Figure S4A-D) exhibited limited immune responses, underscoring the importance of Ser-938 phosphorylation for FLS2-mediated PTI responses”.

      Question 20: Line 166 - these are not appropriate references, again.

      Reply: Thank you for the suggestion. In the revised manuscript, we removed the inappropriate references. Besides, we added new references by citing: “(Baral et al., 2015; Marcec et al., 2021)”.

      Question 21: Lines 169-173 - this is not relevant, the inhibition of growth by elicitors is extremely well-documented (though not by the refs cited here).

      Reply: We reworded the sentence and deleted the inappropriate reference in the revised manuscript.

      Question 22: Lines 174-175 - I don't see why this is unexpected, as nanodomain organization of PRRs has been previously described.

      Reply: Sorry for the inappropriate expressions here. As we have dramatically changed the manuscript, this sentence was deleted from the new version.

      References we added into the revised manuscript

      Baral A, Irani NG, Fujimoto M, Nakano A, Mayor S, Mathew MK. 2015. Salt-induced remodeling of spatially restricted clathrin-independent endocytic pathways in Arabidopsis root. Plant Cell 27:1297-315. DOI: 10.1105/tpc.15.00154, PMID: 25901088

      Barbieri L, Colin-York H, Korobchevskaya K, Li D, Wolfson DL, Karedla N, Schneider F, Ahluwalia BS, Seternes T, Dalmo RA, Dustin ML, Li D, Fritzsche M. 2021. Two-dimensional TIRF-SIM-traction force microscopy (2D TIRF-SIM-TFM). Nature Communications 12:2169. DOI: 10.1038/s41467-021-22377-9, PMID: 33846317

      Bertot L, Grassart A, Lagache T, Nardi G, Basquin C, Olivo-Marin J, Sauvonnet N. 2018. Quantitative and statistical study of the dynamics of clathrin-dependent and -independent endocytosis reveal a differential role of endophilinA2. Cell Reports 22: 1574–1588. DOI:org/10.1016/j.celrep.2018.01.039, PMID: 29425511

      Bücherl CA, Jarsch IK, Schudoma C, Segonzac C, Mbengue M, Robatzek S, MacLean D, Ott T, Zipfel C. 2017. Plant immune and growth receptors share common signalling components but localise to distinct plasma membrane nanodomains. eLife 6:e25114. DOI: https://doi.org/10.7554/eLife.25114, PMID: 28262094

      Chen Y, Munteanu AC, Huang YF, Phillips J, Zhu Z, Mavros M, Tan W. 2009. Mapping receptor density on live cells by using fluorescence correlation spectroscopy. Chemistry 15:5327-36. DOI: https://doi.org/10.1002/chem.200802305, PMID: 19360825

      Chinchilla, D., Bauer, Z., Regenass, M., Boller, T., and Felix, G. 2006. The Arabidopsis receptor kinase FLS2 binds flg22 and determines the specificity of flagellin perception. Plant Cell 18:465-476. doi:10.1105/tpc.105.036574, PMID: 16377758

      Gada KD, Kawano T, Plant LD, Logothetis DE. 2022. An optogenetic tool to recruit individual PKC isozymes to the cell surface and promote specific phosphorylation of membrane proteins. The Journal of Biological Chemistry 298:101893. DOI: https://doi.org/10.1016/j.jbc.2022.101893, PMID: 35367414

      Gronnier J, Franck CM, Stegmann M, DeFalco TA, Abarca A, von Arx M, Dünser K, Lin W, Yang Z, Kleine-Vehn J, Ringli C, Zipfel C. 2022. Regulation of immune receptor kinase plasma membrane nanoscale organization by a plant peptide hormone and its receptors. eLife 11:e74162. DOI: https://doi.org/10.7554/eLife.74162, PMID: 34989334

      Hohmann U, Lau K, Hothorn M. 2017. The structural basis of ligand perception and signal activation by receptor kinases. Annual Review of Plant Biology 68:109–137. DOI: https://doi.org/10.1146/annurev-arplant-042916-040957, PMID: 28125280.

      Huang D, Sun Y, Ma Z, Ke M, Cui Y, Chen Z, Chen C, Ji C, Tran TM, Yang L, Lam SM, Han Y, Shu G, Friml J, Miao Y, Jiang L, Chen X. 2019. Salicylic acid-mediated plasmodesmal closure via Remorin-dependent lipid organization. Proceedings of the National Academy of Sciences 116:21274–21284. DOI: https://doi.org/10.1073/pnas.1911892116, PMID: 31575745

      Huang Y, Cui J, Li M, Yang R, Hu Y, Yu X, Chen Y, Wu Q, Yao H, Yu G, Guo J, Zhang H, Wu S, Cai Y. 2023. Conservation and divergence of flg22, pep1 and nlp20 in activation of immune response and inhibition of root development. Plant Science 331:111686. DOI: https://doi.org/10.1016/j.plantsci.2023.111686, PMID: 36963637

      Jiao C, Gong J, Guo Z, Li S, Zuo Y, Shen Y. 2022. Linalool activates oxidative and calciμm burst and CAM3-ACA8 participates in calciμm recovery in Arabidopsis leaves. International Journal of Molecular Sciences, 23:5357. DOI: https://doi.org/10.3390/ijms23105357, PMID: 35628166

      Kim TJ, Lei L, Seong J, Suh JS, Jang YK, Jung SH, Sun J, Kim DH, Wang Y. 2018. Matrix rigidity-dependent regulation of Ca2+ at plasma membrane microdomains by FAK visualized by fluorescence resonance energy transfer. Advanced science, 6:1801290. DOI: https://doi.org/10.1002/advs.201801290, PMID: 30828523

      Kontaxi C, Kim N, Cousin MA. 2023. The phospho-regulated amphiphysin/endophilin interaction is required for synaptic vesicle endocytosis. Journal of Neurochemistry 166:248–264. DOI: https://doi.org/10.1111/jnc.15848, PMID: 37243578

      Lee Y, Phelps C, Huang T, Mostofian B, Wu L, Zhang Y, Tao K, Chang YH, Stork PJ, Gray JW, Zuckerman DM, Nan X. 2019. High-throughput, single-particle tracking reveals nested membrane domains that dictate KRasG12D diffusion and trafficking. eLife 8:e46393. DOI: https://doi.org/10.7554/eLife.46393, PMID: 31674905

      Li B, Meng X, Shan L, He P. 2016a. Transcriptional regulation of pattern-triggered immunity in plants. Cell Host Microbe 19:641-50. DOI: 10.1016/j.chom.2016.04.011, PMID: 27173932

      Li L, Kim P, Yu L, Cai G, Chen S, Alfano JR, Zhou JM. 2016b. Activation-dependent destruction of a co-receptor by a pseudomonas syringae effector dampens plant immunity. Cell Host Microbe 20:504-514. DOI: https://doi.org/10.1016/j.chom.2016.09.007, PMID: 27736646.b

      Lorrai R, Ferrari S. 2021. Host cell wall damage during pathogen infection: mechanisms of perception and role in plant-pathogen interactions. Plants (Basel) 10:399. DOI: https://doi.org/10.3390/plants10020399, PMID: 33669710

      Marcec MJ, Tanaka K. 2021. Crosstalk between Calcium and ROS signaling during flg22-triggered immune response in Arabidopsis leaves. Plants 11:14. DOI: 10.3390/plants11010014. PMID: 35009017

      Ma M, Wang W, Fei Y, Cheng HY, Song B, Zhou Z, Zhao Y, Zhang X, Li L, Chen S, Wang J, Liang X, Zhou JM. A surface-receptor-coupled G protein regulates plant immunity through nuclear protein kinases. 2022. Cell Host Microbe 30:1602-1614. DOI: 10.1016/j.chom.2022.09.012. Epub 2022 Oct 13. PMID: 36240763.

      Martinière A, Zelazny E. 2021. Membrane nanodomains and transport functions in plant. Plant Physiology 187:1839–1855. DOI: https://doi.org/10.1093/plphys/kiab312, PMID: 35235669

      Majhi, B.B., Sobol, G., Gachie, S., Sreeramulu, S., and Sessa, G. 2021. BRASSINOSTEROID-SIGNALLING KINASES 7 and 8 associate with the FLS2 immune receptor and are required for flg22-induced PTI responses. Molecular Plant Pathology 22:786-799. DOI:https://doi.org/10.1111/mpp.13062, PMID: 33955635

      Mitra SK, Chen R, Dhandaydham M, Wang X, Blackburn RK, Kota U, Goshe MB, Schwartz D, Huber SC, Clouse SD. 2015. An autophosphorylation site database for leucine-rich repeat receptor-like kinases in Arabidopsis thaliana. The Plant Journal 82:1042–1060. DOI: https://doi.org/10.1111/tpj.12863, PMID: 25912465

      Orosa B, Yates G, Verma V, Srivastava AK, Srivastava M, Campanaro A, De Vega D, Fernandes A, Zhang C, Lee J, Bennett MJ, Sadanandom A. 2018. SμmO conjugation to the pattern recognition receptor FLS2 triggers intracellular signalling in plant innate immunity. Nature Communications 9:5185. DOI: https://doi.org/10.1038/s41467-018-07696-8, PMID: 30518761

      Sun Y, Li L, Macho AP, Han Z, Hu Z, Zipfel C, Zhou JM, Chai J. 2013. Structural basis for flg22-induced activation of the Arabidopsis FLS2-BAK1 immune complex. Science 342:624-628. DOI: https://doi.org/10.1126/science.1243825, PMID: 24114786

      Vitrac H, Mallampalli VKPS, Dowhan W. 2019. Importance of phosphorylation/dephosphorylation cycles on lipid-dependent modulation of membrane protein topology by posttranslational phosphorylation. The Journal of Biological Chemistry 294:18853–18862. DOI: https://doi.org/10.1074/jbc.RA119.010785, PMID: 31645436

      Xue Y, Xing J, Wan Y, Lv X, Fan L, Zhang Y, Song K, Wang L, Wang X, Deng X, Baluška F, Christie JM, Lin J. 2018. Arabidopsis blue light receptor phototropin 1 undergoes blue light-induced activation in membrane microdomains. Molecular Plant 11:846-859. DOI: 10.1016/j.molp.2018.04.003, PMID: 29689384

      Xing J, Ji D, Duan Z, Chen T, Luo X. 2022. Spatiotemporal dynamics of FERONIA reveal alternative endocytic pathways in response to flg22 elicitor stimuli. New Phytologist 235: 518-532. DOI: 10.1111/nph.18127, PMID: 35358335

      Zhai K, Liang D, Li H, Jiao F, Yan B, Liu J, Lei Z, Huang L, Gong X, Wang X, Miao J, Wang Y, Liu JY, Zhang L, Wang E, Deng Y, Wen CK, Guo H, Han B, He Z. 2021. NLRs guard metabolism to coordinate pattern- and effector-triggered immunity. Nature 601:245-251. DOI: https://doi.org/10.1038/s41586-021-04219-2, PMID: 34912119

      Zhong YH, Guo ZJ, Wei MY, Wang JC, Song SW, Chi BJ, Zhang YC, Liu JW, Li J, Zhu XY, Tang HC, Song LY, Xu CQ, Zheng HL. 2023. Hydrogen sulfide upregulates the alternative respiratory pathway in mangrove plant Avicennia marina to attenuate waterlogging-induced oxidative stress and mitochondrial damage in a calciμm-dependent manner. Plant Cell and Environment 46:1521-1539. DOI: https://doi.org/10.1111/pce.14546, PMID: 36658747

      Inappropriate references we deleted from the revised manuscript

      Schulze S, Yu L, Hua C, Zhang L, Kolb D, Weber H, Ehinger A, Saile SC, Stahl M, Franz-Wachtel M, Li L, El Kasmi F, Nürnberger T, Cevik V, Kemmerling B. 2022. The Arabidopsis TIR-NBS-LRR protein CSA1 guards BAK1-BIR3 homeostasis and mediates convergence of pattern- and effector-induced immune responses. Cell Host Microbe 30:1717-1731.e6. DOI: 10.1016/j.chom.2022.11.001, PMID: 36446350

      Wang Q, Zhao Y, Luo W, Li R, He Q, Fang X, Michele RD, Ast C, von Wirén N, Lin J. 2013. Single-particle analysis reveals shutoff control of the Arabidopsis ammonium transporter AMT1;3 by clustering and internalization. Proceedings of the National Academy of Sciences of the United States of America 110:13204-9. DOI: 10.1073/pnas.1301160110, PMID: 23882074

      Eichel K, Jullié D, von Zastrow M. β-Arrestin drives MAP kinase signalling from clathrin-coated structures after GPCR dissociation. Nature Cell Biology 18:303-10. DOI: 10.1038/ncb3307, PMID: 26829388

      Van Itallie CM, Anderson JM. Phosphorylation of tight junction transmembrane proteins: Many sites, much to do. Tissue Barriers 6:e1382671. DOI: 10.1080/21688370.2017.1382671, PMID: 29083946

      Monje-Galvan V, Warburton L, Klauda JB. Setting up all-atom molecular dynamics simulations to study the interactions of peripheral membrane proteins with model lipid bilayers. Methods in Molecular Biology 1949:325-339. DOI: 10.1007/978-1-4939-9136-5_22, PMID: 30790265.

      Trotta A, Bajwa AA, Mancini I, Paakkarinen V, Pribil M, Aro EM. The role of phosphorylation dynamics of CURVATURE THYLAKOID 1B in plant thylakoid membranes. Plant Physiology 181:1615-1631. DOI: 10.1104/pp.19.00942, PMID: 31615849

      Dorrity MW, Saunders LM, Queitsch C, Fields S, Trapnell C. Dimensionality reduction by UMAP to visualize physical and genetic interactions. Nature Communications 11:1537. DOI: 10.1038/s41467-020-15351-4, PMID: 32210240

      Sato KI, Tokmakov AA. Membrane microdomains as platform to study membrane-associated events during Oogenesis, Meiotic Maturation, and Fertilization in Xenopus laevis. Methods in Molecular Biology 920:59-73. DOI: 10.1007/978-1-4939-9009-2_5, PMID: 30737686.

      Ozolina NV, Kapustina IS, Gurina VV, Bobkova VA, Nurminsky VN. Role of plasmalemma microdomains (Rafts) in protection of the plant cell under Osmotic stress. Journal of Membrane Biology 254:429-439. DOI: 10.1007/s00232-021-00194-x, PMID: 34302495

      Boutté Y, Moreau P. Plasma membrane partitioning: from macro-domains to new views on plasmodesmata. Frontiers in Plant Science 5:128. DOI: 10.3389/fpls.2014.00128. PMID: 24772114

      Yu M, Cui Y, Zhang X, Li R, Lin J. Organization and dynamics of functional plant membrane microdomains. Cellular and Molecular Life Sciences 77:275-287. DOI: 10.1007/s00018-019-03270-7, PMID: 31422442

      Zhao Z, Li M, Zhang H, Yu Y, Ma L, Wang W, Fan Y, Huang N, Wang X, Liu K, Dong S, Tang H, Wang J, Zhang H, Bao Y. Comparative proteomic analysis of plasma membrane proteins in rice leaves reveals a vesicle trafficking network in plant immunity that is provoked by Blast Fungi. Frontiers in Plant Science 13:853195. DOI: 10.3389/fpls.2022.853195, PMID: 35548300

      Hilgemann DW, Dai G, Collins A, Lariccia V, Magi S, Deisl C, Fine M. Lipid signaling to membrane proteins: From second messengers to membrane domains and adapter-free endocytosis. Journal of General Physiology 150:211-224. DOI: 10.1085/jgp.201711875, PMID: 29326133

      Joshi R, Paul M, Kumar A, Pandey D. Role of calreticulin in biotic and abiotic stress signalling and tolerance mechanisms in plants. Gene 714:144004. DOI: 10.1016/j.gene.2019.144004, PMID: 31351124

      Chen Y, Cao C, Guo Z, Zhang Q, Li S, Zhang X, Gong J, Shen Y. Herbivore exposure alters ion fluxes and improves salt tolerance in a desert shrub. Plant Cell and Environment 43:400-419. DOI: 10.1111/pce.13662, PMID: 31674033

      Chi Y, Wang C, Wang M, Wan D, Huang F, Jiang Z, Crawford BM, Vo-Dinh T, Yuan F, Wu F, Pei ZM. Flg22-induced Ca2+ increases undergo desensitization and resensitization. Plant Cell and Environment 44:3563-3575. DOI: 10.1111/pce.14186, PMID: 34536020

      Zhang M, Su J, Zhang Y, Xu J, Zhang S. Conveying endogenous and exogenous signals: MAPK cascades in plant growth and defense. Current Opinion in Plant Biology 45:1-10. DOI: 10.1016/j.pbi.2018.04.012, PMID: 29753266

      Arnaud D, Deeks MJ, Smirnoff N. RBOHF activates stomatal immunity by modulating both reactive oxygen species and apoplastic pH dynamics in Arabidopsis. Plant Journal 116:404-415. DOI: 10.1111/tpj.16380, PMID: 37421599

      Zou Y, Wang S, Zhou Y, Bai J, Huang G, Liu X, Zhang Y, Tang D, Lu D. Transcriptional regulation of the immune receptor FLS2 controls the ontogeny of plant innate immunity. Plant Cell.30:2779-2794. DOI: 10.1105/tpc.18.00297, PMID: 30337428

      Ngou BPM, Jones JDG, Ding P. Plant immune networks. Trends in Plant Science 27:255-273. DOI: 10.1016/j.tplants.2021.08.012, PMID: 34548213.

      Yu M, Liu H, Dong Z, Xiao J, Su B, Fan L, Komis G, Šamaj J, Lin J, Li R. 2017. The dynamics and endocytosis of Flot1 protein in response to flg22 in Arabidopsis. Journal of Plant Physiology 215:73–84. DOI: https://doi.org/10.1016/j.jplph.2017.05.010, PMID: 28582732

    2. Reviewer #1 (Public Review):


      Organization of cell surface receptors in membrane nanodomains is important for signaling, but how this is regulated is poorly understood. In this study the authors employ TIRFM single-molecule tracking combined with multiple analyses to show that ligand exposure increases diffusion of the immune receptor FLS2 in the plasma membrane and its co-localization with remorin REM1.3 in a manner dependent on the phosphosite S938. They additionally show that ligand increases dwell time of FLS2, and this is linked to FLS2 endocytosis, also in a manner dependent on S938 phosphorylation. The study uncovers a regulatory mechanism of FLS2 localization in the nanodomain crucial for signaling.


      TIRFM single-molecule tracking, FRAP, FRET and endocytosis experiments were nicely done. A role of S938 phosphorylation is convincing.


      In the previous submission, reviewers pointed out multiple issues, which the reviewers believed the authors can address in the revision. The revised version does improve to some extent but still contains many issues in terms of data analysis and writing.

    1. Author Response

      The following is the authors’ response to the current reviews.

      Response to Reviewer 1:

      • We agree with the reviewer’s overall assessment of this manuscript.

      • Because multiple secreted proteins are changed between the control and experimental groups, some of them could be causal and others corelative in the context of enhancing compensatory glucose production in response to elevated glycosuria. Through future studies we will determine the causal factors that trigger the increase in glucose production.

      • Yes, we will correct the typographical errors in a revised version of this manuscript.

      Response to Reviewer 2:

      • We agree with reviewer on their comment about potential sex differences we may have missed in this study. Therefore, we will include this limitation in discussion section of a revised manuscript.

      • The reviewer’s statement ‘The methods of that publication indicate that all experiments were completed within 14 days of inducing the Glut2 knockout’ is incorrect. In the referred publication, we had explicitly mentioned in methods that ‘All of the experiments, except those using a diet-induced obesity mouse model or noted otherwise, were completed within 14 days of inducing the Glut2 deficiency.’ Please see figures 5h-l and 6 in that previous publication, which demonstrate that all the experiments were not completed within 14 days of inducing renal Glut2 deficiency. Per the reviewer’s advice, in the present manuscript we will include the timeline of the experiments (which in some cases is 4 months beyond inducing glycosuria) with all the figure legends. In addition, for a separate project (which is unpublished) we have measured glycosuria up to 1 year after inducing renal Glut2 deficiency. Therefore, the glycosuria observed in the renal Glut2 KO mice is not temporary.

      • In our previous response to the reviewer, we had already mentioned which control group was used in this study. Please see our response to the second reviewer’s point 3. As mentioned to the reviewer, we had used Glut2-loxp/loxp mice as the control group, which is also described multiple times in the figure legends of our previous paper that reported the phenotype of renal Glut2 KO mice and is cited in this manuscript so we don’t have to repeat the same information. Per the reviewer’s advice, we will also include the information in a revised version of this manuscript.

      • We request the reviewer to look at figure 1, showing an increase in glucose production in renal Glut2 KO mice and figure 3, which demonstrates that an afferent renal denervation reduces blood glucose levels by 50%. The afferent renal denervation (ablation of afferent renal nerves) does reduce blood glucose levels in renal Glut2 KO mice. Therefore, the use of the word ‘promote’ in the title is accurate and appropriate to reflect the role of the afferent renal nerves in contributing to about 50% increase in blood glucose levels in renal Glut2 KO mice. Regarding the reviewer's comment on changes in Crh gene expression, please look at figure 3. Ablation of renal afferent nerves decreases hypothalamic Crh gene expression and other mediators of the HPA axis by 50%. Therefore, the afferent renal nerves do contribute to regulating blood glucose levels, at least in part, by the HPA axis (which is widely known to change blood glucose levels). The use of words such as ‘required’ or ‘necessary’ in the title may have indicated causal role or could have been misleading here; therefore we have purposely used ‘promote’ in the title to accurately reflect the findings of this study.

      • Because we observed an increase in hepatic glucose production in renal Glut2 KO mice (Fig. 1) - which was reduced by 50% after selective afferent renal denervation (Fig. 3) - in the graphical abstract we are suggesting a neural connection between the kidney-brain-liver or an endocrine factor(s) to account for these changes in blood glucose levels as also described in the discussion section. We can include a question mark ‘?’ in the graphical abstract to show that further studies are need to validate these proposed mechanisms; however, we cannot just remove the arrow as advised by the reviewer.

      • Per the reviewer’s advice, in the methods we will include the dilutions used for each assay.

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      It would be helpful to the reader to specify in Figure 1a-c whether data were directly measured or calculated.

      We have now clarified this in method section of the revised manuscript. The glucose production was directly measured and then fractional contribution of the tissues was calculated from the former data. We have also included a reference research paper to further clarify the method.

      The methods section would be strengthened by clarifying the order in which experiments were performed, the age of the mice at each time point, and whether different cohorts were used for different techniques.

      We have included additional details in the method section with proper citations. For in-depth protocols we have cited our previous publications.

      It would be helpful to explain or provide a reference for how the post-mortem background activity measurement was performed.

      We have included this explanation in the revised manuscript.

      Similarly, details regarding the collection of blood for ACTH and corticosterone measurement are needed for the reader to evaluate whether the results are confounded by stress at the time of collection.

      We have added these details in the method section.

      I recommend stating, if accurate, that you used mixed-sex groups because your previous study found no sex differences in the phenotype of renal Glut2 KO mice.

      Yes, we have included these details in the revised manuscript.

      Sentence 239 is difficult to follow. Also, line 287 contains a contraction.

      We have revised the sentence per the reviewer’s advice.

      A graphical abstract would be helpful, bearing in mind conclusive vs suggestive findings.

      Yes, we have included the graphical abstract with the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Minor Comments to the Authors

      (1) The Methods also need to specify more of the critical details of the ELISAs, including the dilution factors used, and whether the values reported are dilution-corrected. Also, there is no description of how insulin was measured.

      We have included these details in the method section. The assay dilutions were performed per manufacturers’ instructions.

      (2) The Methods do not sufficiently describe how Crh mRNA was quantified in the hypothalamus. Presumably, they examined only the paraventricular nucleus? How many sections were used for in situ hybridization? How were the brains processed? What thickness of section was used? When were the brains collected?

      We have included these details in the method section and cited our previous publications for in-depth protocols. Some of the information is also available in the figure legends.

      (3) The number of mice that were used for plasma proteomics is not indicated.

      The number of mice is indicated using individual symbols or points presented on the bar graphs.