10,000 Matching Annotations
  1. Dec 2024
    1. Reviewer #1 (Public review):

      Summary:

      This work uses a novel, ethologically relevant behavioral task to explore decision-making paradigms in C. elegans foraging behavior. By rigorously quantifying multiple features of animal behavior as they navigate in a patch food environment, the authors provide strong evidence that worms exhibit one of three qualitatively distinct behavioral responses upon encountering a patch:<br /> (1) "search", in which the encountered patch is below the detection threshold;<br /> (2) "sample", in which animals detect a patch encounter and reduce their motor speed, but do not stay to exploit the resource and are therefore considered to have "rejected" it; and<br /> (3) "exploit", in which animals "accept" the patch and exploit the resource for tens of minutes.<br /> Interestingly, the probability of these outcomes varies with the density of the patch as well as the prior experience of the animal. Together, these experiments provide an interesting new framework for understanding the ability of the C. elegans nervous system to use sensory information and internal state to implement behavioral state decisions.

      Strengths:

      (1) The work uses a novel, neuroethologically-inspired approach to studying foraging behavior.

      (2) The studies are carried out with an exceptional level of quantitative rigor and attention to detail.

      (3) Powerful quantitative modeling approaches including GLMs are used to study the behavioral states that worms enter upon encountering food, and the parameters that govern the decision about which state to enter.

      (4) The work provides strong evidence that C. elegans can make 'accept-reject' decisions upon encountering a food resource.

      (5) Accept-reject decisions depend on the quality of the food resource encountered as well as on internally represented features that provide measurements of multiple dimensions of internal state, including feeding status and time.

      Weaknesses:

      (1) The authors repeatedly assert that an individual's behavior in the foraging assay depends on its prior history (particularly cultivation conditions). While this seems like a reasonable expectation, it is not fully fleshed out. The work would benefit from studies in which animals are raised on more or less abundant food before the behavioral task.

      (2) The authors convincingly show that the probability of particular behavioral outcomes occurring upon patch encounter depends on time-associated parameters (time since last patch encounter, time since last patch exploitation). There are two concerns here. First, it is not clear how these values are initialized - i.e., what values are used for the first occurrence of each behavioral state? More importantly, the authors don't seem to consider the simplest time parameter, the time since the start of the assay (or time since worm transfer). Transferring animals to a new environment can be associated with significant mechanical stimulus, and it seems quite possible that transferring animals causes them to enter a state of arousal. This arousal, which certainly could alter sensory function or decision-making, would likely decay with time. It would be interesting to know how well the model performs using time since assay starts as the only time-dependent parameter.

      (3) Similarly, Figures 2L and M clearly show that the probability of a search event occurring upon a patch encounter decreases markedly with time. Because search events are interpreted as a failure to detect a patch, this implies that the detection of (dilute) patches becomes more efficient with time. It would be useful for the authors to consider this possibility as well as potential explanations, which might be related to the point above.

      (4) Based on their results with mec-4 and osm-6 mutants, the authors assert that chemosensation, rather than mechanosensation, likely accounts for animals' ability to measure patch density. This argument is not well-supported: mec-4 is required only for the function of the six non-ciliated light-touch neurons (AVM, PVM, ALML/R, PLML/R). In contrast, osm-6 is expected to disrupt the function of the ciliated dopaminergic mechanosensory neurons CEP, ADE, and PDE, which have previously been shown to detect the presence of bacteria (Sawin et al 2000). Thus, the paper's results are entirely consistent with an important role of mechanosensation in detecting bacterial abundance. Along these lines, it would be useful for the authors to speculate on why osm-6 mutants are more, rather than less, likely to "accept" when encountering a patch.

      (5) While the evidence for the accept-reject framework is strong, it would be useful for the authors to provide a bit more discussion about the null hypothesis and associated expectations. In other words, what would worm behavior in this assay look like if animals were not able to make accept-reject decisions, relying only on exploit-explore decisions that depend on modulation of food-leaving probability?

    2. Reviewer #2 (Public review):

      This study provides an experimental and computational framework to behavioral biology that helps examine and understand how C. elegans make decisions while foraging in environments with patches of food. The authors show that worms actively reject or accept food patches depending on a number of internal and external factors.

      The key novelty and strength of this paper is the explicit demonstration of behavior analysis and quantitative modeling to elucidate the decision-making process. In particular, the description of the exploring vs. exploiting phases, and sensing vs. non-sensing categories of C. elegans foraging behavior based on the clustering of behavioral states defined in a multi-dimensional behavior-metrics space, and the implementation of a generalized linear model (GLM) whose parameters can provide quantitative biological interpretations.

      While the concept is interesting, there are many flaws in the experimental, analysis, and models that weaken what one can conclude from the work.

    3. Reviewer #3 (Public review):

      Summary:

      In this study by Haley et al, the authors investigated explore-exploit foraging using C. elegans as a model system. Through an elegant set of patchy environment assays, the authors built a GLM based on past experience that predicts whether an animal will decide to stay on a patch to feed and exploit that resource, instead of choosing to leave and explore other patches.

      Strengths:

      I really enjoyed reading this paper. The experiments are simple and elegant, and address fundamental questions of foraging theory in a well-defined system. The experimental design is thoroughly vetted, and the authors provide a considerable volume of data to prove their points. My only criticisms have to do with the data interpretation, which I think is easily addressable.

      Weaknesses:

      (1) Sensing vs. non-sensing

      The authors claim that when animals encounter dilute food patches, they do not sense them, as evidenced by the shallow deceleration that occurs when animals encounter these patches. This seems ethologically inaccurate. There is a critical difference between not sensing a stimulus, and not reacting to it. Animals sense numerous stimuli from their environment, but often only behaviorally respond to a fraction of them, depending on their attention and arousal state. With regard to C. elegans, it is well-established that their amphid chemosensory neurons are capable of detecting very dilute concentrations of odors. In addition, the authors provide evidence that osm-6 animals have altered exploit behaviors, further supporting the importance of amphid chemosensory neurons in this behavior.

      (2) Search vs. sample & sensing vs. non-sensing

      In Figures 2H and 2I, the authors claim that there are three behavioral states based on quantifying average velocity, encounter duration, and acceleration, but I only see three. Based on density distributions alone, there really only seem to be 2 distributions, not 3. The authors claim there are three, but to come to this conclusion, they used a QDA, which inherently is based on the authors training the model to detect three states based on prior annotations. Did the authors perform a model test, such as the Bayesian Information Criterion, to confirm whether 2 vs. 3 Gaussians is statistically significant? It seems like the authors are trying to impose two states on a phenomenon with a broad distribution. This seems very similar to the results observed for roaming vs. dwelling experiments, which again, are essentially two behavioral states.

      (4) History-dependence of the GLM

      The logistic GLM seems like a logical way to model a binary choice, and I think the parameters you chose are certainly important. However, the framing of them seems odd to me. I do not doubt the animals are assessing the current state of the patch with an assessment of past experience; that makes perfect logical sense. However, it seems odd to reduce past experience to the categories of recently exploited patch, recently encountered patch, and time since last exploitation. This implies the animals have some way of discriminating these past patch experiences and committing them to memory. Also, it seems logical that the time on these patches, not just their density, should also matter, just as the time without food matters. Time is inherent to memory. This model also imposes a prior categorization in trying to distinguish between sensed vs. not-sensed patches, which I criticized earlier. Only "sensed" patches are used in the model, but it is questionable whether worms genuinely do not "sense" these patches.

      (5) osm-6

      The osm-6 results are interesting. This seems to indicate that the worms are still sensing the food, but are unable to assess quality, therefore the default response is to exploit. How do you think the worms are sensing the food? Clearly, they sense it, but without the amphid sensory neurons, and not mechanosensation. Perhaps feeding is important? Could you speculate on this?

      (7) Impact:

      I think this work will have a solid impact on the field, as it provides tangible variables to test how animals assess their environment and decide to exploit resources. I think the strength of this research could be strengthened by a reassessment of their model that would both simplify it and provide testable timescales of satiety/starvation memory.

    4. Author response:

      We thank the reviewers for their thoughtful comments. We are working to revise our manuscript and address each of the reviewers comments. A summary of our planned revisions and responses to some of the reviewers’ major concerns are included below.

      Cultivation Density: Reviewers #1 and #2 suggested that additional studies testing the effects of varying bacterial density during animal development (cultivation) would strengthen our findings. While we agree with the reviewers that this is a very interesting experiment, it is not feasible. Indeed, we attempted this experiment but found it nontrivial to maintain stable bacterial density conditions over long timescales as this requires matching the rate of bacterial growth with the rate of bacterial consumption. Despite our best efforts, we have not been able to identify conditions that satisfy these requirements. We will focus our revised manuscript to include only assertions about the effects of recent experiences.

      Transfer Method: Reviewers #1 and #2 expressed concern that the stress of transferring animals to a new plate may have resulted in an increased arousal state and thus a greater probability of rejecting patches. We thank the reviewers for this thoughtful remark and plan to conduct additional analyses to address this hypothesis. We did, however, anticipate this possibility and, to mitigate the stress of moving, we used an agar plug method where animals were transferred using the flat surface of small cylinders of agar. Importantly, the use of agar as a medium to transfer animals provides minimal disruption to their environment as all physical properties (e.g. temperature, humidity, surface tension) are maintained. Qualitatively, we observe no marked change in behavior from before to after transfer with the agar plug method, especially as compared to the often drastic changes observed when using a metal or eyelash pick.

      Time Parameter: Related to the transfer method, Reviewer #1 expressed concern that the simplest time parameter (time since start of the assay) might better predict animal behavior. We thank the reviewer for pointing out the need to specifically test whether the time-dependent change in explore-exploit decision-making corresponds better with satiety (time off patch) or arousal (time since transfer/start of assay) state. We will conduct additional analyses to address these alternative hypotheses.

      Parameter Initialization: Reviewer #1 pointed out an oversight in our methods section regarding the model parameter values used for the first encounter. We plan to clarify the initialization of parameters in the manuscript. In short, for the first patch encounter where k = 1:

      ρk is the relative density of the first patch.

      τs is the duration of time spent off food since the beginning of the recorded experiment. For the first patch, this is equivalent to the total time elapsed.

      ρh is the approximated relative density of the bacterial patch on the acclimation plates (see Assay preparation and recording in Methods). Acclimation plates contained one large 200 µL patch seeded with OD600 = 1 and grown for a total of ~48 hours. As with all patches, the relative density was estimated from experiments using fluorescent bacteria OP50-GFP as described in Bacterial patch density estimation in Methods.

      ρe is equivalent to ρh.

      Sensing vs. non-sensing: Reviewer #3 suggested that the term “non-sensing” may not be ethologically accurate. We thank the reviewer for their comment and agree that we do not know for certain whether the animals sensed these patches or were merely non-responsive to them. We are, however, confident that these encounters lack evidence of sensing. Specifically, we note that our analyses used to classify events as sensing or non-sensing examined whether an animal’s slow-down upon patch entry could be distinguished from either that of events where animals exploited or that of encounters with patches lacking bacteria. We found that  “non-sensing” encounters are indeed indistinguishable from encounters with bacteria-free patches where there are no bacteria to be sensed (see Figure 2 - Supplement 7C-D and Patch encounter classification as sensing or non-sensing in Methods). Regardless, we agree with the reviewer that all that can be asserted for certain about these events is that animals do not respond to the bacterial patch in any way that we measured. Therefore, we will replace the term “non-sensing” with “non-responding” to better indicate the ethological interpretation of these events.

      Time-dependent changes in sensing vs. non-sensing: Reviewer #1 remarked that the sensation of dilute patches increases with time. We agree with the reviewer that we observe increased responsiveness to dilute patches with time. Although this is interesting, our primary focus was on what decision an animal made given that they clearly sensed the presence of the bacterial patch. Nonetheless, we will add this observation to the discussion as an area of future work to investigate the sensory mechanisms behind this effect.

      Classification of sensing vs. non-sensing: Reviewers #2 and #3 expressed concerns about the validity of the two clusters identified using the semi-supervised QDA approach described. We are grateful to the reviewers for pointing out the difficulty in visualizing the clusters and the need for additional clarity in explaining the supervised labeling. We will use additional visualizations and methods to validate the clusters we have discovered. Specifically, we aim to provide additional evidence that the sensing vs. nonsensing data is bi-modal (i.e. a two-cluster classification method fits best). Further, it seems that there may be some confusion as to how we arrived at 3 encounter types (i.e. search, sample, exploit) that we plan to clarify in the manuscript. Specifically, it’s important to note that two methods were used on two different (albeit related) sets of parameters. We first used a two-cluster GMM to classify encounters as explore or exploit. We then used a two-cluster semi-supervised QDA to classify encounters as sensing or non-sensing (to be changed to “non-responding”, see above response) using a different set of parameters. We thus separated the explore cluster into two (sensing and non-sensing exploratory events) resulting in three total encounter types: exploit, sample (explore/sensing), and search (explore/non-sensing). We will clarify this in the text. Additionally, we will clarify the labelling used for “supervising” QDA. Specifically, we made two simple assumptions: 1) animals must have sensed the patch if they exploited it and 2) animals must not have sensed the patch if there were no bacteria to sense. Thus, we labeled encounters as sensing if they were found to be exploitatory as we assume that sensation is prerequisite to exploitation; and we labeled encounters as non-sensing for events where animals encountered patches lacking bacteria (OD600 = 0). All other points were non-labeled prior to learning the model. In this way, our labels were based on the experimental design and results of the GMM, an unsupervised method; rather than any expectations we had about what sensing should look like. The semi-supervised QDA method then used these initial labels to iteratively fit a paraboloid that best separated these clusters, by minimizing the posterior variance of classification.

      Accept-reject vs. stay-switch: Reviewers #1 and #2 ask for additional discussion on how the accept-reject decision-making framework differs from the stay-switch framework. We thank the reviewers for alerting us to this gap in our discussion. We intend to clarify that these frameworks ask two different types of questions (i.e. “Do you want to eat it?” versus “If so, how long do you want to eat it for?”). These concepts are well described in canonical foraging theory literature (see Pyke, Pulliam & Charnov 1977 for a review on the subject) and are easily distinguishable for animals that forage using the following framework: 1) search for prey, 2) encounter prey from a distance, 3) identify prey type, 4) decide to pursue (accept-reject decision), 5) pursue and capture the prey, 6) exploit prey, and 7) decide to stop exploiting and start searching again (stay-switch decision). In this case, it is easy to see the distinction between accept-reject and stay-switch decisions. However, in some scenarios, animals must physically encounter prey prior to identification and then must make an accept-reject decision. In these cases where pursuit and capture are not visualized, it is harder to distinguish between accept-reject and stay-switch decisions. In our experiments, we find significant bimodality in encounter duration (see Figure 2H) where short duration (exploratory) encounters appear to represent a lower bound where animals spend the minimum amount of time possible on a patch (less than 2 minutes), which we interpret as a rejection of the patch. On the other hand, exploitatory encounters span a large range of durations from 2 to 60+ minutes which we interpret as an initial acceptance of the patch followed by a series of stay-switch decisions which determine the overall duration of the encounter. While one could certainly model our data using only stay-switch decision-making, we ascertain that an encounter of minimal duration is better interpreted ethologically as a rejection than as an immediate switch decision. We will revise the text to further extrapolate upon our point of view on this somewhat philosophical distinction and what it predicts about C. elegans behavior.

      Sensory mutant behavior: Reviewers #1 and #3 ask for further speculation on the observed behavior of osm-6 and mec-4 animals. We will further elaborate on our findings, how they relate to previous studies, and what they suggest about the mechanisms behind these foraging decisions.

      Model design: Reviewer #3 suggested several alterations to the behavioral model. While the proposed model seems entirely reasonable and could aid in elucidating the time component of how prior experience affects decision-making, we chose the present model based on our experience with model selection using these data. Indeed, as the reviewer suggested, we did a great number of analyses involving model selection including model selection criteria (AIC, BIC) and optimization with regularization techniques (LASSO and elastic nets). We found that the problem of model selection was compounded by the enormous array of highly correlated variables we had to choose from. Additionally, we found that both interaction terms and non-linear terms of our task variables could be predictive of accept-reject decisions but that the precise set of terms selected depended sensitively on which model selection technique was used and generally made rather small contributions to prediction. The diverse array of results and combinatorial number of predictors to possibly include failed to add anything of interpretable value. We therefore chose to take a different approach to this problem. Rather than trying to determine what the “best” model was we instead asked whether a minimal model could be used to answer a set of core questions. Indeed, our goal was not maximal predictive performance but rather to distinguish between the effects of different influences enough to determine if encounter history had a significant, independent effect on decision making. We thus chose to only include task variables that spanned the most basic components of behavioral mechanisms to ask very specific questions. For example, we selected a time variable that we thought best encapsulated satiety. While we could have included many additional terms, or made different choices about which terms to include, based on our analyses these choices would not have qualitatively changed our results. Further, we sought to validate the parameters we chose with additional studies (i.e. food-deprived and sensory mutant animals). We regard our study as an initial foray into demonstrating accept-reject decision-making in nematodes. The exact mechanisms and, consequently, the best model design is therefore beyond the scope of this study. Lastly, Reviewer #3 criticized the use of only sensed patches in the model. While we acknowledge that we are not certain as to whether the “non-sensing” encounters are truly not sensed, we find qualitatively similar results when including all exploratory patches in our analyses. In fact, when all encounters are used, we find stronger correlations between our task variables and the accept-reject decision. However, we take the position that sensation is necessary for decision-making and thus believe that while our model’s predictive performance may be better using all encounters, the interpretation of our findings is stronger when we only include sensing events.

    1. eLife Assessment

      This study provides important insights into the role of the Mid1 gene in hippocampal development and its implications in Opitz G/BBB syndrome, with much evidence supporting its impact on synaptic plasticity, neural rhythms, and cognitive functions. The methods, data, and analyses are solid, supporting the claims, presenting several minor weaknesses, and establishing Mid1 as a potential therapeutic target for neurological deficits associated with OS. The conclusions are largely supported by the results, but additional data are needed.

    2. Reviewer #1 (Public review):

      Summary:

      The authors demonstrated that a mouse model of Opitz syndrome induced by Mid1 gene knockout exhibited a significant decrease in α rhythm in HPC and abnormal synchronization of γ rhythm in the prefrontal cortex and hippocampus, showing decreased synaptic plasticity and learning and memory dysfunction. All these effects were attributed to the inhibition of p Creb by PP2Ac.

      Strengths:

      The authors used Mid1 gene knockout mice as a mouse model of Opitz syndrome. They carried out RNA seq analysis and found cAMP signaling pathway, calcium signaling pathway, and 100 other pathways have changed significantly.

      Weaknesses:

      (1) A Mid1 supplementation experiment in Mid1 knockout mice was lacking in this study.

      (2) Enzymes that regulate Creb phosphorylation include not only phosphatases such as PP2A, but also kinases such as CaMKII, PKA, and ERK1/2. These protein kinases should be detected, especially CaMKII, their bioinformatics data show calcium signaling pathways have significantly changed.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript investigates the role of the Mid1 gene in hippocampal (HPC) development and its contribution to Opitz G/BBB syndrome (OS), which is characterized by neurological deficits and structural abnormalities. The authors use a knockout mouse model (Mid1-/y) to elucidate the underlying molecular mechanisms that contribute to learning and memory impairments. They demonstrate that Mid1 gene deletion leads to reduced synaptic plasticity, abnormal neural rhythms, and decreased cognitive functions, providing a mechanistic explanation for the neurological deficits seen in OS patients. This study addresses an important gap in understanding the neural mechanisms underlying Opitz G/BBB syndrome and provides substantial evidence that the Mid1 gene plays a critical role in hippocampal function and cognition.

      Strengths:

      Understanding the role of Mid1 in HPC development could have broader implications for neurodevelopmental disorders beyond OS, particularly in conditions associated with synaptic dysfunction or memory impairments. The study's focus on the impact of Mid1 on the cAMP signaling pathway, BDNF expression, and synaptic plasticity offers novel mechanisms relevant to both neurodevelopment and neurodegeneration. Moreover, the combination of RNA-seq, electrophysiological measurements, and histological staining provides a multidimensional approach to understanding how Mid1 influences neuronal function and structure.

      Weaknesses:

      (1) The introduction is insufficient, and the number of references is too low. With only nine references, there isn't enough context to adequately explain the background and previous evidence.

      (2) The specificity of behavioral deficits is lacking. The authors indicate learning and memory dysfunction, yet the Y-maze and Morris water maze primarily assess spatial memory. Additional behavioral tests, such as the novel object recognition test for recognition memory or fear conditioning for associative learning, should be included to provide a more comprehensive assessment.

      (3) The manuscript mentions decreased synaptic plasticity but lacks thorough investigation; a more detailed analysis of long-term potentiation (LTP) or depression (LTD) would strengthen the claims. Additionally, while spine morphology is analyzed, incorporating electrophysiological measurements of synaptic strength would better correlate structural changes with functional outcomes.

      (4) The authors performed H&E staining to count the number of hippocampal pyramidal neurons; however, H&E lacks specificity for identifying pyramidal neurons. Neuronal-specific IHC staining would be more appropriate for this quantification. Additionally, the manuscript does not mention the counting method used, which should be clarified.

      (5) Information on the knockout mice used in the study is missing from the Methods section. Additionally, the sex of the mice should be specified, as exploring potential sex-specific differences in the impact of Mid1 deletion could significantly enhance the study's findings.

    4. Reviewer #3 (Public review):

      Summary:

      The authors tried to characterize the neuronal deficiency in Mid1 knockout mice. They performed behavioral, neuroelectrophysiological, and pathological experiments to show that Mid1 knockout mice have cognitive function, impaired synaptic plasticity, and changes in gene expression.

      Strengths:

      The evidence provides insight into the mechanisms of cognitive impairments in Opitz syndrome. Overall, the manuscript is well-organized.

      Weaknesses:

      (1) The major weakness is that the proposed molecular mechanism is not fully supported by the current data. The data presented here only show that changes in gene expression levels, cognitive impairments, and electrophysiological impairments are correlated with each other, but do not support causality.

      (2) The main conclusion is that "The main reason is that the deletion of Mid1 gene will increase the accumulation of Pp2ac protein, inhibit the activity of p-Creb, affect the downstream cAMP pathway, lead to the decrease of synaptic density and plasticity, and ultimately affect the learning and memory ability". This should be toned down, since causality is not supported here.

      (3) The description of the results should be improved. Only one figure is presented in the manuscript. Some key information in the supplementary figures should be moved to the main figures. This is very strange since four display items are allowed even for a short report.

    5. Author response:

      First of all, I'd like to express my heartfelt thanks to you for your meticulous and professional review comments. Your feedback is very important to our work. It not only helps us identify the shortcomings in the paper, but also provides valuable guidance for improving the quality of the paper.

      We carefully read every suggestion you made and were deeply inspired. Please rest assured that we will carefully consider and revise each opinion to ensure that our research work is more rigorous and clear. We promise to revise the manuscript accordingly to meet the standards of the journal and enhance the credibility and influence of the research.

      The main modifications include the experiment of A Mid1 supplementation experiment in Mid1 knockout micesupplementing Mid1 in Mid1 knockout mice; Detection of kinases such as CaMKII, PKA and ERK1/2; Supplementary references; Supplement the behavioral experiment of new object recognition; Electrophysiological measurement experiment of supplementing LTP; Supplementary neuron-specific immunohistochemical staining experiment; Supplementing the information of knockout mice used in the study; Modify the language expression of the article and the problem of too few pictures.

      Thank you again for your valuable time and professional advice. We look forward to submitting the revised manuscript to you for further review.

    1. eLife Assessment

      This study makes a valuable advance in our understanding of defensive symbionts in insects. It uses a meta-analysis to quantify the magnitude of change in host fitness components when symbionts are present in hosts exposed to natural enemies. The evidence supporting the study conclusions is solid, with analyses confirming common assumptions that symbionts generally provide defence at low cost to hosts.

    2. Reviewer #1 (Public review):

      Summary:

      Cesar, Santos & Cogni use a meta-analysis to report on the direction and magnitude of three fundamental fitness components in defensive symbioses. Specifically, the work focuses on interactions between three arthropod host families (Aphididae, Culicidae, Drosophilidae, and others) and common bacterial endosymbionts (Wolbachia, Serratia, Hamiltonella, Spiroplasma, Rickettsia, Regiella X-type and Arsenophonus). The results of the overall analysis confirm common assumptions and previous work on such fitness components, showing that defensive symbionts provide strong protection to hosts and cause detectable costs to both hosts and the enemy. The analysis provides insight into the extent of the cost/benefit tradeoff for hosts, reporting that the cost is six times lower than the protective effect. The confirmation that natural enemies attacking hosts infected with symbionts have a reduction in their fitness is also an interesting one, as this shows that the majority of defensive symbionts provide protection by resisting enemy infection, as opposed to tolerating it. This finding has important consequences for evolutionary counter-responses in the enemy species. Of course, this result has less relevance for certain types of enemies (such as parasitoids) where successful infection is dependent upon host killing.

      Interesting results also emerge from the subgroup analysis. For the full dataset, both natural and introduced symbionts were similarly effective in positively influencing the fitness of hosts. However, in the Wolbachia-specific analysis, the artificially introduced symbionts caused costs to the hosts where the natural strain did not. These findings have potentially important ramifications for schemes that use endosymbionts for biocontrol or vector competence, suggesting that (in some cases) natural strains may be the more stable choice for deploying (as they are associated with lower costs).

      The analysis draws from an impressively large dataset, but the interpretation of the full impact of the results would be helped by greater detail on the species/strain level systems included, the data extraction approach, and inclusion criteria. Accounting for phylogenetic nonindependence and alternative coding of one of the moderator variables could also strengthen the biological relevance of the models. Suggestions and thoughts are outlined below.

      Strengths & Potential Improvements:

      An impressively large number of effect sizes (3000) from only 226 studies is collected, robustly confirming common assumptions on the magnitude of fundamental fitness components. However the paper would benefit from a clear breakdown in the main text of the specificities of each system included (e.g. a table at the host species/symbiont strain level, where it is possible). Currently, there is not enough detail for those who want a deep dive to understand what data was extracted for the analysis from these 226 studies, or those who want to understand the underlying diversity in the dataset.

      Currently, when the 'natural enemy group' is tested as a moderator it is coded broadly by type of organism (e.g. virus, bacterium, fungi, parasitoid). But this doesn't adequately capture the mode of killing/fitness reduction by the enemy, which would be the much more biologically relevant categorisation for your questions. For example, parasitoid infection is dependent upon host death (thus host fecundity is not relevant, because the host either survived or did not). Among bacterial and viral pathogens antagonists there is scope for both fecundity and survival to be affected. This in turn may be a very influential factor for the outcome. You could consider recoding this enemy moderator.

      The analysis is restricted to arthropod hosts and defensive symbionts that are also classed as endosymbionts. This focus should be made clear early on in the paper, as there are many systems (that are classed by many as defensive symbioses) that are not part of the analysis.

      There is fairly minimalistic testing of moderators/sub-groups (which probably has its statistical strengths) but perhaps there are also some missed opportunities for testing other ecological contributors to variance, including coinfection (although perhaps limited by power) and other approaches to coding enemy group (as detail above).

      Looking at the overview of systems included, there's likely a high degree of phylogenetic non-independence in the dataset. Where it is possible, using phylogenetically controlled models could strengthen this analysis.

      Looking at your included systems (Table S5), you might be able to test the effect of coinfection on the 3 variables of interest. For example, it would be particularly important to see if the effects of two symbionts are additive or not.

      No code for the analysis is provided for review at this stage and full details of the dataset are also not available. This slightly limits the ability to assess the full scope and robustness of the study. It would be helpful to have an extensive table in the supplementary detailing (minimum) the reference, study, experiment, host species, symbiont strain, and a description of the exact data extraction source (e.g.table/figure/in text), and method of extraction.

    3. Reviewer #2 (Public review):

      Summary:

      In this exciting study, Cesar and co-authors perform a meta-analysis on the influence of arthropod symbionts on the fitness of their hosts when they are exposed or not to natural enemies. These so-called defensive symbionts are increasingly recognized as key elements in arthropod survival against natural enemies, with effects that ripple through entire terrestrial ecosystems. The topic is timely, the approach is sound, and the manuscript is well-written. I believe this manuscript will attract the attention of entomologists and of microbiologists interested in symbiosis. This study builds on a previous meta-analysis that I was involved in, which was based on phloem-feeding insects. This novel data set is much larger and includes flies (including the model system Drosophila) and mosquitoes (a group of high medical interest). While the previous meta-analysis considered only parasitoids as natural enemies, this study also includes fungi, bacteria, and viruses.

      Strengths:

      The authors compile a very large dataset and provide a broad quantitative overview of the effects of defensive symbionts in insects. By measuring symbiont effects in the presence and absence of natural enemies, the authors are able to infer whether a trade-off between defense and the costs of mutualism in the absence of enemy pressure exists. Defensive symbioses are an important research topic that had its initial "momentum" a decade ago, so the timing for such a systematic review is very appropriate.

      Weaknesses:

      I think the manuscript could be improved by clarifying several sections, particularly the introduction and methods. The introduction section is too specific and heavily reliant on particular examples. In my view, the theoretical background of the study could be made clearer, and the knowledge gap identified more explicitly. A focus on how widespread defensive symbioses are, along with a brief, up-to-date review of the groups possessing such symbionts, would help. This lack of focus is also observed in the methods section, where more details are needed in many instances to better understand how data was collected and analyzed. Regarding the analyses, the multi-level analysis contains many moderators, but it's unclear why these moderators were included. While this may seem a minor issue, it highlights a disconnection between the analyses, the conceptual background, and the hypotheses tested. Another important weakness is that the analyses are too general, and much-hidden information is not immediately apparent. For instance, readers cannot easily identify which species of symbionts are studied (and the effects they have), or which natural enemies are involved. Although this information is found in the supplementary material, including it in the main body would significantly improve the manuscript.

    4. Author response:

      Reviewer #1 (Public review):

      Summary:

      Cesar, Santos & Cogni use a meta-analysis to report on the direction and magnitude of three fundamental fitness components in defensive symbioses. Specifically, the work focuses on interactions between three arthropod host families (Aphididae, Culicidae, Drosophilidae, and others) and common bacterial endosymbionts (Wolbachia, Serratia, Hamiltonella, Spiroplasma, Rickettsia, Regiella X-type and Arsenophonus). The results of the overall analysis confirm common assumptions and previous work on such fitness components, showing that defensive symbionts provide strong protection to hosts and cause detectable costs to both hosts and the enemy. The analysis provides insight into the extent of the cost/benefit tradeoff for hosts, reporting that the cost is six times lower than the protective effect. The confirmation that natural enemies attacking hosts infected with symbionts have a reduction in their fitness is also an interesting one, as this shows that the majority of defensive symbionts provide protection by resisting enemy infection, as opposed to tolerating it. This finding has important consequences for evolutionary counter-responses in the enemy species. Of course, this result has less relevance for certain types of enemies (such as parasitoids) where successful infection is dependent upon host killing.

      Interesting results also emerge from the subgroup analysis. For the full dataset, both natural and introduced symbionts were similarly effective in positively influencing the fitness of hosts. However, in the Wolbachia-specific analysis, the artificially introduced symbionts caused costs to the hosts where the natural strain did not. These findings have potentially important ramifications for schemes that use endosymbionts for biocontrol or vector competence, suggesting that (in some cases) natural strains may be the more stable choice for deploying (as they are associated with lower costs).

      The analysis draws from an impressively large dataset, but the interpretation of the full impact of the results would be helped by greater detail on the species/strain level systems included, the data extraction approach, and inclusion criteria. Accounting for phylogenetic nonindependence and alternative coding of one of the moderator variables could also strengthen the biological relevance of the models. Suggestions and thoughts are outlined below.

      We sincerely thank Reviewer #1 for the time and effort dedicated to reviewing our manuscript. The suggestions provided are highly constructive and will greatly assist us in improving both our analyses and the manuscript overall.

      Strengths & Potential Improvements:

      An impressively large number of effect sizes (3000) from only 226 studies is collected, robustly confirming common assumptions on the magnitude of fundamental fitness components. However the paper would benefit from a clear breakdown in the main text of the specificities of each system included (e.g. a table at the host species/symbiont strain level, where it is possible). Currently, there is not enough detail for those who want a deep dive to understand what data was extracted for the analysis from these 226 studies, or those who want to understand the underlying diversity in the dataset.

      We thank the reviewer for the suggestion, and we will add this information to our revised manuscript.

      Currently, when the 'natural enemy group' is tested as a moderator it is coded broadly by type of organism (e.g. virus, bacterium, fungi, parasitoid). But this doesn't adequately capture the mode of killing/fitness reduction by the enemy, which would be the much more biologically relevant categorisation for your questions. For example, parasitoid infection is dependent upon host death (thus host fecundity is not relevant, because the host either survived or did not). Among bacterial and viral pathogens antagonists there is scope for both fecundity and survival to be affected. This in turn may be a very influential factor for the outcome. You could consider recoding this enemy moderator.

      We agree, and we will implement this in the analysis to our revised manuscript.

      The analysis is restricted to arthropod hosts and defensive symbionts that are also classed as endosymbionts. This focus should be made clear early on in the paper, as there are many systems (that are classed by many as defensive symbioses) that are not part of the analysis.

      We agree, and we will implement this to our revised manuscript.

      There is fairly minimalistic testing of moderators/sub-groups (which probably has its statistical strengths) but perhaps there are also some missed opportunities for testing other ecological contributors to variance, including coinfection (although perhaps limited by power) and other approaches to coding enemy group (as detail above).

      We agree, and we will implement this in the analysis to our revised manuscript.

      Looking at the overview of systems included, there's likely a high degree of phylogenetic non-independence in the dataset. Where it is possible, using phylogenetically controlled models could strengthen this analysis.

      We thank the reviewer for the suggestion. We will explore the possibility of using phylogenetically controlled models in our analyses, although we recognize the challenges associated with their implementation, particularly in the case of the natural enemies, given the great diversity of distant related groups included in our study - viruses, bacteria, fungi, protozoans, nematodes and parasitoids wasps.

      Looking at your included systems (Table S5), you might be able to test the effect of coinfection on the 3 variables of interest. For example, it would be particularly important to see if the effects of two symbionts are additive or not.

      We agree, and we will implement this in the analysis to our revised manuscript.

      No code for the analysis is provided for review at this stage and full details of the dataset are also not available. This slightly limits the ability to assess the full scope and robustness of the study. It would be helpful to have an extensive table in the supplementary detailing (minimum) the reference, study, experiment, host species, symbiont strain, and a description of the exact data extraction source (e.g.table/figure/in text), and method of extraction.

      The code for the analysis and the full raw data with the suggested information are available at https://github.com/cassiasqr/MetaSymbiont (The link is available at the end of the manuscript).

      Reviewer #2 (Public review):

      Summary:

      In this exciting study, Cesar and co-authors perform a meta-analysis on the influence of arthropod symbionts on the fitness of their hosts when they are exposed or not to natural enemies. These so-called defensive symbionts are increasingly recognized as key elements in arthropod survival against natural enemies, with effects that ripple through entire terrestrial ecosystems. The topic is timely, the approach is sound, and the manuscript is well-written. I believe this manuscript will attract the attention of entomologists and of microbiologists interested in symbiosis. This study builds on a previous meta-analysis that I was involved in, which was based on phloem-feeding insects. This novel data set is much larger and includes flies (including the model system Drosophila) and mosquitoes (a group of high medical interest). While the previous metaanalysis considered only parasitoids as natural enemies, this study also includes fungi, bacteria, and viruses.

      Strengths:

      The authors compile a very large dataset and provide a broad quantitative overview of the effects of defensive symbionts in insects. By measuring symbiont effects in the presence and absence of natural enemies, the authors are able to infer whether a trade-off between defense and the costs of mutualism in the absence of enemy pressure exists. Defensive symbioses are an important research topic that had its initial "momentum" a decade ago, so the timing for such a systematic review is very appropriate.

      We sincerely thank Reviewer #2 for dedicating their time and effort to reviewing our manuscript. The suggestions are very insightful and will significantly contribute to improving our manuscript.

      Weaknesses:

      I think the manuscript could be improved by clarifying several sections, particularly the introduction and methods. The introduction section is too specific and heavily reliant on particular examples. In my view, the theoretical background of the study could be made clearer, and the knowledge gap identified more explicitly. A focus on how widespread defensive symbioses are, along with a brief, up-to-date review of the groups possessing such symbionts, would help. This lack of focus is also observed in the methods section, where more details are needed in many instances to better understand how data was collected and analyzed. Regarding the analyses, the multi-level analysis contains many moderators, but it's unclear why these moderators were included. While this may seem a minor issue, it highlights a disconnection between the analyses, the conceptual background, and the hypotheses tested. 

      We thank the reviewer for the suggestions, and we will try to make the introduction and the methods section clearer. 

      Another important weakness is that the analyses are too general, and much-hidden information is not immediately apparent. For instance, readers cannot easily identify which species of symbionts are studied (and the effects they have), or which natural enemies are involved. Although this information is found in the supplementary material, including it in the main body would significantly improve the manuscript.

      We agree, and we will implement this to our   revised manuscript.

    1. eLife Assessment

      This valuable study provides new insights into the role of the conserved protein FLWR-1/Flower in synaptic transmission using C. elegans. The authors employ a range of techniques, including calcium imaging, ultrastructural analysis, and electrophysiology, providing evidence that challenges previous assumptions about FLWR-1 function. While some findings are solid, several conclusions remain incomplete and require further study to substantiate the proposed mechanisms.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, Seidenthal et al. investigated the role of the C. elegans Flower protein, FLWR-1, in synaptic transmission, vesicle recycling, and neuronal excitability. They confirmed that FLWR-1 localizes to synaptic vesicles and the plasma membrane and facilitates synaptic vesicle recycling at neuromuscular junctions, albeit in an unexpected manner. The authors observed that hyperstimulation results in endosome accumulation in flwr-1 mutant synapses, suggesting that FLWR-1 facilitates the breakdown of endocytic endosomes, which differs from earlier studies in flies that suggested the Flower protein promotes the formation of bulk endosomes. This is a valuable finding. Using tissue-specific rescue experiments, the authors showed that expressing FLWR-1 in GABAergic neurons restored the aldicarb-resistant phenotype seen in flwr-1 mutants to wild-type levels. In contrast, FLWR-1 expression in cholinergic neurons in flwr-1 mutants did not restore aldicarb sensitivity, yet muscle expression of FLWR-1 partially but significantly recovered the aldicarb-resistant defects. The study also revealed that removing FLWR-1 leads to increased Ca2+ signaling in motor neurons upon photo-stimulation. Further, the authors conclude that FLWR-1 contributes to the maintenance of the excitation/inhibition (E/I) balance by preferentially regulating the excitability of GABAergic neurons. Finally, SNG-1::pHluorin data imply that FLWR-1 removal enhances synaptic transmission, however, the electrophysiological recordings do not corroborate this finding.

      Strengths:

      This study by Seidenthal et al. offers valuable insights into the role of the Flower protein, FLWR-1, in C. elegans. Their findings suggest that FLWR-1 facilitates the breakdown of endocytic endosomes, which marks a departure from its previously suggested role in forming endosomes through bulk endocytosis. This observation could be important for understanding how Flower proteins function across species. In addition, the study proposes that FLWR-1 plays a role in maintaining the excitation/inhibition balance, which has potential impacts on neuronal activity.

      Weaknesses:

      One issue is the lack of follow-up tests regarding the relative contributions of muscle and GABAergic FLWR-1 to aldicarb sensitivity. The findings that muscle expression of FLWR-1 can significantly rescue aldicarb sensitivity are intriguing and may influence both experimental design and data interpretation. Have the authors examined aldicarb sensitivity when FLWR-1 is expressed in both muscles and GABAergic neurons, or possibly in muscles and cholinergic neurons? Given that muscles could influence neuronal activity through retrograde signaling, a thorough examination of FLWR-1's role in muscle is necessary, in my opinion.

      Would the results from electrophysiological recordings and GCaMP measurements be altered with muscle expression of FLWR-1? Most experiments presented in the manuscript compare wild-type and flwr-1 mutant animals. However, without tissue-specific knockout, knockdown, or rescue experiments, it is difficult to separate cell-autonomous roles from non-cell-autonomous effects, in particular in the context of aldicarb assay results. Also, relying solely on levamisole paralysis experiments is not sufficient to rule out changes in muscle AChRs, particularly due to the presence of levamisole-resistant receptors.

      This issue regarding the muscle role of FLWR-1 also complicates the interpretation of results from coelomocyte uptake experiments, where GFP secreted from muscles and coelomocyte fluorescence were used to estimate endocytosis levels. A decrease in coelomocyte GFP could result from either reduced endocytosis in coelomocytes or decreased secretion from muscles. Therefore, coelomocyte-specific rescue experiments seem necessary to distinguish between these possibilities.

      The manuscript states that GCaMP was used to estimate Ca2+ levels at presynaptic sites. However, due to the rapid diffusion of both Ca2+ and GCaMP, it is unclear how this assay distinguishes Ca2+ levels specifically at presynaptic sites versus those in axons. What are the relative contributions of VGCCs and ER calcium stores here? This raises a question about whether the authors are measuring the local impact of FLWR-1 specifically at presynaptic sites or more general changes in cytoplasmic calcium levels.

      The experiments showing FLWR-1's presynaptic localization need clarification/improvement. For example, data shown in Fig. 3B represent GFP::FLWR-1 is expressed under its own promoter, and TagRFP::ELKS-1 is expressed exclusively in GABAergic neurons. Given that the pflwr-1 drives expression in both cholinergic and GABAergic neurons, and there are more cholinergic synapses outnumbering GABAergic ones in the nerve cord, it would be expected that many green FLWR-1 puncta do not associate with TagRFP::ELKS-1. However, several images in Figure 3B suggest an almost perfect correlation between FLWR-1 and ELKS-1 puncta. It would be helpful for the readers to understand the exact location in the nerve cord where these images were collected to avoid confusion.

      The SNG-1::pHluorin data in Figure 5C is significant, as they suggest increased synaptic transmission at flwr-1 mutant synapses. However, to draw conclusions, it is necessary to verify whether the total amount of SNG-1::pHluorin present on synaptic vesicles remains the same between flwr-1 mutant and wild-type synapses. Without this comparison, a conclusion on levels of synaptic vesicle release based on changes in fluorescence might be premature, in particular given the results of electrophysiological recordings.

      Finally, the interpretation of the E74Q mutation results needs reconsideration. Figure 8B indicates that the E74Q variant of FLWR-1 partially loses its rescuing ability, which suggests that the E74Q mutation adversely affects the function of FLWR-1. Why did the authors expect that the role of FLWR-1 should have been completely abolished by E74Q? Given that FLWR-1 appears to work in multiple tissues, might FLWR-1's function in neurons requires its calcium channel activity, whereas its role in muscles might be independent of this feature? While I understand there is ongoing debate about whether FLWR-1 is a calcium channel, the experiments in this study do not definitively resolve local Ca2+ dynamics at synapses. Thus, in my opinion, it may be premature to draw firm conclusions about calcium influx through FLWR-1.

      Also, the aldicarb data presented in Figures 8B and 8D show notable inconsistencies that require clarification. While Figure 8B indicates that the 50% paralysis time for flwr-1 mutant worms occurs at 3.5-4 hours, Figure 8D shows that 50% paralysis takes approximately 2.5 hours for the same flwr-1 mutants. This discrepancy should be addressed. In addition, the manuscript mentions that the E74Q mutation impairs FLWR-1 folding, which could significantly affect its function. Can the authors show empirical data supporting this claim?

    3. Reviewer #2 (Public review):

      Summary:

      The Flower protein is expressed in various cell types, including neurons. Previous studies in flies have proposed that Flower plays a role in neuronal endocytosis by functioning as a Ca2+ channel. However, its precise physiological roles and molecular mechanisms in neurons remain largely unclear. This study employs C. elegans as a model to explore the function and mechanism of FLWR-1, the C. elegans homolog of Flower. This study offers intriguing observations that could potentially challenge or expand our current understanding of the Flower protein. Nevertheless, further clarification or additional experiments are required to substantiate the study's conclusions.

      Strengths:

      A range of approaches was employed, including the use of a flwr-1 knockout strain, assessment of cholinergic synaptic activity via analyzing aldicarb (a cholinesterase inhibitor) sensitivity, imaging Ca2+ dynamics with GCaMP3, analyzing pHluorin fluorescence, examination of presynaptic ultrastructure by EM, and recording postsynaptic currents at the neuromuscular junction. The findings include notable observations on the effects of flwr-1 knockout, such as increased Ca2+ levels in motor neurons, changes in endosome numbers in motor neurons, altered aldicarb sensitivity, and potential involvement of a Ca2+-ATPase and PIP2 binding in FLWR-1's function.

      Weaknesses:

      (1) The observation that flwr-1 knockout increases Ca2+ levels in motor neurons is notable, especially as it contrasts with prior findings in flies. The authors propose that elevated Ca2+ levels in flwr-1 knockout motor neurons may stem from "deregulation of MCA-3" (a Ca2+ ATPase in the plasma membrane) due to FLWR-1 loss. However, this conclusion relies on limited and somewhat inconclusive data (Figure 7). Additional experiments could clarify FLWR-1's role in MCA-3 regulation. For instance, it would be informative to investigate whether mutations in other genes that cause elevated cytosolic Ca2+ produce similar effects, whether MCA-3 physically interacts with FLWR-1, and whether MCA-3 expression is reduced in the flwr-1 knockout.

      (2) In silico analysis identified residues R27 and K31 as potential PIP2 binding sites in FLWR-1. The authors observed that FLWR-1(R27A/K31A) was less effective than wild-type FLWR-1 in rescuing the aldicarb sensitivity phenotype of the flwr-1 knockout, suggesting that FLWR-1 function may depend on PIP2 binding at these two residues. Given that mutations in various residues can impair protein function non-specifically, additional studies may be needed to confirm the significance of these residues for PIP2 binding and FLWR-1 function. In addition, the authors might consider explicitly discussing how this finding aligns or contrasts with the results of a previous study in flies, where alanine substitutions at K29 and R33 impaired a Flower-related function (Li et al., eLife 2020).

      (3) A primary conclusion from the EM data was that FLWR-1 participates in the breakdown, rather than the formation, of bulk endosomes (lines 20-22). However, the reasoning behind this conclusion is somewhat unclear. Adding more explicit explanations in the Results section would help clarify and strengthen this interpretation.

      (4) The aldicarb assay results in Figure 3 are intriguing, indicating that reduced GABAergic neuron activity alone accounts for the flwr-1 mutant's hyposensitivity to aldicarb. Given that cholinergic motor neurons also showed increased activity in the flwr-1 mutant, one might expect the flwr-1 mutant to display hypersensitivity to aldicarb in the unc-47 knockout background. However, this was not observed. The authors might consider validating their conclusion with an alternative approach or, at the minimum, providing a plausible explanation for the unexpected result. Since aldicarb-induced paralysis can be influenced by factors beyond acetylcholine release from cholinergic motor neurons, interpreting aldicarb assay results with caution may be advisable. This is especially relevant here, as FLWR-1 function in muscle cells also impacts aldicarb sensitivity (Figure S3B). Previous electrophysiological studies have suggested that aldicarb sensitivity assays may sometimes yield misleading conclusions regarding protein roles in acetylcholine release.

      (5) Previous studies have suggested that the Flower protein functions as a Ca²⁺ channel, with a conserved glutamate residue at the putative selectivity filter being essential for this role. However, mutating this conserved residue (E74Q) in C. elegans FLWR-1 altered aldicarb sensitivity in a direction opposite to what would be expected for a Ca²⁺ channel function. Moreover, the authors observed that E74 of FLWR-1 is not located near a potential conduction pathway in the FLWR-1 tetramer, as predicted by Alphafold3. These findings raise the possibility that Flower may not function as a Ca2+ channel. While this is a potentially significant discovery, further experiments are needed to confirm and expand upon these results.

      (6) Phrases like "increased excitability" and "increased Ca2+ influx" are used throughout the manuscript. However, there is no direct evidence that motor neurons exhibit increased excitability or Ca2+ influx. The authors appear to interpret the elevated Ca2+ signal in motor neurons as indicative of both increased excitability and Ca2+ influx. However, this elevated Ca2+ signal in the flwr-1 mutant could occur independently of changes in excitability or Ca2+ influx, such as in cases of reduced MCA-3 activity. The authors may wish to consider alternative terminology that more accurately reflects their findings.

    4. Reviewer #3 (Public review):

      Summary:

      Seidenthal et al. investigated the role of the Flower protein, FLWR-1, in C. elegans and confirmed its involvement in endocytosis within both synaptic and non-neuronal cells, possibly by contributing to the fission of bulk endosomes. They also uncovered that FLWR-1 has a novel inhibitory effect on neuronal excitability at GABAergic and cholinergic synapses in neuromuscular junctions.

      Strengths:

      This study not only reinforces the conserved role of the Flower protein in endocytosis across species but also provides valuable ultrastructural data to support its function in the bulk endosome fission process. Additionally, the discovery of FLWR-1's role in modulating neuronal excitability broadens our understanding of its functions and opens new avenues for research into synaptic regulation.

      Weaknesses:

      The study does not address the ongoing debate about the Flower protein's proposed Ca2+ channel activity, leaving an important aspect of its function unexplored. Furthermore, the evidence supporting the mechanism by which FLWR-1 inhibits neuronal excitability is limited. The suggested involvement of MCA-3 as a mediator of this inhibition lacks conclusive evidence, and a more detailed exploration of this pathway would strengthen the findings.

    1. eLife Assessment

      This important study introduces rationally designed, genetically encoded tools for the selective and reversible ablation of excitatory and inhibitory synapses. The evidence is convincing, supported by robust experiments and clear results that validate the effectiveness of each tool. This work will be of particular interest to researchers exploring the roles of specific synapses within neural circuitry.

    2. Reviewer #1 (Public review):

      Summary:

      This work is a continuation of a previous paper from the Arnold group, where they engineered GFE3, which allows to specifically ablate inhibitory synapses. Here, the authors generate 3 different actuators:

      (1) An excitatory synapse ablator.

      (2) A photoactivatable inhibitory synapse ablator.

      (3) A chemically inhibitory synapse ablator.

      Following initial engineering, the authors present characterization and optimization data to showcase that these new tools allow one to specifically ablate synapses, without toxicity and with specificity. Furthermore, they showcase that these manipulations are reversible.

      Altogether, these new tools would be important for the neuroscience community.

      Strengths:

      The authors convincingly demonstrate the engineering, optimization, and characterization of these new probes. The main novelty here is the new excitatory synapse ablator, which has not been shown yet and thus could be a valuable tool for neuroscientists.

      Weaknesses:

      There are a few specific issues with regard to these probes that are unclear to me, which require some explanation or potentially new analysis and experiments.

      The biggest concern in this regard is: that almost all the characterization is performed in cultured dissociated neurons. I wonder if, for the typical neuroscience user, it would be trivial to characterize how well these tools express and operate in vivo. This could be substantially different and present some limitations as to the utility of these tools.

    3. Reviewer #2 (Public review):

      Summary:

      This study introduces a set of genetically encoded tools for the selective and reversible ablation of excitatory and inhibitory synapses. Previously, the authors developed GFE3, a tool that efficiently ablates inhibitory synapses by targeting an E3 ligase to the inhibitory scaffolding protein Gephyrin via GPHN.FingR, a recombinant, antibody-like protein (Gross et al., 2016). Building on this work, they now present three new ablation tools: PFE3, which targets excitatory synapses, and two new versions of GFE3-paGFE3 and chGFE3-that are photoactivatable and chemically inducible, respectively. These tools enable selective and efficient synapse ablation in specific cell types, providing valuable methods for disrupting neural circuits. This approach holds broad potential for investigating the roles of specific synaptic input onto genetically determined cells.

      Strengths:

      The primary strength of this study lies in the rational design and robust validation of each tool's effectiveness, building on previous work by the authors' group (Gross et al., 2016). Each tool serves distinct research needs: PFE3 enables efficient degradation of PSD-95 at excitatory synapses, while paGFE3 and chGFE3 allow for targeted degradation of Gephyrin, offering spatiotemporal control over inhibitory synapses via light or chemical activation. These tools are efficiently validated through robust experiments demonstrating reductions in synaptic markers (PSD-95 and Gephyrin) and confirming reversibility, which is crucial for transient ablation. By providing tools with both optogenetic and chemical control options, this study broadens the applicability of synapse manipulation across varied experimental conditions, enhancing the utility of E3 ligase-based approaches for synapse ablation.

      Weaknesses:

      While this study provides valuable tools and addresses many critical points for validation, examining potential issues with specificity and background effects in further detail could strengthen the paper. For instance, PFE3 results in reductions in both PSD-95 and GluA1. In previous work, GFE3 selectively reduced Gephyrin without affecting major Gephyrin interactors or other PSD proteins. Clarifying whether PFE3 affects additional PSD proteins beyond GluA1 would be important for accurately interpreting results in experiments using PFE3. Additionally, further insight into PFE3's impact on inhibitory synapses would be valuable.

      For paGFE3 and chGFE3, the E3 ligase (RING domain of Mdm2) is overexpressed throughout cells as a separate construct. Although the authors show that Gephyrin is not significantly reduced without light or chemical activation, it remains possible that other proteins could be ubiquitinated due to the overexpressed E3 domain. Addressing these points would clarify the strengths and limitations of tools, providing users with valuable information.

    1. eLife Assessment

      This paper is an important overview of the currently published literature on low-intensity focussed ultrasound stimulation (TUS) in humans, with a meta-analysis of this literature that explores which stimulation parameters might predict the directionality of the physiological stimulation effects. Whilst currently incomplete, the database proposed by the paper has the potential to become a key community resource if carefully curated and developed.

    2. Reviewer #1 (Public review):

      Summary:

      This paper is a relevant overview of the currently published literature on low-intensity focussed ultrasound stimulation (TUS) in humans, with a meta-analysis of this literature that explores which stimulation parameters might predict the directionality of the physiological stimulation effects.

      The pool of papers to draw from is small, which is not surprising given the nascent technology. It seems nevertheless relevant to summarize the current field in the way done here, not least to mitigate and prevent some of the mistakes that other non-invasive brain stimulation techniques have suffered from, most notably the theory- and data-free permutation of the parameter space.<br /> The meta-analysis concludes that there are, at best, weak trends toward specific parameters predicting the direction of the stimulation effects. The data have been incorporated into an open database, that will ideally continue to be populated by the community and thereby become a helpful resource as the field moves forward.

      Strengths:

      The current state of human TUS is concisely and well summarized. The methods of the meta-analysis are appropriate. The database is a valuable resource.

      Weaknesses:

      These are not so much weaknesses but rather comments and suggestions that the authors may want to consider.

      (1) I may have missed this, but how will the database be curated going forward? The resource will only be as useful as the quality of data entry, which, given the complexity of TUS can easily be done incorrectly.

      (2) It would be helpful to report the full statistics and effect sizes for all analyses. At times, only p-values are given. The meta-analysis only provides weak evidence (judged by the p-values) for two parameters having a predictive effect on the direction of neuromodulation. This reviewer thinks a stronger statement is warranted that there is currently no good evidence for duty cycle or sonication direction predicting outcome (though I caveat this given the full stats aren't reported). The concern here is that some readers may gallop away with the impression that the evidence is compelling because the p-value is on the correct side of 0.05.

      (3) This reviewer thinks the issue of (independent) replication should be more forcefully discussed and highlighted. The overall motivation for the present paper is clearly and thoughtfully articulated, but perhaps the authors agree that the role that replication has to play in a nascent field such as TUS is worth considering.

      (4) A related point is that many of the results come from the same groups (the so-called theta-TUS protocol being a clear example). The analysis could factor this in, but it may be helpful to either signpost independent replications, which studies come from the same groups, or both.

      (5) The recent study by Bao et al 2024 J Phys might be worth including, not least because it fails to replicate the results on theta TUS that had been limited to the same group so far (by reporting, in essence, the opposite result).

      (6) The summary of TUS effects is useful and concise. Two aspects may warrant highlighting, if anything to safeguard against overly simplistic heuristics for the application of TUS from less experienced users. First, could the effects of sonication (enhancing vs suppressing) depend on the targeted structure? Across the cortex, this may be similar, but for subcortical structures such as the basal ganglia, thalamus, etc, the idiosyncratic anatomy, connectivity, and composition of neurons may well lead to different net outcomes. Do the models mentioned in this paper account for that or allow for exploring this? And is it worth highlighting that simple heuristics that assume the effects of a given TUS protocol are uniform across the entire brain risk oversimplification or could be plain wrong? Second, and related, there seems to be the implicit assumption (not necessarily made by the authors) that the effects of a given protocol in a healthy population transfer like for like to a patient population (if TUS protocol X is enhancing in healthy subjects, I can use it for enhancement in patient group Y). This reviewer does not know to which degree this is valid or not, but it seems simplistic or risky. Many neurological and psychiatric disorders alter neurotransmission, and/or lead to morphological and structural changes that would seem capable of influencing the impact of TUS. If the authors agree, this issue might be worth highlighting.

    3. Reviewer #2 (Public review):

      Summary:

      This paper describes a number of aspects of transcranial ultrasound stimulation (TUS) including a generic review of what TUS might be used for; a meta-analysis of human studies to identify ultrasound parameters that affect directionality; a comparison between one postulated mechanistic model and results in humans; and a description of a database for collecting information on studies.

      Strengths:

      The main strength was a meta-analysis of human studies to identify which ultrasonic parameters might result in enhancement or suppression of modulation effects. The meta-analysis suggests that none of the US parameters correlate significantly with effects. This is a useful result for researchers in the field in trying to determine how the parameter space should be further investigated to identify whether it is possible to indeed enhance or suppress brain activity with ultrasound.

      The database is a good idea in principle but would be best done in collaboration with ITRUSST, an international consortium, and perhaps should be its own paper.

      Weaknesses:

      The paper tries to cover too many topics and some of the technical descriptions are a bit loose. The review section does not add to the current literature. The comparison with a mechanistic model is limited to comparing data with a single model at a time when there is no general agreement in the field as to how ultrasound might produce a neuromodulation effect. The comparison is therefore of limited value.

    1. eLife Assessment

      This important study includes convincing evidence to show that behavioral measures and hippocampal representations of cognitive control are not dependent upon the medial prefrontal cortex. Whilst overall the study is of importance, it is possible that the conceptual framework used to interpret and discuss the findings could be strengthened in a revised version. The results are expected to be of interest to those studying neural mechanisms of cognitive control and functions of associational brain regions.

    2. Reviewer #1 (Public review):

      Summary:

      The authors examine the role of the medial prefrontal cortex (mPFC) in cognitive control, i.e. the ability to use task-relevant information and ignore irrelevant information, in the rat. According to the central-computation hypothesis, cognitive control in the brain is centralized in the mPFC and according to the local hypothesis, cognitive control is performed in task-related local neural circuits. Using the place avoidance task which involves cognitive control, it is predicted that if mPFC lesions affect learning, this would support the central computation hypothesis whereas no effect of lesions would rather support the local hypothesis. The authors thus examine the effect of mPFC lesions in learning and retention of the place avoidance task. They also look at functional interconnectivity within a large network of areas that could be activated during the task by using cytochrome oxydase, a metabolic marker. In addition, electrophysiological unit recordings of CA1 hippocampal cells are made in a subset of (lesioned or intact) animals to evaluate overdispersion, a firing property that reflects cognitive control in the hippocampus. The results indicate that mPFC lesions do not impair place avoidance learning and retention (though flexibility is altered during conflict training), do not affect cognitive control seen in hippocampal place cell activity (alternation of frame-specific firing), a measure of location-specific firing variability, in pretraining. It nevertheless has some effect on functional interconnections. The results overall support the local hypothesis.

      Strengths:

      (1) Straightforward hypothesis: clarification of the involvement of the mPFC in the brain is expected and achieved. Appropriate use of fully mastered methods (behavioral task, electrophysiological recordings, measure of metabolic marker cytochrome oxidase) and rigorous analysis of the data. The conclusion is strongly supported by the data.

      (2) Weaknesses: No notable weaknesses in the conception, making of the study, and data analysis. The introduction does not mention important aspects of the work, i.e. cytochrome oxidase measure and electrophysiological recordings. The study is actually richer than expected from the introduction.

    3. Reviewer #2 (Public review):

      Park et al. set out to test two competing hypotheses about the role of the medial prefrontal cortex (PFC) in cognitive control, the ability to use task-relevant cues and ignore task-irrelevant cues to guide behavior. The "central computation" hypothesis assumes that cognitive control relies on computations performed by the PFC, which then interacts with other brain regions to accomplish the task. Alternatively, the "local computation" hypothesis suggests that computations necessary for cognitive control are carried out by other brain regions that have been shown to be essential for cognitive control tasks, such as the dorsal hippocampus and the thalamus. If the central computation hypothesis is correct, PFC lesions should disrupt cognitive control. Alternatively, if the local computation hypothesis is correct, cognitive control would be spared after PFC lesions. The task used to assess cognitive control is the active place avoidance task in which rats must avoid a section of a rotating arena using the stationary room cues and ignoring the local olfactory cues on the rotating platform. Performance on this task has previously been shown to be disrupted by hippocampal lesions and hippocampal ensembles dynamically represent the room and arena depending on the animal's proximity to the shock zone. They found no group (lesion vs. sham) differences in the three behavioral parameters tested: distance traveled, latency to enter the shock zone, and number of shock zone entries for both the standard task and the "conflict" task in which the shock zone was rotated by 180 degrees. The only significant difference was the savings index; the lesion group entered the new shock zone more often than the sham group during the first 5 minutes of the second conflict session. This deficit was interpreted as a cognitive flexibility deficit rather than a cognitive control failure. Next, the authors compared cytochrome oxidase activity between sham and lesion groups in 14 brain regions and found that only the amygdala showed significant elevation in the lesion vs. sham group. Pairwise correlation analysis revealed a striking difference between groups, with many correlations between regions lost in the lesion group (between reuniens and hippocampus, reuniens and amygdala and a correlation between dorsal CA1 and central amygdala that appeared in the lesion group and were absent in the sham group. Finally, the authors assessed dorsal hippocampal representations of the spatial frame (arena vs. room) and found no differences between lesion and sham groups. The only difference in hippocampal activity was reduced overdispersion in the lesion group compared to the sham group on the pretraining session only and this difference disappeared after the task began. Collectively, the authors interpret their findings as supporting the local computation hypothesis; computations necessary for cognitive control occur in brain regions other than the PFC.

      Strengths:

      (1) The data were collected in a rigorous way with experimental blinding and appropriate statistical analyses.

      (2) Multiple approaches were used to assess differences between lesion and sham groups, including behavior, metabolic activity in multiple brain regions, and hippocampal single-unit recording.

      Weaknesses:

      (1) Only male rats were used with no justification provided for excluding females from the sample.

      (2) The conceptual framework used to interpret the findings was to present two competing hypotheses with mutually exclusive predictions about the impact of PFC lesions on cognitive control. The authors then use mainly null findings as evidence in support of the local computation hypothesis. They acknowledge that some people may question the notion that the active place avoidance task indeed requires cognitive control, but then call the argument "circular" because PFC has to be involved in cognitive control. This assertion does not address the possibility that the active place avoidance task simply does not require cognitive control.

      (3) The authors did not link the CO activity with the behavioral parameters even though the CO imaging was done on a subset of the animals that ran the behavioral task nor did they make any attempt to interpret these findings in light of the two competing hypotheses posed in the introduction. Moreover, the discussion lacks any mechanistic interpretations of the findings. For example, there are no attempts to explain why amygdala activity and its correlation with dCA1 activity might be higher in the PFC lesioned group.

      (4) Publishing null results is important to avoid wasting animals, time, and money. This study's results will have a significant impact on how the field views the role of the PFC in cognitive control. Whether or not some people reject the notion that the active place avoidance task measures cognitive control, the findings are solid and can serve as a starting point for generating hypotheses about how brain networks change when deprived of PFC input.

    4. Reviewer #3 (Public review):

      Summary:

      This study by Park and colleagues investigated how the medial prefrontal cortex (mPFC) influences behavior and hippocampal place cell activity during a two-frame active place avoidance task in rats. Rats learned to avoid the location of mild shock within a rotating arena, with the shock zone being defined relative to distal cues in the room. Permanent chemical lesions of the mPFC did not impair the ability to avoid the shock zone by using distal cues and ignoring proximal cues in the arena. In parallel, hippocampal place cells alternated between two spatial tuning patterns, one anchored to the distal cues and the other to the proximal cues, and this alteration was not affected by the mPFC lesion. Based on these findings, the authors argue that the mPFC is not essential for differentiating between task-relevant and irrelevant information.

      Strengths:

      This study was built on substantial work by the Fenton lab that validated their two-frame active place avoidance task and provided sound theoretical and analytical foundations. Additionally, the effectiveness of mPFC lesions was validated by several measures, enabling the authors to base their argument on the lack of lesion effects on behavior and place cell dynamics.

      Weaknesses:

      The authors define cognitive control as "the ability to judiciously use task-relevant information while ignoring salient concurrent information that is currently irrelevant for the task." (Lines 77-78). This definition is much simpler than the one by Miller and Cohen: "the ability to orchestrate thought and action in accordance with internal goals (Ref. 1)" and by Robbins: "processes necessary for optimal scheduling of complex sequence of behaviour." (Dalley et al., 2004, PMID: 15555683). Differentiating between task-relevant and irrelevant information is required in various behavioral tasks, such as differential learning, reversal learning, and set-shifting tasks. Previous rodent behavioral studies have shown that the integrity of the mPFC is necessary for set-shifting but not for differential or reversal learning (e.g., Enomoto et al., 2011, PMID: 21146155; Cho et al., 2015, PMID: 25754826). In the present task design, the initial training is a form of differential learning between proximal and distal cues, and the conflict training is akin to reversal learning. Therefore, the lack of lesion effects is somewhat expected. It would be interesting to test whether mPFC lesions impair set-shifting in their paradigm (e.g., the shock zone initially defined by distal cues and later by proximal cues). If the mPFC lesions do not impair this ability and associated hippocampal place dynamics, it will provide strong support for the authors' local-computation hypothesis.

    1. eLife Assessment

      This manuscript represents a fundamental contribution demonstrating that fentanyl-induced respiratory depression can be reversed with a peripherally-restricted mu opioid receptor antagonist. The paper reports compelling and rigorous physiological, pharmacokinetic, and behavioral evidence supporting this major claim, and furthers mechanistic understanding of how peripheral opioid receptors contribute to respiratory depression. These findings reshape our understanding of opioid-related effects on respiration and have significant therapeutic implications given that medications currently used to reverse opioid overdose (such as naloxone) produce severe aversive and withdrawal effects via actions within the central nervous system.

    2. Reviewer #1 (Public review):

      Summary:

      This paper shows that the synthetic opioid fentanyl induces respiratory depression in rodents. This effect is revised by the opioid receptor antagonist naloxone, as expected. Unexpectedly, the peripherally restricted opioid receptor antagonist naloxone methiodide also blocks fentanyl-induced respiratory depression.

      Strengths:

      The paper reports compelling physiology data supporting the induction of respiratory distress in fentanyl-treated animals. Evidence suggesting that naloxone methiodide reverses this respiratory depression is compelling. This is further supported by pharmacokinetic data suggesting that naloxone methiodide does not penetrate into the brain, nor is it metabolized into brain-penetrant naloxone.

      Weaknesses:

      A weakness of the study is the fact that the functional significance of opioid-induced changes in neural activity in the nTS (as measured by cFos and GcAMP/photometry) is not established. Does the nTS regulate fentanyl-induced respiratory depression, and are changes in nTS activity induced by naloxone and naloxone methiodide relevant to their ability to reverse respiratory depression?

    3. Reviewer #2 (Public review):

      Summary:

      In this article, Ruyle and colleagues assessed the contribution of central and peripheral mu opioid receptors in mediating fentanyl-induced respiratory depression using both naloxone and naloxone methiodide, which does not cross the blood-brain barrier. Both compounds prevented and reversed fentanyl-induced respiratory depression to a comparable degree. The advantage of peripheral treatments is that they circumvent the withdrawal-like effects of naloxone. Moreover, neurons located in the nucleus of the solitary tract are no longer activated by fentanyl when nalaxone methiodide is administered, suggesting that these responses are mediated by peripheral mu opioid receptors. The results delineate a role for peripheral mu opioid receptors in fentanyl-derived respiratory depression and identify a potentially advantageous approach to treating overdoses without inflicting withdrawal on the patients.

      Strengths:

      The strengths of the article include the intravenous delivery of all compounds, which increase the translational value of the article. The authors address both the prevention and reversal of fentanyl-derived respiratory depression. The experimental design and data interpretation are rigorous and appropriate controls were used in the study. Multiple doses were screened in the study and the approaches were multipronged. The authors demonstrated the activation of NTS cells using multiple techniques and the study links peripheral activation of mu opioid receptors to central activation of NTS cells. Both males and females were used in the experiments. The authors demonstrate the peripheral restriction of naloxone methiodide.

      Weaknesses:

      Nalaxone is already broadly used to prevent overdoses from opioids so in some respects, the effects reported here are somewhat incremental.

    4. Reviewer #3 (Public review):

      Summary:

      This manuscript outlines a series of very exciting and game-changing experiments examining the role of peripheral MORs in OIRD. The authors outline experiments that demonstrate a peripherally restricted MOR antagonist (NLX Methiodide) can rescue fentanyl-induced respiratory depression and this effect coincides with a lack of conditioned place aversion. This approach would be a massive boon to the OUD community, as there are a multitude of clinical reports showing that naloxone rescue post fentanyl over-intoxication is more aversive than the potential loss-of-life to the individuals involved. This important study reframes our understanding of successful overdose rescue with potential for reduced aversive withdrawal effects.

      Strengths:

      Strengths include the plethora of approaches arriving at the same general conclusion, the inclusion of both sexes and the result that a peripheral approach for OIRD rescue may side-step severe negative withdrawal symptoms of traditional NLX rescue.

      Weaknesses:

      The major weakness of this version relates to the data analysis assessed sex-specific contributors to the results.

    1. eLife Assessment

      Aging reduces tissue regeneration capacity, posing challenges for an aging population. In this fundamental study, Reeves et al. show that by combining Wnt-mediated osteoprogenitor expansion (using a special bandage) with intermittent fasting, bone healing can be restored in aged animals. By employing rigorous histological, transcriptomic, and imaging analyses in a clinically relevant model, the authors provide compelling evidence supporting the conclusions. The therapeutic approach presented in this study shows promise for rejuvenating tissue repair, not only in bones but potentially across other tissues.

    2. Reviewer #1 (Public review):

      Summary:

      Aging reduces tissue regeneration capacity, posing challenges for an aging population. In this study, the authors investigate impaired bone healing in aging, focusing on calvarial bones, and introduce a two-part rejuvenation strategy. Aging depletes osteoprogenitor cells and reduces their function, which hinders bone repair. Simply increasing the number of these cells does not restore their regenerative capacity in aged mice, highlighting intrinsic cellular deficits. The authors' strategy combines Wnt-mediated osteoprogenitor expansion with intermittent fasting, which remarkably restores bone healing. Intermittent fasting enhances osteoprogenitor function by targeting NAD+ pathways and gut microbiota, addressing mitochondrial dysfunction - an essential factor in aging. This approach shows promise for rejuvenating tissue repair, not only in bones but potentially across other tissues.

      Strengths:

      This study is exciting, impressive, and novel. The data presented is robust and supports the findings well.

      Weaknesses:

      As mentioned above the data is robust and supports the findings well. I have minor comments only.

    3. Reviewer #2 (Public review):

      Summary:

      Reeves et al explore a model of bone healing in the context of aging. They show that intermittent fasting can improve bone healing, even in aged animals. Their study combines a 'bone bandage' which delivers a canonical Wnt signal with intermittent fasting and shows impacts on the CD90 progenitor cell population and the healing of a critical-sized defect in the calvarium. They also explore potential regulators of this process and identify mitochondrial dysfunction in the age-related decline of stem cells. In this context, by modulating NAD+ pathways or the gut microbiota, they can also enhance healing, hinting at an effect mediated by complex impacts on multiple pathways associated with cellular metabolism.

      Strengths:

      The study shows a remarkable finding: that age-related decreases in bone healing can be restored by intermittent fasting. There is ample evidence that intermittent fasting can delay aging, but here the authors provide evidence that in an already-aged animal, intermittent fasting can restore healing to levels seen in younger animals. This is an important finding as it may hint at the potential benefits of intermittent fasting in tissue repair.

      Weaknesses:

      The authors explore potential mechanisms by which the intermittent fasting protocol might impact bone healing. However, they do not identify a magic bullet here that controls this effect. Indeed, the fact that their results with intermittent fasting can be replicated by changing the gut microbiota or modulating fundamental pathways associated with NAD, suggests that there is no single mechanism that drives this effect, but rather an overall complex impact on metabolic processes, which may be very difficult to untangle.

    4. Reviewer #3 (Public review):

      Summary:

      This study aims to address the significant challenge of age-related decline in bone healing by developing a dual therapeutic strategy that rejuvenates osteogenic function in aged calvarial bone tissue. Specifically, the authors investigate the efficacy of combining local Wnt3a-mediated osteoprogenitor stimulation with systemic intermittent fasting (IF) to restore bone repair capacity in aged mice. The highlights are:

      (1) Novel Approach with Aged Models:<br /> This pioneering study is among the first to demonstrate the rejuvenation of osteoblasts in significantly aged animals through intermitted fasting, showcasing a new avenue for regenerative therapies.

      (2) Rejuvenation Potential in Aged Tissues:<br /> The findings reveal that even aged tissues retain the capacity for rejuvenation, highlighting the potential for targeted interventions to restore youthful cellular function.

      (3) Enhanced Vascular Health:<br /> The study also shows that vascular structure and function can be significantly improved in aged tissues, further supporting tissue regeneration and overall health.<br /> Through this innovative approach, the authors seek to overcome intrinsic cellular deficits and environmental changes within aged osteogenic compartments, ultimately achieving bone healing levels comparable to those seen in young mice.

      Strengths:

      The study is a strong example of translational research, employing robust methodologies across molecular, cellular, and tissue-level analyses. The authors leverage a clinically relevant, immunocompetent mouse model and apply advanced histological, transcriptomic, and functional assays to characterise age-related changes in bone structure and function. Major strengths include the use of single-cell RNA sequencing (scRNA-seq) to profile osteoprogenitor populations within the calvarial periosteum and suture mesenchyme, as well as quantitative assessments of mitochondrial health, vascular density, and osteogenic function. Another important point is the use of very old animals (up to 88 weeks, almost 2 years) modelling the human bone aging that usually starts >65 yo. This comprehensive approach enables the authors to identify critical age-related deficits in osteoprogenitor number, function, and microenvironment, thereby justifying the combined Wnt3a and IF intervention.

      Weaknesses:

      One limitation is the use of female subjects only and the limited exploration of immune cell involvement in bone healing. Given the known role of the immune system in tissue repair, future studies including a deeper examination of immune cell dynamics within aged osteogenic compartments could provide further insights into the mechanisms of action of IF.

    1. eLife Assessment

      The findings of this study are valuable, as they address a critical methodological gap in decision-making research by demonstrating how heuristic strategies can confound interpretations of uncertainty-driven behaviour and provide a clearer framework for distinguishing between uncertainty-seeking and heuristic-driven exploration. While the evidence is solid, with strong methodological rigour in task design and computational modelling, some claims, such as the stability of uncertainty parameters and correlations with psychopathology measures, require refinement. Overall, the data broadly support the study's claims, but interpretational ambiguities limit the impact of certain findings.

    2. Reviewer #1 (Public review):

      Summary:

      The study investigates how uncertainty and heuristic strategies influence reward-based decision-making, using a novel two-armed bandit task combined with computational modeling. It aims to disentangle uncertainty-driven behavior from heuristic strategies such as repetition bias and win-stay-lose-shift tendencies, while also exploring individual differences in these processes.

      Strengths:

      The paper is methodologically sound, and the inclusion of subjective reports enhances the validity of the model testing. The findings on the use of heuristics under specific uncertainty conditions are particularly intriguing.

      Weaknesses:

      (1) Unclear how the findings significantly diverge from previous work:

      At the start of the introduction, the authors propose a working hypothesis of "heterogeneity in the uncertainty effects." However, this concept is already well-established in the field. Foundational work by Yu and Dayan (2005) and more recent studies by Gershman and colleagues on total and relative uncertainty have provided substantial evidence supporting this idea. Additionally, the notion that such heterogeneity could explain mixed findings has been discussed in studies like Wilson (2014). What specific problem are the authors addressing here, and how does their work significantly differ from previous research?

      Later on, however, it seems that the authors' hypothesis is to test the role of multiple factors in driving participants' decisions in the context considered by the authors. First, why is it important to solve such a puzzle? Second, this too has been investigated previously, see for example Dubois (2022), eLife. Therefore, what novel things is this paper bringing to the table? I do see that the task is novel - mostly combining different experimental strategies previously adopted - and that the model includes both heuristics and uncertainty-based strategies, which can account for their shared variance ... but are the authors really answering a novel question? Also, it is not very clear which question the authors are answering see point C below.

      (2) The sample size appears to be quite small, and the results would be more convincing if supported by a replication study.

      (3) The results section can be somewhat unclear at times, as it introduces novel aspects (e.g., the fMRI session) or questions that were not previously explained within the framework outlined in the introduction. While the findings related to psychopathology are interesting, their relevance to the research question posed in the introduction is not immediately clear. If these findings have significant added value, it would be helpful for the authors to highlight this earlier in the manuscript. Similarly, the results on individual differences in uncertainty (Section 3.6), though intriguing, appear tangential to the primary research question regarding the role of multiple factors in driving participants' decisions. Overall, it would strengthen the manuscript to clarify the main research question and ensure the results are more directly aligned with it.

    3. Reviewer #2 (Public review):

      Summary:

      This paper addresses mixed findings regarding levels of uncertainty-seeking/avoidance in past reinforcement learning studies. Using computational modelling and a novel variant of a bandit task performed across two sessions, the authors investigate the extent to which uncertainty-driven behaviour can be distinguished from heuristic-like behaviours (e.g., repetition, win-stay/lose-switch). They demonstrate that heuristics account for a significant and stable portion of the variance in choice behaviour, which might otherwise be misattributed to uncertainty-driven parameters. Additionally, they find that relative uncertainty explains additional variance and provides some evidence of stability across sessions.

      Strengths:

      The task is well-designed to tease apart multiple different factors contributing to choice during a bandit task, including separating those tied to uncertainty per se versus other policies. They validate a Bayesian model to account for learning and choice behaviour, as well as subjective estimates of learned value and confidence in these values. The work employs comprehensive model comparison to characterise behaviour in this task, and points to important risks within research on uncertainty preferences using bandit-like tasks when failing to fully account for heuristic-like drivers of such behaviour.

      Weaknesses:

      Part of this work seeks to relate individual differences in various choice parameters across sessions and to relate those to self-report scales. The estimates of cross-session reliability are valuable, particularly when comparing across the different parameters (e.g., heuristic ones being most robust), but the uncertainty-related parameters are interpreted too liberally (i.e., as being stable across sessions when both were weak and one was not significant). Moreover, the correlations with external scales are very hard to interpret given the number of comparisons that were run without correction. The findings overall will have value to people interested in modelling uncertainty preferences in learning tasks -- some of whom have considered heuristic factors less than others -- but perhaps be of more moderate impact beyond this group.

    4. Reviewer #3 (Public review):

      Summary:

      This work investigated how uncertainty, repetition bias, and win-stay-lose-shift processes influence reward-based decision-making. Using a modified two-armed bandit task and computational models, the authors provide evidence for individual variation in the integration of uncertainty on choice behaviour that remains somewhat stable across two experiment sessions. The authors also find a number of interesting results due to their ability to disentangle components of this decision-making process using their novel task and models. Specifically, they find that higher total uncertainty leads people to use more heuristic-based strategies like making repetitive choices or engaging in win-stay-lose-shift behaviour. However, they also find that there are individual differences in how people use uncertainty to guide their choices, and that these differences are consistent within individuals across multiple experiment sessions. This finding can help explain prior inconsistencies in the literature, where researchers have found evidence for both uncertainty-seeking and uncertainty-avoidance tendencies. Overall, this research adds to our understanding of the mechanisms of uncertainty-modulated learning and decision-making.

      Strengths:

      One of the primary strengths of this research is that it helps provide support for the idea that mixed and null results in the prior literature could be due to individual differences in uncertainty preferences and that this individual variation is somewhat stable within subjects across multiple experiment sessions. The authors cleverly disentangle expected reward and uncertainty by interleaving free and forced choice trials in their behavioural task, illuminating the novel impact of reward and uncertainty on this particular decision process. However, it should be noted that this behavioural decorrelation does not persist beyond the first few trials after a forced choice period, so whether or not the decorrelation is truly robust remains unclear.

      The authors also use computational modelling to further probe the influence of uncertainty on reward-based choices. Specifically, they compare a Bayesian ideal observer learning model and a variation on a standard Rescorla-Wagner model, finding that a version of the Bayesian model fits the participants' behaviour best. The model descriptions and analyses are clearly explained and methodologically rigorous.

      Interestingly, the authors find that both repetition bias and model parameters that capture a win-stay-lose-shift strategy (signed and unsigned previous prediction error) significantly improve their model fits. They also make an important point that if win-stay-lose-shift behaviour is not controlled for, then switch behaviour (for example, switching to a lower expected reward option after receiving a large loss) may appear to be uncertainty-seeking when it is not. This idea speaks to a larger point that future research should be careful to not conflate "exploration" with "uncertainty-seeking."

      Weaknesses:

      This research has some weaknesses regarding the correlations between the psychopathology measures and the computational model parameters. First, the choice of self-report measures is not well supported by any specific hypotheses. Relatedly, the authors do not include sufficient rationale for their choice to include only results from the anxiety and impulsivity measures in the main text while leaving out significant findings for a number of correlations between other measures and parameter coefficients. It is also not clear how the model parameters are being derived for use in each of these correlational analyses. In sum, the manuscript as-is contains inconsistent and/or confusing reporting of correlation results that require further clarification.

    1. eLife Assessment

      This valuable study investigates the mechanisms that contribute to nerve-injury-induced allodynia by studying the role of the estrogen receptor GPR30 in a population of CCK+ neurons in the dorsal horn of the spinal cord that receive direct inputs from primary somatosensory cortex and modulate nociceptive sensitivity. The authors provide convincing evidence, using a variety of complementary approaches, ranging from the cellular to physiology level; however, conclusions that descending corticospinal projections modulate nociceptive behaviors through GPR30 are incompletely supported. With some additional analyses, the findings will be better positioned within the context of spinal circuitry literature.

    2. Reviewer #1 (Public review):

      In this manuscript, Chen et al. investigate the role of the membrane estrogen receptor GPR30 in spinal mechanisms of neuropathic pain. Using a wide variety of techniques, they first provide convincing evidence that GPR30 expression is restricted to neurons within the spinal cord, and that GPR30 neurons are well-positioned to receive descending input from the primary sensory cortex (S1). In addition, the authors put their findings in the context of the previous knowledge in the field, presenting evidence demonstrating that GRP30 is expressed in the majority of CCK-expressing spinal neurons. Overall, this manuscript furthers our understanding of neural circuity that underlies neuropathic pain and will be of broad interest to neuroscientists, especially those interested in somatosensation. Nevertheless, the manuscript would be strengthened by additional analyses and clarification of data that is currently presented.

      Strengths:

      The authors present convincing evidence for the expression of GPR30 in the spinal cord that is specific to spinal neurons. Similarly, complementary approaches including pharmacological inhibition and knockdown of GPR30 are used to demonstrate the role of the receptor in driving nerve injury-induced pain in rodent models.

      Weaknesses:

      Although steps were taken to put their data into the broader context of what is already known about the spinal circuitry of pain, more considerations and analyses would help the authors better achieve their goal. For instance, to determine whether GPR30 is expressed in excitatory or inhibitory neurons, more selective markers for these subtypes should be used over CamK2. Moreover, quantitative analysis of the extent of overlap between GRP30+ and CCK+ spinal neurons is needed to understand the potential heterogeneity of the GRP30 spinal neuron population, and to interpret experiments characterizing descending SI inputs onto GRP30 and CCK spinal neurons. Filling these gaps in knowledge would make their findings more solid.

    3. Reviewer #2 (Public review):

      Using a variety of experimental manipulations, the authors show that the membrane estrogen receptor G protein-coupled estrogen receptor (GPER/GPR30) expressed in CCK+ excitatory spinal interneurons plays a major role in the pain symptoms observed in the chronic constriction injury (CCI) model of neuropathic pain. Intrathecal application of selective GPR30 agonist G 1induced mechanical allodynia and thermal hyperalgesia in male and female mice. Downregulation of GPR30 in CCK+ interneurons prevented the development of mechanical and thermal hypersensitivity during CCI. They also show the up modulation of AMPA receptor expression by GPR30.

      Generally, the conclusions are supported by the experimental results. I also would like to see significant improvements in the writing and the description of results.

      Methodological details for some of the techniques are rather sparse. For example, when examining the co-localization of various markers, the authors do not indicate the number of animals/sections examined. Similarly, when examining the effect of shGper1, it is unclear how many cells/sections/animals were counted and analyzed.

      In other sections, there is no description of the concentration of drugs used (for example, Figure 4H). In Figures 4C-E, there is no indication of the duration of the recordings, the ionic conditions, the effect of glutamate receptor blockers, etc

      Some results appear anecdotal in the way they are described. For example, in Figure 5, it is unclear how many times this experiment was repeated.

    4. Reviewer #3 (Public review):

      Summary:

      The authors convincingly demonstrate that a population of CCK+ spinal neurons in the deep dorsal horn express the G protein-coupled estrogen receptor GPR30 to modulate pain sensitivity in the chronic constriction injury (CCI) model of neuropathic pain in mice. Using complementary pharmacological and genetic knockdown experiments they convincingly show that GPR30 inhibition or knockdown reverses mechanical, tactile, and thermal hypersensitivity, conditioned place aversion, and c-fos staining in the spinal dorsal horn after CCI. They propose that GPR30 mediates an increase in postsynaptic AMPA receptors after CCI using slice electrophysiology which may underlie the increased behavioral sensitivity. They then use anterograde tracing approaches to show that CCK and GPR30 positive neurons in the deep dorsal horn may receive direct connections from the primary somatosensory cortex. Chemogenetic activation of these dorsal horn neurons proposed to be connected to S1 increased nociceptive sensitivity in a GPR30-dependent manner. Overall, the data are very convincing and the experiments are well conducted and adequately controlled. However, the proposed model of descending corticospinal facilitation of nociceptive sensitivity through GPR30 in a population of CCK+ neurons in the dorsal horn is not fully supported.

      Strengths:

      The experiments are very well executed and adequately controlled throughout the manuscript. The data are nicely presented and supportive of a role for GPR30 signaling in the spinal dorsal horn influencing nociceptive sensitivity following CCI. The authors also did an excellent job of using complementary approaches to rigorously test their hypothesis.

      Weaknesses:

      The primary weakness in this manuscript involves overextending the interpretations of the data to propose a direct link between corticospinal projections signaling through GPR30 on this CCK+ population of spinal dorsal horn neurons. For example, even in the cropped images presented, GPR30 is present in many other CCK-negative neurons. Only about a quarter of the cells labeled by the anterograde viral tracing experiment from S1 are CCK+. Since no direct evidence is provided for S1 signaling through GPR30, this conclusion should be revised.

    1. eLife Assessment

      This valuable study by Xu and colleagues investigates brainstem circuits mediating evoked respiratory reflexes that they define as cough-like in a freely behaving mouse model. They have applied multiple circuit mapping and manipulation approaches to suggest that the caudal spinal trigeminal nucleus (SP5C) nucleus can play a novel role in generating a reflex cough-like behavior in mice. The authors give incomplete evidence that the reflex behavior produced in their mouse model is definitively cough, limiting functional interpretation of the putative circuit identified and requiring more thorough experimental interrogation of the behavior studied.

    2. Reviewer #1 (Public review):

      Summary:

      The study by Xu and colleagues provides a useful study of brainstem circuits involved in evoked respiratory reflexes that they define to be cough or cough-like in nature. The study is conducted in mice which has the benefit of allowing for the use of modern transgenic tools, although many of the experiments end up using viral vector-based approaches that could be deployed in any species. The disadvantage of the mouse model is understanding the true identity of the respiratory event that is defined as cough. This limitation requires careful interrogation in order to understand the biology of the circuit under investigation. In this respect, the authors provide an incomplete description of a putative cough pathway linking the caudal spinal trigeminal nucleus with the ventral respiratory group. Neurons assigned as CaMKII+ with putative inputs from the paratrigeminal nucleus are central to this circuit, although the evidence for each of these claims is relatively weak or non-existent. Overall, the study employs interesting methods but limitations in methods and details of methods reduce interpretation of the study outcomes.

      Strengths:

      The use of modern methods to investigate brainstem circuits involved in an essential respiratory reflex.

      Weaknesses:

      (1) The most significant issue that needs careful consideration is the exact respiratory response, which is called a cough. The authors show a trace from their plethysmography recordings and superimpose the 3 phases of cough (inspiration, compression, expiration) with confidence, yet the parameters used to delineate these phases are unclear. Of more concern, an identical respiratory trace was reported recently as a sneeze in Jiang et al Cell 2024 (PMID 39243765). Comparing Figure 1 in the Xu study with Figure 5 in the Jiang study, it is impossible to see any difference in the respiratory trace that would allow the assignment of one as cough and the other as sneeze. The audio signals also look remarkably similar and the purported cough signal in the Jiang study is quite different. Gannot et al Nat Neurosci 2024 (PMID 38977887) seems to agree with Xu in the identity of a cough signal, but Li et al Cell 2021 (PMID 34133943) again labels these as sneezes. One of the older studies that tried to classify respiratory signals in mice (Chen et al PlosONE 2013) labeled the Jiang cough trace as a deep inspiration, while sneeze looks different again. To add further confusion, Zhang et al AJP 2017 (PMID 28228416 ) provide yet another respiratory plethysmography trace that they define as a cough, and label responses discussed above as expiration reflexes. This begs the question - who, if anyone, is correct? Interpreting the circuits underlying these peculiar mouse responses depends on accuracy in defining the response in the first instance.

      (2) The involvement of the causal nSp5 in cough is an unexpected finding. Some understanding of if and how vagal afferent inputs reach this location would help strengthen the manuscript. The authors claim in the discussion that the nucleus of the solitary tract is not the source of inputs, but rather they may arise from the paratrigeminal nucleus (although no data is presented to support this claim). This could fit with the known jugular vagal afferent pathway, which is embryologically distinct and terminates in trigeminal regions, rather than the NTS. But if this is correct, what does this finding then say about the purported involvement of NTS neurons in cough in mice, for example, the recent study by Gannot et al Nat Neurosci where Tac1-expressing NTS neurons were integral for what they call cough in mice? Xu and colleagues are encouraged to resolve their input circuitry so that we can better understand the pathway under investigation and how it relates to the NTS pathway. Related to this, and the issues differentiating cough-like responses from sneeze, the authors will need to consider how to differentiate their cough-like circuitry from the sneeze pathway from the caudal nSp5 to the cVRG as reported by Li et al Cell 2021. It seems highly possible that the two groups are studying the same circuitry, yet the interpretation is confounded by an inability to agree on the identity of the evoked response.

      (3) Injection volumes and titres for AAV transductions are not stated anywhere. The methods (line 484) indicate that different volumes were used for different purposes, but nowhere is this information stated properly. Looking at representative images suggests that volumes were very large, with most of the brainstem often transduced. As single slices are only ever shown it becomes a concern as to how extensive transductions truly are. The authors need to provide complete maps of viral transduction so that readers can understand exactly what regions could contribute to responses, thereby confounding interpretation.

      (4) The authors do not provide any data to explore the impacts of manipulations on basal breathing. This is important as impacts on the respiratory patterning will likely have profound effects on evoked responses that are not related to the specific pathway under investigation. For example, in Figure 2b. breathing looks to be severely compromised in the TKO animals and disrupted in the M4 DREADD animals. Figure 3 also shows the effects of optical stimulation on breathing patterns, which appear like apnea with several breakthrough augmented breaths (some labeled as cough?), although hard to see properly in the traces provided. Figure 5, one would expect VRG inhibition to have impacts on breathing, and the traces supplied appear to suggest this is the case. Please include data showing breathing effects and consider how these may confound your study interpretation.

    3. Reviewer #2 (Public review):

      Summary:

      This study employs a combination of state-of-the-art experimental approaches in mice to identify components of the brainstem circuits involved in the cough reflex in a freely behaving mouse model. The cough reflex is an important respiratory airway defense mechanism, and there has been longstanding interest in defining the neural circuits involved in the mammalian brainstem. Consistent with other recent studies, the present results provide multiple lines of evidence indicating that mice are a suitable model for studying neural mechanisms generating cough behavior. The main novel finding of this study is the authors' results indicating that the caudal spinal trigeminal nucleus (SP5C) nucleus plays a role in generating cough-like behaviors in response to inhaled tussigen. The supporting data presented for this role includes the authors' findings that: (1) neural activity in the SP5C is strongly correlated with tussigen-evoked cough-like behaviors, (2) impairing synaptic outputs or chemogenetic inhibition of SP5C neurons effectively abolished these cough-like reflexes, (3) optogenetically activating a specific subpopulation of excitatory neurons in the SP5C triggers cough-like behaviors, (4) SP5C neurons project monosynaptically to ventral medullary regions containing respiratory circuits that exhibit cough-related neural activity, and (5) specific activation of the SP5C-ventral respiratory circuitry induces robust cough-like behavior without tussive stimuli. This study will be valuable to respiratory neurobiologists studying mechanosensory control of breathing in mammals.

      Strengths:

      (1) The authors developed an experimental paradigm in mice that combines whole-body plethysmography (WBP), audio, and video tracking to assess breathing and putative cough-like behaviors in conscious animals.

      (2) The mouse model enables optogenetic, chemogenetic, virus-based circuit tracing and manipulation, and in vivo fiber photometry to analyze neural activity and define circuity in the medulla-producing cough-like behavior.

      (3) Multiple lines of evidence from these experimental approaches support the conclusion that the SP5C nucleus plays a role in the respiratory reflex behaviors studied in mice, but there is uncertainty that these behaviors are definitively cough.

      Weaknesses:

      (1) This paper lacks essential quantitative details about the number of animals studied explicitly for many of the experimental paradigms presented and for statistical analyses as well as to verify replication of the neuroanatomical data presented.

      (2) The authors' evidence is incomplete that the reflex behavior produced in their mouse model is definitively cough, limiting functional interpretation of the putative circuit identified and requiring more thorough experimental interrogation of the behavior studied.

      (3) The medullary circuit described conveys afferent sensorimotor signals to downstream respiratory circuits to coordinate cough-like motor behavior, but how the circuits that typically mediate the cough reflex, which involve airway-related vagal sensory neurons, operate in conjunction or parallel with the SP5C circuit described has not been determined, which is a significant gap in understanding how the present results fit into the neural control of the cough reflex.

    4. Reviewer #3 (Public review):

      Summary:

      The authors have submitted a comprehensive manuscript on the production and central pathways that they propose mediate cough-like behaviors in a TRAP2 transgenic mouse model. While the central pathway data are good, there is significant uncertainty regarding the identity of the presumptive cough-like behavior that has been produced in their model which reduces enthusiasm for the manuscript.

      Strengths:

      (1) The use of the TRAP2 model in the investigation of coughing is strong.

      (2) The implication of SP5 in the production of coughing in response to ammonia inhalation is a novel finding. Further, this area can be activated by AAV-CaMKII to induce coughing in the absence of coincident afferent activation is an important observation.

      Weaknesses:

      (1) A fundamental aspect of this investigation is the unequivocal identification of the behavior that has been evoked. In this case, the authors have not established that they are actually studying cough. The evidence that they present (especially Figure 1 - Supplement 1) clearly shows that the citric acid (2nd example), capsaicin (2nd example), and ammonia (2nd example) box flows lack a large inspiratory component which is a requirement of cough. The referenced behaviors appear to be expulsion/inspiration which is not cough. The only way these behaviors could be cough is if the conventional polarity of presentation of the flow signals are reversed. However, inspection of the flows during breathing strongly indicates that inspiration is down in your records. Again, this makes these behaviors expulsion/inspiration.

      An additional issue is that there are compression phases marked when the flow is occurring. The compression phase is a period of no flow so this is not accurate. There also is no evidence that the mouse has a compression phase at all. In cough flow records in humans, the compression phase can clearly be seen when it happens but not all coughs have one. You must show that a compression phase happens according to the actual description of what this segment of cough actually is.

      It may be that you are evoking behaviors that primarily occur in the mouse. As such, they would be novel airway protective behaviors that are worthy of description and study. Ironically, another manuscript in the journal Cell (Jiang et al, 2024, Cell 187:5981-5997) shows similar box flow polarities as your own and clear cough airflows (Fig. 5B). However, they also show other airflow patterns that resemble what you call cough (Figure 5A) but they call them sneeze. Those airflows are expulsion/inspiration and are clearly not sneezing as the expulsion in this behavior also is preceded not followed by inspiration.

      The definitive manuscript on cough in the mouse is Zhang et al Am J Physiol Reg Integr Comp 312:R718-R726, 2017. In this manuscript, Figure 2 clearly shows both box pressures and intrapleural pressures during airway protective behaviors in the awake mouse. Note that both cough and a behavior known as expiration reflex were recorded. The key element here is that the cough elicited a tri-phasic box flow. The last excursion was associated with a sound. When compared to the pressure it is clear that this last flow excursion is mechanical chest wall recoil from residual volume. The fact that this segment of the flow record was associated with sound strongly suggests that the vocal folds were adducting at the time to "brake" the chest wall recoil. In other words, the airway resistance went up to slow inspiratory airflow as the chest returned to its resting position. As such, this observation suggests that the chest wall mechanics of cough in the mouse are different than that of larger animals.

      (2) Roger Shannon and coworkers have published a number of papers on the detailed brainstem circuits that are responsible for coughing. I recommend that the authors assimilate this knowledge in the context of their results.

    1. eLife Assessment

      The authors present a biologically plausible framework for action selection and learning in the striatum that is a fundamental advance in our understanding of possible neural implementations of reinforcement learning in the basal ganglia. They provide compelling evidence that their model can reconcile realistic neural plasticity rules with the distinct functional roles of the direct and indirect spiny projection neurons of the striatum, recapitulating experimental findings regarding the activity profiles of these distinct neural populations and explaining a key aspect of striatal function.

    2. Reviewer #1 (Public review):

      Summary:

      The authors propose a new model of biologically realistic reinforcement learning in the direct and indirect pathway spiny projection neurons in the striatum. These pathways are widely considered to provide a neural substrate for reinforcement learning in the brain. However, we do not yet have a full understanding of mechanistic learning rules that would allow successful reinforcement learning like computations in these circuits. The authors outline some key limitations of current models and propose an interesting solution by leveraging learning with efferent inputs of selected actions. They show that the model simulations are able to recapitulate experimental findings about the activity profile in these populations of mice during spontaneous behavior. They also show how their model is able to implement off-policy reinforcement learning.

      Strengths:

      The manuscript has been very clearly written and the results have been presented in a readily digestible manner. The limitations of existing models, that motivate the presented work, have been clearly presented and the proposed solution seems very interesting. The novel contribution of the proposed model is the idea that different patterns of activity drive current action selection and learning. Not only does this allow the model is able to implement reinforcement learning computations well, but this suggestion may have interesting implications regarding why some processes selectively affect ongoing behavior and others affect learning. The model is able to recapitulate some interesting experimental findings about various activity characteristics of dSPN and iSPN pathway neuronal populations in spontaneously behaving mice. The authors also show that their proposed model can implement off-policy reinforcement learning algorithms with biologically realistic learning rules. This is interesting since off-policy learning provides some unique computational benefits and it is very likely that learning in neural circuits may, at least to some extent, implement such computations.

      Weaknesses:

      A weakness in this work is that it isn't clear how a key component in the model - an efferent copy of selected actions - would be accessible to these striatal populations. The authors propose several plausible candidates, but future work may clarify the feasibility of this proposal.

    3. Reviewer #2 (Public review):

      Summary:

      The basal ganglia is often understood within a reinforcement learning (RL) framework, where dopamine neurons convey a reward prediction error that modulates cortico-striatal connections onto spiny projection neurons (SPNS) in the striatum. However, current models of plasticity rules are inconsistent with learning in a reinforcement learning framework.

      This paper proposes a new model that describes how distinct learning rules in direct and indirect pathway striatal neurons allow them to implement reinforcement learning models. It proposes that two distinct components of striatal activity affect action selection and learning. They show that the proposed implementation allows learning in simple tasks and is consistent with experimental data from calcium imaging data in direct and indirect SPNs in freely moving mice.

      Strengths:

      Despite the success of reward prediction errors at characterizing the responses of dopamine neurons as the temporal difference error within an RL framework, the implementation of RL algorithms in the rest of the basal ganglia has been unclear. A key missing aspect has been the lack of a RL implementation that is consistent with the distinction of direct- and indirect SPNs. This paper proposes a new model that is able to learn successfully in simple RL tasks and explains recent experimental results.

      The author shows that their proposed model, unlike previous implementations, this model can perform well in RL tasks. The new model allows them to make experimental predictions. They test some of these predictions and show that the dynamics of dSPNs and iSPNs correspond to model predictions.

      More generally, this new model can be used to understand striatal dynamics across direct and indirect SPNs in future experiments.

      Weaknesses:

      The authors could characterize better the reliability of their experimental predictions and the description of the parameters of some of the simulations

      The authors propose some ideas about how the specificity of the striatal efferent inputs but should highlight better that this is a key feature of the model whose anatomical implementation has yet to be resolved.

    4. Reviewer #3 (Public review):

      Summary:

      This paper points out an inconsistency of the roles of the striatal spiny neurons projecting to the indirect pathway (iSPN) and the synaptic plasticity rule of those neurons expressing dopamine D2 receptors and proposes a novel, intriguing mechanisms that iSPNs are activated by the efference copy of the chosen action that they are supposed to inhibit.

      The proposed model was supported by simulations and analysis of the neural recording data during spontaneous behaviors.

      Strengths:

      Previous models suggested that the striatal neurons learn action-value functions, but how the information about the chosen action is fed back to the striatum for learning was not clear. The author pointed out that this is a fundamental problem for iSPNs that are supposed to inhibit specific actions and its synaptic inputs are potentiated with dopamine dips.

      The authors propose a novel hypothesis that iSPNs are activated by efference copy of the selected action which they are supposed to inhibit during action selection. Even though intriguing and seemingly unnatural, the authors demonstrated that the model based on the hypothesis can circumvent the problem of iSPNs learning to disinhibit the actions associated with negative reward errors. They further showed by analyzing the cell-type specific neural recording data by Markowitz et al. (2018) that iSPN activities tend to be anti-correlated before and after action selection.

      Weaknesses:

      (1) It is not correct to call the action value learning using the externally-selected action as "off-policy." Both off-policy algorithm Q-learning and on-policy algorithm SARSA update the action value of the chosen action, which can be different from the greedy action implicated by the present action values. In standard reinforcement learning terminology, on-policy or off-policy is regarding the actions in the subsequent state, whether to use the next action value of (to be) chosen action or that of greedy choice as in equation (7).

      It is worth noting that this paper suggested that dopamine neurons encode on-policy TD errors:<br /> Morris G, Nevet A, Arkadir D, Vaadia E, Bergman H (2006). Midbrain dopamine neurons encode decisions for future action. Nat Neurosci, 9, 1057-63. https://doi.org/10.1038/nn1743

      (2) It is also confusing to contract TD learning and Q-learning, as the latter is considered as one type of TD learning. In the TD error signal by state value function (6) is dependent on the chosen action a_{t-1} implicitly in r_t and s_t based on the reward and state transition function.

      (3) It is not clear why interferences of the activities for action selection and learning can be avoided, especially when actions are taken with short intervals or even temporal overlaps. How can the efference copy activation for the previous action be dissociated with the sensory cued activation for the next action selection?

      (4) Although it may be difficult to single out the neural pathway that carries the efference copy signal to the striatum, it is desired to consider their requirements and difference possibilities. A major issue is that the time delay from actions to reward feedback can be highly variable.

      An interesting candidate is the long-latency neurons in the CM thalamus projecting to striatal cholinergic interneurons, which are activated following low-reward actions:<br /> Minamimoto T, Hori Y, Kimura M (2005). Complementary process to response bias in the centromedian nucleus of the thalamus. Science, 308, 1798-801. https://doi.org/10.1126/science.1109154

      (5) In the paragraph before Eq. (3), Eq. (1) should be Eq. (2) for the iSPN.

    1. eLife Assessment

      This study presents a valuable study of early brain development using advanced MRI methods. In particular, the study investigates the relationship between the maturation of diffusion MRI tissue properties and suggests that they may precede and guide the emergence of brain folding patterns. The data is solid, however, the evidence supporting the precedence of tissue changes over brain folding appears incomplete.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript describes the analysis of fetal MRI and diffusion-weighted images of the fetal brain in utero, which reveals correlations between spatial and temporal patterns in diffusion behavior (associated with tissue microstructure) with local geometry of the brain surface (describing cortical folding). The authors use advanced imaging and image analysis pipelines, notably high angular resolution multi-shell diffusion imaging (HARDI) and multi-shell, multi-tissue constrained spherical deconvolution (MSMT-CSD) analysis of the resulting data to analyze. The key metric of tissue microstructure is "tissue fraction" which describes the relative contribution of organized anisotropic diffusion to overall diffusion, and the key geometry parameter is sulcal depth.

      The major observation is that tissue fraction, which generally increases with gestational age, is lower in sulcal fundi, and importantly that the relative difference in tissue fraction emerges *before* folding occurs. The relatively low values of tissue fraction in regions of incipient sulci may be important to the physical mechanism of cortical folding.

      Strengths:

      Strengths of the manuscript include the application of advanced, highly technical imaging and image analysis methods to extract high-resolution data on both surface geometry and diffusion from a unique fetal cohort. The comparison of local features of surface and microstructure in both age-matched and age-mismatched analyses reveals a clear negative correlation between tissue fraction and sulcal depth.

      Weaknesses:

      The authors could improve the manuscript by (i) expanding their effort to place their current findings in the context of mechanistic models of folding and (ii) explaining more clearly how the diffusion measurements reflect tissue fraction. The relationship between the tissue fraction metric, the diffusion measurements, and the tissue microstructure is quite opaque.

    3. Reviewer #2 (Public review):

      Summary:

      The authors analyze parameters related to anisotropy and gyrification in the developing human brain and describe an increase in tissue fraction (TF) across development. They correlate TF and sulcal depth in the CP and SP across local neighborhoods, describing a negative correlation. Also, they perform age-mismatched correlation of tissue fraction at early stages with sulcal depth at later ones and show correlation inside sulci, which they interpret as indicating the presence of minor structural changes in the brain that precede the development of sulci.

      Strengths:

      The study compiles a large cohort of cases through different developmental ages and performs sophisticated data analysis. Overall, the work is interesting.

      Weaknesses:

      I have some questions. What is the potential meaning of TF? It seems to be an estimator of anisotropy highly related to fractional anisotropy (FA), but it behaves in a complementary manner, increasing along gestation, in sharp contrast with the decrease observed in FA in this study (suppl. fig 3) and by others. Please clarify how it is calculated, what is the potential biological meaning of TF and how it differs from FA.

      The correlations between TF and sulcal depth do not seem to provide much novelty, since as mentioned by the authors, previous evidence has pointed in that direction. The other concept in the paper relates to detecting structural changes in prospective sulcal areas in the cortex, which the authors analyze through the age-mismatched correlation of TF and subsequent sulcation. However, the results do not show a robust correlation as detailed below and do not seem particularly useful, as they require the inclusion of post-hoc information in the model, limiting the strength of the relationship and the predictive value. My main point of criticism is that if TF is a good marker of the structural modifications that will favor the development of sulci later in development, TF should show a map predictive of those sulci (e.g. at GW 25), that is however not the case. It is not necessary to correlate with future sulcal depth, as we know exactly where the primary sulci will develop. Conversely, it seems that TF decreases along the gyrification process, and it might just be a measure of the structural changes accompanying it.

      In Figure 2 it illustrates the increase in TF across GA, but no R values or significance values are provided. Please add them to evaluate the robustness of the correlation.

      In previous work of the authors, the subplate is not clearly distinguished from the subcortical white matter after 31 GW, as it starts to disintegrate (Kostovic et al., 2002; Calixto et al., 2024). However, in this manuscript, the SP is differentiated at those later ages. The methods section describes a 2 mm thick compartment below the cortical plate. However, if that is the case, it seems quite arbitrary (to coincide with the resolution of the diffusion imaging) and risks analyzing a compartment that is no longer present. Please explain the criteria followed for such distinction and more importantly, how such distinction is reliable considering the low detectability described in previous studies. In this regard, the discussion described that a rapid increase in TF was only seen in the SP after 30 GW, but maybe this increase would reflect the dissipation of the SP and the transformation of that space in subcortical white matter, with a much more expected anisotropy. The authors should review this.

      The analysis describes a negative correlation between tissue fraction and sulcal depth when gyrification proceeds and the authors find that an age-mismatched correlation between tissue fraction in young embryos and sulcal depth in older embryos also shows a negative correlation in future sites of sulcation. However, for the correlation to exist, the tissue fraction in young individuals should already show low values in the prospective sulci, but no clear changes can be seen at GW 25 or 27 in lissencephalic areas that will bear sulci later on, as is the case of the central sulcus at GW 25 or the STS at GW 27, the latter showing very high tissue fraction (instead of the expected low).

      Another question refers to Figures 3b and c. The graphs represent specific neighborhoods in the central sulcus at 27 and 35 GW. It can be argued that those neighborhoods might not be representative of the brain or of the whole sulcus. Please show the graph with all neighborhoods, which will provide more definitive evidence of the existence of the correlation. In this regard, the average graphs represented in Figure 3F seem to show a clear correlation at 27 GW in the subplate, but the correlation seems to fade at later stages (in both SP and CP), with both sulci and gyri exhibiting a negative correlation while other sulcal areas do not exhibit correlation. I think all points should be included in the correlation to better support the hypothesis.

      Figure 4 shows the age-mismatched correlations, but they do not seem convincing particularly when they should be more useful, at 25 GW. Indeed, as seen in both Figures A and C, the central sulcus shows a negative correlation only in a few spots on one hemisphere, while (in C) most of the prospective sulcus shows a positive correlation, contrary to the hypothesis.

      Lastly, the authors performed an age-mismatched correlation between TF at different ages and the sulcal depth at 35W, when it is maximal. This maximal depth might be "pushing" the correlation to significant territory. The authors should provide correlation also with the sulcal depth at other GAs, such as P29, P31, or P33, and analyze how the correlations hold.

    1. eLife Assessment

      This study presents valuable quantitative insights into the prevalence of functionally clustered synaptic inputs on neuronal dendrites. The simple analytical calculations and computer simulations provide solid support for the main arguments. The findings can lead to a more detailed understanding of how dendrites contribute to the computation of neuronal networks.

    2. Joint Public Review:

      Summary:

      If synaptic input is functionally clustered on dendrites, nonlinear integration could increase the computational power of neural networks. But this requires the right synapses to be located in the right places. This paper aims to address the question of whether such synaptic arrangements could arise by chance (i.e. without special rules for axon guidance or structural plasticity), and could therefore be exploited even in randomly connected networks. This is important, particularly for the dendrites and biological computation communities, where there is a pressing need to integrate decades of work at the single-neuron level with contemporary ideas about network function.

      Using an abstract model where ensembles of neurons project randomly to a postsynaptic population, back-of-envelope calculations are presented that predict the probability of finding clustered synapses and spatiotemporal sequences. Using data-constrained parameters, the authors conclude that clustering and sequences are indeed likely to occur by chance (for large enough ensembles), but require strong dendritic nonlinearities and low background noise to be useful.

      Strengths:

      - The back-of-envelope reasoning presented can provide fast and valuable intuition. The authors have also made the effort to connect the model parameters with measured values. Even an approximate understanding of cluster probability can direct theory and experiments towards promising directions, or away from lost causes.

      - I found the general approach to be refreshingly transparent and objective. Assumptions are stated clearly about the model and statistics of different circuits. Along with some positive results, many of the computed cluster probabilities are vanishingly small, and noise is found to be quite detrimental in several cases. This is important to know, and I was happy to see the authors take a balanced look at conditions that help/hinder clustering, rather than just focus on a particular regime that works.

      - This paper is also a timely reminder that synaptic clusters and sequences can exist on multiple spatial and temporal scales. The authors present results pertaining to the standard `electrical' regime (~50-100 µm, <50 ms), as well as two modes of chemical signaling (~10 µm, 100-1000 ms). The senior author is indeed an authority on the latter, and the simulations in Figure 5, extending those from Bhalla (2017), are unique in this area. In my view, the role of chemical signaling in neural computation is understudied theoretically, but research will be increasingly important as experimental technologies continue to develop.

      (Editors' note: the paper has been through two rounds of revisions and the authors are encouraged to finalise this as the Version of Record. The earlier reviews are here: https://elifesciences.org/reviewed-preprints/100664v2/reviews#tab-content)

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      In this revision, the authors significantly improved the manuscript. They now address some of my concerns. Specifically, they show the contribution of end-effects on spreading the inputs between dendrites. This analysis reveals greater applicability of their findings to cortical cells, with long, unbranching dendrites than other neuronal types, such as Purkinje cells in the cerebellum.

      They now explain better the interactions between calcium and voltage signals, which I believe improve the take-away message of their manuscript. They modified and added new figures that helped to provide more information about their simulations.

      However, some of my points remain valid. Figure 6 shows depolarization of ~5mV from -75. This weak depolarization would not effectively recruit nonlinear activation of NMDARs. In their paper, Branco and Hausser (2010) showed depolarizations of ~10-15mV.

      More importantly, the signature of NMDAR activation is the prolonged plateau potential and activation at more depolarized resting membrane potentials (their Figure 4). Thus, despite including NMDARs in the simulation, the authors do not model functional recruitment of these channels. Their simulation is thus equivalent to AMPA only drive, which can indeed summate somewhat nonlinearly.

      In the current study, we used short sequences of 5 inputs, since the convergence of longer sequences is extremely unlikely in the network configurations we have examined. This resulted in smaller EPSP amplitudes of ~5mV (Figure 6 - Supplement 2A, B). Longer sequences containing 9 inputs resulted in larger somatic depolarizations of ~10mV (Figure 6 - Supplement 2E, F). Although we had modified the (Branco, Clark, and Häusser 2010) model to remove the jitter in the timing of arrival of inputs and made slight modifications to the location of stimulus delivery on the dendrite, we saw similar amplitudes when we tested a 9-length sequence using (Branco, Clark, and Häusser 2010)’s published code (Figure 6 - Supplement 2I, J). In all the cases we tested (5 input sequence, 9 input sequence, 9 input sequence with (Branco, Clark, and Häusser 2010) code repository), removal of NMDA synapses lowered both the somatic EPSPs (Figure 6 - Supplement 2C,D,G,H,K,L) as well as the selectivity (measured as the difference between the EPSPs generated for inward and outward stimulus delivery) (Figure 6 Supplement 2M,N,O). Further, monitoring the voltage along the dendrite for a sequence of 5 inputs showed dendritic EPSPs in the range of 20-45 mV (Figure 6 - Supplement 2P, Q), which came down notably (10-25mV) when NMDA synapses were abolished (Figure 6 - Supplement 2R, S). Thus, even sequences containing as few as 5 inputs were capable of engaging the NMDA-mediated nonlinearity to show sequence selectivity, although the selectivity was not as strong as in the case of 9 inputs.

      Reviewer #1 (Recommendations for the authors):

      Minor points:

      Figure 8, what does the scale in A represent? I assume it is voltage, but there are no units. Figure 8, C, E, G, these are unconventional units for synaptic weights, usually, these are given in nS / per input.

      We have corrected these. The scalebar in 8A represents membrane potential in mV. The units of 8C,E,G are now in nS.

      Reviewer #2 (Public Review):

      Summary:

      If synaptic input is functionally clustered on dendrites, nonlinear integration could increase the computational power of neural networks. But this requires the right synapses to be located in the right places. This paper aims to address the question of whether such synaptic arrangements could arise by chance (i.e. without special rules for axon guidance or structural plasticity), and could therefore be exploited even in randomly connected networks. This is important, particularly for the dendrites and biological computation communities, where there is a pressing need to integrate decades of work at the single-neuron level with contemporary ideas about network function.

      Using an abstract model where ensembles of neurons project randomly to a postsynaptic population, back-of-envelope calculations are presented that predict the probability of finding clustered synapses and spatiotemporal sequences. Using data-constrained parameters, the authors conclude that clustering and sequences are indeed likely to occur by chance (for large enough ensembles), but require strong dendritic nonlinearities and low background noise to be useful.

      Strengths:

      (1) The back-of-envelope reasoning presented can provide fast and valuable intuition. The authors have also made the effort to connect the model parameters with measured values. Even an approximate understanding of cluster probability can direct theory and experiments towards promising directions, or away from lost causes.

      (2) I found the general approach to be refreshingly transparent and objective. Assumptions are stated clearly about the model and statistics of different circuits. Along with some positive results, many of the computed cluster probabilities are vanishingly small, and noise is found to be quite detrimental in several cases. This is important to know, and I was happy to see the authors take a balanced look at conditions that help/hinder clustering, rather than to just focus on a particular regime that works.

      (3) This paper is also a timely reminder that synaptic clusters and sequences can exist on multiple spatial and temporal scales. The authors present results pertaining to the standard `electrical' regime (~50-100 µm, <50 ms), as well as two modes of chemical signaling (~10 µm, 100-1000 ms). The senior author is indeed an authority on the latter, and the simulations in Figure 5, extending those from Bhalla (2017), are unique in this area. In my view, the role of chemical signaling in neural computation is understudied theoretically, but research will be increasingly important as experimental technologies continue to develop.

      Weaknesses:

      (1) The paper is mostly let down by the presentation. In the current form, some patience is needed to grasp the main questions and results, and it is hard to keep track of the many abbreviations and definitions. A paper like this can be impactful, but the writing needs to be crisp, and the logic of the derivation accessible to non-experts. See, for instance, Stepanyants, Hof & Chklovskii (2002) for a relevant example.

      It would be good to see a restructure that communicates the main points clearly and concisely, perhaps leaving other observations to an optional appendix. For the interested but time-pressed reader, I recommend starting with the last paragraph of the introduction, working through the main derivation on page 7, and writing out the full expression with key parameters exposed. Next, look at Table 1 and Figure 2J to see where different circuits and mechanisms fit in this scheme. Beyond this, the sequence derivation on page 15 and biophysical simulations in Figures 5 and 6 are also highlights.

      We appreciate the reviewers' suggestions. We have tightened the flow of the introduction. We understand that the abbreviations and definitions are challenging and have therefore provided intuitions and summaries of the equations discussed in the main text.

      Clusters calculations

      Our approach is to ask how likely it is that a given set of inputs lands on a short segment of dendrite, and then scale it up to all segments on the entire dendritic length of the cell.

      Thus, the probability of occurrence of groups that receive connections from each of the M ensembles (PcFMG) is a function of the connection probability (p) between the two layers, the number of neurons in an ensemble (N), the relative zone-length with respect to the total dendritic arbor (Z/L) and the number of ensembles (M).

      Sequence calculations

      Here we estimate the likelihood of the first ensemble input arriving anywhere on the dendrite, and ask how likely it is that succeeding inputs of the sequence would arrive within a set spacing.

      Thus, the probability of occurrence of sequences that receive sequential connections (PcPOSS) from each of the M ensembles is a function of the connection probability (p) between the two layers, the number of neurons in an ensemble (N), the relative window size with respect to the total dendritic arbor (Δ/L) and the number of ensembles (M).

      (2) I wonder if the authors are being overly conservative at times. The result highlighted in the abstract is that 10/100000 postsynaptic neurons are expected to exhibit synaptic clustering. This seems like a very small number, especially if circuits are to rely on such a mechanism. However, this figure assumes the convergence of 3-5 distinct ensembles. Convergence of inputs from just 2 ense mbles would be much more prevalent, but still advantageous computationally. There has been excitement in the field about experiments showing the clustering of synapses encoding even a single feature.

      We agree that short clusters of two inputs would be far more likely. We focused our analysis on clusters with three of more ensembles because of the following reasons:

      (1) The signal to noise in these clusters was very poor as the likelihood of noise clusters is high.

      (2) It is difficult to trigger nonlinearities with very few synaptic inputs.

      (3) At the ensemble sizes we considered (100 for clusters, 1000 for sequences), clusters arising from just two ensembles would result in high probability of occurrence on all neurons in a network (~50% in cortex, see p_CMFG in figures below.). These dense neural representations make it difficult for downstream networks to decode (Foldiak 2003).

      However, in the presence of ensembles containing fewer neurons or when the connection probability between the layers is low, short clusters can result in sparse representations (Figure 2 - Supplement 2). Arguments 1 and 2 hold for short sequences as well.

      (3) The analysis supporting the claim that strong nonlinearities are needed for cluster/sequence detection is unconvincing. In the analysis, different synapse distributions on a single long dendrite are convolved with a sigmoid function and then the sum is taken to reflect the somatic response. In reality, dendritic nonlinearities influence the soma in a complex and dynamic manner. It may be that the abstract approach the authors use captures some of this, but it needs to be validated with simulations to be trusted (in line with previous work, e.g. Poirazi, Brannon & Mel, (2003)).

      We agree that multiple factors might affect the influence of nonlinearities on the soma. The key goal of our study was to understand the role played by random connectivity in giving rise to clustered computation. Since simulating a wide range of connectivity and activity patterns in a detailed biophysical model was computationally expensive, we analyzed the exemplar detailed models for nonlinearity separately (Figures 5, 6, and new figure 8), and then used our abstract models as a proxy for understanding population dynamics. A complete analysis of the role played by morphology, channel kinetics and the effect of branching requires an in-depth study of its own, and some of these questions have already been tackled by (Poirazi, Brannon, and Mel 2003; Branco, Clark, and Häusser 2010; Bhalla 2017). However, in the revision, we have implemented a single model which incorporates the range of ion-channel, synaptic and biochemical signaling nonlinearities which we discuss in the paper (Figure 8, and Figure 8 Supplement 1, 2,3). We use this to demonstrate all three forms of sequence and grouped computation we use in the study, where the only difference is in the stimulus pattern and the separation of time-scales inherent in the stimuli.

      (4) It is unclear whether some of the conclusions would hold in the presence of learning. In the signal-to-noise analysis, all synaptic strengths are assumed equal. But if synapses involved in salient clusters or sequences were potentiated, presumably detection would become easier? Similarly, if presynaptic tuning and/or timing were reorganized through learning, the conditions for synaptic arrangements to be useful could be relaxed. Answering these questions is beyond the scope of the study, but there is a caveat there nonetheless.

      We agree with the reviewer. If synapses receiving connectivity from ensembles had stronger weights, this would make detection easier. Dendritic spikes arising from clustered inputs have been implicated in local cooperative plasticity (Golding, Staff, and Spruston 2002; Losonczy, Makara, and Magee 2008). Further, plasticity related proteins synthesized at a synapse undergoing L-LTP can diffuse to neighboring weakly co-active synapses, and thereby mediate cooperative plasticity (Harvey et al. 2008; Govindarajan, Kelleher, and Tonegawa 2006; Govindarajan et al. 2011). Thus if clusters of synapses were likely to be co-active, they could further engage these local plasticity mechanisms which could potentiate them while not potentiating synapses that are activated by background activity. This would depend on the activity correlation between synapses receiving ensemble inputs within a cluster vs those activated by background activity. We have mentioned some of these ideas in a published opinion paper (Pulikkottil, Somashekar, and Bhalla 2021). In the current study, we wanted to understand whether even in the absence of specialized connection rules, interesting computations could still emerge. Thus, we focused on asking whether clustered or sequential convergence could arise even in a purely randomly connected network, with the most basic set of assumptions. We agree that an analysis of how selectivity evolves with learning would be an interesting topic for further work.

      References

      Bhalla, Upinder S. 2017. “Synaptic Input Sequence Discrimination on Behavioral Timescales Mediated by Reaction-Diffusion Chemistry in Dendrites.” Edited by Frances K Skinner. eLife 6 (April):e25827. https://doi.org/10.7554/eLife.25827.

      Branco, Tiago, Beverley A. Clark, and Michael Häusser. 2010. “Dendritic Discrimination of Temporal Input Sequences in Cortical Neurons.” Science (New York, N.Y.) 329 (5999): 1671–75. https://doi.org/10.1126/science.1189664.

      Foldiak, Peter. 2003. “Sparse Coding in the Primate Cortex.” The Handbook of Brain Theory and Neural Networks. https://research-repository.st-andrews.ac.uk/bitstream/handle/10023/2994/FoldiakSparse HBTNN2e02.pdf?sequence=1.

      Golding, Nace L., Nathan P. Staff, and Nelson Spruston. 2002. “Dendritic Spikes as a Mechanism for Cooperative Long-Term Potentiation.” Nature 418 (6895): 326–31. https://doi.org/10.1038/nature00854.

      Govindarajan, Arvind, Inbal Israely, Shu-Ying Huang, and Susumu Tonegawa. 2011. “The Dendritic Branch Is the Preferred Integrative Unit for Protein Synthesis-Dependent LTP.” Neuron 69 (1): 132–46. https://doi.org/10.1016/j.neuron.2010.12.008.

      Govindarajan, Arvind, Raymond J. Kelleher, and Susumu Tonegawa. 2006. “A Clustered Plasticity Model of Long-Term Memory Engrams.” Nature Reviews Neuroscience 7 (7): 575–83. https://doi.org/10.1038/nrn1937.

      Harvey, Christopher D., Ryohei Yasuda, Haining Zhong, and Karel Svoboda. 2008. “The Spread of Ras Activity Triggered by Activation of a Single Dendritic Spine.” Science (New York, N.Y.) 321 (5885): 136–40. https://doi.org/10.1126/science.1159675.

      Losonczy, Attila, Judit K. Makara, and Jeffrey C. Magee. 2008. “Compartmentalized Dendritic Plasticity and Input Feature Storage in Neurons.” Nature 452 (7186): 436–41. https://doi.org/10.1038/nature06725.

      Poirazi, Panayiota, Terrence Brannon, and Bartlett W. Mel. 2003. “Pyramidal Neuron as Two-Layer Neural Network.” Neuron 37 (6): 989–99. https://doi.org/10.1016/S0896-6273(03)00149-1.

      Pulikkottil, Vinu Varghese, Bhanu Priya Somashekar, and Upinder S. Bhalla. 2021. “Computation, Wiring, and Plasticity in Synaptic Clusters.” Current Opinion in Neurobiology, Computational Neuroscience, 70 (October):101–12. https://doi.org/10.1016/j.conb.2021.08.001.

    1. eLife Assessment

      The study presents important findings that reveal SEPHS2 and VPS37C as new potential drug targets for dasatinib and hydroxychloroquine respectively in addition to confirming known targets of these drugs. The evidence provided is compelling as observed in the methods, data and analyses. This article will be of great interest to chemical biologists, biochemists, and scientists in drug discovery and diagnostics.

    2. Reviewer #2 (Public review):

      Summary:

      The study by Sun et al. introduces a useful system utilizing the proteasomal accessory factor A (PafA) and HaloTag for investigating drug-protein interactions in both in vitro (cell culture) and in vivo (zebrafish) settings. The authors presented the development and optimization of the system, as well as examples of its application and the identification of potential novel drug targets. However, the manuscript requires considerable improvements, particularly in writing and justification of experimental design. There are several inaccuracies in data description and a lack of statistics in some figures, undermining the conclusions drawn in the manuscript. Additionally, the authors introduced variants of the ligands and its cognate substrates, yet their use in different experiments appears random and lacks justification. It is challenging for readers to remember and track the specific properties of each variant, further complicating the interpretation of the results.

      The conclusions of this paper are mostly backed by data, but certain aspects of data analysis and description require further clarification and expansion.

      Comments on revisions:

      We would like thank authors for submitting this revised version. We appreciate their inclusion of additional experiments, which convincingly demonstrate the absence of significant toxicity for in vivo applications. All our concerns and questions have been fully addressed. The clarity and quality of the writing have been substantially improved. We believe this innovative proximity labeling tool would be inspiring and valuable for the field.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript introduces POST-IT (Pup-On-target for Small molecule Target Identification Technology), a novel non-diffusive proximity tagging system for identifying target proteins in live cells and organisms. This technology preserves cellular context essential for capturing specific drug-protein interactions, including transient complexes and membrane-associated proteins. Using an engineered fusion of proteasomal accessory factor A (PafA) and HaloTag, POST-IT specifically labels proximal proteins upon binding to a small molecule, with extensive optimization to enhance specificity and efficiency.

      Strengths:

      The study successfully identifies known targets and discovers new binders, such as SEPHS2 for dasatinib and VPS37C for hydroxychloroquine, advancing our understanding of their mechanisms. Additionally, its application in live zebrafish embryos demonstrates POST-IT's potential for widespread use in biological research and drug development.

      Comments on revisions:

      The authors have addressed most of the issues I raised in my review. I have no further comments.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      (1) The technology requires a halo-tagged derivation of the active compound, and the linked position will have a huge impact on the potential "target hits" of the molecules. Given the fact that most of the active molecules lack of structure-activity relationship information, it is very challenging to identify the optimal position of the halo tag linkage.

      We appreciate your insightful comment. While finding the optimal position to attach a chemical linker to a small molecule of interest is indeed a challenging but necessary step, this is a common difficulty across all target-ID methods, except for those that are modification-free, as we described in Discussion. However, modification-free approaches such as DARTS, CETSA, and TPP have their own limitations, such as low sensitivity and a high false-positive rate. Additionally, DARTS and SPROX are limited to use with cell lysates. Please refer to the introduction in our manuscript for more details on these approaches. On the other hand, synthesizing HTL derivatives is relatively straightforward compared to other modifications, and we provide helpful guidelines for chemical linker design, provided the optimal chemical moiety has been identified, which is crucial for target identification. We selected dasatinib and HCQ/CQ as model compounds because previous studies offered insights into their derivative synthesis. Our data also show that DH5 retains strong kinase inhibitory activity (Figure 4—figure supplement 2), and DC661-H1 demonstrates potent inhibition of autophagy (Figure 6—figure supplement 1). For novel compounds, conducting a thorough structure-activity relationship (SAR) study is essential to determine the optimal position for HTL derivative synthesis.

      (2) Although POST-IT works in zebrafish embryos, there is still a long way to go for the broad application of the technology in other animal models.

      Thank you for your constructive comment. Yes, there is still a long way to go in developing the POST-IT system for broader applications in other animal models, especially in mice. However, we hope that our study provides valuable insights and inspiration to scientists and experts for applying the POST-IT system in various models. We are also committed to further improving its applicability.

      (3) The authors identified SEPHS2 as a new potential target of dasatinib and further validated the direct binding of dasatinib with this protein. However, considering the super strong activity of dasatinib against c-Src (sub nanomolar IC50 value), it is hard to conclude the contribution of SEPHS2 binding (micromolar potency) to its antitumor activity.

      Thank you for your insightful comment. We agree that the anticancer activity of dasatinib primarily results from inhibiting tyrosine kinases such as SRC and ABL. However, SEPHS2 contains an “opal" termination codon, UGA, at the 60th amino acid residue, which codes for selenocysteine. Due to the technical challenge of expressing selenoproteins in E. coli, we mutated it to cysteine for expression in E. coli to avoid premature translation termination, as described in the Materials and Methods section. Although the purified recombinant SEPHS2 shows a Kd of about 10 µM for dasatinib, the binding affinity to endogenous SEPHS2 may be higher since selenocysteine is larger and more electronegative than cysteine. This presents an interesting area for future investigation. Furthermore, our study of dasatinib’s binding to SEPHS2 could help facilitate the development of new SEPHS2 inhibitors, potentially targeting the active site of SEPHS2.

      Reviewer #3 (Public review):

      (1) Target Specificity: It is crucial for the authors to differentiate between the primary targets of the POST-IT system and those identified as side effects. This distinction is essential for assessing the specificity and utility of the technology.

      Thank you for your insightful comment. Drugs inevitably bind to various proteins with differing affinities, which can contribute to both side effects and beneficial outcomes. Typically, the primary targets exhibit high affinities. In this manuscript, we ranked the identified protein targets of DH5 based on affinity from mass spectrometry and p-values (Fig. 5A), and for DC661-H1, we used the SILAC ratio (Fig. 6A). We also individually assessed many drug-protein binding affinities using the MST assay, as well as in vitro and in cellulo assays, demonstrating their specificity. Moreover, we believe it is essential to identify as many protein targets as possible at physiological drug concentrations to better understand the drug’s side effects. Of course, further investigation is required to assess the roles and effects of these target proteins.

      (2) In Vivo Target Identification: The manuscript lacks detailed clarity on which specific targets were successfully identified in the in vivo experiments. Expanding on this information would provide a clearer view of the system's effectiveness and scope in complex biological settings.

      Thank you for your insightful comment regarding in vivo target identification. In this manuscript, we utilized a cell line as the primary method for in vivo target identification and validation after optimizing our system in test tubes. We successfully validated many of the targets identified using our POST-IT system (Figure 6—figure supplement 3). To demonstrate the proof of principle for in vivo application, we employed zebrafish embryos as an in vivo model, showing that endogenous SRC can be effectively pulled down by DH5 treatment (Fig. 7). While we could have explored the entire proteome to identify endogenous target proteins in zebrafish that bind to DH5 or dasatinib, we felt this would extend beyond our original scope, given that we have already demonstrated POST-IT’s ability to identify target proteins for dasatinib. Specific target identification and validation are crucial when using zebrafish for drug discovery. Additionally, we acknowledge that drugs likely interact with a range of protein targets in living organisms and may undergo metabolism and interactions within the circulatory system, which we address in our discussion.

      (3) Reproducibility and Scalability: Discussion on the reproducibility of the POST-IT system across various experimental setups and biological models, as well as its scalability for larger-scale drug discovery programs, would be beneficial.

      Thank you for the suggestion. While our system has shown  high reproducibility in our experiments, further improving both reproducibility and scalability would be advantageous. One potential approach to address this is through the generation of stable-expressing cell lines and transgenic zebrafish lines, which we have discussed in the revised manuscript. Establishing stable cell lines with robust POST-IT expression could enhance scalability for drug discovery applications.

      (4) Quantitative Analysis: A more detailed quantitative analysis of the protein interactions identified by POST-IT, including statistical significance and comparative data against other technologies, would enhance the manuscript.

      Thank you for your suggestion. In our assessment of drug-protein affinity, we included Kd values as quantitative measures using MST assays. The protein targets of dasatinib identified through mass spectrometry are also accompanied by p-values for quantitative analysis (Fig. 5A), and the detailed procedures are described in the Material and methods section. While it is challenging to provide direct comparative data against other technologies, our system successfully identified many known target proteins for dasatinib, as well as SEPHS2 and VPS37C as new targets for dasatinib and for HCQ/CQ, respectively, which were not detected by other methods.

      (5) Technological Limitations: The authors should discuss any limitations or potential pitfalls of the POST-IT system, which would be crucial for future users and for guiding subsequent improvements.

      Thank you for your insightful suggestion We agree that clearly defining the technological limitations is important. Therefore, we have expanded our original discussion on the limitations of our POST-IT system (Discussion section, paragraph 6).

      (6) Long-Term Stability and Activity: Information on the long-term stability and activity of the POST-IT components in different biological environments would ensure the reliability of the system in prolonged experiments.

      Yes, this is an important question. We did not notice any stability or toxicity issues with Halo-PafA and Pup substrates in HEK293T cells or zebrafish, which is an important factor for stable cell lines and transgenic zebrafish lines. However, HTL derivatives of the drug could be toxic or unstable due to the nature of the drug or its metabolism, which needs to be taken into account when designing experiments, and we have included this in the Discussion.

      (7) Comparison with Existing Technologies: A detailed comparison with existing proximity tagging and target identification technologies would help position POST-IT within the current landscape, highlighting its unique advantages and potential drawbacks.

      We appreciate your valuable feedback and agree that such comparisons are crucial. We have included a detailed overview and comparison of existing proximity-tagging systems and their related target identification technologies in the Introduction (lines 78-100) and Discussion (lines 391-412), highlighting their respective pros and cons. Additionally, we have expanded the discussion to further compare these technologies with our POST-IT system, addressing its advantages and limitations (lines 378-390, lines 448-467). We hope this provides sufficient context and information to effectively position POST-IT among the landscape of proximity-tagging target identification technologies.

      (8) Concerns Regarding Overexposed Bands: Several figures in the manuscript, specifically Figure 3A, 3B, 3C, 3F, 3G, Figure 4D, and the second panels in Figure 7C as well as some figures in the supplementary file, exhibit overexposed bands.

      We appreciate your astute observation regarding the overexposed bands and apologize for any confusion. The “overexposed” bands represent the unpupylated proteins, while the bands above them correspond to the pupylated proteins. We intended to clearly show both pupylated and unpupylated bands, although the latter are generally much weaker. We are currently working on further improving our POST-IT system to enhance pupylation efficiency.

      (9) Innovation Concern: There is a previous paper describing a similar approach: Liu Q, Zheng J, Sun W, Huo Y, Zhang L, Hao P, Wang H, Zhuang M. A proximity-tagging system to identify membrane protein-protein interactions. Nat Methods. 2018 Sep;15(9):715-722. doi: 10.1038/s41592-018-0100-5. Epub 2018 Aug 13. PMID: 30104635. It is crucial to explicitly address the novel aspects of POST-IT in contrast to this earlier work.

      Thank you for bringing this to our attention. Proximity-tagging systems like BioID, TurboID, NEDDylator, and PafA (Lui Q et al., Nat Methods 2018) were initially developed to study protein-protein interactions or identify protein interactomes, as these applications are of broader interest and generally easier to implement. However, applying proximity-tagging systems for small molecule target identification requires significant optimization. As described in the introduction (lines 78-100), target protein identification systems have since been developed using TurboID and NEDDylator (Tao AJ et al., Nat Commun 2023; Hill ZB et al., J Am Chem Soc 2016). It is conceivable that a PafA-based proximity-tagging system could also be adapted for target-ID, and other groups may pursue this approach in the future. Although the PafA-Pup system shows great promise for target-ID applications, extensive optimization was needed to enable its use for this purpose. Finally, we demonstrate that POST-IT offers distinct advantages over other proximity-tagging-based target-ID systems. For more details, please refer to the introduction and discussion sections.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      (1) Figure 1- Figure Supplement 1A: The Pup substrate "HB-Pup" is mentioned, but the main text or figure legend provides no introduction or description.

      We appreciate your astute observation. We have added a description in the main text and figure legend as follows: “…and used HB-Pup as a control, which contains 6´His and BCCP at the N terminus of Pup” in the main text (line 142) and “HB, TS, and SBP refer to 6´His and BCCP, twin-STII (Strep-tag II), and streptavidin binding peptide, respectively.” in the Figure 1-figure supplement 1A.

      (2) Figure 1 - Figure Supplement 3B: The authors used TS-sPupK61R as a substrate but did not explain why. The main text mentions that mutating sPup alone did not affect polypupylation, raising the question of why TS-sPupK61R was used in this figure. Furthermore, while the authors state that polypupylation becomes evident after 1 hour of incubation (more pronounced after 2 or 3 hours), the reactions here were conducted for only 30 minutes.

      Thank you for your question. Figure 1 - Figure Supplement 3B was conducted to test self-pupylation levels in the different Halo-PafA derivatives. For this purpose, we could use any Pup substrate such as SBP-sPup and SBPK4R-sPupK61R, instead of Ts-sPup and TS-sPupK61R, as they do not show any differences in pupylation activity. We chose Ts-sPup and TS-sPupK61R simply because any Pup substrates could be used for this purpose. Similarly, we did not need to incubate the reaction for a longer time to detect polypupylation, as our intention was to test “self-pupylation”. We demonstrated in Figure 1 – figure supplement 2 that polypupylation is dependent on the number or position of lysine residues in Pup substrate or tags. The results clearly showed that self-pupylation was almost completely abolished by the Halo8KR mutation. To clarify this, we added the following description in lines 168-169: “Ts-sPup and TS-sPupK61R were chosen as sPup substrates for this experiment, although any Pup substrates could have been used. The levels of self-pupylation were assessed.”

      (3) Line 156: The statement that "the TS-tag completely abolished polypupylation in TS-sPup" is inaccurate. Using TSK8R-sPupK61R as the substrate, several bands appear, which likely represent Halo-PafA with varying degrees of polypupylation. Some bands also appear to correspond to those seen when using TS-sPup as a substrate. The authors should clarify how they distinguish between multipupylation and polypupylation in this case.

      We sincerely appreciate your insight into clarifying the distinction between multipupylation and polypupylation. Polypupylation refers to the addition of a new Pup onto a previously linked Pup on the target protein, akin to polyubiquitination. In contrast, multipupylation involves multiple single pupylations at different positions on the target proteins. Since pupylation occurs exclusively at lysine residues in tag-Pup substrates, mutating all lysine residues to arginine, as in TSK48R-sPupK61R, prevents the mutant tag-Pup from linking to another Pup. This means that only single pupylation can proceed with this type of mutant Pup substrate. If multiple pupylated bands are observed with this mutant substrate, it indicates “multipupylation” rather than “polypupylation”, as shown in Figure 1-figure supplement 2D. The same applies to the pupylation bands in Figure 1-figure supplement 2E and F, as sSBP-sPupK61R and SBPK4R-sPupK61R lack lysine residues. By comparing these multipupylation bands, it is also possible to distinguish them from polypupylation bands, which are marked by yellow arrows. However, after 2-3 pupylation bands, higher-order bands become increasingly difficult to distinguish.

      To clarify the mutation in the TS-tag, we revised the sentence in line 156 from “However, further mutations within the TS-tag completely abolished polypupylation in TS-sPup” to “However, further mutations of two lysine residues within the TS-tag, creating TSK8R-sPupK61R, completely abolished polypupylation in TS-sPup”. Additionally, we have inserted sentences in line 152 to define polypupylation and multipupylation, as described here.

      (4) Line 160: Similar to the above concern about line 156, the claim that SBPK4R and sSBP completely prevented polypupylation is unconvincing and requires more supporting evidence.

      Thank you for raising this concern. As mentioned above, both SBPK4R and sSBP lack lysine residues required for pupylation. As a result, these mutants can only undergo multiple single pupylations on the lysine residues of the target protein, which leads to “multipupylation”. In Figure 1-figure supplement 2E and F, pupylation bands by sSBP-sPupK61R or SBPK4R-sPupK61R do not display doublet bands (one from multipupylation and the other from polypupylation), as seen with SBP-sPup, marked by yellow arrows. Notably, Halo-PafA containing polypupylated branches migrates more slowly than one with an equal number of multipupylation events. To clarify this point, we have added the phrase “as shown in sSBP-sPupK61R and SBP4KR-sPupK61R” at the end of the sentence in line 160.

      (5) Lines 176-177: The authors claim that PafAS126A exhibited reduced polypupylation compared to PafA, but given that PafAS126A may reduce depupylase activity, how could it reduce polypupylation levels? Moreover, it is hard to find any data supporting this conclusion in Figure 1 - Figure Supplement 3B.

      We appreciate your insightful comment. At this point, we do not fully understand how the mutation that reduces depupylase activity also decreases polypupylation. It is possible that PafAS126A has a lower preference for pupylated Pup as a prey, which is required for polypupylation, since depupylase activity depends on recognizing pupylated Pup as a prey to remove it. Nonetheless, Halo-PafAS126A shows reduced levels of higher molecular weight bands compared to Halo-PafA, as shown in Figure 1-figure supplement 3B, while exhibiting increased pupylation in lower molecular weight bands, which represent either multipupylation or low-degree polypupylation. Since higher molecular weight bands (> 150 kD) are likely due to polypupylation, this result suggests reduced polypupylation and increased multipupylation in Halo-PafAS126A. To clarify this in the main text, we have added the following description in line 177: “as evidenced by the decreased levels of high molecular weight bands and an increase in low molecular weight bands”

      (6) POST-IT system in cellulo validation: The system was developed using the Halo-tag, yet the in-cell validation uses FRB and FKBP instead, without explaining this switch. This inconsistency makes the logic of the experiment unclear.

      We appreciate your insightful comment. The interaction between rapamycin and FRB or FKBP is known to be highly specific and robust, making this system useful in various biological contexts. Due to this property, rapamycin can induce interaction between two proteins when one is fused with FRB and the other with FKBP. Before testing or optimizing the POST-IT system in cells, we hypothesized that using the rapamycin-induced interaction between FRB and FKBP could introduce pupylation of the target protein, provided that PafA is fused with FRB or FKBP and the target protein is fused with the other. The results demonstrate that PafA can introduce pupylation of the target protein in a proximity-dependent manner via this chemically induced interaction. To further clarify this in the main text, we modified the original sentence in lines 214-216 as follows: “To mimic drug-target interaction-induced pupylation in live cells and assess the potential of PafA as a proximity-tagging system for target-ID, we incorporated the rapamycin-induced interaction between FRB and FKBP into our PL system, as this interaction between a small molecule and a protein is known to be highly specific and robust (Figure 3—figure supplement 1A).”

      (7) Line 209: The authors decided to use the SBP-tag for further studies due to better performance, but in Figure 3 - Figure supplement 1, they still used the unintroduced HB-Pup as the substrate, which is confusing and lacks explanation.

      Thank you for raising your question. The SBP-tag is not superior to the TS-tag in terms of pupylation activity. However, the TSK8R mutant cannot bind to Strep-Tactin beads, while the SBP mutants, SBPK4R and sSBP, can bind to streptavidin. Therefore, we chose the SBP-tag instead of the TS-tag for further studies as a Pup substrate in POST-IT system, as we needed to pull down the target proteins. HB-Pup is consistently used as a control throughout various experiments, as it is the original Pup substrate. In Figure 3-figure supplement 1B and C, HB-Pup was used to test chemically induced pupylation by PafA. In these cases, it was not so critical which Pup substrate was chosen. Furthermore, we compared HB-Pup and different SBP-sPup substrates in Figure 3-figure supplement 1D, where HB-Pup was used as a control or for comparison. Although pupylation bands with HB-Pup appear more robust, this substrate contains multiple lysine residues, leading to high levels of polypupylation. To make it clear, we modified the sentence in line 209 to “Therefore, we decided to use the SBP-tag as a Pup substrate in the POST-IT system for further studies.”.

      (8) Line 220: Both SBP-sPup and SBPK4R-sPupK61R are described as exhibiting efficient pupylation, but the data show mostly self-pupylation and little to no pupylation of the target protein.

      Thank you for your concern. However, pupylation of the target protein is actually quite substantial, as the intensities of the free form and pupylated proteins are relatively similar, as shown in the upper panel of Figure 3-figure supplement 1D. Self-pupylation is always much higher than target pupylation, because PafA constantly pupylates itself, whereas pupylation of the target protein occurs only through interaction. Furthermore, V5-FRB-mKate2-PafA contains many lysine residues, which increases the levels of self-pupylation.

      (9) Lines 222-224: The authors chose SBPK4R-sPupK61R to avoid polypupylation, although SBP-sPup did not cause detectable polypupylation. Neither substrate caused pupylation of the target protein, so the rationale behind this choice is unclear.

      Thank you for raising your question. Similar to the above comment (#8), please refer to the pupylation bands of the target protein, as shown in the upper panel of Figure 3-figure supplement 1D. The pupylation band of the target protein is quite remarkable, as the intensities of the free form and pupylated proteins are comparable. Additionally, there are no multiple pupylation bands in either case, except for one additional weak multipupylation band, indicating no polypupylation by SBP-sPup, which does not have K-to-R mutations. Of course, SBPK4R-sPupK61R can only undergo single pupylation, as it does not contain lysine residues. Although we did not observe polypupylation by SBP-sPup in this experimental condition, it is possible that SBP-sPup may cause polypupylation under different experimental conditions or with other target proteins. Since SBPK4R-sPupK61R exhibits comparable pupylation of the target protein at least in this experiment setting as SBP-sPup, we selected SBPK4R-sPupK61R as the Pup substrate for POST-IT system to avoid any potential polypupylation that could be caused by SBP-sPup in other cases. We believe that polypupylation can introduce bias into the analysis and hinder the comprehensive discovery of additional target proteins for small molecules.

      (10) Line 224: The authors conclude that rapamycin greatly reduced self-pupylation, but the supporting data are unclear.

      Thank you for your constructive comments on our manuscript. Please refer to the lower panel of Figure 3-figure supplement 1D. When using either SBPK4R-sPupK61R or SBP-sPup, rapamycin treatment results in reduced levels of self-pupylation compared to the no-treatment control. However, we did not observe this reduction with HB-Pup and do not know the reason. To clarify this in the main text, we added the following description to the end of the sentence: “when using either SBPK4R-sPupK61R or SBP-sPup, as shown in the lower panel of Figure 3—figure supplement 1D”

      (11) Line 234: The authors selected an 18-amino acid linker, but given that linkers longer than 10 amino acids enhance labeling, this choice should be explained.

      Thank you for raising your question. In fact, a linker of 10 amino acids (aa) or longer is likely to behave similarly. We chose an 18 aa linker instead of a 40 aa linker primarily for the convenience of cloning and to reduce the potential for DNA sequence recombination associated with longer repeats. Additionally, a longer, flexible linker may behave like an intrinsically disordered protein (Harmon et al., 2017), which can lead to unwanted protein-protein interactions or phase separation. To elaborate on this, we added the following sentences after the sentence in line 233-235: “We chose the 18-amino acid linker instead of the 40-amino acid linker for easier cloning and to lower the risk of DNA recombination from longer repeats. Additionally, a longer, flexible linker may behave like an intrinsically disordered protein (Harmon et al., 2017), an unwanted feature for target-ID.”

      (12) S126A and K172R mutations: The authors claim that these mutations additively enhanced pupylation under cellular conditions, but in Figure 3B, the band intensities appear similar for the wild-type and mutant versions.

      Thank you for raising your concern. Although a single pupylation band appears similar among the three different Halo-PafA proteins, multipupylation bands are slightly but noticeably increased by the S126A and K172R mutations compared to Halo8KR-PafA. Since we used SBPK4R-sPupK61R as a Pup substrate, all higher molecular weight bands result from multipupylation rather than polypupylation. This illustrates why it is preferable to use SBPK4R-sPupK61R over SBP-sPup, as the pupylation bands with SBP-sPup are mixtures of poly- and multipupylation, making it difficult to assess levels of target labeling. To clarify this in the main text, we added the following description after the sentence in line 236: “as the higher molecular weight multipupylation bands are slightly but noticeably increased with these mutations compared to Halo8KR-PafA”

      (13) Line 263: The authors selected DH5 for further experiments due to its efficiency, but the data suggest that the performance of DH1 to DH5 is similar.

      We appreciate your question about the different dasatinib HTL derivatives. However, our data clearly show that DH2-5 derivatives bind significantly more effectively to Halo-PafA in vitro and in live cells compared to DH1 (Figure 4A and B). Additionally, the DH2-5 derivatives result in dramatically increased pupylation of the target protein in vitro and noticeable enhancement in live cells (Figure 4C and D). Among DH2 to DH5, there is no obvious difference in binding to Halo-PafA or pupylation of the target protein. Therefore, we chose DH5, as we believe that the longer linker in DH5 may facilitate the binding of a more diverse range of target proteins to dasatinib, enabling the discovery of additional target proteins.

      (14) Line 309: The authors introduce HCQ and CQ as important drugs but then investigate the mechanism using DC661 without introducing or justifying the choice of this compound.

      Thank you for your point. We explained the reason to choose DC661, a dimer form of CQ, instead of CQ for the synthesis of an HTL derivative in line 310. “assuming that a dimer would enhance binding affinity as previously described.” As the dimer forms of a drug or a small molecule such as testosterone dimers, estrogen dimers, and numerous anticancer drug dimers have been often developed to enhance drug effects (Paquin A et., Molecules 2021). Similarly, dimer forms of HCQ/CQ have been introduced and shown to be more potent (Hrycyna CA et al., ACS Chem Biol 2014; Rebecca VW et al., Cancer Discovery 2019). We expected that using a dimer form might offer higher probability to identify target proteins for HCQ/CQ.

      (15) The authors suggest that multipupylation levels were enhanced but do not explain whether this might benefit the system or introduce other issues. Clarifying this point would provide valuable insight for potential users of this system.

      Thank you for your thoughtful suggestion. Polypupylation likely leads to biased enrichment of a limited set of target proteins, and its levels may not correlate with the binding affinity of target proteins to the small molecule of interest, features that can negatively impact target-ID. In contrast, multipupylation may be correlated with binding affinity or interaction frequency, as we observed increased levels of multipupylation with higher Pup concentrations and longer incubation times. This suggests that target proteins with multiple lysines in proximity to PafA can be sequentially pupylated, starting with the most accessible lysine. However, if a target protein has only one accessible lysine, pupylation will occur only once, regardless of the protein’s affinity to the small molecule. In summary, while polypupylation may be a drawback for target-ID, multipupylation could be useful for both target-ID and understanding binding mode. To elaborate on this, we added the following additional explanation after the sentence in line 152: “, whereas multipupylation is more likely correlated with binding affinity or interaction frequency.”

      (16) The author should address whether the Halotag ligand modification of the drug alters the binding properties between the drug and targets. That may be causing artifact binding of the drug and other proteins.

      Thank you for your insightful comment. Yes, it is true that chemical modifications of the small molecule of interest, such as linker derivatization (e.g., HTL) or photo-affinity labeling, generally lead to reduced activity or affinity compared to the original molecule. Synthesizing a derivative is a common challenge across all target-ID methods, except for modification-free approaches, as we mentioned in the Discussion. However, modification-free methods like DARTS, CETSA, and TPP have their own limitations, including low sensitivity or high false positive rates. Identifying the optimal position for chemical modification on the small molecule of interest is critical. We chose dasatinib and HCQ/CQ as model compounds, because previous studies provided insights into their derivative synthesis. In addition, our data show that DH5 retains robust kinase inhibitory activity (Figure 4-figure supplement 2), and DC661-H1 exhibits potent autophagy inhibition (Figure 6-figure supplement 1). For novel compounds, a thorough structure-activity relationship study is essential to identify the optimal position for HTL derivative synthesis.

      (17) The author stated there is no observable toxicity in zebrafish without providing a detailed analysis or enough data. Further analysis of the expression of Halo-PafA and its substrate sPup influence on toxicity or side effects to the living cells or animals would be needed. It is important for in vivo applications.

      Thank you for your constructive suggestion. We have now included additional experimental data in Figure 7-figure supplement 1, showing no toxicity in zebrafish embryos expressing the POST-IT system. We assessed toxicity in two ways: by injecting the POST-IT DNA plasmid into one-cell-stage embryos for acute expression, and by using embryos from transgenic zebrafish expressing POST-IT under a heat-shock inducible promoter. Neither the injection nor the heat-shock activation of POST-IT expression resulted in any noticeable toxicity.

    1. eLife Assessment

      This important work presents two studies on predictive processing in subjects with and without tinnitus, matched for age, sex and hearing loss. These studies together provide compelling evidence for an enhanced predictability of upcoming sounds in regular sequences in EEG data recorded from tinnitus subjects. This work will be of interest to researchers, especially neuroscientists, in the tinnitus field and beyond.

    2. Reviewer #1 (Public Review):

      This work presents a replicable difference in predictive processing between subjects with and without tinnitus. In two independent MEG studies and using a passive listening paradigm, the authors identify an enhanced prediction score in tinnitus subjects compared to control subjects. In the second study, individuals with and without tinnitus were carefully matched for hearing levels (next to age and sex), increasing the probability that the identified differences could truly be attributed to the presence of tinnitus. Results from the first study could successfully be replicated in the second, although the effect size was notably smaller.

      Throughout the manuscript, the authors provide a thoughtful interpretation of their key findings and offer several interesting directions for future studies. Their conclusions are fully supported by their findings. Moreover, the authors are sufficiently aware of the inherent limitations of cross-sectional studies.

      Strengths:

      The robustness of the identified differences in prediction scores between individuals with and without tinnitus is remarkable, especially as successful replication studies are rare in the tinnitus field. Moreover, the authors provide several plausible explanations for the decline of the effect size observed in the second study.

      The rigorous matching for hearing loss, in addition to age and sex, in the second study is an important strength. This ensures that the identified differences cannot be attributed to differences in hearing levels between the groups.

      The used methodology is explained clearly and in detail, ensuring that the used paradigms may be employed by other researchers in future studies. Moreover, the registering of the data collection and analysis methods for Study 2 as a Registered Report should be commended, as the authors have clearly adhered to the methods as registered.

    3. Reviewer #2 (Public review):

      Summary:

      This study aimed to test experimentally a theoretical framework that aims to explain the perception of tinnitus, i.e., the perception of a phantom sound in the absence of external stimuli, through differences in auditory predictive coding patterns. To this aim, the researchers compared the neural activity preceding and following the perception of a sound using MEG in two different studies. The sounds could be highly predictable or random, depending on the experimental condition. They revealed that individuals with tinnitus and controls had different anticipatory predictions. This finding is a major step in characterizing the top-down mechanisms underlying sound perception in individuals with tinnitus.

      Strengths:

      This article uses an elegant, well-constructed paradigm to assess the neural dynamics underlying auditory prediction. The findings presented in the first experiment were partially replicated in the second experiment, which included 80 participants. This large number of participants for an MEG study ensures very good statistical power and a strong level of evidence. The authors used advanced analysis techniques - Multivariate Pattern Analysis (MVPA) and classifier weights projection - to determine the neural patterns underlying the anticipation and perception of a sound for individuals with or without tinnitus. The authors evidenced different auditory prediction patterns associated with tinnitus. Overall, the conclusions of this paper are well supported, and the limitations of the study are clearly addressed and discussed.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      eLife Assessment

      This important work presents two studies on predictive processes in subjects with and without tinnitus. The evidence supporting the authors' claims is compelling, as their second study serves as an independent replication of the first. Rigorous matching between study groups was performed, especially in the second study, increasing the probability that the identified differences in predictive processing can truly be attributed to the presence of tinnitus. This work will be of interest to researchers, especially neuroscientists, in the tinnitus field.

      We thank the editors at elife very much for their favorable assessment of our manuscript. Based upon the comments of the reviewer, we aimed to further improve our manuscript to be a valuable addition to the tinnitus research field.

      Public Reviews:

      Reviewer #2 (Public review):

      Summary:

      This study aimed to test experimentally a theoretical framework that aims to explain the perception of tinnitus, i.e., the perception of a phantom sound in the absence of external stimuli, through differences in auditory predictive coding patterns. To this aim, the researchers compared the neural activity preceding and following the perception of a sound using MEG in two different studies. The sounds could be highly predictable or random, depending on the experimental condition. They revealed that individuals with tinnitus and controls had different anticipatory predictions. This finding is a major step in characterizing the top-down mechanisms underlying sound perception in individuals with tinnitus.

      Strengths:

      This article uses an elegant, well-constructed paradigm to assess the neural dynamics underlying auditory prediction. The findings presented in the first experiment were partially replicated in the second experiment, which included 80 participants. This large number of participants for an MEG study ensures very good statistical power and a strong level of evidence. The authors used advanced analysis techniques - Multivariate Pattern Analysis (MVPA) and classifier weights projection - to determine the neural patterns underlying the anticipation and perception of a sound for individuals with or without tinnitus. The authors evidenced different auditory prediction patterns associated with tinnitus. Overall, the conclusions of this paper are well supported, and the limitations of the study are clearly addressed and discussed.

      Weaknesses:

      Even though the authors took care of matching the participants in age and sex, the control could be more precise. Tinnitus is associated with various comorbidities, such as hearing loss, anxiety, depression, or sleep disorders. The authors assessed individuals' hearing thresholds with a pure tone audiogram, but they did not take into account the high frequencies (6 kHz to 16 kHz) in the patient/control matching. Moreover, other hearing dysfunctions, such as speech-in-noise deficits or hyperacusis, could have been taken into account to reinforce their claim that the observed predictive pattern was not linked to hearing deficits. Mental health and sleep disorders could also have been considered more precisely, as they were accounted for only indirectly with the score of the 10-item mini-TQ questionnaire evaluating tinnitus distress. Lastly, testing the links between the individuals' scores in auditory prediction and tinnitus characteristics, such as pitch, loudness, duration, and occurrence (how often it is perceived during the day), would have been highly informative.

      Thank you very much for your careful evaluation of our manuscript. We agree with you that our study design has some limitations such as the assessment of higher frequencies, comorbidities, and tinnitus characteristics. In our discussion, we aimed to acknowledge these issues for future research to improve this study design and gain more insights into neural tinnitus processes.

      See e.g.:

      Line 946-949:

      “Additionally, we rigorously controlled for hearing loss in Study 2, however, pure-tone audiometric testing was solely performed up to 8kHz and we were therefore not able to draw conclusions regarding hearing impairments in higher frequencies and their influence on the effects.”

      Line 949-954:

      “Moreover, we did not screen our participants for hyperacusis. This hypersensitivity to mild sounds is widely correlated with the sensation of tinnitus and underlying neural mechanisms are potentially intertwined with tinnitus processes (Schilling et al., 2023; Yukhnovich et al., 2023; Zheng, 2020). Screening for hyperacusis in future work can therefore reveal more details on participant characteristics influencing predictive processing.”

      Line 955-958:

      “In both studies, tinnitus distress was not correlated with the reported prediction effects. Nevertheless, tinnitus can also be characterized by other features such as its loudness, pitch or duration which were not included in the experimental assessment.”

      Line 958-963:

      “Additionally, we solely used a short version of the Mini-TQ (Goebel and Hiller, 1992) in Study 2, which did not allow us to relate prediction scores to subscales like sleep disturbances which potentially influence cognitive functioning and thus predictive processing. Next to sleeping disorders and distress, tinnitus is often also accompanied by psychological comorbidities such as depression or anxiety (Langguth, 2011) which are potential confounds of the results.”

      Comments on revisions:

      Thank you for your responses. There are a few remaining points that, if addressed, could further enhance the manuscript:

      - While the manuscript acknowledges the limitation of not matching groups on hearing thresholds in Study 1, a deeper analysis of participants' hearing abilities and their impact on MEG results, similar to that conducted in Study 2, would be valuable. Specifically, including a linear model that considers all frequencies, group membership, and their interactions could highlight differences across groups. Additionally, examining the effect of high-frequency hearing loss on prediction scores, as performed in Study 2, would strengthen the analysis, particularly given the trend noted (line 719). Such an addition could make a significant contribution to the literature by exploring how hearing abilities may influence prediction patterns.

      We appreciate your feedback and agree with you that it is a crucial question how hearing abilities influence prediction patterns in tinnitus. However, as hearing status was not assessed in the control group in study 1, we are unfortunately not able to include linear models to investigate differences across groups in this sample. This led us to the implementation of study 2 with a comprehensive hearing assessment to investigate group differences. We highlighted this issue in our methods section.

      Line 170-172:

      “As pure-tone audiometric testing was not included for the control subjects, group comparisons between hearing thresholds were not feasible.”

      - The connection with the hippocampal regions (line 864) remains somewhat unclear. While the inclusion of the Paquette reference appropriately links temporal region activity with tinnitus, it does not fully support the statement: "An increased focus on hippocampal regions, e.g., in fMRI, patient, or animal studies, could be a worthwhile complement to our MEG work, given the outstanding relevance of medial temporal areas in the formation of associations in statistical learning paradigms"

      Thank you for your constructive input. This section is purely speculative, and we do not aim to provide strong claims or expected results but solely point out potential future research directions.

      - Authors should add a comparison of participants mini-TQ scores on both studies

      We appreciate your input and added a comparison of mini TQ-scores between samples. For study 1, all subscales were included, however, we computed the comparison solely based on the items of the mini-TQ to increase comparability. The results were not significant, i.e., tinnitus distress values did not differ between studies.

      Line 629-632:

      “We additionally compared tinnitus distress values assessed by the mini-TQ (Goebel and Hiller, 1992) between study 1 and study 2 to detect potential differences between the samples, however, results of the Welch’s t-test were not significant with t(30.7)=1.27, p\=.214.”

      - Authors should add significant level on Fig 6.B as in Fig 3.C, and a n.s on Fig 6.D

      Thank you very much for your input, we added significance levels and a n.s. to the Figures 6B and 6D.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary of the work: In this work, Fruchard et. al. study the enzyme Tgt and how it modifies guanine in tRNAs to queuosine (Q), essential for Vibrio cholerae's growth under aminoglycoside stress. Q's role in codon decoding efficiency and its proteomic effects during antibiotic exposure is examined, revealing Q modification impacts tyrosine codon decoding and influences RsxA translation, affecting the SoxR oxidative stress response. The research proposes Q modification's regulation under environmental cues reprograms the translation of genes with tyrosine codon bias, including DNA repair factors, crucial for bacterial antibiotic response.

      The experiments are well-designed and conducted and the conclusions, for the most part, are well supported by the data. However, a few clarifications will significantly strengthen the manuscript.

      Thank you.

      Major:

      Figure S4 A-D. These growth curves are important data and should be presented in the main figures. Moreover, given that it is not possible to make a rsxA mutant, I wonder if it would be possible to connect rsx and tgt using the following experiment: expression of tgt results in resistance to TOB (in B), while expression of only rsx lower resistance to TOB (in D). Then simultaneous overexpression of both tgt/rsx in the WT strain should have either no effect on TOB resistance or increased resistance, relative to the WT. Perhaps the authors have done this, and if so, the data should be included as it will significantly strengthen their model.

      We thank the reviewer for this suggestion, we have tried to overexpress both tgt and rsxA simultaneously. However, this appears to be toxic as cells form small colonies and cannot grow well in liquid. We think that the presence of 2 plasmids and corresponding selection antibiotics amplify the toxicity of overexpressing rsxA, and even tgt. In fact, it can be seen that tgt overexpression in WT is already slightly deleterious, in the absence of tobramycin (figure 1B).

      Figure S4 - Is there a rationale for why it is possible to make rsx mutants in E. coli, but not in V. cholerae? For example, does E. coli have a second gene/protein that is redundant in function to rsxA, while V. cholerae does not? I think your data hint at this, since in the right panel growth data, your double mutant does not fully rescue back to rsx single mutant levels, suggesting another factor in tgt mutant also acts to lower resistance to TOB. If so, perhaps a line or two in text will be helpful for readers.

      This point raised by the referee is an interesting one that we have also asked ourselves at multiple occasions. In fact, the Rsx operon is linked with oxidative stress and respiration. Vibrio cholerae and E. coli show differences on genes involved in these pathways. V. cholerae lacks the cyo/nuo respiratory complex genes, and does not encode a Suf operon. Moreover, deletion of the anaerobic respiration Frd pathway leads to strong decrease of V. cholerae growth even in aerobic conditions. (10.1128/spectrum.01730-23). We have previously also generally seen differences between the 2 species in response to stress (10.1128/AAC.01549-10) and the way they deal with ROS (10.1371/journal.pgen.1003421). Therefore, we think that the fact that rsx is essential in V. cholerae and not E. coli could either be due to the presence of an additional redundant pathway in E. coli as suggested by the referee, or to more general differences in respiration and treatment of ROS. We thank the referee for highlighting this and we have now included a comment about this in the manuscript.

      - For growth curves in Figure 2 and relative comparisons like in Figure 5D and Figure S4 (and others in the paper), statistics and error bars, along with replicate information should be provided.

      We had mentioned this in the methods section, we have now added the specific information also on figure legends.

      - Figure 6A - Is the transcript fold change in linear or log? If linear, then tgt expression should not be classified as being upregulated in TOB. It is barely up by ~2-fold with TOB- 0.6....which is a mild phenotype, at best.

      We think that 2-fold change of tgt expression can be sufficient to lead to changes in tRNA modification levels. We agree that this is a mild induction, we have thus changed “increase” to “mildly increase” in the results.  

      - Line 779- 780: "This indicates that sub-MIC TOB possibly induces tgt expression through the stringent response activation." To me, the data presented in this figure, do not support this statement. The experiment is indirect.

      We agree, we rephrased: “Tobramycin may induces tgt expression through stringent response activation or through an independent pathway. “

      - Figure 3B and D. - These samples only have tobramycin, correct? The legend says both carbenicillin and tobramycin.

      The legend is correct, samples also have carbenicillin because we are testing here the growth with 2 synonymous beta-lactamase genes in presence of beta-lactams.

      - Figure 5. The color schemes in bars do not match up with the color scheme in cartoons below panels B and C. That makes it confusing to read. Please fix.

      Fixed.

      - A lot of abbreviations have been used. This makes reading a bit cumbersome. Ideally, less abbreviations will be used.

      Fixed

      Reviewer #2 (Public Review):

      Fruchard et al. investigate the role of the queuosine (Q) modification of the tRNA (Q-tRNA) in the human pathogen Vibrio cholerae. First, the authors state that the absence of Q-modified tRNAs (tgt mutant) increases the translation of TAT codons and proteins with a high TAT codon bias. Second, the absence of Q increases rsxA translation, because rsxA gene has a high TAT codon bias. Third, increased RsxA in the absence of Q inhibits SoxR response, reducing resistance towards the antibiotic tobramycin (TOB). Authors also predict in silico which genes harbor a higher TAT bias and found that among them are some involved in DNA repair, experimentally observing that a tgt mutant is more resistant to UV than the wt strain. It is worth noting that authors employ a wide variety of techniques, both experimental and bioinformatic. However, some aspects of the work need to be clarified or reevaluated.

      (1) The statement that the absence of Q increases the translation of TAT codons and proteins encoded by TAT-enriched genes presents the following problems that should be addressed:

      (1.1) The increase in TAT codon translation in the absence of Q is not supported by proteomics, since there was no detected statistical difference for TAT codon usage in proteins differentially expressed. Furthermore, there are some problems regarding the statistics of proteomics. Some proteins shown in Table S1 have adjusted p-values higher than their pvalues, which makes no sense. Maybe there is a mistake in the adjusted p-value calculation.

      We appreciate the reviewer’s thorough examination of our findings. In our study, we employed an adaptive Benjamini-Hochberg (BH) procedure to control the false discovery rate in our list of selected proteins, as explained in the Data Analysis part of the Proteomics MS and analysis part of our material and methods. The classical BH procedure (10.1111/j.2517-6161.1995.tb02031.x) calculates the 𝑚×𝑝(𝑗) adjusted p-value for the i-th ranked p-value as min where 𝑝(𝑗) is the j-th ranked pvalue and 𝑚 is the number of tests (e.g. number of proteins) (see 10.1021/acs.jproteome.7b00170 for details). Since m/j > 1 and 𝑝(𝑗) > 𝑝(𝑖) for 𝑗≥𝑚, it follows that for 𝑗≥i, resulting in adjusted p-values being higher or equal than the original p-values. Therefore, contrary to the reviewer's comment, it is a mathematical property that the adjusted p-value is greater than the original p-value when using the classical Benjamini-Hochberg procedure. 

      However, we want to underline that we used an « adaptive » BH procedure, which calculates the adjusted p-value for the i-th ranked p-value as min , where 𝜋0 is an estimate of the proportion of true null hypotheses (see 10.1021/acs.jproteome.7b00170 for details). Indeed, the classical BH procedure makes the assumption that 𝜋0 \= 1, which is a strong assumption in MS-based proteomics context.  Consequently, the mathematical property that the adjusted p-value is greater than the original p-value does not always hold true in our approach (that depends also on the 𝜋0 parameter).

      In addition, it is not common to assume that proteins that are quantitatively present in one condition and absent in another are differentially abundant proteins. Proteomics data software typically addresses this issue and applies some corrections. It would be advisable to review that.

      We thank the reviewer for highlighting this point. Indeed, some software impute a random small value to replace missing values and then produces statistics based on this imputed data (10.1038/nmeth.3901). However, the validity and relevance of generating statistics in the absence of actual data is questionable. 

      There are no universally accepted guidelines for handling this situation, and we believe it is more logical to set these values aside as potential interesting proteins. It is well-established that intensity values are often missing due to the detection limits of the spectrometer, suggesting that the missing values observed in several replicates of a condition are actually due to low values (see 10.1093/bioinformatics/btp362 and 10.1093/bioinformatics/bts193 for instance). It is thus logical to consider the associated proteins as potentially differentially abundant when comparing their complete absence in all replicates of one condition to their presence in several replicates of another condition.

      (1.2) Problems with the interpretation of Ribo-seq data (Figure 4D). On the one hand, the Ribo-seq data should be corrected (normalized) with the RNA-seq data in each of the conditions to obtain ribosome profiling data, since some genes could have more transcription in some of the conditions studied. In other articles in which this technique is used (such as in Tuorto et al., EMBO J. 2018; doi: 10.15252/embj.201899777), it is interpreted that those positions in which the ribosome moves most slowly and therefore less efficiently translated), are the most abundant. Assuming this interpretation, according to the hypothesis proposed in this work, the fragments enriched in TAT codons should have been less abundant in the absence of Q-tRNA (tgt mutant) in the Rib-seq experiment. However, what is observed is that TAT-enriched fragments are more abundant in the tgt mutant, and yet the Ribo-seq results are interpreted as RNA-seq, stating that this is because the genes corresponding to those sequences have greater expression in the absence of Q. 

      As recommended by the reviewer, we normalized the RiboSeq data with the RNAseq data to account for potential RNA variations. The updated Figure 4 demonstrates that this normalization does not alter our findings, confirming that variations at the RNAseq level do not contradict changes at the translational level. 

      The reviewer's observation that pauses at TAT codons would lead to ribosome accumulation and subsequent categorization as "up" genes is accurate. We must emphasize, however, that this category of “up genes” is probably quite diverse. The effect of ribosome stalling at TAT codons on total mRNA ribosome occupancy is likely highly variable, depending on the location of the TAT codon(s) within the CDS and the gene's expression level. We therefore think that genes in the "Up" category mainly correspond to genes that are more translated because the impact of pausing at TAT codons is probably not strong enough. Note that unlike what is usually done in bacterial riboseq experiments, we did not use any antibiotics to artificially freeze the ribosomes.

      On the other hand, it would be interesting to calculate the mean of the protein levels encoded by the transcripts with high and low ribosome profiling data.

      While this is a common request, we believe that comparing RiboSeq and proteomics data is not particularly informative. RiboSeq data directly measures translation, while proteomics provides information about protein abundance at steady state, reflecting the balance between protein synthesis and degradation. Furthermore, the number of proteins detectable by mass spectrometry is significantly smaller than the number of genes quantified by RiboSeq. Given these factors, there is often a low correlation between translation and protein abundance, making a direct comparison less relevant 

      (1.3) This statement is contrary to most previously reported studies on this topic in eukaryotes and bacteria, in which ribosome profiling experiments, among others, indicate that translation of TAT codons is slower (or unaffected) than translation of the TAC codons, and the same phenomenon is observed for the rest of the NAC/T codons. This is completely opposed to the results showed in Figure 4. However, the results of these studies are either not mentioned or not discussed in this work. Some examples of articles that should be discussed in this work:

      - "Queuosine-modified tRNAs confer nutritional control of protein translation" (Tuorto et al., 2018; 10.15252/embj.201899777)

      - "Preferential import of queuosine-modified tRNAs into Trypanosoma brucei mitochondrion is critical for organellar protein synthesis" (Kulkarni et al., 2021; doi:10.1093/nar/gkab567.

      - "Queuosine-tRNA promotes sex-dependent learning and memory formation by maintaining codonbiased translation elongation speed" (Cirzi et al., 2023; 10.15252/embj.2022112507)

      - "Glycosylated queuosines in tRNAs optimize translational rate and post-embryonic growth" (Zhao et al., 2023; 10.1016/j.cell.2023.10.026)

      - "tRNA queuosine modification is involved in biofilm formation and virulence in bacteria" (Diaz-Rullo and Gonzalez-Pastor, 2023; doi: 10.1093/nar/gkad667). In this work, the authors indicate that QtRNA increases NAT codon translation in most bacterial species. Could the regulation of TAT codonenriched proteins by Q-tRNAs in V. cholerae an exception? In addition, authors use a bioinformatic method to identify genes enriched in NAT codons similar to the one used in this work, and to find in which biological process are involved the genes whose expression is affected by Q-tRNAs (as discussed for the phenotype of UV resistance). It will be worth discussing all of this.

      Thank you for detailed suggestions, we agree that this discussion was missing and this comment gives us a chance to address that in the revised version of the manuscript.

      About the references above suggested by the referee, 4 of these papers were not mentioned in our manuscript, these were published while our manuscript was previously in review and we realize we have not cited them in the latest version of our manuscript. We thank the referee for highlighting this. We have now included a discussion about this. 

      We included the following in the discussion:

      “However, the opposite codon preference was shown in E. coli {Diaz-Rullo, 2023 #1888}. In eukaryotes also, several recent studies indicate slower translation of U-ending codons in the absence of Q34 {Cirzi, 2023 #1887;Kulkarni, 2021 #1886;Tuorto, 2018 #1268}. It’s important to note here, that in V. cholerae ∆tgt, increased decoding of U-ending codons is observed only with tyrosine, and not with the other three NAC/U codons (Histidine, Aspartate, Asparagine). This is interesting because it suggests that what we observe with tyrosine may not adhere to a general rule about the decoding efficiency of U- or C-ending codons, but instead seems to be specific to Tyr tRNAs, at least in the context of V. cholerae. Exceptions may also exist in other organisms. For example, in human cells, queuosine increases efficiency of decoding for U- ending codons and slows decoding of C- ending codons except for AAC {Zhao, 2023 #1889}. In this case, the exception is for tRNA Asparagine. Moreover, in mammalian cells {Tuorto, 2018 #1268}, ribosome pausing at U-ending codons is strongly seen for Asp, His and Asn, but less with Tyr. In Trypanosoma {Kulkarni, 2021 #1886}, reporters with a combination of the 4 NAC/NAU codons for Asp, Asn, Tyr, His have been tested, showing slow translation at U- ending version of the reporter in the absence of Q, but the effect on individual codons (e.g. Tyr only) is not tested. In mice {Cirzi, 2023 #1887}, ribosome slowdown is seen for the Asn, Asp, His U-ending codons but not for the Tyr U-ending codon. In summary, Q generally increases efficiency of U- ending codons in multiple organisms, but there appears to be additional unknown parameters which affect tyrosine UAU decoding, at least in V. cholerae. Additional factors such as mRNA secondary structures or mistranslation may also contribute to the better translation of UAU versions of tested genes. Mistranslation could be an important factor. If codon decoding fidelity impacts decoding speed, then mistranslation could also contribute to decoding efficiency of Tyr UAU/UAC codons and proteome composition.”

      (1.4) It is proposed that the stress produced by the TOB antibiotic causes greater translation of genes enriched in TAT codons. 

      Actually, it’s the opposite because in presence of TOB, in the wt, tgt would be induced leading to more Q on tRNA-Tyr and less translation of TAT.

      On the one hand, it is shown that the GFP-TAT version (gene enriched in TAT codons) and the RsxATAT-GFP protein (native gene naturally enriched in TAT) are expressed more, compared to their versions enriched in TAC in a tgt mutant than in a wt, in the presence of TBO (Fig. 5C). 

      Figure 5C shows relative fluorescence, ie changes of fluorescence in delta-tgt compared to WT. So it’s not necessarily more expressed but “more increased”

      However, in the absence of TOB, and in a wt context, although the two versions of GFP have a similar expression level (Fig. 3SD), the same does not occur with RsxA, whose RsxA-TAT form (the native one) is expressed significantly more than the RsxA-TAC version (Fig. 3SA). How can it be explained that in a wt context, in which there are also tRNA Q-modification, a gene naturally enriched in TAT is translated better than the same gene enriched in TAC?

      We thank the referee for this question based on careful assessment of our data. We agree, there appears to be significantly more RsxA-TAT in WT than RsxA-TAC. This could be due to other effects such as secondary structure formation on mRNA when the wt RsxA is recoded with TAC codons. This does not hinder the conclusion that the translation of the TAT version is increased in delta-tgt compared to WT.  

      It would be expected that in the presence of Q-tRNAs the two versions would be translated equally (as happens with GFP) or even the TAT version would be less translated. On the other hand, in the presence of TOB the fluorescence of WT GFP(TAT) is higher than the fluorescence of WT GFP(TAC) (Figure S3E) (mean fluorescence data for RsxA-GFP version in the presence of TOB is not shown). These results may indicate that the apparent better translation of TAT versions could be due to indirect effects rather from TAT codon translation.

      This is now mentioned in the manuscript

      “We cannot exclude, however, that additional factors such as mRNA secondary structures also contributes to the better translation of UAU versions of tested genes. “

      (2) Another problem is related to the already known role of Q in prevention of stop codon readthrough, which is not discuss at all in the work. In the absence of Q, stop codon readthrough is increased. In addition, it is known that aminoglycosides (such as tobramycin) also increase stop codon readthrough ("Stop codon context influences genome-wide stimulation of termination codon readthrough by aminoglycosides"; Wanger and Green, 2023; 10.7554/eLife.52611). Absence of Q and presence of aminoglycosides can be synergic, producing devastating increases in stop codon readthrough and a large alteration of global gene expression. All of these needs to be discussed in the work. Moreover, it is known that stop codon readthrough can alter gene expression and mRNA sequence context all influence the likelihood of stop codon readthrough. Thus, this process could also affect to the expression of recoded GFP and RsxA versions.

      We included the following in the revised version of the manuscript (results):

      “Q modification impacts decoding fidelity in V. cholerae.

      To test whether a defect in Q34 modification influences the fidelity of translation in the presence and absence of tobramycin, previously developed reporter tools were used (Fabret & Namy, 2021), to measure stop codons readthrough in V. cholerae ∆tgt and wild-type strains. The system consists of vectors containing readthrough promoting signals inserted between the lacZ and luc sequences, encoding β-galactosidase and luciferase, respectively. Luciferase activity reflects the readthrough efficiency, while β-galactosidase activity serves as an internal control of expression level, integrating a number of possible sources of variability (plasmid copy number, transcriptional activity, mRNA stability, and translation rate).  We found increased readthrough at stop codons UAA and to a lesser extent at UAG for ∆tgt, and this increase was amplified for UAG in presence of tobramycin (Fig. S2, stop readthrough). In the case of UAA, tobramycin appears to decrease readthrough, this may be artefactual, due to the toxic effect of tobramycin on ∆tgt.

      Mistranslation at specific codons can also impact protein synthesis. To further investigate mistranslation levels by tRNATyr in WT and ∆tgt, we designed a set of gfp mutants where the codon for the catalytic tyrosine required for fluorescence (TAT at position 66) was substituted by nearcognate codons (Fig. S2). Results suggest that in this sequence context, particularly in the presence of tobramycin, non-modified tRNATyr mistakenly decodes Asp GAC, His CAC and also Ser UCC, Ala GCU, Gly GGU, Leu CUU and Val GUC codons, suggesting that Q34 increases the fidelity of tRNATyr. 

      In parallel, we replaced Tyr103 of the β-lactamase described above, with Asp codons GAT or GAC. The expression of the resulting mutant β-lactamase is expected to yield a carbenicillin sensitive phenotype. In this system, increased tyrosine misincorporation (more mistakes) by tRNATyr at the mutated Asp codon, will lead to increased synthesis of active β-lactamase, which can be evaluated by carbenicillin tolerance tests. As such, amino-acid misincorporation leads here to phenotypic (transient) tolerance, while genetic reversion mutations result in resistance (growth on carbenicillin). The rationale is summarized in Fig. 3C. When the Tyr103 codon was replaced with either Asp codons, we observe increased β-lactamase tolerance (Fig. 3D, left), suggesting increased misincorporation of tyrosine by tRNATyr at Asp codons in the absence of Q, again suggesting that Q34 prevents misdecoding of Asp codons by tRNATyr.

      In order to test any effect on an additional tRNA modified by Tgt, namely tRNAAsp, we mutated the Asp129 (GAT) codon of the β-lactamase. When Asp129 was mutated to Tyr TAT (Fig. 3D, right), we observe reduced tolerance in ∆tgt, but not when it was mutated to Tyr TAC, suggesting less misincorporation of aspartate by tRNAAsp at the Tyr UAU codon in the absence of Q. In summary, absence of Q34 increases misdecoding by tRNATyr at Asp codons, but decreases misdecoding by tRNAAsp at Tyr UAU. 

      This supports the fact that tRNA Q34 modification is involved in translation fidelity during antibiotic stress, and that the effects can be different on different tRNAs, e.g. tRNATyr and tRNAAsp tested here.”

      Added figures: Figure S2, Figure 3CD

      (3) The statement about that the TOB resistance depends on RsxA translation, which is related to the presence of Q, also presents some problems:

      (3.1) It is observed that the absence of tgt produces a growth defect in V. cholerae when exposed to TOB (Figure 1A), and it is stated that this is mediated by an increase in the translation of RsxA, because its gene is TAT enriched. However, in Figure S4F, it is shown that the same phenotype is observed in E. coli, but its rsxA gene is not enriched in TAT codons. Therefore, the growth defect observed in the tgt mutant in the presence of TOB may not be due to the increase in the translation of TAT codons of the rsxA gene in the absence of Q. This phenotype is very interesting, but it may be related to another molecular process regulated by Q. Maybe the role of Q in preventing stop codon readthrough is important in this process, reducing cellular stress in the presence of TOB and growing better.

      FigS4F (now figure 5D) shows that rsxA can be toxic during growth in presence of tobramycin, but it does not show that rsxA translation is increased in E. coli in delta-tgt. However, we agree with the referee that there are probably additional processes regulated by Q which are also involved in the response to TOB stress. We already had mentioned this briefly in the discussion (“Note that, our results do not exclude the involvement of additional Q-regulated MoTTs in the response to sub-MIC TOB, since Q modification leads to reprogramming of the whole proteome. “), we further discussed it as follows:

      “As a consequence, transcripts with tyrosine codon usage bias are differentially translated. One such transcript codes for RsxA, an anti-SoxR factor. SoxR controls a regulon involved in oxidative stress response and sub-MIC aminoglycosides trigger oxidative stress in V. cholerae{Baharoglu, 2013 #720}, pointing to an involvement of oxidative stress response in the response to sub-MIC tobramycin stress.

      A link between Q34 and oxidative stress has also been previously found in eukaryotic organisms {Nagaraja, 2021 #1466}. Note that our results do not exclude the involvement of additional Qregulated translation of other transcripts in the response to tobramycin. Q34 modification leads to reprogramming of the whole proteome, not only for other transcripts with codon usage bias, but also through an impact on the levels of stop codon readthrough and mistranslation at specific codons, as supported by our data.”

      (3.2) All experiments related to the effect of Q on the translation of TAT codons have been performed with the tgt mutant strain. Considering that the authors have a pSEVA-tgt plasmid to overexpress this gene, they would have to show whether tgt overexpression in a wt strain produces a decrease in the translation of proteins encoded by TAT-enriched genes such as RsxA. This experiment would allow them to conclude that Q reduces RsxA levels, increasing resistance to TOB.

      We agree that this would be interesting to test, however, as it can be seen in figure 1B, delta-tgt pSEVAtgt (complemented strain) grows better than WT pSEVA-tgt (tgt overexpression). In fact, overexpression of tgt negatively impacts cell growth and yield smaller colonies, especially when cells carry a second plasmid (e.g with gfp constructs). We have also seen this with other RNA modification gene overexpressions in the lab (unpublished). We believe that the expression of tgt is tuned and since overexpression affects fitness, it is generally difficult to conduct experiments with overexpression plasmid for RNA modifications.  Nevertheless, we have done the experiment (with slow growing bacteria) and when we normalize expression of gfp in the presence of tgt overexpressing plasmid to the condition with no plasmid, we see little (1.5 fold) or no effect of tgt overexpression on fluorescence (see graph below). This is probably due to a toxic effect of ooverexpression and we do not believe these results are biologically relevant. 

      Author response image 1.

      (3.3) On the other hand, Fig. 1B shows that when the wt and tgt strains compete, both overexpressing tgt, the tgt mutant strain grows better in the presence of TOB. This result is not very well understood, since according to the hypothesis proposed, the absence of modification by Q of the tRNA would increase the translation of genes enriched in TAT, therefore, a strain with a higher proportion of Q-modified tRNAs as in the case of the wt strain overexpressing tgt would express the rsxA gene less than the tgt strain overexpressing tgt and would therefore grow better in the presence of TOB. For all these reasons, it would be necessary to evaluate the effect of tgt overexpression on the translation of RsxA.

      See our answer above about negative effect of tgt overexpression.

      (3.4) According to Figure 1I, the overexpression of tRNA-Tyr(GUA) caused a better growth of tgt mutant in comparison to WT. If the growth defect observed in tgt mutant in the presence of TOB is due to a better translation of the TAT codons of rsxA gene, the overexpression of tRNA-Tyr(GUA) in the tgt mutant should have resulted in even better RsxA translation a worse growth, but not the opposite result.

      We agree, we think that rsxA is not the only factor responsible for growth defect of tgt in presence of TOB (as now further discussed in the discussion). Overexpression of tRNAtyr possibly changes the equilibrium between the decoding of TAC vs TAT and may restore translation of TAC enriched genes. As also suggested by rev3, we have measured decoding reporters for TAT/TAC while overexpressing tTNA-tyr. This is now added to the results in fig S2C and the following:

      “We also tested decoding reporters for TAT/TAC in WT and ∆tgt overexpressing tRNATyr in trans (Fig. S1C). The presence of the plasmid (empty p0) amplified differences between the two strains with decreased decoding of TAC (and increased TAT, as expected) in ∆tgt compared to WT. Overexpression of tRNATyrGUA did not significantly impact decoding of TAT and increased decoding of TAC, as expected. Since overexpression of tRNATyrGUA rescues ∆tgt in tobramycin (Fig. 1I) and facilitates TAC decoding, this suggests that issues with TAC codon decoding contribute to the fitness defect observed in ∆tgt upon growth with tobramycin. Overexpression of tRNATyrAUA increased decoding of TAT in WT but did not change it in ∆tgt where it is already high. Unexpectedly, overexpression of tRNATyrAUA also increased decoding of TAC in WT. Thus, overexpression of tRNATyrAUA possibly changes the equilibrium between the decoding of TAC vs TAT and may restore translation of TAC enriched transcripts.” 

      Added figure: figure S1C

      (4) It cannot be stated that DNA repair is more efficient in the tgt mutant of V. cholerae, as indicated in the text of the article and in Fig 7. The authors only observe that the tgt mutant is more resistant to UV radiation and it is suggested that the reason may be TAT bias of DNA repair genes. To validate the hypothesis that UV resistance is increased because DNA repair genes are TAT biased, it would be necessary to check if DNA repair is affected by Q. UV not only produces DNA damage, but also oxidative stress. Therefore, maybe this phenotype is due to the increase in proteins related to oxidative stress controlled by RsxA, such as the superoxide dismutase encoded by sodA. It is also stated that these repair genes were found up for the tgt mutant in the Ribo-seq data, with unchanged transcription levels. Again, it is necessary to clarify this interpretation of the Ribo-seq data, since the fact that they are more represented in a tgt mutant perhaps means that translation is slower in those transcripts. Has it been observed in proteomics (wt vs tgt in the absence of TOB) whether these proteins involved in repair are more expressed in a tgt mutant?

      We agree that our results do not directly show that DNA repair is more efficient, but that delta-tgt responds better to UV. This has been modified in the manuscript. About oxidative stress, we did not see a better or worse response to H202 of delta-tgt. Moreover, since we see better response of deltatgt  to UV only in V. cholerae and not in E. coli, we did not favor the hypothesesi of response to stressox. In proteomics, we do not detect changes for DNA repair genes except for RuvA which is more abundant in delta-tgt. We have toned down the statement about DNA repair in the paper.

      (5) The authors demonstrate that in E. coli the tgt mutant does not show greater resistance to UV radiation (Fig. 7D), unlike what happens in V. cholerae. It should be discussed that in previous works it has been observed that overexpression in E. coli of the tgt gene or the queF gene (Q biosynthesis) is involved in greater resistance to UV radiation (Morgante et al., Environ Microbiol, 2015 doi: 10.1111/1462-2920.12505; and Díaz-Rullo et al., Front Microbiol. 2021 doi: 10.3389/fmicb.2021.723874). As an explanation, it was proposed (Diaz-Rullo and Gonzalez-Pastor, NAR 2023 doi: 10.1093/nar/gkad667) that the observed increase in the capacity to form biofilms in strains that overexpress genes related to Q modification of tRNA would be related to this greater resistance to UV radiation.

      We now mention the previous observations suggesting a link between tgt and UV. We thank the referee for the reference which we had overlooked. Note that in the case of our experiments, all cultures are in planktonic form and are not allowed to form biofilms. We thus prefer not to biofilmlinked processes in this study.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript the authors begin with the interesting phenotype of sub-inhibitory concentrations of the aminoglycoside tobramycin proving toxic to a knockout of the tRNA-guanine transglycosylase (Tgt) of the important human pathogen, Vibrio cholerae. Tgt is important for incorporating queuosine (Q) in place of guanosine at the wobble position of GUN codons. The authors go on to define a mechanism of action where environmental stressors control expression of tgt to control translational decoding of particularly tyrosine codons, skewing the balance from TAC towards TAT decoding in the absence of the enzyme. The authors use advanced proteomics and ribosome profiling to reveal that the loss of tgt results in increased translation of proteins like RsxA and a cohort of DNA repair factors, whose genes harbor an excess of TAT codons in many cases. These findings are bolstered by a series of molecular reporters, mass spectrometry, and tRNA overexpression strains to provide support for a model where Tgt serves as a molecular pivot point to reprogram translational output in response to stress.

      Strengths:

      The manuscript has many strengths. The authors use a variety of strains, assays, and advanced techniques to discover a mechanism of action for Tgt in mediating tolerance to sub-inhibitory concentrations of tobramycin. They observe a clear phenotype for a tRNA modification in facilitating reprogramming of the translational response, and the manuscript certainly has value in defining how microbes tolerate antibiotics.

      We thank the referee for their time and comments. 

      Weaknesses:

      The conclusions of the manuscript are mostly very well-supported by the data, but in some places control experiments or peripheral findings cloud precise conclusions. Some additional clarification, discussion, or even experimental extension could be useful in strengthening these areas.

      (1) The authors have created and used a variety of relevant molecular tools. In some cases, using these tools in additional assays as controls would be helpful. For example, testing for compensation of the observed phenotypes by overexpression of the Tyrosine tRNA(GUA) in Figure 2A with the 6xTAT strain, Figure 5C with the rxsA-GFP fusion, and/or Figure 7B with UV stress would provide additional information of the ability of tRNA overexpression to compensate for the defect in these situations.

      Thank you for the suggestions. Since overexpression of tRNA tyr is not expected to decrease decoding of TAT, we do not necessarily expect any effect for UV and rsxA expression. Overexpression of tRNA_GUA restores fitness of delta-tgt in TOB, but this is probably independent of RsxA. As ref2 also suggested above, we included in the discussion that the effect seen in delta-tgt with TOB is not only due to RsxA expression but also additional processes. However, these suggestions are interesting and we performed the following experiments in order to have an answer for these questions: 

      - “testing for compensation of the observed phenotypes by overexpression of the Tyrosine tRNA(GUA) in Figure 2A with the 6xTAT strain”: 

      This is now included in figure S2C and results as follows: 

      “We also tested decoding reporters for TAT/TAC in WT and ∆tgt overexpressing tRNA-Tyr in trans (Fig. S1C). The presence of the plasmid amplified differences between the two strains with decreased decoding of TAC (and increased TAT, as expected) in ∆tgt with empty plasmid compared to WT. Overexpression of tRNA_TyrGUA did not significantly impact decoding of TAT and increased decoding of TAC as expected. Since overexpression of tRNA_TyrGUA rescues ∆tgt in tobramycin (Fig. 1I) and facilitates TAC decoding, this suggests that issues with TAC codon decoding contribute to the fitness defect observed in ∆tgt upon growth with tobramycin. Overexpression of tRNA_TyrAUA increased decoding of TAT in WT but did not change it in ∆tgt where it is already high. Interestingly, overexpression of TyrAUA also increased decoding of TAC in WT. Thus, overexpression of tRNA_TyrAUA possibly changes the equilibrium between the decoding of TAC vs TAT and may restore translation of TAC enriched transcripts. “  

      -  Figure 5C with the rxsA-GFP fusion: 

      When we overexpress tRNA_GUA, rsxA fluorescence is 2-fold higher in delta-tgt compared to wt. However, the fluorescence is highly decreased compared to the condition with no tRNA overexpression. While we are not sure whether this apparent decrease is a technical issue or not (e.g. due to the presence of additional plasmid), we prefer not to further explore this in this manuscript. Note that we could not obtain delta-tgt strain carrying both plasmids expressing tRNA_GUA and rsxA, suggesting toxic overproduction of rsxA in this context.

      Author response image 2.

      - Figure 7B with UV stress: 

      Here again, delta-tgt overexpressing tRNA_GUA is still more UV resistant than WT overexpressing tRNA_GUA.

      Author response image 3.

      (2) The authors present a clear story with a reprogramming towards TAT codons in the knockout strain, particularly regarding tobramycin treatment. The control experiments often hint at other codons also contributing to the observed phenotypes (e.g., His or Asp), yet these effects are mostly ignored in the discussion. It would be helpful to discuss these findings at a minimum in the discussion section, or possibly experimentally address the role of His or Asp by overexpression of these tRNAs together with Tyrosine tRNA(GUA) in an experiment like that of Figure 1I to see if a more "wild type" phenotype would present. In fact, the synergy of Tyr, His, and/or Asp codons likely helps to explain the effects observed with the DNA repair genes in later experiments.

      We thank the referee for the suggestion. We agree that there could be synergies between these codons, and that’s probably why proteomics data does not clearly reflect tyrosine codons usage bias. This is now further discussed in the ideas and speculation section. 

      Moreover, we have added Figure S3G and the following result:

      “Since not all TAT biased proteins are found to be enriched in ∆tgt proteomics data, the sequence context surrounding TAT codons could affect their decoding. To illustrate this, we inserted after the gfp start codon, various tyrosine containing sequences displayed by rsxA (Fig. S3G). The native tyrosines were all TAT codons, our synthetic constructs were either TAT or TAC, while keeping the remaining sequence unchanged.  We observe that the production of GFP carrying the TEYTATLLL sequence from RsxA is increased in Δtgt compared to WT, while it is unchanged with TEYTACLLL. However, production of the GFP with the sequences LYTATRLL/LYTACRLL and EYTATLR/ EYTACLR was not unaffected (or even decreased for the latter) by the absence of tgt. Overall, our results demonstrate that RsxA is upregulated in the ∆tgt strain at the translational level, and that proteins with a codon usage bias towards tyrosine TAT are prone to be more efficiently translated in the absence of Q modification, but this is also dependent on the sequence context. “

      (3) Regarding Figure 6D, the APB northern blot feels like an afterthought. It was loaded with different amounts of RNA as input and some samples are repeated three times, but Δcrp only once. Collectively, it makes this experiment very difficult to assess.

      A different amount of RNA was used only for ∆tgt in which we have only one band because of the absence of modification. For all the other conditions, the same amount of RNA was used (0.9 µg). Additional replicates of crp were in an additional gel but only a representative gel was shown in the manuscript. This is now specified in the legend.

      We also attach below the picture of the gel with total RNA (syber Gold labelling of total RNA), where it can be seen that the lanes contain an equivalent quantity of RNA, except for ∆tgt.

      Author response image 4.

      Minor Points:

      (3) Fig S2B, do the authors have a hypothesis why the Asp and Phe tRNAs lead to a growth decrease in the untreated samples? It appears like Phe(GAA) partially compensates for the defect.

      Yes we agree, at this stage we do not have any satisfactory answer for this unfortunately. This would be interesting to study further but this is beyond the scope of the present study.

      (5) Lines 655 to 660 seem more appropriate as speculation in the discussion rather than as a conclusion in the results, where no direct experiments are performed. The authors might take advantage of the "Ideas and Speculation" section that eLife allows.

      Thank you very much for this suggestion, we added this section to the manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor.

      - Figure 6 - Fonts on several mutants is different size/type. fixed

      - What is the Pm promoter. Please expand and give enough details so reader can follow. Especially as it is less used in V. cholerae (typical being pBAD or pTAC promoters). done

      - Spacing where references are inserted should be checked. done

      - Line 860-863 - "V. cholerae's response to sub-MIC antibiotic stress is transposable to other Gramnegative pathogens" . This reads awkard. Consider rephrasing. done

      - Figure 7 - Text in A and C is very small and is very hard to read. Font for tgt is different.

      Fixed. Tgt is in italics.

      Reviewer #2 (Recommendations For The Authors):

      As specified in the public review, more evidence would be necessary to affirm that tRNAs not modified by Q have a greater preference for translating TAT codons, since there are several previous studies in which it is shown that Q-tRNAs have a greater preference for NAT codons (including TAT). For example, it is suggested to explore what happens with other recoded genes (enriched in TAT or TAC) if there is a high level of Q-tRNAs (overexpression of tgt in a wt context). It is also necessary to clarify how to interpret the Ribo-seq results, which apparently is different from how they have been interpreted in other studies.

      Please see above our responses and changes made to the manuscript.

      Minor corrections

      In Figure 8, replace "Epitranscriptomic adapation to stress" with "Epitranscriptomic adaptation to stress".

      Fixed, thank you for noticing!

      Reviewer #3 (Recommendations For The Authors):

      (1) Lines 48-50, and 110 to 112, the authors have a nice mechanism and story, yet the lines mentioned feel very qualified (e.g., "possibly", "plausibly") and lead to the abstract hiding the value and major conclusions of the study. The authors could consider to revise or even remove these lines to focus on the take-home message in the abstract and end of introduction/discussion. 

      Thank you for this comment, we modified the text.  

      (2) Additional description for the samples in the results section for Figure 1 would be helpful to the reader.

      Done

      (3) Figure S1, the line of experiments with rluF is interesting, but in the end the choice seems a little random. Have the authors assessed knockouts of other modifications on the ASL for effects? Since the modification is not well characterized in V. cholerae according to the authors, it might make sense to save this for a future paper.

      We removed S1, as we agree that this experiment does not really add something to the paper.

      (4) Line 334 and 353 are redundant.

      Fixed

      (5) It is likely beyond the scope of the study, but it would strengthen the paper to repeat Figure 3 with His and/or Asp based on the findings of 2C and 4E to better understand the contribution of His and Asp to Q biology.

      We repeated figure 3 with Asp. Based on Fig 2C (less efficient decoding of GAC in deta-tgt in TOB) and 4E (positive GAT codon bias in proteins up in riboseq in delta-tgt TOB), we would expect that beta-lactamase with asp GAC would be less efficiently decoded than GAT in delta-tgt. 

      This was added to the manuscript

      “Like Tyr103, Asp129 was shown to be important for resistance to β-lactams (Doucet et al., 2004; Escobar et al., 1994; Jacob et al., 1990). When we replaced the native Asp129 GAT with the synonymous codon Asp129 GAC, the GAC version did not appear to produce functional β-lactamase in ∆tgt (Fig. 3B), suggesting increased mistranslation or inefficient decoding of the GAC codon by tRNAAsp in the absence of Q. Decoding of GAT codon was also affected in ∆tgt in the presence of tobramycin.”

      Added figure: Figure 3B

      (6) The authors could consider replacing 5D with S4A-D, which is easier to understand in our opinion.

      Done

    2. eLife Assessment

      This study investigates the role of queuosine (Q) tRNA modification in aminoglycoside tolerance in Vibrio cholerae and presents convincing evidence to conclude that Q is essential for the efficient translation of TAT codons, although this depends on the context. The absence of Q reduces aminoglycoside tolerance potentially by reprogramming the translation of an oxidative stress response gene, rxsA. Overall, the findings point to an important mechanism whereby changes in Q modification levels control the decoding of mRNAs enriched in TAT codons under antibiotic stress.

    3. Reviewer #1 (Public review):

      Summary of the work:

      In this work Fruchard et. al. study the enzyme Tgt and how it modifies guanine in tRNAs to queuosine (Q), essential for Vibrio cholerae's growth under aminoglycoside stress. Q's role in codon decoding efficiency and its proteomic effects during antibiotic exposure is examined, revealing Q modification impacts tyrosine codon decoding and influences RsxA translation, affecting the SoxR oxidative stress response. The research proposes Q modification's regulation under environmental cues reprograms the translation of genes with tyrosine codon bias, including DNA repair factors, crucial for bacterial antibiotic response.

      The experiments are well-designed and conducted and the conclusions, for the most part, are well-supported by the data.

      Comments on revisions:

      The authors have answered my queries

    4. Reviewer #2 (Public review):

      Fruchard et al. investigate the role of the queuosine (Q) modification of the tRNA (Q-tRNA) in the human pathogen Vibrio cholerae. First, the authors state that the absence of Q-modified tRNAs (tgt mutant) increases the translation of TAT codons and proteins with a high TAT codon bias. Second, the absence of Q increases rsxA translation, because rsxA gene has a high TAT codon bias. Third, increased RsxA in the absence of Q inhibits SoxR response, reducing resistance towards the antibiotic tobramycin (TOB). Authors also predict in silico which genes harbor a higher TAT bias and found that among them are some involved in DNA repair, experimentally observing that a tgt mutant is more resistant to UV than the wt strain. It is worth noting that authors employ a wide variety of techniques, both experimental and bioinformatics.

      The authors have satisfactorily responded to most of the comments that needed clarification. Particularly interesting was the addition of the new results section "Q modification impacts decoding fidelity in V. cholerae", after the suggestion to explore the role of Q in prevention of stop codon readthrough. Although it is not a major problem, since the article is very complete and interesting, the interpretation of the results of RiboSeq data carried out in this work remains controversial. This technique, at least when it has been used in eukaryotes to investigate whether there is a bias in the translation of certain codons affected by Q (Tuorto et al., EMBO J. 2018; doi: 10.15252/embj.201899777), has been interpreted as meaning that ribosomes spend less time in the optimal codons and therefore there is an increase in occupancy at codons where translation slows down. On the other hand, it has been observed that "in ribosome profiling experiments conducted without cycloheximide pretreatment, there is a clear inverse relationship between tRNA abundance and ribosome occupancy, showing that ribosomes spend less time at optimal codons and specifically this has been observed in experiments in which a translation inhibitor such as cycloheximide is not used (see review: Hanson G & Coller J. Nat Rev Mol Cell Biol. doi: 10.1038/nrm.2017.91, and experiments in yeast: Hussmann JA et al. PLoS Genet. doi: 10.1371/journal.pgen.1005732). On the other hand, we believe that the comparison between RiboSeq and proteomic data would be interesting to check whether this interpretation of the RiboSeq data is correct. It should not be a problem that the proteomics data could be incomplete, it would just be a more limited study. If the correct interpretation of the RiboSeq results is as proposed by the authors, a correlation should be observed between the abundance of TAT-enriched RNA fragments and the most abundant proteins. Therefore, it would be interesting to perform this comparison and see if significant results are obtained that help to understand the correct interpretation of the RiboSeq experiments.

    5. Reviewer #3 (Public review):

      Summary:

      In this manuscript the authors begin with the interesting phenotype of sub inhibitory concentrations of the aminoglycoside tobramycin proving toxic to a knockout of the tRNA-guanine transglycosylase (Tgt) of the important human pathogen, Vibrio cholerae. Tgt is important for incorporating queuosine (Q) in place of guanosine at the wobble position of GUN codons. The authors go on to define a mechanism of action where environmental stressors control expression of tgt to control translational decoding of particularly tyrosine codons, skewing the balance from TAC towards TAT decoding in the absence of the enzyme. The authors use advanced proteomics and ribosome profiling to reveal that the loss of tgt results in increased translation of proteins like RsxA and a cohort of DNA repair factors, whose genes harbor an excess of TAT codons in many cases. These findings are bolstered by a series of molecular reporters, mass spectrometry, and tRNA overexpression strains to provide support for a model where Tgt serves as a molecular pivot point to reprogram translational output in response to stress. The manuscript therefore improves our understanding of the phenotype of focus and will prove useful for the field in our understanding of Modification Tunable Transcripts.

      Strengths:

      The manuscript has many strengths. The authors use a variety of strains, assays, and advanced techniques to discover a mechanism of action for Tgt in mediating tolerance to sub inhibitory concentrations of tobramycin. They observe a clear phenotype for a tRNA modification in facilitating reprogramming of the translational response, and the manuscript certainly has value in defining how microbes tolerate antibiotics.

      Weaknesses:

      The conclusions of the manuscript are mostly very well-supported by the data, but a few experimental directions remain inconclusive. The finding linking Tgt and UV damage susceptibility is one example where the phenotype is striking, but the mechanism remains somewhat unclear. Future work in this direction will likely be required to fully understand how Tgt influences the repair of DNA after UV.

    1. eLife Assessment

      The study highlights adhesion G-protein-coupled receptor A3 (ADGRA3) as a potential target for activating adaptive thermogenesis in both white and brown adipose tissue. This finding offers valuable insights for researchers in the field of adipose tissue biology and metabolism. The authors have presented additional evidence to address the reviewers' comments, including experiments conducted on primary stromal vascular fractions from adipose tissues. However, the revised manuscript fails to address several reviewer concerns, such as the measurement of whole-body energy expenditure through indirect calorimetry and the assessment of food intake. Furthermore, the nanoparticle-mediated knockdown of Adgra3 did not adequately address the tissue selectivity of ADGRA in mice. As a result, the primary claims of the study are only partially supported by the available data, leading to the conclusion that the research is deemed incomplete.

    2. Joint Public Review:

      Based on bioinformatics and expression analysis using mouse and human samples, the authors claim that the adhesion G-protein coupled receptor ADGRA3 may be a valuable target for increasing thermogenic activity and metabolic health. Genetic approaches to deplete ADGRA3 expression in vitro resulted in reduced expression of thermogenic genes including Ucp1, reduced basal respiration and metabolic activity as reflected by reduced glucose uptake and triglyceride accumulation. In line, nanoparticle delivery of shAdgra3 constructs is associated with increased body weight, reduced thermogenic gene expression in white and brown adipose tissue (WAT, BAT), and impaired glucose and insulin tolerance. On the other hand, ADGRA3 overexpression is associated with an improved metabolic profile in vitro and in vivo, which can be explained by increasing the activity of the well-established Gs-PKA-CREB axis. Notably, a computational screen suggested that ADGRA3 is activated by hesperetin. This metabolite is a derivative of the major citrus flavonoid hesperidin and has been described to promote metabolic health. Using appropriate in vitro and in vivo studies, the authors show that hesperitin supplementation is associated with increased thermogenesis, UCP1 levels in WAT and BAT, and improved glucose tolerance, an effect that was attenuated in the absence of ADGRA3 expression.

      The revised manuscript fails to address several reviewer concerns, such as the measurement of whole-body energy expenditure through indirect calorimetry and the assessment of food intake.

      The previous reviews are here: https://elifesciences.org/reviewed-preprints/100205v2/reviews#tab-content

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This article identifies ADGR3 as a candidate GPCR for mediating beige fat development. The authors use human expression data from Human Protein Atlas and Gtex databases and combine this with experiments performed in mice and a murine cell line. They refer to a GPCR bioactivity screening tool PRESTO-Salsa, with which it was found that Hesperetin activates ADGR3. From their experiments, authors conclude that Hesperetin activates ADGR3, inducing a Gs-PKA-CREB axis resulting in adipose thermogenesis.

      Strengths:

      The authors analyze human data from public databases and perform functional studies in mouse models. They identify a new GPCR with a role in thermogenic activation of adipocytes.

      Considerations:

      Selection of ADGRA3 as a candidate GPCR relevant for mediating beiging in humans:

      The authors identify GPCRs that are expressed more highly in murine iBAT compared to iWAT in response to cold and assess which of these GPCRs are expressed in human subcutaneous or visceral adipocytes. Although this strategy will identify GPCRs that are expressed at higher levels in brown fat compared to beige and thus possibly more active in thermogenic function, the relevance in choosing GPCRs that also are expressed in unstimulated human white adipocytes should be considered. Thermogenic activity is not normally present in human white adipocytes. It would have strengthened the GPCR selection if the authors instead had assessed the intersection with human brown adipocytes that were activated with norepinephrine.

      We appreciate your constructive feedback and believe that by adopting this refined strategy, we will strengthen our selection of GPCRs related to adipose thermogenesis in other ongoing studies. We look forward to continuing our research in this area and contributing to the understanding of adipose thermogenesis and its potential therapeutic applications. Thank you once again for your valuable input. 

      Strategy to investigate the role of ADGRA3 in WAT beiging:

      Having identified ADGRA3 as their candidate receptor, the authors investigated the receptor in mouse models, the murine inguinal adipocyte cell line 3T3 and in human subcutaneous adipose progenitors (HAdsc) differentiated in vitro. Calling the human cells "beige" is a stretch as these cells are derived from a white adipose depot. The authors do observe regulation in UCP1 and abundance of mitochondria following modification of ADGRA3 in the cells. However, in future studies, it should be considered if the receptor rather plays a role in differentiation per se, and perhaps not specifically in thermogenic differentiation/activity.

      Regarding the reviewer's suggestion to consider whether ADGRA3 plays a role in differentiation per se, rather than specifically in thermogenic differentiation/activity, we acknowledge that this is an important consideration. Our current studies have focused on the role of ADGRA3 in regulating UCP1 expression and mitochondrial abundance, which are hallmarks of adipose thermogenic activity. However, we recognize that ADGRA3 may also have broader roles in adipocyte differentiation and function that are not limited to thermogenesis.

      To address this point, in future studies, we plan to conduct additional experiments to investigate the potential role of ADGRA3 in adipocyte differentiation, including its effects on the expression of markers of adipocyte differentiation and its impact on adipocyte metabolism and function. These studies will provide further insights into the mechanisms by which ADGRA3 regulates adipocyte biology.

      According to the Human Protein Atlas and Gtex databases, ADGRA3 is not only expressed in adipocytes, but also in other tissues and cell types. The authors address this by measuring the expression in a panel of these tissues, demonstrating a knockdown not only in the adipose tissue, but also in the liver and less pronounced in the muscle (Figure S2). It should thus be emphasized that the decreased TG levels in serum and liver in the mice might in fact depend on Adgra3 overexpression in the liver. Even though this might not have been the purpose of the experiment, it is important to highlight this as it could serve as hypothesis building for future studies of the function of this receptor.

      Thank you for your thoughtful comments and feedback. We appreciate the insight provided by the Human Protein Atlas and Gtex databases regarding the tissue distribution of ADGRA3. We fully acknowledge that the decreased TG levels observed in both the serum and liver of the mice might be linked to the overexpression of Adgra3 in the liver.

      Although this was not the primary objective of our experiment, we agree that this observation is worth highlighting as it could serve as a basis for future hypothesis-driven research on the functional role of ADGRA3 in different tissues. In light of your comments, we emphasized this potential link between Adgra3 overexpression in the liver and reduced TG levels in discussion, as follows.

      “…the precise mechanisms underlying the influence of on adipose thermogenesis. Furthermore, it is crucial to highlight that the observed decrease in TG levels in both serum and liver (Figure 4-figure supplement 2C-D) might be attributed to the significant increase in Adgra3 expression in the liver, which is a consequence of the nanoparticle-mediated overexpression of Adgra3. While the exact mechanism remains to be fully elucidated, this correlation suggests a potential link between Adgra3 overexpression in the liver and reduced TG levels in the serum. We will employ more sophisticated models in subsequent studies to further…”

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Zhao et al. explored the function of adhesion G protein-coupled receptor A3 (ADGRA3) in thermogenic fat biology.

      Strengths:

      Through both in vivo and in vitro studies, the authors found that the gain function of ADGRA3 leads to browning of white fat and ameliorates insulin resistance.

      Weaknesses:

      There are several lines of weak methodologies such as using 3T3-L1 adipocytes and intraperitoneal(i.p.) injection of virus. Moreover, as the authors stated that ADGRA3 is constitutively active, how could the authors then identify a chemical ligand?

      Comments on revised version:

      The revised manuscript by Zhao et al. has limited improvement. The authors refused to perform revised experiments using primary cultures even though two reviewers pointed out the same weakness (3T3-L1 adipocytes are unsuitable). Using infrared thermography to measure body temperature is also problematic.

      Thanks for your comments. We regret that human adipocytes induced from human adipose-derived stem cells (hADSCs) were not recognized as primary cultures by multiple reviewers. Therefore, we have included relevant experimental results of mouse primary adipocytes induced from stromal vascular fraction (SVF) in Figure 8E-H as a supplement. The thermal imaging device was used to measure the temperature of BAT, while the body temperature was measured at 9:00 using a rectal probe connected to a digital thermometer.

    1. eLife Assessment

      This important study represents a data processing pipeline to discover causal interactions from time-lapse imaging data, and convincingly illustrates it on a challenging application for the analysis of tumor-on-chip ecosystem data. The authors describe the raw data they used (imaging data), go through a step-by-step description on how to extract the features they are interested in from the raw data, and how to perform the causal discovery process. This paper tackles the problem of learning causal interactions from temporal data, which is applicable to many biological applications.

    2. Reviewer #1 (Public review):

      Summary:

      This paper presents a data processing pipeline to discover causal interactions from time-lapse imaging data and convincingly illustrates it on a challenging application for the analysis of tumor-on-chip ecosystem data.

      The core of the discovery module is the original tMIIC method of the authors, which is shown in supplementary material to compare favourably to two state-of-the-art methods on synthetic temporal data on a 15 nodes network.

      Strengths:

      This paper tackles the problem of learning causal interactions from temporal data which is an open problem in presence of latent variables.

      The core of the method tMIIC of the authors is nicely presented in connection to Granger-Schreiber causality and to the novel graphical conditions used to infer latent variables and based on a theorem about transfer entropy.

      tMIIC compares favourably to PC and PCMCI+ methods using different kernels on synthetic datasets generated from a network of 15 nodes.

      A full application to tumor-on-chip cellular ecosystems data including cancer cells, immune cells, cancer-associated fibroblasts, endothelial cells and anti cancer drugs, with convincing inference results with respect to both known and novel effects between those components and their contact.

      The code and dataset are available online for the reproducibility of the results.

      Weaknesses:

      The references to "state-of-the-art methods" concerning the inference of causal networks should be more precise by giving citations in the main text, and better discussed in general terms, both in the first section and in the section of presentation of CausalXtract. It is only in the legend of the figures of the supplementary material that we get information.

      Of course, comparison on our own synthetic datasets can always be criticized but this is rather due to the absence of a common benchmark in this domain compared to other domains. I recommend the authors to explicitly propose their datasets made accessible in supplementary material as benchmark for the community.

      Comments on revisions:

      This is a very nice paper.

    3. Reviewer #2 (Public review):

      Summary:

      The authors propose a methodology to perform causal (temporal) discovery. The approach appears to be robust and is tested in the different scenarios: one related to live-cell imaging data, and another one using synthetic (mathematically defined) time series data. They compare the performance of their findings against another well-known method by using metrics like F-score, precision and recall,

      Strengths:

      --Performance, robustness, the text is clear and concise, The authors provide the code to review.

      Comments on revisions:

      The authors have addressed my concerns properly providing the needed explanations.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This paper presents a data processing pipeline to discover causal interactions from time-lapse imaging data, and convicingly illustrates it on a challenging application for the analysis of tumor-on-chip ecosystem data. The core of the discovery module is the original tMIIC method of the authors, which is shown in supplementary material to compare favourably to two state-of-the-art methods on synthetic temporal data on a 15 nodes network.

      Strengths:

      This paper tackles the problem of learning causal interactions from temporal data which is an open problem in presence of latent variables. The core of the method tMIIC of the authors is nicely presented in connection to Granger- Schreiber causality and to the novel graphical conditions used to infer latent variables and based on a theorem about transfer entropy. tMIIC compares favourably to PC and PCMCI+ methods using different kernels on synthetic datasets generated from a network of 15 nodes. A full application to tumor-onchip cellular ecosystems data including cancer cells, immune cells, cancer-associated fibroblasts, endothelial cells and anti cancer drugs, with convincing inference results with respect to both known and novel effects between those components and their contact.

      The code and dataset are available online for the reproducibility of the results.

      We thank Reviewer #1 for highlighting the main results and strengths of our paper, as well as, for his/her recommendations below to further improve the manuscript.

      Weaknesses:

      The references to ”state-of-the-art methods” concerning the inference of causal networks should be more precise by giving citations in the main text, and better discussed in general terms, both in the first section and in the section of presentation of CausalXtract. It is only in the legend of the figures of the supplementary material that we get information. Of course, comparison on our own synthetic datasets can always be criticized but this is rather due to the absence of common benchmark and I would recommend the authors to explicitly propose their datasets as benchmark to the community.

      Following Reviewer #1’s suggestion, we now compare tMIIC’s performance to other state-of-the-art causal discovery methods for time series data in the main text and in a new Figure 2. This Figure 2 also highlights the relation between graph-based causal discovery methods for time series data and Granger-Schreiber temporal causality, as discussed in more details in Methods (Theorem 1).

      We also agree about the importance of sharing benchmark datasets with the community. This is the reason why we provide the dynamical equations of the 15-node benchmarks in Supplementary Tables 1 & 2, so that anyone can generate equivalent time series datasets of any desired length.

      Reviewer #2 (Public review):

      Summary:

      The authors propose a methodology to perform causal (temporal) discovery. The approach appears to be robust and is tested in the different scenarios: one related with live-cell imaging data, and another one using synthetic (mathematically defined) time series data. They compare the performance of their findings against another well-know method by using metrics like F-score, precision and recall,

      Strengths:

      Performance, robustness, the text is clear and concise, The authors provide the code to review.

      We thank Reviewer #2 for his/her positive assessment of our work and the suggestions below to improve the manuscript.

      Weaknesses:

      One concern could be the applicability of the method in other areas like climate, economy. For those areas, public data are available and might be interesting to test how the method performs with this kind of data.

      While our main expertise concerns the analysis of biological and biomedical data, we agree that tMIIC (which is included in MIIC R package) could in principle be applied to other areas, like climate, economy.

      We have not included benchmarks on such diverse types of datasets in the present manuscript, which focuses on CausalXtract’s pipeline for the analysis and causal interpretation of live-cell time-lapse imaging data from complex cellular systems.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      We thank Reviewer 1 for their helpful comments and hope that the changes made to the revised manuscript have addressed their points.

      This study presents a novel application of the inverted encoding (i.e., decoding) approach to detect the correlates of crossmodal integration in the human EEG (electrophysiological) signal. The method is successfully applied to data from a group of 41 participants, performing a spatial localization task on auditory, visual, and audiovisual events. The analyses clearly show a behavioural superiority for audio-visual localization. Like previous studies, the results when using traditional univariate ERP analyses were inconclusive, showing once more the need for alternative, more sophisticated approaches. Instead, the principal approach of this study, harnessing the multivariate nature of the signal, captured clear signs of super-additive responses, considered by many as the hallmark of multisensory integration. Unfortunately, the manuscript lacks many important details in the descriptions of the methodology and analytical pipeline. Although some of these details can eventually be retrieved from the scripts that accompany this paper, the main text should be self-contained and sufficient to gain a clear understanding of what was done. (A list of some of these is included in the comments to the authors). Nevertheless, I believe the main weakness of this work is that the positive results obtained and reported in the results section are conditioned upon eye movements. When artifacts due to eye movements are removed, then the outcomes are no longer significant. 

      Therefore, whether the authors finally achieved the aims and showed that this method of analysis is truly a reliable way to assess crossmodal integration, does not stand on firm ground. The worst-case scenario is that the results are entirely accounted for by patterns of eye movements in the different conditions. In the best-case scenario, the method might truly work, but further experiments (and/or analyses) would be required to confirm the claims in a conclusive fashion.

      One first step toward this goal would be, perhaps, to facilitate the understanding of results in context by reporting both the uncorrected and corrected analyses in the main results section. Second, one could try to support the argument given in the discussion, pointing out the origin of the super-additive effects in posterior electrode sites, by also modelling frontal electrode clusters and showing they aren't informative as to the effect of interest.

      We performed several additional analyses to address concerns that our main result was caused by different eye movement patterns between conditions. We re-ran our key analyses using activity exclusively from frontal electrodes, which revealed poorer decoding performance than that from posterior electrodes. If eye movements were driving the non-linear enhancement in the audiovisual condition, we would expect stronger decoding using sensors closer to the source, i.e., the extraocular muscles. We also computed the correlations between average eye position and stimulus position for each condition to evaluate whether participants made larger eye movements in the audiovisual condition, which might have contributed to better decoding results. Though we did find evidence for eye movements toward stimuli, the degree of movement did not significantly differ between conditions.

      Furthermore, we note that the analysis using a stricter eye movement criterion, acknowledged in the Discussion section of the original manuscript, resulted in very similar results to the original analysis. There was significantly better decoding in the AV condition (as measured by d') than the MLE prediction, but this difference did not survive cluster correction. The most likely explanation for this is that the strict eye movement criterion combined with our conservative measure of (mass-based) cluster correction led to reduced power to detect true differences between conditions. Taken together with the additional analyses described in the revised manuscript and supplementary materials, the results show that eye movements are unlikely to account for differences between the multisensory and unisensory conditions. Instead, our decoding results likely reflect nonlinear neural integration between audio and visual sensory information.

      “Any experimental design that varies stimulus location needs to consider the potential contribution of eye movements. We computed correlations between participants’ average eye position and each stimulus position between the three sensory conditions (auditory, visual and audiovisual; Figure S1) and found evidence that participants made eye movements toward stimuli. A re-analysis of the data with a very strict eye-movement criterion (i.e., removing trials with eye movements >1.875º) revealed that the super-additive enhancement in decoding accuracy no longer survived cluster correction, suggesting that our results may be impacted by the consistent motor activity of saccades towards presented stimuli. Further investigation, however, suggests this is unlikely. Though the correlations were significantly different from 0, they were not significantly different from each other. If consistent saccades to audiovisual stimuli were responsible for the nonlinear multisensory benefit we observed, we would expect to find a higher positive correlation between horizontal eye position and stimulus location in the audiovisual condition than in the auditory or visual conditions. Interestingly, eye movements corresponded more to stimulus location in the auditory and audiovisual conditions than in the visual condition, indicating that it was the presence of a sound, rather than a visual stimulus, that drove small eye movements. This could indicate that participants inadvertently moved their eyes when localising the origin of sounds. We also re-ran our analyses using the activity measured from the frontal electrodes alone (Figure S2). If the source of the nonlinear decoding accuracy in the audiovisual condition was due to muscular activity produced by eye movements, there should be better decoding accuracy from sensors closer to the source. Instead, we found that decoding accuracy of stimulus location from the frontal electrodes (peak d' = 0.08) was less than half that of decoding accuracy from the more posterior electrodes (peak d' = 0.18). These results suggest that the source of neural activity containing information about stimulus position was located over occipito-parietal areas, consistent with our topographical analyses (inset of Figure 3).” 

      The univariate ERP analyses an outdated contrast, AV <> A + V to capture multisensory integration. A number of authors have pointed out the potential problem of double baseline subtraction when using this contrast, and have recommended a number of solutions, experimental and analytical. See for example: [1] and [2]. 

      (1) Teder-Salejarvi, W. A., McDonald, J. J., Di Russo, F., & Hillyard, S. A. (2002). Cognitive Brain Research, 14, 106-114. 

      (2) Talsma, D., & Woldorff, M. G. (2005). Journal of cognitive neuroscience, 17(7), 1098-1114.

      We thank the reviewer for raising this point. Comparing ERPs across different sensory conditions requires careful analytic choices to discern genuine sensory interactions within the signal. The AV <> (A +V) contrast has often been used to detect multisensory integration, though any non-signal related activity (i.e. anticipatory waves; Taslma & Woldorff, 2005) or pre-processing manipulation (e.g. baseline subtraction; Teder-Sälejärvi et al., 2002) will be doubled in (A + V) but not in AV. Critically, we did not apply a baseline correction during preprocessing and thus our results are not at risk of double-baseline subtraction in (A + V). Additionally, we temporally jittered the presentation of our stimuli to mitigate the potential influence of consistent overlapping ERP waves (Talsma & Woldorff, 2005). 

      The results section should provide the neurometric curve/s used to extract the slopes of the sensitivity plot (Figure 2B). 

      We thank the reviewer for raising this point of clarification. The sensitivity plots for Figures 2B and 2C were extracted from the behavioural performance of the behavioural and EEG tasks, respectively. The sensitivity plot for Figure 2B was extracted from individual psychometric curves, whereas the d’ values for Figure 2C were calculated from the behavioural data for the EEG task. This information has been clarified in the manuscript.

      “Figure 1. Behavioural performance is improved for audiovisual stimuli. A) Average accuracy of responses across participants in the behavioural session at each stimulus location for each stimulus condition, fitted to a psychometric curve. Steeper curves indicate greater sensitivity in identifying stimulus location. B) Average sensitivity across participants in the behavioural task, estimated from psychometric curves, for each stimulus condition. The red cross indicates estimated performance assuming optimal (MLE) integration of unisensory cues. C) Average behavioural sensitivity across participants in the EEG session for each stimulus condition. Error bars indicate ±1 SEM.”

      The encoding model was fitted for each electrode individually; I wonder if important information contained as combinations of (individually non-significant) electrodes was then lost in this process and if the authors consider that this is relevant. 

      Although the encoding model was fitted for each electrode individually for the topographic maps (Figure 4B), in all other analyses the encoding model was fitted across a selection of electrodes (see final inset of Figure 3). As this electrode set was used for all other neural analyses, our model would allow for the detection of important information contained in the neural patterns across electrodes. This information has been clarified in the manuscript.

      “Thus, for all subsequent analyses we only included signals from the central-temporal, parietal-occipital, occipital and inion sensors for computing the inverse model (see final inset of Figure 2). As the model was fitted for multiple electrodes, subtle patterns of neural information contained within combinations of sensors could be detected.”

      Neurobehavioral correlations could benefit from outlier rejection and the use of robust correlation statistics. 

      We thank the reviewer for raising this issue. Note, however, that the correlations we report are resistant to the influence of outliers because we used Spearman’s rho1 (as opposed to Pearson’s). This information has been communicated in the manuscript.

      (1) Wilcox, R.R. (2016), Comparing dependent robust correlations. British Journal of Mathematical & Statistical Psychology, 69(3), 215-224. https://doi.org/10.1111/bmsp.12069

      “Neurobehavioural correlations. As behavioural and neural data violated assumptions of normality, we calculated rank-order correlations (Spearman’s rho) between the average decoding sensitivity for each participant from 150-250 ms poststimulus onset and behavioural performance on the EEG task. As Spearman’s rho is resistant to outliers (Wilcox, 2016), we did not perform outlier rejection.”

      “Wilcox, R.R. (2016), Comparing dependent robust correlations. British Journal of Mathematical & Statistical Psychology, 69(3), 215-224. https://doi.org/10.1111/bmsp.12069”

      Many details that are important for the reader to evaluate the evidence and to understand the methods and analyses aren't given; this is a non-exhaustive list:  

      We thank the reviewer for highlighting these missing details. We have updated the manuscript where necessary to ensure the methods and analyses are fully detailed and replicable.

      - specific parameters of the stimuli and performance levels. Just saying "similarly difficult" or "marginally higher volume" is not enough to understand exactly what was done.  

      “The perceived source location of auditory stimuli was manipulated via changes to interaural level and timing (Whitworth & Jeffress, 1961; Wightman & Kistler, 1992). The precise timing of when each speaker delivered an auditory stimulus was calculated from the following formula:

      where x and z are the horizontal and forward distances in metres between the ears and the source of the sound on the display, respectively, r is the head radius, and s is the speed of sound. We used a constant approximate head radius of 8 cm for all participants. r was added to x for the left speaker and subtracted for the right speaker to produce the interaural time difference. For ±15° source locations, interaural timing difference was 1.7 ms. To simulate the decrease in sound intensity as a function of distance, we calculated interaural level differences for the left and right speakers by dividing the sounds by the left and right distance vectors. Finally, we resampled the sound using linear interpolation based on the calculations of the interaural level and timing differences. This process was used to calculate the soundwaves played by the left and right speakers for each of the possible stimulus locations on the display. The maximum interaural level difference between speakers was 0.14 A for ±15° auditory locations, and 0.07 A for ±7.5°.”

      - where are stimulus parameters adjusted individually or as a group? Which method was followed?  

      To clarify, stimulus parameters (frequency, size, luminance, volume, location, etc.) were manipulated throughout pilot testing only. Parameters were adjusted to achieve similar pilot behavioural results between the auditory and visual conditions. For the experiment proper, parameters remained constant for both tasks and were the same for all participants.

      “During pilot testing, stimulus features (size, luminance, volume, frequency etc.) were manipulated to make visual and auditory stimuli similarly difficult to spatially localize. These values were held constant in the main experiment.”

      - specify which response buttons were used.

      “Participants were presented with two consecutive stimuli and tasked with indicating, via button press, whether the first (‘1’ number-pad key) or second (‘2’ number-pad key) interval contained the more leftward stimulus.”

      “At the end of each sequence, participants were tasked with indicating, via button press, whether more presentations appeared on the right (‘right’ arrow key) or the left (‘left’ arrow key) of the display.”

      - no information is given as to how many trials per condition remained on average, for analysis.  

      The average number of remaining trials per condition after eye-movement analysis is now included in the Methods section of the revised manuscript.

      “We removed trials with substantial eye movements (>3.75 away from fixation) from the analyses. After the removal of eye movements, on average 2365 (SD \= 56.94), 2346 (SD \= 152.87) and 2350 (SD \= 132.47) trials remained for auditory, visual and audiovisual conditions, respectively, from the original 2400 per condition.”

      - no information is given on the specifics of participant exclusion criteria. (even if the attrition rate was surprisingly high, for such an easy task).  

      The behavioural session also served as a screening task. Although the task instructions were straightforward, perceptual discrimination was not easy due to the ambiguity of the stimuli. Auditory localization is not very precise, and the visual stimuli were brief, dim, and diffuse. The behavioural results reflect the difficulty of the task. Attrition rate was high as participants who scored below 60% correct in any condition were deemed unable to accurately perform the task, were not invited to complete the subsequent EEG session, and omitted from the analyses. We have included the specific criteria in the manuscript.

      “Participants were first required to complete a behavioural session with above 60% accuracy in all conditions to qualify for the EEG session (see Behavioural session for details).”

      - EEG pre-processing: what filter was used? How was artifact rejection done? (no parameters are reported); How were bad channels interpolated?  

      We used a 0.25 Hz high-pass filter to remove baseline drifts, but no low-pass filter. In line with recent studies on the undesirable influence of EEG preprocessing on ERPs1, we opted to avoid channel interpolation and artifact rejection. This was erroneously reported in the manuscript and has now been clarified. For the sake of clarity, here we demonstrate that a reanalysis of data using channel interpolation and artifact rejection returned the same pattern of results. 

      (1) Delorme, A. (2023). EEG is better left alone. Scientific Reports, 13, 2372. https://doi.org/10.1038/s41598-023-27528-0

      - specific electrode locations must be given or shown in a plot (just "primarily represented in posterior electrodes" is not sufficiently informative).  

      A diagram of the electrodes used in all analyses is included within Figure 3, and we have drawn readers’ attention to this in the revised manuscript.

      “Thus, for all subsequent analyses we only included signals from the central-temporal, parietal-occipital, occipital and inion sensors for computing the inverse model (see final inset of Figure 2).” 

      - ERP analysis: which channels were used? What is the specific cluster correction method?

      We used a conservative mass-based cluster correction from Pernet et al. (2015) - this information has been clarified in the manuscript.

      “A conservative mass-based cluster correction was applied to account for spurious differences across time (Pernet et al., 2015).” 

      “Pernet, C. R., Latinus, M., Nichols, T. E., & Rousselet, G. A. (2015). Cluster-based computational methods for mass univariate analyses of event-related brain potentials/fields: A simulation study. Journal of Neuroscience Methods, 250, 85-93. https://doi.org/https://doi.org/10.1016/j.jneumeth.2014.08.003” 

      - results: descriptive stats on performance must be given (instead of saying "participants performed well").  

      The mean and standard deviation of participants’ performance for each condition in the behavioural and EEG experiments are now explicitly mentioned in the manuscript.

      “A quantification of the behavioural sensitivity (i.e., steepness of the curves) revealed significantly higher sensitivity for the audiovisual stimuli (M = .04, SD = .02) than for the auditory stimuli alone (M = .03, SD = .01; Z = -3.09, p = .002), and than for the visual stimuli alone (M = .02, SD = .01; Z = -5.28, p = 1.288e-7; Figure 1B). Sensitivity for auditory stimuli was also significantly higher than sensitivity for visual stimuli (Z = 2.02, p = .044).” 

      “We found a similar pattern of results to those in the behavioural session; sensitivity for audiovisual stimuli (M = .85, SD = .33) was significantly higher than for auditory (M = .69, SD = .41; Z = -2.27, p = .023) and visual stimuli alone (M = .61, SD = .29; Z = -3.52, p = 4.345e-4), but not significantly different from the MLE prediction (Z = -1.07, p = .285).” 

      - sensitivity in the behavioural and EEG sessions is said to be different, but no comparison is given. It is not even the same stimulus set across the two tasks...  

      This relationship was noted as a potential explanation for the higher sensitivities obtained in the EEG task, and was not intended to stand up to statistical scrutiny. We agree it makes little sense to compare statistically between the EEG and behavioural results as they were obtained from different tasks. We would like to clarify, however, that the stimuli used in the two tasks were the same, with the exception that in the EEG task the stimuli were presented from 5 locations versus 8 in the behavioural task. To avoid potential confusion, we have removed the offending sentence from the manuscript:

      Reviewer 2:

      Their measure of neural responses is derived from the decoder responses, and this takes account of the reliability of the sensory representations - the d' statistics - which is an excellent thing. It also means if I understand their analysis correctly (it could bear clarifying - see below), that they can generate from it a prediction of the performance expected if an optimal decision is made combining the neural signals from the individual modalities. I believe this is the familiar root sum of squares d' calculation (or very similar). Their decoding of the audiovisual responses comfortably exceeds this prediction and forms part of the evidence for their claims. 

      Yet, superadditivity - including that in evidence in the principle of inverse effectiveness more typically quantifies the excess over the sum of proportions correct in each modality. Their MLE d' statistic can already predict this form of superadditivity. Therefore, the superadditivity they report here is not the same form of superadditivity that is usually referred to in behavioural studies. It is in fact a stiffer definition. What their analysis tests is that decoding performance exceeds what would be expected from an optimally weighted linear integration of the unisensory information. As this is not the common definition it is difficult to relate to behavioral superadditivity reported in much literature (of percentage correct). This distinction is not at all clear from the manuscript. 

      But the real puzzle is here: The behavioural data or this task do not exceed the optimal statistical decision predicted by signal detection theory (the MLE d'). Yet, the EEG data would suggest that the neural processing is exceeding it. So why, if the neural processing is there to yield better performance is it not reflected in the behaviour? I cannot explain this, but it strikes me that the behaviour and neural signals are for some reason not reflecting the same processing. 

      Be explicit and discuss this mismatch they observe between behaviour and neural responses. 

      Thank you, we agree that it is worth expanding on the observed disconnect between MSI in behaviour and neural signals. We have included an additional paragraph in the Discussion of the revised manuscript. Despite the mismatch, we believe the behavioural and neural responses still reflect the same underlying processing, but at different levels of sensitivity. The behavioural result likely reflects a coarse down-sampling of the precision in location representation, and thus less likely to reflect subtle MSI enhancements.

      “An interesting aspect of our results is the apparent mismatch between the behavioural and neural responses. While the behavioural results meet the optimal statistical threshold predicted by MLE, the decoding analyses suggest that the neural response exceeds it. Though non-linear neural responses and statistically optimal behavioural responses are reliable phenomena in multisensory integration (Alais & Burr, 2004; Ernst & Banks, 2002; Stanford & Stein, 2007), the question remains – if neural super-additivity exists to improve behavioural performance, why is it not reflected in behavioural responses? A possible explanation for this neurobehavioural discrepancy is the large difference in timing between sensory processing and behavioural responses. A motor response would typically occur some time after the neural response to a sensory stimulus (e.g., 70-200 ms), with subsequent neural processes between perception and action that introduce noise (Heekeren et al., 2008) and may obscure super-additive perceptual sensitivity. In the current experiment, participants reported either the distribution of 20 serially presented stimuli (EEG session) or compared the positions of two stimuli (behavioural session), whereas the decoder attempts to recover the location of every presented stimulus. While stimulus location could be represented with higher fidelity in multisensory relative to unisensory conditions, this would not necessarily result in better performance on a binary behavioural task in which multiple temporally separated stimuli are compared. One must also consider the inherent differences in how super-additivity is measured at the neural and behavioural levels. Neural super-additivity should manifest in responses to each individual stimulus. In contrast, behavioural super-additivity is often reported as proportion correct, which can only emerge between conditions after being averaged across multiple trials. The former is a biological phenomenon, while the latter is an analytical construct. In our experiment, we recorded neural responses for every presentation of a stimulus, but behavioural responses were only obtained after multiple stimulus presentations. Thus, the failure to find super-additivity in behavioural responses might be due to their operationalisation, with between-condition comparisons lacking sufficient sensitivity to detect super-additive sensory improvements. Future work should focus on experimental designs that can reveal super-additive responses in behaviour.”

      Re-work the introduction to explain more clearly the relationship between the behavioural superadditivities they review, the MLE model, and the superadditivity it actually tests. 

      We agree it is worth discussing how super-additivity is operationalised across neural and behavioural measures. However, we do not believe the behavioural studies we reviewed claimed super-additive behavioural enhancements. While MLE is often used as a behavioural marker of successful integration, it is not necessarily used as evidence for super-additivity within the behavioural response, as it relies on linear operations. 

      “It is important to consider the differences in how super-additivity is classified between neural and behavioural measures. At the level of single neurons, superadditivity is defined as a non-linear response enhancement, with the multisensory response exceeding the sum of the unisensory responses. In behaviour, meanwhile, it has been observed that the performance improvement from combining two senses is close to what is expected from optimal integration of information across the senses (Alais & Burr, 2004; Stanford & Stein, 2007). Critically, behavioural enhancement of this kind does not require non-linearity in the neural response, but can arise from a reliability-weighted average of sensory information. In short, behavioural performance that conforms to MLE is not necessarily indicative of neural super-additivity, and the MLE model can be considered a linear baseline for multisensory integration.”

      Regarding the auditory stimulus, this reviewer notes that interaural time differences are unlikely to survive free field presentation.

      Despite the free field presentation, in both the pilot test and the study proper participants were able to localize auditory stimuli significantly above chance. 

      "However, other studies have found super-additive enhancements to the amplitude of sensory event-related potentials (ERPs) for audiovisual stimuli (Molholm et al., 2002; Talsma et al., 2007), especially when considering the influence of stimulus intensity (Senkowski et al., 2011)." - this makes it obvious that there are some studies which show superadditivity. It would have been good to provide a little more depth here - as to what distinguished those studies that reported positive effects from those that did not.

      We have provided further detail on how super-additivity appears to manifest in neural measures.

      “In EEG, meanwhile, the evoked response to an audiovisual stimulus typically conforms to a sub-additive principle (Cappe et al., 2010; Fort et al., 2002; Giard & Peronnet, 1999; Murray et al., 2016; Puce et al., 2007; Stekelenburg & Vroomen, 2007; Teder- Sälejärvi et al., 2002; Vroomen & Stekelenburg, 2010). However, when the principle of inverse effectiveness is considered and relatively weak stimuli are presented together, there has been some evidence for super-additive responses (Senkowski et al., 2011).”

      “While behavioural outcomes for multisensory stimuli can be predicted by MLE, and single neuron responses follow the principles of inverse effectiveness and super- additivity, among others (Rideaux et al., 2021), how audiovisual super-additivity manifests within populations of neurons is comparatively unclear given the mixed findings from relevant fMRI and EEG studies. This uncertainty may be due to biophysical limitations of human neuroimaging techniques, but it may also be related to the analytic approaches used to study these recordings. For instance, superadditive responses to audiovisual stimuli in EEG studies are often reported from very small electrode clusters (Molholm et al., 2002; Senkowski et al., 2011; Talsma et al., 2007), suggesting that neural super-additivity in humans may be highly specific. However, information encoded by the brain can be represented as increased activity in some areas, accompanied by decreased activity in others, so simplifying complex neural responses to the average rise and fall of activity in specific sensors may obscure relevant multivariate patterns of activity evoked by a stimulus.”

      P9. "(25-75 W, 6 Ω)." This is not important, but it is a strange way to cite the power handling of a loudspeaker. 

      “The loudspeakers had a power handling capacity of 25-75 W and a nominal impedance of 6 Ω.” 

      I am struggling to understand the auditory stimulus: 

      "Auditory stimuli were 100 ms clicks". Is this a 100-ms long train of clicks? A single pulse which is 100ms long would not sound like a click, but two clicks once filtered by the loudspeaker. Perhaps they mean 100us. 

      "..with a flat 850 Hz tone embedded within a decay envelope". Does this mean the tone is gated - i.e. turns on and off slowly? Or is it constant?

      We thank the reviewer for catching this. ‘Click’ may not be the most apt way of defining the auditory stimulus. It was a 100 ms square wave tone with decay, i.e., with an onset at maximal volume before fading gradually. Given that the length of the stimulus was 100 ms, the decay occurs quickly and provides a more ‘click-like’ percept than a pure tone. We have provided a representation of the sound below for further clarification. This represents the amplitude from the L and R speakers for maximally-left and maximally-right stimuli. We have added this clarification in the revised manuscript. 

      Author response image 1.

      “Auditory stimuli were 100 ms, 850 Hz tones with a decay function (sample rate = 44, 100 Hz; volume = 60 dBA SPL, as measured at the ears).”

      P10. "Stimulus modality was either auditory, visual, or audiovisual. Trials were blocked with short (~2 min) breaks between conditions".

      Presumably the blocks were randomised across participants.

      Condition order was not randomised across participants, but counterbalanced. This has been clarified in the manuscript.

      “Stimulus modality was auditory, visual or audiovisual, presented in separate blocks with short breaks (~2 min) between conditions (see Figure 6A for an example trial). The order of conditions was counterbalanced across participants.” 

      P15. Feels like there is a step not described here: "The d' of the auditory and visual conditions can be used to estimate the predicted 'optimal' sensitivity of audiovisual signals as calculated through MLE." Do they mean sqrt[ (d'A)^2 + (d'V)^2] ? If it is so simple then it may as well be made explicit here. A quick calculation from eyeballing Figures 2B and 2C suggests this is the case.

      We thank the reviewer for raising this point of clarification. Yes, the ‘optimal’ audiovisual sensitivity was calculated as the hypotenuse of the auditory and visual sensitivities. This calculation has been made explicit in the revised manuscript.

      The d’ from the auditory and visual conditions can be used to estimate the predicted ‘optimal’ sensitivity to audiovisual signals as calculated through the following formula:

      "The perceived source location of auditory stimuli was manipulated via changes to interaural intensity and timing (Whitworth & Jeffress, 1961; Wightman & Kistler, 1992)." The stimuli were delivered by a pair of loudspeakers, and the incident sound at each ear would be a product of both speakers. And - if there were a time delay between the two speakers, then both ears could potentially receive separate pulses one after the other at different delays. Did they record this audio stimulus with manikin? If not, it would be very difficult to know what it was at the ears. I don't doubt that if they altered the relative volume of the loudspeakers then some directionality would be perceived but I cannot see how the interaural level and timing differences could be matched - as if the sound were from a single source. I doubt that this invalidates their results, but to present this as if it provided matched spatial and timing cues is wrong, and I cannot work out how they can attribute an azimuthal location to the sound. For replication purposes, it would be useful to know how far apart the loudspeakers were and what the timing and level differences actually were.

      The behavioural tasks each had evenly distributed ‘source locations’ on the horizontal azimuth of the computer display (8 for the behavioural session, 5 for the EEG session). We manipulated the perceived location of auditory stimuli through interaural time delays and interaural level differences. By first measuring the forward (z) and horizontal (x) distance of each source location to each ear, the method worked by calculating what the time-course of a sound wave should be at the location of the ear given the sound wave at the source. Then, for each source location, we can calculate the time delay between speakers given the vectors of x and z, the speed of sound and the width of the head.  As the intensity of sound drops inversely with the square of the distance, we can divide the sound wave by the distance for each source location to provide the interaural level difference. Though we did not record the auditory stimulus with a manikin, our behavioural analyses show that participants were able to detect the directions of auditory stimuli from our manipulations, even to a degree that significantly exceeded the localisation accuracy for visual stimuli (for the behavioural session task). This information has been clarified in the manuscript.

      “Auditory stimuli were played through two loudspeakers placed either side of the display (80 cm apart for the behavioural session, 58 cm apart for the EEG session).” 

      “The perceived source location of auditory stimuli was manipulated via changes to interaural level and timing (Whitworth & Jeffress, 1961; Wightman & Kistler, 1992). The precise timing of when each speaker delivered an auditory stimulus was calculated from the following formula:

      where x and z are the horizontal and forward distances in metres between the ears and the source of the sound on the display, respectively, r is the head radius, and s is the speed of sound. We used a constant approximate head radius of 8 cm for all participants. r was added to x for the left speaker and subtracted for the right speaker to produce the interaural time difference. For ±15° source locations, interaural timing difference was 1.7 ms. To simulate the decrease in sound intensity as a function of distance, we calculated interaural level differences for the left and right speakers by dividing the sounds by the left and right distance vectors. Finally, we resampled the sound using linear interpolation based on the calculations of the interaural level and timing differences. This process was used to calculate the soundwaves played by the left and right speakers for each of the possible stimulus locations on the display. The maximum interaural level difference between speakers was 0.14 A for ±15° auditory locations, and 0.07 A for ±7.5°.

      I am confused about this statement: "A quantification of the behavioural sensitivity (i.e., steepness of the curves) revealed significantly greater sensitivity for the audiovisual stimuli than for the auditory stimuli alone (Z = -3.09, p = .002)," It is not clear from the methods how they attributed sound source angle to the sounds. Conceivably they know the angle of the loudspeakers, and this would provide an outer bound on the perceived location of the sound for extreme interaural level differences (although free field interaural timing cues can create a wider sound field). 

      Our analysis of behavioural sensitivity was dependent on the set ‘source locations’ that were used to calculate the position of auditory and audiovisual stimuli.  In the behavioural task, participants judged the position of the target stimulus relative to a central stimulus. Thus, for each source location, we recorded how often participants correctly discriminated between presentations. The quoted analysis acknowledges that participants were more sensitive to audiovisual stimuli than auditory stimuli in the context of this task. A full explanation of how source location was implemented for auditory stimuli has been clarified in the manuscript. 

      It would be very nice to see some of the "channel" activity - to get a feel for the representation used by the decoder. 

      We have included responses for the five channels as a Supplemental Figure.

      Figure 6 appears to show that there is some agreement between behaviour and neural responses - for the audiovisual case alone. The positive correlation of behavioural and decoding sensitivity appears to be driven by one outlier - who could not perform the audiovisual task (and indeed presumably any of them). Furthermore, if we were simply Bonferonni correct for the three comparisons, this would become non-significant. It is also puzzling why the unisensory behaviour and EEG do not correlate - which seems to again suggest a poor correspondence between them. Opposite to the claim made.

      We understand the reviewer’s concern here. We would like to note, however, that each correlation used unique data sets – that is, the behavioural and neural data for each separate condition. In this case, we believe a Bonferroni correction for multiple comparisons is too conservative, as no data set was compared more than once. Neither the behavioural nor the neural data were normally distributed, and both contained outliers. Rather than reduce power through outlier rejection, we opted to test correlations using Spearman’s rho, which is resistant to outliers1. It is also worth noting that, without outlier rejection, the audiovisual correlation (p \= .003) would survive a Bonferroni correction for 3 comparisons. The nonsignificant correlation in the auditory and visual conditions might be due to the weaker responses elicited by unisensory stimuli, with the reduced signal-to-noise ratio obscuring potential correlations. Audiovisual stimuli elicited more precise responses both behaviourally and neurally, increasing the power to detect a correlation. 

      (1) Wilcox, R.R. (2016), Comparing dependent robust correlations. British Journal of Mathematical & Statistical Psychology, 69(3), 215-224. https://doi.org/10.1111/bmsp.12069

      “We also found a significant positive correlation between participants’ behavioural judgements in the EEG session and decoding sensitivity for audiovisual stimuli. This result suggests that participants who were better at identifying stimulus location also had more reliably distinct patterns of neural activity. The lack of neurobehavioural correlation in the unisensory conditions might suggest a poor correspondence between the different tasks, perhaps indicative of the differences between behavioural and neural measures explained previously. However, multisensory stimuli have consistently been found to elicit stronger neural responses than unisensory stimuli (Meredith & Stein, 1983; Puce et al., 2007; Senkowski et al., 2011; Vroomen & Stekelenburg, 2010), which has been associated with behavioural performance (Frens & Van Opstal, 1998; Wang et al., 2008). Thus, the weaker signalto-noise ratio in unisensory conditions may prevent correlations from being detected.”

      Further changes:

      (1)   To improve clarity, we shifted the Methods section to after the Discussion. This change included updating the figure numbers to match the new order (Figure 1 becomes Figure 6, Figure 2 becomes Figure 1, and so on).

      (2)   We also resolved an error on Figure 2 (previously Figure 3). The final graph (Difference between AV and A + V) displayed incorrect values on the Y axis.

      This has now been remedied.

    2. eLife Assessment

      Despite the well-established facilitatory effects of multisensory integration on behavioural measures, standard neuroimaging approaches have yet to reliably and precisely identify the corresponding neural correlates. In this valuable paper, Buhmann et al. leverage EEG decoding methods, moving beyond traditional univariate analyses, to capture these correlates. They present solid evidence that this approach can effectively estimate multisensory integration in humans across a broad range of contexts.

    3. Reviewer #1 (Public review):

      This study presents a novel application of inverted encoding (i.e., decoding) to detect non-linear correlates of crossmodal integration in human neural activity, using EEG (electroencephalography). The method is successfully applied to data from a group of 41 participants, performing a spatial localization task on auditory, visual, and audio-visual events. The analyses clearly show a behavioural superiority for audio-visual localization. Like previous studies, the results when using traditional univariate ERP analyses were inconclusive, showing once more the need for alternative, more sophisticated approaches. The inverted encoding approach of this study, harnessing on the multivariate nature of the signal, captured clear signs of super-additive responses, considered by many as the hallmark of multisensory integration. Despite the removal of eye-movement artefacts from the signal eliminated the significant decoding effect, the author's control analyses showed that decoding is more effective from parietal, compared to frontal electrodes, thereby ruling out ocular contamination as the sole origin of the relevant signal.

      This significant finding can bear important advances in the many fields where multisensory integration has been shown to play an important role, by providing a way to bring much needed coherence across levels of analysis, from behaviour to single-cell electrophysiology. To achieve this, it would be ideal to contrast whether the pattern of super-additive effects in other scenarios where clear behavioural signs of multisensory integration are also observed. One could also try to further support the posterior origin of the super-additive effects by source localization.

      Comments on revised version:

      All my previous concerns have been addressed. I congratulate the authors on a very nice paper.

    4. Reviewer #2 (Public review):

      Summary:

      This manuscript seeks to reconcile observations in multisensory perception - from behavior and from neural responses. It is intuitively obvious that perceiving a stimulus via two senses results in better performance than one alone. However, the nature of this interaction is complicated and relating different measures (behavioural, neural) is challenging.

      It is not uncommon to observe that for a perceptual task the percentage of correct responses seen with two senses is higher than the sum of the percentage correct obtained with each modality individually. i.e. the gains are "superadditive". The gains of adding a second sense are typically larger when the performance with the first sense is relatively poor - this effect is often called the principle inverse effectiveness. More generally, what this tells us is that performance in a multisensory perceptual task is a non-linear sum of performance for each sensory modality alone. In invasive recordings from single neurons, a wide range of non-linear interactions is observed - some superadditive, and some sub-additive.

      Despite this abundance evidence of non-linearity in some measures of multisensory integration, evoked responses (EEG) to such sensory stimuli often show little evidence of it - and this is the problem this manuscript tackles. The key assertion made is that a univariate analysis of the EEG signal is likely to average out non-linear effects of integration. This is a reasonable assertion, and their analysis does indeed provide evidence that a multivariate approach can reveal non-linear interactions in the evoked responses.

      Strengths:

      It is of great value to understand how the process of multisensory integration occurs, and despite a wealth of observations of the benefits of perceiving the world with multiple senses, we still lack a reasonable understanding of how the brain integrates information. For example - what underlies the large individual differences in the benefits of two senses over one? One way to tackle this is via brain imaging, but this is problematic if important features of the processing - such as non-linear interactions are obscured by the lack of specificity of the measurements. The approach they take to analysis of the EEG data allows the authors to look in more detail at the variation in activity across EEG electrodes, which averaging across electrodes cannot.

      This version of the manuscript is well written and for the most part clear and the report of non-linear summation of neural responses is convincing. A particular strength of the paper is their use of a statistical model of multisensory integration as their "null" model of neural responses, and the "inverted-encoder" which infers an internal representation of the stimulus which can explain the EEG responses. This encoder generates a prediction of decoding performance, which can be used to generate predictions of multisensory decoding from unisensory decoding, or from a sum of the unisensory internal representations.

      In behavioural performance, it is frequently observed that the performance increase from two senses is close to what is expected from the optimal integration of information across the senses, in a statistical sense. It can be plausibly explained by assuming that people are able to weight sensory inputs according to their reliability - and somewhat optimally. Critically the apparent "superadditive" effect on performance described above does not require any non-linearity in the sum of information across the senses, but can arise from correctly weighting the information according to reliability.

      The authors apply a similar model to predict the neural responses expected to audiovisual stimuli from the neural responses to audio and visual stimuli alone, assuming optimal statistical integration of information. The neural responses to audiovisual stimuli exceed the predictions of this model and this is the main evidence supporting their conclusion, and it is convincing.

      Weaknesses:

      The main weakness of the manuscript is that their behavioural data show no evidence of performance that exceeds the predictions of these statistical models. In fact, the models predict multisensory performance from unisensory performance pretty well. This makes it hard to interpret their results, as surely if these nonlinear neural interactions underlie the behaviour, then we should be able to see evidence of it in the behaviour. I cannot offer an easy explanation for this.

      Overall, therefore, I applaud the motivation and the sophistication of the analysis method and think it shows great promise for tackling these problems.

    1. eLife Assessment

      This valuable research contributes to our understanding of marine plankton diversity and gene expression by employing robust methodologies for sample collection and analysis. However, it lacks a comprehensive comparison with existing single-cell transcriptomics techniques in microbial ecology, and some terminology requires clarification for consistency with field standards. The downstream data analysis therefore provides only incomplete support for the claims made by the authors.

    2. Reviewer #1 (Public review):

      Summary:

      The authors aim to elucidate the diversity and gene expression patterns of marine plankton using innovative collection and sequencing methodologies. Their work investigates the taxonomic and functional profiles of planktonic communities, providing insights into their ecological roles and responses to environmental changes.

      Strengths:

      The methodology utilized in this study, particularly the combination of single-cell sequencing and advanced bioinformatics techniques, represents a significant advancement in the field of plankton research. The application of the Smart-seq2 protocol for cDNA synthesis, followed by rigorous quality control measures, ensures high-quality data generation. This comprehensive approach not only enhances the resolution of the obtained genetic information but also allows for a more detailed exploration of the diversity and functional potential of the phytoplankton community.

      One of the major strengths of this study is the rigorous methodological approach, including precise sampling techniques and robust data analysis protocols, which enhance the reliability of the results. The use of advanced sequencing technologies allows for a comprehensive assessment of gene expression, significantly contributing to our understanding of plankton diversity and its implications for marine ecosystems.

      Weaknesses:

      While the evidence presented is solid, there are areas where the analysis could be expanded. The authors could further explore the ecological interactions within plankton communities, which would provide a more holistic view of their functional roles. Additionally, a broader discussion of the implications of their findings for marine conservation efforts could enhance the manuscript's impact.

      The choice of both the plankton net and filter pore size during the plankton collection process is critical, as these factors directly impact the types of phytoplankton collected. The use of a 25 μm filter paper, in particular, may result in the omission of many eukaryotic phytoplankton species. This limitation, combined with the characteristics of the plankton net, could affect the comprehensiveness and accuracy of the results, potentially influencing the study's conclusions regarding phytoplankton diversity.

      The timing of fixation is crucial, as it directly affects whether the measured transcriptome accurately represents the organisms' actual transcriptional state in their native water environment. If fixation occurred a significant time after sample collection, the transcriptomic data may not reflect their true in situ transcriptional activity, which greatly reduces the relevance of this method.

    3. Reviewer #2 (Public review):

      Summary:

      The paper introduces Ukiyo-e-Seq, a novel method integrating microscopy with single-cell transcriptomics to study individual, uncultured eukaryotic plankton cells. By combining microscopic imaging with transcriptomic analysis, the approach links plankton morphology to gene expression, enabling taxonomic identification and functional protein exploration. Ukiyo-e-Seq was tested on 66 microbial eukaryotic cells, revealing taxonomic diversity across four superkingdoms and allowing analysis of protein complexes and developmental genes in individual species. According to the authors, this method has the potential to advance single-cell marine biodiversity studies by addressing limitations in traditional taxonomy and metatranscriptomics, especially for rare or uncultured organisms.

      However, the study's conclusions are often weakly supported by data, particularly given that this is not the first study to combine microscopy and single-cell transcriptomics of eukaryotic plankton using Smart-seq2.

      Strengths:

      A notable strength is the authors' generation of several single-cell transcriptomes for the diatom Chaetoceros, which could benefit from greater focus rather than broadly addressing eukaryotic single cells.

      Weaknesses:

      The study lacks comparison with other single-cell transcriptomics studies and it was presented as the first study that combines imaging and single-cell transcriptomics (smart-seq2) of eukaryotic plankton while in fact it is not. The sampling methodology is not replicable as the authors used a tea strainer instead of standard plankton collection equipment to filter larger cells. Terminology throughout the paper is unconventional, such as "public and private contigs," "single-organism genomics," "highly expressed contigs," and "optical methods." Additionally, the authors did not specify which database was used for taxonomic assignments. These issues may stem from the authors' limited background in microbial ecology. Overall, the study has many drawbacks and it could benefit from complete rewriting and focusing mainly on single-cell transcriptomics of diatoms.

    4. Reviewer #3 (Public review):

      Gatt et al. present a novel take on single-cell RNA-sequencing from complex planktonic samples, introducing an approach they aptly named Ukiyo-e-Seq. This work combines environmental sampling with cell picking, microscopic imaging, and Smart-seq2 single-cell RNA sequencing to profile uncultured eukaryotic plankton. Developing single-cell approaches for such ecosystems is critical, given the poor representation of many planktonic species in cultures and reference databases. This work could help bridge existing technological gaps between morphological and molecular studies of aquatic microeukaryotes

      The authors argue that microscopy does not provide information on the biochemistry of species under consideration. At best, it provides taxonomic labeling of species within a sample, yet imaging fails to assess their metabolic state or to disentangle cryptic species. In a standard metatranscriptomic setup, the sequence pool is described by aligning assembled contigs with reference databases to obtain functional and taxonomic information. This complex community-level data is impossible to parse at the single-organism level. Moreover, by relying on reference datasets, a lot of potential information can be missed. The aim of the approach is to combine the strengths of both methods, generating single-cell transcriptomic data linked to individual plankton images.

      Strengths:

      Ukiyo-e-Seq generated a valuable dataset by combining imaging and transcriptomics for individual planktonic organisms from environmental samples. This multimodal approach has the potential to improve taxonomic predictions and functional insights at the single-organism level. This manuscript demonstrates the technical feasibility of such an approach. Data of this type is rare and thus represents a valuable resource to further advance single-cell sequencing of planktonic species from environmental samples.

      Weaknesses:

      (1) The merge-split strategy, where single-cell reads are pooled prior to assembly, is counterintuitive. Pooling obscures the single-organism resolution that single-cell methods aim to achieve. The approach might be useful for assembling low-coverage contigs, but risks masking unique expression profiles for transcripts unique to a given well. As an alternative, the authors could assemble each well independently to obtain well-specific transcriptomic bins. Assemblies could then be clustered based on sequence similarity, thereby imposing strict clustering parameters to maintain resolution, to create a common reference for downstream analysis if needed. In my opinion, better results would be obtained by implementing a per-well assembly and read mapping.

      (2) The focus on the top five most expressed contigs throughout the manuscripts' data analysis is a limiting choice, as it excludes most contigs. In the preprint, we are presented with a very narrow view of the data. Visualising the entire range of assembled contigs would provide a better picture of the transcriptomic composition and diversity per well. It would be interesting to assess if the full information could be used to preliminary bin transcriptomic sequences from individual wells, for example, by gathering all 'private' contigs with high read coverage in a single well. Does such a set represent a single complete eukaryotic transcriptome?

      (3) I missed a verification with (broad-scale) taxonomic assessments based on the associated microscopic images. In their goals, the authors state that a joint approach has the potential to discover new taxonomic biodiversity. I agree, and to me, this is what is exciting about the preprint, yet I miss an example or the right bioinformatic implementation to drive home this claim. Are there organisms in wells where poor taxonomic annotations, based on alignment to a reference database or the LCA approach implemented in Kraken2, would usually result in ignoring the species in classic metatranscriptomics? Can you advance the taxonomic annotation by referring back to the organisms' picture? Can manual assessment of taxonomy advance the results from the LCA approach?

      (4) The current use of AlphaFold to predict protein structures does not convincingly add to the study's core objectives.

      Overall, Ukiyo-e-Seq presents a promising method for studying single-cell diversity in environmental samples, though the bioinformatic pipeline requires refinement to support some of the claims made by the authors. Additionally, the manuscript would benefit from clarity and additional details in its methods and a more consistent approach to presenting results and summary statistics across all assembled contigs and all sampled wells, rather than focusing on selected wells.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors aim to elucidate the diversity and gene expression patterns of marine plankton using innovative collection and sequencing methodologies. Their work investigates the taxonomic and functional profiles of planktonic communities, providing insights into their ecological roles and responses to environmental changes.

      Strengths:

      The methodology utilized in this study, particularly the combination of single-cell sequencing and advanced bioinformatics techniques, represents a significant advancement in the field of plankton research. The application of the Smart-seq2 protocol for cDNA synthesis, followed by rigorous quality control measures, ensures high-quality data generation. This comprehensive approach not only enhances the resolution of the obtained genetic information but also allows for a more detailed exploration of the diversity and functional potential of the phytoplankton community.

      One of the major strengths of this study is the rigorous methodological approach, including precise sampling techniques and robust data analysis protocols, which enhance the reliability of the results. The use of advanced sequencing technologies allows for a comprehensive assessment of gene expression, significantly contributing to our understanding of plankton diversity and its implications for marine ecosystems.

      Weaknesses:

      While the evidence presented is solid, there are areas where the analysis could be expanded. The authors could further explore the ecological interactions within plankton communities, which would provide a more holistic view of their functional roles. Additionally, a broader discussion of the implications of their findings for marine conservation efforts could enhance the manuscript's impact.

      The choice of both the plankton net and filter pore size during the plankton collection process is critical, as these factors directly impact the types of phytoplankton collected. The use of a 25 μm filter paper, in particular, may result in the omission of many eukaryotic phytoplankton species. This limitation, combined with the characteristics of the plankton net, could affect the comprehensiveness and accuracy of the results, potentially influencing the study's conclusions regarding phytoplankton diversity.

      The timing of fixation is crucial, as it directly affects whether the measured transcriptome accurately represents the organisms' actual transcriptional state in their native water environment. If fixation occurred a significant time after sample collection, the transcriptomic data may not reflect their true in situ transcriptional activity, which greatly reduces the relevance of this method.

      Thank you for your time, effort, and expertise.

      We agree that additional analyses could improve our understanding of the plankton communities sampled. We have conducted an array of alternative analyses that were not included in the current manuscript and plan to perform new analyses over the next few months as part of a deeper revision of the manuscript. We are especially interested in “providing a more holistic view of the functions” of individual plankton within the community.

      As for the protocol details, the pore size of the filter paper was chosen to focus on ~100 micron-sized organisms as a starting point: they are likely to contain more RNA than smaller organisms, making them well suited for an initial proof of concept of the methodology. That choice, however, is not particularly tightly constrained, therefore smaller plankton could be captured. This is supported by the lack of correlation, in our data, between organismal size and number of detected sequencing reads.

      Timing to cell death/fixation is a common question we receive not just in this manuscript but any RNA-Seq from primary samples. In this case, plankton were seen swimming until picking, and after picking each organism was deposited within two seconds into a lysis buffer for fixation. Therefore, we do not have reason to believe that the transcriptional activity sampled in the sequencing reads differs in any major way from the one in living plankton. Nonetheless, a study specifically testing the effect of time between ocean sampling and reverse transcription would provide more quantitative information on this point.

      Reviewer #2 (Public review):

      Summary:

      The paper introduces Ukiyo-e-Seq, a novel method integrating microscopy with single-cell transcriptomics to study individual, uncultured eukaryotic plankton cells. By combining microscopic imaging with transcriptomic analysis, the approach links plankton morphology to gene expression, enabling taxonomic identification and functional protein exploration. Ukiyo-e-Seq was tested on 66 microbial eukaryotic cells, revealing taxonomic diversity across four superkingdoms and allowing analysis of protein complexes and developmental genes in individual species. According to the authors, this method has the potential to advance single-cell marine biodiversity studies by addressing limitations in traditional taxonomy and metatranscriptomics, especially for rare or uncultured organisms.

      However, the study's conclusions are often weakly supported by data, particularly given that this is not the first study to combine microscopy and single-cell transcriptomics of eukaryotic plankton using Smart-seq2.

      Strengths:

      A notable strength is the authors' generation of several single-cell transcriptomes for the diatom Chaetoceros, which could benefit from greater focus rather than broadly addressing eukaryotic single cells.

      Weaknesses:

      The study lacks comparison with other single-cell transcriptomics studies and it was presented as the first study that combines imaging and single-cell transcriptomics (smart-seq2) of eukaryotic plankton while in fact it is not. The sampling methodology is not replicable as the authors used a tea strainer instead of standard plankton collection equipment to filter larger cells. Terminology throughout the paper is unconventional, such as "public and private contigs," "single-organism genomics," "highly expressed contigs," and "optical methods." Additionally, the authors did not specify which database was used for taxonomic assignments. These issues may stem from the authors' limited background in microbial ecology. Overall, the study has many drawbacks and it could benefit from complete rewriting and focusing mainly on single-cell transcriptomics of diatoms.

      Thank you for your time, effort, and expertise.

      There might be a bit of confusion between single-cell and single-organism sequencing, likely due to lack of clarity in our initial submission. In particular, in this manuscript no effort was spent trying to dissociate oligocellular plankton into individual cells before sequencing. While probably feasible, we expect that to be technically much harder than single-organism sequencing as performed here. The reviewer does not reference a published paper where combined imaging and RNA-Seq of individual uncultured plankton has been achieved, and we were unable to find one in the scientific literature. As stated in the manuscript, others have already performed some work on cultured plankton and single-organism sequencing (without matching images) of uncultured environmental microorganisms.

      The suggestion to focus on a smaller biological niche such as diatoms and adopt language more familiar to that specific community is well received. Indeed, given that organisms as diverse as fish larvae and diatoms could be profiled with Ukiyo-e-Seq, future studies could use the same method to address specific questions with a deeper and more narrow scope. However, this manuscript is demonstrating the feasibility of Ukiyo-e-Seq and its ability to produce usable data for a broad spectrum of organisms: part of the scientific audience might not have a specific interest in diatoms.

      The tea strainer was used for coarse pre-filtering: the exact pore size, geometry and factory tolerance on those measurements are inconsequential because each organism is later chosen (or not) based on a high-resolution microscopy image (or multiple, if fluorescence is considered). This really is a strength of Ukiyo-e-Seq over FACS or droplet-based sorters, which can only collect coarse optical information from each organism for (typically) less than 1 millisecond. In Ukiyo-q-Seq, while the actual decision to pick an individual is currently manual (by the operator of the picker), it can be automated in principle. For instance, one could build a machine learning model of plankton taxonomy based on a large collection of labelled images and use predictions from such a model to automatically drive the picker (e.g. focussing on diatoms), increasing throughput. Even in that case, however, the initial filtering stages using tea strainers, plankton nets, filter paper etc. would not be critical for the final selection of individuals as long as they are not too restrictive.

      The database used for taxonomic assignment was the NCBI non-redundant nucleotide database, accessed through the reference library provided by Kraken2 (nt).

      Reviewer #3 (Public review):

      Gatt et al. present a novel take on single-cell RNA-sequencing from complex planktonic samples, introducing an approach they aptly named Ukiyo-e-Seq. This work combines environmental sampling with cell picking, microscopic imaging, and Smart-seq2 single-cell RNA sequencing to profile uncultured eukaryotic plankton. Developing single-cell approaches for such ecosystems is critical, given the poor representation of many planktonic species in cultures and reference databases. This work could help bridge existing technological gaps between morphological and molecular studies of aquatic microeukaryotes

      The authors argue that microscopy does not provide information on the biochemistry of species under consideration. At best, it provides taxonomic labeling of species within a sample, yet imaging fails to assess their metabolic state or to disentangle cryptic species. In a standard metatranscriptomic setup, the sequence pool is described by aligning assembled contigs with reference databases to obtain functional and taxonomic information. This complex community-level data is impossible to parse at the single-organism level. Moreover, by relying on reference datasets, a lot of potential information can be missed. The aim of the approach is to combine the strengths of both methods, generating single-cell transcriptomic data linked to individual plankton images.

      Strengths:

      Ukiyo-e-Seq generated a valuable dataset by combining imaging and transcriptomics for individual planktonic organisms from environmental samples. This multimodal approach has the potential to improve taxonomic predictions and functional insights at the single-organism level. This manuscript demonstrates the technical feasibility of such an approach. Data of this type is rare and thus represents a valuable resource to further advance single-cell sequencing of planktonic species from environmental samples.

      Weaknesses:

      (1) The merge-split strategy, where single-cell reads are pooled prior to assembly, is counterintuitive. Pooling obscures the single-organism resolution that single-cell methods aim to achieve. The approach might be useful for assembling low-coverage contigs, but risks masking unique expression profiles for transcripts unique to a given well. As an alternative, the authors could assemble each well independently to obtain well-specific transcriptomic bins. Assemblies could then be clustered based on sequence similarity, thereby imposing strict clustering parameters to maintain resolution, to create a common reference for downstream analysis if needed. In my opinion, better results would be obtained by implementing a per-well assembly and read mapping.

      (2) The focus on the top five most expressed contigs throughout the manuscripts' data analysis is a limiting choice, as it excludes most contigs. In the preprint, we are presented with a very narrow view of the data. Visualising the entire range of assembled contigs would provide a better picture of the transcriptomic composition and diversity per well. It would be interesting to assess if the full information could be used to preliminary bin transcriptomic sequences from individual wells, for example, by gathering all 'private' contigs with high read coverage in a single well. Does such a set represent a single complete eukaryotic transcriptome?

      (3) I missed a verification with (broad-scale) taxonomic assessments based on the associated microscopic images. In their goals, the authors state that a joint approach has the potential to discover new taxonomic biodiversity. I agree, and to me, this is what is exciting about the preprint, yet I miss an example or the right bioinformatic implementation to drive home this claim. Are there organisms in wells where poor taxonomic annotations, based on alignment to a reference database or the LCA approach implemented in Kraken2, would usually result in ignoring the species in classic metatranscriptomics? Can you advance the taxonomic annotation by referring back to the organisms' picture? Can manual assessment of taxonomy advance the results from the LCA approach?

      (4) The current use of AlphaFold to predict protein structures does not convincingly add to the study's core objectives.

      Overall, Ukiyo-e-Seq presents a promising method for studying single-cell diversity in environmental samples, though the bioinformatic pipeline requires refinement to support some of the claims made by the authors. Additionally, the manuscript would benefit from clarity and additional details in its methods and a more consistent approach to presenting results and summary statistics across all assembled contigs and all sampled wells, rather than focusing on selected wells.

      Thank you for your time and effort, and for your expertise on the matter.

      The suggestions to conduct additional bioinformatic analyses to explore more fully the criticality and potential of various design choices (e.g. meta-assembly) are well received. We have tried some of those ideas already (e.g. assembling individual wells) and we have considered but not yet conducted or polished others (e.g. a more thorough taxonomic verification). We will endeavour to carry out as many of those analyses as possible during the deeper revision process in the coming months.

      AlphaFold 3’s use was designed to demonstrate the ability to investigate protein-protein interactions from individual species. When two peptide sequences are detected within the same well, they are more likely to be potential interacting partners than in a metatranscriptomic study, because the compartmentalisation of reads into tens or hundreds of wells greatly reduces the search space of potential interaction partners (which has a baseline runtime complexity of n squared, where n is the number of peptide sequences identified).

      ----------

    1. eLife Assessment

      This important study explores the relationship between the sequence of prokaryotic promoter elements and their activity using mutagenesis to generate thousands of mutant sequences. The evidence supporting these findings is convincing. This work will appeal to those interested in bacterial genetics, genome evolution, and gene regulation.

    2. Reviewer #1 (Public review):

      Summary:

      This study by Fuqua et al. studies the emergence of sigma70 promoters in bacterial genomes. While there have been several studies to explore how mutations lead to promoter activity, this is the first to explore this phenomena in a wide variety of backgrounds, which notably contain a diverse assortment of local sigma70 motifs in variable configurations. By exploring how mutations affect promoter activity in such diverse backgrounds, they are able to identify a variety of anecdotal examples of gain/loss of promoter activity and propose several mechanisms for how these mutations are interacting within the local motif landscape. Ultimately, they show how different sequences have different probabilities of gaining/losing promoter activity and may do so through a variety of mechanisms.

      Major strengths and weaknesses of the methods and results:

      This study uses Sort-Seq to characterize promoter activity, which has been adopted by multiple groups and shown to be robust. Furthermore, they use a slightly altered protocol which allows measurements of bi-directional promoter activity. This combined with their pooling strategy allows them to characterize expression of many different backgrounds in both directions in extremely high-throughput which is impressive! A second key approach this study relies on is the identification of promoter motifs using position weight matrices (PWMs). While these methods are prone to false positives, the authors implement a systematic approach which is standard in the field. However, drawing these types of binary definitions (is this a motif? yes/no) should always come with the caveat that gene expression is quantitative traits that we oversimplify when drawing boundaries.

      Their approach to randomly mutagenize promoters allowed them to find many examples of different types of evolutions that may occur to increase or decrease promoter activity. They have supported these with validations in more controlled backgrounds which convincingly support their proposed mechanisms for promoter evolution.

      An appraisal of whether the authors achieved their aims, and whether the results support their conclusions:

      The authors express a key finding that the specific landscape of promoter motifs in a sequence affect the likelihood that local mutations create or destroy regulatory elements. The authors have described many examples, including several that are non-obvious, and show convincingly that different sequence backgrounds have different probabilities for gaining or losing promoter activity. This overarching conclusion is supported by trend and mechanistic data which show differences in probabilities of evolving promoters, as well as the mechanisms underlying these evolutions. Furthermore, these mutations are well described and presented, showing the strength of emergent promoter motifs and their specific spacings from existing motifs within the sequence.

      Impact of the work on the field, and the utility of the methods and data to the community:

      This study enhances our understanding of the diverse mechanisms by which promoters can evolve or devolve, potentially improving models that predict mutational outcomes. While this study reveals complex mutational patterns, modeling them could significantly advance our ability to predict bacterial evolutionary trajectories and interpret genomes, bringing us closer to that goal.

      Recent work in the field of bacterial gene regulation has raised interest in bidirectional promoter regions. While the authors do not discuss how mutations that raise expression in one direction may affect another, they have created an expansive dataset which may enable other groups to study this interesting phenomenon. Also, their variation of the Sort-Seq protocol will be a valuable example for other groups who may be interested in studying bidirectional expression. Lastly, this study may be of interests to groups studying eukaryotic regulation as it can inform how the evolution of transcription factor binding sites influences short-range interactions with local regulator elements.

      Any additional context to understand the significance of the work:

      Predicting whether a sequence drives promoter activity is a challenging task. By learning the types of mutations that create or destroy promoters, this study provides valuable insights for computational models aimed at predicting promoter activity.

      Comments on revised version:

      I am satisfied with the extensive changes made by the author. This manuscript is excellent.

      I very much like the change in figures to incorporate the sequence information. It is great to see clear representations of the emergent sigma70 motifs and their spacing relative to existing motifs. This addition significantly improves the clarity of the findings.

      The validation of mutations on a clean background is well-executed, and the results are convincing. I appreciate the effort put into validating their results. The additional analyses that include TGn and UP-element motifs are also well done and highly relevant, as these elements are known to compensate for weaker or absent -35 sequences.

      Most or all perceived inconsistencies from the previous version have been resolved. While I don't think the fluorescence threshold of 1.5 a.u. for promoter activity is justified, the authors do acknowledge this shortcoming, and even empirically-derived thresholds are still technically arbitrary.

      I particularly enjoyed Figure 1E, thank you for entertaining my analysis request! Also, the H-NS story is a nice addition showing how transcription factors influence this evolution

      Overall, this revised manuscript is an excellent contribution to the field, and I have no further recommendations for improvement.

    3. Reviewer #2 (Public review):

      Summary:

      Fuqua et al investigated the relationship between prokaryotic box motifs and the activation of promoter activity using a mutagenesis sequencing approach. From generating thousands of mutant daughter sequences from both active and non-active promoter sequences they were able to produce a fantastic dataset to investigate potential mechanisms for promoter activation. From these large numbers of mutated sequences, they were able to generate mutual information with gene expression to identify key mutations relating to the activation of promoter island sequences.

      Strengths:

      The data generated from this paper is an important resource to address this question of promoter activation. Being able to link promoter modulated gene expression to mutational changes in previously nonactive promoter regions is exciting. This approach allows future large-scale studies to investigate evolutionary processes relating to changes in gene regulation in a statistically robust manner. Here there is a focus on the -10 and -35 boxes but other elements and interactions were explored including; H-NS binding, UP-element and TGn. Alongside this, the method of identifying key mutations using mutual information in this paper is well done and should be a standard in future studies for identifying regions of interest.

      Weaknesses:

      While the generation of the data is superb, as the authors have stated clearly themselves, there is a lot of scope for future studies to understand both causal relationships and utilise the data more effectively. The authors look at changes in regulatory expression based on a few observations that are treated independently but occur concurrently. While this study has backed up findings experimentally this may not always be possible. Previously this reviewer had suggested addressing this using complementary approaches such as analysis focusing on identifying important motifs, using something like a glm lasso regression to identify significant motifs, and then combining with mutational hotspot information would be more robust. The authors tried to implement such an approach in response to the review, but its complexity became beyond the scope. I look forward to the development of such methods that allow more complete exploration of similar datasets.

      Comments on revised version:

      The authors addressed all my previous comments. I believe the study is much improved and thank them for the time and effort they put into addressing the comments.

    4. Reviewer #3 (Public review):

      This work brings a computational approach to the study of promoters and transcription. The paper is improved but there are still factual errors and implausible explanations. I am not convinced by the response from the authors, concerning the promoter -35 element, in their rebuttal.

      Comments on author rebuttal:

      - We respectfully but strongly disagree that our analysis has misrepresented the true nature of -35 boxes. First, accounting for more A's at position 5 in the PWM is not going to lead to a "critical error." This is because positions 4-6 of the motif barely have any information content (bits) compared to positions 1-3 (see Fig 1A).

      The analysis does misrepresent the consensus -35 element, which is, unequivocally, TTGACA. I agree that positions 4-6 of the element are less well-conserved.

      - This assertion is not just based on our own PWM, but based on ample precedent in the literature. In PMID 14529615, TTG is present in 38% of all -35 boxes, but ACA only in 8%.

      This does not mean that TTGACA is not the consensus, or that "ACA" is not important at promoters where it's present.

      - In PMID 29388765, with the -10 instance TATAAT, the -35 instance TTGCAA yields stronger promoters compared to the -35 instance TTGACA (See their Figure 3B).

      This is a known phenomenon and results from "perfect" promoters being limited at the point of RNA polymerase promoter escape (because the RNAP struggles to "let go" of perfect promoters). This does not mean the TTGACA is not the consensus. Indeed, and this is a key point, it is evident in the figure the authors refer to that TTGACA stimulates more transcription than alternative -35 sequences when -10 elements are not perfect.

      - In PMID 29745856 (Figure 2), the most information content lies in positions 1-3, with the A and C at position 5 both nearly equally represented, as in our PWM.

      The motif shown in this paper suffers from exactly the same issue as the paper under review; the variable spacing between the -35 hexamer and -10 element isn't taken into account by MEME.

      - In PMID 33958766 (Figure 1) an experimentally-derived -35 box is even reduced to a "partial" -35 box which only includes positions 1 and 2, with consensus: TTnnnn.

      This paper does not show an "experimentally-derived -35 box" in Figure 1 (or anywhere else, as far as I can see).

      - In addition, we did not derive the PWMs as the reviewer describes. The PWMs we use are based on computational predictions that are in excellent agreement with experimental results. Specifically, the PWMs we use are from PMID 29728462, which acquired 145 -10 and -35 box sequences from the top 3.3% of computationally predicted boxes from Regulon DB.

      The paper mentioned states "for the genomic RNAP logo, sequences were taken from computationally predicted RNAP binding sites on RegulonDB" so these are not experimentally defined promoters? It's not obvious from the paper, or regulon DB, which sequences these are or how they were predicted.

      - Thank you for pointing out that our original submission was incomplete in this regard. We address these concerns by new analyses, including some new experiments. First, Rho dependent termination is associated with the RUT motif, which is very rich in Cytosines (PMID: 30845912). Given that our sequences confer between 65%-78% of AT-content, canonical rho dependent termination is unlikely. However, we computationally searched for rho-dependent terminators using the available code from PMID: 30845912, but the algorithm did not identify any putative RUTs. Because this analysis was not informative, we did not include it in the paper.

      I don't believe it is the case that Rho absolutely requires a RUT sequence. My understanding is that, if an RNA is not translated, Rho will intervene (e.g. see PMID: 18487194).

      - We respectfully disagree that the reviewer's point is pertinent because what the reviewer is referring to is the likelihood that the sequence is a promoter, which indeed increases with AT content, but we are focused on the likelihood that a sequence becomes a promoter through DNA mutation

      I disagree that this distinction is relevant. An AT-rich sequence will much more closely resemble a promoter by chance than a GC rich sequence. As an extreme example, the sequence TTTTTT can be converted into a reasonable -10 element by one change (to TATTTT) but the sequence GGGGGG can't.

    5. Author response:

      The following is the authors’ response to the original reviews.

      We performed multiple new experiments and analyses in response to the reviewers concerns, and incorporated the results of these analyses in the main text, and in multiple substantially revised or new figures. Before embarking on a point-by-point reply to the reviewers’ concerns, we here briefly summarize our most important revisions.

      First, we addressed a concern shared by Reviewers #1-3 about a lack of information about our DNA sequences. To this end, we redesigned multiple figures (Figures 3, 4, 5, S8, S9, S10, S11, and S12) to include the DNA sequences of each tested promoter, the specific mutations that occurred in it, the resulting changes in position-weight-matrix (PWM) scores, and the spacing between promoter motifs. Second, Reviewers #1 and #2 raised concerns about a lack of validation of our computational predictions and the resulting incompleteness of the manuscript. To address this issue, we engineered 27 reporter constructs harboring specific mutations, and experimentally validated our computational predictions with them. Third, we expanded our analysis to study how a more complete repertoire of other sigma 70 promoter motifs such as the UP-element and the extended -10 / TGn motif affects gene expression driven by the promoters we study. Fourth, we addressed concerns by Reviewer #3 about the role of the Histone-like nucleoid-structuring protein (H-NS) in promoter emergence and evolution. We did this by performing both experiments and computational analyses, which are now shown in the newly added Figure 5. Fifth, to satisfy Reviewer #3’s concerns about missing details in the Discussion, we have rewritten this section, adding additional details and references. 

      We next describe these and many other changes in a point-by-point reply to each reviewer’s comments. In addition, we append a detailed list of changes to each section and figure to the end of this document.

      Reviewer #1 (Public Review):

      Summary:

      This study by Fuqua et al. studies the emergence of sigma70 promoters in bacterial genomes. While there have been several studies to explore how mutations lead to promoter activity, this is the first to explore this phenomenon in a wide variety of backgrounds, which notably contain a diverse assortment of local sigma70 motifs in variable configurations. By exploring how mutations affect promoter activity in such diverse backgrounds, they are able to identify a variety of anecdotal examples of gain/loss of promoter activity and propose several mechanisms for how these mutations interact within the local motif landscape. Ultimately, they show how different sequences have different probabilities of gaining/losing promoter activity and may do so through a variety of mechanisms.

      We thank Reviewer #1 for taking the time to read and provide critical feedback on our manuscript. Their summary is fundamentally correct.

      Major strengths and weaknesses of the methods and results:

      This study uses Sort-Seq to characterize promoter activity, which has been adopted by multiple groups and shown to be robust. Furthermore, they use a slightly altered protocol that allows measurements of bi-directional promoter activity. This combined with their pooling strategy allows them to characterize expressions of many different backgrounds in both directions in extremely high throughput which is impressive! A second key approach this study relies on is the identification of promoter motifs using position weight matrices (PWMs). While these methods are prone to false positives, the authors implement a systematic approach which is standard in the field. However, drawing these types of binary definitions (is this a motif? yes/no) should always come with the caveat that gene expression is a quantitative trait that we oversimplify when drawing boundaries.

      The point is well-taken. To clarify this and other issues, we have added a section on the limitations of our work to the Discussion. Within this section we include the following sentences (lines 675-680):

      “Additionally, future studies will be necessary to address the limitations of our own work. First, we use binary thresholding to determine i) the presence or absence of a motif, ii) whether a sequence has promoter activity or not, and iii) whether a part of a sequence is a hotspot or not. While chosen systematically, the thresholds we use for these decisions may cause us to miss subtle but important aspects of promoter evolution and emergence.”

      Their approach to randomly mutagenizing promoters allowed them to find many anecdotal examples of different types of evolutions that may occur to increase or decrease promoter activity. However, the lack of validation of these phenomena in more controlled backgrounds may require us to further scrutinize their results. That is, their explanations for why certain mutations lead or obviate promoter activity may be due to interactions with other elements in the 'messy' backgrounds, rather than what is proposed.

      Thank you for raising this important point. To address it, we have conducted extensive new validation experiments for the newest version of this manuscript. For the “anecdotal” examples you described, we created 27 reporter constructs harboring the precise mutation that leads to the loss or gain of gene expression, and validated its ability to drive gene expression. The results from these experiments are in Figures 3, 4, 5, and Supplemental Figures S8-S11, and are labeled with a ′ (prime) symbol.

      These experiments not only confirm the increases and decreases in fluorescence that our analysis had predicted. They also demonstrate, with the exception of two (out of 27) falsepositive discoveries, that background mutations do not confound our analysis. We mention these two exceptions (lines 364-367):

      “In two of these hotspots, our validation experiments revealed no substantial difference in gene expression as a result of the hotspot mutation (Fig S8F′ and Fig S8J′). In both of these false positives, new -10 boxes emerge in locations without an upstream -35 box.”

      An appraisal of whether the authors achieved their aims, and whether the results support their conclusions:

      The authors express a key finding that the specific landscape of promoter motifs in a sequence affects the likelihood that local mutations create or destroy regulatory elements. The authors have described many examples, including several that are non-obvious, and show convincingly that different sequence backgrounds have different probabilities for gaining or losing promoter activity. While this overarching conclusion is supported by the manuscript, the proposed mechanisms for explaining changes in promoter activity are not sufficiently validated to be taken for absolute truth. There is not sufficient description of the strength of emergent promoter motifs or their specific spacings from existing motifs within the sequence. Furthermore, they do not define a systematic process by which mutations are assigned to different categories (e.g. box shifting, tandem motifs, etc.) which may imply that the specific examples are assigned based on which is most convenient for the narrative.

      To summarize, Reviewer #1 criticizes the following three aspects of our work in this comment. 1) The mechanisms we proposed are not sufficiently validated. 2) The description of motifs, spacing, and PWM scores are not shown. 3) How mutations are classified into different categories (i.e. box-shifting, tandem motifs, etc.) is not systematically defined. 

      These are all valid criticisms. In response, we performed an extensive set of follow-up experiments and analyses, and redesigned the majority of the figures. Here is a more detailed response to each criticism:

      (1) Proposed mechanisms for explaining changes in promoter activity are not sufficiently validated. We engineered 27 reporter constructs harboring the specific mutations in the parents that we had predicted to change promoter activity. For each, we compared their fluorescence levels with their wild-type counterpart. The results from these experiments are in Figures 3 and 4, 5, and Supplemental Figures S8, S9, S10, S11, and S12, and are labeled with a ′ (prime) symbol.

      (2) No sufficient description of the strength of emergent promoter motifs or their specific spacings. We redesigned the figures to include the DNA sequences of the parent sequences, as well as the degenerate consensus sequences for each mutation. We additionally now highlight the specific motif sequences, their respective PWM scores, and by how much the score changes upon mutation. Finally, we annotated the spacing of motifs. These changes are in Figures 3, 4, 5, and Supplemental Figures S8, S9, S10, S11, and S12.

      We note that in many cases, high-scoring PWM hits for the same motif can overlap (i.e. two -10 motifs or two -35 motifs overlap). Additionally, the proximity of a -35 and -10 box does not guarantee that the two boxes are interacting. Together, these two facts can result in an ambiguity of the spacer size between two boxes. To avoid any reporting bias, we thus often report spacer sizes as a range (see Figure panels 4F, S8D, S8F-L, S9A, S9H, S10A, and S10E). The smallest spacer we annotate is in Figure 4F with 10 bp, and the largest is in Figure S8D with 26 bp. Any more “extreme” distances are not annotated and for the reader to decide if an interaction is present or not.

      (3) No systematic process by which mutations are assigned to different categories such as box shifting, tandem motifs, etc. We opted to reformulate these categories completely, because the phenotypic effects of a previously mentioned “tandem motif” was actually a byproduct of H-NS repression (see the newly added Figure S12). 

      We also agree that the categories were ambiguous. We now introduce two terms: homo-gain and hetero-gain of -10 and -35 boxes. The manuscript now clearly defines these terms, and the relevant passage now reads as follows (lines 430-435): 

      “We found that these mutations frequently create new boxes overlapping those we had identified as part of a promoter

      (Fig S9). This occurs when mutations create a -10 box overlapping a -10 box, a -35 box overlapping a -35 box, a -10 box overlapping a -35 box, or a -35 box overlapping a -10 box. We call the resulting event a “homo-gain” when the new box is of the same type as the one it overlaps, and otherwise a “hetero-gain”. In either case, the creation of the new box does not always destroy the original box.”

      Impact of the work on the field, and the utility of the methods and data to the community: From this study, we are more aware of different types of ways promoters can evolve and devolve, but do not have a better ability to predict when mutations will lead to these effects. Recent work in the field of bacterial gene regulation has raised interest in bidirectional promoter regions. While the authors do not discuss how mutations that raise expression in one direction may affect another, they have created an expansive dataset that may enable other groups to study this interesting phenomenon. Also, their variation of the Sort-Seq protocol will be a valuable example for other groups who may be interested in studying bidirectional expression. Lastly, this study may be of interest to groups studying eukaryotic regulation as it can inform how the evolution of transcription factor binding sites influences short-range interactions with local regulator elements. Any additional context to understand the significance of the work:

      The task of computationally predicting whether a sequence drives promoter activity is difficult. By learning what types of mutations create or destroy promoters from this study, we are better equipped for this task.

      We thank Reviewer #1 again for their time and their thoughtful comments.

      Reviewer #2 (Public Review):

      Summary:

      Fuqua et al investigated the relationship between prokaryotic box motifs and the activation of promoter activity using a mutagenesis sequencing approach. From generating thousands of mutant daughter sequences from both active and non-active promoter sequences they were able to produce a fantastic dataset to investigate potential mechanisms for promoter activation. From these large numbers of mutated sequences, they were able to generate mutual information with gene expression to identify key mutations relating to the activation of promoter island sequences.

      We thank Reviewer #2 for reading and providing a thorough review of our manuscript. 

      Strengths:

      The data generated from this paper is an important resource to address this question of promoter activation. Being able to link the activation of gene expression to mutational changes in previously nonactive promoter regions is exciting and allows the potential to investigate evolutionary processes relating to gene regulation in a statistically robust manner. Alongside this, the method of identifying key mutations using mutual information in this paper is well done and should be standard in future studies for identifying regions of interest.

      Thank you for your kind words.

      Weaknesses:

      While the generation of the data is superb the focus only on these mutational hotspots removes a lot of the information available to the authors to generate robust conclusions. For instance.

      (1) The linear regression in S5 used to demonstrate that the number of mutational hotspots correlates with the likelihood of a mutation causing promoter activation is driven by three extreme points.

      A fair criticism. In response, we have chosen to remove the analysis of this trend from the manuscript entirely. (Additionally, Pnew and mutual information calculations both relied on the fluorescence scores of daughter sequences, so the finding was circular in its logic.)

      (2) Many of the arguments also rely on the number of mutational hotspots being located near box motifs. The context-dependent likelihood of this occurring is not taken into account given that these sequences are inherently box motif rich. So, something like an enrichment test to identify how likely these hot spots are to form in or next to motifs.

      Another good point. To address it, we carried out a computational analysis where we randomly scrambled the nucleotides of each parent sequence while maintaining the coordinates for each mutual information “hotspot.” This scrambling results in significantly less overlap with hotspots and boxes. This analysis is now depicted in Figure 2C and described in lines 272-296.

      (3) The link between changes in expression and mutations in surrounding motifs is assessed with two-sided Mann Whitney U tests. This method assumes that the sequence motifs are independent of one another, but the hotspots of interest occur either in 0, 3, 4, or 5s in sequences. There is therefore no sequence where these hotspots can be independent and the correlation causation argument for motif change on expression is weakened.

      This is a fair criticism and a limitation of the MWU test. To better support our reasoning, we engineered 27 reporter constructs harboring the specific mutations in the parents that we had predicted to change promoter activity. For each, we compared their fluorescence levels with their wild-type counterpart. The results from these experiments are in Figures 3, 4, 5, and Supplemental Figures S8, S9, S10, S11, and S12 and are labeled with a ′ (prime) symbol.

      These experiments not only confirm the increases and decreases in fluorescence that our analysis had predicted. They also demonstrate, with the exception of two (out of 27) falsepositive discoveries, that background mutations do not confound our analysis. We mention these two exceptions (lines 364-367):

      “In two of these hotspots, our validation experiments revealed no substantial difference in gene expression as a result of the hotspot mutation (Fig S8F′ and Fig S8J′). In both of these false positives, new -10 boxes emerge in locations without an upstream -35 box.”

      (4) The distance between -10 and -35 was mentioned briefly but not taken into account in the analysis.

      We have now included these spacer distances where appropriate. These changes are in Figures 3, 4, 5, and Supplemental Figures S8, S9, S10, S11, and S12.

      We note that in many cases, high-scoring PWM hits for the same motif can overlap (i.e. two -10 motifs or two -35 motifs overlap). Additionally, the proximity of a -35 and -10 box does not guarantee that the two boxes are interacting. Together, these two facts can result in an ambiguity of the spacer size between two boxes. To avoid any reporting bias, we thus often report spacer sizes as a range (see Figure panels 4F, S8D, S8F-L, S9A, S9H, S10A, and S10E). The smallest spacer we annotate is in Figure 4F with 10 bp, and the largest is in Figure S8D with 26 bp. More “extreme” distances are not annotated, and for the reader to decide if an interaction is present or not.

      The authors propose mechanisms of promoter activation based on a few observations that are treated independently but occur concurrently. To address this using complementary approaches such as analysis focusing on identifying important motifs, using something like a glm lasso regression to identify significant motifs, and then combining with mutational hotspot information would be more robust.

      This is a great idea, and we pursued it as part of the revision. For each parent sequence, we mapped the locations of all -10 and -35 box motifs in the daughters, then reduced each sequence to a binary representation, either encoding or not encoding these motifs, also referred to as a “hot-encoded matrix.” We subsequently performed a Lasso regression between the hot-encoded matrices and the fluorescence scores of each daughter sequence. The regression then outputs “weights” to each of the motifs in the daughters. The larger a motif’s weight is, the more the motif influences promoter activity. The Author response image 1 describes our workflow.

      Author response image 1.

      We really wanted this analysis to work, but unfortunately, the computational model does not act robustly, even when testing multiple values for the hyperparameter lambda (λ), which accounts for differences in model biases vs variance.

      The regression assigns strong weights almost exclusively to -10 boxes, and assigns weak to even negative weights to -35 boxes. While initially exciting, these weights do not consistently align with the results from the 27 constructs with individual mutations that we tested experimentally. This ultimately suggests that the regression is overfitting the data.

      We do think a LASSO-regression approach can be applied to explore how individual motifs contribute to promoter activity. However, effectively implementing such a method would require a substantially more complex analysis. We respectfully believe that such an approach would distract from the current narrative, and would be more appropriate for a computational journal in a future study. 

      Because this analysis was inconclusive, we have not made it part of the revised manuscript. However, we hope that our 27 experimentally validated new constructs with individual mutations are sufficient to address the reviewer’s concerns regarding independent verification of our computational predictions.

      Other elements known to be involved in promoter activation including TGn or UP elements were not investigated or discussed.

      Thank you for highlighting this potentially important oversight. In response, we have performed two independent analyses to explore the role of TGn in promoter emergence in evolution. First, we computationally searched for -10 boxes with the bases TGn immediately upstream of them in the parent sequences, and found 18 of these “extended -10 boxes” in the parents (lines 143145):

      “On average, each parent sequence contains ~5.32 -10 boxes and ~7.04 -35 boxes (Fig S1). 18 of these -10 boxes also include the TGn motif upstream of the hexamer.”

      However, only 20% of these boxes were found in parents with promoter activity (lines 182-185):

      “We also note that 30% (15/50) of parents have the TGn motif upstream of a -10 box, but only 20% (3/15) of these parents have promoter activity (underlined with promoter activity: P4-RFP, P6-RFP, P8-RFP, P9-RFP, P10-RFP, P11GFP, P12-GFP, P17-GFP, P18-GFP, P18-RFP, P19-RFP, P22-RFP, P24-GFP, P25-GFP, P25-RFP). “

      Second, we computationally searched through all of the daughter sequences to identify new -10 boxes with TGn immediately upstream. We found 114 -10 boxes with the bases TGn upstream. However, only 5 new -10 boxes (2 with TGn) were associated with increasing fluorescence (lines 338-345):

      “On average, 39.5 and 39.4 new -10 and -35 boxes emerged at unique positions within the daughter sequences of each mutagenized parent (Fig 3A,B), with 1’562 and 1’576 new locations for -10 boxes and -35 boxes, respectively. ~22% (684/3’138) of these new boxes are spaced 15-20 bp away from their cognate box, and ~7.3% (114/1’562) of the new -10 boxes have the TGn motif upstream of them. However, only a mere five of the new -10 boxes and four of the new 35 boxes are significantly associated with increasing fluorescence by more than +0.5 a.u. (Fig 3C,D).”

      In addition, we now study the role of UP elements. This analysis showed that the UP element plays a negligible role in promoter emergence within our dataset.  It is discussed in a new subsection of the results (lines 591-608).

      Collectively, these additional analyses suggest that the presence of TGn plus a -10 box is insufficient to create promoter activity, and that the UP element does not play a significant role in promoter emergence or evolution.

      Reviewer #3 (Public Review):

      Summary:

      Like many papers in the last 5-10 years, this work brings a computational approach to the study of promoters and transcription, but unfortunately disregards or misrepresents much of the existing literature and makes unwarranted claims of novelty. My main concerns with the current paper are outlined below although the problems are deeply embedded.

      We thank Reviewer #3 for taking the time to review this manuscript. We have made extensive changes to address their concerns about our work.

      Strengths:

      The data could be useful if interpreted properly, taking into account i) the role of translation ii) other promoter elements, and iii) the relevant literature.

      Weaknesses:

      (1) Incorrect assumptions and oversimplification of promoters.

      - There is a critical error on line 68 and Figure 1A. It is well established that the -35 element consensus is TTGACA but the authors state TTGAAA, which is also the sequence represented by the sequence logo shown and so presumably the PWM used. It is essential that the authors use the correct -35 motif/PWM/consensus. Likely, the authors have made this mistake because they have looked at DNA sequence logos generated from promoter alignments anchored by either the position of the -10 element or transcription start site (TSS), most likely the latter. The distance between the TSS and -10 varies. Fewer than half of E. coli promoters have the optimal 7 bp separation with distances of 8, 6, and 5 bp not being uncommon (PMID: 35241653). Furthermore, the distance between the -10 and -35 elements is also variable (16,17, and 18 bp spacings are all frequently found, PMID: 6310517). This means that alignments, used to generate sequence logos, have misaligned -35 hexamers. Consequently, the true consensus is not represented. If the alignment discrepancies are corrected, the true consensus emerges. This problem seems to permeate the whole study since this obviously incorrect consensus/motif has been used throughout to identify sequences that resemble -35 hexamers.

      We respectfully but strongly disagree that our analysis has misrepresented the true nature of -35 boxes. First, accounting for more A’s at position 5 in the PWM is not going to lead to a “critical error.” This is because positions 4-6 of the motif barely have any information content (bits) compared to positions 1-3 (see Fig 1A). This assertion is not just based on our own PWM, but based on ample precedent in the literature. In PMID 14529615, TTG is present in 38% of all -35 boxes, but ACA only in 8%. In PMID 29388765, with the -10 instance TATAAT, the -35 instance TTGCAA yields stronger promoters compared to the -35 instance TTGACA (See their Figure 3B).

      In PMID 29745856 (Figure 2), the most information content lies in positions 1-3, with the A and C at position 5 both nearly equally represented, as in our PWM. In PMID 33958766 (Figure 1) an experimentally-derived -35 box is even reduced to a “partial” -35 box which only includes positions 1 and 2, with consensus: TTnnnn.

      In addition, we did not derive the PWMs as the reviewer describes. The PWMs we use are based on computational predictions that are in excellent agreement with experimental results. Specifically, the PWMs we use are from PMID 29728462, which acquired 145 -10 and -35 box sequences from the top 3.3% of computationally predicted boxes from Regulon DB. See PMID 14529615 for the computational pipeline that was used to derive the PWMs, which independently aligns the -10 and -35 boxes to create the consensus sequences. The -35 PWMs significantly and strongly correlates with an experimentally derived -35 box (see Supporting Information from Figure S4 of Belliveau et al., PNAS 2017. Pearson correlation coefficient = 0.89). Within the 145 -35 boxes, the exact consensus sequence (TTGACA) that Reviewer #3 is concerned about is present 6 times in our matrix, and has a PWM score above the significance threshold. In other words, TTGACA, is classified to be a -35 box in our dataset.

      We now provide DNA sequences for each of the figures to improve accessibility and reproducibility. A reader can now use any PWM or method they wish to interpret the data.

      - An uninformed person reading this paper would be led to believe that prokaryotic promoters have only two sequence elements: the -10 and -35 hexamers. This is because the authors completely ignore the role of the TG motif, UP element, and spacer region sequence. All of these can compensate for the lack of a strong -35 hexamer and it's known that appending such elements to a lone -10 sequence can create an active promoter (e.g. PMIDs 15118087, 21398630, 12907708, 16626282, 32297955). Very likely, some of the mutations, classified as not corresponding to a -10 or -35 element in Figure 2, target some of these other promoter motifs.

      Thank you for bringing this oversight to our attention. We have performed two independent analyses to explore the role of TGn in promoter emergence in evolution. First, we computationally searched for -10 boxes with the bases TGn immediately upstream of them in the parent sequences, and found 18 of these “extended -10 boxes” in the parents (lines 143145):

      “On average, each parent sequence contains ~5.32 -10 boxes and ~7.04 -35 boxes (Fig S1). 18 of these -10 boxes also include the TGn motif upstream of the hexamer.”

      However, only 20% of these boxes were found in parents with promoter activity (lines 182-185):

      “We also note that 30% (15/50) of parents have the TGn motif upstream of a -10 box, but only 20% (3/15) of these parents have promoter activity (underlined with promoter activity: P4-RFP, P6-RFP, P8-RFP, P9-RFP, P10-RFP, P11GFP, P12-GFP, P17-GFP, P18-GFP, P18-RFP, P19-RFP, P22-RFP, P24-GFP, P25-GFP, P25-RFP).”

      Second, we computationally searched through all of the daughter sequences to identify new -10 boxes with TGn immediately upstream. We found 114 -10 boxes with the bases TGn upstream. However, only 5 new -10 boxes (2 with TGn) were associated with increasing fluorescence (lines 338-345):

      “On average, 39.5 and 39.4 new -10 and -35 boxes emerged at unique positions within the daughter sequences of each mutagenized parent (Fig 3A,B), with 1’562 and 1’576 new locations for -10 boxes and -35 boxes, respectively. ~22% (684/3’138) of these new boxes are spaced 15-20 bp away from their cognate box, and ~7.3% (114/1’562) of the new -10 boxes have the TGn motif upstream of them. However, only a mere five of the new -10 boxes and four of the new 35 boxes are significantly associated with increasing fluorescence by more than +0.5 a.u. (Fig 3C,D).”

      In addition, we now study the role of UP elements. This analysis showed that the UP element plays a negligible role in promoter emergence within our dataset.  It is discussed in a new subsection of the results (lines 591-608) and in the newly added Figure S13.

      Collectively, these additional analyses suggest that the presence of TGn plus a -10 box is insufficient to create promoter activity, and that the UP element does not play a significant role in promoter emergence or evolution.

      - The model in Figure 4C is highly unlikely. There is no evidence in the literature that RNAP can hang on with one "arm" in this way. In particular, structural work has shown that sequencespecific interactions with the -10 element can only occur after the DNA has been unwound (PMID: 22136875). Further, -10 elements alone, even if a perfect match to the consensus, are non-functional for transcription. This is because RNAP needs to be directed to the -10 by other promoter elements, or transcription factors. Only once correctly positioned, can RNAP stabilise DNA opening and make sequence-specific contacts with the -10 hexamer. This makes the notion that RNAP may interact with the -10 alone, using only domain 2 of sigma, extremely unlikely.

      This is a valid criticism, and we thank the reviewer for catching this problem. In response, we have removed the model and pertinent figures throughout the entire manuscript.

      (2) Reinventing the language used to describe promoters and binding sites for regulators.

      - The authors needlessly complicate the narrative by using non-standard language. For example, On page 1 they define a motif as "a DNA sequence computationally predicted to be compatible with TF binding". They distinguish this from a binding site "because binding sites refer to a location where a TF binds the genome, rather than a DNA sequence". First, these definitions are needlessly complicated, why not just say "putative binding sites" and "known binding sites" respectively? Second, there is an obvious problem with the definitions; many "motifs" with also be "bindings sites". In fact, by the time the authors state their definitions, they have already fallen foul of this conflation; in the prior paragraph they stated: "controlled by DNA sequences that encode motifs for TFs to bind". The same issue reappears throughout the paper.

      We agree that this was needlessly complicated. We now just refer to every sequence we study as a motif. A -10 box is a motif, a -35 box is a motif, a putative H-NS binding site is an H-NS motif, etc. The word “binding site” no longer occurs in the manuscript.

      - The authors also use the terms "regulatory" and non-regulatory" DNA. These terms are not defined by the authors and make little sense. For instance, I assume the authors would describe promoter islands lacking transcriptional activity (itself an incorrect assumption, see below)as non-regulatory. However, as horizontally acquired sections of AT-rich DNA these will all be bound by H-NS and subject to gene silencing, both promoters for mRNA synthesis and spurious promoters inside genes that create untranslated RNAs. Hence, regulation is occurring.

      Another fair point. We have thus changed the terminology throughout to “promoter” and “nonpromoter.”

      - Line 63: "In prokaryotes, the primary regulatory sequences are called promoters". Promoters are not generally considered regulatory. Rather, it is adjacent or overlapping sites for TFs that are regulatory. There is a good discussion of the topic here (PMID: 32665585). 

      We have rewritten this. The sentence now reads (lines 67-69):

      “A canonical prokaryotic promoter recruits the RNA polymerase subunit σ70 to transcribe downstream sequences (Burgess et al., 1969; Huerta and Collado-Vides, 2003; Paget and Helmann, 2003; van Hijum et al., 2009).”

      (3) The authors ignore the role of translation.

      - The authors' assay does not measure promoter activity alone, this can only be tested by measuring the amount of RNA produced. Rather, the assay used measures the combined outputs of transcription and translation. If the DNA fragments they have cloned contain promoters with no appropriately positioned Shine-Dalgarno sequence then the authors will not detect GFP or RFP production, even though the promoter could be making an RNA (likely to be prematurely terminated by Rho, due to a lack of translation). This is known for promoters in promoter islands (e.g. Figure 1 in PMID: 33958766).

      We agree that this is definitely a limitation of our study, which we had not discussed sufficiently. In response, we now discuss this limitation in a new section of the discussion (lines 680-686):

      “Second, we measure protein expression through fluorescence as a readout for promoter activity. This readout combines transcription and translation. This means that we cannot differentiate between transcriptional and post-transcriptional regulation, including phenomena such as premature RNA termination (Song et al., 2022; Uptain and Chamberlin, 1997), post-transcriptional modifications (Mohanty and Kushner, 2006), and RNA-folding from riboswitch-like sequences (Mandal and Breaker, 2004).”

      - In Figure S6 it appears that the is a strong bias for mutations resulting in RFP expression to be close to the 3' end of the fragment. Very likely, this occurs because this places the promoter closer to RFP and there are fewer opportunities for premature termination by Rho.

      The reviewer raises a very interesting possibility. To validate it, we have performed the following analysis. We took the RFP expression values from the 9’934 daughters with single mutations in all 25 parent sequences (P1-RFP, P2-RFP, … P25-RFP), and plotted the location of the single mutation (horizontal axis) against RFP expression (vertical axis) in Author response image 2. 

      Author response image 2.

      The distribution is uniform across the sequences, showing that distance from the RBS is not likely the reason for this observation. Since this analysis was uninformative with respect to distance from the RBS, we chose not to include it in the manuscript.

      (4) Ignoring or misrepresenting the literature.

      - As eluded to above, promoter islands are large sections of horizontally acquired, high ATcontent, DNA. It is well known that such sequences are i) packed with promoters driving the expression on RNAs that aren't translated ii) silenced, albeit incompletely, by H-NS and iii) targeted by Rho which terminates untranslated RNA synthesis (PMIDs: 24449106, 28067866, 18487194). None of this is taken into account anywhere in the paper and it is highly likely that most, if not all, of the DNA sequences the authors have used contain promoters generating untranslated RNAs.

      Thank you for pointing out that our original submission was incomplete in this regard. We address these concerns by new analyses, including some new experiments. First, Rhodependent termination is associated with the RUT motif, which is very rich in Cytosines (PMID: 30845912). Given that our sequences confer between 65%-78% of AT-content, canonical rhodependent termination is unlikely. However, we computationally searched for rho-dependent terminators using the available code from PMID: 30845912, but the algorithm did not identify any putative RUTs. Because this analysis was not informative, we did not include it in the paper.

      We analyzed the role of H-NS on promoter emergence and evolution within our dataset using both experimental and computational approaches. These additional analyses are now shown in the newly-added Figure 5 and the newly-added Figure S12. We found that H-NS represses P22-GFP and P12-RFP and affects the bidirectionality of P20. More specifically, to analyze the effects of H-NS, we first compared the fluorescence levels of parent sequences in a Δhns background vs the wild-type (dh5α) background in Figure 5A. We found 6 candidate H-NS targets, with P22-GFP and P12-RFP exhibiting the largest changes in fluorescence (lines 496506):

      “We plot the fluorescence changes in Fig 5A as distributions for the 50 parents, where positive and negative values correspond to an increase or decrease in fluorescence in the Δhns background, respectively. Based on the null hypothesis that the parents are not regulated by H-NS, we classified outliers in these distributions (1.5 × the interquartile range) as H-NS-target candidates. We refer to these outliers as “candidates” because the fluorescence changes could also result from indirect trans-effects from the knockout (Mattioli et al., 2020; Metzger et al., 2016). This approach identified 6 candidates for H-NS targets (P2-GFP, P19-GFP, P20-GFP, P22-GFP, P12-RFP, and P20-RFP). For GFP, the largest change occurs in P22-GFP, increasing fluorescence ~1.6-fold in the mutant background (two-tailed t-test, p=1.16×10-8) (Fig 5B). For RFP, the largest change occurs in P12-RFP, increasing fluorescence ~0.5-fold in the mutant background (two-tailed t-test, p=4.33×10-10) (Fig 5B).” 

      We also observed that the Δhns background affected the bidirectionality of P20 (lines 507-511):

      “We note that for template P20, which is a bidirectional promoter, GFP expression increases ~2.6-fold in the Δhns background (two-tailed t-test, p=1.59×10-6). Simultaneously, RFP expression decreases ~0.42-fold in the Δhns background (two-tailed t-test, p=4.77×10-4) (Fig S12A). These findings suggest that H-NS also modulates the directionality of P20’s bidirectional promoter through either cis- or trans-effects.”

      We then searched for regions where losing H-NS motifs in hotspots significantly changed fluorescence. We identified 3 motifs in P12-RFP and P22-GFP (lines 522-528):

      “For P22-GFP, a H-NS motif lies 77 bp upstream of the mapped promoter. Mutations which destroy this motif significantly increase fluorescence by +0.52 a.u. (two-tailed MWU test, q=1.07×10-3) (Fig 5E). For P12-RFP, one H-NS motif lies upstream of the mapped promoter’s -35 box, and the other upstream of the mapped promoter’s -10 box. Mutations that destroy these H-NS motifs significantly increase fluorescence by +0.53 and +0.51 a.u., respectively (two-tailed MWU test, q=3.28×10-40 and q=4.42 ×10-50) (Fig 5F,G). Based on these findings, we conclude that these motifs are bound by H-NS.”

      We are grateful for the suggestion to look at the role of H-NS in our dataset. Our analysis revealed a more plausible explanation to what we formerly referred to as a “Tandem Motif” in the original submission. Previously, we had shown that in P12-RFP, when a -35 box is created next to the promoter’s -35 box, or a -10 box next to the promoter’s -10 box, that expression decreases. These new -10 and -35 boxes, however, also overlap with the two H-NS motifs in P12-RFP. We tested these exact point mutations in reporter plasmids and in the Δhns background, and found that the Δhns background rescues this loss in expression (see Figure S12). This analysis is in the newly added subsection: “The binding of H-NS changes when new 10 and -35 boxes are gained” and can be found at lines 529-563. We summarize the findings in a final paragraph of the section (lines 556-563):

      “To summarize, we present evidence that H-NS represses both P22-GFP and P12-RFP in cis. H-NS also modulates the bidirectionality of P20-GFP/RFP in cis or trans. In P22-GFP, the strongest H-NS motif lies upstream of the promoter. In P12-RFP, the strongest H-NS motifs lie  upstream of the -10 and -35 boxes of the promoter. We note that there are 16 additional H-NS motifs surrounding the promoter in P12-RFP that may also regulate P12-RFP (Fig S12G). Mutations in two of these two H-NS motifs can create additional -10 and -35 boxes that appear to lower expression. However, the effects of these mutations are insignificant in the absence of H-NS, suggesting that these mutations actually modulate H-NS binding.”

      We also agree that the majority of these sequences are likely driving the expression of many untranslated RNAs (see Purtov et al., 2014). We thus now define a promoter more carefully as follows (lines 113-119):

      “In this study, we define a promoter as a DNA sequence that drives the expression of a (fluorescent) protein whose expression level, measured by its fluorescence, is greater than a defined threshold. We use a threshold of 1.5 arbitrary units (a.u.) of fluorescence. This definition does not distinguish between transcription and translation. We chose it because protein expression is usually more important than RNA expression whenever natural selection acts on gene expression, because it is the primary phenotype visible to natural selection (Jiang et al., 2023).” 

      We also state this as a limitation of our study in the Discussion (lines 680-686):

      “Second, we measure protein expression through fluorescence as a readout for promoter activity. This readout combines transcription and translation. This means that we cannot differentiate between transcriptional and post-transcriptional regulation, including phenomena such as premature RNA termination (Song et al., 2022; Uptain and Chamberlin, 1997), post-transcriptional modifications (Mohanty and Kushner, 2006), and RNA-folding from riboswitch-like sequences (Mandal and Breaker, 2004).”

      - The authors state that GC content does not correlate with the emergence of new promoters. It is known that GC content does correlate to the emergence of new promoters because promoters are themselves AT-rich DNA sequences (e.g. see Figure 1 of PMID: 32297955). There are two reasons the authors see no correlation in this work. First, the DNA sequences they have used are already very AT-rich (between 65 % and 78 % AT-content). Second, they have only examined a small range of different AT-content DNA (i.e. between 65 % and 78 %). The effect of AT-content on promoter emerge is most clearly seen between AT-content of between around 40 % and 60 %. Above that level, the strong positive correlation plateaus.

      We respectfully disagree that the reviewer’s point is pertinent because what the reviewer is referring to is the likelihood that the sequence is a promoter, which indeed increases with AT content, but we are focused on the likelihood that a sequence becomes a promoter through DNA mutation. We note that if a DNA sequence is more AT-rich, then it is more likely to have -10 and -35 boxes, because their consensus sequences are also AT-rich. However, H-NS and other transcriptional repressors also bind to AT-rich sequences. This could also explain the saturation observed above 60% AT-content in PMID 32297955. Perhaps we can address this trend in future works.

      - Once these authors better include and connect their results to the previous literature, they can also add some discussion of how previous papers in recent years may have also missed some of this important context.

      We apologize for this oversight. We have rewritten the Discussion section to include the following points below. Many of the newly added references come from the group of David Grainger, who works on H-NS repression, bidirectional promoters, promoter emergence, promoter motifs, and spurious transcription in E. coli. More specifically:

      (1) The role of pervasive transcription and the likelihood of promoter emergence (lines 614-621):

      “Instead, we present evidence that promoter emergence is best predicted by the level of background transcription each non-promoter parent produces, a phenomenon also referred to as “pervasive transcription” (Kapranov et al., 2007).

      From an evolutionary perspective, this would suggest that sequences that produce such pervasive transcripts – including the promoter islands (Panyukov and Ozoline, 2013) and the antisense strand of existing promoters (Dornenburg et al., 2010; Warman et al., 2021), may have a proclivity for evolving de-novo promoters compared to other sequences (Kapranov et al., 2007; Wade and Grainger, 2014).”

      (2) How our results contradict the findings from Bykov et al., 2020 (lines 622-640):

      “A previous study randomly mutagenized the appY promoter island upstream of a GFP reporter, and isolated variants with increased and decreased GFP expression. The authors found that variants with higher GFP expression acquired mutations that 1) improve a -10 box to better match its consensus, and simultaneously 2) destroy other -10 and -35 boxes (Bykov et al., 2020). The authors concluded that additional -10 and -35 boxes repress expression driven by promoter islands. Our data challenge this conclusion in several ways. 

      First, we find that only ~13% of -10 and -35 boxes in promoter islands actually contribute to promoter activity. Extrapolating this percentage to the appY promoter island, ~87% (100% - 13%) of the motifs would not be contributing to its activity. Assuming the appY promoter island is not an outlier, this would insinuate that during random mutagenesis, these inert motifs might have accumulated mutations that do not change fluorescence. Indeed, Bykov et al. (Bykov et al., 2020) also found that a similar frequency of -10 and -35 boxes were destroyed in variants selected for lower GFP expression, which supports this argument. Second, we find no evidence that creating a -10 or -35 box lowers promoter activity in any of our 50 parent sequences. Third, we also find no evidence that destruction of a -10 or -35 box increases promoter activity without plausible alternative explanations, i.e. overlap of the destroyed box with a H-NS site, destruction of the promoter, or simultaneous creation of another motif as a result of the destruction. In sum, -10 and 35 boxes are not likely to repress promoter activity.”

      (3) How other sequence features besides the -10 and -35 boxes may influence promoter emergence and activity (lines 661-671):

      “These findings suggest that we are still underestimating the complexity of promoters. For instance, the -10 and -35 boxes, extended -10, and the UP-element may be one of many components underlying promoter architecture. Other components may include flanking sequences (Mitchell et al., 2003), which have been observed to play an important role in eukaryotic transcriptional regulation (Afek et al., 2014; Chiu et al., 2022; Farley et al., 2015; Gordân et al., 2013). Recent studies on E. coli promoters even characterize an AT-rich motif within the spacer sequence (Warman et al., 2020), and other studies use longer -10 and -35 box consensus sequences (Lagator et al., 2022). Another possibility is that there is much more transcriptional repression in the genome than anticipated (Singh et al., 2014). This would also coincide with the observed repression of H-NS in P22-GFP and P12-RFP, and accounts of H-NSrepression in the full promoter island sequences (Purtov et al., 2014).”

      (4) The limits of our experimental methodology (lines 675-686):

      “Additionally, future studies will be necessary to address the limitations of our own work. First, we use binary thresholding to determine i) the presence or absence of a motif, ii) whether a sequence has promoter activity or not, and iii) whether a part of a sequence is a hotspot or not. While chosen systematically, the thresholds we use for these decisions may cause us to miss subtle but important aspects of promoter evolution and emergence. Second, we measure protein expression through fluorescence as a readout for promoter activity. This readout combines transcription and translation. This means that we cannot differentiate between transcriptional and post-transcriptional regulation, including phenomena such as premature RNA termination (Song et al., 2022; Uptain and Chamberlin, 1997), posttranscriptional modifications (Mohanty and Kushner, 2006), and RNA-folding from riboswitch-like sequences (Mandal and Breaker, 2004) “

      (5) An updated take-home message (lines 687-694):

      “Overall, our study demonstrates that -10 and -35 boxes neither prevent existing promoters from driving expression, nor do they prevent new promoters from emerging by mutation. It shows how mutations can create new -10 and -35 boxes near or on top of preexisting ones to modulate expression. However, randomly creating a new -10 or -35 box will rarely create a new promoter, even if the new box is appropriately spaced upstream or downstream of a cognate box. Ultimately our study demonstrates that promoter models need to be further scrutinized, and that using mutagenesis to create de-novo promoters can provide new insights into promoter regulatory logic.”

      (5) Lack of information about sequences used and mutations.

      - To properly assess the work any reader will need access to the sequences cloned at the start of the work, where known TSSs are within these sequences (ideally +/- H-NS, which will silence transcription in the chromosomal context but may not when the sequences are removed from their natural context and placed in a plasmid). Without this information, it is impossible to assess the validity of the authors' work.

      Thank you for raising this point. Please see Data S1 for the 25 template sequences (P1-P25) used in this study, and Data S2 for all of the daughter sequences.

      For brevity, we have addressed the reviewer’s request to look at the role of H-NS in their comment (4) “Ignoring or misrepresenting the literature.”

      We do not have information about the predicted transcription start sites (TSS) for the parent sequences because the program which identified them (Platprom) is no longer available. Regardless, having TSS coordinates would not validate or invalidate our findings, since we already know that the promoter islands produce short transcripts throughout their sequences, and we are primarily interested in promoters which can produce complete transcripts.

      - The authors do not account for the possibility that DNA sequences in the plasmid, on either side of the cloned DNA fragment, could resemble promoter elements. If this is the case, then mutations in the cloned DNA will create promoters by "pairing up" with the plasmid sequences. There is insufficient information about the DNA sequences cloned, the mutations identified, or the plasmid, to determine if this is the case. It is possible that this also accounts for mutational hotspots described in the paper.

      We agree that these are important points. To address the criticism that we provided insufficient information, we now redesigned all our figures to provide this information. Specifically, the figures now include the DNA sequences, their PWM predictions, and the exact mutations that lead to promoter activity. The figures with these changes are Figures 3, 4, 5, and Supplemental Figures S8, S9, S10, S11, and S12. We now also provide more details about pMR1 in a new section of the methods (lines 740-748):

      “Plasmid MR1 (pMR1)

      The plasmid MR1 (pMR1) is a variant of the plasmid RV2 (pRV2) in which the kan resistance gene has been swapped with the cm resistance gene (Guazzaroni and Silva-Rocha, 2014). Plasmid pMR1 encodes the BBa_J34801 ribosomal binding site (RBS, AAAGAGGAGAAA) 6 bp upstream of the start codon for GFP(LVA). The plasmid also encodes a putative RBS (AAGGGAGG) (Cazemier et al., 1999) 5 bp upstream of the start codon for mCherry on the opposite strand.

      The plasmid additionally contains the low-to-medium copy number origin of replication p15A (Westmann et al., 2018).

      A map of the plasmid is available on the Github repository: https://github.com/tfuqua95/promoter_islands

      The reviewer also makes a valid point about promoter elements of the plasmid itself. We addressed it with the following new analyses. First we re-examined each of the examples where new -10 and -35 boxes are gained or lost, to see if any of these hotspots occur on the flanking ends of the parent sequences. We looked specifically at the ends because they could potentially interact with -10 and -35 box-like sequences on the plasmid to form a promoter. 

      Only one of these hotspots (out of 27) occurred at the end of the cloned sequences, and is thus a candidate for the phenomenon the reviewer hypothesized. This hotspot occurs in P9-GFP, where gaining a -10 box at the left flank increases expression (see Figure S8E-F’). There is indeed a -35 box 22-23 bp upstream of this -10 box on the plasmid, which could potentially affect promoter activity. 

      We tested the GFP expression of a construct harboring the point mutation which creates this -10 box on the left flank of P9-GFP. However, there was no significant difference in fluorescence between this construct and the wile-type P9-GFP (see Figure S8E-F’). Thus, this -35 box on pMR1 is not likely creating a new promoter.

      (6) Overselling the conclusions.

      Line 420: The paper claims to have generated important new insights into promoters. At the same time, the main conclusion is that "Our study demonstrates that mutations to -10 and -35 boxes motifs are the primary paths to create new promoters and to modulate the activity of existing promoters". This isn't new or unexpected. People have been doing experiments showing this for decades. Of course, mutations that make or destroy promoter elements create and destroy promoters. How could it be any other way?

      In hindsight, we agree that the original conclusion was not very novel. Our new conclusion is that -10 and -35 boxes do not repress transcription, and that our current promoter models, even with the additional motifs like the UP-element and the extended -10, are insufficient to understand promoters (lines 687-694):

      “Overall, our study demonstrates that -10 and -35 boxes neither prevent existing promoters from driving expression, nor do they prevent new promoters from emerging by mutation. It shows how mutations can create new -10 and -35 boxes near or on top of preexisting ones to modulate expression. However, randomly creating a new -10 or -35 box will rarely create a new promoter, even if the new box is appropriately spaced upstream or downstream of a cognate box. Ultimately our study demonstrates that promoter models need to be further scrutinized, and that using mutagenesis to create de-novo promoters can provide new insights into promoter regulatory logic.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I would like to start by thanking the authors for presenting an interesting and well-written article for review. This paper is a welcome addition to the field, addressing modern questions in the longstanding area of bacterial gene regulation. It is both enlightening and inspiring. While I do have suggestions, I hope these are not perceived as a lack of optimism for the work.

      Thank you for your kind words and suggestions, and for providing an astute and constructive review. We feel that manuscript has greatly improved with your suggested changes.

      ABSTRACT:

      Line 11: The sentence, "It is possible that these motifs influence..." Could be rewritten to be clearer as it is the most important point of the manuscript. It is not obvious that you're talking about how the local landscape of motifs affects the probability of promoters evolving/devolving in this location.

      We have changed the sentence to read, “Here, we ask whether the presence of such motifs in different genetic sequences influences promoter evolution and emergence.”

      INTRODUCTION:

      Line 68: Is the -35 consensus motif not TTGACA? Here it is listed as TTGAAA.

      Corrected from TTGAAA to TTGACA

      RESULTS:

      Line 92-94. In finding that the. The main takeaway from this work is that different sequences have different likelihoods of mutations creating promoters and so I believe this claim could be explored deeper with more quantitative information. Could the authors supplement this claim by including? Could you look at whether there is a correlation between the baseline expression of a parent sequence and Pnew? I expect even the inactive sequences to have some variability in measured expression.

      Thank you for this great idea. We followed up on it by plotting the baseline parent sequence fluorescence scores against Pnew. You are indeed correct, i.e., Pnew increases with baseline expression following a sigmoid function, and is now shown in Figure 1D. To report our new observations, we have added the following section to the Results (lines 219-232):

      “Although mutating each of the 40 non-promoter parent sequences could create promoter activity, the likelihood Pnew that a mutant has promoter activity, varies dramatically among parents. For each non-promoter parent, Fig 1D shows the percentage of active daughter sequences. The median Pnew is 0.046 (std. ± 0.078), meaning that ~4.6% of all mutants have promoter activity. The lowest Pnew is 0.002 (P25-GFP) and the highest 0.41 (P8-RFP), a 205-fold difference.

      We hypothesized that these large differences in Pnew could be explained by minute differences in the fluorescence scores of each parent, particularly if its score was below 1.5 a.u. Plotting the fluorescence scores of each parent (N=50) and their respective Pnew values as a scatterplot (Fig 1E), we can fit these values to a sigmoid curve (see methods). This finding helps to explain why P8-RFP has a high Pnew (0.41) and P25-GFP a low Pnew (0.002), as their fluorescence scores are 1.380 and 1.009 a.u., respectively. The fact that the inflection point of the fitted curve is at 1.51 a.u. further justifies our use of 1.5 a.u. as a cutoff for promoter and non-promoter activity.”

      Another potentially interesting analysis would be to see if k-mer content is correlated with Pnew. That is, determine the abundance of all hexamers in the sequence and see if Pnew is correlated with the number of hexamers present that is one nucleotide distance away from the consensus motifs (such as TcGACA or TAcAAT).

      We performed the suggested analysis by searching for k-mers that correlate with Pnew and found that no k-mer significantly correlates with Pnew (lines 240-248):

      “We then asked whether any k-mers ranging from 1-6 bp correlated with the non-promoter Pnew values (5,460 possible k-mers). 718 of these 1-6 bp k-mers are present 3 or more times in at least one non-promoter parent. We calculated a linear regression between the frequency of these 718 k-mers and each Pnew value, and adjusted the p-values to respective q-values (Benjamini-Hochberg correction, FDR=0.05). This analysis revealed six k-mers: CTTC, GTTG,

      ACTTC, GTTGA, AACTTC, TAACTT which correlate with Pnew. However, these correlations are heavily influenced by an outlying Pnew value of 0.41 (P8-RFP) (Fig S5C-H), and upon removing P8-RFP from the analysis, no k-mer significantly correlates with Pnew (data not shown)”

      Line 152-157: How did you define the thresholds for 'active' or 'inactive'? It is not clear in the methods how this distinction was made.

      We have more clearly defined these thresholds in the text. A sequence with promoter activity has a fluorescence score greater than 1.5 a.u. (lines 168-172):

      “We declared a daughter sequence to have promoter activity or to be a promoter if its score was greater than or equal to 1.5 a.u., as this score lies at the boundary between no fluorescence and weak fluorescence based on the sort-seq bins (methods). Otherwise, we refer to a daughter sequence as having no promoter activity or being a non-promoter.”

      Lines: 152-157: In trying to find the parent expression levels, no figure was available showing the distribution of parent expression levels. Furthermore, In looking at Data S2 & filtering out for sequences with distance 0 from the parent, I found the most active sequences did not match up with the sequences described as active in this section (e.g. p19 and p20 have a higher topstrand mean over P22, yet are not listed as active top strand sequences).

      We really appreciate you taking the time to examine the supplemental data. We previously listed the parents that had only GFP activity but no RFP activity (P22), and only RFP activity but no GFP activity (P6, P12, P13, P18, P21). We then said that P19 and P20 were bidirectional promoters, because they showed both GFP and RFP activity. In hindsight, we realize that our wording was confusing. We thus rewrote the affected paragraph, such that the bidirectional promoters are now in both lists of GFP/RFP active parents. We also now make the distinction between “templates” which comprise our 25 promoter island fragments, and “parents”, where we treat both strands separately (50 parents total). The paragraph in question now reads (lines 173-187):

      “Because some sequences in our library are unmutated parent sequences, we determined that 10/50 of the parent sequences already encode promoter activity before mutagenesis. Specifically, three parents drove expression on the top strand (P19-GFP, P20-GFP, P22-GFP), and five did on the bottom strand (P6-RFP, P12-RFP, P13-RFP, P18-RFP, P19-RFP, P20-RFP, P21-RFP). Two parents harbor bidirectional promoters (P19 and P20). The remaining 40 parent sequences are non-promoters, with an average fluorescence score of 1.39 a.u. We note that some of these parents have a fluorescence score higher than 1.39 a.u., but less than 1.50 a.u. such as P8-RFP (1.38 a.u.), P16-RFP (1.39 a.u.), P9-GFP (1.49 a.u.), and P1-GFP (1.47 a.u.). Whether these are truly “promoters” or not, is based solely on our threshold value of 1.5 a.u. We also note that 30% (15/50) of parents have the TGn motif upstream of a -10 box, but only 20% (3/15) of these parents have promoter activity (underlined with promoter activity: P4-RFP, P6-RFP, P8-RFP, P9RFP, P10-RFP, P11-GFP, P12-GFP, P17-GFP, P18-GFP, P18-RFP, P19-RFP, P22-RFP, P24-GFP, P25-GFP, P25RFP). See Fig S4 for fluorescence score distributions for each parent and its daughters, and Data S2 for all daughter sequence fluorescence scores.”

      Please include a supplementary figure showing the different parent expression levels (GFP mean +/- sd). Also, please explain the discrepancy in the 'active sequences' compared to Data S2 or correct my misunderstanding.

      We have added this plot to Figure S4B. The discrepancy arose because we listed the parents that had only GFP activity but no RFP activity (P22), and only RFP activity but no GFP activity (P6, P12, P13, P18, P21). We then said that P19 and P20 were bidirectional promoters, because they showed both GFP and RFP activity. previous response regarding the ambiguity.

      Line 182: I do not see 'Fuqua and Wagner 2023' in the references (though I am familiar with the preprint).

      We have added Fuqua and Wagner, BiorXiv 2023 to the references.

      Lines 197 - 200: The distribution of hotspot locations should be compared to the distribution of mutations in the library. e.g. It is not notable that 17% of mutations are in -10 motifs if 17% of all mutations are in -10 motifs.

      Thank you for raising this point. To address it, we carried out a computational analysis where we randomly scrambled the nucleotides of each parent sequence while maintaining the coordinates for each mutual information “hotspot.” This scrambling results in significantly less overlap with hotspots and boxes. This analysis is now depicted in Figure 2C and written in lines 272-296.

      Lines 253-264: Examples 3B, 3D, and 3F should indicate the spacing between the new and existing motifs. Are these close to the 15-19 bp spacer lengths preferred by sigma70?

      Point well taken. We now annotate the spacing of motifs in Figures 3, 4, 5, and Supplemental Figures S8, S9, S10, and S11. We note that in many cases, high-scoring PWM hits for the same motif can overlap (i.e. two -10 motifs or two -35 motifs overlap). Additionally, the proximity of a 35 and -10 box does not guarantee that the two boxes are interacting. Together, these two facts can result in an ambiguity of the spacer size between two boxes. To avoid any reporting bias, we thus often report spacer sizes as a range (see Figure panels 4F, S8D, S8F-L, S9A, S9H, S10A, and S10E). The smallest spacer we annotate is in Figure 4F with 10 bp, and the largest is in Figure S8D with 26 bp. Any more “extreme” distances are not annotated, and for the reader to decide if an interaction is present or not.

      Line 255: While fun, I am concerned about the 'Shiko' analogy. My understanding is the prevailing theory is that -35 recognition occurs before -10 recognition (https://doi.org/10.1073/pnas.94.17.9022, 10.1101/sqb.1998.63.141). Given this, the 'Shiko -35' concept in 3H is a bit awkward as it suggests that sigma70 stops at -10 motifs before planting down on the -35. Considering the cited paper is still in the preprint stages (and did not observe these Shiko -35 emergences), I am concerned about how this particular example will be received by the community. Perhaps more care could be done to verify that this example is consistent with generally accepted mechanisms of promoter recognition or a short clarification could be added to clarify the extent of the analogy.

      Thank you for raising this point. We decided to remove the Shiko analogy, because several readers assumed that it relates to the physical binding of RNA polymerase, rather than being an evolutionary mechanism of mutations forming complementary motifs in a stepwise manner.

      Lines 323-326: It would be helpful to describe a more systematic approach to defining emergence events into different categories. A clear definition of each category in the methods or main text would help others consistently refer to these concepts in the future. This could be helped by showing the actual parent vs daughter sequences as a supplementary figure to figures 4B, 4D, & 4G.

      We agree this could have been more clearly communicated. We have addressed this by 1) simplifying the nomenclatures of these categories and  2) clearly defining these categories, and 3) showing the actual parent vs daughter sequences in Figure 4, and Supplemental Figures S9, S10, S11, and S12. More specifically:

      (1) Simplifying the nomenclature. We highlight events where gaining new -10 and -35 boxes can modify the promoter activity of parent sequences with promoter activity. This occurs when a new -10 or -35 box appears that partially overlaps with the -10 or -35 box of the actual promoter. Thus, we rename two terms: hetero-gain and homo-gain, shown in Figure 4B:

      (2) We clearly define these categories (lines 430-435):

      “We found that these mutations frequently create new boxes overlapping those we had identified as part of a promoter (Fig S9). This occurs when mutations create a -10 box overlapping a -10 box, a -35 box overlapping a 35 box, a -10 box overlapping a -35 box, or a -35 box overlapping a -10 box. We call the resulting event a “homogain” when the new box is of the same type as the one it overlaps, and otherwise a “hetero-gain”. In either case, the creation of the new box does not always destroy the original box.”

      In the original manuscript, there was an additional third category, where gaining a -35 box upstream of the promoter’s -35 box, and gaining a -10 box upstream of the promoter’s -10 box decreased expression. We referred to this as a “tandem motif” and it can be found in Figure S12C,D. However, in response to comment “(4) Ignoring or misrepresenting the literature” from Reviewer #3, we carried out an analysis of the binding of H-NS (see Figure 5 and Figure S12). This analysis revealed that this “tandem motif” phenomenon was actually the result of changing the affinity of H-NS to these regions. Thus, the “tandem motif” is probably spurious.

      DISCUSSION:

      Line 378-379: Since hotspots are essentially areas where promoters appear, wouldn't it be obvious that having more hotspots (i.e. areas where more promoters appear) would equate to a higher probability of new promoters? It would be helpful to clarify why this isn't obvious. This could be resolved by adding more complexity to the statement, such as showing that the level of mutual information found in a hotspot or across all hotspots in a sequence is correlated with Pnew.

      A fair criticism. In response, we have chosen to remove the analysis of this trend from the manuscript entirely. (Additionally, Pnew and mutual information calculations both relied on the fluorescence scores of daughter sequences, so the finding was circular in its logic.)

      Line 394-396: This comparison of findings to Bykov et al should include a bit more justification for the proposed mechanism and how it specifically was observed in this paper. What did they observe and how do these findings relate?

      We gladly followed this suggestion, and added the following two paragraphs to the discussion (lines 622-640).

      “A previous study randomly mutagenized the appY promoter island upstream of a GFP reporter, and isolated variants with increased and decreased GFP expression. The authors found that variants with higher GFP expression acquired mutations that 1) improve a -10 box to better match its consensus, and simultaneously 2) destroy other -10 and -35 boxes (Bykov et al., 2020). The authors concluded that additional -10 and -35 boxes repress expression driven by promoter islands. Our data challenge this conclusion in several ways. 

      First, we find that only ~13% of -10 and -35 boxes in promoter islands actually contribute to promoter activity. Extrapolating this percentage to the appY promoter island, ~87% (100% - 13%) of the motifs would not be contributing to its activity. Assuming the appY promoter island is not an outlier, this would insinuate that during random mutagenesis, these inert motifs might have accumulated mutations that do not change fluorescence. Indeed, Bykov et al. (Bykov et al., 2020) also found that a similar frequency of -10 and -35 boxes were destroyed in variants selected for lower GFP expression, which supports this argument. Second, we find no evidence that creating a -10 or -35 box lowers promoter activity in any of our 50 parent sequences. Third, we also find no evidence that destruction of a -10 or -35 box increases promoter activity without plausible alternative explanations, i.e. overlap of the destroyed box with a H-NS site, destruction of the promoter, or simultaneous creation of another motif as a result of the destruction. In sum, -10 and 35 boxes are not likely to repress promoter activity. “

      METHODS:

      Line 500: Could you provide more details on PMR1 (e.g. size, copy number, RBS strength) or a reference? I could not find this easily.

      Thank you for pointing out this oversight. In response, we have added the following subsection to the methods (lines 740-748):

      “Plasmid MR1 (pMR1)

      The plasmid MR1 (pMR1) is a variant of the plasmid RV2 (pRV2) in which the kan resistance gene has been swapped with the cm resistance gene (Guazzaroni and Silva-Rocha, 2014). Plasmid pMR1 encodes the BBa_J34801 ribosomal binding site (RBS, AAAGAGGAGAAA) 6 bp upstream of the start codon for GFP(LVA). The plasmid also encodes a putative RBS (AAGGGAGG) (Cazemier et al., 1999) 5 bp upstream of the start codon for mCherry on the opposite strand.

      The plasmid additionally contains the low-to-medium copy number origin of replication p15A (Westmann et al., 2018).

      A map of the plasmid is available on the Github repository: https://github.com/tfuqua95/promoter_islands.”

      Line 581: What was the sequencing instrument &/or depth?

      We now report this information as follows (Methods, lines 918-922):

      “Illumina sequencing

      The amplicon pool was sequenced by Eurofins Genomics (Eurofins GmbH, Germany) using a NovaSeq 6000 (Illumina, USA) sequencer, with an S4 flow cell, and a PE150 (Paired-end 150 bp) run. In total, 282’843’000 reads and 84’852’900’000 bases were sequenced. Raw sequencing reads can be found here: https://www.ncbi.nlm.nih.gov/bioproject/1071572.”

      SUPPLEMENT:

      Supplementary Figure 2: Why does the GFP control produce a bimodal distribution?

      The GFP+ culture was inoculated directly from a glycerol stock. The bimodal distribution probably results from a subset of the bacteria having lost the GFP-coding insert, because the left-most peak coincides with the negative control.

      Reviewer #2 (Recommendations For The Authors):

      This paper would benefit from a clear definition of what constitutes an active promoter as this is only mentioned as justification for the use of arbitrary values for fluorescence.

      Good point. To clarify, we now include this new paragraph in the introduction (lines 112-119):

      “In this study, we define a promoter as a DNA sequence that drives the expression of a (fluorescent) protein whose expression level, measured by its fluorescence, is greater than a defined threshold. We use a threshold of 1.5 arbitrary units (a.u.) of fluorescence. This definition does not distinguish between transcription and translation. We chose it because protein expression is usually more important than RNA expression whenever natural selection acts on gene expression, because it is the primary phenotype visible to natural selection (Jiang et al., 2023).”

      There needs to be a clear distinction in the use of the word sequences as often interchange sequences when meaning the 25 parent sequences and then the 50 possible sequences directions the promoter can act. It is confusing going from one to the other.

      We agree that this distinction is important. To make it clearer, we now introduce an additional term (lines 119-130). Our experiments start from 25 promoter island fragments (P1-P25), which we now call template sequences. Each template sequence comprises both DNA strands. The parent sequences are the top and bottom strands of each template sequence. Therefore, there are now 50 parent sequences (P1-GFP, P1-RFP, P2-GFP…, P25-RFP). By treating each strand as its own sequence, we no longer have to refer to the strand, avoiding the earlier confusion.

      The description of the hotspots is often unclear and trying to determine if 3 out of 9 hotspots come from one parent sequence or multiple is not possible. A table denoting this information would be most helpful.

      We agree, and now provide this information in Data S3.

      Finally, the description of the proposed mechanism of promoter activation via mutation of motifs should not be in the results but in the discussion, as it has insufficient evidence and would require further experimental validation.

      We remedied this problem by providing experimental validation of the proposed mechanisms. Specifically, we created the precise mutations that caused a loss or gain of a -10 or a -35 box, and measured the level of gene expression they drive with a plate reader. Because we chose to provide this experimental validation, we opted to leave the mechanisms of promoter activation in the results section.

      The (Fuqua and Wagner 20023) paper is not in the references.

      We have added Fuqua and Wagner, BiorXiv 2023 to the references.

      I enjoyed the paper and wish the authors the best for their future work.

      Thank you for taking the time to review our manuscript!

      Reviewer #3 (Recommendations For The Authors):

      The paper has major flaws. For example:

      The data need to be analysed with correct promoter sequence element sequences (TTGACA for the -35 element).

      The discrepancy lies in the frequency of A’s vs C’s at position #5 of the PWM. Our PWM was built with more A’s than C’s at this position, but also includes C’s in this position. However, we respectfully disagree that using a different -35 box PWM is going to change the outcomes of our study. First, positions 4-6 of the PWM barely have any information content (bits) compared to positions 1-3 (see Fig 1A). This assertion is not just based on our own PWM, but based on ample precedent in the literature. In PMID 14529615, TTG is present in 38% of all -35 boxes, but ACA only 8%. In PMID 29388765, with the -10 instance TATAAT, the -35 instance TTGCAA yields stronger promoters compared to the -35 instance TTGACA (See their Figure 3B). In PMID 29745856 (Figure 2), the most information content lies in positions 1-3, with the A and C at position 5 both nearly equally represented, as in our PWM. In PMID 33958766 (Figure 1) an experimentally-derived -35 box is even reduced to a “partial” -35 box which only includes positions 1 and 2, with consensus: TTnnnn. Additionally, the -35 box PWM that we used significantly and strongly correlates with an experimentally derived -35 box (see Supporting Information from Figure S4 of Belliveau et al., PNAS 2017. Pearson correlation coefficient = 0.89). We now provide DNA sequences for each of the figures to improve accessibility and reproducibility. A reader can now use any PWM or method they wish to interpret the data.

      The data need to be analysed taking into account the role of other promoter elements and sequences for translation.

      Point well taken. 

      Thank you for bringing this oversight to our attention. We have performed two independent analyses to explore the role of TGn in promoter emergence in evolution. First, we computationally searched for -10 boxes with the bases TGn immediately upstream of them in the parent sequences, and found 18 of these “extended -10 boxes” in the parents (lines 143145):

      “On average, each parent sequence contains ~5.32 -10 boxes and ~7.04 -35 boxes (Fig S1). 18 of these -10 boxes also include the TGn motif upstream of the hexamer.”

      However, only 20% of these boxes were found in parents with promoter activity (lines 182-185):

      “We also note that 30% (15/50) of parents have the TGn motif upstream of a -10 box, but only 20% (3/15) of these parents have promoter activity (underlined with promoter activity: P4-RFP, P6-RFP, P8-RFP, P9-RFP, P10-RFP, P11GFP, P12-GFP, P17-GFP, P18-GFP, P18-RFP, P19-RFP, P22-RFP, P24-GFP, P25-GFP, P25-RFP).” 

      Second, we computationally searched through all of the daughter sequences to identify new -10 boxes with TGn immediately upstream. We found 114 -10 boxes with the bases TGn upstream. However, only 5 new -10 boxes (2 with TGn) were associated with increasing fluorescence (lines 338-345):

      “Mutations indeed created many new -10 and -35 boxes in our daughter sequences. On average, 39.5 and 39.4 new 10 and -35 boxes emerged at unique positions within the daughter sequences of each mutagenized parent (Fig 3A,B), with 1’562 and 1’576 new locations for -10 boxes and -35 boxes, respectively. ~22% (684/3’138) of these new boxes are spaced 15-20 bp away from their cognate box, and ~7.3% (114/1’562) of the new -10 boxes have the TGn motif upstream of them. However, only a mere five of the new -10 boxes and four of the new -35 boxes are significantly associated with increasing fluorescence by more than +0.5 a.u. (Fig 3C,D).”

      In addition, we now study the role of UP elements. This analysis showed that the UP element plays a negligible role in promoter emergence within our dataset.  It is discussed in a new subsection of the results (lines 591-608).

      “The UP-element does not strongly influence promoter activity in our dataset.

      The UP element is an additional AT-rich promoter motif that can lie stream of a -35 box in a promoter sequence (Estrem et al., 1998; Ross et al., 1993). We asked whether the creation of UP-elements also creates or modulates promoter activity in our dataset. To this end, we first identified a previously characterized position-weight matrix for the UP element (NNAAAWWTWTTTTNNWAAASYM, PWM threshold score = 19.2 bits) (Estrem et al., 1998) (Fig S13A). We then computationally searched for UP-element-specific hotspots within the parent sequences, i.e., locations in which mutations that gain or lose UP-elements lead to significant fluorescence increases (Mann-Whitney U-test, Fig S7 and methods. See Data S8 for the coordinates, fluorescence changes, and significance). The analysis did not identify any UP elements whose mutation significantly changes fluorescence. 

      We then repeated the analysis with a less stringent PWM threshold of 4.8 bits (1/4th of the PWM threshold score). This time, we identified 74 “UP-like” elements that are created or destroyed at unique positions within the parents. 23 of these motifs significantly change fluorescence when created or destroyed. However, even with this liberal threshold, none of these UP-like elements increase fluorescence by more than 0.5 a.u. when gained, or decrease fluorescence by more than 0.5 a.u. when lost (Fig S13B). This finding ultimately suggests that the UP element plays a negligible role in promoter emergence within our dataset.”

      Collectively, these additional analyses suggest that the presence of TGn plus a -10 box is insufficient to create promoter activity, and that the UP element does not play a significant role in promoter emergence or evolution.

      The full sequences used need to be provided and mutations resulting in new promoters need to be shown.

      To Figures 3, 4, 5, and Supplemental Figures S8, S9, S10, S11, and S12, we have added the sequences which created or the destroyed the promoters, and their PWM scores.

      The paper needs to be rewritten to take into account the relevant literature on i) promoter islands (i.e. sections of horizontally acquired AT-rich DNA) ii) generation and loss of promoters by mutation.

      We have rewritten the introduction. The majority of these points are now addressed in the following two new paragraphs (lines 92-112):

      “Recent work shows that mutations can help new promoters to emerge from promoter motifs or from sequences adjacent to such motifs (Bykov et al., 2020; Fuqua and Wagner, 2023; Yona et al., 2018). However, encoding -10 and -35 boxes is insufficient to drive complete transcription of a gene coding sequence. For instance, the E. coli genome contains clusters of -10 and -35 boxes that are bound by RNA polymerase and produce short oligonucleotide fragments, but rarely create complete transcripts. Such clusters are called promoter islands, and are strongly associated with horizontally-transferred DNA (Bykov et al., 2020; Panyukov and Ozoline, 2013; Purtov et al., 2014; Shavkunov et al., 2009). 

      There are two proposed explanations for why promoter islands do not create full transcripts. First, the TF H-NS may repress promoter activity in promoter islands. This is because in a Δhns background, transcript levels from the promoter islands increases (Purtov et al., 2014). However, mutagenizing a specific promoter island (appY) until it transcribes a GFP reporter, reveals that in-vitro H-NS binding does not significantly change when GFP levels increase (Bykov et al., 2020). Thus, it is not clear whether H-NS actually represses the complete transcription of these sequences. The second proposed explanation is that excessive promoter motifs silence transcription. The aforementioned study found that promoter activity increases when mutations improve a -10 box to better match its consensus (TAAAAAT→TATACT), while simultaneously destroying surrounding -10 and -35 boxes (Bykov et al., 2020). However, we note that if these surrounding motifs never contributed to GFP fluorescence to begin with, then mutations could also simply have accumulated in them during random mutagenesis without affecting promoter activity.”

      In closing, we would like to thank all three reviewers again for your time to engage with this manuscript.

      Summary of specific changes that we have made to each section of the manuscript 

      • Abstract

      - We updated the abstract to include the finding that more than 1’500 new -10s and 35s are created in our dataset, but only ~0.3% of them actually create de-novo promoter activity.

      - We no longer highlight the conclusion that the majority of promoters emerge and evolve from -10 and -35 boxes.

      • Introduction

      - We have added more background information about the UP-element and the TGn motif.

      - We better describe the promoter islands and the results identified by Bykov et al., 2020.

      • Results: Promoter island sequences are enriched with motifs for -10 and -35 boxes.

      - We clarify how the -10 and -35 PWMs we use were derived.

      - We refer to the 25 promoter island fragments as “Template sequences” (P1-P25). The “parent sequences” now correspond to the top and bottom strands of each template (N=50, P1-GFP, P1-RFP, P2-GFP, …, P25-RFP).

      - We elaborate that ~7% of the -10 boxes in the template sequences have the TGn motif.

      - In the previous version of the manuscript, if there were overlapping -10 boxes or overlapping -35 box, we counted these to be a single -10 box or a single -35 box, respectively. In the new version of the manuscript, we now treat each motif as an independent box. Because of this, the number of -10 and -35 boxes per parent have slightly increased.  

      •Results: Non-promoters vary widely in their potential to become promoters.

      - We make a clear distinction between promoters and non-promoters, and define the parent sequences.

      - We note that only 20% of parents with an “extended -10 box” have promoter activity.

      • Results: Promoter emergence correlates with minute differences in background promoter levels.

      - We added an analysis where we compare Pnew to the parent fluorescence levels, even if they are below 1.5 a.u. We find that the distribution of Pnew matches a sigmoid function.

      • Results: Promoter emergence does not correlate with simple sequence features

      - We added an analysis comparing k-mer counts to Pnew.

      - We updated the way we count -10 and -35 boxes, and recalculated the correlation with Pnew. The P and R2 values have changed, but Pnew still does not significantly correlate with -10 or -35 box counts.

      • Results: Promoters emerge and evolve only from specific subsets of -10 and -35 boxes

      - We have added an analysis where we computationally scramble the wild-type parent sequences while maintaining the coordinates of the mutual information hotspots. This reveals that the overlap with -10 and -35 motifs is not a coincidence of dense promoter motif encoding.

      We found a computational error in our analysis and updated the percent overlap between -10 boxes and -35 boxes with mutual information hotspots. The results are similar. o 14% of -10 boxes overlap with hotspots with our new way of defining -10 and -35 boxes.

      • Results: New -10 and -35 boxes readily emerge, but rarely lead to de-novo promoter activity

      - We quantify how often a new -10 and -35 box is created at a unique position within our collection of promoter fragments, and how often this results in a -10 and -35 box being appropriately spaced, and how often this actually leads to de-novo promoter activity. o We quantify how often a TGn sequence lies upstream of a new -10 box.

      • Results: Promoters can emerge when mutations create motifs but not by destroying them.

      - For each example, we added the DNA sequences of the wild-type region of interest and the mutant region of interest that results in the gain of promoter activity, and their respective PWM scores. 

      - We created constructs to validate each example by testing their fluorescence on a plate reader.

      - We removed the P1-GFP example from the main figure, as it was a false-positive in the dataset. It is now in Fig S8.

      - We removed the Shiko Emergence metaphor because it could be confused with a binding mechanism for RNA polymerase.

      • Results – Gaining new motifs over existing motifs increases and decreases promoter activity.

      - We removed the “Tandem motif” because it is more likely caused by H-NS binding.

      - We renamed the mechanisms to be “hetero-gain” and “homo-gain” for simplicity, and clearly define how we classified each sequence into each category.

      - We now include the DNA sequences, the PWM scores, the spacer lengths, and the fluorescence values from constructs harboring the predicted point mutations.

      • Results – Histone-like nucleoid-structuring protein (H-NS) represses P12-RFP and P22-GFP.

      - This is a new analysis, which explores the role of the TF H-NS in repressing the parent sequences. 

      - We identified putative H-NS motifs in P12-RFP and P22-GFP.

      - We show experimentally that in a H-NS null background, a bidirectional promoter (P20) becomes unidirectional, even though P20 does not contain an obvious H-NS motif.

      - In the original version of the manuscript, we describe a phenomenon where gaining a -35 box upstream of a promoter’s -35 box, or a -10 box upstream of a promoter’s -10 box significantly decreases expression. We called this phenomenon a “tandem motif.” However, in the newest version of the manuscript, we find that these fluorescence decreases are rescued in a H-NS null background, suggesting the finding was actually due to H-NS binding modulation and not -10 and -35 boxes.

      • Results – The UP-element does not strongly influence promoter activity in our dataset.

      We used a PWM for the UP element to see if gaining or losing UP motifs was significantly correlated with increasing or decreasing expression. Even with a liberal PWM threshold, the analysis did not find any UP elements.

      • Discussion

      - We rewrote the discussion to account for the new analyses and the results on H-NS, the UP-element, and the extended -10.

      - We better explain how our results clash with the results from the Bykov paper.

      - We fit our results into the context of David Grainger’s papers.

      • Methods

      - Added an explanation about pMR1.

      - Added methods describing how we created the point mutation constructs.

      - Added the methods for the plate reader.

      - Added the methods for Illumina sequencing.

      - Added the methods for the sigmoid curve-fitting.

      • Figure 1

      - Panel E compares how Pnew (the probability of a daughter sequence having a fluorescence score greater than 1.5 a.u.) associates with the fluorescence scores of each parent sequence.

      - Panel F was originally in Figure S5. In the originally submitted version of the manuscript, if there were overlapping -10s or overlapping -35s, we counted these to be a single -10 or a single -35, respectively. In the new version of the manuscript, we now treat each motif as an independent box. Because of this, the r2 and p values have changed, but the conclusions have not (Pnew still does not significantly correlate with -10 or -35 box counts).

      • Figure 2

      - Panel C now includes a stacked barplot showing the percentage of -10 and -35 boxes that overlap with mutual information hotspots when the parent sequences are randomly scrambled computationally.

      • Figure 3

      - Panels A-C were added to explain how we define a new -10/-35 box, how many such new boxes each parent has. These panels also illustrate how we associate the presence or absence of a motif with significant changes in fluorescence scores of the daughter sequences.

      - We moved the example of P1-GFP to Figure S8 because when we tested the specific mutation which leads to gaining the -10 box, fluorescence did not change.

      - We now include the DNA sequences, the PWM scores, the spacer lengths, and the fluorescence values from reporter constructs harboring the point mutations predicted by our computational analyses.

      - Cartoons of RNA polymerase have been removed.

      • Figure 4

      - The tandem-motif has been removed from the figure.

      - Cartoons of RNA polymerase have been removed.

      - We now include the DNA sequences, the PWM scores, the spacer lengths, and the fluorescence values from constructs harboring the point mutations predicted by our computational analyses.

      • Figure 5

      - This is a new figure analyzing the role of H-NS in promoter evolution and emergence.

      • Figure S4

      - Panel B now shows the wild-type parent scores and their standard deviations from the sort-seq experiment.

      • Figure S5

      - Panels with -10 and -35 box counts moved to Figure 1.

      - The panel comparing Pnew to hotspot counts was removed.

      - Correlations between different k-mers and Pnew are added to panels C-H.

      • Figure S8

      - We now include the DNA sequences, the PWM scores, the spacer lengths, and the fluorescence values from constructs harboring the point mutations predicted by our computational analyses.

      • Figure S9

      - We now include the DNA sequences, the PWM scores, the spacer lengths, and the fluorescence values from constructs harboring the point mutations predicted by our computational analyses.

      • Figure S10

      - We now include the DNA sequences, the PWM scores, the spacer lengths, and the fluorescence values from constructs harboring the point mutations predicted by our computational analyses.

      • Figure S11

      - Added DNA sequences and PWM scores.

      • Figure S12

      - A new figure with further insights about H-NS.

      • Figure S13

      - A new figure regarding the UP-element analysis.

      • Figure S14

      - Added Panel D to show how we created mutant reporter constructs for validation.

    1. eLife Assessment

      This study presents a new quantitative method, CROWN-seq, to map the cap-adjacent RNA modification N6,2'-O-dimethyladenosine (m6Am) with single nucleotide resolution. Using thoughtful controls and well-validated reagents, the authors provide compelling evidence that the method is reliable and reproducible. Additionally, the study provides important evidence that m6Am may increase transcription in modified mRNAs, however, the data only demonstrates a correlation between m6Am and transcriptional regulation rather than causality. Overall, this study is poised to advance m6Am research, being of broad interest to the RNA biology and gene regulation fields.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Liu et al. present CROWN-seq, a technique that simultaneously identifies transcription-start nucleotides and quantifies N6,2'-O-dimethyladenosine (m6Am) stoichiometry. This method is derived from ReCappable-seq and GLORI, a chemical deamination approach that differentiates A and N6-methylated A. Using ReCappable-seq and CROWN-seq, the authors found that genes frequently utilize multiple transcription start sites, and isoforms beginning with an Am are almost always N6-methylated. These findings are consistently observed across nine cell lines. Unlike prior reports that associated m6Am with mRNA stability and expression, the authors suggest here that m6Am may increase transcription when combined with specific promoter sequences and initiation mechanisms. Additionally, they report intriguing insights on m6Am in snRNA and snoRNA and its regulation by FTO. Overall, the manuscript presents a strong body of work that will significantly advance m6Am research.

      Strengths:

      The technology development part of the work is exceptionally strong, with thoughtful controls and well-supported conclusions.

      Weaknesses:

      Given the high stoichiometry of m6Am, further association with upstream and downstream sequences (or promoter sequences) does not appear to yield strong signals. As such, transcription initiation regulation by m6Am, suggested by the current work, warrants further investigation.

    3. Reviewer #2 (Public review):

      Summary:

      In the manuscript "Decoding m6Am by simultaneous transcription-start mapping and methylation quantification" Liu and co-workers describe the development and application of CROWN-Seq, a new specialized library preparation and sequencing technique designed to detect the presence of cap-adjacent N6,2'-O-dimethyladenosine (m6Am) with single nucleotide resolution. Such a technique was a key need in the field since prior attempts to get accurate positional or quantitative measurements of m6Am positioning yielded starkly different results and failed to generate a consistent set of targets. As noted in the strengths section below the authors have developed a robust assay that moves the field forward.

      Furthermore, their results show that most mRNAs whose transcription start nucleotide (TSN) is an 'A' are in fact m6Am (85%+ for most cell lines). They also show that snRNAs and snoRNAs have a substantially lower prevalence of m6Am TSNs.

      Strengths:

      Critically, the authors spent substantial time and effort to validate and benchmark the new technique with spike-in standards during development, cross-comparison with prior techniques, and validation of the technique's performance using a genetic PCIF1 knockout. Finally, they assayed nine different cell lines to cross-validate their results. The outcome of their work (a reliable and accurate method to catalog cap-adjacent m6Am) is a particularly notable achievement and is a needed advance for the field.

      Weaknesses:

      No major concerns were identified by this reviewer.

      Mid-level Concerns:

      (1) In Lines 625 and 626, the authors state that "our data suggest that mRNAs initate (mis-spelled by authors) with either Gm, Cm, Um, or m6Am." This reviewer took those words to mean that for A-initiated mRNAs, m6Am was the 'default' TSN. This contradicts their later premise that promoter sequences play a role in whether m6Am is deposited.

      (2) Further, the following paragraph (lines 633-641) uses fairly definitive language that is unsupported by their data. For example in lines 637 and 638 they state "We found that these differences are often due to the specific TSS motif." Simply, using 'due to' implies a causative relationship between the promoter sequences and m6Am has been demonstrated. The authors do not show causation, rather they demonstrate a correlation between the promoter sequences and an m6Am TSN. Finally, despite claiming a causal relationship, the authors do not put forth any conceptual framework or possible mechanism to explain the link between the promoter sequences and transcripts initiating with an m6Am.

      (3) The authors need to soften the language concerning these data and their interpretation to reflect the correlative nature of the data presented to link m6Am and transcription initiation.

    4. Reviewer #3 (Public review):

      Summary:

      m6Am is an abundant mRNA modification present on the TSN. Unlike the structurally similar and abundant internal mRNA modification m6A, m6Am's function has been controversial. One way to resolve controversies surrounding mRNA modification functions has been to develop new ways to better profile said mRNA modification. Here, Liu et al. developed a new method (based on GLORI-seq for m6A-sequencing), for antibody-independent sequencing of m6Am (CROWN-seq). Using appropriate spike-in controls and knockout cell lines, Liu et al. clearly demonstrated CROWN-seq's precision and quantitative accuracy for profiling transcriptome-wide m6Am. Subsequently, the authors used CROWN-seq to greatly expand the number of known m6Am sites in various cell lines and also determine m6Am stoichiometry to generally be high for most genes. CROWN-seq identified gene promoter motifs that correlate best with high stoichiometry m6Am sites, thereby identifying new determinants of m6Am stoichiometry. CROWN-seq also helped reveal that m6Am does not regulate mRNA stability or translation (as opposed to past reported functions). Rather, m6Am stoichiometry correlates well with transcription levels. Finally, Liu et al. reaffirmed that FTO mainly demethylates m6Am, not of mRNA but of snRNAs and snoRNAs.

      Strengths:

      This is a well-written manuscript that describes and validates a new m6Am-sequencing method: CROWN-seq as the first m6Am-sequencing method that can both quantify m6Am stoichiometry and profile m6Am at single-base resolution. These advantages facilitated Liu et al. to uncover new potential findings related to m6Am regulation and function. I am confident that CROWN-seq will likely be the gold standard for m6Am-sequencing henceforth.

      Weaknesses:

      Though the authors have uncovered a potentially new function for m6Am, they need to be clear that without identifying a mechanism, their data might only be demonstrating a correlation between the presence of m6Am and transcriptional regulation rather than causality.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, Liu et al. present CROWN-seq, a technique that simultaneously identifies transcription-start nucleotides and quantifies N6,2'-O-dimethyladenosine (m6Am) stoichiometry. This method is derived from ReCappable-seq and GLORI, a chemical deamination approach that differentiates A and N6-methylated A. Using ReCappable-seq and CROWN-seq, the authors found that genes frequently utilize multiple transcription start sites, and isoforms beginning with an Am are almost always N6-methylated. These findings are consistently observed across nine cell lines. Unlike prior reports that associated m6Am with mRNA stability and expression, the authors suggest here that m6Am may increase transcription when combined with specific promoter sequences and initiation mechanisms. Additionally, they report intriguing insights on m6Am in snRNA and snoRNA and its regulation by FTO. Overall, the manuscript presents a strong body of work that will significantly advance m6Am research.

      Strengths:

      The technology development part of the work is exceptionally strong, with thoughtful controls and well-supported conclusions.

      We appreciate the reviewer for the very positive assessment of the study. We have addressed the concerns below.

      Weaknesses:

      Given the high stoichiometry of m6Am, further association with upstream and downstream sequences (or promoter sequences) does not appear to yield strong signals. As such, transcription initiation regulation by m6Am, suggested by the current work, warrants further investigation.

      We thank the reviewer for the insightful comments. We have softened the language related to m6Am and transcription regulation. We totally agree with the reviewer that future investigation is required to determine the molecular mechanism behind m6Am and transcription regulation.

      Reviewer #2 (Public review):

      Summary:

      In the manuscript "Decoding m6Am by simultaneous transcription-start mapping and methylation quantification" Liu and co-workers describe the development and application of CROWN-Seq, a new specialized library preparation and sequencing technique designed to detect the presence of cap-adjacent N6,2'-O-dimethyladenosine (m6Am) with single nucleotide resolution. Such a technique was a key need in the field since prior attempts to get accurate positional or quantitative measurements of m6Am positioning yielded starkly different results and failed to generate a consistent set of targets. As noted in the strengths section below the authors have developed a robust assay that moves the field forward.

      Furthermore, their results show that most mRNAs whose transcription start nucleotide (TSN) is an 'A' are in fact m6Am (85%+ for most cell lines). They also show that snRNAs and snoRNAs have a substantially lower prevalence of m6Am TSNs.

      Strengths:

      Critically, the authors spent substantial time and effort to validate and benchmark the new technique with spike-in standards during development, cross-comparison with prior techniques, and validation of the technique's performance using a genetic PCIF1 knockout. Finally, they assayed nine different cell lines to cross-validate their results. The outcome of their work (a reliable and accurate method to catalog cap-adjacent m6Am) is a particularly notable achievement and is a needed advance for the field.

      Weaknesses:

      No major concerns were identified by this reviewer.

      We thank the reviewer for the positive assessment of the method and dataset. We have addressed the concerns below.

      Mid-level Concerns:

      (1) In Lines 625 and 626, the authors state that “our data suggest that mRNAs initate (mis-spelled by authors) with either Gm, Cm, Um, or m6Am.” This reviewer took those words to mean that for A-initiated mRNAs, m6Am was the ‘default’ TSN. This contradicts their later premise that promoter sequences play a role in whether m6Am is deposited.

      We thank the reviewer for the comment. We have changed this sentence into “Instead, our data suggest that mRNAs initiate with either Gm, Cm, Um, or Am, where Am are mostly m6Am modified.” The revised sentence separates the processes of transcription initiation and m6Am deposition, which will not confuse the reader.

      (2) Further, the following paragraph (lines 633-641) uses fairly definitive language that is unsupported by their data. For example in lines 637 and 638 they state “We found that these differences are often due to the specific TSS motif.” Simply, using ‘due to’ implies a causative relationship between the promoter sequences and m6Am has been demonstrated. The authors do not show causation, rather they demonstrate a correlation between the promoter sequences and an m6Am TSN. Finally, despite claiming a causal relationship, the authors do not put forth any conceptual framework or possible mechanism to explain the link between the promoter sequences and transcripts initiating with an m6Am.

      (3) The authors need to soften the language concerning these data and their interpretation to reflect the correlative nature of the data presented to link m6Am and transcription initiation.

      For (2) and (3). We have softened the language in the revised manuscript. Specifically, for lines 633-641 in the original manuscript, we have changed “are often due to” into “are often related to” in the revised manuscript, which claims a correlation rather than a causation.

      Reviewer #3 (Public review):

      Summary:

      m6Am is an abundant mRNA modification present on the TSN. Unlike the structurally similar and abundant internal mRNA modification m6A, m6Am’s function has been controversial. One way to resolve controversies surrounding mRNA modification functions has been to develop new ways to better profile said mRNA modification. Here, Liu et al. developed a new method (based on GLORI-seq for m6A-sequencing), for antibody-independent sequencing of m6Am (CROWN-seq). Using appropriate spike-in controls and knockout cell lines, Liu et al. clearly demonstrated CROWN-seq’s precision and quantitative accuracy for profiling transcriptome-wide m6Am. Subsequently, the authors used CROWN-seq to greatly expand the number of known m6Am sites in various cell lines and also determine m6Am stoichiometry to generally be high for most genes. CROWN-seq identified gene promoter motifs that correlate best with high stoichiometry m6Am sites, thereby identifying new determinants of m6Am stoichiometry. CROWN-seq also helped reveal that m6Am does not regulate mRNA stability or translation (as opposed to past reported functions). Rather, m6Am stoichiometry correlates well with transcription levels. Finally, Liu et al. reaffirmed that FTO mainly demethylates m6Am, not of mRNA but of snRNAs and snoRNAs.

      Strengths:

      This is a well-written manuscript that describes and validates a new m6Am-sequencing method: CROWN-seq as the first m6Am-sequencing method that can both quantify m6Am stoichiometry and profile m6Am at single-base resolution. These advantages facilitated Liu et al. to uncover new potential findings related to m6Am regulation and function. I am confident that CROWN-seq will likely be the gold standard for m6Am-sequencing henceforth.

      Weaknesses:

      Though the authors have uncovered a potentially new function for m6Am, they need to be clear that without identifying a mechanism, their data might only be demonstrating a correlation between the presence of m6Am and transcriptional regulation rather than causality.

      We thank the reviewer for the very positive assessment of the CROWN-seq method. We have softened the language which is related to the correlation between m6Am and transcription regulation.

    1. eLife Assessment

      This valuable study combined multiple approaches to gain insight into why rising estradiol levels, by influencing hypothalamic neurons, ultimately lead to ovulation. The experimental data were solid, but evidence for the conclusion that the findings explain how estradiol acts in the intact female were incomplete because they lacked experimental conditions that better approximate physiological conditions. Nevertheless the work will be of interest to reproductive biologists working on ovarian biology and female fertility.

    2. Reviewer #1 (Public review):

      Summary:

      In this work, Qiu and colleagues examined the effects of preovulatory (i.e., proestrous or late follicular phase) levels of circulating estradiol on multiple calcium and potassium channel conductances in arcuate nucleus kisspeptin neurons. Although these cells are strongly linked to a role as the "GnRH pulse generator," the goal here was to examine the physiological properties of these cells in a hormonal milieu mimicking late proestrus, the time of the preovulatory GnRH-LH surge. Computational modeling is used to manipulate multiple conductances simultaneously and support a role for certain calcium channels in facilitating a switch in firing mode from tonic to bursting. CRISPR knockdown of the TRPC5 channel reduced overall excitability, but this was only examined in cells from ovariectomized mice without estradiol treatment.

      Comments to address most recent author response:

      The concern regarding the CRISPR experiments being confined to OVX mice is that the results can only suggest that CRISPR-mediated knockdown of TRPC5 can, at best, phenocopy the OVX+E condition. A reciprocal experiment in the opposite direction (for example, that returning TRPC5 to OVX levels in OVX+E mice prevents the changes in firing activity and pattern typical of the OVX+E2 condition) would strengthen the indication that E2-sensitive changes in TRPC5 expression and function are critically important to surge function. Acknowledging this as a limitation of the studies would help to better contextualize the value of the CRISPR experiments to an understanding of surge mechanisms when done only in OVX conditions.

      The nature of the confusion regarding the consideration of OVX+E2 conditions in the computational model primarily arises from the methods description in the supplemental file: "The effect of E2 on ionic currents is modelled as a change in the maximum conductance parameter. For currents IM,IT, ICa and ITRPC5 this change is inferred from the qPCR data assuming that the conductance is directly proportional to the mRNA expression." If these were instead based on the whole-cell recordings as the authors now indicate in their response, then this description needs to be edited and clarified accordingly. Furthermore, the section states, "For ISK, IBK, Ileak, the OVX and OVX+E2 conductances are obtained from current-voltage relationships recorded from Kiss1ARH neurons in the absence/presence of iberiotoxin (BK blocker) and apamin (SK blocker). All other currents were assumed to be unaffected by E2." This section thus does not directly indicate that the recordings in the stated figures were used in the model, and moreover suggests that currents besides ISK, IBK, and Ileak were not different in OVX+E2 conditions.

      The prior evidence stated for correlation of mRNA and channel conductance is not explicitly cited in the manuscript. It is well known that post-translational modifications, physiological modulation of individual channel biophysical properties, and many other factors can influence the end output of a membrane conductance. Therefore, the authors should, at minimum, provide a literature citation supporting the assumption used here.

    3. Reviewer #2 (Public review):

      Summary:

      Kisspeptin neurons of the arcuate nucleus (ARC) are thought to be responsible for the pulsatile GnRH secretory pattern and to mediate feedback regulation of GnRH secretion by estradiol (E2). Evidence in the literature, including the work of the authors, indicates that ARC kisspeptin coordinate their activity through reciprocal synaptic interactions and the release of glutamate and of neuropeptide neurokinin B (NKB), which they co-express. The authors show here that E2 regulates the expression of genes encoding different voltage-dependent calcium channels, calcium-dependent potassium channels and canonical transient receptor potential (TRPC5) channels and of the corresponding ionic currents in ARC kisspeptin neurons. Using computer simulations of the electrical activity of ARC kisspeptin neurons, the authors also provide evidence of what these changes translate into in terms of these cells' firing patterns. The experiments reveal that E2 upregulates various voltage-gated calcium currents as well as 2 subtypes of calcium-dependent potassium currents, while decreasing TRPC5 expression (an ion channel downstream of NKB receptor activation), the slow excitatory synaptic potentials (slow EPSP) elicited in ARC kisspeptin neurons by NKB release and expression of the G protein-associated inward-rectifying potassium channel (GIRK). Based on these results, and on those of computer simulations, the authors propose that E2 promotes a functional transition of ARC kisspeptin neurons from neuropeptide-mediated sustained firing that supports coordinated activity for pulsatile GnRH secretion to a less intense burst-like firing pattern that could favor glutamate release from ARC kisspeptin. The authors suggest that the latter might be important for the generation of the preovulatory surge in females.

      Strengths:

      The authors combined multiple approaches in vitro and in silico to gain insights into the impact of E2 on the electrical activity of ARC kisspeptin neurons. These include patch-clamp electrophysiology combined with selective optogenetic stimulation of ARC kisspeptin neurons, reverse transcriptase quantitative PCR, pharmacology and CRISPR-Cas9-mediated knockdown of the Trpc5 gene. The addition of computer simulations for understanding the impact of E2 on the electrical activity of ARC kisspeptin cells is also a strength.

      The authors add interesting information on the complement of ionic currents in ARC kisspeptin neurons and on their regulation by E2 to what was already known in the literature. Pharmacological and electrophysiological experiments appear of the highest standards and robust statistical analyses are provided throughout. The impact of E2 replacement on calcium and potassium currents is compelling. Likewise, the results of Trpc5 gene knockdown do provide good evidence that the TRPC5 channel plays a key role in mediating the NKB-mediated slow EPSP. Surprisingly, this also revealed an unsuspected role for this channel in regulating the membrane potential and excitability of ARC kisspeptin neurons.

      Weaknesses:

      The manuscript also has weaknesses that obscure some of the conclusions drawn by the authors.

      One is that the authors compare here two conditions, OVX versus OVX replaced with high E2, that may not reflect the physiological conditions under which the proposed transition between neuropeptide-dependent sustained firing and less intense burst firing might take place (i.e. the diestrous [low E2] and proestrous [high E2] stages of the estrous cycle). This is an important caveat to keep in mind when interpreting the authors' findings. Indeed, that E2 alters certain ionic currents when added back to OVX females, does not mean that the magnitude of all of these ionic currents will vary during the estrous cycle.<br /> In addition, although the computational modeling indicates a role of the various E2-modulated conductances in causing a transition in ARC kisspeptin neuron firing pattern, their role is not directly tested in physiological recordings, weakening the link between these changes and the shift in firing patterns.

      Overall, the manuscript provides interesting information about the effects of E2 on specific ionic currents in ARC kisspeptin neurons and some insights into the functional impact of these changes. However, some of the conclusions of the work, with regard, in particular, to the role of these changes in ion channels and to their implications for the LH surge, are not fully supported by the findings.

    4. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public review):

      Summary:

      In this work, Qiu and colleagues examined the effects of preovulatory (i.e., proestrous or late follicular phase) levels of circulating estradiol on multiple calcium and potassium channel conductances in arcuate nucleus kisspeptin neurons. Although these cells are strongly linked to a role as the "GnRH pulse generator," the goal here was to examine the physiological properties of these cells in a hormonal milieu mimicking late proestrus, the time of the preovulatory GnRH-LH surge. Computational modeling is used to manipulate multiple conductances simultaneously and support a role for certain calcium channels in facilitating a switch in firing mode from tonic to bursting. CRISPR knockdown of the TRPC5 channel reduced overall excitability, but this was only examined in cells from ovariectomized mice without estradiol treatment.

      Comments to address most recent author response:

      The concern regarding the CRISPR experiments being confined to OVX mice is that the results can only suggest that CRISPR-mediated knockdown of TRPC5 can, at best, phenocopy the OVX+E condition. A reciprocal experiment in the opposite direction (for example, that returning TRPC5 to OVX levels in OVX+E mice prevents the changes in firing activity and pattern typical of the OVX+E2 condition) would strengthen the indication that E2-sensitive changes in TRPC5 expression and function are critically important to surge function. Acknowledging this as a limitation of the studies would help to better contextualize the value of the CRISPR experiments to an understanding of surge mechanisms when done only in OVX conditions.

      We have noted in the manuscript that “It would be of interest in future experiments to do the reciprocal experiment to see if overexpressing Trpc5 channels in Kiss1ARH neurons from OVX + E2 females restores the RMP and  “rescues” the synchronization phenotype.”

      The nature of the confusion regarding the consideration of OVX+E2 conditions in the computational model primarily arises from the methods description in the supplemental file: "The effect of E2 on ionic currents is modelled as a change in the maximum conductance parameter. For currents IM,IT, ICa and ITRPC5 this change is inferred from the qPCR data assuming that the conductance is directly proportional to the mRNA expression." If these were instead based on the whole-cell recordings as the authors now indicate in their response, then this description needs to be edited and clarified accordingly. Furthermore, the section states, "For ISK, IBK, Ileak, the OVX and OVX+E2 conductances are obtained from current-voltage relationships recorded from Kiss1ARH neurons in the absence/presence of iberiotoxin (BK blocker) and apamin (SK blocker). All other currents were assumed to be unaffected by E2." This section thus does not directly indicate that the recordings in the stated figures were used in the model, and moreover suggests that currents besides ISK, IBK, and Ileak were not different in OVX+E2 conditions.

      The prior evidence stated for correlation of mRNA and channel conductance is not explicitly cited in the manuscript. It is well known that post-translational modifications, physiological modulation of individual channel biophysical properties, and many other factors can influence the end output of a membrane conductance. Therefore, the authors should, at minimum, provide a literature citation supporting the assumption used here.

      We have re-written the paragraph on “Modelling the effects of E2” in the Supplemental Information (now Appendix 1)  to clarify the that the modeling was based on a combination of electrophysiological recordings and the qPCR data presented in this and previous publications. The statement that “all other currents were assumed to be unaffected by E2” was a misstatement and has been deleted. As per the reviewer’s request, we have listed seven publications that document the correlation between the mRNA expression and channel conductance for the various channels. We thank the reviewer for the suggestion.

      Reviewer #2 (Public review):

      Summary:

      Kisspeptin neurons of the arcuate nucleus (ARC) are thought to be responsible for the pulsatile GnRH secretory pattern and to mediate feedback regulation of GnRH secretion by estradiol (E2). Evidence in the literature, including the work of the authors, indicates that ARC kisspeptin coordinate their activity through reciprocal synaptic interactions and the release of glutamate and of neuropeptide neurokinin B (NKB), which they co-express. The authors show here that E2 regulates the expression of genes encoding different voltage-dependent calcium channels, calcium-dependent potassium channels and canonical transient receptor potential (TRPC5) channels and of the corresponding ionic currents in ARC kisspeptin neurons. Using computer simulations of the electrical activity of ARC kisspeptin neurons, the authors also provide evidence of what these changes translate into in terms of these cells' firing patterns. The experiments reveal that E2 upregulates various voltage-gated calcium currents as well as 2 subtypes of calcium-dependent potassium currents, while decreasing TRPC5 expression (an ion channel downstream of NKB receptor activation), the slow excitatory synaptic potentials (slow EPSP) elicited in ARC kisspeptin neurons by NKB release and expression of the G protein-associated inward-rectifying potassium channel (GIRK). Based on these results, and on those of computer simulations, the authors propose that E2 promotes a functional transition of ARC kisspeptin neurons from neuropeptide-mediated sustained firing that supports coordinated activity for pulsatile GnRH secretion to a less intense burst-like firing pattern that could favor glutamate release from ARC kisspeptin. The authors suggest that the latter might be important for the generation of the preovulatory surge in females.

      Strengths:

      The authors combined multiple approaches in vitro and in silico to gain insights into the impact of E2 on the electrical activity of ARC kisspeptin neurons. These include patch-clamp electrophysiology combined with selective optogenetic stimulation of ARC kisspeptin neurons, reverse transcriptase quantitative PCR, pharmacology and CRISPR-Cas9-mediated knockdown of the Trpc5 gene. The addition of computer simulations for understanding the impact of E2 on the electrical activity of ARC kisspeptin cells is also a strength.

      The authors add interesting information on the complement of ionic currents in ARC kisspeptin neurons and on their regulation by E2 to what was already known in the literature. Pharmacological and electrophysiological experiments appear of the highest standards and robust statistical analyses are provided throughout. The impact of E2 replacement on calcium and potassium currents is compelling. Likewise, the results of Trpc5 gene knockdown do provide good evidence that the TRPC5 channel plays a key role in mediating the NKB-mediated slow EPSP. Surprisingly, this also revealed an unsuspected role for this channel in regulating the membrane potential and excitability of ARC kisspeptin neurons.

      Weaknesses:

      The manuscript also has weaknesses that obscure some of the conclusions drawn by the authors.

      One is that the authors compare here two conditions, OVX versus OVX replaced with high E2, that may not reflect the physiological conditions under which the proposed transition between neuropeptide-dependent sustained firing and less intense burst firing might take place (i.e. the diestrous [low E2] and proestrous [high E2] stages of the estrous cycle). This is an important caveat to keep in mind when interpreting the authors' findings. Indeed, that E2 alters certain ionic currents when added back to OVX females, does not mean that the magnitude of all of these ionic currents will vary during the estrous cycle.

      We do know that the slow EPSP, which is generated by TRPC5 channels, tracks beautifully with the steroid state of female mice.  Using our E2 treatment paradigm that generates a LH surge in OVX females (left panel in Author response image 1), there is no difference in the amplitude of the slow EPSP in proestrous versus OVX + E2 females (right panel in Author response image 1).    

      Author response image 1.

      In addition, although the computational modeling indicates a role of the various E2-modulated conductances in causing a transition in ARC kisspeptin neuron firing pattern, their role is not directly tested in physiological recordings, weakening the link between these changes and the shift in firing patterns.

      In future experiments we will test directly the physiological contribution of the other E2-modulated conductances in causing the transition in the firing pattern of arcuate Kiss1 neurons using CRISPR/SaCas9 technology as we have documented for the TRPC5 channel (e.g., Figures 11 and 12).

      Overall, the manuscript provides interesting information about the effects of E2 on specific ionic currents in ARC kisspeptin neurons and some insights into the functional impact of these changes. However, some of the conclusions of the work, with regard, in particular, to the role of these changes in ion channels and to their implications for the LH surge, are not fully supported by the findings.

      ---------

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this work, Qiu and colleagues examined the effects of preovulatory (i.e., proestrous or late follicular phase) levels of circulating estradiol on multiple calcium and potassium channel conductances in arcuate nucleus kisspeptin neurons. Although these cells are strongly linked to a role as the "GnRH pulse generator," the goal here was to examine the physiological properties of these cells in a hormonal milieu mimicking late proestrus, the time of the preovulatory GnRH-LH surge. Computational modeling is used to manipulate multiple conductances simultaneously and support a role for certain calcium channels in facilitating a switch in firing mode from tonic to bursting. CRISPR knockdown of the TRPC5 channel reduced overall excitability, but this was only examined in cells from ovariectomized mice without estradiol treatment. The manuscript has been substantially improved from the initial version by the addition of new experiments and clarification of important figures. Importantly, the overlap of data with previous reports from the same group has been corrected.

      Strengths:

      (1) Examination of multiple types of calcium and potassium currents, both through electrophysiology and molecular biology.

      (2) Focus on arcuate kisspeptin neurons during the surge is relatively conceptually novel as the anteroventral periventricular nucleus (AVPV) kisspeptin neurons have received much more attention as the "surge generator" population.

      (3) The modeling studies allow for direct examination of manipulation of single and multiple conductances, whereas the electrophysiology studies necessarily require examination of each current in isolation. Construction of an arcuate kisspeptin neuron model promises to be of value to the reproductive neuroendocrinology field.

      Weaknesses:

      A remaining weakness in this revised version of the manuscript is that the relevance of the CRISPR experiments is still rather tenuous given that the goal is to understand what happens in the estrogen-treatment condition, and these experiments were performed only in OVX mice. Similar concerns reflect that the computational model examining the effect of E2 infers multiple conductances based on qPCR data and an assumption that the conductances are directionally proportional to the level of gene expression, and then tunes these to the current recordings obtained from OVX mice, without a direct confirmation in OVX+E2 conditions that the model parameters accurately reflect the properties of these currents in the presence of estrogen.

      We are still puzzled by Reviewer’s concerns about doing the CRISPRing of Trpc5 in the OVX+E2 females.  The Trpc5 channel expression is significantly reduced with the E2 treatment (Figure 10E) which we know translates into a minimal slow EPSP (Figure 2, Qiu eLife 2016) and is essentially equivalent to the slow EPSP amplitude in the Trpc5 mutagenesis in the ovariectomized females (Figure 12).  TRPC5 channel conductance is already at “rock bottom.”  The modeling informs us that such a low TRPC5 conductance will not support a long lasting slow EPSP and sustained firing (Figure 13A).

      Also, we respectively point out that we have published a score of papers over the past 20 years showing that the channel conductance does correlate with the mRNA expression (e.g., Qiu et al., eLife 2018).  Secondly, the model does take into consideration the OVX + E2 conditions (Figure 13B,C) which is based on the extensive whole-cell recordings presented in Figures 4,5,6,7,8 and 9.

      Reviewer #2 (Public Review):

      Summary:

      Kisspeptin neurons of the arcuate nucleus (ARC) are thought to be responsible for the pulsatile GnRH secretory pattern and to mediate feedback regulation of GnRH secretion by estradiol (E2). Evidence in the literature, including the work of the authors, indicates that ARC kisspeptin coordinate their activity through reciprocal synaptic interactions and the release of glutamate and of neuropeptide neurokinin B (NKB), which they co-express. The authors show here that E2 regulates the expression of genes encoding different voltage-dependent calcium channels, calcium-dependent potassium channels and canonical transient receptor potential (TRPC5) channels and of the corresponding ionic currents in ARC kisspeptin neurons. Using computer simulations of the electrical activity of ARC kisspeptin neurons, the authors also provide evidence of what these changes translate into in terms of these cells' firing patterns. The experiments reveal that E2 upregulates various voltage-gated calcium currents as well as 2 subtypes of calcium-dependent potassium currents while decreasing TRPC5 expression (an ion channel downstream of NKB receptor activation), the slow excitatory synaptic potentials (slow EPSP) elicited in ARC kisspeptin neurons by NKB release and expression of the G protein-associated inward-rectifying potassium channel (GIRK). Based on these results, and on those of computer simulations, the authors propose that E2 promotes a functional transition of ARC kisspeptin neurons from neuropeptide-mediated sustained firing that supports coordinated activity for pulsatile GnRH secretion to a less intense burst-like firing pattern that could favor glutamate release from ARC kisspeptin. The authors suggest that the latter might be important for the generation of the preovulatory surge in females.

      Strengths:

      The authors combined multiple approaches in vitro and in silico to gain insights into the impact of E2 on the electrical activity of ARC kisspeptin neurons. These include patch-clamp electrophysiology combined with selective optogenetic stimulation of ARC kisspeptin neurons, reverse transcriptase quantitative PCR, pharmacology and CRISPR-Cas9-mediated knockdown of the Trpc5 gene. The addition of computer simulations for understanding the impact of E2 on the electrical activity of ARC kisspeptin cells is also a strength.

      The authors add interesting information on the complement of ionic currents in ARC kisspeptin neurons and on their regulation by E2 to what was already known in the literature. Pharmacological and electrophysiological experiments appear of the highest standards and robust statistical analyses are provided throughout. The impact of E2 replacement on calcium and potassium currents is compelling. Likewise, the results of Trpc5 gene knockdown do provide good evidence that the TRPC5 channel plays a key role in mediating the NKB-mediated slow EPSP. Surprisingly, this also revealed an unsuspected role for this channel in regulating the membrane potential and excitability of ARC kisspeptin neurons.

      Weaknesses:

      The manuscript also has weaknesses that obscure some of the conclusions drawn by the authors.

      One is that the authors compare here two conditions, OVX versus OVX replaced with high E2, that may not reflect the physiological conditions under which the proposed transition between neuropeptide-dependent sustained firing and less intense burst firing might take place (i.e. the diestrous [low E2] and proestrous [high E2] stages of the estrous cycle). This is an important caveat to keep in mind when interpreting the authors' findings. Indeed, that E2 alters certain ionic currents when added back to OVX females, does not mean that the magnitude of all of these ionic currents will vary during the estrous cycle.

      Unfortunately, mice are a poor reproductive model since female mice do not have a clear follicular (estradiol-driven) phase distinctive from the luteal (progesterone-driven) phase.  Had we utilized a “proestrous” female, we could not with certainty distinguish between the effects of estradiol versus progesterone on the expression of the calcium and potassium channels that were the focus of this study.  Therefore, using our physiological model we can state with confidence that “estradiol elicits distinct firing patterns in arcuate nucleus kisspeptin neurons….”

      Overall, the manuscript provides interesting information about the effects of E2 on specific ionic currents in ARC kisspeptin neurons and some insights into the functional impact of these changes. However, some of the conclusions of the work, with regard, in particular, to the role of these changes in ion channels and their implications for the LH surge, are not fully supported by the findings.

      As we pointed out in the Discussion, the O’Byrne lab has clearly shown the relevance of Kiss1ARH neuronal burst firing and the release of glutamate to its effects on the LH surge:

      “Rather, we postulate that glutamate neurotransmission is more important for excitation of Kiss1AVPV/PeN neurons and facilitating the GnRH (LH) surge with high circulating levels of E2 when peptide neurotransmitters are at a nadir and glutamate levels are high in female Kiss1ARH neurons. Indeed, low frequency (5 Hz) optogenetic stimulation of Kiss1ARH neurons, which only releases glutamate in E2-treated, ovariectomized females (Qiu J. et al., 2016), generates a surge-like increase in LH release during periods of optical stimulation (Lin et al., 2021; Voliotis et al., 2021).  In a subsequent study optical stimulation of Kiss1ARH neuron terminals in the AVPV at 20 Hz, a frequency commonly used for terminal stimulation in vivo, generated a similar surge of LH (Shen et al., 2022).  Additionally, intra-AVPV infusion of glutamate antagonists, AP5+CNQX, completely blocked the LH surge induced by Kiss1ARH terminal photostimulation in the AVPV (Shen et al., 2022).”

      Recommendations for the authors:

      Reviewer #2 (Recommendations for The Authors):

      The reviewer noted the following in the revised manuscript:

      - page 6, the authors may consider adding that presynaptic effects of blocking calcium channels on the slow EPSP cannot be fully ruled out. Indeed, the added experiments do indicate that some of the effects can be explained by impaired regulation of TRPC5 channels by calcium influx through calcium channels; however, the senktide-induced current is not fully blocked by the broad-spectrum calcium channel inhibitor cadmium, suggesting that the effect of blocking these channels on the slow EPSP may involve other mechanisms, such as presynaptic effects.

      Optogenetic stimulation of all Kiss1ARH neurons induces the release of NKB at “physiological” concentrations, which in turn generates a slow EPSP in the recorded Kiss1ARH neuron. Blocking voltage-gated calcium channels can inhibit the NKB release from presynaptic  Kiss1ARH neurons, thereby reducing the amplitude of the slow EPSP. However, in whole-cell recordings of synaptically isolated Kiss1ARH neurons,  senktide directly induces a large inward current (Figure 3F), which is generated by the opening of TRPC5 channels (Qiu et al. J. Neurosci 2021). Voltage-gated calcium channels are coupled to the activation of TRPC5 channels (Blair, Kaczmarek and Clapham, J. Gen Physiol 2009), so by blocking voltage-gated calcium channels, cadmium effectively abrogates the facilitating effects of these channels on TRPC5 channel activation and significantly reduces but does not abolish the inward (excitatory) current (Figures 3F-H). We have clarified in the Results (page 6) that the Kiss1ARH neurons were synaptically isolated as depicted in Figures 3F,G.

      - page 8, bottom, the mean value given for the apamin-sensitive current amplitude in E2 treated females does not match that plotted on the I/V graph in Figure 7F.

      Thank you for pointing out this typographical error, which we have corrected.

    1. eLife Assessment

      Abssy et al. carried out a study to test the effects of repetitive peripheral magnetic stimulation (rPMS) on pain perception in an experimental pain model and concluded that the analgesic properties of rPMS could be largely attributed to its auditory component rather than peripheral nerve stimulation per se. While the study presents valuable data on the modulation of pain perception in response to the stimulation paradigms that were tested, several issues in the experimental design and interpretation of results render the evidence incomplete to support their main claims, which should therefore be revised. In that case, these results could be of interest to pain clinicians and researchers.

    2. Reviewer #1 (Public review):

      Summary:

      This study from Abssy et al. aims to determine if different non-invasive peripheral stimulation techniques - such as magnetic and electrical stimulations - may influence pain intensity, unpleasantness, and secondary hyperalgesia using a 4-arm parallel-group study. They observed no effect on pain intensity and unpleasantness. Also, they reported that only the TENS (electrical stimulation) did not impact secondary hyperalgesia. They hypothesized that the effects were probably due to the sound emitted by RPMS (magnetic stimulation). In a follow-up study, they tried to determine if covering the sound of RPMS would abolish the effect on secondary hyperalgesia using a single-arm design. They observed no effect of RPMS.

      Strengths:

      (1) The research team recruited a relatively large sample size for this type of study.

      (2) The phasic heat pain protocol appears rigorous and well-described.

      (3) The Figures are helpful in facilitating the understanding of the study design and results.

      (4) The statistical analyses appear sound.

      Weaknesses:

      (1) The proposed design is not sufficient to answer the research question. The rationale of the study proposed in the introduction is that auditory stimulation may explain the analgesic effects of RPMS. To answer this question, the authors should have used a factorial design using 4 groups (active RPMS + sound; active RPMS + no sound; sham RPMS + sound; sham RPMS + no sound). Using this design, it would have been possible to determine if the sound, the afferent stimulation, or both are necessary to produce analgesia. Rather, they tested two types of RPMS (iTBS, cTBS) without real rationale, one electrical stimulation and a placebo.

      (2) There are multiple ways that the current design could have introduced biases. The study was not randomized but pseudo-randomised. What does that mean? Was their allocation concealment? Was the assessor and data analyst blinded to group allocation? Did an intention to treat analyses were performed? Did the participants were adequately blinded (was it measured)?

      (3) The TENS parameters used were not optimal and are not those commonly used in clinical practice. This could have explained the lack of TENS effects. The lack of TENS effects has not been discussed and it is concerning. If TENS had been effective (as expected), the story about the auditory effects would not have been presented as the primary mechanisms underlying the current results.

      (4) No primary outcome has been identified. It is important to mention that the interpretation of results is based on the presence of only one statistically significant result. Pain intensity and pain unpleasantness are not affected. This was not properly addressed in the Discussion. What does that mean that secondary hyperalgesia is affected but not pain?

      (5) The use of secondary hyperalgesia as a variable requires further clarification. How is it possible to measure secondary hyperalgesia if there is no lesioned tissue? If heat creates secondary hyperalgesia without lesion, what does that mean physiologically? Is it a valid and reliable "pain" variable?

      (6) The follow-up study has been designed to cover the RPMS sound using pink noise. However, the pink noise was also present during the PHP measurement. How can we determine whether the absence of change is due to the pink noise during the RPMS or the presence of pink noise during PHP? I don't think this is possible to discriminate.

      Appraisal:

      (7) Despite all these potential issues, authors interpret their data with high confidence and with several overstatements in the Title, Abstract, and Discussion. The results do not support their conclusions. The fact that auditory stimulation may produce an analgesic effect is a hypothesis, but the current study cannot ascertain it.

    3. Reviewer #2 (Public review):

      Summary:

      In this article, Abssy, Osokin, Osborne, et al. aimed to demonstrate the effect of Peripheral Magnetic Stimulation (PMS) as a pain relief tool, studying its effects in an experimentally induced pain paradigm applied over healthy subjects. This is a relevant objective, as it will give a proxy indication of its utility as a clinical intervention to treat pain. Shockingly, in the first experiment, the authors found that this effect existed, not only in the active PMS groups but also in the sham PMS. With a clever second experiment, the authors used pink noise to mask the clicking sound and the PMS: this modification abolished the hypoalgesic effect of PMS.

      Strengths:

      This study presents an adequately calculated sample size (n = 100 for study 1 and n = 32 for study 2). This gives trustability to the results and allows for a correct disaggregated analysis to assess gender effects, which correct execution does not often occur. Nuisance variables are adequately addressed, figures and writing are clear, and I especially liked figures 4 and 5 for their easiness of interpretation. They explore two different stimulation protocols for the PMS, extending their results beyond parametrization. Secondary hyperalgesia is a particularly relevant measurement, as it is a common symptom in many relevant painful conditions. Pseudorandomization and counterbalanced design are also appreciated, as well as reinforcement of the results through Bayesian statistical approaches. Regarding the scientific content, the main result (auditory modulation of pain in PMS) is exciting and very interesting by itself and will be relevant for the pain community, granting further research, both from a fundamental and clinical perspective. Personally, I respect that they recognize that results did not match their a priori hypothesis, instead of committing HARKing. And it is a very thrilling mismatch for sure!

      It will be especially interesting for those among us dedicated to neural stimulation for pain treatment.

      Weaknesses:

      Although the study presents solid results, some specific concerns make me reluctant to accept the interpretations that the authors take from said results. I list the most important here.

      (1) My biggest concern in this paper is that the stimulation protocols are not applied after pain was induced in the subjects, but before. This is not bad in itself, but as the paper presents the stimulations as potential "treatments" it generates a severe mismatch between the objective, context (introduction), and impact (discussion) presented for the experiments, and how they are actually designed. This adds to the fact that healthy volunteers are used here to generate a study with low translational capability, that aims to be translational and provide an indication for clinics (maybe this is why the reduction in pain intensity caused by PMS when applied in patients, reported in references [29, 35 and 39], is not observed here).

      (2) TENS treatment duration is simply too short (90s) to be considered a therapeutic TENS intervention. I get that this duration was chosen to match the one of PMS, but TENS is never applied like this in the clinics, in which the duration varies from 10 minutes to an hour (or more). This specific study comparing different durations recommends 40 minutes for knee osteoarthritis pain relief (PMID: 12691335). Under these conditions, this stimulation is more similar to a sham TENS than to a real TENS treatment: I would suggest interpreting it as such. As the paper is right now, it could give the impression that PMS could produce clinical effects not observed in TENS, but while the PMS application resembles a clinical one, the TENS application does not (due to its extremely short duration). As an example, giving paracetamol at a dose 10 times below its effective dose is a placebo, not a paracetamol treatment.

      (3) This study measured pain, not central sensitization. Specifically, the effects refer to the area of secondary hyperalgesia. The IASP definition for central sensitization is "Increased responsiveness of nociceptive neurons in the central nervous system to their normal or subthreshold afferent input." (PMID: 32694387). No neuronal results are reported in this article. Therefore, central sensitization is not measured here, and we do not know if it is reduced by sound. This frontally clashes with the title of the article and with many interpretations of the results. For a deep review on this topic, I recommend PMID: 39278607 and the short article PMID: 30416715.

      (4) There is no mention of blinding/masking/concealing in this manuscript. Was the therapist blind to whether they applied one protocol, another, or a placebo? Were the evaluators blind, as this can heavily influence their measurements? And the volunteers? Was allocation concealed? Was this blinding measured afterwards? Blinding is, together with randomization, the most important methodological feature for those interventional studies. For example, not introducing blinding and concealing directly makes a study lose 4 out of 10 points in the PEDro scale, failing to fulfill criteria 3, 5, 6, and 7 (https://pedro.org.au/english/resources/pedro-scale/). Continuing with methodological considerations, the dropout percentage is high (18% for the first and 25% for the second study), both above the 15% cutoff for criterion 8 of the PEDro, losing another point. It is not mentioned whether the statistical analysis was intention-to-treat or per-protocol. Assuming the second, criterion 9 is failed too. Also, although between-group comparisons are done for study 1, they are not for study 2. Criterion 10 depends on this, so I would recommend doing it to avoid failing it. As it is right now, the study will be a 3/10 on the PEDro scale, being therefore considered "low-quality level evidence". As some of these criteria can be fulfilled in this study, I will recommend doing so to increase its quality level to medium (more in "recommendations for authors").

      (5) Data reporting and statistical treatment can be improved, as only differences are reported and regression to the mean is not accounted for in this study. Moreover, baseline levels for the dependent variables (control session) are not accessible for evaluation and they are not compared statistically, making it impossible to know if the groups were similar at baseline. This will imply failing criterion 3 of the PEDro, for a total of 2/10 points.

    4. Author response:

      Reviewer 1 (Public Review)

      (1) The proposed design is not sufficient to answer the research question. The rationale of the study proposed in the introduction is that auditory stimulation may explain the analgesic effects of RPMS. To answer this question, the authors should have used a factorial design using 4 groups (active RPMS + sound; active RPMS + no sound; sham RPMS + sound; sham RPMS + no sound). Using this design, it would have been possible to determine if the sound, the afferent stimulation, or both are necessary to produce analgesia. Rather, they tested two types of RPMS (iTBS, cTBS) without real rationale, one electrical stimulation and a placebo.

      We will clarify that the study design employed was originally designed to determine whether iTBS or cTBS would be more effective to reduce pain. We included TENS as a positive control, and sham as a negative control. We were indeed surprised by the findings, and present them herein. Future RCTs should be performed to reproduce these findings.

      (2) There are multiple ways that the current design could have introduced biases. The study was not randomized but pseudo-randomised. What does that mean? Was their allocation concealment? Was the assessor and data analyst blinded to group allocation? Did an intention to treat analyses were performed? Did the participants were adequately blinded (was it measured)?

      This study was not designed as an RCT, but rather as experimental study. The study was pseudo-randomized to ensure that the groups had equal allocation and distribution of sexes.

      The groups were blinded to the other stimulations (they were not informed of the various arms of the study, through different consent forms).

      It was not possible to blind the experimenter as the iTBS and cTBS protocols are very different: iTBS has multiple bursts separated by brief intervals, whereas cTBS is continuous). The data were masked for analysis, and only unblinded at the final stage. We will update the manuscript to reflect these changes.

      (3) The TENS parameters used were not optimal and are not those commonly used in clinical practice. This could have explained the lack of TENS effects. The lack of TENS effects has not been discussed and it is concerning. If TENS had been effective (as expected), the story about the auditory effects would not have been presented as the primary mechanisms underlying the current results.

      We acknowledge that this is a limitation of the study. A future study should address this. However, we will not remove the arm for transparency.

      (4) No primary outcome has been identified. It is important to mention that the interpretation of results is based on the presence of only one statistically significant result. Pain intensity and pain unpleasantness are not affected. This was not properly addressed in the Discussion. What does that mean that secondary hyperalgesia is affected but not pain?

      We reiterate that this study was not designed as an RCT, but rather an experimental study with The primary outcomes measures that capture change in  were measures of pain sensitivity (pain intensity NRS, pain unpleasantness NRS, and secondary hyperalgesia). We will clarify this in the revised manuscript.

      We will now include discussion of the effects being solely on secondary hyperalgesia, and not on pain intensity and unpleasantness.

      (5a) The use of secondary hyperalgesia variable is concerning. How is it possible to measure secondary hyperalgesia if there is no lesioned tissue?

      Secondary hyperalgesia refers to hyperalgesia assessed in an area adjacent to or remote of the site of stimulation. In general, it is not required to lesion a tissue to activate the nociceptive system or to induce pain. We have cited other studies that have employed secondary hyperalgesia as a pain outcome measure without inducing a lesion.

      Hyperalgesia reflects increased pain on suprathreshold stimulation. Then, one measures the subjective response to a painful (i.e. suprathreshold) stimulation, then applies a conditioning stimulation (e.g. heat), and measures the subjective response to the same original stimulus. If the response after conditioning is higher than the baseline measure, hyperalgesia has been induced. Secondary hyperalgesia just refers to hyperalgesia assessed in an area adjacent to or remote of the site of stimulation. In general, it is not required to lesion a tissue to activate the nociceptive system or to induce pain.

      (5b) If heat creates secondary hyperalgesia without lesion, what does that mean physiologically?

      Secondary hyperalgesia is normally interpreted as a perceptual correlate of central sensitization.

      (5c) Is it a valid and reliable "pain" variable?

      Yes and yes. A noxious heat stimulus can reliably elicit secondary hyperalgesia (see section 3.2 from Quesada et al. 2021). We also cite several studies that have used secondary hyperalgesia as an outcome measure of central sensitization in pain.

      (6) The follow-up study has been designed to cover the RPMS sound using pink noise. However, the pink noise was also present during the PHP measurement. How can we determine whether the absence of change is due to the pink noise during the RPMS or the presence of pink noise during PHP? I don't think this is possible to discriminate.

      We will add a third study that performs the control analysis with the sound of the rPMS masked, but no pink noise otherwise. The study will be performed in two groups: one with pink noise, and one without pink noise.

      Appraisal

      (7) Despite all these potential issues, authors interpret their data with high confidence and with several overstatements in the Title, Abstract, and Discussion. The results do not support their conclusions. The fact that auditory stimulation may produce an analgesic effect is a hypothesis, but the current study cannot ascertain it.

      We believe that the chief concern with the interpretation lies with concerns with the second study. The proposed third experiment will address these concerns.

      Reviewer 2 (Public Review):

      (1) My biggest concern in this paper is that the stimulation protocols are not applied after pain was induced in the subjects, but before. This is not bad in itself, but as the paper presents the stimulations as potential "treatments" it generates a severe mismatch between the objective, context (introduction), and impact (discussion) presented for the experiments, and how they are actually designed. This adds to the fact that healthy volunteers are used here to generate a study with low translational capability, that aims to be translational and provide an indication for clinics (maybe this is why the reduction in pain intensity caused by PMS when applied in patients, reported in references [29, 35 and 39], is not observed here).

      We will reframe these as prophylaxis, rather than treatment. This study was an experimental study originally designed to determine which stimulation parameters (cTBS or iTBS) would be better suited to modulate pain. We performed the study in healthy individuals undergoing acute pain, akin to a person undergoing painful procedure, which could lead to central sensitization and pain persistence (e.g., post-surgical pain). However, before testing this in individuals undergoing actual procedures, it is essential to determine efficacy in people before translation.

      Khan et al [29] is a case study with neuropathic pain, whereas our study uses a nociceptive pain model. Lim et al [35] employed 10 sessions of rPMS stimulation in patients with acute low back pain. Similar to our study, the change in VAS driven by rPMS was no different than the sham stimulation. We notice that there is no reference 39, and will correct this.

      (2) TENS treatment duration is simply too short (90s) to be considered a therapeutic TENS intervention. I get that this duration was chosen to match the one of PMS, but TENS is never applied like this in the clinics, in which the duration varies from 10 minutes to an hour (or more). This specific study comparing different durations recommends 40 minutes for knee osteoarthritis pain relief (PMID: 12691335). Under these conditions, this stimulation is more similar to a sham TENS than to a real TENS treatment: I would suggest interpreting it as such. As the paper is right now, it could give the impression that PMS could produce clinical effects not observed in TENS, but while the PMS application resembles a clinical one, the TENS application does not (due to its extremely short duration). As an example, giving paracetamol at a dose 10 times below its effective dose is a placebo, not a paracetamol treatment.

      We acknowledge that this is a limitation, and will address this in the Discussion of the revised manuscript.

      (3) This study measured pain, not central sensitization. Specifically, the effects refer to the area of secondary hyperalgesia. The IASP definition for central sensitization is "Increased responsiveness of nociceptive neurons in the central nervous system to their normal or subthreshold afferent input." (PMID: 32694387). No neuronal results are reported in this article. Therefore, central sensitization is not measured here, and we do not know if it is reduced by sound. This frontally clashes with the title of the article and with many interpretations of the results. For a deep review on this topic, I recommend PMID: 39278607 and the short article PMID: 30416715.

      It is widely accepted that central sensitization is the neurophysiological basis of secondary hyperalgesia (see PMID: 11313449; PMID: 10581220).

      The reviewer is conflating secondary hyperalgesia due to central sensitization and chronic pain. Whether chronic pain is driven or maintained by central sensitization is not the goal of our study. However, there is ample evidence that nociceptive drive can induce plasticity in the CNS, which alters pain sensitivity, and that these changes facilitate pain.

      (4a) There is no mention of blinding/masking/concealing in this manuscript. Was the therapist blind to whether they applied one protocol, another, or a placebo? Were the evaluators blind, as this can heavily influence their measurements? And the volunteers? Was allocation concealed? Was this blinding measured afterwards? Blinding is, together with randomization, the most important methodological feature for those interventional studies. For example, not introducing blinding and concealing directly makes a study lose 4 out of 10 points in the PEDro scale, failing to fulfill criteria 3, 5, 6, and 7 (https://pedro.org.au/english/resources/pedro-scale/).

      This study was not designed as an RCT, but rather as experimental study. The study was pseudo-randomized to ensure that the groups had equal allocation and distribution of sexes.

      The groups were blinded to the other stimulations (they were not informed of the various arms of the study, through different consent forms). However, blinding was not measured afterwards (again, this was not meant to be an RCT).

      It was not possible to blind the experimenter as the iTBS and cTBS protocols are very different: iTBS has multiple bursts separated by brief intervals, whereas cTBS is continuous). The data were masked for analysis, and only unblinded at the final stage. We will update the manuscript to reflect these changes.

      (4b) Continuing with methodological considerations, the dropout percentage is high (18% for the first and 25% for the second study), both above the 15% cutoff for criterion 8 of the PEDro, losing another point.

      In the study, only 2 withdrew after feeling the heat, 2 were lost to follow up, and 2 had incomplete data. That totals 6/123 in Study 1. In study 2, none of the participants that met inclusion/exclusion criteria, and who were ‘allocated’ to the study were included (0% dropout/data loss).

      We are unsure how to address this point, as we had clear inclusion/exclusion criteria, and these could only be measured after consenting. As this is an experimental study performed on healthy individuals in a university setting, we are not able to collect any study related data prior to consent.

      We openly reported individuals who did not meet the criteria, and thus were excluded. These criteria are a combination of what is required to collect good quality data, and what we are ethically permitted to do. We understand that in an interventional trial where >15% drop out due to intolerance, or adverse events would indeed be concerning.

      (5) Data reporting and statistical treatment can be improved, as only differences are reported and regression to the mean is not accounted for in this study. Moreover, baseline levels for the dependent variables (control session) are not accessible for evaluation and they are not compared statistically, making it impossible to know if the groups were similar at baseline. This will imply failing criterion 3 of the PEDro, for a total of 2/10 points.

      This only concerns study 1, as study 2 is a within subject study design. Study 1 provides the raw data in Figure 4. We will provide the raw data for each of the primary outcome measures in a supplemental table in the revision.

    1. eLife Assessment

      The study provides valuable insight into the biological significance of SARS-CoV-2 by using a series of computational analyses of viral proteins. While the evidence is solid, the reviewers noted a lack of clarity about the objectives of the analyses. While impactful for the field, the manuscript would benefit from improved presentation.

    2. Reviewer #1 (Public Review):

      Summary:

      Park et al. conducted various analyses attempting to elucidate the biological significance of SARS-CoV-2 mutations. However, the study lacks a clear objective. The specific goals of the analyses in each subsection are unclear, as is how the results from these subsections are interconnected. Compiling results from unrelated analyses into a single paper can be confusing for readers. Clarifying the objective and narrowing down the topics would make the paper's purpose clearer.

      The logic of the study is also unclear. For instance, the authors developed an evaluation score, APESS, for analyzing viral sequences. Although they state that the APESS score correlates with viral infectivity, there is no explanation in the results section about why this is the case.

      In summary, I recommend reconsidering the structure of the paper.

    3. Reviewer #2 (Public review):

      Summary:

      The authors have developed a machine learning tool AIVE to predict the infectivity of SARS-CoV-2 variants and also a scoring metric to measure infectivity. A large number of virus sequences were used with very detailed analysis that incorporates hydrophoic, hydrophiclic, acid and alkaline characteristics. The protein structures were also considered to measure infectivity and search for core mutations. The study especially focused on the S protein of SARS-CoV-2. The contents of this study would be of interest to many researchers related to this area and the web-service would be helpful to easily analyze such data without indepth bioinformatics expertise.

      Strengths:

      - Analysis on large scale data<br /> - Experimental validation on a partial set of searched mutations<br /> - A user-friendly web-based analysis platform that is made public

      Weaknesses:

      - Complexity of the research

      Comments on revisions:

      The authors have addressed all my comments and is much more readable.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      Park et al. conducted various analyses attempting to elucidate the biological significance of SARS-CoV-2 mutations. However, the study lacks a clear objective. The specific goals of the analyses in each subsection are unclear, as is how the results from these subsections are interconnected. Compiling results from unrelated analyses into a single paper can be confusing for readers. Clarifying the objective and narrowing down the topics would make the paper's purpose clearer.

      The logic of the study is also unclear. For instance, the authors developed an evaluation score, APESS, for analyzing viral sequences. Although they state that the APESS score correlates with viral infectivity, there is no explanation in the results section about why this is the case.

      The structure of the paper should be reconsidered.

      Thank you for your feedback. We have heeded the input that the study lacks a clear objective and made sure that the overall goal of the study is reflected in the Abstract, Results, and Discussion.

      We have made sure that the specific goals in each subsection are clearer in the Results section that better explain the goals of those sections and elaborated on how the components of our study connect to each other. We have addressed these in more detail in the ‘Recommendations for the authors’ section.

      Thank you for the feedback on APESS, our evaluation model. APESS was created based on virus properties that we discovered of SARS-CoV-2 in our study. When applying our evaluation model, high APESS scores indicated high infectivity. APESS is calculated from a comprehensive evaluation of SARS-CoV-2 at the nucleotide, amino acid, and protein structure levels.

      The detailed explanations and exact calculations of APESS are detailed in the Materials and Methods section in line 571 but we should have been more detailed in the Results section as well. We have made sure to properly indicate this in the Results section in line 284.

      And overall, we have made edits to the manuscript that accurately explain our research by amending terms, restructuring arguments, and providing more clarity for the interconnectivity of the research.

      Reviewer #2 (Public review):

      Summary:

      The authors have developed a machine learning tool AIVE to predict the infectivity of SARS-CoV-2 variants and also a scoring metric to measure infectivity. A large number of virus sequences were used with a very detailed analysis that incorporates hydrophobic, hydrophilic, acid, and alkaline characteristics. The protein structures were also considered to measure infectivity and search for core mutations. The study especially focused on the S protein of SARS-CoV-2. The contents of this study would be of interest to many researchers related to this area and the web service would be helpful to easily analyze such data without in-depth bioinformatics expertise.

      Strengths:

      - Analysis of large-scale data.

      - Experimental validation on a partial set of searched mutations.

      - A user-friendly web-based analysis platform that is made public.

      Weaknesses:

      - Complexity of the research.

      Thank you for your kind feedback. Our study explored a wide range of topics including biochemical properties, machine learning, and viral infectivity.

      In presenting our research, we recognize that our comprehensive analysis may have slightly obscured the specific aims and overall objective of the study. We investigated properties in the viral sequences of SARS-CoV-2 and examined big data, clinical data, and expression data to elucidate their effect on viral infectivity. We then used evaluation modeling and in silico and in vitro validation.

      We have clarified the aims of our research and improved upon the flow of the manuscript by adding sentences that outline the goals of our research in the appropriate sub sections of the Results and Discussion sections.

      Reviewer #1 (Recommendations for the authors):

      The abstract should clearly state the backgrounds, objectives, strategies, and findings of this study in an orderly manner.

      Thank you for your feedback. We have restructured the Abstract to better reflect the goals and methods of our study. We start the Abstract by introducing the background of the study ‘An unprecedented amount of SARS-CoV-2 data has been accumulated compared with previous infectious diseases, enabling insights into its evolutionary process and more thorough analyses.’ in line 48. Then we more clearly stated the overall objectives of our research in line 50 as ‘This study investigates SARS-CoV-2 features as it evolves to evaluate its infectivity.’ Then, we clearly defined our specific discoveries in the virus, the purpose of our evaluation model, and how we validated our findings.

      In the Introduction, the message of each paragraph is unclear. Please clearly state the objectives of the study and what was done to achieve these objectives.

      Thank you for the feedback. We have updated the Introduction section to more clearly state the objectives of the study.

      To increase clarity, we have moved ‘Furthermore, hydrophobic properties in the amino acid sequence affect protein folding. Coronavirus hydrophobicity has significant effects on amino acid properties and protein folding.’ to line 127.

      In line 130, we rephrased the first sentence of the paragraph to ‘For these prior approaches to virus analysis and prediction, expertise with the relevant fields is required for a full understanding.’ to better establish the link between the background information and aims of the study. Then in line 134, we added ‘elucidate properties about the virus’ to clarify the aims of the study.

      In line 141, we have improved the clarity of the sentence to better present the scope and objectives of the study.

      The relationship between the sections in the Results is unclear. Clarify why each section is necessary and how they are interconnected.

      We investigated properties in the viral sequences of SARS-CoV-2 that highlighted amino acid substitutions or changes in polarity (Figure 1). In VOCs, we noted trends or absences of amino acid substitutions at specific positions (Figure 2). We examined epidemiological and clinical data to determine the infectivity, severity, and symptomaticity of lineages. Looking at expression data and binding affinity further illuminated the effect of amino acid substitutions (Figure 3). We created APESS, an evaluation modeling, that is comprehensively calculated from the nucleotide, amino acid, and protein structure levels of the virus. Evaluation of lineages revealed that higher APESS scores were associated with higher infectivity (Figure 4). We used in silico and in vitro validation to reinforce our findings then used machine learning to make predictions on future developments (Figure 5). We created candidate sequences for evaluation and utilized machine learning in predictions (Figure 6).

      We have added explanations to each section in Results that elucidate the objective of each section and how they connect with each other in the wider study.

      In line 157, we have added ‘We examined the amino acid sequences of SARS-CoV-2 to make discoveries about biochemical properties.’ to clearly outline the objective of the subsection.

      In line 207, we have improved the phrasing of the sentence.

      In line 278, we stressed that ‘We developed APESS, an evaluation model to analyze viral sequences based on the nucleotide, amino acid, and protein structure properties.’ to properly define the purpose and background of APESS.

      Please define abbreviations when they first appear.

      We have added the full terms for the stated abbreviations in the relevant sections of the manuscript.

      In line 107, we have added the proper abbreviation for Our World in Data (OWID).

      In lines 143, 175, and 489 we have added the full term for Variants of Concern (VOCs).

      In line 160, we have added the full term for Receptor Binding Motif (RBM).

      Reviewer #2 (Recommendations for the authors):

      (1) pg 9, line 51, full name of RBM should be declared.

      We have added the full name of Receptor Binding Motif (RBM) to the appropriate section in the Abstract.

      (2) How are the Variants of Concern (VOCs) defined?

      Thank you for the comment and we apologize for the confusion. Variants of Concern as defined by the World Health Organization are specified in the Materials and Methods section. We have also added the full name for Variants of Concern (VOCs) when they are first mentioned in the Introduction and Results sections.

      (3) pg 17, line 297. The purpose of using AI/ML to predict amino acid substitutions at specific locations is not clear. The VOCs and related mutation loci were already searched, so the AA substitution prediction step seems a little repetitive. Is it to create customized sequences? Also, if prediction (or probability) was made, some performance evaluation would be helpful.

      Thank you for this feedback. The purpose of utilizing machine learning to make predictions about amino acid substitutions is to assess the possibility of amino acid substitutions occurring at specific locations. These potential amino acid substitutions were evaluated by APESS to have high scores, linking them to high infectivity. As the feedback suggests, amino acid substitutions in VOCs are researched but our prediction sought to ascertain the likelihood of amino acid substitutions that our evaluation model associated with infectivity. In the Results section in line 330, we assessed the probability of amino acid substitutions N460K and Q493R that the study found to be significant. The datasets that we utilized for these predictions are detailed in the Materials and Methods section in line 677.

      The models we trained with machine learning predicted the probability of mutations based on samples in each group and their performance was evaluated by comparing the presence of mutations in the clades they diverged from. We have added the following sentences to line 330: “We used Accuracy, Precision, Recall, and F1 score to evaluate performance. All models showed high performance scores above 0.95 in Precision, Recall, and F1 score. For accuracy, XGBoost, scored above 0.89, exhibiting relatively high performance while LightGBM scored above 0.78.”

      (4) pg 17, line 289. The objective of creating candidate lineages is not clear and would be helpful for the readers if its purpose is elaborated on. Since there are enough SARS-CoV-2 sequences, wouldn't it be more realistic and accurate to use those real sequences instead of creating them? Furthermore, the candidate lineages should be defined but they were missing in this section. This part made it a little difficult to follow the overall paper's logic.

      The manuscript should have been clearer on what ‘candidate lineages’ signified, we apologize for the confusion. In line 314, we included the following sentences for clarity: ‘We introduced amino acid substitutions at specific locations in the SARS-CoV-2 backbone for the wildtype and VOCs. The amino acid substitutions were lysine (K), arginine (R), asparagine (N), serine (S), tyrosine (Y), and glycine (G). We then evaluated the infectivity of these candidate lineages with our evaluation model APESS.’

      The purpose of creating candidate lineages in our study was to assess the effect of specific amino acid substitutions on the virus’ infectivity. The amino acid substitutions we evaluated were lysine (K), arginine (R), asparagine (N), serine (S), tyrosine (Y), and glycine (G). We determined that examining the introduction of specific amino acid substitutions to SARS-CoV-2 sequences would highlight the significance they had on infectivity. We have revised the paragraph in line 314 of the Results section to convey what we were doing.

      (5) This study covers very detailed contents regarding lineages, mutations, and their effect on infectivity. It would be more readable if subsections could be added per group of investigation, especially in the results and discussion section.

      In the Results section, we have emphasized the objective of each subsection and how they connect with one another for the overall goals of our study.

      In line 157, we have added ‘We examined the amino acid sequences of SARS-CoV-2 to make discoveries about biochemical properties.’ to clearly outline the objective of the subsection.

      In line 207, we have improved the phrasing of the sentence.

      In line 278, we stressed that ‘We developed APESS, an evaluation model to analyze viral sequences based on the nucleotide, amino acid, and protein structure properties.’ to properly define the purpose and background of APESS.

      We have made edits to the Discussion section to more clearly indicate subsections.

      In line 389, we have added ‘In our investigation of various viruses’ to clearly indicate the background on other viruses.

      In line 409, we added the sentence ‘We made discoveries on specific amino acid substitutions at positions.’ to indicate the subsection talking about N437R, N460K, and D467 mutations.

      In line 471, we added the sentence ‘We created AIVE to feature our findings and analyses on an online platform.’ And modified the following sentence to better explain AIVE.

      (6) pg 26, line 557. The criteria for the SCPSi scores were set to 0.9 and 0.1 by the proportion of the Omicron and Delta variants. How do other criteria affect the performance of the method?

      Thank you for the question and check point. We used 0.9/0.1 for our initial criteria in our SCPS calculation. To determine how that affected performance, we have used 0.8/0.2 and 0.7/0.3 as the criteria.

      After calculating APESS with different SCPS weights (0.9/0.1, 0.8/0/2, 0.7/0.3), we used a Gaussian Mixture Model (GMM) to compare how the groups were divided based on APESS. All three groups with different SCPS weights were determined to accurately reflect data patterns when they had four components.

      When comparing parameter values, the group that used the original weights of 0.9 and 0.1 for SCPS showed the lowest values for variance and standard error across all four components. This indicates that each component was stable and clearly distinguishable from one another.

      The group where the weights were adjusted to 0.7 and 0.3 for SCPS showed significantly higher variance and a large error for the G2 component. The distribution of each component was more widespread, signifying that the stability and reliability was lower.

      The group where the weights were adjusted to 0.8 and 0.2 for SCPS was positioned between the two previous groups for finer data classification and reliability. However, the group notably lacked reliability when it came to the SE values for the G4 component.

      Thus, the original model with 0.9 and 0.1 weight is the most reliable.

      When the Gaussian Density for each group was plotted, the group with 0.9/0.1 SCPS weights showed the highest peak near 2 (G1), with a value of approximately 2. For the group with SCPS 0.8/0.2 weights, the highest peak appeared near 4.2 (G3), showing a high value around 14. For the group with SCPS 0.7/0.3 weights, the highest peak appeared near 3.7 (G3) showing a value around 5. The group with 0.9/0.1 SCPS weights exhibited a more uniform Gaussian distribution compared to the other two.

      Author response image 1.

      Superposition of Gaussian Densities for SCPS weight 0.9/0.1

      Author response table 1.

      Statistical values of the Superposition of Gaussian Densities for SCPS weight 0.9/0.1

      Author response image 2.

      Superposition of Gaussian Densities for SCPS weight 0.8/0.2

      Author response table 2.

      Statistical values of the Superposition of Gaussian Densities for SCPS weight 0.8/0.2

      Author response image 3.

      Superposition of Gaussian Densities for SCPS weight 0.7/0.3

      Author response table 3.

      Statistical values of the Superposition of Gaussian Densities for SCPS weight 0.7/0.3

      (7) Overall, the approach is very detailed and realistic. Just curious if this approach would be also applicable to other viruses such as influenza.

      We appreciate the insightful comments from the reviewer, and this is a direction we hope to take our research in the future. Our study focused on SARS-CoV-2 and the properties we discovered from the virus’ spike protein interacting with the host’s ACE2 receptor. In our investigation of other coronaviruses such as MERS-CoV, SARS-CoV-1 possesses a different structure and properties than these viruses as we have illustrated in Supplementary Figure 24. We had provided explanations about our investigation of other viruses in the Discussion section. In line 389, we have added ‘In our investigation of various viruses’ to better signpost this section.

    1. eLife Assessment

      The authors modified a common method to induce epilepsy in mice to provide an improved approach to screening new drugs for epilepsy. This is an important goal because of the need to develop drugs for patients who are refractory to current medications. The authors' method evokes seizures to circumvent a low rate of spontaneous seizures and the approach was validated using two common anti-seizure medications. The strength of evidence was solid in that some validation was provided, but incomplete because the method for quantification, definition of seizures, and some other aspects of the paper were not clear or absent.

    2. Reviewer #1 (Public review):

      Summary:

      This important study by Takano et. al. describes a novel approach for optogenetically evoking seizures in an etiologically relevant mouse model of epilepsy. The authors developed a model that can trigger seizures "on demand" using optogenetic stimulation of CA1 principal cells in mice rendered epileptic by an intra-hippocampal kainate (IHK) injection into CA3. The authors discuss their model in the context of the limitations of current animal models used in epilepsy drug development. In particular, their model addresses concerns regarding existing models where testing typically involves inducing acute seizures in healthy animals or waiting on infrequent, spontaneous seizures in epileptic animals.

      Strengths:

      A strength of this manuscript is that this approach may facilitate the evaluation of novel therapeutics since these evoked seizures are demonstrated as being sufficiently similar to spontaneous seizures in these same mice which are more laborious to analyze. The data demonstrating the commonality of pharmacology and EEG features between evoked seizures and spontaneous seizures in epileptic mice, while also being different from evoked seizures in naïve mice, are convincing despite concerns regarding the biological significance of the differences in effect sizes of these features. The structural, functional, and behavioral differences between a seizure-naïve and epileptic mouse are complex and important issues. This study positively impacts the wider epilepsy research community by investigating seizure semiology and pharmaceutical responses in these populations.

      Weaknesses:

      While the data generally supports the authors' conclusions, a weakness of this manuscript lies in their analytical approach where EEG feature-space comparisons used the number of spontaneous or evoked seizures as their replicates as opposed to the number of IHK mice; these large data sets tend to identify relatively small effects of uncertain biological significance as being highly statistically significant. Furthermore, the clinical relevance of similarly small differences in EEG feature space measurements between seizure-naïve and epileptic mice is also uncertain. Finally, the multiple surgeries and long timetable to generate these mice may limit the value compared to existing models in drug-testing paradigms.

    3. Reviewer #2 (Public review):

      Summary:

      The authors have attempted to modify and adapt the IH-KA model in mice to provide an improved approach to screening for new ASDs by partially mitigating the problem of randomly occurring seizures and relatively low seizure frequency in the IH-KA model. The authors used KA micro-injections to selectively kill the hippocampal CA3 area as a way to induce temporal lobe "epileptogenesis" (TLE), and then used optogenetics to activate CA1 pyramidal cells specifically. This approach allowed the authors to trigger generalized seizures where the tonic-clonic pattern of electrical activity was reminiscent of actual tonic-clonic behavioral convulsions. Administration of levitracetam (LEV) and diazepam (DZP), two widely used ASDs with different mechanisms, reduced the probability of optogenetically activated epileptic seizures in IH-KA mice, thus seeming to provide evidence for a new approach to screen ASDs. A variety of problems and issues with the approach and the results lead to confounds that raise serious concerns about the conclusions.

      Major strengths and weaknesses of the Methods and Results:

      Strengths:

      The authors have designed a method for triggering seizures, and the figures show bona fide electrographic seizures with concomitant convulsive behavioral components. The optogenetically evoked seizures in IH-KA mice had the electrical properties of actual seizures and the tonic-clonic components were readily apparent. These seizures appeared different from seizures evoked in naïve mice, and the authors attribute this difference to the epileptogenic process, but this may not be correct.

      The ASDs (i.e., LEV and DZP) reduced the success rate of the optogenetically evoked seizures in IH-KA mice, thus suggesting the potential usefulness of the model for testing ASDs. The paper discusses whether the Epilepsy Therapy Screening Program (ETSP) will be able to use this modification of the IH-KA model in place of (1) ASD screening with acute seizures in naïve animals, where the brain has not undergone "epileptogenesis", (2) testing ASDs on hippocampal paroxysmal discharges (HPDs) in the IH-KA model, which has undergone epileptogenesis, or (3) spontaneous epileptic seizures in animal models of TLE based on systemic treatments that lead to acute convulsive status epilepticus that have later undergone epileptogenesis. This proposed version of the IH-KA model aims to address the former problem (#1, above) by using a mouse model of TLE, and to address the latter problems (#2 and #3, above) of the seemingly random occurrence of epileptic seizures and the low seizure frequency by using optogenetically "triggered" seizures.

      Weaknesses

      Although the figures provide excellent examples of individual electrographic seizures and compare induced seizures in epileptic and naïve animals, it is unclear which criteria were used to identify an actual seizure induced by the optogenetic stimulus, versus a hippocampal paroxysmal discharge (HPD), an "afterdischarge", an "electrophysiological epileptiform event" (EEE, Ref #36, D'Ambrosio et al., 2010 Epilepsy Currents), or a so-called "spike-wave-discharge" (SWD). Were HPDs or these other non-seizure events ever induced using stimulation in animals with IH-KA? A critical issue is that these other electrical events are not actual seizures, and it is unclear whether they were included in the column showing data on "electrographic afterdischarges" in Figure 5 for the studies on ASDs. This seems to be a problem in other areas of the paper, also.

      The differences between the optogenetically evoked seizures in IH-KA vs naïve mice are interpreted to be due to the "epileptogenesis" that had occurred, but the lesion from the KA-induced injury would be expected to cause differences in the electrically and behaviorally recorded seizures - even if epileptogenesis had not occurred. This is not adequately addressed.

      The authors did not test whether an apparent "kindling" effect, apparently seen in naïve controls, also occurred in animals micro-injected with kainic acid (KA). This effect could cause model instability that might result in variability in response to ASDs. It is not clear whether the number of optogenetically induced seizures in epileptic animals would affect the response to drugs. It is also unclear how much of an improvement the animal model in the present work is over other similar models of TLE, where electrically triggered seizures could simply be applied to one of them.

      The authors offer little mention of other research using animal models of TLE to screen ASDs, of which there are many published studies - many of them with other strengths and/or weaknesses. For example, although Grabenstatter and Dudek (2019, Epilepsia) used a version of the systemic KA model to obtain dose-response data on the effects of carbamazepine on spontaneous seizures, that work required use of KA-treated rats selected to have very high rates of spontaneous seizures, which requires careful and tedious selection of animals. The ETSP has published studies with an intra-amygdala kainic acid (IA-KA) model (West et al., 2022, Exp Neurol), where the authors claim that they can use spontaneous seizures to identify ASDs for DRE; however, their lack of a drug effect of carbamazepine may have been a false negative secondary to low seizure rates. The approach described in this paper may help with confounds caused by low or variable seizure rates. These types of issues should be discussed, along with others.

      While the paper may be relevant for the ETSP and contract research organizations (CROs), the paper was not written to attract the interest of biological scientists, even those in this specific area of epilepsy research. It may be of low interest to other neuroscientists.

      The outcome measure for testing LEV and DZP on seizures was essentially the fraction of unsuccessful or successful activations of seizures, where high ASD efficacy is based on showing that the optogenetic stimulation causes fewer seizures when the drug is present. The final outcome measure is thus a percentage, which would still lead to a large number of tests to be assured of adequate statistical power. Thus, there is a concern about whether this proposed approach will have high enough resolution to be more useful than conventional screening methods so that one can obtain actual dose-response data on ASDs.

      The key issue the authors aim to address is the 30-40% of patients with DRE, but the real problem with DRE patients is not that these people have seizures with no effect of the ASDs; rather, although ASD may reduce seizure burden, these patients continue to have some remaining seizures even after high doses of ASDs, which often leads to adverse effects from the particular ASDs.

      In several sections of the paper, the authors argue that two different groups are similar on the basis that no statistical difference was found between the two groups (i.e., p > 0.05); however, the failure to find a statistically significant difference, particularly with relatively small sample sizes, is not rigorous evidence that the two groups are actually similar - they are just "not significantly different."

      It remains unclear that the optogenetically induced seizures in this model are better than similarly induced seizures in a naïve animal, and there is no evidence that the model will be useful for finding new ASDs to treat DRE.

      Do the results support the conclusions?

      Although the Results show examples of clear tonic-clonic seizures, it is not at all clear whether this approach is a significant improvement over previous methods used on animal models of TLE. The presented data from this method shows it provides an ability to detect the effect of widely used ASDs, but not that it will have the resolution to find better ASDs. The outcome measure of successful vs failed seizure inductions does not necessarily translate to a pathway for finding new ASDs for DRE, which often is a function of the side effects of the proposed new ASD. Although the recorded seizures in IH-KA rats differ in waveform from the ones in naïve mice, this could be due to the pattern of damage resulting from the micro-injection of KA or the density of expressed Chr2, which could be affected by sclerosis.

      Impact and utility of methods and data.

      The authors state that this approach should be used to test for and discover new ASDs for DRE, and also used for various open/closed loop protocols with deep-brain stimulation; however, the paper does not actually discuss rigorously or critically the background literature on other published studies in these areas or how this approach will improve future research for a broader audience than the ETSP and CROs. Thus, it is not clear whether the utility will apply more widely and how extensive a readership will be attracted to this work.

      Final Conclusions:

      Although this is an Interesting if not elegant new model for testing ASDs, it could be seen as a version of kindling (plus brain damage) in a rodent model, where some of the pathology of TLE is induced through focal injection of KA in the CA3 area of the hippocampus. Unfortunately, no evidence was presented that it will be any better than other models, although it could be faster and maybe easier than models based on spontaneous seizures. Although it has some similarities to the pathology of human TLE, the ablating part of the hippocampus does not account for the more widespread pathology that usually occurs elsewhere in the brain, as studied with imaging and with anatomy in surgical specimens from patients with DRE.

      Although this approach with seizure induction via an optogenetic approach adds specificity to the type of cell that is stimulated (i.e., CA1 pyramidal cells), it is not apparent why this provides a better or more effective tool than simple electrical induction of seizures in any TLE model. Most important, it remains unclear how this addresses any aspect of drug resistance. To improve the ASD discovery process, an important new model must make a significant reduction in seizure burden, and would ideally improve the percentage of patients that become seizure-free. It is not clear how this model will do that.

      In the end, the authors have created a model with some of the pathology of TLE, where they can then induce actual seizures via specific optogenetic stimulation. So, although it is potentially elegant work, it remains unclear what new information this model will tell us about epilepsy, and most importantly DRE - or how it will improve treatment outcomes.

    4. Reviewer #3 (Public review):

      Summary:

      Chen et al. develop and characterize a new approach for screening drugs for epilepsy. The idea is to increase the ability to study seizures in animals with epilepsy because most animal models have rare seizures. Thus, the authors use the existing intrahippocampal kainic acid (IHKA) mouse model, which can have very unpredictable seizures with long periods of time between seizures. The authors employ an additional method to trigger seizures in the IHKA model. This method is closed-loop optogenetic stimulation of area CA1. There are several assumptions: area CA1 is the best location, triggered seizures are the same as spontaneous seizures, and this method will be useful despite requiring a great deal of effort. Regarding the latter, using a mouse model with numerous seizures (such as the pilocarpine model) might be more efficient than using a modified IHKA protocol that requires viral injection for optogenetics, fiber insertion requiring additional surgery, and accurate targeting to reliably trigger seizures on-demand. Aside from these caveats, the authors do succeed in studying seizures more readily in a mouse model of rare seizures. However, the seizures are evoked, not spontaneous. As currently presented, it is not clear how the triggered seizures can be used to investigate if antiseizure medication can reduce seizure burden as measured by seizure severity and seizures per day.

      The authors modified the IHKA model to inject KA into CA3 instead of CA1 in order to preserve the CA1 pyramidal cells that they will later stimulate. To express the excitatory opsin channelrhodopsin (ChR2) in area CA1, they use a virus that expresses ChR2 in cells that express the Thy-1 promoter. The authors demonstrate that CA3 delivery of KA can induce a very similar chronic epilepsy phenotype to the injection of KA in CA1 and show that optical excitation of CA1 can reliably induce seizures. These are the strengths of the study.

      While the authors show that electrophysiological signatures of induced vs spontaneous seizures are similar in many ways, the authors also show several differences and it is not clear if these differences are meaningful. Notably, the induced seizures are robustly inhibited by the antiseizure medication levetiracetam and variably but significantly inhibited by diazepam, similar to many mouse models with chronic recurrent seizure activity. I agree with the authors that this modified IHKA model will be of most value for higher throughput screening of potential antiseizure therapies, but with the caveat that the data may not generalize to other epilepsy models or humans. The authors evaluate the impact of repeated stimulation on the reliability of seizure induction and show that seizures can be reliably induced by CA1 stimulation for as long as 16 days, but the utility of the model would be better demonstrated if seizures could be shown to be inducible over the range of weeks to months.

      Strengths:

      (1) The authors show that the IHKA model of chronic epilepsy can be modified to preserve CA1 pyramidal cells (but at a cost of CA3 cells), allowing on-demand optogenetic stimulation of CA1 that appears to lower seizure threshold and thus trigger a seizure event.

      (2) The authors show that repeated reactivation of CA1 even in untreated mice can promote kindling and induction of seizure activity, indeed generating two mouse models in total.

      (3) Many electrophysiological signatures are similar between the induced and spontaneous seizures, and induced seizures reliably respond to treatment with antiseizure medications.

      (4) Given that more seizures can be observed per mouse using on-demand optogenetics, this model enhances the utility of each individual mouse.

      Weaknesses:

      (1) Evaluation of seizure similarity using the SVM modeling and clustering is not sufficiently explained to show if there are meaningful differences between induced and spontaneous seizures. SVM modeling did not include analysis to assess the overfitting of each classifier since mice were modeled individually for classification.

      (2) The difference between seizures and epileptiform discharges or trains of spikes (which are not seizures) is not made clear.

      (3) The utility of increasing the number of seizures for enhancing statistical power is limited unless the sample size under evaluation is the number of seizures. However, the standard practice is for the sample size to be the number of mice.

      (4) Seizure burden is not easily tested.

      (5) It is unlikely that long-term adaptation to CA1-stimulated seizure induction is absent in these mice. A duration of evaluation longer than 16 days is warranted in light of the downward slope at days 13-16 for induced seizures in Figure 4C.

      (6) Human epilepsy is extensively heterogeneous in both etiology and individual phenotype, and it may be hard to generalize the approach.

      (7) No mention or assessment of mouse sex as a biological variable.

    5. Author response:

      In this initial response to the public review, we outline our plan to address the major concerns raised. Below, we provide a general categorization of the suggestions and our corresponding responses

      Weakness #1: Statistical Concerns - using the number of seizures (rather than the number of animals) may identify small effects that could be insignificant. Effect size should be taken into consideration.

      Reviewer 1:

      “While the data generally supports the authors' conclusions, a weakness of this manuscript lies in their analytical approach where EEG feature-space comparisons used the number of spontaneous or evoked seizures as their replicates as opposed to the number of IHK mice; these large data sets tend to identify relatively small effects of uncertain biological significance as being highly statistically significant.”

      Reviewer 2:

      “In several sections of the paper, the authors argue that two different groups are similar on the basis that no statistical difference was found between the two groups (i.e., p > 0.05); however, the failure to find a statistically significant difference, particularly with relatively small sample sizes, is not rigorous evidence that the two groups are actually similar - they are just "not significantly different.”

      Reviewer 3:

      “(3) The utility of increasing the number of seizures for enhancing statistical power is limited unless the sample size under evaluation is the number of seizures. However, the standard practice is for the sample size to be the number of mice.”

      Reviewer 3:

      “(1) Evaluation of seizure similarity using the SVM modeling and clustering is not sufficiently explained to show if there are meaningful differences between induced and spontaneous seizures. SVM modeling did not include analysis to assess the overfitting of each classifier since mice were modeled individually for classification.”

      We understand the reviewers’ concerns. In this work, we used linear mixed effect model to address two levels of variability –between animals and within animals. The interactive linear mixed effect model shows that most (~90%) of the variability in our data comes from within animals (Residual), the random effect that the model accounts for, rather than between animals. Since variability between animals are low, the model identifies common changes in seizure propagation across animals, while accounting for the variability in seizures within each animal. Therefore, the results we find are of changes that happen across animals, not of individual seizures. We will make text edits to enhance understanding of the linear mixed effect model.

      To address the point raised about similarity, we will explain how the SVM classifier was trained. The purpose of the SVM is not to identify meaningful differences between induced and spontaneous seizures. Rather, it is to classify EEG sections as “seizures” or non-seizures, demonstrating the gross similarity between induced and spontaneous seizures despite minor differences. We will make text clarifications for the SVM model.

      Weakness #2: Clinical and biological significance is unclear.

      Reviewer 1:

      “Furthermore, the clinical relevance of similarly small differences in EEG feature space measurements between seizure-naïve and epileptic mice is also uncertain.”

      Reviewer 2:

      “While the paper may be relevant for the ETSP and contract research organizations (CROs), the paper was not written to attract the interest of biological scientists, even those in this specific area of epilepsy research. It may be of low interest to other neuroscientists… The key issue the authors aim to address is the 30-40% of patients with DRE, but the real problem with DRE patients is not that these people have seizures with no effect of the ASDs; rather, although ASD may reduce seizure burden, these patients continue to have some remaining seizures even after high doses of ASDs, which often leads to adverse effects from the particular ASDs… It remains unclear that the optogenetically induced seizures in this model are better than similarly induced seizures in a naïve animal, and there is no evidence that the model will be useful for finding new ASDs to treat DRE.”

      Reviewer 3:

      “(6) Human epilepsy is extensively heterogeneous in both etiology and individual phenotype, and it may be hard to generalize the approach.”

      Reviewer 2:

      “The authors state that this approach should be used to test for and discover new ASDs for DRE, and also used for various open/closed loop protocols with deep-brain stimulation; however, the paper does not actually discuss rigorously or critically the background literature on other published studies in these areas or how this approach will improve future research for a broader audience than the ETSP and CROs. Thus, it is not clear whether the utility will apply more widely and how extensive a readership will be attracted to this work.”

      We appreciate the reviewer’s concerns. We will revise the manuscript to better emphasize the potential significance of our approach. The on-demand seizure model can be applied to address biologically and clinically relevant questions beyond its utility in drug screening. For example, crossing the Thy1-ChR2 mouse line with genetic epilepsy models, such as Scn1a mutants, could reveal how optogenetic stimulation differentially induces seizures in mutant versus non-mutant mice, providing insights into seizure generation and propagation in Dravet Syndrome. Due to the cellular specificity of optogenetics, we also envision this approach being used to study circuit-specific mechanisms of seizure generation and propagation. Regarding drug-resistant epilepsy (DRE) and anti-seizure drug (ASD) screening, we agree with the reviewer that probing new classes of ASDs for DRE represents the critical goal. However, we believe a full exploration of additional ASD classes and/or modeling DRE lies outside the scope of this manuscript.

      Weakness #3: Definition of Seizure is unclear

      Reviewer 2:

      “Although the figures provide excellent examples of individual electrographic seizures and compare induced seizures in epileptic and naïve animals, it is unclear which criteria were used to identify an actual seizure induced by the optogenetic stimulus, versus a hippocampal paroxysmal discharge (HPD), an "afterdischarge", an "electrophysiological epileptiform event" (EEE, Ref #36, D'Ambrosio et al., 2010 Epilepsy Currents), or a so-called "spike-wave-discharge" (SWD). Were HPDs or these other non-seizure events ever induced using stimulation in animals with IH-KA? A critical issue is that these other electrical events are not actual seizures, and it is unclear whether they were included in the column showing data on "electrographic afterdischarges" in Figure 5 for the studies on ASDs”

      Reviewer 3:

      “(2) The difference between seizures and epileptiform discharges or trains of spikes (which are not seizures) is not made clear.”

      Reviewer 2:

      “The differences between the optogenetically evoked seizures in IH-KA vs naïve mice are interpreted to be due to the "epileptogenesis" that had occurred, but the lesion from the KA-induced injury would be expected to cause differences in the electrically and behaviorally recorded seizures - even if epileptogenesis had not occurred. This is not adequately addressed.”

      Thank you for pointing out the unclear definition of the seizures analyzed. We agree and will revise the text to clarify this issue. In this manuscript, we focused on tonic-clonic seizures. We analyzed animal behavior during evoked events, and a high percentage of induced electrographic events were accompanied by behavioral seizures with a Racine scale of three or above. Regarding epileptogenesis, our model is based on the IHK model, in which spontaneous tonic-clonic seizures occur a few to several days after KA injection. These mice are, by definition, epileptogenic. We will further clarify this methodology in the text.

      Weakness #4: Similarity/Difference with Kindling Not Clear

      Reviewer 2:

      “The authors did not test whether an apparent "kindling" effect, apparently seen in naïve controls, also occurred in animals micro-injected with kainic acid (KA). This effect could cause model instability that might result in variability in response to ASDs. It is not clear whether the number of optogenetically induced seizures in epileptic animals would affect the response to drugs. It is also unclear how much of an improvement the animal model in the present work is over other similar models of TLE, where electrically triggered seizures could simply be applied to one of them.”

      Reviewer 3:

      “(5) It is unlikely that long-term adaptation to CA1-stimulated seizure induction is absent in these mice. A duration of evaluation longer than 16 days is warranted in light of the downward slope at days 13-16 for induced seizures in Figure 4C.”

      We appreciate the reviewer’s comments regarding the “kindling effect” as well as its similarity to the kindling model. We will carefully assess the data and address this in the revised manuscript. In electrical kindling, the activated cellular population is non-specific, including both excitatory and inhibitory neurons. In our model, we specifically activate predominantly excitatory neurons (Thy1-positive neurons), which we observed to participate in convulsant-induced seizures (as demonstrated in Thy1-GCaMP experiments). We consider this specificity an improvement over the kindling model, making our approach more biologically relevant.

      Weakness #5: Time needed to generate model is significant. Unclear if animals were pre-selected

      Reviewer 1:

      “Finally, the multiple surgeries and long timetable to generate these mice may limit the value compared to existing models in drug-testing paradigms.

      Reviewer 2:

      “The authors offer little mention of other research using animal models of TLE to screen ASDs, of which there are many published studies - many of them with other strengths and/or weaknesses. For example, although Grabenstatter and Dudek (2019, Epilepsia) used a version of the systemic KA model to obtain dose-response data on the effects of carbamazepine on spontaneous seizures, that work required use of KA-treated rats selected to have very high rates of spontaneous seizures, which requires careful and tedious selection of animals. The ETSP has published studies with an intra-amygdala kainic acid (IA-KA) model (West et al., 2022, Exp Neurol), where the authors claim that they can use spontaneous seizures to identify ASDs for DRE; however, their lack of a drug effect of carbamazepine may have been a false negative secondary to low seizure rates. The approach described in this paper may help with confounds caused by low or variable seizure rates. These types of issues should be discussed, along with others.”

      We appreciate the reviewer’s insights. In an existing model investigating spontaneous tonic-clonic seizures (such as the intra-amygdala kainate injection model), the time investment is back-loaded, requiring two to three weeks per condition while counting spontaneous seizures, which may occur only once a day. In contrast, our model requires a front-loaded time investment. Once the animals are set up, we can test multiple drugs within a few weeks, providing significant time savings. Additionally, we did not pre-screen animals in our study. Existing models often pre-select mice with high rates of spontaneous seizures, whereas in our model, seizures can be induced even in animals with few spontaneous seizures. We believe that bypassing the need for pre-screening is a key advantage of our induced seizure model.

      Reviewer 3:

      “(7) No mention or assessment of mouse sex as a biological variable.”

      Thank you for pointing this out. Both female and male animals were included in this study: Epileptic cohort: 7 males, 3 females; Naïve cohort: 3 males, 4 females

    1. eLife Assessment

      This study presents valuable findings, based on solid methods, to link metabolic dysfunction in Wilson's disease to immune cell dysregulation and poor cholecystitis outcomes. The integration of clinical data and single-cell analyses highlights NK cell exhaustion as a key factor, offering insights with potential therapeutic implications. The work will be of interest to colleagues in inflammatory and metabolic diseases.

    2. Reviewer #2 (Public review):

      Summary:

      Wilson's disease is a rare genetic disorder caused by mutations in the ATP7B gene. Previous studies have documented that ATP7B mutations can disrupt copper metabolism, affecting brain and liver function. In this paper, the authors performed a retrospective clinical study and found that Wilson's disease has a high incidence of cholecystitis. Single-cell RNA-seq analysis revealed changes in the immune microenvironment, including the activation of immune responses and the exhaustion of natural killer cells.

      Strengths:

      A key finding of this study is that the predominant ATP7B gene mutation in the Chinese population is the 2333G>T (p. R778L) mutation. The authors reported associations between Wilson's disease and cholecystitis, as well as the exhaustion of natural killer cells.

      Weaknesses:

      The underlying mechanisms linking ATP7B mutations to cholecystitis and natural killer cell exhaustion remain unclear. Specifically, it is not yet determined whether copper metabolism alterations directly cause cholecystitis and natural killer cell exhaustion, or if these effects are secondary to liver dysfunction.

      Comments on revisions:

      The authors fully addressed my questions and I don't have further comments.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      Wilson's Disease (WD) is an inherited rare pathological condition due to a mutation in ATP7B that alters mitochondrial structure and dysfunction. Additionally, WD results in dysregulated copper metabolism in patients. These metabolic abnormalities affect the functions of the liver and can result in cholecystitis. Understanding the immune component and its contribution to WD and cholecystitis has been challenging. In this work, the authors have performed single-cell RNA sequencing of mesenchymal tissue from three WD patients and three liver hemangioma patients.

      Strengths:

      The authors describe the transcriptomic alterations in myeloid and lymphoid compartments.

      Weaknesses:

      In brief, this manuscript lacks a clear focus, and the writing needs vast improvement. Figures lack details (or are misrepresented), the results section only catalogs observations, and the discussion needs to focus on their findings' mechanistic and functional relevance. The major weakness of this manuscript is that the authors do not provide a mechanistic link between the absence of ATP7B and NK cells' impaired/altered functions. While the work is of high clinical relevance, there are various areas that could be improved.

      In this study, we reported for the first time that ATP7B mutation and the resulting metabolic abnormalities in hepatocytes cause functional alteration of immune cells in WD patients. We dissected the transcriptional profiles of liver mesenchymal cells and delineated the functional differences of main immune cells in WD patients through scRNA-seq. The NK cell exhaustion and its clinical significance were further demonstrated.

      The mechanism study is of our concern. Given that the ATP7B mutation is hepatocyte-specific, its effect on immune cells is most probably through intercellular communication rather than through the direct action of ATP7B protein. How ATP7B mutation disturbs the metabolic homeostasis in hepatocyte, how metabolic pathways regulate the release of signal substances, and how signal substances act on the NK cells need to be explained. These contents, together with this manuscript, are beyond the scope of a single article, so we put the novelty in this manuscript.

      We sincerely appreciate the comments. We have improved the manuscript based on your valuable suggestions. The mechanism study is our subsequent research topic. We are actively promoting it and have found that ATP7B mutation rewires a certain metabolism pathway in hepatocyte, and that a critical metabolite functions as the mediator causing NK cell exhaustion.

      Reviewer #2 (Public Review):

      Summary:

      Wilson's disease is a rare genetic disorder caused by mutations in the ATP7B gene. Previous studies have documented that ATP7B mutations can disrupt copper metabolism, affecting brain and liver function. In this paper, the authors performed a retrospective clinical study and found that Wilson's disease has a high incidence of cholecystitis. Single-cell RNA-seq analysis revealed changes in the immune microenvironment, including the activation of immune responses and the exhaustion of natural killer cells.

      Strengths:

      A key finding of this study is that the predominant ATP7B gene mutation in the Chinese population is the 2333G>T (p. R778L) mutation. The authors reported associations between Wilson's disease and cholecystitis, as well as the exhaustion of natural killer cells.

      Weaknesses:

      The underlying mechanisms linking ATP7B mutations to cholecystitis and natural killer cell exhaustion remain unclear. Specifically, it is not yet determined whether copper metabolism alterations directly cause cholecystitis and natural killer cell exhaustion, or if these effects are secondary to liver dysfunction.

      In this study, we reported for the first time that ATP7B mutation and the resulting metabolic abnormalities in hepatocytes cause functional alteration of immune cells in WD patients. We dissected the transcriptional profiles of liver mesenchymal cells and delineated the functional differences of main immune cells in WD patients through scRNA-seq, focusing on the NK cell exhaustion and its clinical significance.

      The mechanism study is of our concern. Given that the ATP7B mutation is hepatocyte-specific, its effect on immune cells is most probably through intercellular communication, so we prioritize the studying of this aspect. How ATP7B mutation disturbs the metabolic homeostasis in hepatocyte, how metabolic pathways regulate the release of signal substances, and how signal substances act on the NK cells need to be explained. These contents, together with this manuscript, are beyond the scope of a single article, so we put the novelty in this manuscript.

      We sincerely appreciate the comments. The mechanism study is the topic of our follow-up study. We are actively promoting the research and we have found that ATP7B mutation rewires a certain metabolism pathway in hepatocyte, and that a critical metabolite functions as the mediator causing NK cell exhaustion.

      Reviewer #1 (Recommendations For The Authors):

      Major:

      (1) Abstract. A major portion of this manuscript focuses on non-NK cells. Data that describes NK cell exhaustion is only minimal. Therefore, the authors should modify the abstract.

      Thank you for your valuable suggestion. We have supplemented the description of functional changes in other immune cells, and have modified the abstract (line 31-35).

      (2) Introduction. There are three paragraphs. The first paragraph discusses cholecystitis. However, there are too many repetitions, and the information is unclear. In the second part, the authors discuss NK cells and their exhaustion. The authors do not establish a clear rationale or logic linking NK cells to WD or cholecystitis. In the last paragraph, the authors describe their findings. Their correlation between NK cell exhaustion and the poor healing process of cholecystitis has no direct experimental proof.

      Thank you for your comments. We have deleted the repetitions and rephrased some sentences (line 72-74). Briefly, in the first paragraph, we proposed the significant prognostic value of immune cell dysfunction for cholecystitis. In the second paragraph, we introduced NK cell exhaustion and its potential to predict prognosis of certain diseases. In the third paragraph, we introduced that the liver is a central organ involved in metabolism and immunity, holding a large number of NK cells. Liver pathologies commonly impact the development and outcome of inflammation-associated diseases such as cholecystitis. WD was selected as a research model. In the last paragraph, we introduced our findings from clinical study, scRNA-seq, clinical samples, and bioinformatics analysis, and concluded at the end.

      (3) Results. Overall, the results section lacks clarity and a clear focus. Figure legends need to be significantly detailed. The authors make too many broad statements without any support. The authors also make too many overstatements.

      Thank you for your valuable suggestion. We have improved the inaccurate statements and made detailed refinement of figure legends. All the changes are marked in the manuscript, and related responses are described below.

      Figure 1: No information is provided about the functional impairment of ATP7B protein due to the mutation found in the cohort of Chinese patients. What does 'immune abnormalities' (line 127) mean? What is the relevance of showing liver fibrosis and copper accumulation in the eye in Figure 1c and d, respectively? Total cholesterol concentrations are still within the range in the plasma of WD patients, but the authors call it higher. ECAR has not changed in WD patients, but the authors claim it has (line 117).

      (1) All these gene mutations in WD disable the protein function and cause the same outcome. (2) We have deleted the inappropriate statement. (3) In clinical observation, we found that WD not only causes copper accumulation in hepatocytes, but also leads to a variety of diseases, including liver fibrosis, Kayser-Fleischer Ring, and lower risk of hyperglycemia. We showed these together with the data of cholecystitis incidence. We think these might suggest the significance of intercellular communication between hepatocytes and other cells in microenvironment. (4) We have deleted the inappropriate statement (line 108-110, 112-113).

      Figure 2: Did the authors use the liver mesenchymal tissue or mesenchymal cells? Figure 2 states that they used mesenchymal cells, different from liver mesenchymal tissue. Numbers within Figure 2b UMAP are not visible. Were the initial T and NK cells annotated as indicated in Figure S2 (CD3D, CD#E, CD3G)? If so, that does not include NK cells.

      (1) The liver mesenchymal cells were used for scRNA-seq. (2) It is possible that the image resolution was reduced due to the compression of files by the submission system during merging process. We confirm that the image resolution of all figures meets publishing requirements, and that all characters on the figures are visible. You can download figure files to view details. (3) It was our negligence that the incomplete cell markers were shown in Figure S2. We have updated the markers (CD3D, CD3E, NKG7), references (Ref #53, #55, and #56), and related figures (Figure 2e, and Figure S2c).

      Figure 3: The authors should change 'Case' to 'WD patients' both in the text and figures. DEGs in Figure 3C indicate a transcriptomic alteration in the B cell compartment, which the authors do not delineate. Also, the rationale and explanation for the CellChat analyses are minimal. Concluding that a change occurred within the TME with minimal data and explanations is unfair.

      Thank you for your comments. (1) We apologize for the confusion caused by the use of nomenclatures and abbreviations in the text and figures. In all scRNA-seq data analysis, presentation, and description, we used specific terms (CASE and CON) to refer to the group of WD patients and controls, as well as their cell population. We have now unified the use of nomenclature in full text and defined them when first appeared (line 126-127), avoiding using lowercase form to prevent confusion. (2) We have now compared the expression of key genes of B cell between the two group in the next section “The dysfunction of main immune cells in WD patients” (line 230-235, Figure 4e, Figure S4e). (3) We have described the results of cellular communication in more detail (line 188-194). (4) We have modified the conclusion and all the related statement in full text (line 29-31, 82-84, 149, 194-195).

      Figure 4: This section deals with multiple cell types with minimal explanations. This section discusses various cell types, but it lacks focus. In particular, the T cell section should be separated and elaborated more in detail.

      (1) In this section, we intended to show the comparison in function of main immune cells that account for a considerable proportion, instead of just showing differently expressed genes that provide minimal information. The evaluation of functional signature, based on the integration of multiple gene expression, allows a direct understanding of the final outcome owing to transcriptional changes. (2) Given that the main functions of T cells did not change significantly and there were more significant changes in innate immunity, the T cell section is relatively short and unsuitable as a separated part.

      Figure 5: What are the distinct subsets of NK cells authors have found in the WD patients and controls? How do these subsets differ between the two groups in numbers and their transcriptomes? The presentation and labeling of Figure 5 and Supplementary Figure 5 need to be vastly improved. The pseudotime presentation in Figure 5b should be presented separately for the patients and the controls. Are the changes in gene expression presented in Figure 5a due to the change in the subset compositions? Figure 5c immuno-staining is not at all visible. A clear explanation should be given for the differences between Figure 5c and Figure 5e, where NKG2A expressions are shown. A better explanation for Figure 5d is required. Did the authors use all the antibodies with the same fluorochrome? If so, what color is that? Can the authors include the individual samples in the bar diagram in Figure 5e? Again, the data in Figure 5 is insufficient to conclude that NK cells are exhausted in WD patients. While the role of changes in the expression of T-BET and EOMES can be related to dysfunction and cellular exhaustion of NK cells, the statement made by the authors needs to be toned down as they do not test with independent experiments.

      (1) The subsets of NK cell were clustered by gene expression profile and labeled by the characteristically expressed gene, using certain algorithm in the routine procedure. They cannot be distinguished in clinical samples by one or several genes or other sorting methods. Thus, we were not able to analyze these subsets in clinical samples. (2) We have supplemented the comparison of numbers and transcriptomes of three NK subtypes between the two groups (line 268-273). (3) We have checked the figures and confirmed that all characters on the figures are visible. (4) We have separately presented the plot in Figure S5d. (5) We compared the expression level of genes presented in Figure 5a between the two groups in three NK subtypes and supplemented this part (line 264-268). The results were very consistent across the three subtypes, suggesting that the results in total NK population were contributed by all three subtypes and not affected by a single composition. (6) KLRC1 is also known as NKG2A. We are sorry for not making a clear explanation, and now we use KLRC1 only in all text to avoid confusion. We have made a more clear and detailed description for Figure 5c, 5d, and 5e (now labeled as Figure 5b, 5c, and 5d), and have included the fluorochrome in Figure 5d (now labeled as Figure 5c) and the individual value in Figure 5e (now labeled as Figure 5d) (line 293-299). (7) In this section, we found the upregulated expression of inhibitory receptors, downregulated expression of effector molecules, and the impaired NK cell-mediated cytotoxicity in NK cell of WD patients from scRNA-seq. Then we validated the findings in clinical liver section samples and clinical blood samples by mIHC and flow cytometry, respectively. According to the recent articles, exhausted NK cells are characterized by decreased production of effector cytokines (e.g., IFNγ), as well as by impaired cytolytic activity, and downregulate expression of certain activating receptors and upregulate expression of inhibitory receptors (e.g., 10.3389/fimmu.2017.00760, 10.1038/s41590-018-0132-0, 10.1038/s41467-019-09212-y, 10.1080/2162402X.2016.1264562). Therefore, we concluded NK cell exhaustion in WD patients. (8) In the part about transcription factors, we kept the description of objective data and deleted the statement of the contribution of transcription factors to NK exhaustion.

      Figure 6: Data presented in Figure 6 and the conclusion made in this manuscript are predictive. There is no direct testing of ATP7B in NK cells to show the functions of this gene. Extension of this to patient survival is purely speculative. As long as authors state these facts clearly in their text, it can be acceptable. However, they do not extend their conclusions to similar liver diseases.

      ATP7B mutation is hepatocyte-specific, and it does not occur in any immune cells. The function of ATP7B in NK cell was not studied. We found the NK exhaustion and poor prognosis of cholecystitis in WD patients. Given that there were researches demonstrating that NK exhaustion is correlated with poor liver cancer prognosis, we hypothesized that NK exhaustion contributes to the poor prognosis of cholecystitis. Bioinformatics studies confirmed our hypothesis and supported the extension of this result to other inflammatory diseases. We had no experimental data, but this result was reliable in bioinformatics method.

      (4) Discussion: While the authors analyzed multiple cell types, the discussion is primarily focused on NK cells. There is no clear link between copper utilization, NK cell function, and exhaustion that the authors articulate.

      Thank you for your comments. The focus of our study is NK cell exhaustion, which is experimentally proven, so we discussed this aspect. We prioritize the effect of intercellular communication and metabolic alteration on the NK cell exhaustion in our follow-up study. Excess copper is released into the circulation in some circumstances in WD patients, but generally they receive long-term de-coppering therapy to maintain intracellular copper at a non-lethal level. Thus, we do not tend to consider copper as a critical factor in this study. In original manuscript, we mentioned the cuproptosis and its potential as a novel target. It is likely to lead to ambiguity and misunderstanding, so we deleted this part to put our point of view clearly.

      (5) Supplementary Figures: The presentation and labeling of these figures need to be changed.

      Thank you for your suggestions. We have modified the figures and confirmed that all characters on the figures are visible.

      Reviewer #2 (Recommendations For The Authors):

      It is better to test whether ATP7B mutation can directly affect immune functions.

      Thank you for your suggestions. Given that the ATP7B mutation is hepatocyte-specific, its effect on immune cells is most probably through intercellular communication. Thus, we prioritize the effect of intercellular communication on the NK cell exhaustion and we are actively promoting the research.

    1. eLife Assessment

      This potentially valuable work characterizes the changes in the microbial composition of the nasal and fecal microbiomes in COVID-19 patients based on disease severity. This study enhances the understanding of COVID-19 severity predictors by identifying changes in bacterial species abundance in nasopharyngeal and fecal samples as a biomarker for predicting disease severity. The methods and statistics used appear to be solid and in line with the standards of the field.

    2. Reviewer #1 (Public review):

      Summary:

      The research study under review investigated the relationship between gut and identified potential biomarkers derived from the nasopharyngeal and gut microbiota-based that could aid predicting COVID-19 severity. The study reported significant changes in the richness and Shannon diversity index in nasopharyngeal microbiome associated with severe symptoms.

      Strengths:

      The study successfully identified differences in the microbiome diversity that could indicate or predict disease severity. Furthermore, the authors demonstrated a link between individual nasopharyngeal organisms and the severity of SARS-CoV-2 infection. The density of the nasopharyngeal organism was shown to be a potential predictors of severity of COVID-19.

    3. Reviewer #3 (Public review):

      Summary:

      How the microbial composition of the human body is influenced by and influences disease progression is an important topic. For people with COVID-19, symptomatic progression and deterioration can be difficult to predict. This manuscript attempts to associate the nasal and fecal microbiomes of COVID-19 patients with the severity of disease symptoms, with the goal of identifying microbial markers that can predict disease outcomes.

      Strengths:

      Analysis of microbiomes from two distinct anatomical locations and across three distinct patient groups is a substantial undertaking. How these microbiomes influence and are influenced by COVID-19 disease progression is an important question. In particular, the putative biomarker identified here could be of clinical value with additional research.

      Weaknesses:

      The primary weaknesses of this analysis is the relatively low sample size for analyzing disease subsets and moderate correlation values observed for putative biomarkers. Regardless, this data can be used to inform future studies aiming to understand the contribution of multifactorial dysbiosis to COVID-19 disease progression.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews:

      Reviewer 1

      We would like to express our gratitude to Reviewer 1 for providing a thorough summary of our work and highlighting its strengths. With regards to the weaknesses, we are committed to improve the manuscript by performing the necessary changes. First, we will specify the exact p-value in all cases.

      Regarding the discussion section, we acknowledge the feedback regarding its potential confusion. In line with the reviewer's suggestion, we will reduce the literature review and highlight our findings.

      Finally, for the preprint we did not include cofounders such as HIV infection and ethnicity as our study population did not exhibit viral infections and comprised only Hispanic individuals. We will make a more thorough description of the population of study and address these characteristics explicitly in both the methods section and the initial part of the results.

      Reviewer 2

      We appreciate and thank reviewer 2 for the commentaries. Although it is true that several papers have described the role of microbiome in COVID-19 severity, we firmly believe that our current work stands out. There is not much information related to this association in Mediterranean countries, especially in the south of Spain. In addition, most of the studies only describe microbiota composition in stool or nasopharyngeal samples separately, without investigating any potential relationships between them as we do.

      (1) We agree with the reviewer idea of a limited sample size. We faced the challenge of collecting the samples during the peak of COVID-19 pandemia. Thus, doctors and nurses were overwhelmed and not always available for carrying out patient recruitment following the inclusion criteria. Despite these constraints, we ensured that all included samples met our specified inclusion criteria and were from subjects with confirmed symptomatology.

      In addition, our main goal was to identify whether severity of the disease could be assessed through microbiota composition. Therefore we did not include a healthy group. Despite not having a large N, our results should be reproducible as they are supported by statistical analysis.

      (2) We thank reviewer commentary, and since our original sentence may have lacked clarity, we intend to modify it to ensure it conveys the intended meaning more effectively.

      Nonetheless, we remain confident in the significance of our findings. Not only have we found correlation between microbiota and COVID severity, but we have also described how specific bacteria from each condition is associated with key biochemical parameters of clinical COVID infection.

      (3) We appreciate the feedback provided by the reviewer. In this case, we have performed 16S analysis due to its cost-effectiveness compared to metagenomic approaches. Furthermore, 16S analysis has undergone refinements that ensure comprehensive coverage and depth, along with standardized analysis protocols. Unlike 16S, metagenomic approaches lack software tools such as QIIME that facilitate standardization of analysis and, thus, reduce reproducibility of results.

      (4) We sincerely appreciate this insightful suggestion. simply listing associations between both microbiomes and COVID-19 severity could not be enough, we intend to discuss how microbiota composition may be linked to the mechanisms underlying COVID-19 pathogenesis in our discussion.

      (5) We are grateful for the constructive criticism and intend to rewrite our abstract to enhance clarity. Additionally, we will thoroughly review all figures and their descriptions to ensure accuracy and comprehensibility.

      Reviewer 3

      We acknowledge the annotations made by reviewer 3 and are committed to addressing all identified weaknesses to enhance the quality of our work. Our idea is to modify the methods section and figures to make them easier to understand.

      Specifically, in the case of Figure 1, we recognize an error in the description of the Bray-Curtis test. We appreciate the commentary and we will make the necessary changes. Moreover, there is another observation related to Figure 1 description. We are going to modify it in order to gain accuracy.

      For figure 2 we are planning to add a supplementary table showing the abundance of detected genus. Nevermind, we will also update the manuscript text to provide clarification on how we obtained this result.

      Regarding the clarification about "1% abundance," we want to emphasize that we are referring to relative abundance, where 1 represents 100%. To avoid confusion, we will explicitly state this in both the methods section and figure descriptions. Besides, it is true that the statistical test employed for the analysis is not mentioned in the figure description and we recognize that the image may be difficult to interpret. Therefore, we will modify the text and a supplementary table displaying the abundance and p values is going to be added.

      Furthermore, we agree with the reviewer's suggestion to investigate whether the bacteria identified as potential biomarkers for each condition are specific to their respective severity index or if there is a threshold. Thus, we will reanalyze the data and include a supplementary table with the abundance of each biomarker for each condition. We will also place greater emphasis on these results in our discussion.

      Finally, in response to the reviewer's suggestion, we are going to go through the nasopharyngeal-fecal axis part in the discussion. It is well described that COVID-19 induces a dysbiosis in both microbiomes. Consequently, we understand that the ratio we have described could be an interesting tool for assessing COVID severity development as it considers alterations in both environments. However, we acknowledge that there may be room for improvement in clarifying the significance of this intriguing finding and its implications.

    1. eLife Assessment

      This important work shows how a simple geophysical setting of gas flow over a narrow channel of water can create a physical environment that leads to the isothermal replication of nucleic acids. The work presents compelling evidence for an isothermal polymerase chain reaction in careful experiments involving evaporation and convective flows, complimented with fluid dynamics simulations. This work will be of interest to scientists working on the origin of life and more broadly, on nucleic acids and diagnostic applications.

    2. Reviewer #1 (Public review):

      This manuscript from Schwintek and coworkers describes a system in which gas flow across a small channel (10^-4-10^-3 m scale) enables the accumulation of reactants and convective flow. The authors go on to show that this can be used to perform PCR as a model of prebiotic replication.

      Strengths:

      The manuscript nicely extends the authors' prior work in thermophoresis and convection to gas flows. The demonstration of nucleic acid replication is an exciting one, and an enzyme-catalyzed proof-of-concept is a great first step towards a novel geochemical scenario for prebiotic replication reactions and other prebiotic chemistry.

      The manuscript nicely combines theory and experiment, which generally agree well with one another, and it convincingly shows that accumulation can be achieved with gas flows and that it can also be utilized in the same system for what one hopes is a precursor to a model prebiotic reaction. This continues efforts from Braun and Mast over the last 10-15 years extending a phenomenon that was appreciated by physicists and perhaps underappreciated in prebiotic chemistry to increasingly chemically relevant systems and, here, a pilot experiment with a simple biochemical system as a prebiotic model.

      I think this is exciting work and will be of broad interest to the prebiotic chemistry community. The techniques described will be useful to the community as well.

      Weaknesses:

      This work stands well on its own in advancing the field and is well-supported by the evidence presented. The weaknesses below are thus more hopes for future work than limitations of a study that I find to be a complete and well-executed piece of work.

      This paper's use of highly evolved protein enzymes is a potential limitation in its direct relevance to prebiotic chemistry. But this is less a limitation of the manuscript than the state of the field after the authors' advances. It will be of interest to see how these systems function in, e.g., RiboPCR (10.1073/pnas.1610103113) and with non enzymatic systems.

      Similarly, some of the artifacts in this work (appreciated and noted by the authors) arising from gas bubbles evolving prevent the simulations from fully describing their results. However, gas-liquid interactions were likely important in prebiotic chemistry and the authors note several areas in which these could be important in future systems.

    3. Reviewer #2 (Public review):

      Schwintek et al. investigated whether a geological setting of a rock pore with water inflow on one end and gas passing over the opening of the pore on the other end could create a non-equilibrium system that sustains nucleic acid reactions under mild conditions. The evaporation of water as the gas passes over it concentrates the solutes at the boundary of evaporation, while the gas flux induces momentum transfer that creates currents in the water that push the concentrated molecules back into the bulk solution. This leads to the creation of steady state regions of differential salt and macromolecule concentrations that can be used to manipulate nucleic acids. First, the authors showed that fluorescent bead behavior in this system closely matched their fluid dynamic simulations. With that validation in hand, the authors next showed that fluorescently-labeled DNA behaved according to their theory as well. Using these insights, the authors performed a FRET experiment that clearly demonstrated hybridization of two DNA strands as they passed through the high Mg++ concentration zone, and, conversely, the dissociation of the strands as they passed through low Mg++ concentration zone. This isothermal hybridization and dissociation of DNA strands allowed the authors to perform an isothermal DNA amplification using a DNA polymerase enzyme. Crucially, the isothermal DNA amplification required the presence of the gas flux and could not be recapitulated using a system that was at equilibrium. These experiments advance our understanding of the geological settings that could support nucleic acid reactions that were key for the origin of life.

      The presented data compellingly supports the conclusions made by the authors. In the revised submission, the authors have made convincing arguments supported by simulations that the present findings obtained with DNA would translate to RNA as well, thus making this work highly relevant for the field of origin of life.

      A potential future experiment the authors could consider includes performing a prebiotically relevant reaction, such as non-enzymatic primer extension or ligation, in the described model of the rock pore geological setting.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      This manuscript from Schwintek and coworkers describes a system in which gas flow across a small channel (10^-4-10^-3 m scale) enables the accumulation of reactants and convective flow. The authors go on to show that this can be used to perform PCR as a model of prebiotic replication.

      Strengths:

      The manuscript nicely extends the authors' prior work in thermophoresis and convection to gas flows. The demonstration of nucleic acid replication is an exciting one, and an enzyme-catalyzed proof-of-concept is a great first step towards a novel geochemical scenario for prebiotic replication reactions and other prebiotic chemistry.

      The manuscript nicely combines theory and experiment, which generally agree well with one another, and it convincingly shows that accumulation can be achieved with gas flows and that it can also be utilized in the same system for what one hopes is a precursor to a model prebiotic reaction. This continues efforts from Braun and Mast over the last 10-15 years extending a phenomenon that was appreciated by physicists and perhaps underappreciated in prebiotic chemistry to increasingly chemically relevant systems and, here, a pilot experiment with a simple biochemical system as a prebiotic model.

      I think this is exciting work and will be of broad interest to the prebiotic chemistry community.

      Weaknesses:

      The manuscript states: "The micro scale gas-water evaporation interface consisted of a 1.5 mm wide and 250 µm thick channel that carried an upward pure water flow of 4 nl/s ≈ 10 µm/s perpendicular to an air flow of about 250 ml/min ≈ 10 m/s." This was a bit confusing on first read because Figure 2 appears to show a larger channel - based on the scale bar, it appears to be about 2 mm across on the short axis and 5 mm across on the long axis. From reading the methods, one understands the thickness is associated with the Teflon, but the 1.5 mm dimension is still a bit confusing (and what is the dimension in the long axis?) It is a little hard to tell which portion (perhaps all?) of the image is the channel. This is because discontinuities are present on the left and right sides of the experimental panels (consistent with the image showing material beyond the channel), but not the simulated panels. Based on the authors' description of the apparatus (sapphire/CNC machined Teflon/sapphire) it sounds like the geometry is well-known to them. Clarifying what is going on here (and perhaps supplying the source images for the machined Teflon) would be helpful.

      We understand. We will update the figures to better show dimensions of the experimental chamber. We will also add a more complete Figure in the supplementary information. Part of the complexity of the chamber however stems from the fact that the same chamber design has also been used to create defined temperature gradients which are not necessary and thus the chamber is much more complex than necessary.

      We added the scheme of the whole PTFE Chip to Figure 2 in the top left corner, indicating the ROI shown in the fluorescence micrographs. Additionally, the channel walls are now clearly indicated by white dotted lines. The dimensions of the setup are now shown clearer, by showing the total width of the channel as well as its height until the gas flux channel, as well as its depth. Changed caption of the figure accordingly and it now reads: “[…] The PTFE chip cutout in the top left corner shows the ROI used for the micrographs. The color scale is equal for both simulation and experiment and Channel dimensions are 4 x 1.5 x 0.25 mm as indicated. Dotted lines visualize the location of the channel walls. […]“

      The data shown in Figure 2d nicely shows nonrandom residuals (for experimental values vs. simulated) that are most pronounced at t~12 m and t~40-60m. It seems like this is (1) because some symmetry-breaking occurs that isn't accounted for by the model, and perhaps (2) because of the fact that these data are n=1. I think discussing what's going on with (1) would greatly improve the paper, and performing additional replicates to address (2) would be very informative and enhance the paper. Perhaps the negative and positive residuals would change sign in some, but not all, additional replicates?

      To address this, we will show two more replicates of the experiment and include them in Figure 2.

      We are seeing two effects when we compare fluorescence measurements of the experiments.

      Firstly, degassing of water causes the formation of air-bubbles, which are then transported upwards to the interface, disrupting fluorescence measurements. This, however, mostly occurs in experiments with elevated temperatures for PCR reactions, such as displayed in Figure 4.

      Secondly, due to the high surface tension of water, the interface is quite flexible. As the inflow and evaporation work to balance each other, the shape of the interface adjusts, leading to alterations in the circular flow fields below.

      Thus the conditions, while overall being in steady state, show some fluctuations. The strong dependence on interface shape is also seen in the simulation. However, modeling a dynamic interface shape is not so easy to accomplish, so we had to stick to one geometry setting. Again here, the added movies of two more experiments should clarify this issue.

      We performed three more replicates of the experiment and included the averaged data points together with their respective standard deviation as error bars in Figure 2d. Additionally, the videos of each individual repeat are now added to the supplementary files for the reader to better understand where the strong fluctuations around half an hour come from. The Figure caption was adjusted to “ […] The maximum relative concentration of DNA increased within an hour to ~30 X the initial concentration, with the trend following the simulation. Error bars are the standard deviation from four independent measurements. […].

      The main text was also changed to better explain how the fluctuations impact the measurements: […] Water continuously evaporated at the interface, but nucleic acids remained in the aqueous phase accumulating near the interface. They could only escape downward either by diffusion or by the vortex induced by the gas flowing across the interface, pushing the molecules back deeper into the bulk (See the flow lines in Fig2(b) taken from the simulation).  As the gas flow continuously removed excess vapor, the evaporation rate remained constant. Thus, except for fluctuations, a stable interface shape should be expected. However, due to the high surface tension of water, the interface is very flexible. As the inflow and evaporation work to balance each other, the shape of the interface adjusts, likely in response to small fluctuations in gas pressure and spatial variations in water surface tension. This is leading to alterations in the circular flow fields below (Supplementary Movie 2).

      As these fluctuations are difficult to simulate, we decided to stick with one interface shape, matching evaporation and inflow speeds. The evaporation rate at the interface was therefore set to be proportional to the vapor concentration gradient and varied spatially along the interface between 5 and 10.5 µm/s (See Suppl. Fig. VI.1(d)). Using the known diffusion coefficient of 95 µm²/s for the 63mer[9]}, the simulation closely matched the experimental results. In both cases, DNA accumulated in regions with circular flow patterns driven by the gas flux (Fig.2(b), right panel).

      5 minutes after starting the experiment, the maximum DNA accumulation was 3-fold, while after one hour of evaporation, around 30-fold accumulation was observed. Due to molecules residing in very shallow volumes when directly at the interface, the fluorescence signal can vary drastically compared to measurements deeper in the bulk. This can be seen in the fluctuations between independent measurements (See Supplementary Movies 2b,2b,2c), especially around 0.5~h shown in Figure 2(d). The simulated maximum accumulation followed the experimental results and starts saturating after about one hour (Fig.2(d)). […]”

      The authors will most likely be familiar with the work of Victor Ugaz and colleagues, in which they demonstrated Rayleigh-Bénard-driven PCR in convection cells (10.1126/science.298.5594.793, 10.1002/anie.200700306). Not including some discussion of this work is an unfortunate oversight, and addressing it would significantly improve the manuscript and provide some valuable context to readers. Something of particular interest would be their observation that wide circular cells gave chaotic temperature profiles relative to narrow ones and that these improved PCR amplification (10.1002/anie.201004217). I think contextualizing the results shown here in light of this paper would be helpful.

      Thanks for pointing this out and reminding us. We apologize. We agree that the chaotic trajectories within Rayleigh-Bénard convection cells lead to temperature oscillations similar to the salt variations in our gas-flux system. Although the convection-driven PCR in Rayleigh-Bénard is not isothermal like our system, it provides a useful point of comparison and context for understanding environments that can support full replication cycles. We will add a section comparing approaches and giving some comparison into the history of convective PCR and how these relate to the new isothermal implementation.

      We added a main text paragraph after the last paragraph in section “Strand Separation Dynamics”: “[…]Rayleigh-Bénard convection cells generate similar patterns to those seen in Fig. 3(c) The oscillations in salt concentration resemble the temperature fluctuations observed in convection-based PCR reactions from earlier studies [32,33], which showed that chaotic temperature variations, compared to periodic ones, enhanced the efficiency of the PCR reaction.[…]

      Again, it appears n=1 is shown for Figure 4a-c - the source of the title claim of the paper - and showing some replicates and perhaps discussing them in the context of prior work would enhance the manuscript.

      We appreciate the reviewer for bringing this to our attention. We will now include the two additional repeats for the data shown in Figure 4c, while the repeats of the PAGE measurements are already displayed in Supplementary Fig. IX.2. Initially, we chose not to show the repeats in Figure 4c due to the dynamic and variable nature of the system. These variations are primarily caused by differences at the water-air interface, attributed to the high surface tension of water. Additionally, the stochastic formation of air bubbles in the inflow—despite our best efforts to avoid them—led to fluctuations in the fluorescence measurements across experiments. These bubbles cause a significant drop in fluorescence in a region of interest (ROI) until the area is refilled with the sample.

      Unlike our RNA-focused experiments, PCR requires high temperatures and degassing a PCR master mix effectively is challenging in this context. While we believe our chamber design is sufficiently gas-tight to prevent air from diffusing in, the high surface-to-volume ratio in microfluidics makes degassing highly effective, particularly at elevated temperatures. We anticipate that switching to RNA experiments at lower temperatures will mitigate this issue, which is also relevant in a prebiotic context.

      The reviewer’s comments are valid and prompt us to fully display these aspects of the system. We will now include these repeats in Figure 4c to give readers a deeper understanding of the experiment's dynamics. Additionally, we will provide videos of all three repeats, allowing readers to better grasp the nature of the fluctuations in SYBR Green fluorescence depicted in Figure 4c.

      The data from the triplicates are now added to Figure 4c, showing how air bubbles, forming through degassing at the high temperatures required for Taq polymerase, disrupt the measurement, as they momentarily dry off the channel and stop the reaction until the channel fills again. Figure caption has been adapted and now reads: “[…] Dotted lines show the data from independent repeats. Air bubbles formed through degassing can momentarily disrupt the reaction. […]”

      We additionally changed the main text to explain the reader the experimental difficulties: “[…] In other repetitions of the reaction, this increase was sometimes even observed earlier, around the one-hour mark (dotted lines). However, air bubbles nucleated by degassing events rise and temporarily dry out the channel, interrupting the reaction until the liquid refills the channel (Supplementary Movies 4,4b,4c\&5). Despite our best efforts, we were unable to fully prevent this, especially given the high temperatures required for Taq polymerase activity. In an identical setting when the gas- and water flux were switched off, no fluorescence increase was found (See Fig. 4(c) red lines). Fluorescence variations are additionally caused by fluctuations in the position of the gas-water interface, as discussed earlier. […]”

      I think some caution is warranted in interpreting the PCR results because a primer-dimer would be of essentially the same length as the product. It appears as though the experiment has worked as described, but it's very difficult to be certain of this given this limitation. Doing the PCR with a significantly longer amplicon would be ideal, or alternately discussing this possible limitation would be helpful to the readers in managing expectations.

      This is a good point and should be discussed more in the manuscript. Our gel electrophoresis is capable of distinguishing between replicate and primer dimers. We know this since we were optimizing the primers and template sequences to minimize primer dimers, making it distinguishable from the desired 61mer product. That said, all of the experiments performed without a template strand added did not show any band in the vicinity of the product band after 4h of reaction, in contrast to the experiments with template, presenting a strong argument against the presence of primer dimers.

      We added a main text section explaining this to the reader: “[…]Suppl. Fig. IX.2 shows all independent repeats of the corresponding experiments. No product was detected in any of these cases, ruling out reaction limitations such as primer dimer formation. Primer dimers would form even in the absence of a template strand and would be identifiable through gel electrophoresis. As Taq polymerase requires a significant overlap between the two dimers to bind, this would result in a shorter product compared to the 61mer used here.  […]”

      Reviewer #2 (Public review):

      Schwintek et al. investigated whether a geological setting of a rock pore with water inflow on one end and gas passing over the opening of the pore on the other end could create a non-equilibrium system that sustains nucleic acid reactions under mild conditions. The evaporation of water as the gas passes over it concentrates the solutes at the boundary of evaporation, while the gas flux induces momentum transfer that creates currents in the water that push the concentrated molecules back into the bulk solution. This leads to the creation of steady-state regions of differential salt and macromolecule concentrations that can be used to manipulate nucleic acids. First, the authors showed that fluorescent bead behavior in this system closely matched their fluid dynamic simulations. With that validation in hand, the authors next showed that fluorescently labeled DNA behaved according to their theory as well. Using these insights, the authors performed a FRET experiment that clearly demonstrated the hybridization of two DNA strands as they passed through the high Mg++ concentration zone, and, conversely, the dissociation of the strands as they passed through the low Mg++ concentration zone. This isothermal hybridization and dissociation of DNA strands allowed the authors to perform an isothermal DNA amplification using a DNA polymerase enzyme. Crucially, the isothermal DNA amplification required the presence of the gas flux and could not be recapitulated using a system that was at equilibrium. These experiments advance our understanding of the geological settings that could support nucleic acid reactions that were key to the origin of life.

      The presented data compellingly supports the conclusions made by the authors. To increase the relevance of the work for the origin of life field, the following experiments are suggested:

      (1) While the central premise of this work is that RNA degradation presents a risk for strand separation strategies relying on elevated temperatures, all of the work is performed using DNA as the nucleic acid model. I understand the convenience of using DNA, especially in the latter replication experiment, but I think that at least the FRET experiments could be performed using RNA instead of DNA.

      We understand the request only partially. The modification brought about by the two dye molecules in the FRET probe to be able to probe salt concentrations by melting is of course much larger than the change of the backbone from RNA to DNA. This was the reason why we rather used the much more stable DNA construct which is also manufactured at a lower cost and in much higher purity also with the modifications. But we think the melting temperature characteristics of RNA and DNA in this range is enough known that we can use DNA instead of RNA for probing the salt concentration in our flow cycling.

      Only at extreme conditions of pH and salt, RNA degradation through transesterification, especially under alkaline conditions is at least several orders of magnitude faster than spontaneous degradative mechanisms acting upon DNA [Li, Y., & Breaker, R. R. (1999). Kinetics of RNA degradation by specific base catalysis of transesterification involving the 2 ‘-hydroxyl group. Journal of the American Chemical Society, 121(23), 5364-5372.]. The work presented in this article is however focussed on hybridization dynamics of nucleic acids. Here, RNA and DNA share similar properties regarding the formation of double strands and their respective melting temperatures. While RNA has been shown to form more stable duplex structures exhibiting higher melting temperatures compared to DNA [Dimitrov, R. A., & Zuker, M. (2004). Prediction of hybridization and melting for double-stranded nucleic acids. Biophysical Journal, 87(1), 215-226.], the general impact of changes in salt, temperature and pH [Mariani, A., Bonfio, C., Johnson, C. M., & Sutherland, J. D. (2018). pH-Driven RNA strand separation under prebiotically plausible conditions. Biochemistry, 57(45), 6382-6386.] on respective melting temperatures follows the same trend for both nucleic acid types. Also the diffusive properties of RNA and DNA are very similar [Baaske, P., Weinert, F. M., Duhr, S., Lemke, K. H., Russell, M. J., & Braun, D. (2007). Extreme accumulation of nucleotides in simulated hydrothermal pore systems. Proceedings of the National Academy of Sciences, 104(22), 9346-9351.].

      Since this work is a proof of principle for the discussed environment being able to host nucleic acid replication, we aimed to avoid second order effects such as degradation by hydrolysis by using DNA as a proxy polymer. This enabled us to focus on the physical effects of the environment on local salt and nucleic acid concentration. The experiments performed with FRET are used to visualize local salt concentration changes and their impact on the melting temperature of dissolved nucleic acids.  While performing these experiments with RNA would without doubt cover a broader application within the field of origin of life, we aimed at a step-by-step / proof of principle approach, especially since the environmental phenomena studied here have not been previously investigated in the OOL context. Incorporating RNA-related complexity into this system should however be addressed in future studies. This will likely require modifications to the experimental boundary conditions, such as adjusting pH, temperature, and salt concentration, to account for the greater duplex stability of RNA. For instance, lowering the pH would reduce the RNA melting temperature [Ianeselli, A., Atienza, M., Kudella, P. W., Gerland, U., Mast, C. B., & Braun, D. (2022). Water cycles in a Hadean CO2 atmosphere drive the evolution of long DNA. Nature Physics, 18(5), 579-585.].

      (2) Additionally, showing that RNA does not degrade under the conditions employed by the authors (I am particularly worried about the high Mg++ zones created by the flux) would further strengthen the already very strong and compelling work.

      Based on literature values for hydrolysis rates of RNA [Li, Y., & Breaker, R. R. (1999). Kinetics of RNA degradation by specific base catalysis of transesterification involving the 2 ‘-hydroxyl group. Journal of the American Chemical Society, 121(23), 5364-5372.], we estimate RNA to have a half-life of multiple months under the deployed conditions in the FRET experiment (High concentration zones contain <1mM of Mg2+). Additionally, dsRNA is multiple orders of magnitude more stable than ssRNA with regards to degradation through hydrolysis [Zhang, K., Hodge, J., Chatterjee, A., Moon, T. S., & Parker, K. M. (2021). Duplex structure of double-stranded RNA provides stability against hydrolysis relative to single-stranded RNA. Environmental Science & Technology, 55(12), 8045-8053.], improving RNA stability especially in zones of high FRET signal. Furthermore, at the neutral pH deployed in this work, RNA does not readily degrade. In previous work from our lab [Salditt, A., Karr, L., Salibi, E., Le Vay, K., Braun, D., & Mutschler, H. (2023). Ribozyme-mediated RNA synthesis and replication in a model Hadean microenvironment. Nature Communications, 14(1), 1495.], we showed that the lifetime of RNA under conditions reaching 40mM Mg2+ at the air-water interface at 45°C was sufficient to support ribozymatically mediated ligation reactions in experiments lasting multiple hours.

      With that in mind, gaining insight into the median Mg2+ concentration across multiple averaged nucleic acid trajectories in our system (see Fig. 3c&d) and numerically convoluting this with hydrolysis dynamics from literature would be highly valuable. We anticipate that longer residence times in trajectories distant from the interface will improve RNA stability compared to a system with uniformly high Mg2+ concentrations.

      Added a new Supplementary section for this. We used the trace from Figure 3(c) and calculated the hydrolysis rate for each timestep by using literature values from RNA [Li, Y., & Breaker, R. R. (1999). Kinetics of RNA degradation by specific base catalysis of transesterification involving the 2 ‘-hydroxyl group. Journal of the American Chemical Society, 121(23), 5364-5372.]. We conclude that the conditions deployed for the experiment are not harsh on RNA, with hydrolysis rates in the E-6 1/min regime. The figure below (also now in the supplementary information) shows the hydrolysis of RNA deployed under the conditions of the experiment in Figure 3. RNA is not expected to hydrolyze under these conditions and timescales, in which a replication reaction would occur. With a half life of around 83 days, even a prebiotically plausible – very slow – replication reaction would not be constrained by hydrolysis boundary conditions in this scenario.

      Referenced to this section in the supplementary information in the maintext: […] In the experimental conditions used here, RNA would also not readily degrade, even if the strand enters the high salt regimes (See Suppl. Sec. IX). Using literature values for hydrolysis rates under the deployed conditions, we estimate dissolved RNA to have a half life of around 83 days. […]

      (3) Finally, I am curious whether the authors have considered designing a simulation or experiment that uses the imidazole- or 2′,3′-cyclic phosphate-activated ribonucleotides. For instance, a fully paired RNA duplex and a fluorescently-labeled primer could be incubated in the presence of activated ribonucleotides +/- flux and subsequently analyzed by gel electrophoresis to determine how much primer extension has occurred. The reason for this suggestion is that, due to the slow kinetics of chemical primer extension, the reannealing of the fully complementary strands as they pass through the high Mg++ zone, which is required for primer extension, may outcompete the primer extension reaction. In the case of the DNA polymerase, the enzymatic catalysis likely outcompetes the reannealing, but this may not recapitulate the uncatalyzed chemical reaction.

      This is certainly on our to-do list for future experiments in this setting. Our current focus is on templated ligation rather than templated polymerization and we are working hard to implement RNA-only enzyme-free ligation chain reaction, based on more optimized parameters for the templated ligation from 2’3’-cyclic phosphate activation that was just published [High-Fidelity RNA Copying via 2′,3′-Cyclic Phosphate Ligation, Adriana C. Serrão, Sreekar Wunnava, Avinash V. Dass, Lennard Ufer, Philipp Schwintek, Christof B. Mast, and Dieter Braun, JACS doi.org/10.1021/jacs.3c10813 (2024)]. But we first would try this at an air-water interface which was shown to work with RNA in a temperature gradient [Ribozyme-mediated RNA synthesis and replication in a model Hadean microenvironment, Annalena Salditt, Leonie Karr, Elia Salibi, Kristian Le Vay, Dieter Braun & Hannes Mutschler, Nature Communications doi.org/10.1038/s41467-023-37206-4 (2023)] before making the jump to the isothermal setting we describe here. So we can understand the question, but it was good practice also in the past to first get to know the setting with PCR, then jump to RNA.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      (1) Could the authors comment on the likelihood of the geological environments where the water inflow velocity equals the evaporation velocity?

      This is an important point to mention in the manuscript, thank you for pointing that out. To produce a defined experiment, we were pushing the water out with a syringe pump, but regulated in a way that the evaporation was matching our flow rate. We imagine that a real system will self-regulate the inflow of the water column on the one hand side by a more complex geometry of the gas flow, matching the evaporation with the reflow of water automatically. The interface would either recede or move closer to the gas flux, depending on whether the inflow exceeds or falls short of the evaporation rate. As the interface moves closer, evaporation speeds up, while moving away slows it down. This dynamic process stabilizes the system, with surface tension ultimately fixing the interface in place.

      We have seen a bit of this dynamic already in the experiments, could however so far not yet find a good geometry within our 2-dimensional constant thickness geometry to make it work for a longer time. Very likely having a 3-dimensional reservoir of water with less frictional forces would be able to do this, but this would require a full redesign of a multi-thickness microfluidics. The more we think about it, the more we envisage to make the next implementation of the experiment with a real porous volcanic rock inside a humidity chamber that simulates a full 6h prebiotic day. But then we would lose the whole reproducibility of the experiment, but likely gain a way that recondensation of water by dew in a cold morning is refilling the water reservoirs in the rocks again. Sorry that I am regressing towards experiments in the future.

      We added a paragraph after the second paragraph in Results and Discussion.

      It now reads: […] For a real early Earth environment we envision a system that self-regulates the water column's inflow by automatically balancing evaporation with capillary flows. The interface adjusts its position relative to the gas flux, moving closer if the inflow is less than the evaporation rate, or receding if it exceeds it. When the interface nears the gas flux, evaporation accelerates, while moving it away slows evaporation. This dynamic process stabilizes the system, with surface tension ultimately fixing the interface's position. […]

      (2) Could the authors speculate on using gases other than ambient air to provide the flux and possibly even chemical energy? For example, using carbonyl sulfide or vaporized methyl isocyanide could drive amino acid and nucleotide activation, respectively, at the gas-water interface.

      This is an interesting prospect for future work with this system. We thought also about introducing ammonia for pH control and possible reactions. We were amazed in the past that having CO2 instead of air had a profound impact on the replication and the strand separation [Water cycles in a Hadean CO2 atmosphere drive the evolution of long DNA, Alan Ianeselli, Miguel Atienza, Patrick Kudella, Ulrich Gerland, Christof Mast & Dieter Braun, Nature Physics doi.org/10.1038/s41567-022-01516-z (2022)]. So going more in this direction absolutely makes sense and as it acts mostly on the length-selectively accumulated molecules at the interface, only the selected molecules will be affected, which adds to the selection pressure of early evolutionary scenarios.

      Of course, in the manuscript, we use ambient air as a proxy for any gas, focusing primarily on the energy introduced through momentum transfer and evaporation. We speculate that soluble gasses could establish chemical gradients, such as pH or redox potential, from the bulk solution to the interface, similar to the Mg2+ accumulation shown in Figure 3c. The nature of these gradients would depend on each gas's solubility and diffusivity. We have already observed such effects in thermal gradients [Keil, L. M., Möller, F. M., Kieß, M., Kudella, P. W., & Mast, C. B. (2017). Proton gradients and pH oscillations emerge from heat flow at the microscale. Nature communications, 8(1), 1897.] and finding similar behavior in an isothermal environment would be a significant discovery.

      Added a paragraph in the Conclusion to showcase this: [… ] Furthermore we expect that other gases, such as CO2, could establish chemical gradients in this environment. Such gradients have been observed in thermal gradients before [23] and finding similar behaviour in an isothermal environment would be a significant discovery.[…]

      (3) Line 162: Instead of "risk," I suggest using "rate".

      Thanks for pointing this out! Will be changed.

      Fixed.

      (4) Using FRET of a DNA duplex as an indicator of salt concentration is a decent proxy, but a more direct measurement of salt concentration would provide further merit to the explicit statement that it is the salt concentration that is changing in the system and not another hidden parameter.

      Directly observing salt concentration using microscopy is a difficult task. While there are dyes that change their fluorescence depending on the local Na+ or Mg2+ concentration, they are not operating differentially, i.e. by making a ratio between two color channels. Only then we are not running into artifacts from the dye molecules being accumulated by the non-equilibrium settings. We were able to do this for pH in the past, but did not find comparable optical salt sensors. This is the reason we ended up with a FRET pair, with the advantage that we actually probe the strand separation that we are interested in anyhow. Using such a dye in future work would however without a doubt enhance the understanding of not only this system, but also our thermal gradient environments.

      (5) Figure 3a: Could the authors add information on "Dried DNA" to the caption? I am assuming this is the DNA that dried off on the sides of the vessel but cannot be sure.

      Thanks to the reviewer for pointing this out. This is correct and we will describe this better in the revised manuscript.

      Added a sentence in the caption to address this: […] Fluctuations in interface position can dry and redissolve DNA repeatedly (see “Dried DNA” in right panel). […]

      (6) Figure 4b and c: How reproducible is this data? Have the authors performed this reaction multiple independent times? If so, this data should be added to the manuscript.

      The data from the gel electrophoresis was performed in triplicates and is shown in full in supplementary information. The data in c is hard to reproduce, as the interface is not static and thus ROI measurements are difficult to perform as an average of repeats. Including the data from the independent repeats will however give the reader insight into some of the experimental difficulties, such as air bubbles, which form from degassing as the liquid heats up, that travel upwards to the interface, disrupting the ongoing fluorescence measurements.

      This was also pointed out by reviewer 1 and addressed there.

      (7) Line 256: "shielding from harmful UV" statement only applies to RNA oligomers as UV light may actually be beneficial for earlier steps during ribonucleoside synthesis. I suggest rephrasing to "shielding nucleic acid oligomers from UV damage.".

      Will be adjusted as mentioned.

      Fixed.

      (8) The final paragraph in the Results and Discussion section would flow better if placed in the Conclusion section.

      This is a good point and we will merge results and discussion closer together.

      Fixed.

      (9) Line 262, "...of early Life" is slightly overstating the conclusions of the study. I suggest rephrasing to "...of nucleic acids that could have supported early life."

      This is a fair comment. We thank the reviewer for his detailed analysis of the manuscript!

      Changed the phrase to: […]In this work we investigated a prebiotically plausible and abundant geological environment to support the replication of nucleic acids. […]

      (10) In references, some of the journal names are in sentence case while others are in title case (see references 23 and 26 for example).

      Thanks - this will be fixed.

      Fixed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      This study provides compelling evidence that RAR, rather than its obligate dimerization partner RXR, is functionally limiting for chromatin binding. This manuscript provides a paradigm for how to dissect the complicated regulatory networks formed by dimerizing transcription factor families.

      Dahal and colleagues use advanced SMT techniques to revisit the role of RXR in DNA-binding of the type-2 nuclear receptor (T2NR) RAR. The dominant consensus model for regulated DNA binding of T2NRs posits that they compete for a limited pool of RXR to form an obligate T2NR-RXR dimer. Using advanced SMT and proximity-assisted photoactivation technologies, Dahal et al. now test the effect of manipulating the endogenous pool size of RAR and RXR on heterodimerization and DNA-binding in live U2OS cells. Surprisingly, it turns out that RAR, rather than RXR, is functionally limiting for heterodimerization and chromatin binding. By inference, the relative pool size of various T2NRs expressed in a given cell, rather than RXR, is likely to determine chromatin binding and transcriptional output.

      The conclusions of this study are well supported by the experimental results and provide unexpected novel insights into the functioning of the clinically important class of T2NR TFs. Moreover, the presented results show how the use of novel technologies can put long-standing theories on how transcription factors work upside down. This manuscript provides a paradigm for how to further dissect the complicated regulatory networks formed by T2NRs or other dimerizing TFs. I found this to be a complete story that does not require additional experimental work. However, I do have some suggestions for the authors to consider.

      Reviewer #1 (Recommendations For The Authors):

      (1) Does the increased chromatin binding measured when the RAR levels are increased reflect a higher occupancy of a similar set of loci, or are additional loci bound? The authors could discuss this issue in the context of the published literature. Obviously, this could be addressed experimentally by ChIP-seq or a similar analysis, but this would extend beyond the main topic of this manuscript.

      We attempted to explore this experimentally using ChIP-seq with multiple RAR- and RXR-specific antibodies. Unfortunately, our results were inconclusive, as the antibody enrichment relative to the IgG control was insufficient for reliable interpretation. Specifically, our ChIP-seq enrichment levels were only around 1.5fold, while the accepted standard for meaningful ChIP enrichment is typically at least 2-fold. Due to these technical limitations, we decided to defer these experiments for now.

      However, we agree with the reviewer that understanding whether the increased chromatin binding of RAR reflects higher occupancy at the same set of loci or binding to additional loci is a key question. In similar experiments involving the transcription factor TFEB (Esbin et al., 2024, Genes Dev, doi: 10.1101/gad.351633.124) where an increase in the SMT bound fraction occurred, both scenarios—higher occupancy at known loci and binding to additional loci in ChIP-seq was observed. So, addressing this intriguing possibility in future studies focused on RAR and RXR would be interesting.

      (2) The results presented suggest convincingly that endogenous RXR is normally in excess to its binding partners (in U2OS cells). This point could be strengthened further by reducing RXR levels, e.g., by knocking out 1 allele or the use of shRNAs (although the latter method might be too hard to control). Overexpression of another T2NR might also help determine the buffer capacity of RXR.

      We appreciate the reviewers’ acknowledgment that our results convincingly demonstrate that endogenous RXR is typically in excess relative to its binding partners in U2OS cells. We agree that this conclusion could be further reinforced by experiments such as overexpression of another T2NR to test RXR's buffering capacity. We are actively pursuing follow-up experiments involving overexpression of additional T2NRs to address this question in more detail. These studies are ongoing, and we plan to explore the buffer capacity of RXR more extensively in a future manuscript.

      (3) The ~10% difference in fbound of RAR and RXR (in Figs 1 and 2), while they should be 1:1 dimers, is explained by invoking the expression of RXR isoforms. Can the authors be more specific concerning the nature of these isoforms?

      We have provided detailed information about different T2NRs expressed in U2OS cells according to the Expression Atlas and the Human Protein Atlas Database in Supplementary Table S1. Table S1 specifically shows that both isoforms of RXRα and RXRβ are expressed in U2OS cells. Additionally, the caption of Table S1 explicitly notes the presence of isoform RXRβ in U2OS cells. In the main text, we reference Table S1 when discussing the 10% difference in fbound between RARα and RXRα, and we have now suggested that the expression of RXRβ likely accounts for the observed discrepancy.

      Reviewer #2 (Public Review):

      Summary:

      In the manuscript "Surprising Features of Nuclear Receptor Interaction Networks Revealed by Live Cell Single Molecule Imaging", Dahal et al combine fast single molecule tracking (SMT) with proximity-assisted photoactivation (PAPA) to study the interaction between RARa and RXRa. The prevalent model in the nuclear receptor field suggests that type II nuclear receptors compete for a limiting pool of their partner RXRa. Contrary to this, the authors find that over-expression of RARa but not RXRa increases the fraction of RXRa molecules bound to chromatin, which leads them to conclude that the limiting factor is the abundance of RARa and not RXRa. The authors also perform experiments with a known RARa agonist, all trans retinoic acid (atRA) which has little effect on the bound fraction. Using PAPA, they show that chromatin binding increases upon dimerization of RARa and RXRa.

      Strengths:

      In my view, the biggest strength of this study is the use of endogenously tagged RARa and RXRa cell lines. As the authors point out, most previous studies used either in vitro assays or over-expression. I commend the authors on the generation of single-cell clones of knock-in RARa-Halo and Halo-RXRa. The authors then carefully measure the abundance of each protein using FACS, which is very helpful when comparing across conditions. The manuscript is generally well written and figures are easy to follow. The consistent color-scheme used throughout the manuscript is very helpful.

      Weaknesses:

      (1) Agonist treatment:

      The authors test the effect of all trans retinoic acid (atRA) on the bound fraction of RARa and RXRa and find that "These results are consistent with the classic model in which dimerization and chromatin binding of T2NRs are ligand independent." However, all the agonist treatments are done in media containing FBS. FBS is not chemically defined and has been found to have between 10 and 50 nM atRA (see references in PMID 32359651 for example). The addition of 1 nM or 100 nM atRA is unlikely to result in a strong effect since the medium already contains comparable or higher levels of agonist. To test their hypothesis of ligand-independent dimerization, the authors should deplete the media of atRA by growing the cells in a medium containing charcoal-stripped FBS for at least 24 hours before adding agonist.

      We acknowledge the reviewer's concern regarding the presence of atRA in FBS and agree that it may introduce baseline levels of agonist. However, in our experiments, both the 1 nM and 100 nM atRA treatments resulted in observable changes in RAR expression levels (Figure S3C). Additionally, the luciferase assays demonstrated that 100 nM atRA significantly increased retinoic acid-responsive promoter activity (Figure S1C). Given these clear responses to atRA, we believe the observed lack of effect on the chromatin-bound fraction cannot be attributed to the presence of comparable or higher levels of atRA in the FBS, as the reviewer suggests. Moreover, since our results align with the established literature and do not impact the core findings of our study, we decided not to pursue the suggested experiments with charcoal-stripped FBS in this manuscript.  

      (2) Photobleaching and its effect on bound fraction measurements:

      The authors discard the first 500 to 1000 frames due to the high localization density in the initial frames. This will preferentially discard bound molecules that will bleach in the initial frames of the movie and lead to an over-estimation of the unbound fraction.

      For experiments with over-expression of RAR-Halo and Halo-RXR, the authors state that the cells were pre-bleached and that these frames were used to calculate the mean intensity of the nuclei. When pre-bleaching, bound molecules will preferentially bleach before the diffusing population. This will again lead to an over-representation of the unbound fraction since this is the population that will remain relatively unaffected by the pre-bleaching. Indeed, the bound fraction for over-expressed RARa and RXRa is significantly lower than that for the corresponding knock in lines. To confirm whether this is a biological result, I suggest that the authors either reduce the amount of dye they use so that this pre-bleaching is not necessary or use the direct reactivation strategy they use for their PAPA experiments to eliminate the pre-bleaching step.

      As for the measurement of the nuclear intensity, since the authors have access to multiple HaloTag dyes, they can saturate the HaloTagged proteins with a high concentration of JF646 or JFX650 to measure the mean intensity of the protein while still using the PA-JFX549 for SMT. Together, these will eliminate the need to prebleach or discard any frames.

      The Janelia Fluor dyes used in our experiments are known for their high photostability (Grimm et al., 2021, JACS Au, doi: 10.1021/jacsau.1c00006). During the initial 80 ms imaging to calculate the mean nuclear intensity, the laser power was kept at very low intensity (~3%) for a brief duration (~10 seconds), in contrast to the high-intensity (~100%) used during the tracking experiments, which span around 3 minutes. This low-power illumination does not induce significant photobleaching but merely puts the dyes in a temporary dark state. Therefore, this pre-bleaching step closely resembles the direct reactivation strategy employed in our PAPA experiments.

      To further address the reviewer's concern, we performed a frame cut-off analysis for our SMT movies of endogenous RARα-Halo and over-expressed RARα-Halo (Figure S9B). The analysis shows no significant change in the bound fraction of either endogenous or over-expressed RARα-Halo when discarding the initial 1000 frames. Based on these results, we conclude that the pre-bleaching does not lead to an overestimation of the unbound fraction, and that our experimental approach is robust.

      (3) Heterogeneous expression of the SNAP fusion proteins:

      The cell lines expressing SNAP tagged transgenes shown in Fig S6 have very heterogeneous expression of the SNAP proteins. While the bulk measurements done by Western blotting are useful, while doing single-cell experiments (especially with small numbers - ~20 - of cells), it is important to control for expression levels. Since these transgenic stable lines were not FACS sorted, it would be helpful for the reader to know the spread in the distribution of mean intensities of the SNAP proteins for the cells that the SMT data are presented for. This step is crucial while claiming the absence of an effect upon over-expression and can easily be done with a SNAPTag ligand such as SF650 using the procedure outlined for the over-expressed HaloTag proteins.

      We agree with the reviewer that there is heterogeneity in SNAP protein expression across the transgenic lines. In response to the reviewer’s suggestion, we performed the proposed experiment to assess the distribution of mean intensities for two key experimental conditions: Halo-RXRα with overexpressed RARα-SNAP and HaloRXRα with overexpressed RARαRR-SNAP. These results again confirm that the increase in chromatin-bound fraction of Halo-RXRα is observed only in the presence of RARα capable of heterodimerizing with RXRα, supporting our main conclusion (Figure S9).

      For these experiments, we followed the same labelling procedure described in the methods section for tracking endogenous Halo-tagged proteins alongside transgenic SNAP proteins. As shown in Figure S9, for ~ 70 cell nuclei, the distribution of mean intensities is similar for both conditions, with the bound fraction of Halo-RXRα significantly increasing in the presence of RARα-SNAP compared to RARαRR-SNAP. This analysis underscores that the observed effects are indeed due to the functional differences between the two RARα variants rather than variability in expression levels.

      (4) Definition of bound molecules:

      The authors state that molecules with a diffusion coefficient less than 0.15 um2/s are considered bound and those between 1-15 um2/s are considered unbound. Clarification is needed on how this threshold was determined. In previous publications using saSPT, the authors have used a cutoff of 0.1 um2/s (for example, PMID 36066004, 36322456). Do the results rely on a specific cutoff? A diffusion coefficient by itself is only a useful measure of normal diffusion. Bound molecules are unlikely to be undergoing Brownian motion, but the state array method implemented here does not seem to account for non-normal diffusive modes. How valid is this assumption here?

      We acknowledge the inconsistency in the diffusion coefficient thresholds for defining the chromatin-bound fraction used across our group’s publications. The choice of threshold or cutoff (0.1 µm²/s vs 0.15 µm²/s) is largely arbitrary and does not significantly impact the results. To validate this, we tested the effect of different cutoffs on fbound (%) for endogenously expressed Halo-tagged RARα and RXRα (Figure S10). As shown in Figure S10, there was no substantial difference in fbound (%) calculated using a 0.1 µm²/s versus 0.15 µm²/s cutoff (e.g., RARα clone c156: 47±1% vs 49±1%; RXRα clone D6: 34±1% vs 35±1%). 

      Since we have consistently applied the 0.15 µm²/s cutoff throughout this manuscript across all experimental conditions, the comparative analysis of fbound (%) remains valid. While we agree that a Brownian diffusion model may not fully capture the motion of bound molecules, our state array model accounts for localization error, which likely incorporates some of the chromatin motion features. Moreover, the distinction between bound (<0.15 µm²/s) and unbound (1-15 µm²/s) populations is sufficiently large that using a normal diffusion model is reasonable for our analysis.

      (5) Movies:

      Since this is an imaging manuscript, I request the authors to provide representative movies for all the presented conditions. This is an essential component for a reader to evaluate the data and for them to benchmark their own images if they are to try to reproduce these findings.

      We have now included representative movies for all the SMT experimental conditions presented in the manuscript. Please see data availability section of the manuscript.

      (6) Definition of an ROI:

      The authors state that "ROI of random size but with maximum possible area was selected to fit into the interior of the nuclei" while imaging. However, the readout speed of the Andor iXon Ultra 897 depends on the size of the defined ROI. If the ROI was variable for every movie, how do the authors ensure the same sampling rate?

      We used the frame transfer mode on the Andor iXon Ultra 897 camera for our acquisitions, which allows for fast frame rate measurements without altering the exposure time between frames. Additionally, we verified the metadata of all our movies to ensure a consistent frame interval of 7.4 ms across all conditions. This confirms that the sampling rate was maintained uniformly, despite the variability in ROI size. 

      Reviewer #2 (Recommendations For The Authors):

      (1) 'Hoechst' is mis-spelled.

      We have now corrected this typo in the manuscript.

      (2) Cos7 appears in several places throughout the text. I assume this is a typo. If so, please correct it. If not, please explain if some experiments were done in Cos7 cells and kindly provide a justification for that.

      The use of Cos7 cells is intentional and not a typo. Cos7 cells have been previously utilized in studies investigating the interaction between T2NRs (Kliewer et al., 1992, Nature, doi: 10.1038/355446a0). In our study, due to technical issues with antibodies for coIP in U2OS cells, we initially used Cos7 cells for control experiments to verify that Halo-tagging of RARα and RXRα did not disrupt their interaction, by transiently expressing the constructs in Cos7 cells. Following these control experiments, we confirmed the direct interaction of endogenously expressed RAR and RXR in U2OS cells with their respective binding partners using the SMT-PAPA assay. Since these results confirmed that Halo-tagging did not interfere with RAR-RXR interactions, we chose not to repeat the coIP experiments in U2OS cells.

      Reviewer #3 (Public Review):

      Summary:

      This study aims to investigate the stoichiometric effect between core factors and partners forming the heterodimeric transcription factor network in living cells at endogenous expression levels. Using state-of-the-art single-molecule analysis techniques, the authors tracked individual RARα and RXRα molecules labeled by HALO-tag knock-in. They discovered an asymmetric response to the overexpression of counter-partners. Specifically, the fact that an increase in RARα did not lead to an increase in RXRα chromatin binding is incompatible with the previous competitive core model. Furthermore, by using a technique that visualizes only molecules proximal to partners, they directly linked transcription factor heterodimerization to chromatin binding.

      Strengths:

      The carefully designed experiments, from knock-in cell constructions to singlemolecule imaging analysis, strengthen the evidence of the stoichiometric perturbation response of endogenous proteins. The novel finding that RXR, previously thought to be a target of competition among partners, is in excess provides new insight into key factors in dimerization network regulation. By combining the cutting-edge single-molecule imaging analysis with the technique for detecting interactions developed by the authors' group, they have directly illustrated the relationship between the physical interactions of dimeric transcription factors and chromatin binding. This has enabled interaction analysis in live cells that was challenging in single-molecule imaging, proving it is a powerful tool for studying endogenous proteins.

      Weaknesses:

      As the authors have mentioned, they have not investigated the effects of other T2NRs or RXR isoforms. These invisible factors leave room for interpretation regarding the origin of chromatin binding of endogenous proteins (Recommendations 4). In the PAPA experiments, overexpressed factors are visualized, but changes in chromatin binding of endogenous proteins due to interactions with the overexpressed proteins have not been investigated. This might be tested by reversing the fluorescent ligands for the Sender and Receiver. Additionally, the PAPA experiments are likely to be strengthened by control experiments (Recommendations 5).

      We agree that this would be an interesting experiment. However, there are three technical challenges that complicate its implementation: First, as demonstrated in our original PAPA paper, dark state formation is less efficient when dyes are conjugated to Halo compared to SNAPf, making the reverse configuration less optimal. Second, SNAPf-tagged proteins have slower labeling kinetics than Halotagged proteins, often resulting in under-labeling of SNAPf. Third, our SNAPf transgenes were integrated polyclonally. Since background PAPA scales with the concentration of the sender-labeled protein, variable concentrations of the senderlabeled SNAPf proteins would introduce significant variability, complicating the interpretation of the background PAPA signal. Due to these concerns, we believe that performing reciprocal measurements with reversed fluorescent ligands may not yield reliable results. 

      Reviewer #3 (Recommendations For The Authors):

      (1) The term "Surprising features" in the title is ambiguous and may force readers to search for what it specifically refers to. Including a word that evokes specific features might be helpful.

      Our findings contradict previous work, which suggested that chromatin binding of T2NRs is regulated by competition for a limited pool of RXR. In contrast, we found that RAR expression can limit RXR chromatin binding, but not the other way around, which challenges the existing model. This unexpected result is what we refer to as a "surprising feature" in our title, and we believe it accurately reflects the novel insights our study provides. We also think that this is clearly conveyed in our manuscript abstract, supporting the use of "Surprising features" in the title. 

      (2) p.3, line 11 - The threshold of 0.15 μm2s-1 seems to be a crucial value directly linked to the value of fbound. What is the rationale for choosing this specific value? If consistent conclusions can be obtained using threshold values that are similar but different, it would strengthen the robustness of the results.

      Please refer to our response to Reviewer #2’s Public Review point 4. The threshold choice is arbitrary and doesn’t affect the overall conclusions. To test this, we compared fbound (%) values calculated using both 0.1 μm²s-1 and 0.15 μm²s-1 cutoffs. For example, with endogenously expressed Halo-tagged RARα (clone c156), we observed fbound values of 47±1% vs 49±1%, and for RXRα (clone D6), 34±1% vs 35±1%, respectively (Figure S10). Since we have consistently applied the 0.15 μm²s-1 cutoff across all experimental conditions in this manuscript, the comparisons of fbound (%) between different conditions are robust and valid.

      (3) p.4, line 13 - "the fbound of endogenous RARα-Halo (47{plus minus}1%) was largely unchanged upon expression of SNAP (47{plus minus}1%)" part of the sentence is not surprising. It would make more sense if it were expressed as "the fbound of endogenous RARα-Halo (47{plus minus}1%) was largely unchanged upon expression of RXRα-SNAP (49{plus minus}1%), consistent with the control SNAP (47{plus minus}1%).".

      We understand how the original phrasing may be confusing to the readers and have restructured the sentence as suggested by the reviewer for clarity.

      (4) p.6, line 26 - The discussion that "most chromatin binding of endogenous RXRα in U2OS cells depends on heterodimerization partners other than RARα" seems to contradict the top right figure in Figure 4. If that's the case, the binding partner for the bound red molecule might be yellow rather than blue. Given a decrease in the number of RARα molecules with an unchanged binding ratio, the total number of binding molecules has decreased. Could it be interpreted that the potential reduction in RXRα chromatin binding, accompanying the decrease in binding RARα, is compensated for by other partners?

      We agree with the reviewer that both the yellow and blue molecules in Figure 4 represent T2NRs that can heterodimerize with RXR. For simplicity, we chose to omit the depiction of RXR dimerization with other T2NRs (represented in yellow) in Figure 4. We have now included a note in the figure caption to clarify this. We plan to follow up on the buffer capacity of RXR with other T2NRs in a separate manuscript and will discuss this aspect in more detail once we have data from those experiments.

      (5) Fig. 3 - I expected that DR localizations always appear more frequently than PAPA localizations by the difference in the number of distal molecules. Why does the linear line for SNAP-RXRα in Fig. 3 B have a slope exceeding 1? Also, although the sublinearity is attributed to binding saturation, is there any possibility that this sublinearity originates from the PAPA system like the saturation of PAPA reactivation? Control samples like Halo-SNAPf-3xNLS might address these concerns.

      The number of DR and PAPA localizations depends on the arbitrarily chosen intensity and duration of green and violet light pulses. For any given protein pair, different experimental settings can result in PAPA localizations being greater than, less than, or equal to the number of DR localizations. Therefore, the informative metric is not the absolute number of DR and PAPA localizations, but rather how the ratio of PAPA to DR localizations changes between different conditions—such as between interacting pairs and non-interacting controls.

      Regarding the sublinearity, we agree that it is essential to consider whether the observed sublinearity might stem from saturation of the PAPA signal. We know of two ways in which this could occur:

      First, PAPA can be saturated as the duration of the green light pulse increases and dark-state complexes are depleted. However, this cannot explain the nonlinearity that we observe, because the duration of the green light pulse is constant, and thus the probability that a given complex is reactivated by PAPA is also constant. Likewise, holding the violet pulse duration constant yields a constant probability that a given molecule is reactivated by DR. PAPA localizations are expected to scale linearly with the number of complexes, while DR localizations are expected to scale linearly with the total number of molecules. Sublinear scaling of PAPA localizations with DR localizations thus implies that the number of complexes scales sublinearly with the total concentration of the protein.

      Second, saturation could occur if PAPA localizations are undercounted compared to DR localizations. While this is a valid concern, we consider it unlikely in this case because 1) our localization density is below the level at which our tracking algorithm typically undercounts localizations, and 2) we observe sublinearity for RXR → RAR PAPA even though the number of PAPA localizations is lower than the DR localizations; undercounting due to excessive localization density would be expected to introduce the opposite bias in this case.

      (6) Fig. 4 - The differences between A, B, and C on the right side of the model are subtle, making it difficult to discern where to see. Emphasizing the difference in molecule numbers or grouping free molecules at the top might help clarify these distinctions.

      We appreciate the reviewer’s feedback. In response, we have revised Figure 4 by grouping the free molecules on the top right side for panels A, B and C, as suggested.

      (7) While the main results are obtained through single-molecule imaging, no singlemolecule fluorescence images or trajectory plots are provided. Even just for representative conditions, these could serve as a guide for readers trying to reproduce the experiments with different custom-build microscope setups. Also, considering data availability, depositing the source data might be necessary, at least for the diffusion spectra.

      We have now included representative movies for all the presented SMT conditions as source data. Please see data availability section of the manuscript.

      (8) Tick lines are not visible on many of the graph axes. 

      We have revised the figures to ensure that the tick lines are now clearly visible on all graph axes.

      (9) Inconsistencies in the formatting are present in the methods, such as "hrs" vs. "hours", spacing between numbers and units, and "MgCl2". "u" should be "μ" and "x" should be "×". 

      We have corrected the formatting errors.

      (10) Table S4, rows 16 and 17 - Are "RAR"s typos for "RXR"s? 

      We have corrected this in the manuscript.

      (11) p.10~12 - Are three "Hoestch"s typos for "Hoechst"s? 

      This is now corrected in the manuscript.

      (12) p.11, line 17 - According to the referenced paper, the abbreviation should be "HILO" in all capital letters, not "HiLO". 

      This is now corrected in the manuscript.

      (13) "%" on p.3, line 18, and "." on p.6, line 27 are missing. 

      This missing “%”  and “.” are now added.

    2. eLife Assessment

      This important study provides data that challenges the standard model that binding of Type 2 Nuclear Receptors to chromatin is limited by the available pool of their common heterodimerization partner Retinoid X Receptor. The evidence supporting the conclusions is compelling, utilizing state-of-the-art single-molecule microscopy. This work will be of broad interest to cell biologists who wish to determine limiting factors in gene regulatory networks.

    3. Reviewer #1 (Public review):

      This study provides compelling evidence that RAR, rather than its obligate dimerization partner RXR, is functionally limiting for chromatin binding. This manuscript provides a paradigm for how to dissect the complicated regulatory networks formed by dimerizing transcription factor families.

      Dahal and colleagues use advanced SMT techniques to revisit the role of RXR in DNA-binding of the type-2 nuclear receptor (T2NR) RAR. The dominant consensus model for regulated DNA binding of T2NRs poses that they compete for a limited pool of RXR to form an obligate T2NR-RXR dimer. Using advanced SMT and proximity-assisted photoactivation technologies Dahal et al. now test the effect of manipulating the endogenous pool size of RAR and RXR on heterodimerization and DNA-binding in live U2OS cells. Surprisingly, it turns out that RAR, rather than RXR, is functionally limiting for heterodimerization and chromatin binding. By inference, the relative pool size of various T2NRs expressed in a given cell, rather than RXR, is likely determine chromatin binding and transcriptional output.

      The conclusions of this study are well supported by the experimental results and provides unexpected novel insights in the functioning of the clinically important class of T2NR TFs. Moreover, the presented results show how the use of novel technologies can put long-standing theories on how transcription factors work upside down. This manuscript provides a paradigm for how to further dissect the complicated regulatory networks formed by T2NRs or other dimerizing TFs. I am convinced by the revised manuscript and have no additional concerns or comments.

    4. Reviewer #2 (Public review):

      Summary:

      In the manuscript "Surprising Features of Nuclear Receptor Interaction Networks Revealed by Live Cell Single Molecule Imaging", Dahal et al combine fast single molecule tracking (SMT) with proximity-assisted photoactivation (PAPA) to study the interaction between RARa and RXRa. The prevalent model in the nuclear receptor field suggests that type II nuclear receptors compete for a limiting pool of their partner RXRa. Contrary to this, the authors find that over-expression of RARa but not RXRa increases the fraction of RXRa molecules bound to chromatin, which leads them to conclude that the limiting factor is the abundance of RARa and not RXRa. The authors also perform experiments with a known RARa agonist, all trans retinoic acid (atRA) which has little effect on the bound fraction. Using PAPA, they show that chromatin binding increases upon dimerization of RARa and RXRa.

      The authors have done well to address my comments and specify limitations where they could not.

    5. Reviewer #3 (Public review):

      Summary:

      This study aims to investigate the stoichiometric effect between core factors and partners forming the heterodimeric transcription factor network in living cells at endogenous expression levels. Using state-of-the-art single-molecule analysis techniques, the authors tracked individual RARα and RXRα molecules labeled by HALO-tag knock-in. They discovered an asymmetric response to the overexpression of counter-partners. Specifically, the fact that an increase in RARα did not lead to an increase in RXRα chromatin binding is incompatible with the previous competitive core model. Furthermore, by using a technique that visualizes only molecules proximal to partners, they directly linked transcription factor heterodimerization to chromatin binding.

      Strengths:

      The carefully designed experiments, from knock-in cell constructions to single-molecule imaging analysis, strengthen the evidence of the stoichiometric perturbation response of endogenous proteins. The novel finding that RXR, previously thought to be a target of competition among partners, is in excess provides new insight into key factors in dimerization network regulation. By combining the cutting-edge single-molecule imaging analysis with the technique for detecting interactions developed by the authors' group, they have directly illustrated the relationship between the physical interactions of dimeric transcription factors and chromatin binding. This has enabled interaction analysis in live cells that was challenging in single-molecule imaging, proving it is a powerful tool for studying endogenous proteins.

      Weaknesses:

      None noted.

    1. eLife Assessment

      This valuable study addresses the potential roles of the master regulator of X chromosome inactivation, the Xist long non-coding RNA, in the regulation of autosomal genes. Using data from mouse cells, the authors propose that Xist can coat specific autosomal promoters, which in turn leads to the attenuation of their transcriptional activity. The evidence from individual genes is interesting, and the model aligns with recently published results from humans. However, despite some improvements during revision, the data and statistical analyses in the current study are not yet strong enough to allow for conclusive inferences, leaving the evidence for mouse cells behaving like human cells incomplete. The topic of the work is of broad interest, in particular to colleagues studying gene regulation and noncoding RNAs.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Yao S. and colleagues aims to monitor the potential autosomal regulatory role of the master regulator of X chromosome inactivation, the Xist long non-coding RNA. It has recently become apparent that in the human system, Xist RNA can not only spread in cis on the future inactive X chromosome but also reach some autosomal regions where it recruits transcriptional repression and Polycomb marking. Previous work has also reported that Xist RNA can show a diffused signal in some biological contexts in FISH experiments.

      In this study, the authors investigate whether Xist represses autosomal loci in differentiating female mouse embryonic stem cells (ESCs) and somatic mouse embryonic fibroblasts (MEFs). They perform a time course of ESC differentiation followed by Capture Hybridization of Associated RNA Targets (CHART) on both female and male ESCs, as well as pulldowns with sense oligos for Xist. The authors also examine transcriptional activity through RNA-seq and integrate this data with prior ChIP-seq experiments. Additional experiments were conducted in MEFs and Xist-ΔB repeat mutants, the latter fails to recruit Polycomb repressors.

      Based on this experimental design, the authors make several bold claims:

      (1) Xist binds to about a hundred specific autosomal regions.<br /> (2) This binding is specific to promoter regions rather than broad spreading.<br /> (3) Xist autosomal signal is inversely correlated with PRC1/2 marks but positively correlated with transcription.<br /> (4) Xist targeting results in the attenuation of transcription at autosomal regions.<br /> (5) The B-repeat region is important for autosomal Xist binding and gene repression.<br /> (6) Xist binding to autosomal regions also occurs in somatic cells but does not lead to gene repression.

      Together, these claims suggest that Xist might play a role in modulating the expression of autosomal genes in specific developmental and cellular contexts in mice.

      Strengths:

      This paper deals with an interesting hypothesis that Xist ncRNA can also function at autosomal loci.

      Weaknesses:

      The revised manuscript now includes many additional bioinformatic analyses to support the premise that Xist RNA targets a specific set of about 100 promoters and attenuates their expression in the early stages of differentiation. I have previously raised significant concerns about the bioinformatic analyses and the robustness of the data, especially those linked to CHART-seq datasets. Despite some improvements, fundamental problems with the analysis remain, precluding a conclusion on whether Xist RNA binds specific autosomal promoters. The main concerns include:

      (1) The authors nicely explain the use of biological replicates; however, they still fail to provide the sufficient analysis I requested on d0 and sense probes. While some quantification is presented in Figures 1E and 1F, the peak calling I asked for has still not been performed. In the response document, the authors report that about 600 peaks were identified in d0 female ESCs compared to about 100 in differentiated conditions. They explain this by the well-known phenomenon of having a background of differentiated cells in d0. In my opinion, this reasoning is flawed. With 98% of cells not inducing Xist in the culture, it is unimaginable why 600 peaks would be detected in the peak calling analysis. Rather, this demonstrates a high background in the CHART peak calling. To assess this further, I have reanalyzed d7 CHART datasets and found robust enrichment of the sense probe on promoters of genes, even stronger than the antisense probe. MACS peak calling also identifies a robust number of peaks on the sense probe. Indeed, even though Figure 1F shows low sense probe enrichment, this is because it focuses on the anti-sense peaks only. An opposite effect is observed when focusing on all genes or on sense-peaks. Thefore it is tough to decide which of the signal is truelly due to Xist binding and what is an inherent problem with the CHART signal. These results cast serious doubts on the biological conclusions of this work and point to a very high background level of promoter signal in both sense and antisense samples.

      (2) The authors do not address the conundrum of their results: how is it possible to have a genome-wide autosomal accumulation of Xist signal at promoters (see Figures 1A and 1B), while simultaneously specifically affecting only 100 promoters in the genome? The signal is either general (as Figures 1A and 1B suggest) or specific (as implied by the peak calling), but it cannot be both. Current data points to the fact that CHART has a bias for the most open parts of the chromatin.

      (3) The text is still very confusing when it comes to Polycomb. Some experiments point to the fact that there are few PRC1/2 marks at putative Xist autosomal binding sites (Figure 3C), while the use of X1 induces the loss of PRC2 marks. I still find this internally contradictory. The authors sadly do not address my concerns with additional analysis. Their current data indicate that upon Xist upregulation, Xist-RNA binds to autosomal regions that are highly expressed and devoid of Polycomb. These loci then become transcriptionally attenuated and gain some (but low) level of PRC2 in a Xist-dependent fashion. If this model is true, then all these regions should not have Xist in d0 of differentiation and should also have slightly lower levels of PRC2. The argument that there is a low level of Xist in 2-5% of cells should not be a problem because most of the signal will come from the 98% of cells not expressing Xist (as seen in Figure 1A). Without timepoint 0, the whole premise of the paper is difficult to interpret. Either the d0 samples are good enough, or the system is so leaky that it is nearly impossible to identify Xist-specific effects. Males are a useful control but are obviously a genetically very different line with distinct epigenetic and signaling statuses. It is crucial to compare the timing of repression/PRC accumulation to conclude if and how Xist is functional on these loci.

      (4) The authors did not address my concerns about the transcriptional analysis. I belive that the control genes are not selected properly. This analysis should not have been performed on just 100 randomly selected regions/genes. Instead, bootstrapping of 100 randomly selected regions/genes should be done, e.g., 1000 times. Additionally, one should only sample from expressed genes to have a comparable control gene set. For example, in Figures 4D and 4E, the distribution of control regions is entirely different. To stress again, relying on a set of 100 randomly selected genes/regions is not statistically robust; controls have to be matched, and bootstrapping has to be performed. Finally, each timepoint uses a different set of autosomal targets. There is a need to visualize the same set of genes across all timepoints (including d0). For example, are genes bound by Xist at d7 highly expressed at d0 and then attenuated only at d7? What happens to them at d14 (see points from 3)? The arguments about d0 heterogeneity are again not convincing (nor is Figure 3H, which shows a different set of genes).

      (5) Transcriptional analysis is often shown only as tracks however the reads for key example genes have to be quantified properly and not just visualized or amalgamated in a violin plot.

    3. Reviewer #2 (Public review):

      Summary:

      To follow-up on recent reports of Xist-autosome interaction the authors examine female (and male transgenic) mESCs and MEFs by CHARTseq. Upon finding that only 10% of reads map to X, they sought to identify reproducible alternative sites of Xist-binding, and identify ~100 autosomal Xist-binding sites in active chromatin regions. They demonstrate a transient down-regulation of autosomal expression. They utilize published male transgenic inducible Xist mESC data to support their findings. In their system, inhibition of Xist reduces autosomal impact.

      Strengths:

      The authors address a topical and interesting question with a series of models including developmental timepoints and utilize unbiased approaches (CHARTseq, RNAseq). For the CHARTseq they have controls of both sense probes and male cells; and indeed do detect considerable background with their controls. The use of 'metagene' plots provides a visual summation of genic impact. They compare with published data.

      Weaknesses:

      The revised text and rebuttal clarified my confusion of the 'follow-up' analyses (Figure 4) compared to published datasets. Further, the figure legends have been improved.

      While the controls were a strength, it appears that when focussed on bound regions, the background (from sense probes) is now also substantially higher than global background (compare 1E to 1A/B). Thus, why do these autosomal targets enrich for the sense probes, and how to distinguish from such background for the ∆B experiments? If male and sense are both controls, then why is sense lower for males than females, doesn't this suggest Xist impact? While authors note d0 might detect Tsix, the signal is only slightly reduced by d14 and never equivalent. Indeed, the new PCA (S1C) does show as noted that female Xist interactions are distinct from sense and male, but the male signal is even more distinct from sense probes.

      It would have been preferable to see the dispersion of the Xist RNA cloud in these ∆B cells, rather than a reference.

      Only 2 replicates were used, but there were multiple time-points: D0, D4, d7, d14; further, the correlation analysis showed good reproducibility, and in response to reviews they note that 2 replicates are standard of practice.

      The conclusion that RepB is "required for localization to the ~100 genes" is based on density (panel 2E); however, these autosomal targets retain enrichment at TSSs (panel 2A) and indeed the text suggests they are the same sites, suggesting that in fact the choice of autosomal region binding is not RepB dependent. Thus, this remains unresolved for me.

      The introduction is clear, and the senior author is a leader in the field; however, by this reviewer's count 19 of the 52 references include the senior author.

      Better descriptors for the supplemental Excel files would be helpful.

      Aim achievement: The authors do identify autosomal sites with enrichment of chromatin marks and evidence of silencing. Their revised text clarifies many issues, although this reviewer still remains unconvinced that the autosomal targeting is repB-dependent.

      The impact of Xist on autosomes is important for consideration of impact of changes in Xist expression with disease (notably cancers). Knowing the targets (if consistent) would enable assessment of such impact.

    4. Reviewer #3 (Public review):

      Summary:

      Yao et al use CHART to identify chromatin associated with Xist in female mouse ESCs, and, as control, male ESCs at various timepoints of differentiation. Besides binding of Xist to X chromosome regions they found significant binding to autosomes, concentrating mostly on promoter regions of around 100 autosomal genes, as elucidated by MACS. The authors went on to show that the RepB repeat is mostly responsible for these autosomal interactions using a female ESC line in which RepB is deleted. Evidence is provided that Xist interacts with active autosomal genes containing lower coverage of repressive marks H3K27me3 and H2AK119ub and that RepB dependent Xist binding leads to dampening of expression, but not silencing of autosomal genes. These results were confirmed by overexpression studies using transgenic ESCs with doxycycline-inducible Xist as well as via a small molecule inhibitor of Xist (X1), inducing/inhibiting the dampening of autosomal genes, respectively. Finally, using MEFs and Xist mutants RepB or RepE the authors provide evidence that Xist is bound to autosomal genes in cells after the XCI process but appears not to affect gene expression. The data presented appear generally clear and consistent and indicate some differences between human and mouse autosomal regulation by Xist. Thus, these results are timely and should be published.

      Strengths:

      Regulation of autosomal gene expression by Xist is a "big deal" as misregulation of this lncRNA causes developmental defects and human disease. Moreover, this finding may explain sex-specific developmental differences between the sexes. The results in this manuscript identify specific mouse autosomal genes bound by Xist and decipher critical Xist regions that mediate this binding and gene dampening. The methods used in this study are appropriate, and the overall data presented appear convincing and are consistent, indicating some differences between human and mouse autosomal regulation by Xist.

      Comments on revisions:

      In the revised manuscript, the authors have addressed my previous criticisms satisfactorily. Moreover, the manuscript has been much improved with new confirmatory results and additional control experiments. This, combined with more detailed descriptions/explanations facilitates data interpretation, making the paper more transparent and easier to read.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      The manuscript by Yao S. and colleagues aims to monitor the potential autosomal regulatory role of the master regulator of X chromosome inactivation, the Xist long non-coding RNA. It has recently become apparent that in the human system, Xist RNA can not only spread in cis on the future inactive X chromosome but also reach some autosomal regions where it recruits transcriptional repression and Polycomb marking. Previous work has also reported that Xist RNA can show a diffused signal in some biological contexts in FISH experiments.

      In this study, the authors investigate whether Xist represses autosomal loci in differentiating female mouse embryonic stem cells (ESCs) and somatic mouse embryonic fibroblasts (MEFs). They perform a time course of ESC differentiation followed by Capture Hybridization of Associated RNA Targets (CHART) on both female and male ESCs, as well as pulldowns with sense oligos for Xist. The authors also examine transcriptional activity through RNA-seq and integrate this data with prior ChIP-seq experiments. Additional experiments were conducted in MEFs and Xist-ΔB repeat mutants, the latter fails to recruit Polycomb repressors.

      Based on this experimental design, the authors make several bold claims:

      (1) Xist binds to about a hundred specific autosomal regions.

      (2) This binding is specific to promoter regions rather than broad spreading.

      (3) Xist autosomal signal is inversely correlated with PRC1/2 marks but positively correlated with transcription.

      (4) Xist targeting results in the attenuation of transcription at autosomal regions.

      (5) The B-repeat region is important for autosomal Xist binding and gene repression.

      (6) Xist binding to autosomal regions also occurs in somatic cells but does not lead to gene repression.

      Together, these claims suggest that Xist might play a role in modulating the expression of autosomal genes in specific developmental and cellular contexts in mice.

      Strengths:

      This paper deals with an interesting hypothesis that Xist ncRNA can also function at autosomal loci.

      Weaknesses: The claims reported in this paper are largely unsubstantiated by the data, with multiple misinterpretations, lacking controls, and inadequate statistics. Fundamental flaws in the experimental design/analysis preclude the validity of the findings. Major concerns are listed below: (1) The entire paper is based on the CHART observation that Xist is specifically targeted to autosomal promoters. Overall, the data analysis is flawed and does not support such conclusions. Importantly the sense WT and the 0h controls are not used, nor are the biological replicates. 

      We respectfully disagree with Rev1 but nevertheless thank the reviewer for making some suggestions that helped to strengthen our manuscript.  We have provided new experiments and analyses in the revised manuscript. Please see responses below.

      Rev1 seems to have missed or misunderstood some key experiments. In fact, the sense WT and 0h controls were shown. Furthermore, we included at least two biological replicates for each experiment.

      We used both male ES cells (which do not express Xist) and sense probes as key negative controls, as outlined in Figure S1. Crucially, we only analyzed peaks that were reproducible between biological replicates. The Xist CHART peaks in differentiating female ES cells were significantly enriched above the “background” defined by the sense probe and male controls. Specifically, in comparison to undifferentiated female ES cells (day 0) where both X chromosomes are active and Xist is not induced, Xist CHART robustly pulled down the X chromosome during cell differentiation (day 4, day 7, and day 14). In contrast, male ES cells showed no significant pull-down of the X chromosome, and the sense group also exhibited markedly reduced binding (new Figure S1B). Furthermore, Principal Component Analysis (PCA) of CHART-seq reads (day 4 as an example) include Xist, sense, and input in WT and ΔRepB female, further confirmed that the sense probe CHART was clearly distinguishable from Xist CHART signals. Please see revised Figure S1C. Together, these findings underscore the specificity and robustness of our CHART results.

      Data is typically visualized without quantification, and when quantified, control loci/gene sets are erroneously selected. Firstly, CHART validation on the X in FigS1 is misleading and not based on any quantifications (e.g., see the scale on Kdm6a (0-190) compared to Cdkl5 (0-40)). If scaled appropriately, there is Xist signal on the escapee. 

      Rev1 may have misread the presented data. In the example raised by Rev1, Fig. S1 is inherently quantitative: e.g., a ratio is a number in Fig. S1A (now Fig. S1B) and all gene tracks in Fig. 1B-E are shown with scales. We showed X-linked genes in Fig. S1 (now Fig. S2) as a control to demonstrate that the CHART worked and that Xist accumulated over time from day 0 to day 14. Our new Figure 1B demonstrates the Xist accumulation in graph format. 

      Our paper focuses on Xist autosomal binding sites. Thus, the X-linked examples were placed in the supplement. Escapee genes do in fact accumulate Xist at their promoter regions and this finding is consistent with data published by Simon et al. (2013, Nature). It was therefore not desirable in this paper to reanalyze X-linked genes, including escapees. Nevertheless, to address the reviewer’s concerns, we present new data in new Figure S3A. Here we analyzed the density of Xist binding across X-linked genes, including both active and inactive genes, as well as escapee genes. From this quantitative analysis, it should be clear that escapees do bind Xist. However, from the metagene plots in Figure S3B, we confirm the previous conclusion that escapees bind Xist at high levels just upstream of the promoter and that there is a depletion of Xist in the escapee gene body, consistent with a barrier preventing Xist from moving into the active gene. 

      All X-linked loci should have been quantified and classified based on escape status; sense control should also be quantified, and biological replicates should be shown separately. 

      Please see above response.

      Additionally, in the revised manuscript, we have examined the Irreproducible Discovery Rate (IDR) to validate the reproducibility of peaks between the two replicates in the revised version, and we included a representative example from female WT ES cells at day 4 (revised Figure S4A). The results showed a strong correlation between the replicates, with an IDR threshold of 0.05 (red point > 0.05). As described in the Methods section, to ensure reliable and robust peak identification, we performed peak calling (MACS2) separately on each replicate, and then used bedtools intersect to identify peaks that overlapped between the two replicates. This stringent process, including strict q-value settings in MACS2, ensures the reliability and reproducibility of the peaks presented in this study.

      Secondly, and most importantly, Figure 1 does not convincingly show specific Xist autosomal binding. Panel A quantification is on extremely variable y-scales and actually shows that Xist is recruited globally to nearly all autosomal genes, likely indicating an unspecific signal. Again, the sense and 0h controls should have been quantified along with biological replicates. 

      Figure 1 shows heatmaps and corresponding metagenes for d0, d4, d7, and d14 female ES cells. Two biological replicates are analyzed. In our revised manuscript, we have used Pearson and Spearman correlation coefficients to measure the strength and direction of a relationship between two biological replicates and shown that the two replicates have high reproducibility (new Figure S1A). On d0, the Xist coverage on autosomes and X chromosome is low, but there is a clear increase on d4, d7, and d14, particularly at the TSS of autosomal genes, as shown by the metagene plots on in Figure 1A-B and the CHART density maps in new Figure 1E-F. We also show relative depletion of Xist signals in the male and sense negative controls.

      Upon inspecting genome browser tracks of all regions reported in the manuscript (Rbm14, Srp9, Brf1, Cand2, Thra, Kmt2c, Kmt2e, Stau2, and Bcl7b), the signal is unspecific on all sites with the possible exception of Kmt2e. On all other loci, there is either a strong signal in the 0h ESC controls or more signal in some of the sense controls. This implies that peak calling is picking up false positive regions. How many peaks would have been picked up if the sense or the 0h controls were used for peak calling? It is likely that there would be a lot since there are also possible "peaks" (e.g., Fzd9) in control tracks. 

      The analysis cannot be performed by visual inspection. A statistical analysis must be performed to call signal above noise. This is why we performed peak-calling on two biological replicates and identified overlapping peaks using bedtools intersect to improve reliability. Significant peaks are noted as black bars under each track. As mentioned above, for our analysis, we focused on the top 100 peaks based on peak scores to ensure robustness. Xist has significantly higher signal compared to the sense probe in the Xist-autosomal peak regions (revised Figure 1E-F). Additionally, we conducted peak calling on undifferentiated ES cells (d0) and detected a significantly higher number of peaks (~600) compared to the differentiated states (d4 or d7) (~100).

      Single-cell sequencing studies have shown that about 2% of undifferentiated mESCs express detectable Xist (Pacini et al., Nat Commun, 2021). The Xist peaks in “day 0” cells may be due to the differentiating population.

      Further inspection of the data was not possible as the authors did not provide access to the raw fastq files. When inspecting results from past published experiments {Engreitz, 2013 #1839} reported regions were not bound by Xist. 

      On the contrary, we deposited the raw data files to GEO prior to the submission of the paper and included the reviewer link to access them. As of August 24, 2024, GEO publicly released these files, allowing for full inspection of the data. 

      Regarding the Engreitz publication, it is not recommended to compare our current study to their analysis for the crucial reason that the Engreitz study was not conducted under physiological conditions. The authors overexpressed the Xist gene in male ES cells. Because Xist RNA can silence genes in male cells as well, this ectopic overexpression normally leads to cell death — thus forcing examination of effects in a narrow time window before Xist can fully spread and act across the genome. Comparing our experiments (endogenous Xist expression in female ES cells) to the ectopic overexpression in male ES cells of Engreitz et al. should therefore not be undertaken.

      Thirdly, contrary to the authors' claim, deleting the B repeat does not lead to a loss of autosomal signal. Indeed, comparing Fig1A and Fig2B side by side clearly shows no difference in the autosomal signal, likely because the autosomal signal is CHART background. Properly quantifying the signal with separate replicates as well as the sense and 0h controls is vital. Overall current data together with published results indicate that CHART peak calling on autosomes is due to technical noise or artefacts.

      In our revised manuscript, we have included the quantitative results as mentioned above in the main and supplementary figure (new Figure 1E-F, Figure 2E-F, and S3A). The data clearly show an enrichment in the Xist CHART samples in differentiating female ES cells.

      We believe the reviewer may be comparing the original Figure 1A and Figure 2A (not Figure 2B). As mentioned above, the analysis cannot be performed by visual inspection. Please see new Figure 2E and 2F. From these data, it should be clear that deleting RepB causes a decrease in Xist targeting to autosomal loci.

      (2) The RNA-seq analysis is also flawed and precludes strong statements. Firstly, the analysis frequently lacks statistical analysis (Fig3B, FigS2B-C) and is often based on visualizations (Fig 3D-G) without quantifications. Day 4 B-repeat deletion does not lead to a significant change in the expression of genes close to Xist signal (Fig3H, d14 does not fully show). 

      Please see new revised Figure 3B and Figures S2B-C (now revised as Figures S6A and S6B). 

      Secondly, for all transcriptional analysis, it is important to show autosomal non-target genes, which is not always done. 

      In the revised manuscript, we included non-target genes for each analysis (new Figure 4E-F, 5D and 5F, 7C and 7E, S7F, S8).

      Indeed, both males and B repeat deletion will lead to transcriptional changes on autosomes as a secondary effect from different X inactivation status. The control set, if used, is inappropriate as it compares one randomly selected set of ~100 genes. This introduces sampling error and compares different classes of genes. Since Xist signal targets more active genes, it is important to always compare autosomal target genes to all other autosomal genes with similar basal expression patterns.

      Please see new Figure S8. We included 100 randomly selected non-target sites on autosomes for this comparative analysis. For consistency, we applied the same flanking regions (10 kb) in the analysis of both target and non-target genes. We believe that this selection method for nontargets is appropriate for two reasons: first, it allows us to control for Xist binding and non-binding; second, it ensures a similar number of genes in both groups, providing a robust foundation for statistical analysis. 

      (3) The ChIP-seq analysis also has some problems. The authors claim that there is no positive correlation between genes close to Xist autosomal binding (10kb) compared to those 50kb away (Fig 3C, S2D); however, this analysis is based entirely on metagene visualization. Signal within the Xist binding sites should be quantified (not genes close by) and compared to other types of genomic loci and promoters. Focusing on the 50kb group only as controls is misleading.

      We believe the reviewer may have misunderstood our conclusions. As stated in the paper, we observed lower coverage of the histone marks H3K27me3 and H2AK119ub, associated with PRC2 and PRC1, respectively. Our conclusions regarding PRC1/2 support the RNA-seq results, indicating that Xist tends to bind to actively expressed genes. In other words, these genes exhibit lower levels of PRC-mediated silencing signals. This observation underscores the relationship between Xist binding and gene activity, highlighting that Xist preferentially associates with regions that are less subject to silencing by polycomb repressive complexes.

      Secondly, the authors only look at PRC mark signal upon differentiation; what about the 0h timepoint, i.e., is there pre-marking? 

      Day 0 is not an appropriate timepoint for this analysis because Xist is not yet induced. There is also a small fraction of cells (<5%) that spontaneously differentiate and start to undergo XCI. Because of these reasons, the day 0 timepoint is considered somewhat heterogeneous and it would be difficult to make conclusions regarding Xist peaks in these samples.

      Most worryingly, the data analysis is not consistent between figures (see Fig3C vs 5H-I). In Fig5, the group of Xist targets was chosen as those within 100kb of Xist binding, which would encompass all the control regions from Fig3C. In this analysis, the authors report that there is Xist-dependent H3K27me3 deposition, and in fact, here the Xist autosomal targets have more of it than the controls. Overall, all of this analysis is misleading, and clear conclusions cannot be made.

      We believe that the reviewer may have also misunderstood the analysis in Figure 5. Figure 5 shows the effect of the Xist inhibitor, X1, on H3K27me3 and gene expression. X1 blocks reduces PRC2 targeting and gene silencing — consistent with X1’s effect on RepA as published in Aguilar et al. 2022. 

      All in all, because the fundamental observation is not robust (see point 1), all subsequent analyses are also affected. There are also multiple other inconsistencies within the analysis; however, they have not been included here for brevity.

      We again respectfully disagree with Rev1 but thank the reviewer for making suggestions that helped to strengthen our manuscript.  We believe that the revised manuscript with new analyses is improved in part because of the reviewer’s critical comments.

      Reviewer #2 (Public review):

      Summary:

      To follow-up on recent reports of Xist-autosome interaction the authors examine female (and male transgenic) mESCs and MEFs by CHARTseq. Upon finding that only 10% of reads map to X, they sought to identify reproducible alternative sites of Xist-binding, and identify ~100 autosomal Xistbinding sites and show a transient impact on expression.

      Strengths:

      The authors address a topical and interesting question with a series of models including developmental timepoints and utilize unbiased approaches (CHARTseq, RNAseq). For the CHARTseq they have controls of both sense probes and male cells; and indeed do detect considerable background with their controls. The use of deletions emphasizes that intact functional Xist is involved. The use of 'metagene' plots provides a visual summation of genic impact.

      Reviewer 2 has made some excellent suggestions. We have revised the manuscript accordingly and are grateful to the reviewer for the recommendations.

      Weaknesses:

      Overall, the result presentation has many 'sample' gene presentations (in contrast to the stronger 'metagene' summation of all genes). The manuscript often relies on discussion of prior X chromosomal studies, while the data generated would allow assessment of the X within this study to confirm concordance with prior results using the current methodology/cell lines. 

      Many of the 'follow-up' analyses are in fact reprocessing and comparison of published datasets. The figure legends are limited, and sample size and/or source of control is not always clear. While similar numbers of autosomal Xist-binding sites were often observed, the presented data did not clarify how many were consistent across time-points/cell types. While there were multiple time points/lines assessed, only 2 replicates were generally done.

      We apologize for the deficiencies in the legend.  The revised manuscript has corrected them.

      We generated many new datasets with deep sequencing, with at least two biological replicates for each. Such experiments are extremely expensive by nature. Thus, two biological replicates are typically considered acceptable.

      Additionally, we performed reanalysis of published datasets to test whether — in the hands of other investigators — cell lines expressing Xist also supported autosomal targeting. Figure 4 is a case in point. Here we examined Tg1 and Tg2, which respond to doxycycline to overexpress Xist from an ectopic site. Transcriptomic analysis showed significant downregulation of autosomal Xist targets, as exemplified by Rbm14 and Bcl7b (new Figure 4C, S9B). In contrast, non-targets of Xist such as Stau1 did not demonstrate significant changes in gene expression (new Figure 4E and 4G). Looking across all autosomal target genes, we observed a significant decrease in mean expression in the Xist overexpressing cell lines (new Figure 4D). The fact that the autosomal changes were also observed in datasets generated by other investigators greatly strengthen our conclusions. 

      Aim achievement:

      The authors do identify autosomal sites with enrichment of chromatin marks and evidence of silencing. More details regarding sample size and controls (both treatment, and most importantly choice of 'non-targets' - discussed in comments to authors) are required to determine if the results support the conclusions.

      Specific scenarios for which I am concerned about the strength of evidence underlying the conclusion:

      I found the conclusion "Thus, RepB is required not only for Xist to localize to the X- chromosome but also for its localization to the ~100 autosomal genes " (p5) in constrast to the statement 2 lines prior: "A similar number of Xist peaks across autosomes in ΔRepB cells was observed and the autosomal targets remained similar". Some quantitative statistics would assist in determining impact, both on autosomes and also X; perhaps similar to the quintile analysis done for expression.

      We have added the Xist coverage panel for day 4 and 7 in the identified Xist-autosomal peak regions (new Figure 1E-F, Figure 2E-F), as mentioned above. The results clearly demonstrate that the deletion of RepB decreases Xist binding to autosomes. Also, we showed that ΔRepB increased X-linked genes expression in our revised Figure 3D. 

      It is stated that there is a significant suppression of X-linked genes with the autosomal transgenes; however, only an example is shown in Figure 4B. To support this statement, a full X chromosomal geneset should be shown in panels F and G, which should also list the number of replicates. 

      Please see new Figure 4B.

      As these are hybrid cells, perhaps allelic suppression could be monitored? Is Med14 usually subject to X inactivation in the Ctrl cells, and is the expression reduced from both X chromosomes or preferentially the active (or inactive) X chromosome?

      If Rev2 is referring to Figure 4, the dataset used in Figure 4 comes from another research group and was previously published (Loda, A. et al. Nat Commun, 2017).

      If Rev2 is referring to our ES cells, they are N2 cell lines.  The X chromosomes are fully hybridized (Cas/Mus), but the autosomes are not fully hybridized (Ogawa et al., Science, 2008). Med14 is subject to XCI and is expressed from the Xa, silenced on the Xi. 

      The expression change for autosomes after transgene induction is barely significant; and it was not clear what was used as the Ctrl? This is a critical comparator as doxycycline alone can change expression patterns.

      We agree that there was a modest change in expression after transgene induction, but it is a significant change. Again, the dataset is from a published study where the authors generated doxycycline-responsive Xist transgenes (see above). The control in this case is Dox-treated wildtype cells. We now clarify these points.

      In the discussion there is the statement. "Genetic analysis coupled to transcriptomic analysis showed that Xist down-regulates the target autosomal genes without silencing them. This effect leads to clear sex difference - where female cells express the ~100 or so autosomal genes at a lower level than male cells (Figure 7H)." This sweeping statement fails to include that in MEFs there is no significant expression difference, in transgenics only borderline significance, and at d14 no significant expression difference. The down-regulation overall seems to be transient during development while targeting is ongoing?

      Indeed, the Xist effects on autosomes seem to occur during cell differentiation in ES cells. While there is no apparent effect in MEFs, we cannot exclude effects on other somatic cells. Regardless of whether the effects are in early development or throughout life, the sex differences may have life-long effects in mammals. The study conducted in human cells by the Plath lab also concluded that the differences primarily affect stem cells.

      Finally, I would have liked to see discussion of the consistency of the identified genes to support the conclusion that the autosomal sites are not merely the results of Xist diffusion.

      We address this in the third paragraph of the Discussion. Our main argument is that if autosomal binding were caused by diffusion, then RepB deletion or X1 treatment would have led to increased binding at autosomal sites, as Xist would bind less to the X chromosome. However, as demonstrated in our study, both treatments resulted in reduced Xist binding on both the X chromosome and autosomes. This finding suggests that the binding is specific and reliant on Xist's RepA and RepB domains, rather than being a passive diffusion process.

      To examine overlap between the conditions (days of differentiation and WT/RepB cells), we generated Venn Diagrams as now shown in Figure S4E.

      The impact of Xist on autosomes is important for consideration of impact of changes in Xist expression with disease (notably cancers). Knowing the targets (if consistent) would enable assessment of such impact.

      We thank Rev2 for the very helpful review and for the forward-looking experiments. Indeed, the physiological changes brought on by autosomal targeting will be of future interest.

      Reviewer #3 (Public review):

      Summary:

      Yao et al use CHART to identify chromatin associated with Xist in female mouse ESCs, and, as control, male ESCs at various timepoints of differentiation. Besides binding of Xist to X chromosome regions they found significant binding to autosomes, concentrating mostly on promoter regions of around 100 autosomal genes, as elucidated by MACS. The authors went on to show that the RepB repeat is mostly responsible for these autosomal interactions using a female ESC line in which RepB is deleted. Evidence is provided that Xist interacts with active autosomal genes containing lower coverage of repressive marks H3K27me3 and H2AK119ub and that RepB dependent Xist binding leads to dampening of expression, but not silencing of autosomal genes. These results were confirmed by overexpression studies using transgenic ESCs with doxycycline-inducible Xist as well as via a small molecule inhibitor of Xist (X1), inducing/inhibiting the dampening of autosomal genes, respectively. Finally, using MEFs and Xist mutants RepB or RepE the authors provide evidence that Xist is bound to autosomal genes in cells after the XCI process but appears not to affect gene expression. The data presented appear generally clear and consistent and indicate some differences between human and mouse autosomal regulation by Xist. Thus, these results are timely and should be published.

      We thank Rev3 for the positive remarks and great suggestions.  We have amended the manuscript per below. 

      Strengths:

      Regulation of autosomal gene expression by Xist is a "big deal" as misregulation of this lncRNA causes developmental defects and human disease. Moreover, this finding may explain sexspecific developmental differences between the sexes. The results in this manuscript identify specific mouse autosomal genes bound by Xist and decipher critical Xist regions that mediate this binding and gene dampening. The methods used in this study are appropriate, and the overall data presented appear convincing and are consistent, indicating some differences between human and mouse autosomal regulation by Xist.

      Weaknesses:

      (1) The figure legends and/or descriptions of data are often very short lacking detail, and this unnecessarily impedes the reading of the manuscript, in particular the figures would benefit not only from more detailed descriptions/explanations of what has been done but also what is shown. 

      We have included more detailed descriptions in the figure legends and throughout the manuscript.

      This will facilitate the reading and overall comprehension by the reader. One out of many examples: In Fig S1B in the CHART data at d4 and d7 there is not only signal in female WT Xist antisense but also in female sense control. For a reader that is not an expert in XCI it would be helpful to point out in the legend that this signal corresponds to the lncRNA Tsix (I suppose), that is transcribed on the other strand.

      We thank the reviewer for this excellent point.  We have amended the Results section accordingly.

      (2) Different scales are used in the lower panels of Figures 1A and 2A, which makes it difficult to directly compare signals between the different differentiation stages.

      We have included a figure combining all timepoints — d0, d4, d7, and d14 WT female Xist CHART signals  — on the X chromosome and autosomes to support our thesis. Please see new Figure 1B.

      (3) In this study some of the findings on mouse cells contrast previously published results in human ESCs: 1) Xist binding occurs preferentially to promoters in mice, not in human. 2) Binding of Xist is mostly detected in polycomb-depleted regions in mice but there is a positive correlation between Xist RNA and PRC2 marks in human ESCs. These differences are surprising but may be very interesting and relevant. While I am aware that this might be a difficult task, it would be helpful to experimentally address this issue in order to distinguish whether species specific and/or methodological differences between the studies are responsible for these differences.

      Indeed, our findings in mouse cells contrast with those observed in humans. As discussed in the manuscript, this discrepancy may be attributed to factors such as cell type, differentiation methods, and the Xist pull-down technique employed (our CHART method utilizes a 20 nt oligo library, whereas RAP uses long oligos). We agree that future work should investigate the underlying causes of these differences between mouse and human systems.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      For Figure 2: labelling ∆B on the panel A timeline (e.g. d0-∆B) would make the results clearer for the audience. Panel B makes most sense beside panel E of Figure 1, so combine here and skip in Figure 1?

      We have modified Figure 2A and thank Rev2 for this suggestion. As for the embedded tables: since we performed peak calling for WT and ∆B separately, we believe that showing both the peak numbers and their corresponding peak patterns provides a clearer representation of the data.

      I agree that at day 7 there appears to be a difference in X; but by day 14 this looks much more minimal - is it just time-shifted rather than altered? Perhaps this could be discussed. Autosomal binding sites show no change in number.

      Day 7 exhibits the strongest Xist binding on the X chromosome, consistent with the de novo establishment phase of XCI when Xist is expressed at the highest levels (300 copies/cell during de novo XCI versus ~100 copies/cell during maintenance [Sunwoo et al., 2015 as cited]. Per our RNA-seq analysis here, we also observed highest Xist expression on day 7 and reduced levels on day 14 (Fig. S5A). This expression difference explains the reduced Xist CHART levels on day 14 compared to day 7. 

      While the X has previously been examined, it would seem beneficial to conduct the same expression analyses (Figure 3) for the X (perhaps supplemental), as the authors have the data 'in hand'. I feel comparison to X in the main figure for panels A and B would fit, while a similar analysis for the X for panel C could be supplemental, presumably supporting the published data to which this data is currently compared. 

      This is a good suggestion. Please find the new data in Figures 2E-F and 3D, which demonstrate that the RepB deletion inhibits Xist binding on the X chromosome, resulting in increased X-linked gene expression, as previously mentioned. Since Xist binds across the X chromosome, we did not perform peak calling as we did for the autosomes. Therefore, applying a similar analysis as in Figures 3A-B may not be appropriate in this case.

      Such a direct comparison to X-data from the same study would be important. For panel H: How many replicates (2)? This should be in the legend. What is the change in median expression? Again, a supplemental figure showing impact on X-linked targets would be useful. Do male and female ESCs show an expression difference prior to differentiation (ie d0)? The data underlying this Figure should be in one of the supplementary tables, showing the full statistical tests and average change. The supplementary tables 8-12 list the WT target genes, not expression differences with the deletion. Again, given that the difference appears transient, might the ∆B cells be altered in rate of differentiation?

      Panel H (revised Figure 3G) includes two replicates, and this has been added to the legends. We have provided a supplementary figure demonstrating that RepB increases the expression levels of X-linked genes on days 4, 7, and 14 (revised Figure 3D). Male and female ESCs show differences in the expression of X-linked genes, as both X chromosomes are active in females at this stage prior to differentiation (revised Figure S5C). 

      A supplementary table with statistical tests and average change information has been included in our revised version (Table S11).

      On the other hand, these Xist-autosomal target genes displayed no significant differences between WT male, female, or ∆B female cells on day 0 — prior to onset of XCI and Xist expression. Please see new Figure 3H. 

      As for whether ∆B cells are altered in their rate of differentiation, the analysis by Colognori et al. 2019 indicates that ∆B cells differentiate similarly to WT cells. (In Figure 6 of Colognori et al. 2019, autosomal genes expressed similarly in WT and ∆B cells, whereas XCI is affected only in ∆B cells)

      We have also modified the legends for our supplementary tables.

      Why were the transgene lines examined upon neuronal differentiation rather than the same approach as in Figures 1-3? I would have thought neuronal differentiation might be more similar to d14, where limited changes remain? Could the authors clarify and discuss?

      We apologize for the confusion. The Tg lines in Figure 4 came from a previously published study. We performed reanalysis of published datasets because we wanted to test whether — in the hands of other investigators — cell lines expressing Xist also supported autosomal targeting. Here we examined Tg1 and Tg2, which respond to doxycycline to overexpress Xist from an ectopic site. Transcriptomic analysis showed significant downregulation of autosomal Xist targets, as exemplified by Bcl7b and Rbm14 (Figure 4C and S9B). In contrast, non-targets of Xist such as Stau1 did not demonstrate significant changes in gene expression (Figure 4E and 4F). Looking across all autosomal target genes, we observed a significant decrease in mean expression in the Xist overexpressing cell lines (Figure 4D). The fact that the autosomal changes were also observed in datasets generated by other investigators greatly strengthen our conclusions. We have clarified this in the Results section.

      Figure 5 - the legend should specify the number of replicates and clarify the blue/green (intuitive, but not specified). Are the 'target' / 'non-target' genes from d4 Chart (but the RNA from d5)? How are 'non-targets' defined - do they match the 'targets' in certain criteria (expression level, chromatin features, GC content)? Do they change per differentiation protocol?

      We have modified the legends to clarify that the 'target' and 'non-target' genes are derived from the day 4 CHART-seq data, while the RNA data is from day 5, as that study sequenced day 5 and not day 4. Non-targets were randomly chosen based on (i) the absence of Xist binding and (ii) similar expression levels. Please see revised Figure S8.

      It would be helpful to compare Xist expression levels across the various models, and the MEF model could be better described - are they polyploid as often happens?

      We have included the Xist expression levels of ES cells and MEF cells in the revised version (revised Figure S5A, 6D). The transformed MEFs are indeed tetraploid, as is typical.

      For 6A to be informative, one needs to know % mapping to X in ES timeline, which is in supplemental, so perhaps 6A should also be supplemental?

      We have moved 6A to the supplemental figure.

      It is odd that ∆B seems to have had more impact in MEFs, and I would like more discussion - but I also think I am missing something: "We observed that Xist signals were more substantially reduced on both the Xi and autosomal regions in ΔRepE MEFs compared to ΔRepB cells", yet in lower panel 6 G it looks like ∆B is LOWER than ∆E? Am I misinterpreting?

      We apologize for the confusing writing.  The revised text now reads:  “To investigate, we utilized a deletion of Xist’s Repeat E (∆RepE), which was previously demonstrated to severely abrogate localization of Xist to the Xi 41,42. We reasoned that the severe loss of Xist binding might unmask a transcriptomic difference. As expected, we observed that Xist signals were somewhat more reduced on the Xi in ΔRepE MEFs compared to ΔRepB cells (Figure 6E-6F). Despite this reduction, peak coverages in autosomal target genes did not increase in ΔRepE MEFs (Figure 6E-6F). However, there was an overall decrease in the number of significant autosomal peaks in ∆RepE MEFs relative to WT cells (Figure 6A). Regardless, we observed no significant transcriptomic differences in ∆RepE MEFs relative to WT MEFs (Figure 7A-7E). Additionally, further examination of RNA sequencing data from male and female MEF cells in two published studies 43,44 corroborated that the expression levels of these autosomal Xist targets did not exhibit significant changes (Figure 7F and 7G). Altogether, the analysis in MEFs demonstrates that Xist continues to bind autosomal genes in post-XCI somatic cells. However, autosomal binding of Xist in post-XCI cells does not overtly impact expression of the associated autosomal genes. Nonetheless, we cannot exclude more subtle changes that do not meet the significance cut-off.”

      Overall, I would like to see how consistent these autosomal peaks are - I shudder to suggest Venn diagrams, but something to show whether there are day/lineage specific peaks and/or ∆repeat B/E resistant peaks. 

      We now present Venn diagrams comparing MEF, ES_d4, and ES_d7, showing approximately 50% overlap between MEF and ES cells (revised Figure S10B). This may be expected, as each timepoint is a different developmental stage of XCI, with expected gene expression differences.

      Very minor comments:

      It would be easier if the supplemental tables were tabs in 1 file!

      We will defer to the editor on how best to format the supplemental tables.

      Similar to the text, could gene names be included in the supplemental?

      We have provided gene names in the supplemental files.

      Figure 3 legend: should 'representing' be representative?

      We have modified it.

      "Xist patterns identified in human cells" p 5; it is challenging to follow human versus mouse, so specify or ensure correct use of XIST/Xist Indeed, we edited the manuscript accordingly.

      Gene names should be italicized.

      We have italicized gene names in our manuscript.

      Ref. 38 lacks details (...).

      We have updated the reference.

      Peak-like characters - perhaps characteristics? P8

      We have modified this.

      Reviewer #3 (Recommendations for the authors):

      On page 6, the 6th sentence in the first paragraph needs correction. "Consistent with Xist's behavior on the X chromosome."

      We have modified the sentence. Thank you.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The study by Longhurst et al. investigates the mechanisms of chemoresistance and chemosensitivity towards three compounds that inhibit cell cycle progression: camptothecin, colchicine, and palbociclib. Genome-wide genetic screens were conducted using the HAP1 Cas9 cell line, revealing compound-specific and shared pathways of resistance and sensitivity. The researchers then focused on novel mechanisms that confer resistance to palbociclib, identifying PRC2.1. Genetic and pharmacological disruption of PRC2.1 function, but not related PRC2.2, leads to resistance to palbociclib. The researchers then show that disruption of PRC2.1 function (for example, by MTF2 deletion), results in locus-specific changes in H3K27 methylation and increases in D-type cyclin expression. It is suggested that increased expression of D-type cyclins results in palbociclib resistance.

      Strengths:

      The results of this study are interesting and contribute insights into the molecular mechanisms of CDK4/6 inhibitors. Importantly, while CDK4/6 inhibitors are effective in the clinic, tumour recurrence is very high due to acquired resistance.

      Weaknesses:

      A key resistance mechanism is Rb loss, so it is important to understand if resistance conferred by PRC2.1 loss is mediated by Rb, and whether restoration of PRC2.1 function in Rb-deplete cells results in renewed palbociclib sensitivity. It is also important to understand the clinical implications of the results presented. The inclusion of these data would significantly improve the paper. However, besides some presentation issues and typos as described below, it is my opinion that the results are robust and of broad interest.

      Major questions:

      (1) Is the resistance to CDK4/6 inhibition conferred by mutation of MTF2 mediated by Rb?

      (2) Are mutations in PRC2.1 found in genetic analyses of tumour samples in patients with acquired resistance?

      We thank the reviewer for their editing and experimental suggestions, and have integrated their responses into our re-submitted manuscript.

      We also agree that understanding the role of RB1 in mediating palbociclib resistance to the proposed resistance mechanism is of particular interest. However, as there are three RB proteins expressed in human cells, this is a technically difficult question to probe genetically. Despite this technical challenge, we have provided multiple lines of evidence in our resubmitted manuscript that the resistance to palbociclib observed in our PRC2.1-deficent cells is mediated through the canonical CDK4/6-RB1 pathway. First, disruption of RB1 in HAP1 cells results in palbociclib resistance to a level comparable level to PRC2.1 disruption (Fig. 4E). Second, inactivation of SUZ12 or MTF2 increases the number of cells entering S-phase in palbociclib treatment (Fig. 4G) with no increase in basal rates of apoptosis (Fig. S2D), suggesting that any proliferation advantage observed in PRC2.1-defective cells is due to resistance to  palbociclib-induced cell cycle arrest. Third, we show that over expression of CCND1 and CCND2 is sufficient to drive resistance to palbociclib in wild-type HAP1 cells (Fig. S5F).  And finally, increased levels of CCND1 and CCND2 observed in cells lacking PRC2.1 activity results in higher CDK4/6 activity as measured by RB1 phosphorylation, despite palbociclib blockade (Fig. 6F). All these lines of evidence strongly suggest that MTF2-containing PRC2.1 regulates G1 progression in through the canonical CDK4/6RB1 pathway by repressing CCND1 and CCND2 expression. 

      Whether or not MTF2 deletion leads to palbociclib resistance in clinical samples is also of a question of particular interest. Currently, we are unaware of any reports that specifically mention MTF2 deletion as leading to palbociclib resistance, and we were unable to find another example in our own cancer database review. However, we have included references to other examples of MTF2 mutation resulting in chemotherapeutic resistance in our discussion. Additionally, although MTF2 is rarely observed to be mutated in cancers (Ngubo et al. 2023), it is highly differentially expressed and investigating decreased MTF2 transcription in palbociclib resistant tumors, though challenging, might prove fruitful.  However, as mechanisms of palbociclib resistance is an area of active investigation, we speculate that future studies might uncover additional examples of MTF2 mediating resistance to this clinically important chemotherapeutic.  

      Reviewer #2 (Public Review):

      Summary:

      Longhurst et al. assessed cell cycle regulators using a chemogenetic CRISPR-Cas9 screen in haploid human cell line HAP1. Besides known cell cycle regulators they identified the PRC2.1 subcomplex to be specifically involved in G1 progression, given that the absence of members of the complex makes the cells resistant to Palbociclib. They further showed that in HAP1 cells the PRC2.1, but not the PRC2.2 complex is important to repress the cyclins CCND1 and CCND2. This can explain the enhanced resistance to Palbociclib, a CDK4/6Inhibitor, after PRC2.1 deletion.

      Strengths:

      The initial CRISPR screen is very interesting because it uses three distinct chemicals that disturb the cell cycle at various stages. This screen mostly identified known cell cycle regulators, which demonstrates the validity of the approach. The results can be used as a resource for future research.

      The most interesting outcome of the experiment is the finding that knockouts of the PRC2.1 complex make the cell resistant to Palbociclib. In a further experiment, the authors focused on MTF2 and JARID2 as the main components of PRC2.1 and PRC2.2, respectively. Via extensive analyses, including genome-wide experiments, they confirmed that MTF2 is particularly important to repress the cyclins CCND1 and CCND2. The absence of MTF2 therefore leads to increased expression of these genes, sufficient to make the cell resistant to palociclib. This result will likely be of wide interest to the community.

      Weaknesses:

      The main weakness of the manuscript is that the experiments were performed in only one cell line. To draw more general conclusions, it would be essential to confirm some of the results in other cell lines.

      In addition, some of the findings, such as the results from the CRISPR screen as well as the stronger impact of the MTF2 KO on H3K27me3 and gene expression (compared to JARID2 KO), are not unexpected, given that similar results were already obtained before by other labs.

      We thank the reviewer for their suggestions and we believe that we have addressed their main concern about the generality of the MTF2 regulation of D-type cyclin expression in our resubmitted manuscript. We have now shown through shRNA knockdown that MTF2 represses CCND1 in two additional cell lines, the breast cancer MDA-MB-231 and immortalized monkey COS7 cell line (Fig. 6E). However, it is important to note that MTF2 did not control CCND1 expression in every cell line tested (Fig. 6D), underscoring the context-dependent nature of this regulation. Future studies will illuminate what cell or tumor types in which this regulation is observed.

      Additionally, while MTF2 has previously been shown to exert a greater effect on H3K27me3 levels in some circumstances (Loh et al. 2021, Rothberg et al. 2018), a number of notable reports in ES cell lines have concluded that PRC2 localization and H3K27me3 at the majority of genomic sites are dependent on both PRC2.1 and PRC2.2 activity (Healy et al. 2019, Højfeldt et al. 2019, Perino et al. 2020, Oksuz et al. 2018). Therefore, we think it is important to highlight the greater dependence on MTF2 for promoter proximal H3K27me3 levels in our transformed cell line context.  

      Reviewer #3 (Public Review):

      This study begins with a chemogenetic screen to discover previously unrecognized regulators of the cell cycle. Using a CRISPR-Cas9 library in HAP1 cells and an assay that scores cell fitness, the authors identify genes that sensitize or desensitize cells to the presence of palbociclib, colchicine, and camptothecin. These three drugs inhibit proliferation through different mechanisms, and with each treatment, expected and unexpected pathways were found to affect drug sensitivity. The authors focus the rest of the experiments and analysis on the polycomb complex PRC2, as the deletion of several of its subunits in the screen conferred palbociclib resistance. The authors find that PRC2, specifically a complex dependent on the MTF2 subunit, methylates histone 3 lysine 27 (H3K27) in promoters of genes associated with various processes including cell-cycle control. Further experiments demonstrate that Cyclin D expression increases upon loss of PRC2 subunits, providing a potential mechanism for palbociclib resistance.

      The strengths of the paper are the design and execution of the chemogenetic screen, which provides a wealth of potentially useful information. The data convincingly demonstrate in the HAP1 cell line that the MTF2-PRC2 complex sustains the effects of palbociclib (Figure 4), methylates H3K27 in CpG-rich promoters (Figure 5), and represses Cyclin D expression (Figure 6). These results could be of great interest to those studying cell-cycle control, resistance mechanisms to therapeutic cell-cycle inhibitors, and chromatin regulation and gene expression.

      There are several weaknesses that limit the overall quality and potential impact of the study. First, none of the results from the colchicine and camptothecin screens (Figures 1 and 2) are experimentally validated, which lessens the rigor of those data and conclusions. Second, all experiments validating and further exploring results from the palbociclib screen are restricted to the Hap1 cell line, so the reproducibility and generality of the results are not established. While it is reasonable to perform the initial screen to generate hypotheses in the Hap1 line, other cancer and non-transformed lines should be used to test further the validity of conclusions from data in Figures 4-6. Third, conclusions drawn from data in Figures 3D and 4D are not fully supported by the experimental design or results. Finally, there have been other similar chemogenetic screens performed with palbociclib, most notably the study described by Chaikovsky et al. (PMID: 33854239). Results here should be compared and contrasted to other similar studies.

      We thank the reviewer for their suggestions regarding our manuscript. While the genes recovered as mediating cellular responses to camptothecin and colchicine was never confirmed following our chemogenetic screens, we felt our primary findings were in the area of palbociclib resistance and decided focus our follow-up investigations on genes. We included the results camptothecin and colchicine chemogenetic screens as confirmation of the specificity of PRC2 mutation resulting in resistance to palbociclib (Fig. 4C) and for others in the community to use as a resource for future investigations. We have also clarified our results for Figure 3D and 4D in our revised manuscript, as well as included additional plots of these results (Fig. S1DS1F). And, with our resubmitted manuscript, we believe we have addressed their concern of the generality of our results by demonstrating our primary finding that MTF2 regulates D-type cyclins in additional cell lines other than HAP1. We feel these results indicate that while not “general”, there are additional cellular contexts that our main result holds true. In line with this, and to address how our chemogenetic screens fits into the landscape of previous studies, including Chaikosvsky et al., we have included the following lines to our discussion:  “Additionally, other chemogenetic screens utilizing palbociclib and have not identified that inactivation of PRC2 components as either enhancing or reducing palbociclib-induced proliferation defects, suggesting that PRC2 mutation is neutral in the cell lines studied. These observations not only underscore the context-dependent ramifications of mutation of these PRC2 complex members, but also may help inform the context in which CDK4/6 inhibitors are most efficacious.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) "We found that only thirteen and twenty genes resulted in sensitivity or resistance, respectively, in every conditions tested and were deemed non-specific and excluded from any further analysis (see Table S2)." It's unclear to me why these genes were deemed 'nonspecific'. Are these genes functionally important for the general exclusion of xenobiotic molecules?

      By this, we simply meant that these effects were not specific to one condition. Such genes could affect drug half-life or a general stress response, but are less likely to have functions directly tied to the pathway targeted by a drug than are genes whose loss affects only one condition.  

      (2) "Given that increased CCND1 levels is sufficient to drive increased CDK4/6 kinase activity, upregulation of these D-type cyclins is likely to be a significant contributor to the palbociclib resistance in MTF2∆ cells." It's unclear to me what is the basis for this statement. This is only true if there is free CDK4/6. If CDK4/6 is already fully occupied by D-type cyclins, then increased CCND1 levels would not be expected to have an effect. 

      While we anticipated that increased levels of CCND1 would result in more CDK4/6-Dtype association, we now demonstrate in the new Figure S5F that there is more CCND1 in complex with CDK6 in both SUZ12∆ and MTF2∆ cell lines. Furthermore, we able to show in Figure S5G that overexpression of D-type cyclins results in resistant to palbociclib-induced proliferation defects in HAP1 cells.

      (3) The description of the results is very confusing in places, especially regarding "resistance" versus "sensitivity" genes. For example: "CCNE1, CDK6, CDK2, CCND2 and CCND1, all of which are integral to promoting the G1/S phase transition, ranked as the 2nd, 24th, 27th, 29th and 46th most important genes for palbociclib resistance, respectively (Figures 1F and 1G). CCND1 and CCND2 bind either CDK4 or CDK6, the molecular targets of palbociclib, whereas CDK2 and CCNE1 form a related CDK kinase that promotes the G1/S transition.

      Similarly, cells with sgRNAs targeting RB1, whose phosphorylation by CDK4/6 is a critical step in G1 progression, displayed substantial resistance to palbociclib." My reading of this paragraph suggests that disruption of the CDK6 locus is associated with palbociclib resistance - surely this is a typo and instead should have been sensitivity? Please explain.

      We thank the reviewer for pointing this out and have corrected this typo  

      (4) Sensitivity to palbociclib was enhanced in cells expressing sgRNAs targeting H4 acetylation, positive regulators of Pol II transcription, and regulators of the DNA Damage Response pathway (Figures 3A and 3B), although this sensitivity was much weaker than that seen with DNA damaging agents. This observation is consistent with long-term treatment with palbociclib inducing DNA damage, as has been suggested by a number of recent publications 65,66." This is also consistent with recent work on Cdk7 inhibitors (Wilson et al. Mol Cell 2023), as Cdk7 inhibition is expected to affect both CDK1/2/4/6 activities and Pol II transcription.

      We thank the reviewer for bringing this observation to our attention and we have added this citation to this passage in our manuscript.

      (5) Figure 3D - would it not make sense to plot the data such that palbo concentration is on the x-axis? It is also difficult to interpret since the data are normalized to starting "% proliferation" at the indicated palbo treatment, when it is likely that % proliferation changes significantly with palbo concentration. Indeed, this is the graphing format used for a later figure (Figure 4D). The data with rotenone suggests palbo antagonizes rotenone-mediated reduction in proliferation. But it's unclear to me whether the graph shows the converse - that rotenone treatment modulates palbo-induced cell cycle arrest.

      This reviewer is correct about the fact that increasing doses of palbociclib in the absence of oxidative phosphorylation do indeed have an effect on proliferation. However, it is helpful to normalize proliferation values to each initial dose of palbociclib and then compare this to the different oxidative phosphorylation inhibitors treatment combinations. To illustrate that the oxidative phosphorylation inhibitors do indeed antagonize palbociclib-induced proliferation defects, we have now included the data graphed as each oxidative phosphorylation inhibitor vs palbociclib as Supplemental Figures S1D-S1F.

      • The highest concentration of GSK126 tested (5µM) does not appear to confer resistance, but perhaps this is due to off-target effects or cytotoxicity?

      We agree with the reviewer that at the highest doses of dose of GSK126, low doses of palbociclib do not confer resistance to palbociclib. However, higher doses do appear to have this effect. We have included a statement in our results section to address this reviewer’s observations. 

      • Disruption of Emi1 leads to resistance (Figure 1F, FZR1), yet overexpression induces resistance (Mouery et al. bioRxiv 2023). Explain.

      We do not understand why EMI1 responds in this way, and therefore we cannot comment on this in the text. 

      Typos/stylistic comments:

      • Typo "However, the net result of these opposing effects on cell cycle progression, and the contribution of the individual subcomplexes to this regulation, rained unclear."

      We thank the reviewer for pointing this out, and we have corrected it.  

      • Use of the word "growth" - I think the authors should be more precise. Is "proliferation" meant here?

      We thank the reviewer for pointing this out, and we have corrected it.

      • n Figure 4G, two of the panels have 8.42%. Is this correct, or may it be a copy/paste error?

      This was an error, but is no longer relevant as we have reconducted and reanalyzed this experiment.

      Reviewer #2 (Recommendations For The Authors):

      Major Points

      (1) Some of the conclusions should be confirmed in additional cell lines. I would suggest testing the resistance to Palbociclib in several additional cell lines, where MTF2 and JARID2 are deleted. If the conclusion can be generalized, one would expect that the differential role of MTF2 versus JARID2 can be confirmed in more cell lines.

      While the PRC2.1-dependent repression of D-type cyclins does not appear to be general, we have now demonstrated in Figures 5SE and 6F that there are multiple different cellular contexts in which our observations are consistent. Specifically, we demonstrate that GSK126 causes upregulation of CCND1 in both immortalized nontumor cells (COS7 cells) and in the breast cancer cell line MDA-MB-231. Moreover, in both cases we showed that this effect is PRC2.1-dependent, as shRNA knockdown of MTF2 increases expression of CCND1.

      (2) In addition, it may be attractive to make use of publicly available RNA-seq data of MTF2 and JARID2 knockout/down cells, to investigate the generality of the finding that PRC2.1 regulates CCND1 and CCND2.

      While it would be useful to address this issue, Figure S5E demonstrates that the repression of D-type cyclin expression by PRC2.1 is context dependent. Furthermore, prior to identifying the lines shown in Figure 6F and 5SE, we were not aware of which lines to focus our investigations on. However, we have now demonstrated a few cellular contexts in which either chemical inhibition of PRC2 or knockdown of MTF2 results in de-repression of CCND1 expression.

      (3) At a bare minimum the authors should strongly discuss the limitations of the study, and tone down the conclusions.

      We would agree with this based upon the data in the original submitted manuscript, however, now that we have shown that this effect is more general, this is less critical. That said, we do not see this effect in all cell lines, and we have made this apparent in the final version of the manuscript.

      Minor point

      (1) In my view, Figures 1-3 should be shortened to the most essential points, and some data/figures should be moved to the supplementary figures. Especially the STING genenetwork graphs are in my view not particularly meaningful.

      While we understand the opinion of this reviewer, we feel that these data will be of significant interest to some readers.  

      (2) Figure 6E and 6F/G appear to be largely redundant. This can perhaps be made more concise.

      This has been addressed in the new version of Figure 6

      (3) Figure 5D should be enlarged. 

      We thank the reviewer for this suggestion and have enlarged the image.

      Reviewer #3 (Recommendations For The Authors):

      The manuscript could be edited to improve clarity. In several places, the scientific logic motivating an experiment is confusing, and there are several hypotheses and conclusions that seem opposite from what the data are suggesting. Some aspects of the figures were also unclear. Specific examples include the following:

      (1) Last sentence of abstract : "Our results demonstrate a role for PRC2.1, but not PRC2.2, in promoting G1 progression." Data show that knockout of PRC2.1 components promotes G1 progression through upregulation of CycD, so the conclusion here is the opposite.

      We thank the reviewer for catching this error. We have now changed this to “in antagonizing G1 progression”.

      (2) In the second paragraph of the results, CCNE1, CDK2, etc are described as scoring high for palbociclib resistance, but those genes scored as sensitizing. Also, in that paragraph, it is described that a drug is sensitizing cells to loss of a gene, which seems like incorrect logic. It should be clarified that knock-out of a gene either sensitizes or desensitizes cells to the drug.

      We thank the reviewer for catching this error. We have now corrected it.  

      (3) In the motivation for the experiment in Figure 3D, it is written: "we asked whether chemical inhibition of oxidative phosphorylation could rescue sensitivity to palbociclib". Considering that knock-out of genes that mediate oxidative phosphorylation confer resistance to palbociclib, it is confusing why it was expected that chemical inhibitors would restore sensitivity.

      We are sorry if the original wording was confusing. We have now changed this to “combined inhibition of oxidative phosphorylation and CDK4/6 activity mutually rescue the proliferation defect imposed by agents targeting the other process”.  

      (4) If the intention of Figure 3D is to test the hypothesis that chemical inhibition of oxidative phosphorylation modulates sensitivity to palbociclib, the clarity of Figure 3D would be improved if data were shown such that palbociclib concentration is on the x-axis and the different curves are different drug concentrations.

      It appears that there is some mutual suppression, which inhibition of each process rescues cells partly from inhibition of the other. In fact, with these drugs the stronger of the two is seen as the rescue of mitochondrial poisons by palbociclib. We have now discussed this in the text.  

      (5) The authors should check the units on the x-axis in Figure 4D, should they be log[uM Palbo] or log [nM Palbo]?

      We thank the reviewer for catching this error. We have now corrected it

      (6) It should be clarified which data are summarized in the graph to the right in Figure 4G, are these experiments with palbociclib?

      This is currently included in the figure legends.

      (7) The text suggests that the control CCNE1 knockout is shown in Figure 4E, but those data are missing.

      This has been corrected in Figure 4E.

      Several conclusions are not well supported by the data and should be revised or more data and analysis should be added.

      (1) The titular conclusion that the "PRC2.1 Subcomplex Opposes G1 Progression through Regulation of CCND1 and CCND2" has only been demonstrated in the context of a Cdk4/6 inhibitor in HAP1 cells. There is little evidence supporting this claim that is broadly applicable. For example, data in Figure 4G show small and not demonstrable significant differences in G1 and S phase populations in the mock experiments. Also, experiments in other cells are needed to support the rigor and generality of the conclusion.

      Our chemogenetic screen and competitive proliferation assay data in Figure 4A, 4C and 4E support the conclusion that PRC2.1 and PRC2.2 play opposing roles in G1 progression. Furthermore, we have repeated the initial BrdU incorporation experiments shown in Figure 4G and have been able to demonstrate that JARID2∆ cells do indeed display a significant decrease of cells entering into S-phase when treated with palbociclib. Most importantly, in the Figures 6D and 6E we show additional cell lines where this is the case.  Therefore, we feel that this title is valid in the current version of the manuscript, where we have shown it to be the case in multiple tumor-derived human cell lines as well as immortalized non-human primate cells.  

      (2) It is unclear how the data in Figure 3D support the conclusion that the administered inhibitors of oxidative phosphorylation influence response to palbociclib.

      As noted in the response to point 4, we have now discussed this mutual rescue more thoroughly in the text.  

      (3) In Figure 4D, the IC50 values should be calculated and statistical significance based on biological replicates should be determined. Also, the conclusion that "increasing doses of GSK126 withstood palbociclib-induced growth suppression" is overstated, as ultimately all drug conditions succumb to palbocilib suppression of proliferation, although there may be differences in sensitivity.

      We have now  included a statical analysis of each data point in Figure 4D.  

      Editorial comments:

      (1) The title does not seem to optimally capture the content of the paper. Please consider changing it, e.g. focusing on palbociclib resistance. 

      While we used this particular drug to make the original observation, we feel it is more general to discuss the underlying biology (cyclin gene control) than the pharmacological methodology. Moreover, we have now extended our findings about the regulation of D-type cyclins by PRC2.1 to several cell lines, derived from both cancers and primary cells, re-enforcing the fact that this effect is observed more broadly.   

      (2) Please indicate the biological system (haploid human HAP1 cells) in either title or abstract.

      The abstract now indicates that we have observed this in CML, breast cancer and immortalized primary cells.

    2. eLife Assessment

      This valuable study reports a chemogenetic screen for resistance and sensitivity to three cell cycle inhibitors used in the clinic: camptothecin, colchicine, and palbociclib. The screen provides a wealth of information that will be of interest to cell cycle and cancer biologists. Convincing evidence is provided that resistance to palbociclib can result from loss of PRC2.1 activity, which raises cyclin D levels. The effect of PRC2.1 on cyclin D is not universal across tested cell lines with the causal differences not yet understood.

    3. Reviewer #1 (Public review):

      The study by Longhurst et al. investigates the mechanisms of chemoresistance and chemosensitivity towards three compounds that inhibit cell cycle progression: camptothecin, colchicine, and palbociclib. Genome-wide genetic screens were conducted using the HAP1 Cas9 cell line, revealing compound-specific and shared pathways of resistance and sensitivity. The researchers then focused on novel mechanisms that confer resistance to palbociclib, identifying PRC2.1. Genetic and pharmacological disruption of PRC2.1 function, but not related PRC2.2, leads to resistance to palbociclib. The researchers then show that disruption of PRC2.1 function (for example, by MTF2 deletion), results in locus-specific changes in H3K27 methylation and increases in D-type cyclin expression. The study shows that increased expression of D-type cyclins results in palbociclib resistance.

      Strengths:

      The results of this study are interesting, and the study contributes insights into the molecular mechanisms of CDK4/6 inhibitors. Importantly, while CDK4/6 inhibitors are effective in the clinic, tumour recurrence is very high due to acquired resistance.

      Weaknesses:

      A key resistance mechanism is Rb loss, so it is important to understand if resistance conferred by PRC2.1 loss is mediated by Rb, and whether restoration of PRC2.1 function in Rb-deplete cells results in renewed palbociclib sensitivity. It is also important to understand the clinical implications of the results presented. Inclusion of these data would significantly improve the paper. At present, it is unclear if mutations in PRC2.1 are found in genetic analyses of tumour samples in patients with acquired resistance.

    4. Reviewer #2 (Public review):

      Summary:

      Longhurst et al. assessed cell cycle regulators using a chemogenetic CRISPR-Cas9 screen in the haploid human cell line HAP1. Besides known cell cycle regulators they identified the PRC2.1 subcomplex to be specifically involved in G1 progression, given that the absence of members of the complex makes the cells resistant to Palbociclib. They further showed that in HAP1 cells the PRC2.1, but not the PRC2.2 complex is important to repress the cyclins CCND1 and CCND2. This can explain the enhanced resistance to Palbociclib, a CDK4/6-Inhibitor, after PRC2.1 deletion.

      Strengths:

      The initial CRISPR screen is very interesting, because it uses three distinct chemicals that disturb the cell cycle at various stages. This screen mostly identified known cell cycle regulators, which demonstrates the validity of the approach. The results can be used as a resource for future research.

      The most interesting outcome of the experiment is the finding that knockouts of the PRC2.1 complex make the cell resistant to Palbociclib. In further experiments, the authors focused on MTF2 and JARID2 as main components of PRC2.1 and PRC2.2, respectively. Via extensive analyses, including genome-wide experiments, they confirmed that MTF2 is particularly important to repress the cyclins CCND1 and CCND2. Absence of MTF2 therefore leads to increased expression of these genes, sufficient to make the cell resistant to Palbociclib. This result will likely be of wide interest to the community.

      Weaknesses:

      The work is limited to specific biological contexts, and the generality of the conclusions is uncertain.

      Comments on revisions:

      The revision offers new insights and is overall satisfying. I have no further recommendations that I consider essential.

    5. Reviewer #3 (Public review):

      This study begins with a chemogenetic screen to discover previously unrecognized regulators of the cell cycle. Using a CRISPR-Cas9 library in HAP1 cells and an assay that scores cell fitness, the authors identify genes that sensitize or desensitize cells to the presence of palbociclib, colchicine, and camptothecin. The results suggest that these three drugs inhibit proliferation through different mechanisms, and with each treatment, expected and unexpected pathways were found to affect drug sensitivity. The authors focus the rest of the experiments and analysis on the polycomb complex PRC2, as deletion of several of its subunits in the screen conferred palbociclib resistance. The authors find that PRC2, specifically a complex dependent on the MTF2 subunit, methylates histone 3 lysine 27 (H3K27) in promoters of genes associated with various processes including cell-cycle control. Further experiments demonstrate that Cyclin D expression increases upon loss of PRC2 subunits, providing a potential mechanism for palbociclib resistance.

      The strengths of the paper are the design and execution of the chemogenetic screen, which provides a wealth of potentially useful information. The data convincingly demonstrate in the HAP1 cell line that the MTF2-PRC2 complex sustains the effects of palbociclib (Fig. 4), methylates H3K27 in CpG-rich promoters (Fig. 5), and represses Cyclin D expression (Fig. 6). The correlation between MTF2-PRC2 inhibition and increased Cyclin D levels is shown in multiple cell lines using both genetic and chemical approaches. These results could be of great interest to those studying cell-cycle control, resistance mechanisms to therapeutic cell-cycle inhibitors, and chromatin regulation and gene expression.

      There are a few weaknesses that somewhat temper the overall quality and potential impact of the study. First, the results from the colchicine and camptothecin screens (Fig. 1 and 2) are not experimentally validated, which lessens the rigor of those data and conclusions. Second, some experiments validating and further exploring results from the palbociclib screen (Figs. 4 and 5) are restricted to the Hap1 cell line, so the generality of some conclusions is not established. Third, conclusions drawn from data in Fig. 4D are not fully supported by proper use of biological replicates and analysis of the results.

      Comments on revisions:

      Proper statistical analysis considering biological replicates is still not applied to determine whether differences in palbociclib IC50 values at different GSK126 concentrations are significant.

    1. eLife Assessment

      This useful study provides incomplete evidence regarding the pathophysiological role of low estrogen levels post-menopause in hypertension, focusing on L-AABA as a key mediator. The results describe a novel hypothesis for the pathophysiology of hypertension in this population and are of interest to experts in hypertension and vascular biology.

    2. Reviewer #1 (Public review):

      The authors aim to investigate the relationship between low estrogen levels, postmenopausal hypertension, and the potential role of the molecule L-AABA as a biomarker for hypertension. By employing metabolomic analysis and various statistical methods, the study seeks to understand how estrogen deficiency affects blood pressure and identify key metabolites involved in this process, with a particular focus on L-AABA.

      Strengths:

      The study addresses a relevant and understudied area: the role of estrogen and metabolites in postmenopausal hypertension. It presents a novel hypothesis that L-AABA may serve as a protective factor against hypertension, which could have significant clinical implications if proven.

      Weaknesses:

      The evidence linking L-AABA to hypertension is largely correlative, lacking experimental validation or mechanistic proof. Key limitations, such as the inadequacy of the ovariectomy model in replicating human menopause, are acknowledged but not addressed with alternative approaches. In summary, while the study offers an intriguing hypothesis, its conclusions are premature and require further experimental validation and human data to substantiate the claims.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Dr. Yao Li et al. documented the metabolomic profile of the aorta from OVX rats and that from OVX plus E2. These conditions mimic post-menopause hypertension and hormonal replacement therapy.

      Strengths:

      The authors state that this is probably the first study to examine the metabolic changes in the aorta of post-menopause hypertension.

      Weaknesses:

      There are several weaknesses, and a few of them are quite serious.

      (1) The aorta is not a resistant artery and has little to do with hypertension. The authors should have used resistant arteries for this study. The expression of several adrenergic receptors and cholinergic receptors in the aorta and resistant arteries are different. It is unknown whether the aorta metabolomic profile has any relevance to BP and whether they are similar to that of the resistant arteries. I understand the logistics issue of obtaining enough tissues from resistant arteries. At least, once some leads are discovered in the aorta, the authors should validate it in resistant arteries. This should be feasible.

      (2) The aorta and all the arteries have three layers. It is critically important to know whether the metabolic changes occur in the intima or in the media, while the adventitia probably has little to do with vasoconstriction and hypertension. If the authors want to use the aorta to conduct the preliminary study, they should completely remove the adventitia and then use samples with and without their endothelium stripped and then assess their metabolomic profiles. After the leads are obtained from this preliminary profiling, they should be validated in endothelium and smooth muscles of the resistant artery. The current experiments are not appropriately designed.

      (3) The tail-cuff BP measurement is a technique of the last century. The current gold standard of BP measurement is by telemetry. The tail-cuff method is particularly problematic in this study because the 1-2 h restraining of the rats for more than 10 times BP measurement will cause significant stress in the animal, and their stress hormone secretion might cause biased metabolomic profiles in the OVX versus shames operated mice. The problem can be totally avoided by using telemetry.

      (4) Although the L-AABA showed a high p-value (10^-4) of a decrease in the OVX rats, the fold change is small (2-3 folds). Such a small change should be validated using a different method to be convincing.

      (5) The authors claim (or hypothesize) that the reduced AABA level in OVX can cause vascular remodeling. This can be easily validated by the histology of the OVX-resistant artery, and they should do that during the revision. The authors should also examine the M1 macrophage function from the OVX mice to validate their claimed link of AABA to M1.

      (6) As mentioned above, the authors need to pinpoint the changes of AABA to target cells, i.e., endothelial cells, SMC, or M1, and then use in vitro or in vivo cell biology approaches to assess whether these cells in the OVX rat indeed have an abnormality in function and, indeed, such functional changes are responsible for the BP phenotype.

      (7) The results of the current study can be condensed into 1 or 2 figures that can serve as a base or a starting point for a deeper scientific study.

      Summary

      The experimental design of this manuscript is inappropriate, and the methods are not up to the current standards. The whole study is descriptive and rudimentary. It lacks validation and mechanism. The data from this manuscript might be of some value and can serve as the first step for more investigation of the mechanism of post-menopause hypertension.

    4. Reviewer #3 (Public review):

      Summary:

      The decrease in estrogen levels is strongly associated with postmenopausal hypertension. Dr. Yao Li and colleagues aimed to investigate the metabolomic mechanisms of underlying postmenopausal hypertension using OVX and OVX+E2 rat models. They successfully established a correlation between reduced estrogen levels and the development of hypertension in rats. They identified L-alpha-aminobutyric acid (AABA) as a potential marker for postmenopausal hypertension. The research explored the metabolic alterations in aortic tissues and proposed several potential mechanisms contributing to postmenopausal hypertension.

      Strengths:

      The group performed a comprehensive enrichment analysis and various statistical analyses of the metabolomics data.

      Weaknesses:

      (1) The manuscript is descriptive in nature, although they mentioned their primary objective is to explore the potential mechanisms linking low estrogen levels with postmenopausal hypertension. No mechanism insights have been interrogated in this study, which has been mentioned by the authors in the discussion. The connection between E2, AABA, and macrophage needs to be validated in endothelial cells, vascular smooth muscle cells, and other aortic tissue cells. Without such verification, the manuscript predominantly raises hypotheses only based on metabolomic data.

      (2) The serum contains three forms of estrogen: Estradiol, Estrone, and Estriol. The authors used the Rat E2 ELISA kit. Ideally, all three forms of estrogen should be measured.

    5. Author response:

      Reviewer #1 (Public review):

      Summary:

      The authors aim to investigate the relationship between low estrogen levels, postmenopausal hypertension, and the potential role of the molecule L-AABA as a biomarker for hypertension. By employing metabolomic analysis and various statistical methods, the study seeks to understand how estrogen deficiency affects blood pressure and identify key metabolites involved in this process, with a particular focus on L-AABA.

      Strengths:

      The study addresses a relevant and understudied area: the role of estrogen and metabolites in postmenopausal hypertension. It presents a novel hypothesis that L-AABA may serve as a protective factor against hypertension, which could have significant clinical implications if proven.

      We appreciate the acknowledgment of our study’s focus on an important and understudied area. Our hypothesis regarding L-AABA’s role as a possible protective factor against hypertension indeed holds promise for advancing clinical implications.

      Weaknesses:

      The evidence linking L-AABA to hypertension is largely correlative, lacking experimental validation or mechanistic proof. Key limitations, such as the inadequacy of the ovariectomy model in replicating human menopause, are acknowledged but not addressed with alternative approaches. In summary, while the study offers an intriguing hypothesis, its conclusions are premature and require further experimental validation and human data to substantiate the claims.

      We recognize the limitations regarding the correlative nature of our findings and the inadequacy of the OVX model in replicating human menopause. Future research will prioritize experimental validation and incorporate human studies to solidify our conclusions.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Dr. Yao Li et al. documented the metabolomic profile of the aorta from OVX rats and that from OVX plus E2. These conditions mimic post-menopause hypertension and hormonal replacement therapy.

      Strengths:

      The authors state that this is probably the first study to examine the metabolic changes in the aorta of post-menopause hypertension.

      As pointed out by the reviewer, our study may be the first to investigate changes in aortic metabolism in postmenopausal hypertension. As an exploratory study, our goal is to depict the overall characteristics and explore possible research directions.

      Weaknesses:

      There are several weaknesses, and a few of them are quite serious.

      (1) The aorta is not a resistant artery and has little to do with hypertension. The authors should have used resistant arteries for this study. The expression of several adrenergic receptors and cholinergic receptors in the aorta and resistant arteries are different. It is unknown whether the aorta metabolomic profile has any relevance to BP and whether they are similar to that of the resistant arteries. I understand the logistics issue of obtaining enough tissues from resistant arteries. At least, once some leads are discovered in the aorta, the authors should validate it in resistant arteries. This should be feasible.

      We acknowledge the limitation of using the aorta and will aim to include studies on resistant arteries to validate our metabolomic findings.

      (2) The aorta and all the arteries have three layers. It is critically important to know whether the metabolic changes occur in the intima or in the media, while the adventitia probably has little to do with vasoconstriction and hypertension. If the authors want to use the aorta to conduct the preliminary study, they should completely remove the adventitia and then use samples with and without their endothelium stripped and then assess their metabolomic profiles. After the leads are obtained from this preliminary profiling, they should be validated in endothelium and smooth muscles of the resistant artery. The current experiments are not appropriately designed.

      Future studies will involve detailed profiling of specific arterial layers, focusing on the intima and media to enhance the relevance of our findings related to hypertension.

      (3) The tail-cuff BP measurement is a technique of the last century. The current gold standard of BP measurement is by telemetry. The tail-cuff method is particularly problematic in this study because the 1-2 h restraining of the rats for more than 10 times BP measurement will cause significant stress in the animal, and their stress hormone secretion might cause biased metabolomic profiles in the OVX versus shames operated mice. The problem can be totally avoided by using telemetry.

      We appreciate the suggestion and will consider telemetry for more accurate blood pressure measurements in future experiments to minimize stress-related bias.

      (4) Although the L-AABA showed a high p-value (10^-4) of a decrease in the OVX rats, the fold change is small (2-3 folds). Such a small change should be validated using a different method to be convincing.

      We plan to employ additional methods to validate the observed changes in L-AABA levels in the following research, ensuring robustness of our findings.

      (5) The authors claim (or hypothesize) that the reduced AABA level in OVX can cause vascular remodeling. This can be easily validated by the histology of the OVX-resistant artery, and they should do that during the revision. The authors should also examine the M1 macrophage function from the OVX mice to validate their claimed link of AABA to M1.

      We intend to conduct histological analyses and examine M1 macrophage function in OVX-resistant arteries to validate our hypothesis in the following research.

      (6) As mentioned above, the authors need to pinpoint the changes of AABA to target cells, i.e., endothelial cells, SMC, or M1, and then use in vitro or in vivo cell biology approaches to assess whether these cells in the OVX rat indeed have an abnormality in function and, indeed, such functional changes are responsible for the BP phenotype.

      Addressing these points, we aim to pinpoint specific cell types affected by AABA variations and conduct in vitro and in vivo studies to examine their physiological impacts in the following research.

      (7) The results of the current study can be condensed into 1 or 2 figures that can serve as a base or a starting point for a deeper scientific study.

      Thank you for your suggestion. As a omics research, our research approach may differ from traditional mechanism studies.

      Summary

      The experimental design of this manuscript is inappropriate, and the methods are not up to the current standards. The whole study is descriptive and rudimentary. It lacks validation and mechanism. The data from this manuscript might be of some value and can serve as the first step for more investigation of the mechanism of post-menopause hypertension.

      Reviewer #3 (Public review):

      Summary:

      The decrease in estrogen levels is strongly associated with postmenopausal hypertension. Dr. Yao Li and colleagues aimed to investigate the metabolomic mechanisms of underlying postmenopausal hypertension using OVX and OVX+E2 rat models. They successfully established a correlation between reduced estrogen levels and the development of hypertension in rats. They identified L-alpha-aminobutyric acid (AABA) as a potential marker for postmenopausal hypertension. The research explored the metabolic alterations in aortic tissues and proposed several potential mechanisms contributing to postmenopausal hypertension.

      Strengths:

      The group performed a comprehensive enrichment analysis and various statistical analyses of the metabolomics data.

      As summarized by the reviewer, our current study conducted a comprehensive analysis of metabolomics data. It is also a reliable foundation for further mechanism research.

      Weaknesses:

      (1) The manuscript is descriptive in nature, although they mentioned their primary objective is to explore the potential mechanisms linking low estrogen levels with postmenopausal hypertension. No mechanism insights have been interrogated in this study, which has been mentioned by the authors in the discussion. The connection between E2, AABA, and macrophage needs to be validated in endothelial cells, vascular smooth muscle cells, and other aortic tissue cells. Without such verification, the manuscript predominantly raises hypotheses only based on metabolomic data.

      We have proposed research hypotheses based on detailed omics data. Further research on the mechanisms involving endothelial and vascular smooth muscle cells to validate the pathway connections between E2, AABA, and macrophages is undoubtedly the future direction of this study.

      (2) The serum contains three forms of estrogen: Estradiol, Estrone, and Estriol. The authors used the Rat E2 ELISA kit. Ideally, all three forms of estrogen should be measured.

      Future assays will aim to measure Estradiol, Estrone, and Estriol to capture a more comprehensive picture of estrogen’s role in postmenopausal hypertension.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This useful study reports on the discovery of an antimicrobial agent that kills Neisseria gonorrhoeae. Sensitivity is attributed to a combination of DedA assisted uptake of oxydifficidin into the cytoplasm and the presence of a oxydifficidin-sensitive RplL ribosomal protein. Due to the narrow scope, the broader antibacterial spectrum remains unclear and therefore the evidence supporting the conclusions is incomplete with key methods and data lacking. This work will be of interest to microbiologists and synthetic biologists.

      General comment about narrow scope: The broader antibacterial spectrum of oxydifficidin has been reported previously (S B Zimmerman et al., 1987). The main focus of this study is on its previously unreported potent anti-gonococcal activity and mode of action. While it is true that broad-spectrum antibiotics have historically played a role in effectively controlling a wide range of infections, we and others believe that narrow-spectrum antibiotics have an overlooked importance in addressing bacterial infections. Their advantage lies in their ability to target specific pathogens without markedly disrupting the human microbiota.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Kan et al. report the serendipitous discovery of a Bacillus amyloliquefaciens strain that kills N. gonorrhoeae. They use TnSeq to identify that the anti-gonococcal agent is oxydifficidin and show that it acts at the ribosome and that one of the dedA gene products in N. gonorrhoeae MS11 is important for moving the oxydifficidin across the membrane.

      Strengths:

      This is an impressive amount of work, moving from a serendipitous observation through TnSeq to characterize the mechanism by which Oxydifficidin works.

      Weaknesses:

      (1) There are important gaps in the manuscript's methods.

      The requested additions to the method describing bacterial sequencing and anti-gonococcal activity screening will be made. However, we do not think the absence of these generic methods reduces the significance of our findings.

      (2) The work should evaluate antibiotics relevant to N. gonorrhoeae.

      (1) It is not clear to us why reevaluating the activity of well characterized antibiotics against known gonorrhoeae clinical strains would add value to this manuscript. The activity of clinically relevant antibiotics against antibiotic-resistant N. gonorrhoeae clinical isolates is well described in the literature. Our use of antibiotics in this study was intended to aid in the identification of oxydifficidin’s mode of action. This is true for both Tables 1 and 2.

      (2) If the reviewer insists, we would be happy to include MIC data for the following clinically relevant antibiotics: ceftriaxone (cephalosporin/beta-lactam), gentamicin (aminoglycoside), azithromycin (macrolide), and ciprofloxacin (fluoroquinolone).

      (3) The genetic diversity of dedA and rplL in N. gonorrhoeae is not clear, neither is it clear whether oxydifficidin is active against more relevant strains and species than tested so far.

      (1) We thank the reviewer for this suggestion. We aligned the DedA sequence from strain MS11 with DedA proteins from 220 N. gonorrhoeae strains that have high-quality assemblies in NCBI. The result showed that there are no amino acid changes in this protein. Using the same method, we observed several single amino acid changes in RplL. This included changes at A64, G25 and S82 in 4 strains with one change per strain. These sites differ from R76 and K84, where we identified changes that provide resistance to oxydifficidin. Notably, in a similar search of representative Escherichia, Chlamydia, Vibrio, and Pseudomonas NCBI deposited genomes, we did not identify changes in RplL at position R76 or K84.

      (2) While the usefulness of screening more clinically relevant antibiotics against clinical isolates as suggested in comment 2 was not clear to us, we agree that screening these strains for oxydifficidin activity would be beneficial. We have ordered Neisseria gonorrhoeae strain AR1280, AR1281 (CDC), and Neisseria meningitidis ATCC 13090. They will be tested when they arrive.

      Reviewer #2 (Public Review):

      Summary:

      Kan et al. present the discovery of oxydifficidin as a potential antimicrobial against N. gonorrhoeae, including multi-drug resistant strains. The authors show the role of DedA flippase-assisted uptake and the specificity of RplL in the mechanism of action for oxydifficidin. This novel mode of action could potentially offer a new therapeutic avenue, providing a critical addition to the limited arsenal of antibiotics effective against gonorrhea.

      Strengths:

      This study underscores the potential of revisiting natural products for antibiotic discovery of modern-day-concerning pathogens and highlights a new target mechanism that could inform future drug development. Indeed there is a recent growing body of research utilizing AI and predictive computational informatics to revisit potential antimicrobial agents and metabolites from cultured bacterial species. The discovery of oxydifficidin interaction with RplL and its DedA-assisted uptake mechanism opens new research directions in understanding and combating antibiotic-resistant N. gonorrhoeae. Methodologically, the study is rigorous employing various experimental techniques such as genome sequencing, bioassay-guided fractionation, LCMS, NMR, and Tn-mutagenesis.

      Weaknesses:

      The scope is somewhat narrow, focusing primarily on N. gonorrhoeae. This limits the generalizability of the findings and leaves questions about its broader antibacterial spectrum. Moreover, while the study demonstrates the in vitro effectiveness of oxydifficidin, there is a lack of in vivo validation (i.e., animal models) for assessing pre-clinical potential of oxydifficidin. Potential SNPs within dedA or RplL raise concerns about how quickly resistance could emerge in clinical settings.

      (1) Spectrum/narrow scope: The broader antibacterial spectrum of oxydifficidin has been reported previously (S B Zimmerman et al., 1987). The focus of this study is on its previously unreported potent anti-gonococcal activity and its mode of action. While it is true that broad-spectrum antibiotics have historically played a role in effectively controlling a wide range of infections, we and others believe that narrow-spectrum antibiotics have an overlooked importance in addressing bacterial infections. Their advantage lies in their ability to target specific pathogens without markedly disrupting the human microbiota.

      (2) Animal models: We acknowledge the reviewer’s insight regarding the importance of in vivo validation to enhance oxydifficidin’s pre-clinical potential. However, due to the labor-intensive process needed to isolate oxydifficidin, obtaining a sufficient quantity for animal studies is beyond the scope of this study. Our future work will focus on optimizing the yield of oxydifficidin and developing a topical mouse model for subsequent investigations.

      (3) Potential SNPs: Please see our response to Reviewer #1’s comment 3. We acknowledge that potential SNPs within dedA and rplL raise concerns regarding clinical resistance, which is a common issue for protein-targeting antibiotics. Yet, as pointed out in the manuscript, obtaining mutants in the lab was a very low yield endeavor.

      Reviewer #3 (Public Review):

      Summary:

      The authors have shown that oxydifficidin is a potent inhibitor of Neisseria gonorrhoeae. They were able to identify the target of action to rplL and showed that resistance could occur via mutation in the DedA flippase and RplL.

      Strengths:

      This was a very thorough and clearly argued set of experiments that supported their conclusions.

      Weaknesses:

      There was no obvious weakness in the experimental design. Although it is promising that the DedA mutations resulted in attenuation of fitness, it remains an open question whether secondary rounds of mutation could overcome this selective disadvantage which was untried in this study.

      We thank the reviewer for the positive comment. We agree that investigating factors that could compensate for the fitness attenuation caused by DedA mutation would enhance our understanding of the role of DedA.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The use of the term "N. gonorrhoeae wildtype" should not be used. It is uninformative, as the species contains a large amount of diversity. Instead, please name the strain. From Figure 1, it looks like the authors used MS11. Since MS11 is a longstanding lab strain and likely does not reflect circulating N. gonorrhoeae, and since H041 is no longer in circulation, the authors should ideally test the compound against more representative strains of N. gonorrhoeae. This includes panels of isolates available through the CDC, for example (https://www.cdc.gov/drugresistance/resistance-bank/index.html). I encourage the authors to include FC428 or another recently identified isolate with the penA 60 allele to demonstrate oxydifficidin's activity against contemporary concerning isolates/lineages.

      (1) “N. gonorrhoeae MS11” is now used instead of “N. gonorrhoeae WT” in this manuscript.

      (2) In our revised manuscript, we have added MIC data for recently identified Neisseria gonorrhoeae isolates AR#1280 and AR#1281 which contain the penA 60 allele (Table 1). The data shows oxydifficidin maintains its potent activity against these multidrug-resistant strains. We also added a description of this data to the results section as shown below.

      Original text: “Oxydifficidin was more potent against N. gonorrhoeae MS11 than almost all other antibiotics we tested. In fact, it was only slightly less active than the highly optimized third-generation cephalosporin, ceftazidime.([18]) However, unlike third-generation cephalosporins, oxydifficidin retained activity against the multidrug resistant H041 clinical isolate (Table 1).([4]) H041 is resistant to the “standard of care” cephalosporin ceftriaxone (2 µg/mL) as well as a number of other antibiotics that are normally active against N. gonorrhoeae (penicillin G, 4 µg/mL; cefixime, 8 µg/mL; levofloxacin, 32 µg/mL).”

      Changed to: “Oxydifficidin was more potent against N. gonorrhoeae MS11 than most other antibiotics we tested. Notably, unlike clinically used antibiotics such as ceftriaxone, azithromycin, and ciprofloxacin, oxydifficidin retained activity against all multidrug-resistant clinical isolates we examined (Table 1).” (Line 77-79)

      (2) Does oxydifficidin have activity against N. meningitidis? It is the species most closely related to N. gonorrhoeae and the other pathogenic Neisseria.

      Oxydifficidin has potent activity against N. meningitidis ATCC 13090. In our revised manuscript, we have included its MIC data in Figure 1c.

      (3) Given claims that oxydifficidin activity in N. gonorrhoeae as compared to other Neisseria reflects N. gonorrhoeae's dedA and sensitive rplL, it would be good to assess the allelic diversity of these genes in N. gonorrhoeae. There are over 20,000 genomes from clinical isolates of N. gonorrhoeae in databases. It should be straightforward to check whether dedA and rplL allelic variants already exist in the population. Should variants be observed, oxydifficidin should be tested against the associated strains of N. gonorrhoeae.

      Response: We thank the reviewer for this suggestion. We aligned the DedA sequence from strain MS11 with DedA proteins from 220 N. gonorrhoeae strains that have high-quality assemblies in NCBI. The result showed that there are no amino acid changes in this protein. Using the same method, we observed several single amino acid changes in RplL. This included changes at A64, G25 and S82 in 4 strains with one change per strain. These sites differ from R76 and K84, where we identified changes that provide resistance to oxydifficidin. Notably, in a similar search of representative Escherichia, Chlamydia, Vibrio, and Pseudomonas NCBI deposited genomes, we did not identify changes in RplL at position R76 or K84.

      New text: “A survey of 220 N. gonorrhoeae strains with high-quality assemblies in NCBI found no mutations in the DedA protein.” (Line 104-105)

      “These two mutations were not found in the survey of the same collection of N. gonorrhoeae strains used to look for DedA mutations.” (Line 143-144)

      (4) Clinically relevant antibiotics for N. gonorrhoeae are penicillin, tetracycline, spectinomycin, gentamicin, ciprofloxacin, azithromycin, ceftriaxone; moreover, zoliflodacin and gepotidacin have reportedly successfully completed phase 3 trials. The authors should redo their MIC testing with these antibiotics (e.g., for Figures 1 and 2 and Tables 1 and 2), both because this will enable direct comparison with the many clinical isolates that have undergone testing and because these are the drugs most pertinent to clinical practice. Ampicillin, ceftazidime, chloramphenicol, bacitracin, and daptomycin are not relevant. Could the authors explain why they tested vancomycin, polymyxin B, irgasan, melittin, avilamycin, and thiostrepton?

      Our use of antibiotics with diverse modes of action (e.g. vancomycin, polymyxin B, irgasan, melittin, avilamycin, and thiostrepton) in this study was intended to aid in the identification of oxydifficidin’s mode of action. This is true for both Tables 1 and 2.

      To address the reviewer’s concern, in our revised manuscript, we have added MIC data for the following clinically relevant antibiotics: ceftriaxone (cephalosporin/beta-lactam), gentamicin (aminoglycoside), azithromycin (macrolide), and ciprofloxacin (fluoroquinolone) to Table 1.

      (5) Please describe the characteristics of the transposon library (finding four transposons in a single strain does seem unexpected, given how most transposon libraries aim for one transposon insertion per strain).

      We understand that one transposon insertion per strain is ideal for transposon libraries. This Bacillus strain proved to be recalcitrant to genetic manipulation. In the rare cases where we obtained resistance colonies upon electroporation with the transposon, all colonies contained multiple (≥ 4) transposon insertions. This made it impractical to build a library with one transposon insertion per library member.

      We assumed that the anti-N. gonorrhoeae activity most likely originated from a natural product BGC, which typically range from 10-100 kb in size.

      Based on the average of 50 kb per BGC, ~80 transposon insertions would be required to fully search the 4.2 Mb genome of Bacillus amyloliquefaciens BK for a BGC. At 4 mutations per transformant, 1x coverage of the genome would require only 20 library members.

      After extensive electroporation of transposon into Bacillus amyloliquefaciens BK, we were able to obtain a library of 50 members, including one mutant (Tn5-3) that lacked anti-N. gonorrhoeae activity.

      New text added to the methods section:

      “A library containing 50 transposon mutants was obtained. In the mutants examined, each strain contained ≥4 transposon insertions” (Line 337-339)

      (6) Please describe in the methods how you sequenced and annotated the genome of Bacillus amyloliquefaciens BK.

      The sequencing method is now described in “Genomic Sequencing and annotation of Bacillus amyloliquefaciens” section. The genome of Bacillus amyloliquefaciens BK was not fully annotated. Mutations were identified as described in the updated methods section below.

      New text:

      “Genomic Sequencing and annotation of Bacillus amyloliquefaciens

      Genomic DNA from Bacillus amyloliquefaciens BK WT and transposon mutant Tn5-3 was isolated using PureLink Microbiome DNA purification kit (Invitrogen) according to the manufacturer’s instructions.

      The Bacillus amyloliquefaciens BK WT genome was assembled by mapping its sequencing data onto the annotated genome of Bacillus amyloliquefaciens FZB42 using Geneious Prime. Differences in the mutant strain Tn5-3 were identified by mapping its sequencing data onto the assembled Bacillus amyloliquefaciens BK WT genome. The mutated genes were then annotated using NCBI BLAST. The oxydifficidin BGC was annotated using the antiSMASH online server.” (Line 253-260)

      (7) Please describe in the methods how you screened the library for strains that lacked anti-gonococcal activity.

      The method is added to our revised manuscript as section “Screening of Bacillus Strains Lacking Anti-N. gonorrhoeae Activity”.

      New text:

      “Screening of Bacillus Strains Lacking Anti-N. gonorrhoeae Activity

      The transposon mutants of Bacillus amyloliquefaciens BK were grown overnight in LB medium at 30 °C. Each overnight culture was then diluted 1:5000, and 1 μl of the diluted culture was spotted onto a GCB agar plate swabbed with N. gonorrhoeae cells. The plate was then incubated overnight at 37 °C with 5% CO2. The mutant strain (Tn5-3) lacking anti-N. gonorrhoeae activity was identified due to its failure to produce a zone of growth inhibition in the resulting N. gonorrhoeae lawn.” (Line 341-346)

      (8) Was only one strain found that was a 'non-producer' of anti-N. gonorrhoeae activity? Line 68 suggests that this was only one of multiple non-producers. Is that correct? If so, did you work up the others, and did they also have disruptions in the same biosynthetic gene cluster?

      Only one strain was identified as a “non-producer” of anti-N. gonorrhoeae activity. We have modified the text to clarify this point.

      Original text: “The sequencing of one non-producer strain revealed that it surprisingly contained four transposon insertions and one frame shift mutation.”

      Changed to: “The sequencing of the non-producer strain revealed that it surprisingly contained four transposon insertions and one frame shift mutation.” (Line 53-54 )

      (9) All sequences (including Bacillus amyloliquefaciens BK) must be deposited in a public database (e.g., NCBI) and the accession numbers reported in the manuscript.

      Genomic sequence data of Bacillus amyloliquefaciens BK has been deposited in GenBank, and its accession number (GCA_019093835.1) now appears in figure legend of Figure S1a.

      Figure S1a legend:

      “Genome-based phylogenetic tree containing Bacillus amyloliquefaciens BK and closely related Bacillus spp. The tree was built by Genome Clustering of MicroScope using neighbor-joining method. The NCBI accession numbers of Bacillus strains used in the tree are GCA_000196735.1, GCA_000204275.1, GCA_000015785.2, GCA_019093835.1, GCA_000009045.1, GCA_000011645.1, GCA_000172815.1, GCA_000008005.1, and GCA_000007845.1 (from top to bottom).”

      Minor

      (10) Statements in the article would benefit from fact-checking. For example:

      - gonorrhea is not the second most prevalent sexually transmitted infection worldwide; it is the second most reported bacterial sexually transmitted infection.

      - Treatment is ceftriaxone 500mg IM x1 in the US, but 1g IM x1 in the UK and Europe. The UK guidelines also permit ciprofloxacin, should sequencing indicate gyrA 91S. I suggest reviewing / specifying which treatment guidelines you're referring to.

      We appreciate the reviewer’s corrections. The word “prevalent” is now changed to “reported”.

      Original text: “Gonorrhea, which is caused by Neisseria gonorrhoeae, is the second most prevalent sexually transmitted infection worldwide.”

      Changed to: “Gonorrhea, which is caused by Neisseria gonorrhoeae, is the second most reported sexually transmitted infection worldwide.” (Line 2-3)

      Original text: “Gonorrhea is the second most prevalent sexually transmitted infection worldwide, its causative agent is the bacterium Neisseria gonorrhoeae.”

      Changed to: “Gonorrhea is the second most reported sexually transmitted infection worldwide, its causative agent is the bacterium Neisseria gonorrhoeae.” (Line 18-19)

      “In the USA” is now added to the sentence stating gonorrhea treatment.

      Original text: “The high dose (500 mg) of the cephalosporin ceftriaxone is currently the only recommended therapy for treating gonorrhea infections.”

      Changed to: “The high dose (500 mg) of the cephalosporin ceftriaxone is currently the only recommended therapy for treating gonorrhea infections in the USA.” (Line 20-22)

      (11) Please make sure all results are in the results section. The report of cell morphology, for example, should be in the results, not the discussion.

      In our revised manuscript, we have included the cell morphology data in the results section with the text changes below.

      Original text: “Interestingly, not only was dedA deficient N. gonorrhoeae less susceptible to oxydifficidin, oxydifficidin also kills this mutant more slowly (Figure 2b) than WT N. gonorrhoeae MS11.”

      Changed to: “Interestingly, not only was dedA deficient N. gonorrhoeae less susceptible to oxydifficidin, oxydifficidin also kills this mutant more slowly (Figure 2b) than WT N. gonorrhoeae MS11. The dedA deletion mutant also showed an altered cell morphology with reduced membrane integrity and lower formation of micro-colonies (Figure S4). (Line 100-104)

      Original text: “The dedA deletion mutant also showed an altered cell morphology with reduced membrane integrity and lower formation of micro-colonies (Figure S4), indicating that it should show reduced pathogenesis and fitness, and, as a result, not accumulate in a clinical setting, which adds to the therapeutic appeal of oxydifficidin.”

      Changed to: “The dedA deletion mutant exhibited altered cell morphology, characterized by diminished membrane integrity and reduced micro-colony formation, indicating that it should show reduced pathogenesis and fitness, and, as a result, not accumulate in a clinical setting, which adds to the therapeutic appeal of oxydifficidin” (Line 206-210)

      (12) Tables 1 and 2 should be combined and should address the most relevant antibiotics

      The MIC data of additional relevant antibiotics are now included in Table 1. However, we still believe that keeping Tables 1 and 2 separate enhances the clarity of the manuscript. Table 2 specifically focuses on diverse ribosomal targeting antibiotics, which highlights the unique binding site of oxydifficidin.

      (13) Supplemental Figure 1a. The tree could be better resolved, and there are four entries with the identical listing of "Bacillus amyloliquefaciens subsp. plantarum" on different branches. In the methods or the legend, please indicate the accession numbers for these genomes. Also please specify how this tree was made-is it a maximum likelihood tree? Something else?

      The tree is now better resolved and includes new entries. The requested information regarding accession numbers and tree construction method has been included in the figure legend.

      New supplemental Figure 1a legend:

      “a. Genome-based phylogenetic tree containing Bacillus amyloliquefaciens BK and closely related Bacillus spp. The tree was built by Genome Clustering of MicroScope using neighbor-joining method. The NCBI accession numbers of Bacillus strains used in the tree are GCA_000196735.1, GCA_000204275.1, GCA_000015785.2, GCA_019093835.1, GCA_000009045.1, GCA_000011645.1, GCA_000172815.1, GCA_000008005.1, and GCA_000007845.1 (from top to bottom).”

      Reviewer #2 (Recommendations For The Authors):

      The conclusions drawn in the manuscript are well-supported by the experimental data presented.

      I have the below minor comments:

      (1) "serendipitously identified" - I feel this wording should be avoided throughout the manuscript. The point of a research paper is to communicate methodology and experimental detail, and this language portrays the opposite.

      While we agree that methodology and experimental procedures are paramount in scientific reporting, we believe it is equally important to convey, particularly to younger generations, that a part of the scientific process is often unplanned and can benefit from chance observations. Therefore, we would like to keep this wording.

      (2) The introduction should include the biological roles/function of DedA proteins in bacteria.

      DedA proteins perform a wide array of biological roles and functions in bacteria. In the results section (Line 107-116), we have described the most well-established of these functions, particularly the flippase activity, which appears to be directly related to oxydifficidin sensitivity. We believe that introducing this information in the results section enhances the manuscript’s clarity and flow.

      (3) "When we screened this contaminant for antibacterial activity against lawns of other Gram-negative bacteria it did not produce a zone of growth of inhibition against any of the bacteria we tested (e.g., Escherichia coli, Vibrio cholerae, Caulobacter crescentus)." Can these data Figures be included in the Supplements?

      This result was recorded in the lead author’s notebook, but no image was saved.

      (4) Line 52: Was any base analyses performed on the Tn-mutants i.e., how many insertion-sites? Depth of mutants? Was a library constructed in this study or previously? Why were only BGC assessed?

      Please see our response to Reviewer #1’s comment (5). We focused on BGCs because we believed the anti-N. gonorrhoeae activity most likely resulted from a molecule encoded by a natural product BGC.

      (5) Line 98: Do the other 2 predicted DedA-like proteins also have a role in uptake of oxydifficidin? Is there some redundancy in uptake?

      We generated knockout mutants for two other predicted DedA-like proteins in N. gonorrhoeae MS11, and the MIC of oxydifficidin for these mutants remained the same as for the N. gonorrhoeae MS11 wild type strain. Therefore, we believe that the DedA protein discussed in this manuscript is the primary transporter of oxydifficidin. However, we cannot completely rule out the possibility of redundancy in oxydifficidin uptake by other DedA-like proteins.

      New text: “We also generated deletion mutants for two other predicted dedA-like genes, and the MIC of oxydifficidin for these mutants remained the same as for the N. gonorrhoeae MS11 wild type strain.” (Line 98-100)

      Reviewer #3 (Recommendations For The Authors):

      This is a well presented manuscript and I could not immediately see any issues with it.

      We appreciate the reviewer’s positive feedback.

    2. eLife Assessment

      Kan et al. report the discovery of a Bacillus amyloliquifaciens strain that kills Nerisseria gonorrhoeae via oxydifficidin which targets ribosomal proteins. Resistance occurred via mutation in the DedA flippase to influence oxydifficidin uptake. The overall mechanism of action is well described making this an important study with implications for combating clinical antibiotic resistance. The evidence presented is convincing due to rigour employed in the methodological approach. The authors should consider performing a more comprehensive genetic analyses of DedA and RpIL in this clinically relevant strain. This work will be of broad interest to microbiologists and synthetic biologists.

    3. Reviewer #1 (Public review):

      Summary:

      Kan et al. report the serendipitous discovery of a Bacillus amyloliquefaciens strain that kills N. gonorrhoeae. They use TnSeq to identify that the anti-gonococcal agent is oxydifficidin and show that it acts at the ribosome and that one of the dedA gene products in N. gonorrhoeae MS11 is important for moving the oxydifficidin across the membrane.

      Strengths:

      - This is an impressive amount of work, moving from a serendipitous observation through TnSeq to characterize the mechanism by which Oxydifficidin works.

      Weaknesses:

      - The genetic diversity of dedA and rplL in N. gonorrhoeae is still not clear, as the authors looked at diversity of these genes in only 220 isolates (of unclear relationship to each other).

      It's not so much a weakness as a source of confusion: how did the authors choose to screen a tiny transposon library of 50 mutants? Since they were surprised to find 4 transposon insertions (if I'm reading it correctly), what was the motivation for even looking at this small library? And since the mutation that led them to the biosynthetic gene cluster wasn't even a transposon insertion but a frameshift, it seems they had another huge episode of serendipity.

    4. Reviewer #2 (Public review):

      Summary:

      Kan et al. presents the discovery of oxydifficidin as a potential antimicrobial against N. gonorrhoeae, including multi-drug resistant strains. The authors show the role of DedA flippase assisted uptake and the specificity of RplL in the mechanism of action for oxydifficidin. This mode of action could potentially offer a new therapeutic avenue, providing a critical addition to the limited arsenal of antibiotics effective against gonorrhea.

      Strengths:

      This study shows the potential of revisiting anti-bacterial agents/products for antibacterial activity against modern-day-concerning pathogens and highlights a new anti-gonoccoal mechanism of action. Indeed there is a recent growing body of research to revisit potential antimicrobial agents and metabolites from cultured bacterial species. The discovery of oxydifficidin interaction with RplL and its DedA-assisted uptake mechanism opens new research directions in understanding and combating antibiotic resistant N. gonorrhoeae. The antimicrobial activity of oxydifficidin is also active against N. meningitidis, a closely related species. Methodologically, the study is rigorous employing various experimental techniques including Tn-mutagenesis (TraDIS, Tn-Seq).

      Weaknesses:

      While the study demonstrates the in vitro effectiveness of oxydifficidin, there is a lack of in vivo validation (i.e., animal models) for assessing pre-clinical potential of oxydifficidin. However, I acknowledge that this would be a tremendous amount of work and likely outside the scope of this study. Potential SNPs within dedA or RplL raises concerns about how quickly resistance could emerge in clinical settings.

    5. Reviewer #3 (Public review):

      Summary:

      The authors have shown that oxydifficidin is a potent inhibitor of Neisseria gonorrhoeae. They were able to identify the target of action to rpsL and showed that resistance could occur via mutation in the DedA flippase and RpsL.

      Strengths:

      This was a very thorough and clearly argued set of experiments that supported their conclusions.

      Weaknesses:

      There was no obvious weakness in the experimental design. Although it is promising that the DedA mutations resulted in attenuation of fitness, it remains an open question whether secondary rounds of mutation could overcome this selective disadvantage which was untried in this study.

      Comments on revisions:

      All of my suggestions were considered and the responses to the other reviewer's appears sound and has improved the manuscript.

    1. eLife Assessment

      Protein and lipid homeostasis is essential for maintaining cellular functions but their crosstalk remains largely unknown. This important manuscript deals with this interesting topic and applies the powerful unbiased tools of somatic cell genetics to discover evidence suggesting a link between sphingolipids/cholesterol ester metabolism and lysosomal protein aggregation. The authors provide compelling orthogonal evidence to support their conclusions.

    2. Reviewer #1 (Public Review):

      In this manuscript, Yong and colleagues link perturbations in lysosomal lipid metabolism with the generation of protein aggregates resulting from proteosome inhibition. The main tool used is the ProteoStat stain to assess protein aggregate burden in native cells (i.e. cells under no exogenous or endogenous stress). They initially use CRISPR-based genome-wide screens to identify several genes that affect this aggregate burden. Interestingly, knockdown of genes involved in lysosomal acidification was a major signature which led to identification of other culprit lysosome-associated genes that included ones involved in lipid metabolism. Subsequent CRISPR screen focused on lipidomic analysis led to identification of sphingolipid and cholesterol esters as lipid classes with effects on proteostasis.

      Comments on revised version:

      They did a decent job addressing most of my comments and the new data (including LysoIP) makes for much more plausible conclusions.

      They propose the idea that microautophagy is mediating the delivery of these aggregates to lysosomes.

      It appears there are enough experiments and support now for their premise.

      The lysosomal lipid metabolism link to proteostasis is still a lingering question in this work but they addressed each of the points I raised regarding it and revised the manuscript accordingly with pertinent discussion.

      It is difficult to truly address the lipid link and I think we have to acknowledge that. But overall, looking at the effort and conclusions, this has been improved enough to be a valuable contribution to the field.

    3. Reviewer #2 (Public Review):

      In this paper, starting with unbiased CRISPRi screening, the authors found that perturbations in lipid homeostasis lead to proteostasis impairment. The screen and most follow-up experiments used the dye ProteoStat, which detects protein aggregates and the aggresome. Based upon their screen hits and subsequent analyses, the authors determined that increased levels of sphingolipids and cholesterol esters induce proteostasis defects, along with formation of protein aggregates that appear to be localized in the lysosome. The lysosome increases in content, but its function is not detectably perturbed.

      Comments on revised version:

      I am satisfied with the authors' actions in response to my public and specific suggestions, but not yet with the manuscript itself. I think that the paper would be improved if they showed the evidence arguing against an effect on proteasome activity but I can live with this omission. I think that the readability and ease of grasping the main points are improved by Figure 7. Inclusion of these simple but informative conceptual summaries is a must.

    4. Author response:

      We are submitting a revised manuscript with major additions that address the main concerns in the initial reviews. At the highest level, this revision provides i) orthogonal biochemical measurements that yield concrete evidence of lysosomal protein aggregates, and ii) a plausible mechanism linking lysosomal lipid handling and protein aggregation through disruption of ESCRT function. We believe these additions significantly improve the completeness of this study and the conclusions that can be drawn from the data.

      Below are more specific highlights on the addition in this revision:

      -       We included orthogonal techniques (thioflavin-T staining and Lyso-IP followed by differential extraction) and confirmed the accumulation of RIPA-insoluble protein aggregates at the lysosomes in cells under lipid perturbation (Figure 3).

      -       We performed TMT-Proteomics and identified accumulation of insoluble ESCRT components at the lysosomes under lipid perturbation (Figure 4). Two new authors involved in this effort are added onto the manuscript.

      -       The ESCRT result prompted us to revisit lysosomal membrane integrity. With improved imaging conditions and analysis we were able to see increased membrane permeabilization under lipid perturbation. VPS4A overexpression partially rescued this phenotype, suggesting that lipid accumulation impairs ESCRT disassembly (Figure 5).

      -       Together, the results suggest that lipid perturbation impairs ESCRT function, compromising both lysosomal membrane repair and microautophagy, resulting in the accumulation of endogenous protein aggregates at the lysosomes (Graphical Abstract).

      Reviewer #1 (Recommendations For The Authors):

      (1) Perhaps the most prominent limitation of this work is the unilateral focus on native cells (i.e. cells under no endogenous or exogenous stress) as the model for protein aggregate formation. Furthermore, although the ProteoStat stain has been utilized by many investigators before, the sole reliance on this stain as the read-out for their assays is concerning. To compound the concern, the ProteoStat-positive puncta co-localize with lysosmal markers which was surprising even to the authors. All in all, it behooves the authors to test proteostasis in multiple parallel ways to actually define what they are studying. How is it possible that protein aggregates under native conditions are only co-localized with lysosomes? Are we really studying protein aggregates which should predominantly be cytoplasmic insoluble aggregates?

      (a) They need to get away from a simple stain like ProteoStat and conduct co-stainings with other markers such as poly-ubiquitin antibodies and other chaperones to define what and where else exactly are these aggregates.

      Co-staining with poly-ubiquitin was included in the original manuscript. We added orthogonal staining with another widely used amyloid dye, Thioflavin-T, and provided fine-grained quantification of lysosomal vs cytosolic localization of various signals (Figures S4A-C & 3A-B).

      (b) They need to do Immunoblots with and without triton insolubility to see if these aggregates are insoluble as most would predict. They can do lysosomal isolation vs cytoplasmic to see if the insoluble aggregates are really lysosomal.

      We performed Lyso-IP followed by differential detergent extraction to confirm the accumulation of insoluble proteins at the lysosomes (Figure 3C). Proteomic analysis identified some of these insoluble proteins as ESCRT subunits (Figure 4).

      (c) They should compare aggregate formation in the native state versus cells with lysosomal inhibition via Bafilomycin or chloroquine versus cells with proteosomal inhibition. The lysosomal inhibition experiments are particularly informative given the lysosomal relevance they have uncovered.

      We included other small molecule inhibitors and at different time points to compare the effect of different modes of proteostasis challenge (Figure S4A-D). Together with the ESCRT finding, our results suggest the role of microautophagy in our system, and provide a model of how ProteoStat- and/or ubiquitin- positive substrates become partitioned between the cytoplasm and lysosomes under different perturbations.

      (d) Many protein aggregates which are too bulky for proteosome degradation will traditionally be dealt with by aggrephagy. Why is this not observed?

      Knockdown of core macroautophagy components did not impact Proteostat intensity in our CRISPRi screen, suggesting that basal macroautophagy plays a negligible role in clearing endogenous amyloid-like structures in our experimental system. We provide an alternative model that these aggregates instead arrive at the lysosomes via microautophagy.

      (2) After addressing #1, they can validate if the genes they identified by CRISPR screens are also important in modulation of protein aggregate burden in other systems. For example, if they inhibit lysosomes by Bafilo or Chloroquine to obtain protein aggregates and then Knockdown the identified genes in the CRISPR screens, will they get the same results?

      We addressed the effect of different modes of proteostasis challenge as recommended above. Deacidifying the lysosomes alone causes intense protein aggregation (Figure S4A-D) and eventually cell death, and was thus not combined with other perturbations.

      (3) They identify lysosomal lipid metabolism genes/pathways as the culprit for inducing proteostasis. In particular sphingolipid and cholesteryl ester species appear to be operational here. However, there are no specific lipids species or specific lipid metabolism gene that is causative. Rather, you have to knockdown entire processes to have an effect. This suggests that the focus on lysosome health (i.e. permeability, proteolysis, etc) is rudimentary. When you have to knockdown entire classes of lipids, this would indicate more broad effects on cellular lipids (including membrane lipids beyond the lysosome) and related cellular health?

      We included data on the effect of knocking down MYLIP, PSAP, and as a comparison PSMD2 on the growth rate of K562 cells (Figure S5A). MYLIP and PSAP KDs, which cause predominantly an accumulation of lipids, do not impede cell growth. Increasing lipid uptake by MYLIP KD increases cell proliferation under our culture conditions, suggesting a general negative impact on cell health was not required for the association between lipid levels and protein aggregates.

      (a) They conduct a superficial methyl-beta-cyclodextrin experiment with equivocal results. The use of MBCD for different time-courses to deplete various membrane cholesterol pools including the plasma membrane pool is important to ascertain what aspect of the cellular cholesterol is affecting proteostasis. MBCD +/- cholesterol reintroduction time-courses for rescue will also be key to determine the culprit cellular cholesterol pool.

      The MBCD / Filipin experiment helped us determine that ProteoStat doesn’t directly stain cholesterol, nor any major plasma membrane components. Free cholesterol was implicated in neither the screen nor the lipidomics and was not the subject of targeted experiments.

      (b) The same concept can be applied to sphingolipids. There are sphingolipids in abundance in multiple membrane compartments. Which ones are causal here? More nuanced evaluation of this with sphingolipid staining/tracking can be conducted.

      We attempted experiments where sphingolipids were added back to cells grown in FBS-depleted media. Nevertheless, we were not able to consistently deliver these lipid species and doing so while ensuring the correct subcellular localization at physiologically relevant level would require substantial methods development.

      (c) As part of this, are lipid rafts and/or caveolae being affected by the perturbations in cholesterol and sphingolipids? Lipid rafts are highly enriched in these 2 lipids which could link to their preteostasis observation.

      Indeed, ceramides released from SM hydrolysis are proposed to self-assembled into microdomains with negative curvature that can promote the formation of intralumenal vesicles (Alonso and Goni, 2018; Niekamp et al 2022). We propose that SM accumulation may hinder this process by counteracting the negative membrane curvature and impede microautophagy.

      (d) How about ER membrane lipids? The UPR and subsequent effects on proteostasis are intricately involved with ER lipid bilayer composition.

      We did not perform lipidomics on ER membranes in this study, though we note that at steady state, sphingolipids and cholesterol esters are not expected to be enriched at the ER (Ikonen and Zhou, 2021). We checked whether lipid-related genetic perturbations induced the UPR in published perturb-seq data in K562 cells. Neither MYLIP nor PSAP knockdown induced a UPR.

      In conclusion, the manuscript is interesting but the excitement over a link between lysosome-related lipid metabolism and proteostasis needs to be tamped until a more robust experimental approach is employed to generate supportive and corroborating results.

      Reviewer #2 (Recommendations For The Authors):

      - The paper has a number of grammatically awkward sentences. Editing these would enhance clarity.

      - It is important to show the co-localization of aggregates with the lysosome. This is shown in supplements but should be in a main figure. Here the authors cite previous work indicating that ProteoStat puncta co-localize with ubiquitinated proteins and state that they do not see this, then essentially just move on. Is there an explanation for this discrepancy and can it be resolved? What do they think is really going on? What happens to levels of ubiquitinated proteins when lipid metabolism is perturbed as in these experiments?

      We have included the lipid-induced lysosomal protein aggregation data in the main text (Figure 3A-B), and provided fine-grained quantification of the cytosolic-vs-lysosomal ProteoStat / Ub / ThT signals under different aggregate-inducing conditions (Figure S4A-D). We discuss these results in the main text and propose a model involving ESCRT-mediated microautophagy in the main text. This is supported further by the LysoIP-proteomics and LMP analysis.

      - Please add an indicator of amino acid numbers to Fig. 3C.

      These annotations are now included (now Figure S3C).

      - The legend for 3D is mislabelled.

      We have corrected the legend (now Figure S3D).

      Reviewer #3 (Recommendations For The Authors):

      Protein homeostasis and lipid homeostasis are both are important for maintaining cellular functions. However, the crosstalk remains largely unknown. The manuscript entitled as "Impairment of lipid homoeostasis causes accumulation of protein aggregates in the lysosome" deals with this interesting topic. An important link between lysosomal protein aggregation and sphingolipids/cholesterol esters metabolism were discovered. The topic belonging to the Cell Biology domain also falls into the aims and scope of eLife. Here are the revisions I recommend:

      (1) From lipidomics analysis, a remarkable correlation between levels of sphingomyelin and cholesterol ester and ProteoStat staining was found. Could the authors explain how sphingomyelin and cholesterol ester are quantified? The two lipids are not included as internal standards from the lipidomics experiment.

      Sphingomyelin and cholesterol ester internal standards are included in the Avanti 330707 SPLASH® LIPIDOMIX® Mass Spec Standard, which was supplied at 3% v/v to the MeOH/H2O cell lysis buffer. We have amended the Methods section to clarify this.

      (2) Could the authors perhaps delete Figure 1B and show it on Figure 2A only? There is no need to show the same figure two times. The threshold of both False Discovery Rate and Median Enrichment needs to be added. From Figure 2A, the Lysosomal hydrolases (GBA, LIPA, GALC) seems located in statistically insignificant region. Based on previous studies, the GBA could have an effect on sphingolipid levels, then how to explain that sphingomyelin was highly correlated with ProteoSate staining?

      We have combined the two volcano plots into a single figure (now Figure 1D), and added a line to help visualize the gene effects while considering the combined contribution of FDR and enrichment. Individual lysosomal hydrolases indeed have insignificant effects on ProteoStat and this is discussed in the main text as having relatively constrained impacts on the general lipidome. For example, while GBA and GALC KDs can lead to accumulation of their immediate substrates (glucosylceramide and galactosylceramide, respectively), they do not directly impinge on sphingomyelin.

      (3) The authors show the corelation between ProteoState staining and different lipids/lipid classes in Figure 3B and Figure S3A. It is not necessary to show the corelation with individual lipids (such as sphingomyelin(d18:1/24:0) and cholesterol ester(18:2). The corelation with full collection of lipid classes would be more representative, which is only list in Figure 3B and Figure S3A. It is suggested to add the information of how many individual lipids in each chass are used for the correlation analysis. Replace Figure 3A to Figure S3A, and put Figure 3A as supplementary figure are suggested.

      We decided to retain the correlation of two individual lipids (a sphingomyelin and a cholesterol ester species) with ProteoStat as examples to illustrate with clarity how we obtained the class-wide comparison. The number of individual lipids included in each class for correlation analysis is now included in Figures 2F and S3A.

      (4) The authors state that lipid uptake and metabolism modulate proteostasis. However, only cholesterol and LDL were tested. It would be more precise to state as cholesterol uptake and metabolism modulate proteostasis. In addition, sphingolipids and cholesterol esters accumulate with increased lysosomal protein aggregation. It would be interesting to see the effects of sphingolipids uptake, since sphingolipids are correlated with proteostasis better than cholesterol.

      We attempted to add back specific sphingolipids to assess sufficiency. However, we found it challenging to ensure that these lipids were distributed to the correct subcellular locations at physiologically relevant levels. Without this crucial information, it was difficult to draw any conclusions about the sufficiency of the sphingolipids we tested to impair proteostasis.

      Alonso A, Goñi FM. 2018. The Physical Properties of Ceramides in Membranes. Annu Rev Biophys 47:633–654. doi:10.1146/annurev-biophys-070317-033309

      Ikonen E, Zhou X. 2021. Cholesterol transport between cellular membranes: A balancing act between interconnected lipid fluxes. Dev Cell 56:1430–1436. doi:10.1016/j.devcel.2021.04.025

      Niekamp P, Scharte F, Sokoya T, Vittadello L, Kim Y, Deng Y, Südhoff E, Hilderink A, Imlau M, Clarke CJ, Hensel M, Burd CG, Holthuis JCM. 2022. Ca2+-activated sphingomyelin scrambling and turnover mediate ESCRT-independent lysosomal repair. Nat Commun 13:1875. doi:10.1038/s41467-022-29481-4

    1. eLife Assessment

      This study provides a valuable look at genome-wide RNaseA-resistant RNA-DNA interactions in human embryonic stem cells. The research indicated that RNase treatment maintained long-range RNA-chromatin connections characterized by significant sequence conservation while abolishing permissive interactions. Interestingly, coding and non-coding RNA transcripts exhibited differing sensitivity to RNase treatment. Although the study findings reveal an intriguing RNase-inaccessible regulatory RNA-chromatin interactome, conclusions about the identity and regulatory significance of RNase-resistant RNA-chromatin interactions are incomplete and would benefit from more rigorous approaches that include additional computational and experimental controls.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript constitutes further analysis of a dataset generated for a previously-published study from the same group. The experiments in the previous work use an RNA-DNA proximity assay to capture RNAs that interact with chromatin, especially beyond their site of transcription, by crosslinking-and proximity ligation. The previous work added one novel feature to this treatment, compared to other studies by the same group, where they treated the nuclei with RNase A prior to crosslinking. The initial study concluded that long-range chromatin interaction via chromatin looping is affected by RNase treatment. In the current manuscript, the group analyze the data from this experiment in more detail. They describe some notable features of RNAs that remain after RNase treatment and where they are associated within the genome. Overall, the further analyses are somewhat useful, with some exceptions for specific analyses that are not clear in the current manuscript. The work is very complementary to the previously published original study, to the point that it is surprising it was not included in that study.

      Strengths:

      (1) The analyses are a useful complement that fill in gaps from the Calandrelli et al paper. Some of the findings are suggestive of RNA-protein networks that operate at long distances to regulate promoters.

      Weaknesses:

      (1) The beginning of the Results section, and elsewhere, describes steps that likely were performed in the previous publication from which the data are being further analyzed and possibly partially reanalyzed. The current manuscript should more clearly describe if there are any aspects of the pipeline that have been modified from the Calandrelli study (which does not have much detail regarding iMARGI parameters in the published paper) for the further analysis in this manuscript.

      (2) The RNase treatment approach is similar to that addressed in recent papers from the Jenner and Davidovich groups (https://doi.org/10.1016/j.celrep.2024.113856; https://doi.org/10.1016/j.celrep.2024.113858) where these groups found RNase treatment significantly affected solubility of chromatin, causing aggregation. The authors should address this work and place it in light of their current study.

      (3) Figure 1f: it is not clear what it means for genes to be "non-differentially expressed" in this context. Isn't this also RNase-insensitive? And how is the "Ctrl specific" RNA set determined? This is confusing, since RNase is assumed to degrade most of the RNA in these samples.

      (4) Figure 2a: The results are somewhat surprising, given that protein-coding genes are depleted more in the RNase treatment. Is the Ctrl set the same as in 1f? This emphasizes the importance of defining that population better.

      (5) Figure 3a: The text references this figure in ways that do not match the figure, referencing at least nine column clusters when there are only six. Heatmaps of certain TFs and "RAH explained" percentages don't seem to match the Results section description, either. The authors claim EZH2 binding sites are the top TF overlap with RAHs and yet do not include EZH2 in Figure 3a. Suz12 (EZH2 binding partner) and H3K27me3 (EZH2 product) are also referenced in the text for this figure, but not included in the figure itself.

      (6) The manuscript uses the term "non-diffusive RNA-chromatin interactome" which is not directly supported by data. The authors use the term initially to describe the RNase-resistant species in their previous work, but through the current study, they support a model where the RNase resistance is simply due to protection by protein binding, not by any constraints on diffusion in particular chromatin environments.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors re-analyze RNase-treated iMARGI data to systematically identify and analyze RNase-resistant RNA-chromatin interactions. In general

      Strengths:

      Analyses are well-thought-out and generally solid.

      Weaknesses:

      Conclusions are massively overstated, and though the analytical pipelines used are solid, the conclusions deriving from them lack the backing of solid computational and molecular controls.

    4. Reviewer #3 (Public review):

      Summary:

      The study investigated stable RNA-chromatin interactions by applying RNase treatment before the iMARGI (in situ mapping of RNA-genome interactome) procedure to remove promiscuous, unprotected RNA transcripts and selectively enrich for RNA-inaccessible, potentially functional RNA-chromatin interactions (RNA-Transcription factor and RNA-histone). The researchers found that short-range interactions (<1kb) are RNase resistant, possibly due to the protection from RNA polymerases. They noticed that long-range RNA-chromatin interactions (>2Mbp or interchromosomal) were also enriched after RNase treatment, hypothesizing that these interactions are stabilized by chromatin-binding proteins. They found that genic caRNAs were sensitive, while repeat-derived caRNAs, such as rRNA and satellite repeats, were resistant to RNase. Long non-coding RNAs (lncRNAs), particularly those associated with diseases, were over-represented among RNase-insensitive transcripts, indicating their potential regulatory significance. Additionally, RNase-insensitive caRNAs exhibited higher evolutionary conservation, implying that they are protected by protein complexes, especially in long-range interactions. RNA Attachment Hot Zones (RAHs) enriched post-RNase treatment were found to localize in functional genomic regions such as promoters, transcription factor binding sites (TFBS), and histone modification sites. Importantly, RNase treatment amplified specific RNA-transcription factor interactions, with caRNA signals being preserved at TFBS for factors with RNA-binding capabilities, suggesting that direct RNA-protein binding helps protect caRNAs from degradation. They also found that different TFs are enriched with specific caRNA species, distinguishing them from their genomic footprints. In addition, transcripts with higher abundance tend to enrich at more TFBS. Overall, the study highlights the role of RNase-inaccessible caRNAs in chromatin regulation and provides insight into their functional significance in genome organization.

      Strengths:

      This study involves rigorous and comprehensive data analysis involving datasets with very high sequencing depth and appropriate statistical tests (e.g., chi-square tests to validate the association between caRNAs and TFBS statistically). This analysis was further strengthened by comparing their results with orthogonal datasets, such as RedChIP and fRIP-seq, providing robust, cross-validated evidence for the caRNA-TFBS associations. In addition to examining broad interactions, the authors identified specific long-range RNA-chromatin interactions and pinpointed specific transcription factors and histone modification markers that are associated with these interactions. The authors explored the evolutionary implications of RNase-insensitive caRNAs and their potential medical relevance, particularly by identifying caRNAs linked to disease-associated genes and long non-coding RNAs (lncRNAs). This combination of detailed analysis, along with functional relevance, broadens the scope of the research, making it a significant contribution to chromatin biology.

      Weaknesses:

      However, I have the following concerns regarding the studies:<br /> (1) I don't understand the logic behind calling promoters, enhancers, and similar regions "functionally important regions" when describing the enrichment of RNase-insensitive interactions. Genic regions that are RNase-sensitive are also functionally relevant. So, what makes promoters, enhancers, etc, unique in terms of functionality?<br /> (2) First, while the study offers strong evidence for associations between caRNAs, transcription factors, and chromatin markers, it lacks direct functional validation experiments such as RNA knockdown or CRISPR interference, to confirm the specific roles of these RNAs in gene regulation or chromatin structure modifications.<br /> (3) Another limitation is the incomplete investigation of caRNAs with short-range interactions (<1kb). The authors hypothesized that these are protected by RNA polymerases but did not provide supporting experimental evidence or references to previous studies. Offering either experimental validation or a rationale for excluding these short-range interactions would strengthen this hypothesis. The conclusion that authors drew on that "chromatin-associated RNAs (caRNAs) involved in short- to middle-range interactions are more susceptible to RNase treatment" was unclear for the specific "short-range" distance. The data shown in Supplementary Figure 2a contradicted the conclusion in the discussion that "long-distance RNA-chromatin interactions are preferentially preserved after RNase treatment, while short-range interactions are depleted." as well as the suggestion made linking RNase inaccessibility to evolutionarily conserved in the paper.<br /> (4) The study heavily relies on RNase treatment to isolate stable RNA-chromatin interactions, which might neglect important transient or weak interactions and overlook the functional relevance of RNase-sensitive interactions, hence missing the dynamic nature of RNA-chromatin interactions.<br /> (5) Tthe analysis is limited to human embryonic stem cells (H1 cells), which might restrict the generalizability of the findings. Expanding the study to include a cell type that represents a broader range of cell types or tissues will strengthen the conclusions.<br /> (6) The term "RNase A treatment" in the methods section could be clearer if specified as "RNase-treated iMARGI," which encompasses the standard iMARGI protocol.<br /> (7) There is some ambiguity regarding whether the researchers generated new data or reanalyzed existing datasets. While it is mentioned early on that previously published RNase-treated iMARGI datasets were reanalyzed, the text later states that "three biological replicates were generated for the RNase-treated samples." Clarifying whether the data were newly generated in this study or obtained from public datasets would improve the clarity.<br /> (8) The color scheme should be the same for heatmaps for control, and RNase-treated samples in Figure 4.

    5. Author response:

      We thank the editors and reviewers for their thorough evaluation of our manuscript. We appreciate the constructive feedback and insights provided. 

      We acknowledge that some of our conclusions would benefit from more measured statements and additional computational controls. We will revise the manuscript to better reflect the scope and limitations of our analytical approach. While we cannot add new experimental validations at this stage, we will strengthen our computational analyses and clarify our methodology.

      Below, we outline our planned revisions to address the major points raised in the public reviews:

      Clarification of Terms and Definitions:

      (1) We will make it clearer in our manuscript to emphasize that we reuse the same raw datasets from our previous study as described in Calendrilli et al, 2023, and there is no modification to the experimental methods or data. 

      (2) We will provide clear definitions for:

      - "Non-differentially expressed" genes

      - "Ctrl specific" RNA sets

      - The composition of control populations in different analyses

      (3) We will revise the use of "non-diffusive RNA-chromatin interactome" and “RNase-resistant” terminology to better reflect our actual findings.

      (4) We will also improve clarity regarding:

      - The rationale for focusing on specific genomic regions

      - The interpretation of evolutionary conservation data

      (5) We will provide additional rationale on the exclusion of short-range interactions.

      Figure Revisions:

      (1) Figure 3a: We will correct any discrepancy between text references and figure content.

      (2) Figure 4: We will standardize the color scheme between control and RNase-treated samples.

      (3) We will follow the reviewer's suggestion to move figure 1g to the supplementary file. 

      Additional Computational Analyses:

      (1) We will consider adding controls for RNA length effects and integrate any existing knowledge on the protection extent variation across different RBP.

      Discussions:

      (1) We will carefully rephrase our conclusions to more accurately reflect the scope and limitations of our computational findings, ensuring we do not overstate the implications.

      (2) We will expand the discussion of limitations, including:

      - The focus on RNase-resistant interactions only

      - The cell-type specificity of our findings

      - The lack of functional validation

      - The limited ability to discern and study the transient or weak RNA-chromatin interactions using the current dataset

      (3) Regarding the recent papers from Jenner and Davidovich groups about RNase treatment effects on chromatin solubility:

      - We will discuss these findings in our revised manuscript

      - We will address potential limitations this may impose on our interpretations

    1. eLife Assessment

      The current study presents useful findings about the inhibition of a membrane pyrophosphatase by non-hydrolyzable phosphonate substrate analogs. The study proposes a model in which the two monomers in a functional dimer interact with the phosphonate molecules in an asymmetric fashion. While asymmetry has been previously demonstrated through other studies, the DEER spectroscopy data presented in the current study provide incomplete evidence of the proposed asymmetry near the binding site.

    2. Reviewer #1 (Public review):

      Summary:

      This work examines the binding of several phosphonate compounds to a membrane-bound pyrophosphatase using several different approaches, including crystallography, electron paramagnetic resonance spectroscopy, and functional measurements of ion pumping and pyrophosphatase activity. The work attempts to synthesize these different approaches into a model of inhibition by phosphonates in which the two subunits of the functional dimer interact differently with the phosphonate.

      Strengths:

      This study integrates a variety of approaches, including structural biology, spectroscopic measurements of protein dynamics, and functional measurements. Overall, data analysis was thoughtful, with careful analysis of the substrate binding sites (for example calculation of POLDOR omit maps).

      Weaknesses:

      Unfortunately, the protein did not crystallize with the more potent phosphonate inhibitors. Instead, structures were solved with two compounds with weak inhibitory constants >200 micromolar, which limits the molecular insight into compounds that could possibly be developed into small molecule inhibitors. Likewise, the authors choose to focus the spectroscopy experiments on these weaker binders, missing an opportunity to provide insight into the interaction between more potent binders and the protein.

      In general, the manuscript falls short of providing any major new insight into membrane-bound pyrophosphatases, which are a very well-studied system. Subtle changes in the structures and ensemble distance distributions suggest that the molecular conformations might change a little bit under different conditions, but this isn't a very surprising outcome. It's not clear whether these changes are functionally important, or just part of the normal experimental/protein ensemble variation.

      The ZLD-bound crystal structure doesn't predict the DEER distances, and the conformation of Na+ binding site sidechains in the ZLD structure doesn't predict whether sodium currents occur. This might suggest that the ZLD structure captures a conformation that does not recapitulate what is happening in solution/ a membrane.

    3. Reviewer #2 (Public review):

      Summary:

      Crystallographic analysis revealed the asymmetric conformation of the dimer in the inhibitor-bound state. Based on this result, which is consistent with previous time-resolved analysis, authors verified the dynamics and distance between spin introduced label by DEER spectroscopy in solution and predicted possible patterns of asymmetric dimer.

      Strengths:

      Crystal structures with inhibitor bound provide detailed coordination in the binding pocket thus useful information for the PPase field and maybe for drug development.

      Weaknesses:

      The distance information measured by DEER is advantageous for verifying the dynamics and structure of membrane protein in solution. However, regarding T211 data, which, as the authors themselves stated, lacks measurement precision, it is unclear for readers how confident one can judge the conclusion leading from these data for the cytoplasmic side.

      The distance information for the luminal site, which the authors claim is more accurate, does not indicate either the possibility or the basis for why it is the ensemble of two components and not simply a structure with a shorter distance than the crystal structure.

    4. Reviewer #3 (Public review):

      Summary:

      Membrane-bound pyrophosphatases (mPPases) are homodimeric proteins that hydrolyze pyrophosphate and pump H+/Na+ across membranes. They are attractive drug targets against protist pathogens. Non-hydrolysable PPi analogue bisphosphonates such as risedronate (RSD) and pamidronate (PMD) serve as primary drugs currently used. Bisphosphonates have a P-C-P bond, with its central carbon can accommodate up to two substituents, allowing a large compound variability. Here the authors solved two TmPPase structures in complex with the bisphosphonates etidronate (ETD) and zoledronate (ZLD) and monitored their conformational ensemble using DEER spectroscopy in solution. These results reveal the inhibition mechanism of these compounds, which is crucial for developing future small molecule inhibitors.

      Strengths:

      The authors show that seven different bisphosphonates can inhibit TmPPase with IC50 values in the micromolar range. Branched aliphatic and aromatic modifications showed weaker inhibition.

      High-resolution structures for TmPPase with ETD (3.2 Å) and ZLD (3.3 Å) are determined. These structures reveal the binding mode and shed light on the inhibition mechanism. The nature of modification on the bisphosphonate alters the conformation of the binding pocket.

      The conformational heterogeneity is further investigated using DEER spectroscopy under several conditions.

      Weaknesses:

      The authors observed asymmetry in the TmPPase-ELD structure above the hydrolytic center. The structural asymmetry arises due to differences in the orientation of ETD within each monomer at the active site. As a result, loop5-6 of the two monomers is oriented differently, resulting in the observed asymmetry. The authors attempt to further establish this asymmetry using DEER spectroscopy experiments. However, the (over)interpretation of these data leads to more confusion than any further understanding. DEER data suggest that the asymmetry observed in the TmPPase-ELD structure in this region might be funneled from the broad conformational space under the crystallization conditions.

      DEER data for position T211R1 at the enzyme entrance reveal a highly flexible conformation of loop5-6 (and do not provide any direct evidence for asymmetry, Figure EV8). Similarly, data for position S521R1 near the exit channel do not directly support the proposed asymmetry for ETD. Despite the high quality of the data, they reveal a very similar distance distribution. The reported changes in distances are very small (+/- 0.3 nm), which can be accommodated by a change of spin label rotamer distribution alone. Further, these spin labels are located on a flexible loop, thereby making it difficult to directly relate any distance changes to the global conformation.

      The interpretations listed below are not supported by the data presented:

      (1) 'In the presence of Ca2+, the distance distribution shifts towards shorter distances, suggesting that the two monomers come closer at the periplasmic side, and consistent with the predicted distances derived from the TmPPase:Ca structure.'

      Problem: This is a far-stretched interpretation of a tiny change, which is not reliable for the reasons described in the paragraph above.

      (2) 'Based on the DEER data on the IDP-bound TmPPase, we observed significant deviations between the experimental and the in silico distances derived from the TmPPase:IDP X-ray structure for both cytoplasmic- (T211R1) and periplasmic-end (S525R1) sites (Figure 4D and Figure EV8D). This deviation could be explained by the dimer adopting an asymmetric conformation under the physiological conditions used for DEER, with one monomer in a closed state and the other in an open state.'

      Problem: The authors are trying to establish asymmetry using the DEER data. Unfortunately, no significant difference is observed (between simulation and experiment) for position 525 as the authors claim (Figure 4D bottom panel). The observed difference for position 112 must be accounted for by the flexibility and the data provide no direct evidence for any asymmetry.

      (3) 'Our new structures, together with DEER distance measurements that monitor the conformational ensemble equilibrium of TmPPase in solution, provide further solid experimental evidence of asymmetry in gating and transitional changes upon substrate/inhibitor binding.'

      Problem: See above. The DEER data do not support any asymmetry.

      (4) Based on these observations, and the DEER data for +IDP, which is consistent with an asymmetric conformation of TmPPase being present in solution, we propose five distinct models of TmPPase (Figure 7).

      Problem: Again, the DEER data do not support any asymmetry and the authors may revisit the proposed models.

      (5) 'In model 2 (Figure 7), one active site is semi-closed, while the other remains open. This is supported by the distance distributions for S525R1 and T211R1 for +Ca/ETD informed by DEER, which agrees with the in silico distance predictions generated by the asymmetric TmPPase:ETD X-ray structure'

      Problem: Neither convincing nor supported by the data

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This work examines the binding of several phosphonate compounds to a membrane-bound pyrophosphatase using several different approaches, including crystallography, electron paramagnetic resonance spectroscopy, and functional measurements of ion pumping and pyrophosphatase activity. The work attempts to synthesize these different approaches into a model of inhibition by phosphonates in which the two subunits of the functional dimer interact differently with the phosphonate.

      Strengths:

      This study integrates a variety of approaches, including structural biology, spectroscopic measurements of protein dynamics, and functional measurements. Overall, data analysis was thoughtful, with careful analysis of the substrate binding sites (for example calculation of POLDOR omit maps).

      Weaknesses:

      Unfortunately, the protein did not crystallize with the more potent phosphonate inhibitors. Instead, structures were solved with two compounds with weak inhibitory constants >200 micromolar, which limits the molecular insight into compounds that could possibly be developed into small molecule inhibitors. Likewise, the authors choose to focus the spectroscopy experiments on these weaker binders, missing an opportunity to provide insight into the interaction between more potent binders and the protein.

      We acknowledge the reviewer concern regarding the choice of weaker inhibitors. We attempted co-crystallization with all available inhibitors, including those with higher potency. However, despite numerous efforts, these potent inhibitors yielded low-resolution crystals, making them unsuitable for detailed structural analysis. Therefore, we chose to focus on the weaker binders, as we were able to obtain high-quality crystal structures for these compounds. This allowed us to perform DEER spectroscopy with the added advantage of accurately analyzing the data against structural models derived from X-ray crystallography. Using these weaker inhibitors enabled a more precise interpretation of the DEER data, thus providing reliable insights into the conformational dynamics and inhibition mechanism. However, as suggested by the reviewer, in the revised version, we will perform DEER analysis on the more potent inhibitors to provide additional insight into their interactions.

      In general, the manuscript falls short of providing any major new insight into membrane-bound pyrophosphatases, which are a very well-studied system. Subtle changes in the structures and ensemble distance distributions suggest that the molecular conformations might change a little bit under different conditions, but this isn't a very surprising outcome. It's not clear whether these changes are functionally important, or just part of the normal experimental/protein ensemble variation.

      We respectfully disagree with the reviewer. The scale of motions seen in this study correspond to those seen in the full panoply of crystal structures of mPPases. Some proteins undergo very large conformational changes during catalysis – such as the rotary ATPase. This one doesn’t, meaning that the precise motions we describe are likely to be relevant. Conformational changes in the ensemble, whether large or small, represent essential protein motions which underlie key mPPase catalytic function. Our DEER spectroscopy data demonstrate the sensitivity and resolution necessary to monitor these subtle changes in equilibria, even if these are only a few Angstroms. For several of the conditions we investigated by DEER in solution, corresponding x-ray structures have been solved, with the derived distances agreeing well with the DEER distributions. This further validates the biological relevance of the structures, including serial time-resolved ones that indicate asymmetry.

      The ZLD-bound crystal structure doesn't predict the DEER distances, and the conformation of Na+ binding site sidechains in the ZLD structure doesn't predict whether sodium currents occur. This might suggest that the ZLD structure captures a conformation that does not recapitulate what is happening in solution/ a membrane.

      We agree with the reviewer that the ZLD-bound crystal structure does not predict the DEER distances. However, we believe this discrepancy arises from the effect of the bulkiness of ZLD inhibitor, which prevents the closure of the hydrolytic centre. Additionally, the absence of Na+ at the ion gate in the ZLD-bound structure suggests that Na+ transport does not occur, a conclusion further supported by our electrometric measurements. We agree with the reviewer, that the distances observed in the DEER experiments might represent a potential new conformation in solution, which may not be captured by the static X-ray structure, thereby offering insights into the dynamic nature of the protein under physiological conditions. Finally, the static x-ray structures have not captured the asymmetric conformations that must exist to explain half-of-the-sites reactivity.

      Reviewer #2 (Public review):

      Summary:

      Crystallographic analysis revealed the asymmetric conformation of the dimer in the inhibitor-bound state. Based on this result, which is consistent with previous time-resolved analysis, authors verified the dynamics and distance between spin introduced label by DEER spectroscopy in solution and predicted possible patterns of asymmetric dimer.

      Strengths:

      Crystal structures with inhibitor bound provide detailed coordination in the binding pocket thus useful information for the PPase field and maybe for drug development.

      Weaknesses:

      The distance information measured by DEER is advantageous for verifying the dynamics and structure of membrane protein in solution. However, regarding T211 data, which, as the authors themselves stated, lacks measurement precision, it is unclear for readers how confident one can judge the conclusion leading from these data for the cytoplasmic side.

      We thank the reviewer for acknowledging the advantageous use of the DEER methodology for identifying dynamic states of membrane proteins in solution. We used two sites in our analysis: S525 (periplasm) and T211 (cytoplasm). As we clearly stated in the original manuscript, S525R1 yielded high-quality DEER data, while T211R1 yielded weak (or no) visual oscillations, leading to broad, though different distributions for the several conditions tested. Our main conclusions are based on the S525R1 data. We included the T211R1 data because, although it does not provide definitive evidence, it is consistent with our proposed model and offers additional insights into biologically relevant conditions. Furthermore, the shifts in the centre of mass (Fig EV8D) of the broad T211R1 distributions show a trend that is consistent with our model; although not proving it, it does not exclude it either. Lastly, these data do indeed confirm an important structural feature of mPPase in solution conditions which is the intrinsically high dynamic state of the loop5-6 where T211 is located, and consistent with our previous (Kellosalo et al., Science,  2012; Li et al., Nat. Commun, 2016; Vidilaseris et al., Sci. Adv., 2019; Strauss et al., EMBO Rep., 2024) and current x-ray crystallography data.

      The distance information for the luminal site, which the authors claim is more accurate, does not indicate either the possibility or the basis for why it is the ensemble of two components and not simply a structure with a shorter distance than the crystal structure.

      We thank the reviewer for pointing out this possibility and alternative interpretation of our DEER data. In the revised version, we will show that our DEER data are consistent with (and do not exclude) asymmetry and rephrase to be inclusive of other possibilities. Importantly, this additional possibility does not affect the current interpretation of the data in our manuscript.

      Reviewer #3 (Public review):

      Summary:

      Membrane-bound pyrophosphatases (mPPases) are homodimeric proteins that hydrolyze pyrophosphate and pump H+/Na+ across membranes. They are attractive drug targets against protist pathogens. Non-hydrolysable PPi analogue bisphosphonates such as risedronate (RSD) and pamidronate (PMD) serve as primary drugs currently used. Bisphosphonates have a P-C-P bond, with its central carbon can accommodate up to two substituents, allowing a large compound variability. Here the authors solved two TmPPase structures in complex with the bisphosphonates etidronate (ETD) and zoledronate (ZLD) and monitored their conformational ensemble using DEER spectroscopy in solution. These results reveal the inhibition mechanism of these compounds, which is crucial for developing future small molecule inhibitors.

      Strengths:

      The authors show that seven different bisphosphonates can inhibit TmPPase with IC50 values in the micromolar range. Branched aliphatic and aromatic modifications showed weaker inhibition.

      High-resolution structures for TmPPase with ETD (3.2 Å) and ZLD (3.3 Å) are determined. These structures reveal the binding mode and shed light on the inhibition mechanism. The nature of modification on the bisphosphonate alters the conformation of the binding pocket.

      The conformational heterogeneity is further investigated using DEER spectroscopy under several conditions.

      Weaknesses:

      The authors observed asymmetry in the TmPPase-ELD structure above the hydrolytic center. The structural asymmetry arises due to differences in the orientation of ETD within each monomer at the active site. As a result, loop5-6 of the two monomers is oriented differently, resulting in the observed asymmetry. The authors attempt to further establish this asymmetry using DEER spectroscopy experiments. However, the (over)interpretation of these data leads to more confusion than any further understanding. DEER data suggest that the asymmetry observed in the TmPPase-ELD structure in this region might be funneled from the broad conformational space under the crystallization conditions.

      See also the response below - We respectfully disagree with the reviewer. The asymmetry was previously established using serial time crystallography (Strauss et al., EMBO Rep, 2024) and biochemical assays (e.g. Malinen et al., Prot. Sci., 2022; Artukka et al., Biochem J, 2018; Luoto et al., PNAS, 2013) and also partially seen in one static structure (Vidilaseris et al., Sci Adv 2019). DEER data only show that the previously proposed asymmetry could also be present within the conformational ensemble in solution conditions. Indeed, our data do not (and cannot) exclude this possibility.

      DEER data for position T211R1 at the enzyme entrance reveal a highly flexible conformation of loop5-6 (and do not provide any direct evidence for asymmetry, Figure EV8).

      Please see relevant response above. We acknowledge that T211 is indeed situated on a highly dynamic loop, which is important for gating and our DEER data confirm its high flexibility. Given we have not observed oscillations of this site, leading to broad distributions, we have stated in the original manuscript that we will not establish the presence of any asymmetry in solution on the basis of T211, rather relying on the S525 site, for which we have acquired high-quality DEER data, as was also pointed out and have been commented on by all reviewers.

      Similarly, data for position S521R1 near the exit channel do not directly support the proposed asymmetry for ETD.

      The reviewer appears to suggest that we hold the S525R1 DEER data as direct proof of asymmetry; this is combative on the grounds that to directly prove asymmetry would require time-resolved DEER measurements, far beyond the scope of this work. Rather, we have applied DEER measurements to explore whether asymmetry (observed previously via time-resolved X-ray crystallography) is also present (or indeed a possibility) in solution. We simply state that the DEER data are consistent with asymmetry (i.e., that the mean distance increases in the presence of ETD compared to the apo-state). This is a restrained interpretation of the data.

      Despite the high quality of the data, they reveal a very similar distance distribution. The reported changes in distances are very small (+/- 0.3 nm), which can be accommodated by a change of spin label rotamer distribution alone. Further, these spin labels are located on a flexible loop, thereby making it difficult to directly relate any distance changes to the global conformation

      We thank the reviewer for recognising the high quality of our DEER data for the S525R1, where visual oscillations in the raw traces, as in our case, reportedly lead to highly accurate and reliable distributions, able to separate (in fortuitous cases) helical movements of only a few Angstroms. The ability of DEER/PELDOR offering near Angstrom resolution was previously demonstrated by the acquisition and solution of high resolution multi-subunit spin-labelled membrane protein structures (Pliotas at al., PNAS, 2012; Pliotas et al., Nat Struct Mol Biol, 2015; Pliotas, Methods Enzymol, 2017) as well as it ability in detecting small (and of similar to mPPase magnitude) conformational changes in different integral membrane proteins systems (Kapsalis et al., Nature Comms, 2019; Kubatova et al., PNAS, 2023; Schmidt et al., JACS, 2024; Lane et al., Structure, 2024; Hett et al., JACS, 2021; Zhao et al., Nature, 2024), occurring under different conditions and/or stimuli in solution and/or lipid environment. The changes here are not very small (e.g. ~ 7 Angstroms between the two mean distance extremes (Ca vs IDP)) for DEER’s proven detection sensitivity, and with all other conditions showing changes between those extremes.

      These changes are relatively small, but they are expected for membrane ion pumps. Indeed, none of the mPPase structures show helical movements of greater than a half a turn, and that only in helices 6 and 12. There appear to be larger-scale loop closing motions of the 5-6 loop that includes T211, due to the presence of E217 which binds to one of the Mg2+ ions that coordinate the leaving group phosphate. (This is, inter alia, the reason that this loop is so flexible: it can not order before substrate is bound.) Here we have the resolution to detect such subtle differences by DEER, given there are clear shifts in our time domain data and these are reflected in the mean distances in the distributions. Therefore, our study demonstrates the sensitivity and resolution DEER offers in detecting subtle conformational transitions, key in membrane proteins pathways. To further belabour this point, we do not quantify the DEER data (for instance through parametric fitting) to extract populations of different conformational states and we appreciate that to do so would be highly prone to error; however we do (and can, we feel without overinterpretation) assert that the mean distances shift.

      The interpretations listed below are not supported by the data presented:

      (1) 'In the presence of Ca2+, the distance distribution shifts towards shorter distances, suggesting that the two monomers come closer at the periplasmic side, and consistent with the predicted distances derived from the TmPPase:Ca structure.' Problem: This is a far-stretched interpretation of a tiny change, which is not reliable for the reasons described in the paragraph above.

      While the authors overall agree with the reviewer assessment that ±0.3 nm is a small (not a minor) change, there are literature examples quantifying (or using for quantification) distribution peaks separated by similar Δr. (Kubatova et al., PNAS, 2023; Schmidt et al., JACS, 2024; Hett et al., JACS, 2021; Zhao et al., Nature, 2024). In particular, none of the mPPase structures show helical movements of greater than a half a turn (in helices 6 and 12 in particular). There appear to be larger-scale loop closing motions of the 5-6 loop that includes T211, due to the presence of E217 which binds to one of the Mg2+ ions that coordinate the leaving group phosphate. (This is, inter alia, the reason that this loop is so flexible: it can not order before substrate is bound.)

      Importantly, we have fitted Gaussians to the experimental distance distributions of 525R1 output by the Comparative Deer Analyzer 2.0 and observed a change in the distribution width in presence of Ca2+, implying the rotameric freedom of the spin label is restricted. However, the CW-EPR for 525R1 indicate that the rotational correlation time of the spin label is highly consistent between conditions (the spectra are almost identical); this cannot be explained simply by rotameric preference of the spin label (as asserted by the reviewer 3), as there is no (further) immobilisation observed from the CW-EPR of apo-state (Figure EV9) to that in presence of Ca2+. Furthermore, in the absence of conformational changes, it is reasonable to assume (and demonstrable from the CW-EPR data) that the rotamer cloud should not significantly change between conditions. However, Gaussian fits of the two extreme cases yielding the longest (i.e., in presence of IDP) and shortest (in presence of ZTD) mean distances for the 525R1 DEER data indicated significant (i.e., above the noise floor after Tikhonov validation) probability density for the IDP condition at 50 Å (P(r) = 0.18). This occurs at four standard deviations above the mean of the ZTD condition, which by random chance should occur with <0.007% probability. Indeed, one can say that to observe 18% probability density at four standard deviations above the mean by random chance would occur on the order of one in 4 x 10^6.

      As in previous response the method can detect changes of such magnitude which are not small, but physiologically relevant and expected for integral membrane proteins, such as mPPases. Indeed, even in equal (or more) complex systems such as heptameric mechanosensitive channel proteins DEER provided sub-Angstrom accuracy, when a spin labelled high resolution XRC structure was solved (Pliotas et al., PNAS, 2012; Pliotas et al., Nat Struct Mol Biol, 2015). Despite this is ideal case where DEER accuracy was experimentally validated another high resolution structural method on modified membrane protein and is not very common it demonstrates the power of the method , especially when strong oscillations are present in the raw DEER data (as here for mPPase 525R1), even when multiple distances are present, Angstrom resolution is achievable in such challenging protein classes.

      (2) 'Based on the DEER data on the IDP-bound TmPPase, we observed significant deviations between the experimental and the in silico distances derived from the TmPPase:IDP X-ray structure for both cytoplasmic- (T211R1) and periplasmic-end (S525R1) sites (Figure 4D and Figure EV8D). This deviation could be explained by the dimer adopting an asymmetric conformation under the physiological conditions used for DEER, with one monomer in a closed state and the other in an open state.'

      Problem: The authors are trying to establish asymmetry using the DEER data. Unfortunately, no significant difference is observed (between simulation and experiment) for position 525 as the authors claim (Figure 4D bottom panel). The observed difference for position 112 must be accounted for by the flexibility and the data provide no direct evidence for any asymmetry.

      Reviewer 3 is wrong in suggesting that we are trying to prove asymmetry through the DEER data. That is a well-known fact in the literature (eg Vidilaseris et al, Sci Adv 2019 where we show (1) that the exit channel inhibitor ATC (i.e., close to 525) binds better in solution to the TmPPase:PPi complex than the TmPPase:PPi2 complex, and (2) that ATC binds in an asymmetric fashion to the TmPPase:IDP2 complex with just one ATC dimer on one of the exit channels. We merely use the DEER data to support this well-established fact.

      However, we agree that the DEER data in presence of IDP does not provide direct proof for asymmetry; particularly mutant T211R1 yields in silico distributions too short for measurement by DEER. It is possible that the deviations observed (and particularly likely for T211R1) arise from conformational heterogeneity in solution. We will rephrase this paragraph accordingly: “Owing to the broad nature of the T211R1 (cytoplasmic site) distance distributions, we refrain from interpreting shifts in this data. For the 525R1 (periplasmic site) for which we obtained data of high quality (as also pointed out by both reviewers 2 and 3) we observed deviations between the experimental and the in-silico distances derived from the TmPPase:IDP X-ray structure. While this deviation is less pronounced than for the +ZTD condition, the deviation is consistent with an asymmetric conformation in solution.”

      (3) 'Our new structures, together with DEER distance measurements that monitor the conformational ensemble equilibrium of TmPPase in solution, provide further solid experimental evidence of asymmetry in gating and transitional changes upon substrate/inhibitor binding.'

      Problem: See above. The DEER data do not support any asymmetry.

      We feel that the reviewer comments here are somewhat unfounded. The DEER data (and we will limit discussion only to the 525R1 mutant in this regard) satisfy relevant criteria of the white paper (Schiemann et al., 2021, JACS) from the EPR community (signal-to-noise ratio w.r.t modulation depth of > 20 in all cases; replicates have been performed and will be added into the main-text or supplementary; near quantitative labelling efficiency (evidenced by lack of free spin label signal in the CW-EPR spectra); analysed using the CDA (now Figure EV10, this data we will promote to the main-text) to avoid confirmation bias).

      While the DEER data do not prove asymmetry, we do not claim proof of asymmetry in the above sentence. We concede to rephrase the offending sentence above as: “Our new structures, together with DEER distance measurements that monitor the conformational ensemble of TmPPase in solution, do not exclude asymmetry in gating and transitional changes upon substrate/inhibitor binding and are consistent with our proposed model.” We feel that this reframed conjecture of asymmetry is well founded; indeed, comparing the experimental apo-state 525R1 distance distribution with in-silico modelling performed on the hybridised asymmetric structure (i.e., comprised of one monomer bound to Ca2+ and another bound to IDP) yields an overlap coefficient (Islam and Roux, JPC B, 2015) of >0.97. This implies the envelope of the modelled distance distribution is quantitatively inside the envelope of the experimental distance distribution. Thus, the DEER data do not exclude asymmetry (previously observed by time-resolved XRC) in solution. While we appreciate that ideally one would measure time-resolved DEER to directly correlate kinetics of conformational changes within the ensemble to the catalytic cycle of mPPase,(and this is something we aim to do in the future), it is beyond the the scope of this study.

      Indeed, half-of-the-sites reactivity has been demonstrated in at least the following papers (Vidilaseris et al, Sci Acv. ,2019, Strauss et al, EMBO Rep. 2024, Malinen et al Prot Sci, 2022, Artukka et al Biochem J, 2018; Luoto et al, PNAS, 2013). Half-of-the sites activity requires asymmetry in the mechanism, and therefore asymmetric motions in the active site (viz 211) and exit channel (viz 525). As mentioned above, we have demonstrated this for other inhibitors (Vidilaseris et al 2019) and as part of a time-resolved experiment (Strauss et al 2024). In fact, given the wealth of evidence showing that the symmetrical crystal structures sample a non- or less-productive conformation of the protein, it would be quixotic to propose the DEER experiments - in solution - do not generate asymmetric conformations. It certainly doesn’t obey Occam’s razor of choosing the simplest possible explanation that covers the data.

      (4) Based on these observations, and the DEER data for +IDP, which is consistent with an asymmetric conformation of TmPPase being present in solution, we propose five distinct models of TmPPase (Figure 7).

      Problem: Again, the DEER data do not support any asymmetry and the authors may revisit the proposed models.

      We respectfully disagree with the reviewer. Please see our detailed response above. However, in the revised version, we will clarify that the proposed models are not solely based on the DEER data but are grounded in both current and previously solved structures, with the DEER data providing additional consistency with these models.

      (5) 'In model 2 (Figure 7), one active site is semi-closed, while the other remains open. This is supported by the distance distributions for S525R1 and T211R1 for +Ca/ETD informed by DEER, which agrees with the in silico distance predictions generated by the asymmetric TmPPase:ETD X-ray structure'

      Problem: Neither convincing nor supported by the data

      We respectfully disagree with the reviewer. However, owing to the conformational heterogeneity of T211R1, in the revised version, we will exclude it in the above sentence, to the effect: Please see our detailed response above.

    1. eLife Assessment

      This important study used whole genome data to investigate Beefalo ancestry for the first time. It provides insight into the genetics of Beefalo cattle, definitively challenging the long-held claim of 37.5% buffalo ancestry reported by the American Beefalo Association. This results are convincing, with a comprehensive range of well-established population genomics methods being used to estimate ancestry in these animals. This work will be of significant interest to evolutionary biologists, population geneticists, animal breeders, and those involved in the conservation genetics of bovine species.

    2. Reviewer #1 (Public review):

      Summary:

      This study used whole genome data to investigate Beefalo ancestry for the first time, filling the gap in the field of Beefalo ancestry. The authors used preserved semen samples to generate genomic data on 47 registered Beefalo and 3 bison hybrids, further questioning the ABA's stated goal of ⅜ bison ancestry. In addition, the authors also show that ancestry profiles of Beefalo and bison hybrid genomes are consistent with repeated backcrossing to either parental species, demonstrating the value of genomic information in examining gene flow between species in the genus Bison. This is an interesting study that still has some major weaknesses that exist, but overall, the work demonstrates the utility of genomic information in validating specific breeding claims for a more complete understanding of gene flow and genetic variation among bovine species.

      Strengths:

      Numerous genetic analysis methods such as PCA, ADMIXTURE, F4 ratios, and local ancestry inference techniques revealed that no single Beefalo set meets the ancestry requirements set by the American Beefalo Association (ABA) and some beefalo had detectable indicine cattle ancestry.

      Weaknesses:

      While this study contributes to our knowledge of Beefalo ancestry, there are some key issues that need to be addressed in terms of analysing the specific results as well as writing the article.

    3. Reviewer #2 (Public review):

      Summary:

      Shapiro et al. set out to verify the American Beefalo Association's claim that Beefalo cattle possess 37.5% bison ancestry. They employ a comprehensive range of well-established population genomics methods to estimate ancestry in these hybrid populations, including PCA, ADMIXTURE, D and F statistics, and local ancestry inference. Their findings conclusively demonstrate that most Beefalo lack the claimed bison ancestry, with only 8 out of 47 samples showing any detectable bison ancestry, ranging from 2 - 18%.

      Strengths:

      The primary strength of this analysis lies in the comprehensive dataset available to the authors, which includes important foundational Beefalo individuals and various reference populations. The rigorous and multi-faceted methodological approach employs several well-established techniques in population genomics for detecting and measuring admixture. Each method used has a firm basis in the field, providing consistent and robust results. The authors' approach of using PCA to initially assess the data within a global context, followed by more specific analyses using ADMIXTURE and D-statistics, provides a clear and logical progression of evidence. The presentation of these results in figures is particularly effective, clearly illustrating the key findings of the study. Additionally, the examination of both autosomal and sex chromosome ancestry offers a more complete understanding of Beefalo genetic composition and the mechanics of bison-cattle hybridisation.

      Weaknesses:

      One limitation of this analysis is the relatively low coverage (~2x) of many Beefalo samples. However, the authors have taken steps to mitigate biases that may arise from this. Another weakness is the limited sampling of contemporary Beefalo populations, as the study focuses primarily on historical samples. This may limit our understanding of how Beefalo genetics may have changed over time.

      Appraisal:

      The authors have clearly achieved their primary aim using a rigorous and comprehensive methodology. Their extensive dataset and multi-faceted analytical approach provide strong support for their conclusions. The study not only addresses its main research question but also reveals unexpected insights into Beefalo genetics, particularly the presence of zebu ancestry.

      Discussion:

      This study is valuable for several reasons beyond its primary findings. First, it definitively addresses and refutes the claim of 37.5% bison ancestry in Beefalo, providing crucial information for those studying these interspecies hybrids and the viability of their offspring. Second, it reveals the unexpected presence of zebu ancestry in many Beefalo, raising intriguing questions about the breed's development and the potential role of zebu cattle in achieving desired traits. This finding suggests that the distinctive appearance of Beefalo may be due in part to zebu admixture rather than bison ancestry. Third, the study highlights the significant barriers to admixture between bison and cattle, both in controlled breeding programs and potentially in wild populations. This has important implications for conservation genetics and our understanding of gene flow between these species. Lastly, the study demonstrates the power of genomic analysis in verifying breed claims and understanding the complex history of domestic animal breeds. These findings open new avenues for research in bovine genomics, breed development, and the dynamics of interspecies hybridisation.

    4. Reviewer #3 (Public review):

      Summary:

      I really like this topic and study. But I think much can be more focused and tightened up. All the components are here - just some more refining to really make the storyline clear, the journey of discovery, and the impact of such knowledge.

      Strengths:

      The authors dive directly into the question of genomic ancestry as compared to the breed club's reported ancestry with heavy, quantitative data and critical analytical methods. The questioning line is direct and does not meander. The reader learns about the challenges of breeding associations, and values of understood ancestry, and presents a clear need of re-evaluating the breed standards and expectations of beefalo (if ancestry is indeed the primary goal instead of a phenotype-driven breed mission).

      Weaknesses:

      Much of the quantitative results are only referred to in the main text with qualitative language. Please incorporate more written quantitative results to highlight evidence that underlines the study narrative because it is quite an interesting study!

    1. eLife Assessment

      This study presents a valuable finding that MK2 inhibitor CMPD1 can inhibit the growth, migration, and invasion of breast cancer cells both in vitro and in vivo by inducing microtubule depolymerization, preferentially at the microtubule plus-end, leading to cell division arrest. The evidence supporting the conclusion of this paper is solid, although additional experiments and controls are needed to further strengthen the claim. This work will be of interest to breast cancer researchers.

    2. Reviewer #1 (Public review):

      In this paper, the authors reveal that the MK2 inhibitor CMPD1 can inhibit the growth, migration, and invasion of breast cancer cells both in vitro and in vivo by inducing microtubule depolymerization, preferentially at the microtubule plus-end, leading to cell division arrest, mitotic defects, and apoptotic cell death. They also showed that CMPD1 treatment upregulates genes associated with cell migration and cell death, and downregulates genes related to mitosis and chromosome segregation in breast cancer cells, suggesting a potential mechanism of CMPD1 inhibition in breast cancer. Besides, they used the combination of an MK2-specific inhibitor, MK2-IN-3, with the microtubule depolymerizer vinblastine to simultaneously disrupt both the MK2 signaling pathway and microtubule dynamics, and they claim that inhibiting the p38-MK2 pathway may help to enhance the efficacy of MTAs in the treatment of breast cancer. However, there are a few concerns, including:

      (1) What is the effect of CMPD1 on breast cancer metastasis?

      (2) The mechanism is lacking as to how MK2 inhibitors enhance the efficacy of MTAs.

    3. Reviewer #2 (Public review):

      Summary:

      This study explores the potential of inhibiting the p38-MK2 signaling pathway to enhance the efficacy of microtubule-targeting agents (MTAs) in breast cancer treatment using a dual-target inhibitor.

      Strengths:

      The study identifies the p38-MK2 pathway as a promising target to enhance the efficacy of microtubule-targeting agents (MTAs), offering a novel therapeutic strategy for breast cancer treatment. In addition, the study employs a wide range of techniques, especially live-cell imaging, to assess the microtubule dynamics in TNBC cells.

      Weaknesses:

      The study primarily uses RPE1 cells as the control for normal cells, which may not fully capture the response of normal mammary epithelial cells. While CMPD1 is shown to be effective in suppressing tumor growth in MDA-MB-231 xenograft, the study lacks detailed toxicity data to confirm its safety profile in vivo.

    4. Reviewer #3 (Public review):

      Summary:

      The authors demonstrated MK2i could enhance the therapeutic efficacy of MTAs. With Tumor xenograft and migration assay, the author suggested that the p38-MK2 pathway may serve as a promising therapeutic target in combination with MTAs in cancer treatment.

      Strengths:<br /> The authors provided a potential treatment for breast cancer.

      Weaknesses:

      (1) In Figure 2, the authors used a human retinal pigment epithelial-1 (RPE1) cell line to show that breast cancer cells are more sensitive to CMPD1 treatment. MCF10A cells would be suggested here as a suitable control. Besides, to compare the sensitivity, IC50 indifferent cell lines should be measured.

      (2) The data of MDA-MB-231 in Figure 1D is not consistent with CAL-51 and T47D, also not consistent with the data in Figures 2B-C.

      (3) To support the authors' conclusion in Figure 5, an additional animal experiment performed by tail vein injection would be helpful.

      (4) Page 14, to evaluate the combination result of MK2i and vinblastine, an in vivo animal assay must be performed.

      (5) The authors used RNA-seq to show some pathways affected by CMPD1. What are the key/top genes that were affected? How about the mechanism?

      (6) Line 127, more experiments should be involved to support the conclusion.

    1. eLife Assessment

      This important study highlights the key role of NK cells and PD-L1+ neutrophils in worsening sepsis responses in the context of of MASH (metabolic dysfunction-associated steatohepatitis). While the data are solid, the overall evidence for the role of neutrophils in mediating this effect, which is based on a choline-deficient high-fat diet model of various knockouts or selective ablation of immune cell types, remains incomplete. The study will be of interest to researchers in immunopathological disease mechanisms.

    2. Reviewer #1 (Public review):

      Summary:

      By using an established NAFLD model, choline-deficient high-fat diet, Barros et al show that LPS challenge causes excessive IFN-γ production by hepatic NK cells which further induces recruitment and polarization of a PD-L1 positive neutrophil subset leading to massive TNFα production and increased host mortality. Genetic inhibition of IFN-γ or pharmacological blockade of PD-L1 decreases recruitment of these neutrophils and TNFα release, consequently preventing liver damage and decreasing host death.

      Since NAFLD is often accompanied by chronic, low-grade inflammation, it can lead to an overactive but dysfunctional immune response and increase the body's overall susceptibility to infections, therefore this is a very important research question.

      Weaknesses:

      I have quite a lot of concerns with this manuscript. One of those is that the authors did not indeed show that the seen effect is really due to NAFLD itself. The role of choline is already known in the context of sepsis since its deficiency (which can be observed in about two weeks through deterioration of liver structure and function) leads to body organ dysfunction both in humans and animals. Nolan and Vilayat in 1968 showed that the hepatic injury and mortality due to endotoxinaemic shock induced by intraperitoneal injection of LPS was significantly increased in adult female Holtzman rats fed on a choline-deficient diet. Therefore, in order to really show that the effect is mediated due to NAFLD some other diet model must be used (e.g. high-fat, high-fructose, and high-cholesterol diet).

    3. Reviewer #2 (Public review):

      Summary:

      This is an extremely interesting mouse study, trying to understand how sepsis is tolerated during obesity/NAFLD. The researchers combine a well-established model of NASH (Choline-deficiency with High Fat Diet) with a sepsis model (IP injection of 10mg/kg LPS), leading to dramatic mortality in mice. Using this model, they characterize the complex contributions of immune cells. Specifically, they find that NK-cells and Neutrophils contribute the most to mortality in this model due to IFNG and PD-L1+ Neutrophils.

      Strengths:

      The biggest strength of the manuscript is how clear the primary phenotypes/endpoints of their model are. Within 6 hours of LPS injection, there is a stark elevation of liver inflammation and damage, which is exacerbated by a High Fat/CholineDeficient diet (HFCD). And after 1 day, almost all of the mice die. Using these endpoints, the authors were able to identify which cells were critical for mortality in the model and the specific mediators involved.

      Weaknesses:

      A few key details regarding the experimental design and interpretation are missing.

      Most important is the choice of a high-fat diet with choline deficiency. I believe this model was chosen because the experiments are shorter and typically result in a liver inflammatory phenotype with not as clear of an adipose/obesity phenotype. I actually think it is typically considered a NASH (Non-alcoholic Steatohepatitis) model. I don't think the manuscript includes any data regarding the physiology of these mice that you would expect in an obesity model: body weight, liver weight, blood glucose, etc.

      You should include a description in the methods for how the survival studies were conducted. Were the mice just checked on once a day for death, or were there other endpoints for euthanasia, like severe weight loss?

      The measurement of IFNG and TNF in tissue throughout the manuscript seemed inconsistent. For example, IFNG in Figure 3A is 0.05pg/g for Chow+LPS, and 0.15pg.g for HFCD+LPS. But in Figure 4H, Chow+LPS is 0.18pg/g and HFCD+LPS is 0.18pg/g, so there is no effect of HFCD in the IgG controls. Also, in Figure 4I and 4J, the TNF values are dramatically different for the controls (0.1 vs 1pg/g).

      You can't conclude that CD4+ and CD8+ T cells or monocytes don't play a role in liver damage from your data, because you did not measure liver damage, only mortality. I understand using mortality as an endpoint, but without ALT/AST measurements or histology, it's hard to say what exactly happened in the livers.

      I'm not sure the authors can conclude that neutrophils expressing PD-L1 live longer in the hepatic environment from an in vitro experiment. I think this is an interesting result in terms of crosstalk between these two cell types, but I'm not sure that in vivo the neutrophils would live longer.

    4. Reviewer #3 (Public review):

      Summary:

      The authors investigated how non-alcoholic fatty liver disease (NAFLD) influences liver damage during endotoxemia (a condition characterized by elevated endotoxins, like lipopolysaccharide or LPS, in the bloodstream) using a mouse model. Mice with NAFLD were given a moderate dose of LPS, which intensified liver inflammation and mortality compared to controls. The study concludes that targeting neutrophil activity and TNF-α signaling could be a promising approach to reducing excessive inflammation and liver injury in NAFLD patients experiencing endotoxemia. This can have important implications for the treatment but I think the manuscript requires revisions.

      Strengths:

      (1) The study presents both in vivo and ex vivo assay and results to support their hypothesis.

      (2) Several cell types and their interaction with each other have been analyzed.

      (3) The authors made use of the publicly available databases.

      Weaknesses:

      (1) Some figures contradict each other.

      (2) Some of the cause-and-effect presentations need additional experiments and different approaches to be proven correct.

      (3) Candidate/mechanism selection strategies are not very clear.

    1. eLife Assessment

      This study reports that activation of TFEB promotes lysosomal exocytosis and clearance of cholesterol from lysosomes, the strength of evidence for which is solid and considered valuable in the context of Niemann-Pick Disease Type C. However, beyond this aspect of the study, the reviewers found the strength of the evidence to be incomplete. The manuscript also needs careful editing to improve readability.

    2. Reviewer #1 (Public review):

      Summary:

      The authors are trying to determine if SFN treatment results in dephosphorylation of TFEB, subsequent activation of autophagy-related genes, exocytosis of lysosomes, and reduction in lysosomal cholesterol levels in models of NPC disease.

      Strengths:

      (1) Clear evidence that SFN results in translocation of TFEB to the nucleus.

      (2) In vivo data demonstrating that SFN can rescue Purkinje neuron number and weight in NPC1-/- animals.

      Weaknesses:

      (1) Lack of molecular details regarding how SFN results in dephosphorylation of TFEB leading to activation of the aforementioned pathways. Currently, datasets represent correlations.

      (2) Based on the manuscript narrative, discussion, and data it is unclear exactly how steady-state cholesterol would change in models of NPC disease following SFN treatment. Yes, there is good evidence that lysosomal flux to (and presumably across) the plasma membrane increases with SFN. However, lysosomal biogenesis genes also seem to be increasing. Given that NPC inhibition, NPC1 knockout, or NPC1 disease mutations are constitutively present and the cell models of NPC disease contain lysosomes (even with SFN) how could a simple increase in lysosomal flux decrease cholesterol levels? It would seem important to quantify the number of lysosomes per cell in each condition to begin to disentangle differences in steady state number of lysosomes, number of new lysosomes, and number of lysosomes being exocytosed.

      (3) Lack of evidence supporting the authors' premise that "SFN could be a good therapeutic candidate for neuropathology in NPC disease".

    3. Reviewer #2 (Public review):

      Summary:

      This study presents a valuable finding that the activation of TFEB by sulforaphane (SFN) could promote lysosomal exocytosis and biogenesis in NPC, suggesting a potential mechanism by SFN for the removal of cholesterol accumulation, which may contribute to the development of new therapeutic approaches for NPC treatment.

      Strengths:

      The cell-based assays are convincing, utilizing appropriate and validated methodologies to support the conclusion that SFN facilitates the removal of lysosomal cholesterol via TFEB activation.

      Weaknesses:

      (1) The in vivo experiments demonstrate the therapeutic potential of SFN for NPC. A clear dose-response analysis would further strengthen the proposed therapeutic mechanism of SFN. Additional data supporting the activation of TFEB by SFN for cholesterol clearance in vivo would strengthen the overall impact of the study

      (2) In Figure 4, the authors demonstrate increased lysosomal exocytosis and biogenesis by SFN in NPC cells. Including a TFEB-KO/KD in this assay would provide additional validation of whether these effects are TFEB-dependent.

      (3) For lysosomal pH measurement, the combination of pHrodo-dex and CF-dex enables ratiometric pH measurement. However, the pKa of pHrodo red-dex (according to Invitrogen) is ~6.8, while lysosomal pH is typically around 4.7. This discrepancy may account for the lack of observed lysosomal pH changes between WT and U18666A-treated cells. Notably, previous studies (PMID: 28742019) have reported an increase in lysosomal pH in U18666A-treated cells.

      (4) The authors are also encouraged to perform colocalization studies between CF-dex and a lysosomal marker, as some researchers may be concerned that NPC1 deficiency could reduce or block the trafficking of dextran along endocytosis.

      (5) In vivo data supporting the activation of TFEB by SFN for cholesterol clearance would significantly enhance the impact of the study. For example, measuring whole-animal or brain cholesterol levels would provide stronger evidence of SFN's therapeutic potential.

    4. Reviewer #3 (Public review):

      Summary:

      The authors demonstrate that activation of TFEB facilitates cholesterol clearance in cell models of Niemann-Pick type C (NPC). This is done through a variety of approaches including activation of TFEB by sulforaphane (SFN), a naturally occurring small-molecule TFEB agonist. SFN induces TFEB nuclear translocation and promotes lysosomal exocytosis. In an NPC mouse model, SFN dephosphorylates/activates TFEB in the brain and rescues the loss of Purkinje cells.

      Strengths:

      NPC is a severe disease and there is little in the way of treatment. The manuscript points towards some treatment options. However, the title, the title "Small-molecule activation of TFEB Alleviates Niemann-Pick Disease..." is far too strong and should be changed.

      Weaknesses:

      (1) The manuscript is extremely hard to read due to the writing; it needs careful editing for grammar and English.

      (2) There are a number of important technical issues that need to be addressed.

      (3) The TFEB influence on filipin staining in Figure 1A is somewhat subtle. In the mCherry alone panels there is a transfected cell with no filipin staining and the mCherry-TFEBS211A cells still show some filipin staining.

      (4) Figure 1C is impressive for the upregulation of filipin with U18666A treatment. However, SFN is used at 15 microM. This must be hitting multiple pathways. Vauzour et al (PMID: 20166144) use SFN at 10 nM to 1microM. Other manuscripts use it in the low microM range. The authors should repeat at least some key experiments using SFN at a range of concentrations from perhaps 100 nM to 5 microM. The use of 15 microM throughout is an overall concern.

    5. Author Response:

      Thank you for your interest in our paper. We would also like to thank the anonymous reviewers for their critical and constructive comments. Although the reviewers found our work interesting, they raised several important concerns about our study. To address these concerns, mostly we will perform new experiments as following.

      1. Examine whether antioxidant-NAC can block SFN-induced TFEB-nuclear translocation in NPC cells;

      2. Examine whether calcineurin inhibitor (FK506+CsA) or Ca 2+ inhibitor (Bapta-AM) can block SFN-induced TFEB-nuclear translocation in NPC cells.

      3. Investigate whether cholesterol was cleared by activation of TFEB by SFN in vivo tissues.

      4. Investigate whether SFN-evoked the lysosomal exocytosis is TFEB-dependent by using TFEB-KO cells.

      5. Examine the effect of NPC1 deficiency on dextran trafficking by studying the localization of CF- dex and Lamp1.

      6. Perform cytotoxicity experiments to examine whether SFN used in this study is cytotoxic in various cell lines

      In addition, according to the reviewers’ suggestions, we will make clarifications and corrections wherever appropriate in the manuscript. Below please find our point-by-point responses and plans to the reviewers’ comments.

      Reviewer #1 (Public review):

      Summary:

      The authors are trying to determine if SFN treatment results in dephosphorylation of TFEB, subsequent activation of autophagy-related genes, exocytosis of lysosomes, and reduction in lysosomal cholesterol levels in models of NPC disease.

      Strengths:

      (1) Clear evidence that SFN results in translocation of TFEB to the nucleus.

      (2) In vivo data demonstrating that SFN can rescue Purkinje neuron number and weight in NPC1-/- animals.

      Thank you for the support!

      Weaknesses:

      (1) Lack of molecular details regarding how SFN results in dephosphorylation of TFEB leading to activation of the aforementioned pathways. Currently, datasets represent correlations.

      Thank you for this constructive comment. The reviewer is right that in this manuscript the molecular mechanism of SFN-activated TFEB has not been discussed in details. Because previously we have shown that SFN induces TFEB nuclear translocation via a Ca 2+ - dependent but MTOR (mechanistic target of rapamycin kinase)-independent mechanism through a moderate increase in reactive oxygen species (ROS). And calcineurin-mediated TFEB dephosphorylation underlies SFN-induced TFEB activation. These data have been published in 2021 autophagy (Li, Shao et al. 2021) . Therefore, in this study we did not mention this part. We will add the molecular mechanism of TFEB activation by SFN in the discussion part. And to further confirm this mechanism in NPC cells, we will also perform experiments including: 1) examine whether antioxidant-NAC can block SFN-induced TFEB-nuclear translocation in NPC cells; 2) examine whether calcineurin inhibitor (FK506+CsA) can block SFN-induced TFEB-nuclear translocation in NPC cells.

      (2) Based on the manuscript narrative, discussion, and data it is unclear exactly how steady-state cholesterol would change in models of NPC disease following SFN treatment. Yes, there is good evidence that lysosomal flux to (and presumably across) the plasma membrane increases with SFN. However, lysosomal biogenesis genes also seem to be increasing. Given that NPC inhibition, NPC1 knockout, or NPC1 disease mutations are constitutively present and the cell models of NPC disease contain lysosomes (even with SFN) how could a simple increase in lysosomal flux decrease cholesterol levels? It would seem important to quantify the number of lysosomes per cell in each condition to begin to disentangle differences in steady state number of lysosomes, number of new lysosomes, and number of lysosomes being exocytosed.

      Thank you for the suggestion. It is important to define the three states 1) original number of lysosomes, 2) number of new lysosomes, and 3) number of lysosomes being exocytosis. However, we have checked literature, so far it seems that there is no good method that could clearly differentiate the three states of lysosomes.

      (3) Lack of evidence supporting the authors' premise that "SFN could be a good therapeutic candidate for neuropathology in NPC disease".

      Suggestion was taken! We will investigate whether cholesterol was reduced by activation of TFEB by SFN in vivo to strength the point that SFN could be a potential therapeutic compound for NPC treatment. And to avoid confusion, we have removed this sentence.

      Reviewer #2 (Public review):

      Summary:

      This study presents a valuable finding that the activation of TFEB by sulforaphane (SFN) could promote lysosomal exocytosis and biogenesis in NPC, suggesting a potential mechanism by SFN for the removal of cholesterol accumulation, which may contribute to the development of new therapeutic approaches for NPC treatment.

      Strengths:

      The cell-based assays are convincing, utilizing appropriate and validated methodologies to support the conclusion that SFN facilitates the removal of lysosomal cholesterol via TFEB activation.

      Weaknesses:

      (1) The in vivo experiments demonstrate the therapeutic potential of SFN for NPC. A clear dose-response analysis would further strengthen the proposed therapeutic mechanism of SFN. Additional data supporting the activation of TFEB by SFN for cholesterol clearance in vivo would strengthen the overall impact of the study

      We understand the reviewer’s point. We examined two doses of SFN-30 and 50mg/kg. As shown in Fig.6, SFN (50mg/kg), but not 30mg/kg prevents a degree of Purkinje cell loss in the lobule IV/V of cerebellum, suggesting a dose-correlated preventive effect of SFN. In vivo experiments with higher concentrations of SFN and optimized dosage form of SFN were planned in the future study, but will not be included in this study.

      We will investigate whether cholesterol was cleared by activation of TFEB by SFN in vivo.

      (2) In Figure 4, the authors demonstrate increased lysosomal exocytosis and biogenesis by SFN in NPC cells. Including a TFEB-KO/KD in this assay would provide additional validation of whether these effects are TFEB-dependent.

      Thank you for this valuable suggestion. We will investigate whether SFN-evoked the lysosomal exocytosis is TFEB-dependent by using TFEB-KO cells.

      (3) For lysosomal pH measurement, the combination of pHrodo-dex and CF-dex enables ratiometric pH measurement. However, the pKa of pHrodo red-dex (according to Invitrogen) is ~6.8, while lysosomal pH is typically around 4.7. This discrepancy may account for the lack of observed lysosomal pH changes between WT and U18666A-treated cells. Notably, previous studies (PMID: 28742019) have reported an increase in lysosomal pH in U18666A-treated cells.

      We understand the reviewer’s point. But we used pHrodo™ Green-Dextran (P35368, Invitrogen), but not pHrodo red-dex to measure the lysosomal luminal acidity. According to the product information from Invitrogen, pHrodo Green-dex conjugates are non-fluorescent at neural pH, but fluorescence bright green at acidic pH ranges 4-9, such as those in endosomes and lysosomes. Therefore, pHrodo Green-dex can be used to monitor the acidity of lysosome (Hu, Li et al. 2022) . We also used LysoTracker Red DND-99 (Thermo Scientific, L7528) to measure lysosomal pH (Fig. 4G, H), which is consistent with results of pHrodo Green/CF measurement. Overall, in our hands, we have not detected pH change of lysosomes in U18666A-treated NPC1 cell models.

      (4) The authors are also encouraged to perform colocalization studies between CF-dex and a lysosomal marker, as some researchers may be concerned that NPC1 deficiency could reduce or block the trafficking of dextran along endocytosis.

      Suggestion was taken! We will examine the effect of NPC1 deficiency on dextran trafficking by studying the localization of CF-dex and Lamp1.

      (5) In vivo data supporting the activation of TFEB by SFN for cholesterol clearance would significantly enhance the impact of the study. For example, measuring whole-animal or brain cholesterol levels would provide stronger evidence of SFN's therapeutic potential.

      We really appreciate the reviewer’s suggestions. We will investigate whether cholesterol was cleared by activation of TFEB by SFN in vivo.

      Reviewer #3 (Public review):

      Summary:

      The authors demonstrate that activation of TFEB facilitates cholesterol clearance in cell models of Niemann-Pick type C (NPC). This is done through a variety of approaches including activation of TFEB by sulforaphane (SFN), a naturally occurring small-molecule TFEB agonist. SFN induces TFEB nuclear translocation and promotes lysosomal exocytosis. In an NPC mouse model, SFN dephosphorylates/activates TFEB in the brain and rescues the loss of Purkinje cells.

      Strengths:

      NPC is a severe disease and there is little in the way of treatment. The manuscript points towards some treatment options. However, the title, the title "Small-molecule activation of TFEB Alleviates Niemann-Pick Disease..." is far too strong and should be changed.

      Weaknesses:

      (1) The manuscript is extremely hard to read due to the writing; it needs careful editing for grammar and English.

      We will thoroughly check grammar to improve the manuscript.

      (2) There are a number of important technical issues that need to be addressed.

      We will address the technical issues mentioned in the following.

      (3) The TFEB influence on filipin staining in Figure 1A is somewhat subtle. In the mCherry alone panels there is a transfected cell with no filipin staining and the mCherry-TFEBS211A cells still show some filipin staining.

      We understand the reviewer’s point. We will investigate whether cholesterol is cleared by activation of TFEB by SFN in vivo.

      (4) Figure 1C is impressive for the upregulation of filipin with U18666A treatment. However, SFN is used at 15 microM. This must be hitting multiple pathways. Vauzour et al (PMID: 20166144) use SFN at 10 nM to 1microM. Other manuscripts use it in the low microM range. The authors should repeat at least some key experiments using SFN at a range of concentrations from perhaps 100 nM to 5 microM. The use of 15 microM throughout is an overall concern.

      We understand the reviewer’s point. See RESPONSE #1, previously we have shown that SFN (10–15 μM, 2–9 h) induces robust TFEB nuclear translocation in a dose- and time-dependent manner in HeLa GFP-TFEB stable cells as well as in other human cell lines without cytotoxicity (Li, Shao et al. 2021) . According to previous results, in this study, we chose SFN (15 μM) to examine its effect on cholesterol clearance. We will add the information in the discussion part. In this study, we will perform dose-response TFEB nuclear translocation in NPC model cells as well as cytotoxicity experiments to examine whether the concentrations of SFN used in various cell lines are toxic.

      References:

      Hu, M. Q., P. Li, C. Wang, X. H. Feng, Q. Geng, W. Chen, M. Marthi, W. L. Zhang, C. L. Gao, W. Reid, J. Swanson, W. L. Du, R. Hume and H. X. Xu (2022). "Parkinson's disease-risk protein TMEM175 is a proton-activated proton channel in lysosomes.” Cell 185(13): 2292-+.

      Li, D., R. Shao, N. Wang, N. Zhou, K. Du, J. Shi, Y. Wang, Z. Zhao, X. Ye, X. Zhang and H. Xu (2021). “Sulforaphane Activates a lysosome-dependent transcriptional program to mitigate oxidative stress.” Autophagy 17(4): 872-887.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The work from Petazzi et al. aimed at identifying novel factors supporting the differentiation of human hematopoietic progenitors from induced pluripotent stem cells (iPSCs). The authors developed an inducible CRISPR-mediated activation strategy (iCRISPRa) to test the impact of newly identified candidate factors on the generation of hematopoietic progenitors in vitro. They first compared previously published transcriptomic data of iPSCderived hemato-endothelial populations with cells isolated ex vivo from the aorta-gonadmesonephros (AGM) region of the human embryo and they identified 9 transcription factors expressed in the aortic hemogenic endothelium that were poorly expressed in the in vitro differentiated cells. They then tested the activation of these candidate factors in an iPSCbased culture system supporting the differentiation of hematopoietic progenitors in vitro. They found that the IGF binding protein 2 (IGFBP2) was the most upregulated gene in arterial endothelium after activation and they demonstrated that IGFBP2 promotes the generation of functional hematopoietic progenitors in vitro.

      Strengths:

      The authors developed an extremely useful doxycycline-inducible system to activate the expression of specific candidate genes in human iPSC. This approach allows us to simultaneously test the impact of 9 different transcription factors on in vitro differentiation of hematopoietic cells, and the system appears to be very versatile and applicable to a broad variety of studies.

      The system was extensively validated for the expression of 1 transcription factor (RUNX1) in both HeLa cells and human iPSC, and a detailed characterization of this test experiment was provided.

      The authors exhaustively demonstrated the role of IGFBP2 in promoting the generation of functional hematopoietic progenitors in vitro from iPSCs. Even though the use of IGFBP2interacting proteins IGF1 and IGF2 have been previously reported in human iPSC-derived hematopoietic differentiation in vitro (Ditadi and Sturgeon, Methods 2016; Ng et al., Nature Biotechnology 2016), and IGFBP-2 itself has been shown to promote adult HSC expansion ex vivo (Zhang et al., Blood 2008), its role on supporting in vitro hematopoiesis was demonstrated here for the first time.

      Weaknesses:

      Although the authors performed a very thorough characterization of the system in proof-ofprinciple experiments activating a single transcription factor, the data provided when 9 independent factors were used is not sufficient to fully validate the experimental strategy. Indeed, in the current version of the manuscript, it is not clear whether the results presented in both the scRNAseq analysis and the functional assays are the consequence of the simultaneous activation of all 9 TF or just a subset of them. This is essential to establish whether all the proposed factors play a role during embryonic hematopoiesis, and a more complete analysis of the scRNAseq dataset could help clarify this aspect.

      Similarly, the data presented in the manuscript are not sufficient to clarify at what stage of the endothelial-to-hematopoietic transition (EHT) the TF activation has an impact. Indeed, even though the overall increase of functional hematopoietic progenitors is fully demonstrated, the assays proposed in the manuscript do not clarify whether this is due to a specific effect at the endothelial level or to an increased proliferation rate of the generated hematopoietic progenitors. Similar conclusions can be applied to the functional validation of IGFBP2 in vitro.

      The overall conclusions are sometimes vague and not always supported by the data. For instance, the authors state that the CRISPR activation strategy resulted in transcriptional remodeling and a steer in cell identity, but they do not specify which cell types are involved and at what level of the EHT process this is happening. In the discussion, the authors also claim that they provided evidence to support that RUNX1T1 could regulate IGFBP2 expression. However, this is exclusively based on the enrichment of RUNX1T1 gRNA in cells expressing higher levels of IGFBP2 and it does not demonstrate any direct or indirect association of the two factors.

      We thank the reviewer for the positive comments about the importance of our work and have now addressed the points raised as weaknesses by performing additional analysis and experiments, adding a new schematic of the mechanism, and rewording our claims.

      We have clarified the different effects mediated by the activation and the IGFBP2 addition in a summary section at the end of the results and added Figure 6, showing this in visual form. We have also clearly stated the limitations related to the correlation between RUNX1T1 and IGFBP2 in the discussion and toned down our claims regarding this throughout the entire paper. We have also reworded the text to clarify the specific cell types identified in the sequencing data that we refer to.

      Reviewer #2 (Public Review):

      To enable robust production of hematopoietic progenitors in-vitro, Petazzi et al examined the role of transcription factors in the arterial hemogenic endothelium. They use IGFBP2 as a candidate gene to increase the directed differentiation of iPSCs into hematopoietic progenitors. They have established a novel induced-CRISPR mediated activation strategy to drive the expression of multiple endogenous transcription factors and show enhanced production of hematopoietic progenitors through expansion of the arterial endothelial cells. Further, upregulation of IGFBP2 in the arterial cells facilitates the metabolic switch from glycolysis to oxidative phosphorylation, inducing hematopoietic differentiation. While the overall study and resources generated are good, assertions in the manuscript are not entirely supported by the experimental data and some claims need further experimental validation.

      We thank the reviewer for the positive comments, and we have provided new data and analysis to make sure that all our assertations are clearly supported and also reworded those where limitations were identified by the reviewers.

      Recommendations for the authors:

      Reviewing Editor (Recommendations For The Authors):

      The assessment could change from "incomplete" to "solid" if the authors: i) improve data analysis (for both scRNAseq and functional assays) by providing additional information that could strengthen their conclusions, as suggested in the specific comments by both reviewers; ii) either provide new functional evidence supporting their mechanistic conclusion or alternatively tone down the claims that are not fully supported by data and acknowledge the limitations raised by reviewers in the discussion; (iii) the issue of paracrine signaling to expand only hematopoietic progenitors needs to be addressed.

      We have now improved the data analysis and provided additional functional tests to strengthen our conclusions and toned down those that were identified by the reviewers as not supported enough and included a discussion on these limitations. We have also reworded the section about the paracrine signaling throughout the paper.

      Reviewer #1 (Recommendations For The Authors):

      Figure 1 contains exclusively published data. It might be more appropriate to use it as a supplementary figure or as part of a more exhaustive figure (maybe combining Figures 1 and 2 together?).

      Figure 1 contained novel bioinformatic analyses that represent the base of our research and it has a different content and focus to figure 2, which is already a large figure. We therefore believe it is better to keep it as a separate figure, containing a new panel now too. 

      It seems there is an issue with Figure S3 labelling:

      • In line 112, Figure S2A-B does not display genomic PCR and sequencing results;

      • In line 123, Figure S3D-E does not show viability and proliferation data;

      • In line 127, Figure S3G does not show mCherry expression in response to DOX;

      We apologies for the confusion with the numbers, we have now correctly labelled the figures.

      It would be more informative to include gates and frequency on flow cytometry plots in Figure S3, to be able to evaluate the extent of the reduction in mCherry expression.

      We have now included the gating and frequency of mCherry-expressing cells in Supplementary Figure 3D.

      It is not clear from the text and figures whether the SB treatment was maintained throughout the hematopoietic differentiation protocol (line 122):

      • If so, it would be important to confirm that HDAC treatment does not affect EHT cultures

      • If not, can the authors provide some evidence that transgene silencing is not occurring during hematopoietic differentiation?

      We have clarified that we decided to treat the cells with SB exclusively in maintenance condihons because HDACs have been shown to be essenhal for the EHT (lines 138-142). We have now also included addihonal data showing the high expression of the mCherry tag reporhng the iSAM expression on day 8 (Supplementary Figure 4F).

      Can the authors provide a simple diagram summarizing the experimental strategy for each differentiation experiment in the respective supplementary figure? For instance, at what stage of the protocol was DOX added in Figure 3? Or at what stage IGFBP2 was added in Figure 5? It would be a very useful addition to the interpretation of the results.

      We have now included three schemahcs for all the experiments in the manuscript in supplementary figure 4 A-C.

      In Figure 3, the authors should provide more detailed information about the data filtering of the scRNAseq experiment, and more specifically:

      • How many cells were included in the analysis for each library after QC and filtering?

      • How "cells in which the gRNAs expression was detected" were selected? Do they include only cells showing expression of gRNAs for all 9 TF?

      This informahon is now included in the method sechon lines 773-781; the detailed code is available on the GitHub link provided in the same sechon. We have filtered the cells expressing one gRNA for the non-targehng gRNA (iSAM_NT) control and more than one for the iSAM_AGM sample. 

      In Figure 3A, it is not clear whether the expression of the 9 factors is consistently detected in all cells or just a subset of them, and the heatmap in Figure 3A does not provide this information. It would be more accurate to provide expression on a per-cell basis, for instance, as a violin plot displaying single dots representing each cell. 

      We have now included this violin plot in Supplementary Figure 4G as requested. However, this visualisation is difficult to interpret because some of the target genes’ expression seems variable in both experimental and control conditions. We had envisaged that this could have been the case and so this is why we had included the three different controls.  For this reason we chose to show the normalised expression which takes all the different variables into account (Figure 3A). 

      In Figure 3B-C, it seems that clusters EHT1 and EHT2 do not express endothelial markers anymore. Are these fully differentiated hematopoietic cells rather than cells undergoing EHT? In general, it would be quite important to provide evidence of expressed marker genes characterizing each cluster (eg. heatmap summarizing top DEG in the supplementary figure?). 

      We have now provided a spreadsheet containing the clusters’ markers that we used in

      Supplementary Table 1) a heatmap in Figure 3E. Furthermor,e we have now edited Figure 3C to include Pan Endothelial markers (PECAM1 and CDH5). These data show that the EHT1 and EHT2 cluster both express endothelial markers but are progressively downregulated as expected during endothelial to hematopoietic transition. We have also included and discussed this in the manuscript lines 192-195 and a schematic for the mechanism in Figure 6.

      In Figure 3E, displaying the proportion of clusters within each sample/library would be a more accurate way of comparing the cell types present in each library (removing potential bias introduced by loading different numbers of cells in each sample).

      We have now included the requested data in Supplementary Figure 4I and it confirms again the expansion of arterial cells in the activated cells.    

      In Figure 3G, by plating 20,000 total CD34+, the assay does not account for potential differences in sample composition. It is then hard to discriminate between the increased number of progenitors in the input or an enhanced ability of HE to undergo EHT. This is an important aspect to consider to precisely identify at what level the activation of the 9 factors is acting. A proper quantification of flow cytometry data summarizing the % of progenitors, arterial cells, etc. would be useful to interpret these results.

      Lines 204-205 reworded. We are very much aware of the fact that the CD34+ cell population consists of a range of cells across the EHT process and this is precisely why we carried out this single cell sequencing analyses.  We purposely tested the effect of the observed changes in composition by colony assays

      In Figure 3G, it seems that NT cells w/o DOX have very little CFU potential (if any). Can the authors provide an explanation for this?

      We think that the limited CFU potential is due to the extensive genetic manipulation and selection that the cells underwent for the derivation of all the iSAM lines but this did not impede us from observing an effect of gene activation on CFU numbers. This is one of the primary reasons that we then validated our overall findings using the parental iPSC line in control condition and with the addition of IGFBP2. We show that the parental iPSC line gives rise to hematopoietic progenitor, both immunophenotypically (Figure 4D) and functionally, at expected levels (Figure 4B left column).

      Figure 4A shows an upregulation of IGFBP2 in arterial cells as a result of TF activation. However, from the data presented here, it is not possible to evaluate whether this is specific to the arterial cluster, or it is a common effect shared by all cell types regardless of their identity. 

      Data has now been included in Supplementary Figure 4H, which shows that all the cells show an increase in IGFBP2, but arterial cells show the highest increase. We have now edited the text to reflect this, in lines 228-230.

      In Figure 5A-B only a minority of arterial cells express RUNX1 in response to IGFBP2 treatment. Is this sufficient to explain the very significant increase in the generation of functional hematopoietic progenitors described in Figure 4? Quantification and statistical analysis of RUNX1 upregulation would strengthen this conclusion.

      We have now provided the statistical analysis showing significant upregulation of RUNX1 upon IGFBP2 addition. The p values are now provided in the figure 5 legend.

      In Figure 5 the authors conclude that IGFBP2 remodels the metabolic profile of endothelial cells. However, it is not clear which cell types and clusters were included in the analysis of Figure 5C-G. Is the switch from Glycolysis to Oxidative Phosphorylation specific to endothelial cells? Or it is a more general effect on the entire culture, including hematopoietic cells? 

      We based this conclusion on the fact that the single-cell RNAseq allows to verify that the metabolic differences are obtained in the endothelial cells. Given that we sorted the adherent cells, the majority of these are endothelial cells as shown in Figure 5A. The Seahorse pipeline includes a number of washing steps resulting in the analyses being performed on the adherent compartment which we know consists primarily of endothelial cells. We cannot exclude some contamination from non-endothelial cells but we highlight to this reviewer that the initial observation of the metabolic changes was identified in endothelial cells in the single cell sequencing data. Taken together, we believe that this implies that metabolic changes are specific to this population. We have clarified this in the line 317.

      In the discussion, the authors conclude that they "provide evidence to support the hypothesis that RUNX1T1 could regulate IGFBP2 expression". To further support this conclusion, the authors could provide a correlation analysis of the expression of the two genes in the cell type of interest. 

      Following the observation of the IGFBP2 high expression across clusters, we have now reworded this sentence in lines 382-385  We have tried to perform the correlation analysis but we believe this not to be appropriate due to the detection level of the gRNA, we have now included this as a limitation point in the discussion lines 416-427, and also toned down the conclusion we did draw about RUNX1T1 throughout the whole manuscript.

      As mentioned by the authors, IGFBP2 binds IGF1 and IGF2 modulating their function. Both IGF1 (http://dx.doi.org/10.1016/j.ymeth.2015.10.001) and IGF2 (doi:10.1038/nbt.3702) have been used in iPSC differentiation into definitive hematopoietic cells. It would be relevant to discuss/reference this in the discussion.

      We have now included the suggested reference in the section where we discuss the role of IGFBP2 in binding IGF1 and IGF2.

      Reviewer #2 (Recommendations For The Authors):

      (1) Figure 1 compares the transcriptome of human AGM and in-vitro derived hemogenic endothelial cells (HECs). It is not clear why only the genes downregulated in the latter were chosen. Are there any significantly upregulated genes, knockdown/knockout which could also serve a similar purpose? Single-cell transcriptome database analysis is very preliminary. A detailed panel with differences in cluster properties of HECs between the two systems should be provided. A heatmap of all differentially expressed genes between the two samples must be generated, along with a logical explanation for choosing the given set of genes. 

      We have now included another panel in figure 1 to better clarify the logic behind the strategy used to identify our target genes (Figure 1A).

      (2) Figure 2 - a panel describing the workflow of gRNA design and targeting for the 9 candidate genes, along with lentiviral packaging and transduction would make it easier to follow. 

      We have now included three schematics for all the experiments in the manuscript in supplementary figure 4 A-C. 

      (3) Figure 3- to assess the effect of arterial cell expansion on the emergence of hematopoietic progenitors, CD34+ Dll4+ cells should be sorted for OP9 co-culture assay.

      Using only CD34+ cells does not answer the question raised. Also, the CFU assay performed does not fully support the claim of enhanced hematopoietic differentiation since only CFU-E and CFU-GM colonies are increased in Dox-treated samples, with no effect on other colony types. OP9 co-culture assay with these cells would be required to strengthen this claim. 

      We wanted to clarify that the effect on the methylcellulose coming from the activated cells was not limited to CFU-E, as the reviewer reported; instead, it also affected CFU-GM and CFU-M. 

      We have now performed additional experiments where we sorted the CD34+ compartment into DLL4- and DLL4+ in Supplementary Figure 5D-E, which we discussed in lines 250-258. 

      (4) In Figure 3F, there appears to be a lot of variation in the DLL4% fold change values for

      DOX treated iSAM_AGM sample, which weakens the claim of increased arterial expansion.

      Can the authors explain the probable reason? It is suggested that the two other controls (iSAM_+DOX and iSAM_-DOX) should be included in this analysis. It is imperative to also show % populations rather than just fold change to gain confidence.

      We agree that there is a lot of variability. That is because differentiation happens in 3D in embryoid bodies, which contain many different cell types that differentiate in different proportions across independent experiments. We have now included the raw data in Supplementary Figure 4 D, with additional statistical analysis to show the expansion of arterial cells including also the suggested additional controls.

      (5) How does activation of these target genes cause increased arterialization? Is the emergence of non-HE populations suppressed? Or is it specific to the HE? The data on this should be clarified and also discussed. ANTO/Lesley text

      We have provided additional data clarifying the connection between increased arterialisation and hemogenic potential. We showed that the activation induces increased arterialisation and that IGFBP2 acts by supporting the acquisition of hemogenic potential. We have discussed this in lines 326-348 and provided a new figure to explain this in detail (figure 6)

      (6) Considering that IGFBP2 was chosen from the activated target gene(s) cluster, can the authors explain why the reduced CFU-M phenomenon observed in Figure 3G does not appear in the MethoCult assay for IGFBP2 treated cells (Figure 4B)?

      The difference could be explained by the fact that in Figure 3G, the cells underwent activation of multiple genes, while in Figure 4B, they were only exposed to IGFBP2. Our results show that IGFBP2 could at least partially explain the phenotype that we see with the activation, but we believe that during the activation experiments, there might be other signals available that might not be induced by IGFBP2 alone. We have also added a summary section and a figure to clarify the different mechanisms of action of the gene activation and IGFBP2.

      (7) Figure 4- while the experiments conducted support the role of IGFBP2 in increasing hematopoietic output, there is no experimental evidence to prove its function through paracrine signalling in HECs. The authors need to provide some evidence of how IGFBP2 supplementation specifically expands only the hematopoietic progenitors. Experimental strategies involving specifically targeting IGFBP2 in hemogenic/arterial endothelial cells are required to prove its cell type specific function. Additionally, assessing the in vivo functional potential of the hematopoietic cells generated in the presence of IGFBP2, by bone-marrow transplantation of CD34+ CD43+ cells, is essential. 

      The role of IGFBP2 in the context of HSC production and expansion was not the topic of our research, and we have not claimed that IGFBP2  affects the long-term repopulating capacity of HSPCs. Therefore, we believe that the requested experiments are not required to support the specific claims that we do make. We have now provided more experiments and bioinformatic analysis that support the role of IGFBP2 in inducing the progression of EHT from arterial cells to hemogenic endothelium, and to avoid misunderstandings, we have toned down our claims by editing the text regarding its paracrine effect s. 

      (8) Figure 4C-D -It is recommended to plot % populations along with fold change value. As this is a key finding, it is important to perform flow cytometry for additional hematopoietic markers- CD144, CD235a and CD41a to demonstrate whether this strategy can also expand erythroid-megakaryocyte progenitors. Telma

      Figure 4C already shows the percentage values; we have now added the percentage for Figure 4D in SF5C. We have also performed additional analysis as requested and added the data obtained to Supplementary Figure 5D.

      (9) In Figure 5, analysis showing the frequency of cells constituting different clusters, between untreated and IGFBP2-treated samples in the single-cell transcriptome analysis is essential. Additional experiments are required to validate the function of IGFBP2 through modulation of metabolic activity. Inhibition of oxidative phosphorylation in the IGFBP2treated cells should reduce the hematopoietic output. Authors should consider doing these experiments to provide a stronger mechanistic insight into IGFBP2-mediated regulation of hematopoietic emergence.

      We have now included the requested cluster composition in Supplementary Figure 5F. We decided not to include further tests on the metabolic profile of IGFBP2 as we already discussed in other papers that showed, using selective inhibitors, that the EHT coincides with a glycol to OxPhos switch. 

      (10) It is very striking to see that IGFBP2 supplementation changes the transcriptional profile of developing hematopoietic cells by increasing transcription of OXPHOS-related genes with concomitant reduction of glycolytic signatures, particularly at Day 13. However, the mitochondrial ATP rate measurements do not seem convincing. The bioenergetic profiles show that when mitochondrial inhibitors are added, both groups exhibit decreased OCR values and, on the other hand, higher ECAR. This indicates that both groups have the capability to utilize OXPHOS or glycolysis and may only differ in their basal respiration rates.

      Differences in proliferation rate can cause basal respiration to change. There is no information on how the bioenergetic profile was normalized (cell no./protein amount). Given that IGFBP2 has been shown to increase proliferation, it is very likely that the cells treated with IGFBP2 proliferated faster and therefore have higher OCR. The data needs to be normalized appropriately to negate this possibility.

      We have previously tested whether IGFBP2 causes an increase in proliferation by analysing the cell cycle of cells treated with it, as we initially thought this could be a mechanism of action. We have now provided the quantification of the cell cycle in the cells treated with IGFBP2, showing no effect was observed in cell cycle Supplementary Figure 4E. Following this analysis, we decided to plate the same number of cells and test their density under the microscope before running the experiment; each experiment was done in triplicate for each condition. We have now added this info to the method sections lines 806-813.  We did not comment on the basal difference, which we agree might be due to several factors, but we only compared the difference in response to the inhibitors, which isn’t affected by the basal level but exclusively by their D values. We have also included the formulas used to calculate the ATP production rate.

      Overall, it appears that IGFBP2 does not seem to primarily cause metabolic changes, but simply accelerates the metabolic dependency on OXPHOS. Hence, the term 'metabolic remodelling' must be avoided unless IGFBP2 depletion/loss of function analysis is shown.

      We thank the reviewer for suggesting how to interpret the data about the dependency on OXPHOS. We have now changed the conclusions and claims about the effect of IGFBP2. We have also included a cell cycle analysis of the hematopoietic cells derived upon IGFBP2 addition to show that they don’t show differences in proliferation that could cause the increase in colony formation we observed. Regarding the assay, we have plated the same number of cells for each group to make sure we were comparing the same number of cells, which we also assessed in the microscope before the test, and we eliminated the suspension cells during the washes that preceded the measurement. The review is correct in indicating that there is a basal difference in the value of OCR and ECAR where the IGFBP2 is lower at the start and not higher, which would not conceal higher proliferation. Finally, the ATP production rate is calculated on the variation of OCR and ECAR upon the addition of inhibitors, which normalizes for the basal differences.

    2. eLife Assessment

      This study presents useful findings to inform and improve the in vitro differentiation of hematopoietic progenitor cells from human induced pluripotent stem cells. Relying on a well-characterised technical approach, the data analysis is overall solid and reasonably supports the main conclusions.

    3. Reviewer #1 (Public Review):

      Summary:

      The work from Petazzi et al. aimed at identifying novel factors supporting the differentiation of human hematopoietic progenitors from induced-pluripotent stem cells (iPSCs). The authors developed an inducible CRISPR-mediated activation strategy (iCRISPRa) to test the impact of newly identified candidate factors on the generation of hematopoietic progenitors in vitro. They first compared previously published transcriptomic data of iPSC-derived hemato-endothelial populations with cells isolated ex vivo from the aorta-gonad-mesonephros (AGM) region of the human embryo and they identified 9 transcription factors expressed in the aortic hemogenic endothelium that were poorly expressed in the in vitro differentiated cells. They then tested the activation of these candidate factors in an iPSC-based culture system supporting the differentiation of hematopoietic progenitors in vitro. They found that the IGF binding protein 2 (IGFBP2) was the most upregulated gene in arterial endothelium after activation and they demonstrated that IGFBP2 promotes the generation of functional hematopoietic progenitors in vitro.

      Strengths:

      The authors developed a very useful doxycycline-inducible system to activate the expression of specific candidate genes in human iPSC. This approach allows us to simultaneously test the impact of 9 different transcription factors on in vitro differentiation of hematopoietic cells, and the system appears to be very versatile and applicable to a broad variety of studies. Using this approach, the authors exhaustively demonstrated the role of IGFBP2 in promoting the generation of functional hematopoietic progenitors in vitro from iPSCs.

      Weaknesses:

      The authors performed a very thorough characterization of the system in proof-of-principle experiments activating a single transcription factor. However, when 9 independent factors were used, it is not always clear whether the observed results were the consequence of the simultaneous activation of all 9 TF or just a subset of them.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Summary:

      In this manuscript, the molecular mechanism of interaction of daptomycin (DAP) with bacterial membrane phospholipids has been explored by fluorescence and CD spectroscopy, mass spectrometry, and RP-HPLC. The mechanism of binding was found to be a two-step process. A fast reversible step of binding to the surface and a slow irreversible step of membrane insertion. Fluorescence-based titrations were performed and analysed to infer that daptomycin bound simultaneously two molecules of PG with nanomolar affinity in the presence of calcium. Conformational change but not membrane insertion was observed for DAP in the presence of cardiolipin and calcium.

      Strengths:

      The strength of the study is skillful execution of biophysical experiments, especially stoppedflow kinetics that capture the first surface binding event, and careful delineation of the stoichiometry.

      Weaknesses:

      The weakness of the study is that it does not add substantially to the previously known information and fails to provide additional molecular details. The current study provides incremental information on DAP-PG-calcium association but fails to capture the complex in mass spectrometry. The ITC and NMR studies with G3P are inconclusive. There are no structural models presented. Another aspect missing from the study is the reconciliation between PG in the monomer, micellar, and membrane forms.

      Besides the two-stage process, another important finding in the current work is the stable complex that plays a critical role in the drug uptake both in vitro and in B. subtilis. This complex has been shown to be a stable species in HPLC and its binding stoichiometry and affinity have been quantitatively characterized. The complex may not be stable enough in gas phase to be detected in the MS analysis, which was designed to detect the phospholipid and Dap components, not the complex itself. The structural model of this complex is clearly proposed and presented in Figure 6. 

      The NMR and ITC studies have a very clear conclusion that Dap has a weak interaction with the PG headgroup alone, which is unable to account for the Dap-PG interaction observed in the fluorescence studies. Thus, the whole PG molecule has to be involved in the interaction, leading to the discovery of the stable complex.  

      Reviewer #2 (Recommendations For The Authors):

      (1) I appreciate and agree with the comment that there are stages of daptomycin insertion, and these might involve the formation of different complexes with different binding partners (e.g. pre-insertion vs quaternary vs bactericidal). However, it seems like lipid II is an apparent participant in daptomycin membrane dynamics (Grein et al. Nature Communications 2020). It's not clear why this was excluded from analysis by the authors, or what basis there is for the discussion statement that the quaternary complex can shift into the bactericidal complex by exchanging 1 PG for lipid II. 

      We agree that lipid II and other isoprenyl lipids may be involved in the uptake and insertion of daptomycin into membrane according to the results of the Nat. Comm. paper. However, these isoprenyl lipids are very small components of the membrane in comparison to PG and their contribution to the drug uptake is thus expected to be much less significant. Nonetheless, we included farnesyl pyrophosphate (FPP) as an analog of bactoprenol pyrophosphate (C55PP), which was reported to have the same promoting effect as lipid II in the previous study, in our study but found no promoting effect in the fluorescence assay (Fig. 2B). In addition, no complex was formed when FPP replaced PG in our preparation and analysis of the drug-lipid complex. In consideration of these negative results and the expected small contribution, other isoprenyl lipids or their analogs were not included in the study.

      The statement of forming the proposed bactericidal complex from the identified complex is a speculation that is possible only when lipid II has a higher affinity for Dap than a PG ligand. To avoid confusion, we deleted the sentence’ in the revision. 

      (2) The detailed examination of daptomycin dynamics, particularly on the millisecond scale, in this paper is ideal for characterizing the effect of lipid II on daptomycin insertion. It would be helpful to either include lipid II in some analyses (micelle binding, fluorescence shifts, CD) or at least address why it was excluded from the scope of this work.

      As mentioned in the response to the first comment, we did not exclude isoprenyl lipids in our study but used some of their analogs in the fluorescence assay. Besides FPP mentioned above, we also tested geranyl pyrophosphate and geranyl monophosphate but obtained the same negative results. Lipid II was not directly used because it is one of the three isoprenyl lipids reported to have the same promoting effects in the Nat. Comm. paper and also because its preparation is not easy. Even if lipid II were different from other isoprenyl lipids in promoting membrane binding, its contribution is likely negligible at the reversible stage compared to the phospholipids because of its minuscule content in bacterial membrane. This is the main reason we did not use the isoprenyl lipids in the fast kinetic study (this stage only involves reversible binding, not insertion). 

      (3) Grein et al. 2020 saw that PG did not have a strong effect on daptomycin interaction with membranes. I believe this discrepancy is more likely due to the complex physical parameters of supported bilayers versus micelles/vesicles or some other methodological variable, but if the authors have more insight on this, it would be valuable commentary in the discussion.

      We totally agree that the discrepancy is likely due to the different conditions in the assays. It is hard to tell exactly what causes the difference. Thus, we did not attempt to comment on the cause of this difference in the discussion.

      (4) Isolation of the daptomycin complex from B. subtilis cells clearly had different traces from the in vitro complex; is it possible that lipid II is present in the B. subtilis complex? If not, a time-course extraction could be useful to support the model that different complexes have different activities. Isolates from early-stage incubation with daptomycin may lack lipid II but isolates from longer incubations may have lipid II present as the complex shifts from insertion to bactericidal.

      From the day we isolated the complex from B. subtilis, we have been looking for evidence for the previously proposed lipid complexes containing lipid II or other isoprenyl lipids but have not been successful. We did not see any sign of lipid II or other isoprenyl lipids in the MALDI or ESI mass spectroscopic data. The minute peaks in the HPLC traces are not the expected complexes in separate LC-MS analysis. However, this does not mean that such complexes are not present in the isolated PG-containing complex because: (1) the amount of such complexes may be too small to be detected due to the low content of the isoprenyl lipids; (2) the isoprenyl lipids, particularly lipid II, are not easily ionizable due to their size and unique structure for detection in mass spectrometry. 

      We don’t think the drug treatment time is the reason for the failure in detecting lipid II or other isoprenyl lipids. In our reported experiment, the cells were treated with a very high dose of Dap for 2 hours before extraction. In a separate experiment done recently, we treated B. subtilis at 1/3 of the used dose under the same condition and found all treated cells were dead after 1 hour in a titration assay, consistent with the results from reported time-killing assays in the literature. From this result, the proposed bactericidal lipid-containing complex should have been formed in the treated cells used in our extraction and isolated along with the PG-containing complex. It was not detected likely due to the reasons discussed above. To avoid the interference of the PG-containing complex, a large amount of bacterial cells might have to be treated at a low dose to isolate enough amount of the lipid II-containing complex for identification. However, isolation or identification of the lipid II-containing complex is outside the scope of the current investigation and is therefore not pursued. 

      (5) Part of the daptomycin mechanism of interacting with bacterial membranes involves the flipping of daptomycin from one leaflet to another. There was some mentioned work on the consistency of results between micelles and vesicles, but the dynamics or existence of a flipping complex in the bilayer system wasn't addressed at all in this paper.

      The current investigation makes no attempt to solve all problems in the daptomycin mode of action and is limited to the uptake of the drug, up to the point when Dap is inserted into the membrane. Within this scope, flipping of the complex is not yet involved and is thus irrelevant to the study. How the complex is flipped and used to kill the bacteria is what should be investigated next.  

      (6) The authors mention data with phosphatidylethanolamine in the text, but I could not find the data in the main or supplemental figures. I recommend including it in at least one of the figures.

      It is much appreciated that this error is identified. The POPE data was lost when the graphic (Fig. 2B) was assembled in Adobe to create Figure 2. We re-draw the graphic and reassemble the figure to solve this problem. Fig. 2B has also been modified to use micromolar for the concentration of the lipids.

      (7) Readability point: I'd suggest some consistency in the concentrations mentioned. Making the concentrations either all molar-based or all percentage-based would make comparison across figures easier.

      As suggested, we have changed the % into micromolar concentrations in Fig. 2B and also in Fig. 3A. 

      (8) The model figure is quite difficult to interpret, particularly the final stage of the tail unfolding. I recommend the authors use a zoomed-in inset for this stage, or at least simplify the diagram by removing the non-participating lipid structures. The figure legend for the model figure should also have a brief description of the events and what the arrows mean, particularly the POPS PG arrow in the final panel of the figure. I am assuming here the authors are implying that daptomycin can transiently interact with one lipid species and move to another, but the arrow here suggests that daptomycin is moving through the lipid headgroup space.

      We really appreciate the suggestions. As suggested, we put an inset to show the preinsertion complex more clearly. In addition, we have removed the green arrows originally intended to show the re-organization/movement of the phospholipids. Moreover, the legend is changed to ‘Proposed mechanism for the two-phased uptake of Dap into bacterial membrane. In the first phase, Dap reversibly binds to negative phospholipids with a hidden tail in the headgroup region, where it combines with two PG molecules to form a pre-insertion complex. In the second phase, the hidden tail unfolds and irreversibly inserts into the membrane. The inset shows the headgroup of the pre-insertion complex with the broad arrow showing the direction for the unfolding of the hidden tail. The red dots denote Ca2+.’  

      (9) The authors listed the Kd for daptomycin and 2 PG as 7.2 x 10-15 M2. Is this correct? This is an affinity in the femtomolar range.

      Please note that this Kd is for the simultaneous binding of two PG molecules, not for the binding of a single ligand that we usually refer to. Assuming that each PG contributes equally to this interaction, the binding affinity for each ligand is then the squared root of 7.2 x 10-15 M2, which equals to 8.5 x 10-8 M. This is equivalent to a nanomolar affinity for PG and is a reasonably high affinity.

      Reviewer #3 (Recommendations For The Authors):

      (1) The authors reported an increase in daptomycin intensity with the increasing amount of negatively charged DMPG. A similar observation has been reported for GUVs, however, the authors did not refer to this paper in their manuscript: E. Krok, M. Stephan, R. Dimova, L. Piatkowski, Tunable biomimetic bacterial membranes from binary and ternary lipid mixtures and their application in antimicrobial testing, Biochim. Biophys. Acta - Biomembr. 1865 (2023) [1]. This paper is also consistent with the authors' observation that there is negligible fluorescence detected for the membranes composed of PC lipids upon exposure to the Dap treatment.

      As suggested, this paper is cited as ref. 29 in the revision by adding the following sentence at the end of the section ‘Dependence of Dap uptake on phosphatidylglycerol.’: ‘PG-dependent increase of the steady-state fluorescence was also observed in giant unilamellar vesicles (GUVs).29’. The numbering is changed accordingly for the remaining references.  

      (2) Please include the plot of the steady-state Kyn fluorescence vs the content of POPA (Figure 2C shows traces for DMPG, CL, and POPS). Both POPA and POPS lipids are negatively charged, however, POPS seems to interact with Dap, while POPA does not. In my opinion, this observation is really interesting and might deserve a more thorough discussion. The authors might want to describe what could be the mechanism behind this lipid-specific mode of binding.

      As suggested, a plot is now added for POPA in Fig. 2C, which is basically a flat line without significant increase for the Kyn fluorescence. Indeed, the different effect of the negative phospholipids is very interesting, indicating that the reversible binding of Dap to the lipid surface is dependent not only on the Ca2+-mediated ionic interaction but also the structure of the headgroup. In other words, Dap recognizes the phospholipids at the surface binding stage. Considering this headgroup specificity, the last sentence in the second paragraph in “Discussion’ is changed from ‘In addition, due to the low lipid specificity, this reversible binding likely involves Ca2+-mediated ionic interaction between Dap and the phosphoryl moiety of the headgroups.’ to ‘In addition, due to the specificity for negative phospholipids (Fig. 2B and 2C), this reversible binding of Dap likely involves both a nonspecific Ca2+-mediated ionic interaction and a specific interaction with the remaining part of the headgroups.’

      (3) The authors write that they propose a novel mechanism for the Ca2+-dependent insertion of Dap to the bacterial membrane, however, they rather ignored the already published findings and hypotheses regarding this process. In fact the role of Ca2+, as well as the proposed conformational changes of Dap, which allow its deeper insertion into the membrane are well known:

      The role of Ca2+ ions in the mechanism of binding is actually three-fold: (i) neutralization of daptomycin charge [2], (iii) creating the connection between lipids and daptomycin and (iii) inducing two daptomycin conformational changes. It should be noted that the interactions between calcium ions and daptomycin are 2-3 orders of magnitude stronger than between daptomycin and PG lipids [3,4]. Thus, upon the addition of CaCl2 to the solution, the divalent cations of calcium bind preferentially to the daptomycin, rather than to the negatively charged PG lipids, which results in the decrease of daptomycin net negative charge but also leads to its first conformational change [4]. Upon binding between calcium ions and two aspartate residues, the area of the hydrophobic surface increases, which allows the daptomycin to interact with the negatively charged membrane. In the next step, Ca2+ acts as a bridge connecting daptomycin with the anionic lipids. This event leads to the second conformational change, which enables deeper insertion of daptomycin into the lipid membrane and enables its fluorescence [4]. The overall mechanism has a sequential character, where the binding of daptomycin-Ca2+ complex to the negatively charged PG (or CA) occurs at the end.

      The authors should focus on emphasizing the novelty of their manuscript, keeping in mind the already published paper.

      We agree with the comments on the three general roles of calcium ion in the Dap interaction with membrane. The current investigation does not ignore the previous findings, which involve many more works than mentioned above, but takes these findings as common knowledge. Actually, the role of calcium ion is not the focus of current work. Instead, the current work focuses on how the drug is taken up and inserted into the membrane in the presence of the ion and how its structure changes in this process. With the known roles of calcium ion in mind, we propose an uptake mechanism (Fig. 6) that shows no conflict with the common knowledge.

      We would like to point out that the ‘deeper insertion into the membrane’ in the comment is different from the membrane insertion referred to in our manuscript. This ‘deeper insertion’ still remains in the reversible stage of binding to the membrane surface because all negative phospholipids can do this (causing a conformational change and fluorescence increase, as quantified in Fig.2C) but now we know that only PG can enable irreversible membrane insertion because of our work. In addition, the comment that calcium binding to daptomycin causes first conformational change is not supported by our finding that no conformational change is found for Dap in the presence of calcium in a lipid-free environment (Fig. 3B). One important aspect of novelty and contribution of our work is to clear up some of these ambiguities in the literature. Another contribution of our work is to demonstrate the formation of a stable complex between Dap and PG with a defined stoichiometry and its crucial role in the drug uptake. 

      (4) One paragraph in the section "Ca2+- dependent interaction between Dap and DMPG" is devoted to a discussion of the formation of precipitate upon extraction of DMPG-containing micelles, exposed to Dap in the calcium-rich environment. Contrary, in the absence of Dap, no precipitate was detected. The authors did not provide any visual proof for their statement. Please include proper photographs in the supplementary information.

      The precipitate formed upon extraction of the DMPG-containing micelles was too little to be visually identifiable but could be collected by centrifugation and detected by fluorescence or HPLC after dissolving in DMSO. For visualization, we show below the precipitate formed using higher amount of Dap and DMPG. The Dap-DMPG-Ca2+ complex (left tube) was formed by mixing 1 mM Dap, 2 mM DMPG and 1 mM Ca2+ and the control (right tube) was a mixture of 2 mM DMPG and 1 mM Ca2+. This is now added as Fig. S7 in the supplementary information (the index is modified accordingly) and cited in the main text.

      (5) The authors wrote that it is not clear how many calcium ions are bound to Dap-2PG complex (page 11, Discussion section). There are already reports discussing this issue. I recommend citing the paper discussing that exactly two Ca2+ ions bind to a single Dap molecule: R. Taylor, K. Butt, B. Scott, T. Zhang, J.K. Muraih, E. Mintzer, S. Taylor, M. Palmer, Two successive calcium-dependent transitions mediate membrane binding and oligomerization of daptomycin and the related antibiotic A54145, Biochim. Biophys. Acta - Biomembr. 1858, (2016) 1999-2005 [5]

      We were aware of the cited work that shows binding of two Ca2+ but also noted that there are more works showing one Ca2+ in the binding, such as the paper in [Ho, S. W., Jung, D., Calhoun, J. R., Lear, J. D., Okon, M., Scott, W. R. P., Hancock, R. E. W., & Straus, S. K. (2008), Effect of divalent cations on the structure of the antibiotic daptomycin. European Biophysics Journal, 37(4), 421–433.]. That was the reason we said ‘it is not clear how many calcium ions are bound to Dap-2PG complex’. Now, both papers are cited (as Ref. #33, 34) to support this statement.

      (6) The authors wrote two contradictory statements:

      -  PG cannot be found in mammalian cell membranes:

      "Moreover, the complete dependence of the membrane insertion on PG also explains why Dap selectively attacks Gram-positive bacteria without affecting mammalian cells, because PG is present only in bacterial membrane but not in mammalian membrane. " (Page 10, Discussion section, last sentence of the first paragraph)

      "However, Dap absorbed on bacterial surface is continuously inserted into the acyl layer via formation of complex with PG in a time scale of minutes, whereas no irreversible insertion of Dap occurs on mammalian membrane due to the absence of PG while the bound Dap is continuously released to the circulation as the drug is depleted by the bacteria." (Page 13, Discussion section)

      -  PG in trace amounts is present in mammalian membranes:

      "The proposed requirement of the pre-insertion quaternary complex increases the threshold of PG content for the membrane insertion to happen and thus makes it impossible on the surface of mammalian cells even if their plasma membrane contains a trace amount of PG." (Page 13, Discussion section).

      In fact, phosphatidylglycerol comprises 1-2 mol% of the mammalian cell membranes. Please, correct this information, which in this form is misleading to the readers.

      We appreciate the comments about the PG content in mammalian cells. Changes are made as listed below:

      (1) p10, the sentence is changed to ‘Moreover, the complete dependence of the membrane insertion on PG also explains why Dap selectively attacks Gram-positive bacteria without affecting mammalian cells, because PG is a major phospholipid in bacterial membrane but is a minor component in mammalian membrane.’ 

      (2) p13, the sentence is changed to ‘However, Dap absorbed on bacterial surface is continuously inserted into the acyl layer via formation of complex with PG in a time scale of minutes, whereas little irreversible insertion of Dap occurs on mammalian membrane due to the low content of PG while the bound Dap is continuously released to the circulation as the drug is depleted by the bacteria.’

      (3) p13, another sentence is modified to ‘The proposed requirement of the pre-insertion quaternary complex increases the threshold of PG content for the membrane insertion to happen and thus makes it less likely on the surface of mammalian cells that contain PG at a low level in the membrane.’ 

      (7) Please include information that Dap is effective only against Gram-positive bacteria and does not show antimicrobial properties against Gram-negative strains. The authors focused on emphasizing that Dap does not affect mammalian membranes, most likely due to the low PG content, however even membranes of Gram-negative bacteria are not susceptible to the Dap, despite the relatively high content of negatively charged PG in the inner membrane (e.g. inner cell membrane of E. coli has ~20% PG).

      The requested information is already included in ‘Introduction’. In this part, Dap is introduced to be only active against Gram-positive bacteria, implicating that it is not active against Gram-negative bacteria. The reason Dap is inactive against E. coli or other Gramnegative bacteria is because the outer membrane prevents the antibiotic from accessing the PG in the inner membrane to cause any harm. When the outer membrane is removed, Dap will also attack the plasma membrane of Gram-negative bacteria. 

      Literature cited in the comments:

      (1) E. Krok, M. Stephan, R. Dimova, L. Piatkowski, Tunable biomimetic bacterial membranes from binary and ternary lipid mixtures and their application in antimicrobial testing, Biochim. Biophys. Acta - Biomembr. 1865 (2023). https://doi.org/10.1101/2023.02.12.528174.

      (2) S.W. Ho, D. Jung, J.R. Calhoun, J.D. Lear, M. Okon, W.R.P. Scott, R.E.W. Hancock, S.K. Straus, Effect of divalent cations on the structure of the antibiotic daptomycin, Eur. Biophys. J. 37 (2008) 421-433. https://doi.org/10.1007/S00249-007-0227-2/METRICS.

      (3) A. Pokorny, P.F. Almeida, The Antibiotic Peptide Daptomycin Functions by Reorganizing the Membrane, J. Membr. Biol. 254 (2021) 97-108. https://doi.org/10.1007/s00232-02100175-0.

      (4) L. Robbel, M.A. Marahiel, Daptomycin, a bacterial lipopeptide synthesized by a nonribosomal machinery, J. Biol. Chem. 285 (2010) 2750127508. https://doi.org/10.1074/JBC.R110.128181.

      (5) R. Taylor, K. Butt, B. Scott, T. Zhang, J.K. Muraih, E. Mintzer, S. Taylor, M. Palmer, Two successive calcium-dependent transitions mediate membrane binding and oligomerization of daptomycin and the related antibiotic A54145, Biochim. Biophys. Acta - Biomembr. 1858 (2016) 1999-2005. https://doi.org/10.1016/J.BBAMEM.2016.05.020.

    2. eLife Assessment

      This valuable study describes the molecular mechanism of daptomycin insertion into bacterial membranes. The authors provide solid in vitro evidence for the early events of daptomycin interaction with phospholipid headgroups and stronger, specific interaction with phosphatidylglycerol. This work will be of interest to bacterial membrane biologists and biochemists working in the antimicrobial resistance field.

    3. Reviewer #3 (Public review):

      Summary:

      Machhua et al. in their work focused on unravelling the molecular mechanism of daptomycin binding and interaction with bacterial cell membranes. Daptomycin (Dap) is an acidic, cyclic lipopeptide composed of 13 amino acids, known for preferential binding to anionic lipids, particularly phosphatidylglycerol (PG), which are prevalent components in the membranes of Gram-positive bacteria. The process of binding and antimicrobial efficacy of Dap are significantly influenced by the ionic composition of the surrounding environment, especially the presence of Ca2+ ions. The authors underscore the presence of significant knowledge gaps in our understanding of daptomycin's mode of action. Several critical questions remain unanswered, including the basis for selective recognition and accumulation in membranes of Gram-positive strains, the specific role of Ca2+ ions in this process, and the mechanisms by which daptomycin binds to and inserts into the cell membrane.

      Dap is intrinsically fluorescent due to its kynurenine residue (Kyn-13) and this property allows direct imaging of Dap binding to model cell membranes without the need of additional labeling. Taking advantage of this Dap autofluorescence, authors monitored the emission intensity of micelles, composed of varying DMPG content upon their exposure to Dap and compared it with the kinetics of fluorescence observed for zwitterionic DMPC and other negatively charged lipids such as cardiolipin (CA), POPA and POPS. The authors noted that the linear relationship between DMPG content and Dap fluorescence is strongly lipid-specific, as it was not observed for other anionic lipids. The manuscript sheds light on the specificity of Dap's interaction with CA and DMPG lipids. Through Ca2+ sequestration with EGTA, the authors demonstrated that the binding of Dap with CA is reversible, while its interaction with DMPG results in the irreversible insertion of Dap into the lipid membrane structure, caused by the significant conformational change of this lipopeptide. The formation of a stable DMPG-Dap complex was also verified in bacterial cells isolated from Gram-positive bacteria B. subtilis, where Dap exhibited a permanent binding to PG lipids.

      Altogether, the authors endeavored to illuminate novel insights into the molecular basis of Dap binding, interaction, and the mechanism of insertion into bacterial cell membranes. Such understanding holds promise for the development of innovative strategies in combating drug resistance and the emerging of the so-called superbugs.

      Strengths:

      - The manuscript by Machhua et al. provides a comprehensive analysis of the Dap mechanism of binding and interaction with the membrane. It discusses various aspects of this, only apparently trivial interaction such as the importance of PG presence in the membrane, the impact of Ca2+ ions, and different mechanisms of Dap binding with other negatively charged lipids.<br /> - The authors focused not only on model membranes (micelles) but also extended their research to bacterial cell membranes obtained from B. subtilis<br /> - The research is not only a report of the experimental findings but tries to give potential hypotheses explaining the molecular mechanisms behind the observed results

      Weaknesses:

      - The authors overestimate their findings, stating that they propose a novel mechanism of Dap interaction with bacterial cell membranes. This research is the extension of the hypotheses that have already been reported.<br /> - The literature study and overall discussion about the mechanism of action of Ca2+ ions or conformational changes of daptomycin could be improved.

    1. eLife Assessment

      This important study provides empirical evidence of the effects of genetic diversity and species diversity on ecosystem functions across multi-trophic levels in an aquatic ecosystem. The support for these findings is solid, but a more nuanced interpretation of the results could strengthen the conclusions. The work will be of interest to ecologists working on multi-trophic relationships and biodiversity.

    2. Reviewer #1 (Public review):

      Summary:

      This work used a comprehensive dataset to compare the effects of species diversity and genetic diversity within each trophic level and across three trophic levels. The results stated that species diversity had negative effects on ecosystem functions, while genetic diversity had positive effects. Additionally, these effects were observed only within each trophic level and not across the three trophic levels studied. Although the effects of biodiversity, especially genetic diversity across multi-trophic levels, have been shown to be important, there are still very few empirical studies on this topic due to the complex relationships and difficulty in obtaining data. This study collected an excellent dataset to address this question, enhancing our understanding of genetic diversity effects in aquatic ecosystems.

      Strengths:

      The study collected an extensive dataset that includes species diversity of primary producers (riparian trees), primary consumers (macroinvertebrate shredders), and secondary consumers (fish). It also includes genetic diversity of the dominant species in each trophic level, biomass production, decomposition rates, and environmental data. The writing is logical and easy to follow.

      Weaknesses:

      The two main conclusions-(1) species diversity had negative effects on ecosystem functions, while genetic diversity had positive effects, and (2) these effects were observed only within each trophic level, not across the three levels-are overly generalized. Analysis of the raw data shows that species and genetic diversity have different effects depending on the ecosystem function. For example, neither affected invertebrate biomass, but species diversity positively influenced fish biomass, while genetic diversity had no effect. Furthermore, Table S2 reveals that only four effect sizes were significant (P < 0.05): one positive genetic effect, one negative genetic effect, and two negative species effects, with two effects within a trophic level and two across trophic levels. Additionally, using a P < 0.2 threshold to omit lines in the SEMs is uncommon and was not adequately justified. A more cautious interpretation of the results, with acknowledgment of the variability observed in the raw data, would strengthen the manuscript.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This work used a comprehensive dataset to compare the effects of species diversity and genetic diversity within each trophic level and across three trophic levels. The results showed that species diversity had negative effects on ecosystem functions, while genetic diversity had positive effects. These effects were observed only within each trophic level and not across the three trophic levels studied. Although the effects of biodiversity, especially genetic diversity across multi-trophic levels, have been shown to be important, there are still very few empirical studies on this topic due to the complex relationships and difficulty in obtaining data. This study collected an excellent dataset to address this question, enhancing our understanding of genetic diversity effects in aquatic ecosystems.

      Strengths:

      The study collected an extensive dataset that includes species diversity of primary producers (riparian trees), primary consumers (macroinvertebrate shredders), and secondary consumers (fish). It also includes the genetic diversity of the dominant species at each trophic level, biomass production, decomposition rates, and environmental data.

      The conclusions of this paper are mostly well supported by the data and the writing is logical and easy to follow.

      Weaknesses:

      (1) While the dataset is impressive, the authors conducted analyses more akin to a "meta-analysis," leaving out important basic information about the raw data in the manuscript. Given the complexity of the relationships between different trophic levels and ecosystem functions, it would be beneficial for the authors to show the results of each SEM (structural equation model).

      We understand the point raised by the reviewer. We now provide individual SEMs (Figure 3), although we limit causal relationships to those for which the p-value was below 0.2 for the sake of graphical clarity. We also provide the percentage of explained variance for each ecosystem function. We detail the graph in the Results section (see l. 317-328) and discuss them (see l. 387-398). Note that we do not detail each function separately as this would (in our opinion) result in a long descriptive paragraph from which it might be difficult to get some key information. Rather, we summarize the percentage of explained variance for each function and discuss the strength of environmental vs biodiversity effects for some examples. In the Discussion, we explain why environmental effects (on functions and biodiversity) are relatively weak. We mainly attribute this to the sampling scheme that follows an East-West gradient (weak altitudinal range) rather than an upstream-downstream gradient as it is traditionally done in rivers. The reasoning behind this sampling scheme is explained in our companion paper (Fargeot et al. Oikos 2023) to which we now refer more explicitly in the MS. Briefly, using an upstream-downstream gradient would have certainly push up the effects of the environment, but this would have made extremely complex the inference of biodiversity effects due to strong collinearity among environmental and biodiversity parameters.

      (2) The main results presented in the manuscript are derived from a "metadata" analysis of effect sizes. However, the methods used to obtain these effect sizes are not sufficiently clarified. By analyzing the effect sizes of species diversity and genetic diversity on these ecosystem functions, the results showed that species diversity had negative effects, while genetic diversity had positive effects on ecosystem functions. The negative effects of species diversity contradict many studies conducted in biodiversity experiments. The authors argue that their study is more relevant because it is based on a natural system, which is closer to reality, but they also acknowledge that natural systems make it harder to detect underlying mechanisms. Providing more results based on the raw data and offering more explanations of the possible mechanisms in the introduction and discussion might help readers understand why and in what context species diversity could have negative effects.

      (We now provide more details. However, we are unfortunately not sure that this helped reaching some stronger explanation regarding underlying mechanisms. To be frank, we did not succeed in improving mechanistic inferences based on the outputs of the SEM models. We explored visually some additional relationships (e.g. relationships between the biomass of the focal species and that of other species in the assemblage) that we now discuss a bit more, but again, this did not really help in better understanding processes. We realize this is a limitation of our study and that this can be frustrating for readers. Nonetheless, as said in the Discussion, field-based study must be taken for what they are; observational studies forming the basis for future mechanistic studies. Although we failed to explain mechanisms, we still think that we provide important field-base evidence for the importance of biodiversity (as a whole) for ecosystem functions.

      3) Environmental variation was included in the analyses to test if the environment would modulate the effects of biodiversity on ecosystem functions. However, the main results and conclusions did not sufficiently address this aspect.

      This is now addressed, see our response to your first comment. We now explain (result section) and discuss environmental effects. As explained in the MS, environmental effects are similar in strength to those of biodiversity and are not that high, which is partly explained by the sampling scheme (see Fargeot et al. 2023). This is a choice we’ve made at the onset of the experiment, as we wanted to focus on biodiversity effects and avoid strong collinearity as it is generally the case in rivers (which impedes any proper and strong statistical inferences).

      Reviewer #2 (Public review):

      Summary:

      Fargeot et al. investigated the relative importance of genetic and species diversity on ecosystem function and examined whether this relationship varies within or between trophic-level responses. To do so, they conducted a well-designed field survey measuring species diversity at 3 trophic levels (primary producers [trees], primary consumers [macroinvertebrate shredders], and secondary consumers [fishes]), genetic diversity in a dominant species within each of these 3 trophic levels and 7 ecosystem functions across 52 riverine sites in southern France. They show that the effect of genetic and species diversity on ecosystem functions are similar in magnitude, but when examining within-trophic level responses, operate in different directions: genetic diversity having a positive effect and species diversity a negative one. This data adds to growing evidence from manipulated experiments that both species and genetic diversity can impact ecosystem function and builds upon this by showing these effects can be observed in nature.

      Strengths:

      The study design has resulted in a robust dataset to ask questions about the relative importance of genetic and species diversity of ecosystem function across and within trophic levels.

      Overall, their data supports their conclusions - at least within the system that they are studying - but as mentioned below, it is unclear from this study how general these conclusions would be.

      Weaknesses:

      (4) While a robust dataset, the authors only show the data output from the SEM (i.e., effect size for each individual diversity type per trophic level (6) on each ecosystem function (7)), instead of showing much of the individual data. Although the summary SEM results are interesting and informative, I find that a weakness of this approach is that it is unclear how environmental factors (which were included but not discussed in the results) nor levels of diversity were correlated across sites. As species and genetic diversity are often correlated but also can have reciprocal feedbacks on each other (e.g., Vellend 2005), there may be constraints that underpin why the authors observed positive effects of one type of diversity (genetic) when negative effects of the other (species). It may have also been informative to run SEM with links between levels of diversity. By focusing only on the summary of SEM data, the authors may be reducing the strength of their field dataset and ability to draw inferences from multiple questions and understand specific study-system responses.

      We have addressed this remark and we ask the reviewers and the readers to refer to our response to comment 1 from reviewer 1. Regarding co-variation among biodiversity estimates (SGDCs according to Vellend’s framework), we have addressed these issues in a companion paper that we now cite and expand further in the MS (Fargeot et al. Oikos, 2023). Given the size of the dataset and its complexity (and associated analyses), we have decided to focus on patterns of species and genetic biodiversity in a first paper (Oikos paper) and then on the link between biodiversity and functions (this paper). As it can be read in the Oikos’s paper, there are no co-variation in term of biodiversity estimates; species diversity is not correlated to genetic diversity, and within facet, there are not co-variation among species. In addition, environmental predictors are highly estimate-specific (i.e. environmental predictors sustaining species and genetic estimates are idiosyncratic). As a result (see the new Figure 3), environmental effects are relatively weak (the same intensity that those of biodiversity) and collinearity among parameters is relatively weak. The second point is important, as this permit to better infer parameters from models, and this allows to discuss direct relationships (as observed in Figure 3, indirect environmental effects are relatively rare). We provide in the Discussion a bit more explanation about the absence of co-variation among biodiversity estimates (see l. 433-440).

      (5) My understanding of SEM is it gives outputs of the strength/significance of each pathway/relationship and if so, it isn't clear why this wasn't used and instead, confidence intervals of Z scores to determine which individual BEFs were significant. In addition, an inclusion of the 7 SEM pathway outputs would have been useful to include in an appendix.

      We now provide p-values (Table S2) and the seven models (Figure 3).

      (6) I don't fully agree with the authors calling this a meta-analysis as it is this a single study of multiple sites within a single region and a specific time point, and not a collection of multiple studies or ecosystems conducted by multiple authors. Moreso, the authors are using meta-analysis summary metrics to evaluate their data. The authors tend to focus on these patterns as general trends, but as the data is all from this riverine system this study could have benefited from focusing on what was going on in this system to underpin these patterns. I'd argue more data is needed to know whether across sites and ecosystems, species diversity and genetic diversity have opposite effects on ecosystem function within trophic levels.

      We agree. “Meta-regression” would perhaps be more adequate than “meta-analyses”. We changed the formulation.

      Reviewer #3 (Public review):

      The manuscript by Fargeot and colleagues assesses the relative effects of species and genetic diversity on ecosystem functioning. This study is very well written and examines the interesting question of whether within-species or among-species diversity correlates with ecosystem functioning, and whether these effects are consistent across trophic levels. The main findings are that genetic diversity appears to have a stronger positive effect on function than species diversity (which appears negative). These results are interesting and have value.

      However, I do have some concerns that could influence the interpretation.

      (7) Scale: the different measures of diversity and function for the different trophic levels are measured over very different spatial scales, for example, trees along 200 m transects and 15 cm traps. It is not clear whether trees 200 m away are having an effect on small-scale function.

      Trees identification and invertebrate (and fish) sampling are done on the same scale. Trees are spread along the river so that their leaves fall directly in the river. Traps have been installed all along the same transect in various micro-habitats. Diversity have been measured at the exact same scale for all organisms. We have modified the MS to make this clear.

      (8) Size of diversity gradients: More information is needed on the actual diversity gradients. One of the issues with surveys of natural systems is that they are of species that have already gone through selection filters from a regional pool, and theoretically, if the environments are similar, you should get similar sets of species, without monocultures. So, if the species diversity gradients range from say, 6 to 8 species, but genetic diversity gradients span an order of magnitude more, you can explain much more variance with genetic diversity. Related to this, species diversity effects on function are often asymptotic at high diversity and so if you are only sampling at the high diversity range, we should expect a strong effect.

      Fish species number varies from 1 to 11, invertebrate family number varies from 15 to 42 and the tree species number varies from 7 to 20 (see Fargeot et al. 2023 for details). We have added this information in the M&M. The gradients are hence relatively large and do not cover a restricted set of values. There is a variance in species number among sites, even if sites are collected along a relatively weak altitudinal gradient. This is obviously complex to compare to SNP (genomic) diversity. Genetic and species effects are similar in effect sizes (percentage of explained variance), so it does not seem we have biased one of the two gradients of biodiversity.

      (9) Ecosystem functions: The functions are largely biomass estimates (expect decomposition), and I fail to see how the biomass of a single species can be construed as an ecosystem function. Aren't you just estimating a selection effect in this case?

      The biomass estimated for a certain area represents an estimate of productivity, whatever the number of species being considered. Obviously, productivity of a species can be due to environmental constraints; the biomass is expected to be lower at the niche margin (selection effect). But if these environmental effects are taken into account (which is the case in the SEMs), then the residual variation can be explained by biodiversity effects. We provide an explanation (l. 217-219).

      (10) Note that the article claims to be one of the only studies to look at function across trophic levels, but there are several others out there, for example:

      Thanks, we now cite some of these studies (Li et al 2020, Moi et al. 2021, Seibold et al. 2018).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Introduction:

      The introduction of the manuscript is generally well-structured, and the scientific questions are clearly presented. However, in each paragraph where specific aspects are introduced, the authors do not focus sufficiently on the given points. The current introduction discusses the weaknesses of previous studies extensively but lacks detailed explanations of mechanisms and a clear anticipation of this study's contributions.

      For example:

      L72-77: The authors mention that "genetic diversity may functionally compensate for a species loss," but this point is not highly relevant to the main analyses of this study, which focus on comparing the relative effects of species diversity and genetic diversity.

      Yes true, we understand the point made by the reviewers. We deleted this part of the sentence.

      L87-95: As previously noted, "whether environmental variation decreases or enhances the relative influence of genetic and species diversity on ecosystem functions" was not addressed in this study. Additionally, the last sentence seems unnecessary here, as it does not relate to "environmental variation." The phrase "generate insightful knowledge for future mechanistic models" is vague. It would be helpful to specify what kind of knowledge and what types of future mechanistic models are being referred to.

      We modified these two sentences. We now posit the prediction that what has been observed under controlled conditions (that genetic and species have effects of similar magnitude) might not be the norm under fluctuating environments (because it has been shown that environmental variation modulates the strength of interspecific BEFS and create huge variance).

      L96-116: The use of "for instance" three times in this paragraph makes the structure seem scattered, as only examples are provided. Improving the transition words can help the text focus better on the main point.

      We have modified some parts of this section to better reflect predictions

      L115-116: Again, it would be beneficial to specify what kind of insightful information can be provided.

      We have modified this sentence by making more explicit some of the information that may be gained.

      L117-134: Stating clear expectations can help the introduction focus on the mechanisms and assist readers in following the results.

      We now provide some predictions. We were reluctant to make predictions in the first version of the MS as we have the feeling that predictions can go on very different direction depending on how we set the scene. We therefore stick to predictions that we think are the most logical (the simplest ones). This illustrates the lack of theoretical papers on these issues.

      Methods:

      L287-293: The method for estimating the standard effect size is unclear. I assume it was derived from the SEM models? This needs further clarification.

      Yes, it is derived from the standardized estimate from each pSEM. This is now explained in the MS.

      Results:

      As mentioned in the public review, it is very important to show the results of analyzing raw data.

      Done, see Figure 3 and Results section.

      Table 1: The font and format of the PCA table are different from other tables and appear vague, resembling a picture rather than a table.

      Changed.

      Table 2 (and supplementary table): "D.f." is not explained in the table legend. Is 1 the numerator df and 30 the denominator df? Is the denominator the residual? Additionally, the table legend mentions "magnitude and direction." ANOVA only tests if the biodiversity effects are significantly different between species or genetic diversity, but not the magnitude. For example, -0.5 and 0.5 are very different, but their effect magnitudes are the same.

      This is a mistake; sorry the format of the Table was from a previous version of the MS in which we used linear models rather that linear mixed models (both lead to the same results). The ANOVA used to test the significance of fixed terms in linear mixed model are based on Wald chi-sqare tests, and it should have been read “Chi-value” rather than “F-value” in both tables and the only degree of freedom in this test is the one at the numerator. This has been changed. We have changed the caption of the Table (“ANOVA table for the linear mixed model testing whether the relationships between biodiversity and ecosystem functions measured in a riverine trophic chain differ between the biodiversity facets (species or genetic diversity) and the types of BEF (within- or between-trophic levels)”)

      Minor:

      There should always be a space between a number and a unit. In the manuscript, spaces are inconsistently used between numbers and units.

      Corrected

      Reviewer #2 (Recommendations for the authors):

      (1) In the introduction, the authors could focus more and build out what they predicted/hypothesized as well as what has been found in the manipulated experiments that examined the role of species and genetic diversity. That would enhance the background information for a more general audience, and highlight expected results and why.

      We modified the Introduction according to comments made by reviewer 1 and clarified the predictions as best as we can.

      (2) Similarly, the discussion is fairly big picture, but this dataset focused exclusively on this 3-trophic interaction in a riverine system. It could be beneficial to dig into the ecology to find out why the opposite effects of species and genetic diversity are seen within trophic levels in this system.

      We have added some explanations based on the specific pSEM (see our responses to the public reviews for details). But as said in the responses to the public reviews, even with mode detailed models, it is hard to tease apart mechanisms. One important point is that genetic and species diversity do not correlate one to each other (they do not co-vary over space), which means the effect of one facet is independent from the other. However, apart from that, we can’t really tell more without more mechanistic approaches. We understand this is frustrating, but this is the nature of field-based data. This does not mean they are useless. On the contrary, they confirm and expand patterns found under controlled conditions (which for ecologists is quite important as nature is our playground), but they are limited in inferences of mechanisms.

      (3) It would also be informative if the authors specified what positive and negative Z scores mean. It seems counterintuitive in Figure 3. For example, in the upper left, it's denoted as a larger intraspecific effect - which I'd assume is higher genetic (within species) diversity - but is this not where species diversity effects are higher? In theory this figure could be similar to Figure 1 from Des Roches et al. 2018 - where showing the 1:1 line of where species and genetic diversity effects are similar and then how some are more impacted by SD or GD as that links to the overall question, right?

      For example: Figure 3 makes it seem that GD effects are stronger (more positive) for within trophic responses (which is reflected in the text), but in that quadrant, it states that the interspecific effect is larger?

      yes, you’re true Figure 3 (now Figure 4) is not ideal. We added an explicit explanation for interpreting Zr in the main text. In addition, we modified the text in the quadrat as this was not correct. Note that it cannot be directly be compared to that of DesRoches et al. In DesRoches et al., there is a single effect size (ES) per situation (which is roughly expressed as “ES = effect of species - effect of genotypes”). Here, there are two ES per situation, one for the species effect, the other for the genetic effect, which makes the biplot more complex (as species and genetic can be similar in magnitude, but opposite in direction, e.g., 0.5 and -0.5). We may have done as DesRoches et al. (“ES = effect of species - effect of genotypes”), but as we don’t have absolute ES (as in DesRoches) the resulting signs of the ES are non sensical…Not easy for us to find a clever solution (or said differently, we were not clever enough to find an easy solution).  Nonetheless, we tried another visualization by including “sub-quadrats” into the four main quadrats. We hope this will be clearer

      (4) It's unclear why authors included both a simplified linear mixed model with diversity type and biodiversity facet as fixed factors, and then a second linear model that included trophic level (with those other 2 factors and interactions), but only showed results of trophic level from that more complex model. It is unclear why they include two models when the more complex one would have evaluated all aspects of their research question and shown the same patterns.

      You’re true, the more complex model evaluates both aspects. Nonetheless, as the hypotheses were strictly separated, we thought it is simpler to associate one model to one hypothesis. We agree that this duplicates information, but we would like to keep the two models to make the text more gradual.

    1. eLife Assessment

      This valuable work provides novel insights into the substrate binding mechanism of a tripartite ATP-independent periplasmic (TRAP) transporter, which may be helpful for the development of specific inhibitors. The structural analysis is convincing, but additional work will be required to establish the transport mechanism as well as well as binding sites for all ligands. This study will be of interest to the membrane transport and bacterial biochemistry communities.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript reports the substrate-bound structure of SiaQM from F. nucleatum, which is the membrane component of a Neu5Ac-specific Tripartite ATP-dependent Periplasmic (TRAP) transporter. Until recently, there was no experimentally derived structural information regarding the membrane components of TRAP transporter, limiting our understanding of the transport mechanism. Since 2022, there have been 3 different studies reporting the structures of the membrane components of Neu5Ac-specific TRAP transporters. While it was possible to narrow down the binding site location by comparing the structures to proteins of the same fold, a structure with substrate bound has been missing. In this work, the authors report the Na+-bound state and the Na+ plus Neu5Ac state of FnSiaQM, revealing information regarding substrate coordination. In previous studies, 2 Na+ ion sites were identified. Here, the authors also tentatively assign a 3rd Na+ site. The authors reconstitute the transporter to assess the effects of mutating the binding site residues they identified in their structures. Of the 2 positions tested, only one of them appears to be critical to substrate binding.

      Strengths:

      The main strength of this work is the capture of the substrate bound state of SiaQM, which provides insight into an important part of the transport cycle.

      Weaknesses:

      The main weakness is the lack of experimental validation of the structural findings. The authors identified the Neu5Ac binding site, but only test 2 residues for their involvement in substrate interactions, which is quite limited. However, comparison with previous mutagenesis studies on homologues supports the location of the Neu5Ac binding site. The authors tentatively identified a 3rd Na+ binding site, which if true would be an impactful finding, but this site was not sufficiently experimentally tested for its contribution to Na+ dependent transport. This lack of experimental validation prevents the authors from unequivocally assigning this site as a Na+ binding site. However, the reporting of these new data is important as it will facilitate follow up studies by the authors or other researchers.

      Comments on revisions:

      Overall, the authors have done a good job of addressing the reviewers' comments. It's good to know that the authors are working on the characterisation of the potential metal binding site mutants - characterising just a few of these will provide much needed experimental support for this potential Na+ site.<br /> The new MD simulations provide some additional support for the new Na+ site and could be included. However, as the authors know, direct experimental characterisation of mutants is the ideal evidence of the Na+ site.

      Aside from the characterisation of mutants, which seems to be held up by technical issues, the only remaining issue is the comparison of the Na+- and Na+/Neu5Ac-bound states with ASCT2.<br /> It still does not make sense to me why the authors are not directly comparing their Na+ only and Na+/Neu5Ac states with the structures of VcINDY in the Na+-only and Na+/succinate bound states. These VcINDY structures also revealed no conformational changes in the HP loops upon binding succinate, as the authors see for SiaQM. Therefore, this comparison is very supportive. It is understood that the similarity to the DASS structure is mentioned on p.17, but it is also interesting and useful to note that TRAP and DASS transporters also share a lack of substrate-induced local conformational changes, to the extent these things have been measured.

    3. Reviewer #3 (Public review):

      The manuscript by Goyal et al report substrate-bound and substrate-free structures of a tripartite ATP independent periplasmic (TRAP) transporter from a previously uncharacterized homolog, F. nucleatum. This is one of most mechanistically fascinating transporter families, by means of its QM domain (the domain reported in his manuscript) operating as a monomeric 'elevator', and its P domain functioning as a substrate-binding 'operator' that is required to deliver the substrate to the QM domain; together, this is termed an 'elevator with an operator' mechanism. Remarkably, previous structures had not demonstrated the substrate Neu5Ac bound. In addition, they confirm the previously reported Na+ binding sites, and report a new metal binding site in the transporter, which seems to be mechanistically relevant. Finally, they mutate the substrate binding site and use proteoliposomal uptake assays to show the mechanistic relevance of the proposed substrate binding residues.

      Strengths:

      The structures are of good quality, the presentation of the structural data has improved, the functional data is robust, the text is well-written, and the authors are appropriately careful with their interpretations. Determination of a substrate bound structure is an important achievement and fills an important gap in the 'elevator with an operator' mechanism.

      Weaknesses:

      Although the possibility of the third metal site is compelling, I do not feel it is appropriate to model in a publicly deposited PDB structure without directly confirming experimentally. The authors do not extensively test the binding sites due to technical limitations of producing relevant mutants; however, their model is consistent with genetic assays of previously characterized orthologs, which will be of benefit to the field. Finally, some clarifications of EM processing would be useful to readers, and it would be nice to have a figure visualizing the unmodeled lipid densities - this would be important to contextualize to their proposed mechanism.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      This manuscript reports the substrate-bound structure of SiaQM from F. nucleatum, which is the membrane component of a Neu5Ac-specific Tripartite ATP-dependent Periplasmic (TRAP) transporter. Until recently, there was no experimentally derived structural information regarding the membrane components of the TRAP transporter, limiting our understanding of the transport mechanism. Since 2022, there have been 3 different studies reporting the structures of the membrane components of Neu5Ac-specific TRAP transporters. While it was possible to narrow down the binding site location by comparing the structures to proteins of the same fold, a structure with substrate bound has been missing. In this work, the authors report the Na+-bound state and the Na+ plus Neu5Ac state of FnSiaQM, revealing information regarding substrate coordination. In previous studies, 2 Na+ ion sites were identified. Here, the authors also tentatively assign a 3rd Na+ site. The authors reconstitute the transporter to assess the effects of mutating the binding site residues they identified in their structures. Of the 2 positions tested, only one of them appears to be critical to substrate binding.

      Strengths:

      The main strength of this work is the capture of the substrate-bound state of SiaQM, which provides insight into an important part of the transport cycle.

      Weaknesses:

      The main weakness is the lack of experimental validation of the structural findings. The authors identified the Neu5Ac binding site, but only tested 2 residues for their involvement in substrate interactions, which was very limited. The authors tentatively identified a 3rd Na+ binding site, which if true would be an impactful finding, but this site was not tested for its contribution to Na+ dependent transport, and the authors themselves report that the structural evidence is not wholly convincing. This lack of experimental validation undermines the confidence of the findings. However, the reporting of these new data is important as it will facilitate follow-up studies by the authors or other researchers.

      The main concern, also mentioned by other reviewers, is the lack of mutational data and functional studies on the identified binding sites. Two other structures of TRAP transporters have been determined, one from Haemophilus influenzae (Hi) and the other from Photobacterium profundum (Pp). We will refer to the references in this paper as [1], Peter et al. as [2], and Davies et al. as [3]. The table below lists all the mutations made in the Neu5Ac binding site, including direct polar interactions between Neu5Ac and the side chains, as well as the newly identified metal sites.

      The structure of Fusobacterium nucleatum (Fn) that we have reported shows a significant sequence identity with the previously reported Hi structure. When we superimpose the Pp and Fn structures, we observe that nearly all the residues that bind to the Neu5Ac and the third metal site are conserved. This suggests that mutagenesis and functional studies from other research can be related to the structure presented in our work.

      The table below shows that all three residues that directly interact with Neu5Ac have been tested by site-directed mutagenesis for their role in Neu5Ac transport. Both D521 and S300 are critical for transport, while S345 is not. We do not believe that a mutation of D521A in Fn, followed by transport studies, will provide any new information.

      However, Peter et al. have mutated only one of the 5 residues near the newly identified metal binding site, which resulted in no transport. The rest of the residues have not been functionally tested. We propose to mutate these residues into Ala, express and purify the proteins, and then carry out transport assays on those that show expression. We will include this information in the revised manuscript.

      Author response table 1.

      Reviewer #2 (Public Review):

      In this exciting new paper from the Ramaswamy group at Purdue, the authors provide a new structure of the membrane domains of a tripartite ATP-independent periplasmic (TRAP) transporter for the important sugar acid, N-acetylneuraminic acid or sialic acid (Neu5Ac). While there have been a number of other structures in the last couple of years (the first for any TRAP-T) this is the first to trap the structure with Neu5Ac bound to the membrane domains. This is an important breakthrough as in this system the ligand is delivered by a substrate-binding protein (SBP), in this case, called SiaP, where Neu5Ac binding is well studied but the 'hand over' to the membrane component is not clear. The structure of the membrane domains, SiaQM, revealed strong similarities to other SBP-independent Na+-dependent carriers that use an elevator mechanism and have defined Na+ and ligand binding sites. Here they solve the cryo-EM structure of the protein from the bacterial oral pathogen Fusobacterium nucleatum and identify a potential third (and theoretically predicted) Na+ binding site but also locate for the first time the Neu5Ac binding site. While this sits in a region of the protein that one might expect it to sit, based on comparison to other transporters like VcINDY, it provides the first molecular details of the binding site architecture and identifies a key role for Ser300 in the transport process, which their structure suggests coordinates the carboxylate group of Neu5Ac. The work also uses biochemical methods to confirm the transporter from F. nucleatum is active and similar to those used by selected other human and animal pathogens and now provides a framework for the design of inhibitors of these systems.

      The strengths of the paper lie in the locating of Neu5Ac bound to SiaQM, providing important new information on how TRAP transporters function. The complementary biochemical analysis also confirms that this is not an atypical system and that the results are likely true for all sialic acid-specific TRAP systems.

      The main weakness is the lack of follow-up on the identified binding site in terms of structure-function analysis. While Ser300 is shown to be important, only one other residue is mutated and a much more extensive analysis of the newly identified binding site would have been useful.

      Please see the comments above.

      Reviewer #3 (Public Review):

      The manuscript by Goyal et al reports substrate-bound and substrate-free structures of a tripartite ATP-independent periplasmic (TRAP) transporter from a previously uncharacterized homolog, F. nucleatum. This is one of the most mechanistically fascinating transporter families, by means of its QM domain (the domain reported in his manuscript) operating as a monomeric 'elevator', and its P domain functioning as a substrate-binding 'operator' that is required to deliver the substrate to the QM domain; together, this is termed an 'elevator with an operator' mechanism. Remarkably, previous structures had not demonstrated the substrate Neu5Ac bound. In addition, they confirm the previously reported Na+ binding sites and report a new metal binding site in the transporter, which seems to be mechanistically relevant. Finally, they mutate the substrate binding site and use proteoliposomal uptake assays to show the mechanistic relevance of the proposed substrate binding residues.

      The structures are of good quality, the functional data is robust, the text is well-written, and the authors are appropriately careful with their interpretations. Determination of a substrate-bound structure is an important achievement and fills an important gap in the 'elevator with an operator' mechanism. Nevertheless, I have concerns with the data presentation, which in its current state does not intuitively demonstrate the discussed findings. Furthermore, the structural analysis appears limited, and even slight improvements in data processing and resulting resolution would greatly improve the authors' claims. I have several suggestions to hopefully improve the clarity and quality of the manuscript.

      We appreciate your feedback and will make the necessary modifications to the manuscript incorporating most of the suggestions. We will submit the revised version once the experiments are completed. We are also working on improving the quality of the figures and have made several attempts to enhance the resolution using CryoSPARC or RELION, but without success. We will continue to explore newer methods in an effort to achieve higher resolution and to model more lipids, particularly in the binding pocket.

      Reviewing Editor (Recommendations for the Authors):

      After discussing the reviews, the reviewers and reviewing editor have agreed on a list of the most important suggested revisions for the authors, which, if satisfactorily addressed, would improve the assessment of the work. These suggested revisions are listed below. We also include the full Recommendations For The Authors from each of the individual reviewers.

      (1) The authors tentatively identified a 3rd Na+ binding site, which if true would be an impactful finding, but this site was not tested for its contribution to Na+ dependent transport, and the authors themselves report that the structural evidence is not wholly convincing. Additional mutagenesis and activity experiments to test the contribution of this site to transport would strengthen the manuscript. Measuring Na+ concentration-response relations and calculating Hill slopes in WT vs. an M site mutant would be a good experiment. Given the lack of functional data and poor density, it does not seem appropriate to build the M site sodium in the PDB model.

      The density is well defined to suggest a metal bound (waters would not be clearly defined at this resolution).  While our modeling of the site as a Na+ is arbitrary, this was done to satisfy the refinement programs where we have a known scatterer modeled.  We could model this density with other metals, but unlike crystallographic refinement, real-space refinement of cryoEM maps does not produce a difference map that might allow us to identify the metal but not conclusively.   The density of the maps is good (we have added better figures to demonstrate this).  We tried making multiple mutations to test for activity – unfortunately, we are still struggling to express proteins with mutations in this site in sufficient quantities to carry out transport assays.

      In the absence of being able to do the experiments, we did MD simulations (carried out by Senwei Quan and Jane Allison at University of Auckland).  Our results are shown below – we are not certain without further studies that these should be included in the current paper (we will add them as authors if the editor feels that this evidence is critical).

      Author response table 2.

      We are showing this for review to suggest that K+, Ca2+, and Na+ were tried, and only Na+ stays stably in the binding pocket. The rest of the results will also have to be explained, which would change the focus of the paper.

      We also provided the sequence to Alphafold3 and asked it to identify the possible metal binding sites—when the input was Na+, it found all three binding sites. 

      Summary:  Both our experimental data and computational studies suggest the observed metal binding site is real but at the moment, it is not possible to refine the structure and put an unidentified metal.  Computational studies suggest that this is a high-probability Na+ site. 

      Demonstration of cooperativity between the Na+ site and transport require carrying out these experiments with mutations in these sites in a concentration-dependent manner. Unfortunately, our inability to produce well-expressed and purified proteins with mutations in a short time frame failed. 

      (2) The authors identified the Neu5Ac binding site but only tested 2 residues for their involvement in substrate interactions, which was very limited. Given that the major highlight of this paper is the identification of the Neu5Ac binding site, it would strengthen the manuscript if the authors provided a more extensive series of mutagenesis experiments - testing at least the effect of D521A would be important. One inconsistency is Ser345 mutagenesis not affecting transport, and the authors should further discuss in the text why they think that is.

      D521A has been tested in H. influenzae, and this mutation results in loss of transport.  This residue is highly conserved and occupies the same position. We expect the result to remain the same. 

      We have added a few extra lines to discuss Serine 345: “Ser 345 OG is 3.5Å away from the C1-carboxylate oxygen – a distance that would result in a weak interaction between the two groups. It is, therefore, not surprising that the mutation into Ala did not affect transport. The space created by the mutation can be occupied by a water molecule.”

      (3) The purification and assessment of the stability of the protein are described in text alone with no accompanying data. It would be beneficial to include these data (e.g. in the Supplementary info) as it allows the reader to evaluate the protein quality.

      This is now added as Supplementary Figure 2.

      (4) The structural figures throughout the paper could benefit from more clarity to better support the conclusions. Specific critiques are listed below:

      - Figure 1: since the unbound map has a similar reported resolution, displaying the unbound structure's substrate binding site with the same contour would clearly demonstrate that the appearance of this density is substrate-dependent.

      - Figure 1: the atomic fit of the ligand to the density, and the suggested coordination by side chain and backbone residues, would be useful in this figure.

      - Figure 1: I think it would be more intuitive to compare apo and bound structures with the same local resolution scale.

      We have remade Figure 1 “Architecture of FnSiaQM with nanobody. (A and B) Cryo-EM maps of FnSiaQM unliganded and sialic acid bound at 3.2 and 3.17 Å, respectively. The TM domain of FnSiaQM is colored using the rainbow model (N-terminus in blue and C-terminus in red). The nanobody density is colored in purely in red. The density for modeled lipids is colored in tan and the unmodelled density in gray. The figures were made with Chimera at thresholds of 1.2 and 1.3 for the unliganded and sialic acid-bound maps. (C and D) The cytoplasmic view of apo and sialic acid bound FnSiaQM, respectively. Color coding is the same as in panels A and B. The density corresponding to sialic acid and sodium ions are in purple. The substrate binding sites of apo and sialic acid bound FnSiaQM are shown with key residues labeled. The density (blue mesh) around these atoms was made in Pymol with 2 and 1.5 s for the apo and the sialic acid, respectively, with a carve radius of 2 Å.”

      The local resolution maps have been moved to Supplementary Figure 3.

      - Figure 3, Figure 5a: The mesh structures throughout the manuscript are blocky and very difficult to look at and interpret, especially for the ion binding sites, which are currently suggestive of but not definitively ion densities. Either using transparent surfaces, higher triangle counts, or smoothing the surface might help this.

      We have made Figure 3 again with higher triangle counts.  We tried all three suggestions and this provided the best figure. We have replaced Figure 5A with density for Neu5Ac and residues around it.

      - Figure 5A: It would be important to show the densities of the entire binding pocket, especially coordinating side chains, to show the reader what is and isn't demonstrated by this structure.

      - It's not clear how Figure 5D is supposed to show that the cavity can accommodate Neu5Gc, as suggested by the text - please make the discussed cavity clearer in the Figure.

      We have now marked with an arrow the Methyl Carbon where the hydroxyl group is added.  We have mentioned that in the legend.  It is open to the periplasmic side of the cavity.

      - Supplementary Figure 4: Please label coordinating residue sites.

      Labels have been added to Supplementary Figure 6 which was earlier Supplementary Figure 4.

      (5) Intro section: the authors should introduce the work on HiSiaP around the role of the R147 residue in high-affinity Neu5Ac binding, which coordinates the carboxylate of Neu5Ac, and which is a generally conserved mechanism for organic acid binding in other TRAP transporters. This context will help magnify their discovery later that in the membrane domains, it is a key serine and not an arginine that coordinates the carboxylate group (probably as the local concentration of Neu5Ac is high and tight binding site is not desirable for rapid transport, which is mentioned in the discussion).

      Thank you for pointing this out. We have added a new sentence to the introduction.

      “All the SiaP structures show the presence of a conserved Arginine that binds to the C1-carboxylate of Neu5Ac, and this Arg residue is critical as the high electrostatic affinity may be important to have a strong binding affinity that sequesters the small amounts that reach the bacterial periplasmic space  (Glaenzer et al., 2017).”

      (6) TRAP transporters exist for many organic compounds and not just sialic acid, which might be nice to make the reader aware of.

      We initially did not do this as this is an advance paper and this was discussed in the earlier paper (Currie et. al., 2024). However, we have now added a sentence to the introduction. “Additionally, amino acids, C4-dicarboxylates, aromatic substrates and alpha-keto acids are also transported by TRAP transporters (Vetting et al., 2015). “

      (7) On p. 12, the authors describe the Neu5Ac binding site as a large solvent-exposed vestibule, having previously described the substrate-bound state as occluded. These descriptions should be adjusted to make clear which structure is being referenced. The clarity of this would be substantially improved if the authors included a figure that showed this occlusion - currently none of the structure figures clearly demonstrate what the authors are referring to. There are several conspicuous unmodeled densities proximal to the substrate, reminiscent of lipids (in between transport and scaffold domain) and possibly waters/ions. Given this, it is really surprising that the substrate binding site is described as "solvent-exposed" since the larger molecules seem to occlude the pocket. The authors should further process their dataset and discuss the implications of these surrounding densities.

      We have processed the data sets carefully both with cryosparc and relion and the resolution described here is same with both software with the cryosparc maps slightly better in terms of interpretability of peripheral helices and described in the manuscript. The current sample (FnTRAP) with the nanobody is a relatively stable sample (in our experience with other similar proteins) as evident from the number of images and particles to achieve a decent resolution and thus the workflow is straightforward and simple.  There are number of non-protein densities, which in principle can be modelled but we have chosen a conservative approach not to model these extra densities (except for the two lipids, few ions) due to limit of the resolution. It is possible that increasing the number of particles will result in an increase in resolution but from the estimated B-factor (125 or 135 Å2 for unliganded and liganded), this will certainly require lot of more images with no guarantee of increased resolution.

      The question of outward open Vs outward occluded is a valid point. We have now modified this in the manuscript. “The Neu5Ac binding site has a large solvent-exposed vestibule towards the cytoplasmic side, while its periplasmic side is sealed off. Cryo-EM map shows the presence of multiple densities that could be modeled as lipids, possibly preventing the substrate from leaving the transporter. However, the densities are not well defined to model them as specific lipids, hence they have not been modeled.  We describe this as the “inward-facing open state” with the substrate-bound.”

      (8) On p.15, the activity of FnSiaPQM in liposomes is reported, although the impetus for this study is not clear. Presumably, the reason for its inclusion is to ensure that the structurally characterized protein is active. It would be useful to say this at the start of the section if this is the case. This study nicely shows that the energetics and requirements of transport are identical to all the previous studies on Neu5Ac TRAP transporters - it would be good to acknowledge this somewhere in this section as well.

      These changes have been incorporated.  We have added a line to say why we did this and added as the last line that this is similar to other SiaPQM’s characterized.

      (9) Figure 5C. The authors show the transport activity with and without valinomycin. The authors do not explain the rationale for testing and reporting both conditions for these mutants; an explanation is required, or the data should be simplified. The expected membrane potential induced by valinomycin should be mentioned in the legend.

      We have simplified Figure 5C and added the expected membrane potential value.

      (10) The authors state that the S300A mutant is inactive. However, unless the authors also measured the background binding/transport of radiolabelled substrate in the absence of protein, then the accuracy of this statement is not clear because Figure 5C does indicate some activity for S300A, albeit much lower than WT. This is an important point in light of the authors' suggestion that the membrane protein does not need a binding site of high affinity or stringent selectivity.

      We thank the reviewer for pointing this out we have now added a line in the experimental protocols “The experimental values were corrected by subtracting the control, i.e. the radioactivity taken up in liposomes reconstituted in the absence of protein. The radioactivity associated with the control samples, i.e. empty liposomes was less than 10% with respect to proteoliposomes.”.

      (11) There are several issues and important omissions in the work cited:

      - It is not normal practice to cite a reference in the abstract and the citation is only to the second structure of HiSiaQM, which does not fairly reflect previous work in the field by only referring to their own work. Also throughout the article, it is normal practice with in-text citations to order them chronologically, i.e. earliest first. Please update this.

      This article was submitted as an “Research advance article”.  The instructions specifically say that “Research advance article should cite the article in eLife this paper advances.  Hence the citation of the “second structure of HiSiaQM”.  In fact, in the manuscript we explicitly say “The first structure of _Hi_SiaQM (4.7 Å resolution) demonstrated that it is composed of 15 transmembrane helices and two helical hairpins.”   We are following the policy laid out.  

      Zotero organizes multiple references in alphabetical order, we did not choose to do it that way – the suggestion of bias is not true. The final version of the accepted paper will have numbers, and this argument will automatically be corrected.

      - Intro: please cite the primary papers discovering other families of sialic acid transporters.

      - Intro: When introducing information on the binding site, dissociation constant of Neu5Ac, and thermodynamics of ligand binding to SiaP, the authors should also include references to the work done by others in addition to their own work.

      The Setty et al. paper was the first to demonstrate that the two-component systems are distinct, and that the binding protein of the TRAP system binds enthalpically while the binding protein of the ABC system binds entropically (SiaP vs SatA). As the reviewer points out, this is significant because it highlights how the Arg binding to the carboxylate, which is the enthalpic driver in this case and contributes to the difference between sugar binding to SiaP and SatA. Many studies have published binding affinities of molecules to SiaP, but this paper offers valuable insight into the differences between these systems. We have cited a number of the SiaP papers from other groups, including acknowledging the first structure of SiaP from H. influenzae by Muller et al., in 2006.

      - p.5 "TRAP transporters are postulated to employ an elevator-type mechanism...". This postulation has been experimentally tested and published, so should be discussed and referenced (Peter et al. 2024. https://doi.org/10.1038/s41467-023-44327-3).

      We have now corrected this error. We removed “are postulated to” and added the reference.

      - p.5 "Notably, the transport of Neu5Ac by TRAP transporters requires at least two sodium ions (Davies et al., 2023)." The requirement for at least 2 Na+ ions for Neu5Ac transport was first demonstrated in Mulligan et al. PNAS 2009, so should also be cited (for completion, so should Mulligan et al. JBC. 2012 and Currie et al. elife 2023, which have also shown this requirement is a commonality amongst all Neu5Ac TRAP transporters).

      Added.

      - P.12, Mulligan et al, JBC, 2012 should be added to the citations in the first sentence.

      Added.

      - p.19 "Interestingly, even the dicarboxylate transporter from V. cholerae (VcINDY) binds to its ligand via electrostatic interactions with both carboxylate groups". Other references are more appropriate than the one used to support this statement.

      Also added references for Mancusso et. al, 2012, Nie et.al, 2017 and Sauer et.al., 2022 here.

      - p.19. "The structure of the protein in the outward-facing conformation is unknown". The authors do not discuss the mechanistic findings from Peter et al 2024 Nat Comm here. The work described in that paper revealed an experimentally verified model of the OFS of HiSiaQM, so really needs to be included.

      This is not an experimentally determined 3D structure. They have shown the possible existence of this by microscopy, but the structure is not determined. The work mentioned is a wonderful piece of work, but it does not report the three-dimensional structure of the protein in the outward-facing conformation to allow us to understand the nature of the molecular interactions. 

      - The reference to Kinz-Thompson et al 2022 on p. 6 is not appropriate - neither the HiSiaQM papers nor the PpSiaQM paper makes reference to this work when identifying the binding site. More suitable references are used, for example, Mancusso et al 2012, Nie et al 2017 and Sauer et al 2022; this should be reported accurately.

      Added the suggested references.  We think the paper (Kinz-Thomposin et al 2022) is relevant and have also kept that reference.

      - Garaeva et al report the opposite of what the authors mention - "In the human neutral amino acid transporter (ASCT2), which also uses the elevator mechanism, the HP1 and HP2 loops have been proposed to undergo conformational changes to enable substrate binding and release (Garaeva et al., 2019)." In fact, this paper suggested a one-gate model of transport (HP2), where HP1 seems uninvolved in gating.

      The Reviewer is correct.  We were wrong and not clear.  The entire paragraph has been rewritten.

      “While, both the HP1 and HP2 loops have been hypothesized to be involved in gating, in the human neutral amino acid transporter (ASCT2), (which also uses the elevator mechanism), only the HP2 loops have been shown to undergo conformational changes to enable substrate binding and release (Garaeva et al., 2019). Hence, it is suggested that there is a single gate that controls substrate binding. Superposition of the _Pp_SiaQM and _Hi_SiaQM structures do not reveal any change in these loop structures upon substrate binding. For TRAP transporters, the substrate is delivered to the QM protein by the P protein; hence, these loop changes may not play a role in ligand binding or release. This may support the idea that there is minimal substrate specificity within SiaQM and that it will transport the cargo delivered by SiaP, which is more selective.”

      - p.19 "suggesting that SSS transporters have probably evolved to transport nine-carbon sugars such as Neu5Ac (Wahlgren et al, 2018)." Surely this goes without saying since Wahlgren et al 2018 demonstrated that SiaT, an SSS, could transport sialic acid? It's unclear why this was included here - perhaps it needs to be rewritten to make the point more clearly, but as it stands, this statement appears self-evident. Furthermore, these proteins can transport all kinds of molecules (see TCDB 2.A.21). This statement needs to be clarified. 

      This was a comparison to other Neu5Ac binding sites in other Neu5Ac transporters. We have modified the sentence. “The polar groups bind to both the C1-caboxylate side of the molecule and the C8-C9 carbonyls, suggesting that Proteus mirabilis Neu5Ac transporter (SSS type) evolved specifically to transport nine-carbon sugars such as Neu5Ac (Wahlgren et al., 2018)”.  These were arguments we were making to suggest that the lack of tight binding could also mean reduced specificity.

      - The authors reconstitute the FnSiaQM and measure transport with SiaP, which resembles closely what is known for both HiSiaPQM, VcSiaPQM, which is not cited (https://doi.org/10.1074/jbc.M111.281030).

      - Regarding lipids between transport and scaffold domains: there is precedent for such lipids in the elevator transporter GltPh, Wang, and Boudker (eLife 2020) proposed similar displacements during transport and would be appropriate to cite here.

      We have now cited the reference to the Mulligan et al., 2012 paper.  We also added a sentence on the findings of GltPh paper by Wang and Boudker.  Thank you for pointing this out.

      (12) p.9 "TRAP transporters, as their name suggests, comprise three units: a substrate-binding protein (SiaP) and two membrane-embedded transporter units (SiaQ and SiaM) (Severi et al., 2007)." This is somewhat odd phrasing because the existence of fused membrane components has been well-documented for a long time. The addition of "Many" at the start of the sentence fixes this.

      Added Many.

      (13) On p.12 the authors compare the ligand-induced conformational changes of FnSiaQM with ASCT2, citing Garaeva et al, 2019. This comparison does not make sense considering TRAP transporters and ASCT2 do not share a common fold. A far superior comparison is with DASS transporters, which actually do have the same fold as TRAP transporters. And, importantly, the Na+ and substrate-induced conformational changes have been investigated for DASS transporters revealing a unique mechanism likely shared by TRAP transporters (Sauer et al, Nat Comm, 2022). The text on p.12 should be adjusted to replace the ASCT comparison with a VcINDY comparison.

      The purpose of citing the ASCT2 paper was only concerning the HP1 and HP2 gates.  The authors show that HP2 changes conformation only.  Comparing the two FnSiaQM structures – with and without ligand, we see no change in either the HP1 or the HP2 loops.  On Page 17, when we describe the structure, we do specifically mention that the overall architecture is similar to VcINDY and the DASS transporters.

      (14) p.12 "For TRAP transporters, the substrate is delivered to the QM protein by the SiaP" protein;" "SiaP protein" should be "P protein"

      Corrected.

      (15) p.18. "periplasmic membrane" should be "cytoplasmic membrane".

      Corrected.

      (16) p.19. "This prevents Neu5Ac from binding..." There is no evidence for this so this needs to be softened, e.g. "This likely prevents Neu5Ac from...".

      Agree – Modified.

      (17) Figure 2B is rather small, cramped, and difficult to see. We suggest that the authors make that panel larger, or include it as a stand-alone supplementary figure.

      We have moved this figure into a supplementary figure as suggested by the reviewer.

      (18) The authors describe the Neu5Ac binding site in SiaQM. It would be helpful if the authors provided a figure in support of the statement that the Neu5Ac binding site architecture is similar to dicarboxylate in VcINDY (especially as Neu5Ac is a monocarboxylate).

      The Neu5Ac binding site is NOT similar to the VcINDY binding site. But, we understand the origin of the comment. We have now changed the sentence: “The overall architecture of the Neu5Ac binding site is similar to that of citrate/malate/fumarate in the di/tricarboxylate transporter of V. cholerae (Vc_INDY), but the residues involved in providing specificity are different (Kinz-Thompson _et al., 2022; Mancusso et al., 2012; Nie et al., 2017; Sauer et al., 2022). Neu5Ac binds to the transport domain without direct interactions with the residues in the scaffold domain. The majority of the interactions are with residues in the HP1 and HP2 loops of the transport domain (Figure 5B). Asp521 (HP2), Ser300 (HP1), and Ser345 (helix 5) interact with the substrate through their side chains, except for one interaction between the main chain amino group of residue 301 and the C1-carboxylate oxygen of Neu5Ac. Mutation of the residue equivalent to Asp521 has been shown to result in loss of transport (Peter et al., 2022). To evaluate the role of residues Ser-300 and Ser-345, we mutated them to alanine and performed the transport assays.”  

      (19) When comparing the binding modes of Neu5Ac to different proteins in Figure 6, it would be helpful to include the structure in this paper as well.

      The Neu5Ac binding site is present in figure 5. We would prefer not to show it again in Figure 6.

      Additionally, there is a clear binding mode of Neu5Ac in Figure 1 as well.

      (20) The manuscript would benefit from a more detailed comparison between Na+-bound (described as apo) and Na+/Neu5Ac structures, especially the prospective gates. If this transporter behaves anything like the archetypical ion-coupled glutamate transporters, some structural changes in the gates might be expected to facilitate transport domain movement when the substrate is loaded, but not when only Na+ is bound. It would be important to discuss and visualize these changes.

      We have described in the manuscript that there is NO change in the HP1 and HP2 gates between the unliganded structure and the Neu5Ac bound structure. The major difference we observe is the ordering of the third metal binding site.

      A figure comparing the substrate binding pockets between the different high-resolution structures would also be informative. Do the bonding distances between ligands and side chains significantly change between homologs?

      This is the only Neu5Ac bound structure.  Since the specificity to the substrate comes from the variability of the residues that interact it, we do not believe that this figure would not add much value.  

      (21) A supplementary figure (or an inset to Figure 2) showing pairwise percent identity between different characterized QM transporters would be useful.

      We have now added a Supplementary Figure 4 showing the comparison of the three QM sequences whose structures have been determined.

      (22) There is relatively minimal EM processing. More rigorous processing would require relatively little effort and could boost resolution, making this a vastly improved manuscript with a much more confident interpretation of structures.

      We described the overall workflow. The processing was rigorous. After obtaining the first maps, we created templates with the structure and did template-based picking.  We then did several rounds of 2D classification followed by homogenous refinement, Non-Uniform Refinement.  We then made masks and carried out local refinement.  We then got the best maps and did a 3D classification. Refined the 3D classes independently.  Then, we regrouped them based on how similar they were. We then went back and picked particles again (we used different methods of particle picking, but template-based picking resulted in the final set of particles used) and went through the whole process again.  At the end of the refinement, we carried out global and local CTF refinement followed by reference-based motion correction. The final refinement was then done with the Bayesian polished particles.  The final refinement was local refinement with a mask over only the transporter and the nano-body. After the reviews came, we tried multi-body refinement in Relion5.  It did not improve resolution. We have expanded the legend to supplementary Figure 2 (without listing all the different things we tried). The best resolution we obtained for the structure was 3.1 Å. However, it is important to note that the local resolution of the map around the ligand is good. 

      We realized this is not easy to depict in a local resolution map.  So, we wrote a script to take every atom, then take a radius of 5 Å (again we tried different radii and used the optimal one; we are preparing a manuscript to describe this), take all the local resolution values within the 5 Å spere and average it and add it as B-factor that atom. We have moved the local resolution map figure to the supplement and replaced Figure 1 with a Cartoon, where the color represents the local resolution in which the atom is. 

      (23) Calling the structure without Neu5Ac bound an "apo" structure is confusing since it indeed has the ligand Na+ present and bound. "Na+" and "Na+/Neu5Ac" structures would be more appropriate.

      Changed all “apo” to “unliganded”.

    1. eLife Assessment

      This valuable study reports a potential connection between the seminal microbiome and sperm quality/male fertility. The data are generally convincing. This study will be of interest to clinicians and biomedical researchers who work on microbiome and male fertility.

    2. Reviewer #1 (Public review):

      Summary:

      The authors analyzed the bacterial colonization of human sperm using 16S rRNA profiling. Patterns of microbiota colonization were subsequently correlated with clinical data, such as spermiogram analysis, presence of reactive oxygen species (ROS), and DNA fragmentation. The authors identified three main clusters dominated by Streptococcus, Prevotella, and Lactobacillus & Gardnerella, respectively, which aligns with previous observations. Specific associations were observed for certain bacterial genera, such as Flavobacterium and semen quality. Overall, it is a well-conducted study that further supports the importance of the seminal microbiota.

      Strengths:

      - The authors performed the analysis on 223 samples, which is the largest dataset in semen microbiota analysis so far<br /> - Inclusion of negative controls to control contaminations.<br /> - Inclusion of a positive control group consisting of men with proven fertility.

      Weaknesses:

      - The manuscript needs comprehensive proofreading for language and formatting. In many instances spaces are missing or not required.<br /> - Could the authors explore correlation network analyses to get additional insights in the structure of different clusters?<br /> - The github link is not correct.<br /> - It is not possible to access the dataset on ENA.<br /> - Add the graphs obtained with decontam analysis as a supplementary figure.<br /> - There is nothing about the RPL group in the results section, while the authors discuss this issue in the introduction. What about the controls with proven fertility?<br /> - While correctly stated in the title, the term microbiota should be used throughout the manuscript instead of "microbiome"

      Comments on revised version:

      Discussion: Could the authors discuss more the findings about Flavobacterium? Has it ever been associated with the urogenital tract? What is the relative abundance in the present study: this type of bacterium has been previously associated with contaminations (PMID: 25387460, 30497919).

      Figure 1: Increase the size of panel A.

      Figure 3: Can the authors indicate the relative abundance of each genus/species by the size of the node?

      Supplementary data: I don't see anywhere the decontam plots.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      - The manuscript needs comprehensive proofreading for language and formatting. In many instances, spaces are missing or not required.

      Thank you for your comments. The manuscript has been thoroughly proofread for errors in language and formatting.

      - Could the authors explore correlation network analyses to get additional insights into the structure of different clusters? 

      We have added a co-occurrence analysis (at species taxonomic level) based on SparCC to the manuscript (Figure 2).

      This is described on Page 9 line 141-148

      - The GitHub link is not correct. 

      The github repository has now been made public.

      - It is not possible to access the dataset on ENA. 

      We have changed the ENA study PRJEB57401 status to open.

      - Add the graphs obtained with decontam analysis as a supplementary figure. 

      We have added the outputs of decontam (.csv files with feature lists of ASVs that were filtered based on the prevalence and frequency tests) to the github repository.

      - There is nothing about the RPL group in the results section, while the authors discuss this issue in the introduction. What about the controls with proven fertility? 

      Thank you. We have amended the manuscript to compare characteristics between the RPL, unexplained subfertility and controls groups.

      Line 1279-130 page 8:  

      “The study group represented 85% of samples with high sperm DNA fragmentation, 85% of samples with elevated ROS and 79% of samples with oligospermia. Rates of abnormal seminal parameters including low sperm concentration, reduced progressive motility and ROS concentrations were found to be highest in the MFI group (Supplementary Figure 1). Baseline characteristics between the RPL, unexplained subfertility and controls groups were similar.

      Line 150-154 Page 9: 

      “Bacterial richness, diversity and load were similar between all patient groups examined in the study (Supplementary Figure 4).

      - While correctly stated in the title, the term microbiota should be used throughout the manuscript instead of "microbiome" 

      Thank you. This misnomer has been amended throughout the manuscript.

      Minor corrections:

      Line 25: provoke is not a good term here. 

      Thank you. The term ‘provoke’ has been removed

      Line 26: why does semen culture have a limited scope? 

      Thank you. Line 40-41 Page 3 has been amended:

      “It is therefore plausible that asymptomatic seminal infections may be associated with impaired reproductive function in some men. Since semen culture has a limited scope for studying the seminal microbiota due to its inability to identify all present microbiota next generation sequencing (NGS) approaches have been reported recently by a growing number of investigators (13, 14, 15, 16, 17, 18, 19)”.

      Line 68: write μl correctly

      Thank you. This has been corrected

      Line 131: several organisms at the genus level. 

      Thank you. This has been corrected

      Line 136: what are the relative abundances of these genera? Is this relevant? 

      The mean relative abundances for the key taxa mention in each cluster are all above 20%. This information has been added to the manuscript text on page 9, line 153.

      Line 173: Molina et al. 

      Thank you. This has been corrected

      Line 173: the contaminations are referred to the low biomass nature of testicular samples. If present, bacteria of accessory gland secretions are an integral part of the seminal microbiota itself. Please review these sentences. 

      Thank you. This had been reworked to highlight the important of urethral contamination, which you later allude to as a limitation of our study is the failure to provide paired urine and semen samples.

      Page 11 line 194-196

      “Molina et al report that 50%-70% of detected bacterial reads may be environmental contaminants in a sample from extracted testicular spermatozoa (35); with the addition of passage along the urethra it is likely that contamination of ejaculated semen would be much higher.”

      Table 1: remove results interpretation from table caption. 

      Thank you this has been acted upon.

      Table 1: why in some cases, like in DNA fragmentation index, the total is not equal to n=223? 

      This is due to missing data/ analysis not possible for some men due to the requirement of a minimum number of sperm in the ejaculate to perform DNA fragmentation testing.

      Table 1: "frag" is not defined. 

      Thank you, this has been amended

      Tables 2, 3 & 4: bacterial genera in italics. 

      Thank you, this has been amended

      Figure 1A: add the fertility status information above the cluster colors. 

      Thank you, this has been amended in Figure 1.

      Figure 1C: the color code is confusing. Use different colors for each cluster. 

      Figure 1 legend: bacterial genera in italics. 

      Figures 1 & 2: the authors should use similar chart formatting in the two tables. 

      Thank you, this has been amended

      Reviewer 2:

      (1) The patient groups have different diagnoses and should be handled as different groups, and not fused into one 'patient' group in analyses. <br /> Why are the data in tables presented as controls and cases? I would consider men from couples with recurrent pregnancy loss, unexplained infertility, and male factor infertility to have different seminal parameters (not to fuse them into one group). This means, that the statistical analyses should be performed considering each group separately, and not to fuse 3 different infertility diagnoses into one patient group. 

      We have conducted detailed analyses, requested by the reviewer, comparing seminal DNA, ROS and microbiota characteristics between each individual patient groups (Supplimental figures 1 and 4). No specific taxa (at either genera or species-level) were found to differ in relative abundance between the diagnostic groups. However, we expect associations between parameters such as reactive oxygen species, or DNA fragmentation, and relative abundance of bacterial species, to be general and not restricted to or specific to each diagnostic group. Therefore, we also conducted further analyses aggregating data from all patient groups to investigate relationships common to these different forms of male reproductive dysfunction.

      (2) Were any covariables included in the statistical analyses, e.g. age, BMI, smoking, time of sexual abstinence, etc? 

      Covariates were not included in the statistical analyses. This has been added in the manuscript to the limitations.

      Page 14 line 267-268

      “Additionally, we did not have other covariables such as smoking status with which to include in further analyses”.

      (3) Furthermore, it is known that 16S rRNA gene analysis does not provide sensitive enough detection of bacteria on the species level. How much do the authors trust their results on the species level? 

      The limitations of taxonomic assignment using 16S rRNA gene metataxonomics are well documented. However, the capacity to assign sequence amplicons at species level depends on the sequence variability of the 16S rRNA gene for each of the taxa reported and the specific gene region chosen. In this study, amplification of the V1-V2 region was performed using a mixed 28f primer set (see methods for details) that enables resolution and assignment of several bacterial species highly relevant to the reproductive tract including Lactobacillus spp., such as L. crispatus and L. iners, (e.g. https://doi.org/10.3389/fcell.2021.641921, https://doi.org/10.1128/msystems.01039-23, https://doi.org/10.1186/s12915-023-01702-2). In this study, we report the presence of L. iners, but not L. crispatus in semen samples, and we have also identified a specific association/co-occurrence between Gardnerella vaginalis and Lactobacillus iners, similar to that observed in vaginal bacterial communities.

      (4) Were the analyses of bacterial genera and species abundances with seminal quality parameters controlled for diagnosis and other confounders? 

      As stated in point 2, no adjustment was made for co-variates. No differences in microbiome composition were observed among the three diagnostic groups, so no adjustments were made to our analysis.

      (5) The authors stress that their study is the biggest on the microbiome in semen. However, when considering that the study consists of 4 groups (with n=46-63), it does not stand out from previous studies. 

      Our study is overall the largest investigating interactions between the seminal microbiome and male reproductive dysfunction. Other studies have included greater numbers of men with infertility.

      (6) Weaknesses: There is a lack of paired seminal/urinal samples. 

      Thank you. This limitation has been added.

      Page 14 line 266-267

      “A further limitation of this study, and others, is the lack of reciprocal genital tract microbiota testing of the female partners, or paired seminal and urinary samples from male participants”.

      Recommendation for authors to consider:

      Including previous classical reviews in the introduction: DOI:10.1097/MOU.0000000000000742 <br /> DOI: 10.1038/s41585-019-0250-y 

      Thank you. This has been added.

      Mentioning in the M&M section that there is a supplementary text with a more detailed M&M part. 

      Thank you. This has been added. Further methodological detail can be found in supplementary text.

      Revising the use of 'microbiota' and 'microbiome', they are not synonyms. When talking of 16S rRNA gene analysis, we consider 'microbiome' analysis. 

      Thank you. This misnomer has been amended throughout the manuscript.

      Revising the text, there are several erratas (e.g. verb missing, etc). 

      Thank you for your comments. The manuscript has been thoroughly proofread for errors in language and formatting.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review):

      Summary: 

      In the manuscript entitled "Magnesium modulates phospholipid metabolism to promote bacterial phenotypic resistance to antibiotics", Li et al demonstrated the role of magnesium in promoting phenotypic resistance in V. alginolyticus. Using standard microbiological and metabolomic techniques, the authors have shown the significance of fatty acid biosynthesis pathway behind the resistance mechanism. This study is significant as it sheds light on the role of an exogenous factor in altering membrane composition, polarization, and fluidity which ultimately leads to antimicrobial resistance. 

      Strengths: 

      (1) The experiments were carried out methodically and logically. 

      (2) An adequate number of replicates were used for the experiments. 

      Weaknesses: 

      (1) The introduction section needs to be more informative and to the point.  

      Thank you so much for your suggestion. We have revised the introduction to make it more informative and to the point as following:

      “Non-inheritable antibiotic or phenotypic resistance represents a serious challenge for treating bacterial infections. Phenotypic resistance does not involve genetic mutations Phenotypic resistance does not involve genetic mutations and is transient, allowing bacteria to resume normal growth. Biofilm and bacterial persisters are two phenotypic resistance types that have been extensively studied (Brandis et al., 2023; Corona & Martinez, 2013). Biofilms have complex structures, containing elements that impede antibiotic diffusion, sequestering and inhibiting their activity (Ciofu et al., 2022). Biofilm-forming bacteria and persisters also have distinct metabolic states that significantly reduce their antibiotic susceptibility (Yan & Bassler, 2019). These two types of phenotypic resistance share the common feature in their retarded or even cease of growth in the presence of antibiotics (Corona & Martinez, 2013). However, specific factors that promote phenotypic resistance and allow bacteria to proliferate in the presence of antibiotics remain poorly defined.

      Metal ions have a diverse impact on the chemical, physical, and physiological processes of antibiotic resistance  (Booth et al, 2011; Lu et al, 2020; Poole, 2017). This includes genetic elements that confer resistance to metals and antibiotics (Poole, 2017) and metal cations that directly hinder (or enhance) the activity of specific antibiotic drugs (Zhang et al., 2014). The metabolic environment can also impact the sensitivity of bacteria to antibiotics (Jiang et al., 2023; Lee & Collins, 2012; Peng et al., 2015; Zhang et al., 2020; Zhao et al., 2021). Light metal ions, such as magnesium, sodium, and potassium, can behave as cofactors for different enzymes (Du et al., 2016) and influence drug efficacy. Heavy metal ions, including Cu2+ and Zn2+, confer resistance to antibiotics (Yazdankhah et al., 2014; Zhang et al., 2018). Recent reports suggest that sodium negatively regulates redox states to promote the antibiotic resistance of Vibrio alginolyticus (Yang et al., 2018), while actively growing Bacillus subtilis cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al, 2019). In Gram-negative bacteria, by contrast, zinc enhances antibiotic efficacy by potentiating carbapenem, fluoroquinolone, and β-lactam-mediated killing (Isaei et al., 2016; Zhang et al., 2014). Magnesium influences bacterial structure, cell motility, enzyme function, cell signaling, and pathogenesis (Wang et al., 2019). This mineral also modulates microbiota to harvest energy from the diet (Garcia-Legorreta et al., 2020), allowing Bacillus subtilis to cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al., 2019). However, the role of magnesium in promoting phenotypic resistance is less well understood.

      Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesiumdependent phenotypic resistance mechanisms of pathogenic Vibrio species. 

      The current study evaluated whether magnesium induces phenotypic resistance in Vibrio species and defined the molecular/genetic basis for this resistance. Genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility testing were used for the evaluations.”

      (2) The weakest point of this paper is in the logistics through the results section. The way authors represented the figures and interpreted them in the results section (or the figure legends) does not match. The figures are difficult to interpret and are not at all self-explanatory. 

      Thank you so much for your suggestion. We have followed your suggestion to check the match between result and figures. They are now revised. 

      (3) There are too many mislabeling of the figure panels in the main text which makes it difficult to find out which figures the authors are explaining. There should be more explanation on why and how they did the experiments and how the results were interpreted. 

      Thank you so much for your suggestion. We have checked the figures and main text to ensure that we make every figure clearly stated.  

      Reviewer #2 (Public Review): 

      Summary: 

      In this study, the authors aimed to identify if and how magnesium affects the ability of two particular bacteria species to resist the action of antibiotics. In my view, the authors succeeded in their goals and presented a compelling study that will have important implications for the antibiotic resistance research community. Since metals like magnesium are present in all lab media compositions and are present in the host, the data presented in this study certainly will inspire additional research by the community. These could include research into whether other types of metals also induce multi-drug resistance, whether this phenomenon can be observed in other bacterial species, especially pathogenic species that cause clinical disease, and whether the underlying molecular determinants (i.e. enzymes) of metal-induced phenotypic resistance could be new antimicrobial drug targets themselves. 

      Strengths: 

      This study's strengths include that the authors used a variety of methodologies, all of which point to a clear effect of exogenous Mg2+ on drug resistance in the targeted species. I also commend the authors for carrying out a comprehensive study, spanning evaluation of whole cell phenotypes, metabolic pathways, genetic manipulation, to enzyme activity level evaluation. The fact that the authors uncovered a molecular mechanism underlying Mg2+-induced phenotypic resistance is particularly important as the key proteins should be studied further.

      Weaknesses: 

      I believe there are weaknesses in the manuscript, however. The authors take for granted that the reader is familiar with all the assays utilized, and do not properly explain some experiments, and thus I highly suggest that the authors add a brief statement in each situation describing the rationale for each selected methodology (more details are in the private review to the authors). The Results section is also quite long and bogs down at times, and I suggest that the authors reduce its length by 10 to 20%. In contrast, the Introduction is sparse and lacks key aspects, for example, there should be mention of the study's main purpose and approaches, plus an introduction to the authors' choice of species and their known drug resistance properties, as well as the drug of choice (balofloxacin). Another notable weakness is that the authors evaluated Mg2+-induced phenotypic resistance only against two closely related species, and thus the generalizability of this mechanism of drug resistance is not known. The paper would be strengthened if the authors could demonstrate this type of phenotypic resistance in at least one more Gram-negative species and at least one Gram-positive species (antimicrobial susceptibility evaluations would suffice), each of which should be pathogenic to humans. Demonstrating magnesium-induced phenotypic drug resistance in the WHO Priority Bacterial Pathogens would be particularly important. 

      In general, the conclusions drawn by the authors are justified by the data, except for the interpretation of some experiments. Importantly, this paper has discovered new antimicrobial resistance mechanisms and has also pointed to potential new targets for antimicrobials. 

      Thank you so much for your suggestion! We followed your idea the revise the manuscript as following:

      (1) We added a brief statement in the situation to explain the result and methodology according to your suggestion in the private review.

      (2) To make the streamline of the story more logic, we moved the whole second result to supplementary text and supplementary figure. 

      (3) We revised the introduction part by adding additional information to make it informative and to the point as following:

      “Non-inheritable antibiotic or phenotypic resistance represents a serious challenge for treating bacterial infections. Phenotypic resistance does not involve genetic mutations Phenotypic resistance does not involve genetic mutations and is transient, allowing bacteria to resume normal growth. Biofilm and bacterial persisters are two phenotypic resistance types that have been extensively studied (Brandis et al., 2023; Corona & Martinez, 2013). Biofilms have complex structures, containing elements that impede antibiotic diffusion, sequestering and inhibiting their activity (Ciofu et al., 2022). Biofilm-forming bacteria and persisters also have distinct metabolic states that significantly reduce their antibiotic susceptibility (Yan & Bassler, 2019). These two types of phenotypic resistance share the common feature in their retarded or even cease of growth in the presence of antibiotics (Corona & Martinez, 2013). However, specific factors that promote phenotypic resistance and allow bacteria to proliferate in the presence of antibiotics remain poorly defined.

      Metal ions have a diverse impact on the chemical, physical, and physiological processes of antibiotic resistance  (Booth et al, 2011; Lu et al, 2020; Poole, 2017). This includes genetic elements that confer resistance to metals and antibiotics (Poole, 2017) and metal cations that directly hinder (or enhance) the activity of specific antibiotic drugs (Zhang et al., 2014). The metabolic environment can also impact the sensitivity of bacteria to antibiotics (Jiang et al., 2023; Lee & Collins, 2012; Peng et al., 2015; Zhang et al., 2020; Zhao et al., 2021). Light metal ions, such as magnesium, sodium, and potassium, can behave as cofactors for different enzymes (Du et al., 2016) and influence drug efficacy. Heavy metal ions, including Cu2+ and Zn2+, confer resistance to antibiotics (Yazdankhah et al., 2014; Zhang et al., 2018). Recent reports suggest that sodium negatively regulates redox states to promote the antibiotic resistance of Vibrio alginolyticus (Yang et al., 2018), while actively growing Bacillus subtilis cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al, 2019). In Gram-negative bacteria, by contrast, zinc enhances antibiotic efficacy by potentiating carbapenem, fluoroquinolone, and β-lactam-mediated killing (Isaei et al., 2016; Zhang et al., 2014). Magnesium influences bacterial structure, cell motility, enzyme function, cell signaling, and pathogenesis (Wang et al., 2019). This mineral also modulates microbiota to harvest energy from the diet (Garcia-Legorreta et al., 2020), allowing Bacillus subtilis to cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al., 2019). However, the role of magnesium in promoting phenotypic resistance is less well understood.

      Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesiumdependent phenotypic resistance mechanisms of pathogenic Vibrio species. 

      The current study evaluated whether magnesium induces phenotypic resistance in Vibrio species and defined the molecular/genetic basis for this resistance. Genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility testing were used for the evaluations.”

      (4) We examined the effect of magnesium in WHO listed priority strains, which confirmed the results as following:

      “Importantly, exogenous MgCl2 also increased MICs of clinic isolates, carbapenemresistant Escherichia coli, carbapenem-resistant Klebsiella pneumoniae, carbapenemresistant Pseudomonas aeruginosa and carbapenem-resistant Acinetobacter baumannii to balofloxacin (Fig 1G).”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      (1) There are many grammatical mistakes to point out. The manuscript needs proofreading and editing.

      We appreciate this comment! The manuscript has been revised by a native speaker.

      (2) The introduction could be more informative. A little more description of magnesium - such as what it does to antibiotics and how it's known to affect the microbiome - might be helpful for the general readers. The question remains why out of all the metal ions that might affect antibiotic resistance (many of them are less explored), authors particularly decided to work on the effect of magnesium. The introduction should cover the rationale of their hypothesis. Also, the authors might want to briefly talk about the model organisms (V. algonolyticus and V. parahemolyticus) describing how threatening they are and how they are becoming resistant to antibiotics. 

      We appreciate this comment! We revise the introduction by providing additional information as following:

      “In Gram-negative bacteria, by contrast, zinc enhances antibiotic efficacy by potentiating carbapenem, fluoroquinolone, and β-lactam-mediated killing (Isaei et al., 2016; Zhang et al., 2014). Magnesium influences bacterial structure, cell motility, enzyme function, cell signaling, and pathogenesis (Wang et al., 2019). This mineral also modulates microbiota to harvest energy from the diet (Garcia-Legorreta et al., 2020), allowing Bacillus subtilis to cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al., 2019). However, the role of magnesium in promoting phenotypic resistance is less well understood.

      Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesiumdependent phenotypic resistance mechanisms of pathogenic Vibrio species. 

      The current study evaluated whether magnesium induces phenotypic resistance in Vibrio species and defined the molecular/genetic basis for this resistance. Genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility testing were used for the evaluations. ”

      (3) Figure 1C is mislabeled as 1B (line 100). Line 101: The sentence is not clear and very confusing. What is meant by 15.6mM - 62.4 mM? Are they talking about the concentration of BLFX (though in the figure the concentration was shown in µg)? Please rewrite the sentence in a simplified way. Also, the zone of inhibition was decreased with increasing MgCl2, not increased. 

      We appreciate this comment! These have been revised, including that Fig 1B is now corrected as Fig. 1C. Line 101, which is now Line 122. The sentence was revised as following:

      “At balofloxacin doses of 1.56, 3.125, 6.25, and 12.5 µg, the zone of inhibition decreased with increasing MgCl2 (Fig 1D)”

      (4) In the western blot images, it would be nice to indicate the MW of the protein bands shown. The loading control used for the experiments should be clearly mentioned in the figure legends. 

      We appreciate this comment! The MWs are indicated in the western-blot image throughout the manuscript. 

      The loading control is clearly stated in the figure legend as following:

      “Whole cell lysates resolved by SDS-PAGE gel was stained with Coomassie brilliant blue as loading control.”. 

      (5) Figures 2 B and C: the figure legend does not explain what the authors wanted to show. It's not clear how they plotted the inhibitory curve, or the binding efficacy. These panels need an explanation of how the analysis was done.

      We appreciate this comment! The figure 2 is now removed to Suppl. Fig 2, and the description of figure 2 is moved to Suppl. Text. We revise the description of the result as following, which is in Suppl. Text:

      “Prior studies suggest that the chelation of antibiotics by magnesium ions inhibits antibiotic uptake (Deitchman et al., 2018; Lunestad and Goksøyr, 1990). To investigate whether magnesium binds to balofloxacin, balofloxacin was pre-incubated with magnesium, and zone of inhibition (ZOI) analysis was conducted. Six different concentrations of balofloxacin were separately incubated with six different concentrations of MgCl2, and then spotted on filter paper so that a defined amount of balofloxacin could be used for ZOI. While lower concentrations of MgCl2, (0.78, 3.125, or 12.5 mM) did not alter the ZOI, higher concentrations, including 50 and 200 mM MgCl2, decreased the ZOI (Suppl. Fig 2A), suggesting that even high doses of magnesium had only a partial effect on balofloxacin through direct binding. For example, at 200 mM MgCl2 and 5 or 10 μg/mL balofloxacin, the balofloxacin ZOI was 53.2 and 70.3% of the ZOI at 0 mM MgCl2, suggesting that  50% of the antibiotics were still functional. Intracellular BLFX also decreased with increasing MgCl2 (Suppl. Fig 2B), while exogenous Mg2+ increased intracellular Mg2+ levels in a dose-dependent manner. For example, exogenous 50 and 200 mM MgCl2 increased intracellular Mg2+ levels to 1.21 and 1.31 mM, respectively (Suppl. Fig 2C). The relationship between TolC, an efflux pump that transports quinolones from bacterial cells, and Mg2+ was also assessed (Kobylka et al., 2020; Song et al., 2020). The expression of TolC/tolC was unaffected by Mg2+ (Suppl. Fig 2D). Magnesium is critical for LPS stability. LPS levels increased at 200 mM Mg2+ (Suppl. Fig 2E), however, the loss of waaF, lpxA, and lpxC, three key genes involved in LPS biosynthesis, did not influence balofloxacin sensitivity/resistance in the presence of Mg2+ (Suppl. Fig 2F). These findings suggest that magnesium-induced LPS biosynthesis does not contribute directly to BLFX resistance and demonstrate that Mg2+ influx is involved in balofloxacin resistance.”

      (6) For the metabolomics results, it will help immensely if the authors provide a volcano plot of the identified metabolites and plot the heat map according to the -log2 metabolite intensities. In Figure 3A, it's not clear what information is conveyed through Euclidean distance calculations of the heat map. In Figure 3 B, the authors mentioned that the OPLS-DA test was conducted, although the figure shows a PCA plot, so it's not clear how these two are connected. Figure 3 E: the figure legend says scattered plot, but the panel represents color-coded numerical values, not a scattered plot. Also, it's not clear how they got those values. 

      We appreciate this comment! We quite agree with you that if the differential metabolites could be shown as volcano plot. However, we didn’t adopt volcano plot in this study because this is a magnesium concentration-dependent metabolomes that includes 6 groups in parallel. Volcano plots may give a complex view of the comparison among different groups. We also tried to plot the heat map according to the -log2 metabolite intensities. Although this analysis cluster 200 mM and 50 mM groups better, the data of low magnesium concentrations was not consistent, which may be due to the minor metabolic change of low concentrations magnesium. Thank you for your understanding. 

      For Euclidean distance calculations, we explain in the figure legend as following:

      “Euclidean distance calculations were used to generate a heatmap that shows clustering of the biological and technical replicates of each treatment.” 

      In Figure 2B, which was Figure 3B in previous version, it has been replaced with OPLS-DA analysis in the revised version. 

      In Figure 2E, which was Figure 3E in previous version, it is revised as following:

      “E. Areas of the peaks of palmitic acid and stearic acid generated by GC-MS analysis.” 

      (7) In Figure 4, the figure legends (as well as the in the text) are not properly referred to. Please make sure to refer to the correct panel. 

      We appreciate this comment! The figure legends have been corrected to match the panel and text. 

      Figure 4F: how was the synergy analysis done? In the methods section, the authors described the antibiotic bactericidal assay protocol, but there was no clear indication of how they generated the isobologram. 

      We appreciate this comment! We provide additional information in the Figure 3F legend, which was Figure 4F in previous version,  as following: 

      “Synergy analysis for BFLX with palmitic acid for V. alginolyticus. Synergy was performed by comparing the dose needed for 50% inhibition of the synergistic agents (white) and non-synergistic (i.e., additive) agents (purple).”

      (8) Figure 5 A: the scatter plot is plotted according to the area along the Y axis: which "area" is represented here? There is absolutely no explanation, neither in the results nor in the figure legends. Using box plots might be a better option than using a scattered plot.

      We appreciate this comment! “Area” has been noted in the revised manuscript as following:

      “The area indicates the area of the peak of the metabolite in total ion chromatography of GC-MS.” 

      (9) In Figure 6 A, the heat map is plotted according to the column Z scores. What is meant by "column Z score"? The corresponding figure legend says, "heat map showing differential abundance of lipid". Z scores do not represent an abundance of a variable, so the conclusion might not be appropriate here. 

      We appreciate this comment! In Figure 5A, which was Figure 6A in previous version, column Z score shows the abundance of metabolites analyzed, which is automatically generated in the heat map analysis to give a sign of these metabolites tested. The legend has been revised as following: 

      “Heatmap showing changes in differential lipid levels at the indicated concentration of MgCl2.”  

      (10) Line 313-314: it should be Figure EV6C.  

      We appreciate this comment! The citation has been corrected.

      (11) The authors have shown that Mg+2 does not alter the LPS transport system, however, there was some significant increase in LPS expression at 200mM MgCl2. It would be interesting if the authors could also check if Mg+2 has any effect on the outer membrane protein (OMP) integrity (by checking OMP components BamA and LptD).  

      We appreciate this comment!  We have carefully examined the membrane permeability in Figure 7. We thus didn’t perform additional experiment here to see the change of BamA and LptD. Thank you very much for your understanding.

      (12) I wonder if the authors could check the effect of extracellular Mg+2 during the co-treatment of palmitic acid, linoleic acid, and balofloxacin. Will there still be the antagonistic effect or the presence of Mg+2 could change the phenotype? 

      We appreciate this comment! Additional experiments is performed as following:

      “Furthermore, magnesium had a minimal effect on the antagonistic effect of palmitic acid, linolenic acid, and balofloxacin (Fig 4G), suggesting that this mineral functions through lipid metabolism.” 

      Reviewer #2 (Recommendations For The Authors)

      (1) As mentioned in the Public Review, I strongly believe that the impact of this study will be more significant if magnesium-induced phenotypic drug resistance could be demonstrated in at least one other Gram-negative and one other Grampositive species, both of which should be human pathogens. The full suite of experiments would not be necessary for this suggestion; evaluation of the effect of Mg concentration in growth media on the drug resistance of other species, testing the different antibiotic types used in this study, would be sufficient. 

      We appreciate this comment! Additional experiments have performed to test this idea. Mg2+ has the similar effect on carbapenem-resistant Escherichia coli, carbapenem-resistant Klebsiella pneumoniae, carbapenem-resistant Pseudomonas aeruginosa and carbapenem-resistant Acinetobacter baumannii as the similar as on the Vibrio species in shown in Figure 1G. These have been described following as

      “Importantly, exogenous MgCl2 also increased MICs of clinic isolates, carbapenemresistant Escherichia coli, carbapenem-resistant Klebsiella pneumoniae, carbapenemresistant Pseudomonas aeruginosa and carbapenem-resistant Acinetobacter baumannii to balofloxacin (Fig 1G).”

      (2) I recommend that the Introduction section be expanded. I recommend one or two sentences introducing the two Vibrio species selected for study. I.e. why did the authors choose these two species? What is known about their phenotypic drug resistance in the literature? Why did the authors select balofloxacin for their studies, is it a common antimicrobial used vs Vibrios? As well, the end of the Introduction section ends abruptly with no transition to the present study itself. The end of the introduction should include one or two sentences introducing the main purpose of the study, its approach, and the techniques undertaken. For example, "In this study, we evaluated whether magnesium induces phenotypic resistance in Vibrio species and the molecular/genetic basis for such resistance. We used genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility evaluations." 

      We appreciate this comment! We revise the introduction by providing additional information as following:

      “In Gram-negative bacteria, by contrast, zinc enhances antibiotic efficacy by potentiating carbapenem, fluoroquinolone, and β-lactam-mediated killing (Isaei et al., 2016; Zhang et al., 2014). Magnesium influences bacterial structure, cell motility, enzyme function, cell signaling, and pathogenesis (Wang et al., 2019). This mineral also modulates microbiota to harvest energy from the diet (Garcia-Legorreta et al., 2020), allowing Bacillus subtilis to cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al., 2019). However, the role of magnesium in promoting phenotypic resistance is less well understood.

      Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesiumdependent phenotypic resistance mechanisms of pathogenic Vibrio species. 

      The current study evaluated whether magnesium induces phenotypic resistance in Vibrio species and defined the molecular/genetic basis for this resistance. Genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility testing were used for the evaluations. ”

      (3) The authors introduce the acronym AWST but never use it again in the paper, instead they use SWT. The authors should introduce SWT only for consistency. 

      We appreciate this comment! We have corrected all the “SWT” to “ASWT”

      (4) Line 76 is not clear: what is meant by "some of which could influence drug efficacy" - the enzymes that utilize light metal ions are co-factors? Or the metals directly?  

      We appreciate this comment! The information we wanted to deliver is that light metal ions can serve as cofactors to catalyze biochemical reaction. Such chemical reaction would alter the drug efficacy, e.g. the Fe-S cluster are metallocofactor for proteins which regulates redox chemistry including antibioticinduced redox change. However, this information is not appropriate for this manuscript, so we delete this sentence. 

      (5) Line 90: add a reference corroborating that this chemical composition is a mimic of marine water. The NaCl concentration used in particular looks quite low. 

      We appreciate this comment! It was a typo error. The NaCl concentration was 210 mM as shown in Suppl. Table 1. We also provide details of the chemical composition of the marine water as following:

      “Marine environments and agriculture, where antibiotics are commonly used, are rich in magnesium. To investigate whether this mineral impacts antibiotic activity, the minimal inhibitory concentration (MIC) of V. alginolyticus ATCC33787 and V. parahaemolyticus VP01, which we referred as ATCC33787 and VP01 afterwards, isolated from marine aquaculture, to balofloxacin (BLFX) in Luria-Bertani medium

      (LB medium) plus 3% NaCl as LBS medium and “artificial seawater” (ASWT) medium that included the major ion species in marine water (Wilson, 1975) (LB medium plus 210 mM NaCl, 35 mM Mg2SO4, 7 mM KCl, and 7 mM CaCl2) were assessed”

      (6) Line 98 and Figure 1B. M9 is indicated in the text but does not appear in the figure, the figure only shows SWT. This should be checked. Line 99: based on Figure 1C, the authors are adding MgCl2 to SWT, SWT should be mentioned in this line. Line 100: I believe this is referring to Figure 1C, which should be checked. 

      We appreciate this comment! 

      Line 98, which is now Line 118: We have corrected M9 to ASWT as following:

      “However, the MIC for BLFX was higher in ASWT medium supplemented with Mg2SO4 or MgCl2 than in LB medium (Fig 1B).”

      Line 99, which is now Line 133: the sentence is corrected as following:

      “The MIC for BLFX increased at higher concentrations of MgCl2 in ASWT”

      Line 100, which is now Line 135: we have corrected Fig 1B to Fig. 1C.

      (7) Line 101: text and Figure 1D are not consistent, as Figure 1D does not show this level of precision in added MgCl2 as indicated in the text (15.6 - 62.4 mM).  

      We appreciate this comment! The sentence has been corrected as following: “At balofloxacin doses of 1.56, 3.125, 6.25, and 12.5 µg, the zone of inhibition decreased with increasing MgCl2 (Fig 1D)””.  

      (8) MgCl2 clearly induces increasing levels of BLFX resistance, and to high levels, but not for every antibiotic. For example, the level of increased resistance to blactams is low (ceftriaxone) and plateaus (ceftazidime). As well, resistance to gentamicin plateaus at a lower level than the other aminoglycosides. These observations do not take away from the conclusion that Mg induces multi-drug resistance, but since the behaviour of the MICs for these drugs is different than the other drugs, they should be mentioned. Also, Figure 1F - tetracyclines (plural) is used for vertical axis label - does this refer to the tetracycline itself or the class itself, and if the class, which one was tested? 

      We appreciate this comment! We revise the description as following: “Notably, magnesium had a reduced effect on ceftriaxone and gentamicin than other antibiotics.”

      The tetracyclines is labeled as “Oxytetracycline” in the revised manuscript. 

      - The magnesium chelation experiments presented in Figure 2 are not clear. The authors should briefly mention how this was done around line 128, and what data underlies the values in Figure 2C. Figure 2B is also not clear to me at all. Similarly, how the authors measured intracellular balofloxacin and Mg2+ is not clear and should be mentioned briefly around lines 130-132. 

      We appreciate this comment! These have been rewritten following as  “To investigate whether magnesium binds to balofloxacin, balofloxacin was preincubated with magnesium, and zone of inhibition (ZOI) analysis was conducted. Six different concentrations of balofloxacin were separately incubated with six different concentrations of MgCl2, and then spotted on filter paper so that a defined amount of balofloxacin could be used for ZOI. While lower concentrations of MgCl2, (0.78, 3.125, or 12.5 mM) did not alter the ZOI, higher concentrations, including 50 and 200 mM MgCl2, decreased the ZOI (Suppl. Fig 2A), suggesting that even high doses of magnesium had only a partial effect on balofloxacin through direct binding. For example, at 200 mM MgCl2 and 5 or 10 μg/mL balofloxacin, the balofloxacin ZOI was 53.2 and 70.3% of the ZOI at 0 mM MgCl2, suggesting that  50% of the antibiotics were still functional. Intracellular BLFX also decreased with increasing MgCl2 (Suppl. Fig 2B), while exogenous Mg2+ increased intracellular Mg2+ levels in a dose-dependent manner. For example, exogenous 50 and 200 mM MgCl2 increased intracellular Mg2+ levels to 1.21 and 1.31 mM, respectively (Suppl. Fig 2C). The relationship between TolC, an efflux pump that transports quinolones from bacterial cells, and Mg2+ was also assessed (Kobylka et al., 2020; Song et al., 2020). The expression of TolC/tolC was unaffected by Mg2+ (Suppl. Fig 2D). Magnesium is critical for LPS stability. LPS levels increased at 200 mM Mg2+ (Suppl. Fig 2E), however, the loss of waaF, lpxA, and lpxC, three key genes involved in LPS biosynthesis, did not influence balofloxacin sensitivity/resistance in the presence of Mg2+ (Suppl. Fig 2F). These findings suggest that magnesium-induced LPS biosynthesis does not contribute directly to BLFX resistance and demonstrate that Mg2+ influx is involved in balofloxacin resistance.”

      - Line 135: LPS cannot be "expressed", as the authors word it here. This should be corrected. Also, the inspection of Figure 2G actually shows the levels of LPS increase with increased Mg2+. The authors should re-evaluate these results and change their description around this area of the Results. 

      We appreciate this comment! We have removed the whole Figure 2 to Supplementary Text and Supplementary Figure 2. We rewrite this part as following: “The relationship between TolC, an efflux pump that transports quinolones from bacterial cells, and Mg2+ was also assessed (Kobylka et al., 2020; Song et al., 2020). The expression of TolC/tolC was unaffected by Mg2+ (Suppl. Fig 2D). Magnesium is critical for LPS stability. LPS levels increased at 200 mM Mg2+ (Suppl. Fig 2E), however, the loss of waaF, lpxA, and lpxC, three key genes involved in LPS biosynthesis, did not influence balofloxacin sensitivity/resistance in the presence of Mg2+ (Suppl. Fig 2F). These findings suggest that magnesium-induced LPS biosynthesis does not contribute directly to BLFX resistance and demonstrate that Mg2+ influx is involved in balofloxacin resistance.”

      - Section: MgCl2 affects bacterial metabolism. Authors switched to M9 medium - why? This contrasts with other sections using SWT and should be explained. Also, I cannot evaluate whether the statistical analysis of the data here was performed correctly and was appropriate for this type of experiment. I advise the authors to move the details in lines 166-169 to the Materials and Methods and replace this section instead with a more accessible description of the statistical analysis that a non-expert would be able to appreciate. Furthermore, analysis of Figure 3A indicates that the levels of asparagine, 4-hydroxybutyric acid, uracil, cystathionine, fumaric acid, and aminoethanol have significantly changed at high MgCl2, but these are not mentioned in the text. I suggest the authors mention these if they are relevant to the 12 enriched pathways, especially the biosynthesis of fatty acids. 

      We appreciate this comment! 

      We indicate the reason we use M9 medium as following:

      “To better understand how magnesium affects bacterial metabolism” for explaining why the M9 medium was used.”

      The information lines 166-169 indicated has been removed to M &M. 

      We have carefully examined the abundance of the metabolites and the enriched pathway. Among the listed metabolites, only fumarate is within the enriched pathways. We mention this point in our revised manuscript as following:

      “The increase in fatty acid biosynthesis could be partially explained by an imbalanced pyruvate cycle/TCA cycle, in which fumarate levels increased at higher Mg2+ while succinate levels increased at lower Mg2+ (Suppl. Fig 5B). These findings indicated that glycolysis fluxes into fatty acid biosynthesis rather than the pyruvate cycle/TCA cycle. The relevance of fatty acids and BLFX was demonstrated by the observation that exogenous palmitic acid increased bacterial resistance to balofloxacin (Fig 2F). These results suggest that fatty acid metabolism may be critical to magnesium-based phenotypic resistance.”

      - Line 211 appears to refer to Figure 4F and should be checked. Similarly in line 216 - appears this should be Figure 4H, and line 218 should be Figure 4H. Line 226: add a reference to Fig 4I (after arcA was decreased). Line 227: what are genes N646_1004 and N646_1885? Based on Fig 4J these are crp - authors should add to line 227. Line 228 appears to refer to Figure 4J, not Figure 4I. Line 229 - should be Figure 4K, not Figure 4I. Line 231 - should be 4L, not 4K. Line 239 - should be 4M.

      We appreciate this comment! The text and figure is now matched. 

      - Line 312: the descriptions of "11 lipids, 32 lipids, and 53", and then "26 lipids, 52 lipids, and 107 lipids" are not clear at all and should be corrected. 

      We appreciate this comment! The sentence is revised as following:

      “The abundance of 11, 32, and 53 lipids was increased in 3.125, 50, and 200 mM MgCl2-treated bacteria, respectively, while the abundance of 26, 52, and 107 lipids was decreased in 3.125, 50, and 200 mM MgCl2-treated bacteria, respectively (Suppl. Fig 7C)”

      - Line 340. What is the assay the authors are using to measure the levels of the PGS and PSS enzymes? This is not mentioned or clear in this part of the Results.  

      We appreciate this comment!  We provide the information in the manuscript as following:

      “Levels of PGS and PSS were quantified by ELISA kits according to manufacture’s instruction (Shanghai Fusheng Industrial Co., Ltd., China)”

      - Line 372: What is the assay for measuring membrane depolarization? This is not mentioned and I suggest it should be. Line 374: Figure 7B does not show time dependence, only dose dependence, this should be corrected, it is assumed the authors are referring to Fig 7C for the time dependence data. 

      We appreciate this comment! We provide the information in the result as following:  

      “The voltage-sensitive dye, DiBAC4(3) showed that 12.5–200 mM MgCl2 promoted membrane depolarization in a dose-dependent manner (Fig 6A)”

      We also explain how DiBAC4(3) can be used to measure membrane depolarization in the Materials and Methods section as following:

      “DiBAC4(3) is a s voltage-sensitive probe that penetrates depolarized cells, binding intracellular proteins or membranes exhibiting enhanced fluorescence and red spectral shift.”

      To make it clear the specific figure, we revise the sentence as following:

      “Meanwhile, MgCl2 had a dose-dependent (Fig 6B) and time-dependent (Fig 6C) effect on proton motive force (PMF).”

      - Line 384: mention how FM5-95 measures membrane permeability. The authors should also clarify how this reagent is used to measure membrane fluidity, and it is not clear if the data for this is presented in Figure 7 - please clarify. Regarding SYTO9 dye experiment: the authors should briefly explain the experimental design - how SYTO9 dye operates and why FACS was chosen. What is labeled with FITC?  

      We appreciate this comment! We clarify the reason we use FM5-95 in the Methods and Materials section as following:

      “Measurement of fluidity by fluorescence microscopy

      Measurement of membrane fluidity is performed as previously described (Wen et al., 2022). Briefly, ATCC33787 were cultured in medium with indicated concentrations of MgCl2, collected and then adjusted to OD 0.6. Aliquot of 100 μL bacteria cells of each sample were diluted to 1 mL and 10 μL (10 mg/mL) FM5-95 (Thermo Fisher

      Scientific, USA) was added. FM5-95 is a lipophilic styryl dye that insert into the outer leaflet of bacterial membrane and become fluorescence. This dye preferentially bind to the microdomains with high membrane fluidity(Wen et al., 2022). After incubated for 20 min at 30 ℃ at vibration without light, the sample was centrifuged for 10 min at 12,000 rpm. The pellets were resuspended with 20 μL of 3% NaCI. Aliquot of 2 μL sample was dropped on the agarose slide, and take photos under the inverted fluorescence microscope.”

      This data is presented as micrographs in Fig. 6D, which shows the decreased FM5-95 staining with increasing concentrations of MgCl2. We make this description clear with the following revision:

      “FM5-95 staining decreased with increasing concentrations of Mg2+, and no staining was observed in the presence of 200 mM Mg2+ (Fig 6D).”

      We explain the reason why we use SYTO9 as following:

      “SYTO9, a green fluorescent dye that binds to nucleic acid, enters and stains bacteria cells when there is an increase in membrane permeability (Lehtinen et al., 2004; McGoverin et al., 2020). Staining decreased with increasing MgCl2, indicating that bacterial membrane permeability declined in an Mg2+ dose-dependent manner (Fig 6E).”

      We didn’t use FACS in this study, while we only analyze the fluorescence distribution with the equipment. To make it clear, we revise the sentence as following:

      “After incubated for 15 min at 30 ℃ at vibration without light, the mixtures were filtered and measured by flow cytometry (BD FACSCalibur, USA).”

      - Lines 391-397. The statement that palmitic acid shifts the peaks in Figure 7F is not supported by the data. There is essential no change in the major peak position within each MgCl2 concentration set with increasing palmitic acid. For the linolenic acid data, it is clear that linolenic acid increases permeability only at 50 mM MgCl2-this should be mentioned in the text. 

      We appreciate this comment! We revise the sentence as following:

      “Exogenous palmitic acid also shifted the fluorescence signal peaks to the left in an MgCl2-dependent manner while palmitic acid only slightly shifted the peaks (Fig 6F). In contrast, exogenous linolenic acid shifted the peak to the right in a dose-dependent manner at 50 mM MgCl2 (Fig 6G).” 

      - Line 404-405 - as mentioned earlier, the assay for the update of BLFX should be mentioned (if it is done so earlier in the text, then it does not need to be here).  

      We appreciate this comment! It has been mentioned in the introduction.  

      - Discussion: CpxA/R-OmprF pathway is mentioned here for the first time. Is this one of the pathways modified by MgCl2 as determined during the course of the study? If so, this should be reworded to mention that. If not, the relevance of this particular pathway as it relates to light metals and phenotypic resistance should be discussed.

      We appreciate this comment! Since it is not relevant to the discussion of Mg2+ and fatty acid biosynthesis, we delete this sentence in the revised manuscript.  

      -The following grammatical errors should be corrected:

      -line 55 change to: "genetic mutations; instead, this type of resistance is transient, and bacteria resume normal growth"

      -line 57: change to "resistance types are biofilm" 

      -line 61: change to "states that significantly" 

      -line 63: change to "resistance share the common feature in they retard or even cease in the presence" 

      -line 65: change to "resistance that allow bacteria to proliferate" 

      -line 81: change "But whether" to "Whether" 

      -line 178: change to "may be critical to the Mg-based phenotypic resistance"

      -line 86: change to "Marine environments and agriculture are rich in magnesium, where..." 

      -line 93: change in to vs

      -line 154: insert space after metabolism 

      -line 158: change 'identified" to "focused on the levels of" 

      -line 160: change "The levels of forty-one metabolites" 

      -line 198: change shared to share 

      -line 310: increased is duplicated, delete one 

      -line 451: add "the" before ratio 

      -line 453: gram should be capitalized 

      -line 462: "the regulation" should be reworded to "More importantly, the effect of exogenous MgCl targets the..." 

      -line 469: add dash between Mg2+ and limited

      -line 478: change "the crucial" to "a crucial" 

      -there are numerous locations in the manuscript where the word "magnetism" is used when clearly the word is supposed to be magnesium - this should be corrected

      We appreciate this comment! These have been corrected or revised. 

      Editors comments:

      Page 2 line 27; Page 25 line number 426; page 27 line number 481: In the abstract and discussion, only Vibrio alginolyticus was mentioned, even though two Vibrio species were used in the study. It would be helpful to understand the rationale behind the focus on this particular species.

      We appreciate this comment! We have revised the introduction to provide additional information as following:

      “Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesium-dependent phenotypic resistance mechanisms of pathogenic Vibrio species.”

      On Page 2, line 34: The abstract contains some undefined abbreviations, such as 'PE' and 'PG', which should be explained. 

      We appreciate this comment! We explain the PE and PG in the revised abstract as following:

      “phosphatidylethanolamine (PE) biosynthesis is reduced and phosphatidylglycerol (PG)”

      On Page 2, line 31-32: For the statement "Exogenous supplementation of fatty acids confirm the role of fatty acids in antibiotic resistance…" it would be beneficial to specify whether the fatty acids were saturated or unsaturated. 

      Response, We appreciate this comment! We revise the sentence as following:

      “Exogenous supplementation of unsaturated and saturated fatty acids increased and decreased bacterial susceptibility to antibiotics, respectively, confirming the role of fatty acids in antibiotic resistance.”

      The potential effects of the specific ions (SO4 and Cl2) present in the Mg2SO4 and MgCl2 compounds used in the study were not discussed. It would be useful to understand if these ions had any influence on the observed outcomes.

      We appreciate this comment! We revise the sentence as following:

      “However, the MIC for BLFX was higher in ASWT medium supplemented with Mg2SO4 or MgCl2 than in LB medium (Fig 1B). And Mg2SO4 or MgCl2 had no

      difference on MIC, suggesting it is Mg2+ not other ions contribute to the MIC change.”

      On Page 8, line 141: The heading of Figure 2, "Mg2+ elevates intracellular Mg2+," seems redundant and could be revised for clarity or modified. 

      We appreciate this comment! Figure 2 is now moved to supplementary figure as Suppl. Fig 2. The title is revised as following:

      “Figure 2. Mg2+ decreases balofloxacin uptake.”

      On Page 4, line 91: some terms/abbreviations, such as 'LB' and 'M9,' require expansion or definition to ensure the reader's understanding.

      We appreciate this comment! We include the expansion for LB and M9 in the  revised manuscript as following:

      “Luria-Bertani medium (LB medium)” and “M9 minimal medium (M9 medium)”

      Page 4, line 92: The real seawater composition used in the experiments should be supported by a reference.

      We appreciate this comment! We provide the reference in the revised manuscript as following:

      ““artificial seawater” (ASWT) medium that included the major ion species in marine water (Wilson, 1975) (LB medium plus 210 mM NaCl, 35 mM Mg2SO4, 7 mM KCl, and 7 mM CaCl2)”

      Page 4 line, number 93: the he full names of the bacterial strains (e.g., ATCC33787 and VP01) should be provided instead of just the strain numbers.

      We appreciate this comment! We revised the sentence as following:

      “To investigate whether this mineral impacts antibiotic activity, the minimal inhibitory concentration (MIC) of V. alginolyticus ATCC33787 and V. parahaemolyticus VP01, which we referred as ATCC33787 and VP01 afterwards,”

      Finally, there appears to be a potential contradiction between the statements on page 12, lines 211-212 and 214-216, regarding the effects of Mg2+ on the synthesis of unsaturated fatty acids. Further explanation may be needed to reconcile these seemingly contradictory points.

      We appreciate this comment! For line 221-226, which was previously line 211-212, is about the gene expression for fatty acid biosynthesis. While, Line 228 and 233, which was previously line 214-216 is about the gene expression for fatty acid degradation. We agree that the previous description is a little bit confuse. We revise the sentence to emphasize that we focus on fatty acid degradation so that the readers can tell them apart. 

      In the text, we revised it as following:

      “In addition, we also quantified gene expression during fatty acid degradation to determine whether Mg2+ affects this process”  In the figure legend, we also indicate that 

      “H. qRT-PCR for the expression of genes encoding fatty acid degradation in the absence or presence of the indicated concentrations of MgCl2”

    2. eLife Assessment

      The study explored the influence of magnesium on phenotypic antibiotic resistance in two strains of Vibrios: V. alginolyticus ATCC33787 and V. parahaemolyticus VP01. This research is fundamental for revealing the phenotypic antibiotic resistance mechanism utilized by the specified model bacteria in elevated levels of magnesium. The study produced convincing evidence indicating that in high concentrations of magnesium, the efficacy of selected antibiotics was diminished due to decreased biosynthesis of unsaturated fatty acids and PE, along with an increase in the biosynthesis of PG.

    3. Reviewer #1 (Public review):

      Summary:

      In the manuscript entitled "Magnesium modulates phospholipid metabolism to promote bacterial phenotypic resistance to antibiotics", Li et al demonstrated the role of magnesium in promoting phenotypic resistance in V. alginolyticus. Using standard microbiological and metabolomic techniques, the authors have shown the significance of fatty acid biosynthesis pathway behind the resistance mechanism. This study is significant as it sheds light on the role of an exogenous factor in altering membrane composition, polarization and fluidity which ultimately leads to antimicrobial resistance.

      Strengths:

      Authors have used different approaches to demonstrate the effect of Mg+2 on drug resistance in Vibrio alginolyticus. The revised version of the manuscript is much improved, with a very informative introduction and a variety of methodologies with clear explanation of the experiments performed. Also, additional experiments were performed as suggested by the reviewers which certainly enhanced the quality of the paper. I believe the findings of this study will be of high impact in the bacterial community.

      Weaknesses:

      There are a few grammatical mistakes.

      Comments on revisions:

      The authors have done a comprehensive job of addressing all my concerns in their revised version.

    4. Reviewer #2 (Public review):

      Summary:

      In this study, the authors aimed to identify if and how magnesium affects the ability of two particular bacteria species to resist the action of antibiotics. In my view, the authors succeeded in their goals and present a compelling study that will have important implications for the antibiotic resistance research community. Since metals like magnesium are present in all lab media compositions and are present in the host, the data presented in this study certainly will inspire additional research by the community. These could include research into whether other types of metals also induce multi-drug resistance, whether this phenomenon can be observed in other bacterial species, especially pathogenic species that cause clinical disease, and whether the underlying molecular determinants (i.e. enzymes) of metal-induced phenotypic resistance could be new antimicrobial drug targets themselves.

      Strengths:

      This study's strengths include that the authors used a variety of methodologies, all of which point to a clear effect of exogenous Mg2+ on drug resistance in the targeted species. I also comment the authors for carrying out a comprehensive study, spanning evaluation of whole cell phenotypes, metabolic pathways, genetic manipulation, to enzyme activity level evaluation. The fact that the authors uncovered a molecular mechanism underlying Mg2+-induced phenotypic resistance is particularly important as the key proteins should be studied further.

      Weaknesses:

      I thank the authors for improving their manuscript based on my previous suggestions. I still believe the Results section is long and bogs down at times.

      In general, the conclusions drawn by the authors are justified by the data, except for the interpretation of some experiments. Importantly, this paper has discovered new antimicrobial resistance mechanisms and has also pointed to potential new targets for antimicrobials.

      Comments on revisions:

      I just wanted to thank the authors for addressing most of my previous comments.