5,954 Matching Annotations
  1. Feb 2024
    1. Author Response

      The following is the authors’ response to the original reviews.

      Summary:

      In this interesting work, the authors investigated an important topical question: when we see travelling waves in cortical activity, is this due to true wave-like spread, or due to sequentially activated sources? In simulations, it is shown that sequential brain module activation can show up as a travelling wave - even in improved methods such as phase delay maps - and a variety of parameters is investigated. Then, in ex-vivo turtle eye-brain preparations, the authors show that visual cortex waves observable in local field potentials are in fact often better explained as areas D1 and D2 being sequentially activated. This has implications for how we think about travelling wave methodology and relevant analytical tools.

      Strengths:

      I enjoyed reading the discussion. The authors are careful in their claims, and point out that some phenomena may still indeed be genuine travelling waves, but we should have a higher evidence bar to claim this for a particular process in light of this paper and Zhigalov & Jensen (2023) (ref 44). Given this careful discussion, the claims made are well-supported by the experimental results. The discussion also gives a nice overview of potential options in light of this and future directions.

      The illustration of different gaussian covariances leading to very different latency maps was interesting to see.

      Furthermore, the methods are detailed and clearly structured and the Supplementary Figures, particularly single trial results, are useful and convincing.

      We are glad the reviewer found our manuscript “interesting”, the questions we raise “important”, our claims “well-supported by the experimental results”, and our methods “detailed and clearly structured”.

      The details of the sequentially activated Gaussian simulations give some useful results, but the fundamental idea still appears to be "sequential activation is often indistinguishable from a travelling wave", an idea advanced e.g. by Zhigalov & Jensen (2023). It takes a while until the (in my opinion) more intriguing experimental results.

      To emphasize the experimental results, we switched between the analytical results and the experimental results. Correspondingly, figure 2 now illustrates the more intriguing experimental results and figure 3 the analytical results. In addition, we added subtitles to the different sections of the results to ease the navigation through the paper and to enable the readers to access the different sections more easily.

      One of the key claims is that the spikes are more consistent with two sequentially activated modules rather than a continuous wave (with Fig 3k and 3l key to support this). Whilst this is more consistent, it is worth mentioning that there seems to be stochasticity to this and between-trial variability, especially for spikes.

      In the revised manuscript we added the reviewer’s comment about stochasticity, and we discuss its possible origins:

      "The transition was also not clear when examining spiking responses in some of the trials (as indicated by high DIP scores, Figure 2K). However, the observation that temporal grouping became more pronounced when using ALSA (a more robust estimate of local excitability) (Figure 2L,N), suggests that high DIP values may result from variability in the spike times of single neurons, and not necessarily from the lack of modular activation. Such issues can be resolved by denser sampling of spiking activity in the tissue."

      Recommendations For The Authors:

      The eye-cortex turtle preparation is not the most common. I would add more context about how specific the results are to this preparation vs how comparable it is to human data.

      We added a sentence explaining the relevance of our preparation: “Finally, while the layered organization of turtle cortex is different than that of mammalian cortex, the basic excitability features of both tissues are similar (Connors and Kriegstein, 1986; Hemberger et al., 2019; Kriegstein and Connors, 1986; Larkum et al., 2008; Shein-Idelson et al., 2017b), and substantial differences in the manner by which field potentials and spikes spread through the tissue are not to be expected.”

      Philosophical question: when does a 'module' become small enough for it to count as a travelling wave? More on this could be added to the discussion. I think we are in the very early days for a true understanding of travelling waves, and I wonder if these sequentially activated modules will functionally correspond to the known cortical segregation, or if it varies by area/task.

      We agree with the reviewer that macroscopic waves could be composed of smaller modules (or single neurons at the smallest scale). Our results suggest that modular patterns can be classified as wave patterns both at large scales (of brain areas) and smaller scales of local neural circuits. Therefore, we believe it is necessary to make this distinction across different scales. We sharpened this point in the first paragraph of the discussion:

      "…We showed that LFP measurements indicative of waves propagating across turtle cortex are underlined by discrete and consecutively activated neuronal populations, and not by a continuously propagating wavefront of spikes (Figure 2). Similarly, activation profiles that resemble continuous travelling waves in EEG simulations can be underlined by consecutive activation of two discrete cortical regions (Figure 1). We replicated these results using an analytical model and demonstrated that a simple scenario of sequentially activated Gaussians can exhibit WLPs with a rich diversity of spatiotemporal profiles (Figure 3). Our results offer insight into the scenarios and conditions for WLP detection by identifying failure points that should be considered when identifying travelling waves and therefore suggest caution when interpreting continuous phase latency maps as microscopically propagating wave patterns. Such failure points may exist both when examining activity at the scale of brain regions (Figure 1) and smaller neural circuits (Figure 2). Therefore, our results suggest that the discrepancy between modular and wave activation should be examined across spatial scales. Specifically, it is not necessarily the case that at the fine grained (single neuron) scale activation patterns are modular, but, following coarse graining, smooth wave patterns emerge. Rather, modular activation may hierarchically exist across scales (Kaiser and Hilgetag, 2010; Meunier et al., 2010) and may be masked by smeared spatial supra-threshold excitability boundaries. Below we discuss these limitations across techniques and their implications.”

      I would advise the authors to focus on the experimental data, perhaps by putting the simulations second, and by putting some of the equation details that are in Methods into the Supplementary Information. Whilst the simulation parameter space is well-explored, the fundamental idea of spreading Gaussians is relatively simple, and the current manuscript organization detracted from the main message for me a little bit.”

      Following the referee’s suggestion, we switched between the section with experimental data and the one with the analytic model (see response to comment 1). In addition, to ease the reading of the methods, we moved the mathematical derivation and related equations to appendix 1.

      Things I thought about that you may also enjoy thinking about: Could we tell something about sequential sources vs travelling waves by the nature of the wave - e.g. shape or dispersion? If some wave properties are conserved whilst travelling, this could be evidence for travelling vs two sources.

      This is a wonderful suggestion. We are currently working on a follow up publication with a new approach to do exactly that! We think that this new body of work is outside the scope of this paper.

      Could synaptic potentials spread like waves, but spikes more in modular bursts? This would also explain the LFP vs spikes difference - maybe travelling waves of EPSPs are there priming the network, 'looking' for suitable modules to activate, which then activate sequentially. The current discussion is quite spike-focused - could some information be in synaptic potentials after all?

      This is an interesting idea with intriguing functional implications. We added this idea to our discussion (see paragraph below). In addition, to emphasize our discussion on synaptic potentials, we reorganized the paragraphs in the discussion to separate between our discussion on sub-threshold excitability (which is mostly synaptic) and supra-threshold excitability which is the focus of the second part of the discussion.

      “Variability in responses may also be explained by differences in propagation mechanisms (Ermentrout and Kleinfeld, 2001; Muller et al., 2018; Wu et al., 2008). Several reports suggest that waves are underlined by propagation along axonal collaterals (Muller et al., 2018, 2014). Both the transmembrane voltage-gated currents excited during action potentials as well as the post-synaptic currents along axonal boutons can potentially contribute to measured signals. However, such waves travel at high propagation speeds and are not compatible with the wide diversity of wave velocities and mechanisms of local neuronal interactions (Ermentrout and Kleinfeld, 2001; Feller et al., 1996). An intriguing possibility is that such axonal waves prime neuronal excitability by sub-threshold inputs that later result in modular supra-threshold activation. The ability to experimentally discriminate between axonal inputs and local spiking excitability (e.g. by reporters with different wavelengths) can potentially resolve such discrepancies.

      Our turtle cortex results (Figure 2) exemplify how contrasting sub-threshold LFP measurements with supra-threshold spiking measurements can yield different conclusions about the nature of activity spread….”

    1. Author Response:

      The following is the authors’ response to the original reviews.

      Joint Public Review:

      […] While this does not rule out criticality in the brain, it decidedly weakens the evidence for it, which was based on the following logic: critical systems give rise to power law behavior; power law behavior is observed in cortical networks; therefore, cortical networks operate near a critical point. Given, as shown in this paper, that power laws can arise from noncritical processes, the logic breaks. Moreover, the authors show that criticality does not imply optimal information transmission (one of its proposed functions). This highlights the necessity for more rigorous analyses to affirm criticality in the brain. In particular, it suggests that attention should be focused on the question "does the brain implement a dynamical latent variable model?".

      These authors are not the first to show that slowly varying firing rates can give rise to power law behavior (see, for example, Touboul and Destexhe, 2017; Priesemann and Shriki, 2018). However, to our knowledge they are the first to show crackling, and to compute information transmission in the critical state.

      We thank the reviewers for their thoughtful assessment of our paper.

      We would push back on the assessment that our model ‘has nothing to do with criticality,’ and that we observed ‘signatures of criticality [that] emerge through fundamentally non-critical mechanisms.’ This assessment partially stems from the definition of criticality provided in the Public Comment, that ‘criticality is a very specific set of phenomena in physics in which fundamentally local interactions produce unexpected long-range behavior.’

      Our disagreement is largely focused on this definition, which we do not think is a standard definition. Taking the favorite textbook example, the Ising model, criticality is characterized by a set of power-law divergences in thermodynamic quantities (e.g., susceptibility, specific heat, magnetization) at the critical temperature, with exponents of these power laws governed by scaling laws. It is not defined by local interactions. All-to-all Ising model is generally viewed as showing a critical behavior at a certain temperature, even though interactions there are manifestly non-local. It is possible that, by “local” in the definition, the Public Comment meant that interactions are “collective” and among microscopic degrees of freedom. However, that same all-to-all Ising model is mathematically equivalent to the mean-field model, where criticality is achieved through large fluctuations of the mean field, but not through microscopic interactions.

      More commonly, criticality is defined by power laws and scaling relationships that emerge at a critical value of a parameter(s) of the system. That is, criticality is defined by its signatures. What is crucial in all such definitions is that this atypical, critical state requires fine tuning. For example, in the textbook example of the Ising model, a parameter (the temperature) must be tuned to a critical value for critical behavior to appear. In the branching process model that generates avalanche criticality, criticality requires tuning m=1. The key result of our paper is that all signatures expected for avalanche criticality (power laws, crackling, and, as shown below, estimates of the branching rate m), and hence the criticality itself, appear without fine-tuning.

      As we discussed in our introduction, there are a few other instances of signatures of criticality (and hence of criticality itself) emerging without fine-tuning. The first we are aware of was the demonstration of Zipf’s Law (by Schwab, et al. 2014, and Aitchison et al. 2016), a power-law relationship between rank and frequency of states, which was shown to emerge generically in systems driven by a broadly distributed latent variable. A second example, arising from applications of coarse-graining analysis to neural data (cf., Meshulam et al. 2019; also, Morales et al., 2023), was demonstrated in our earlier paper (Morrell et al. 2021). Thus, here we have a third example: the model in this paper generates signatures of criticality in the statistics of avalanches of activity, and it does so without fine-tuning (cf., Fig. 2-3).

      The rate at which these ‘criticality without fine-tuning' examples are piling up may inspire revisiting the requirement of fine-tuning in the definition of criticality, and our ongoing work (Ngampruetikorn et al. 2023) suggests that criticality may be more accurately defined through large fluctuations (variance > 1/N) rather than through fine-tuning or scaling relations.

      References:

      • Schwab DJ, Nemenman I, Mehta P. “Zipf’s Law and Criticality in Multivariate Data without FineTuning.” Phys Rev Lett. 2014 Aug; doi::101103/PhysRevLett.113.068102,

      • Aitchison L, Corradi N, Latham PE. “Zipf’s Law Arising Naturally When There Are Underlying, Unobserved Variables.” PLOS Computational biology. 2016 12; 12(12):1-32. doi:10.1371/journal.pcbi.1005110

      • Meshulam L, Gauthier JL, Brody CD, Tank DW, Bialek W. “Coarse Graining, Fixed Points, and Scaling in a Large Population of Neurons.” Phys Rev Lett. 2019 Oct; doi: 10.1103/PhysRevLett.123.178103.

      • Morales GB, di Santo S, Muñoz MA. “Quasiuniversal scaling in mouse-brain neuronal activity stems from edge-of-instability critical dynamics.” Proceedings of the National Academy of Sciences. 2023; 120(9):e2208998120.

      • Morrell MC, Sederberg AJ, Nemenman I. “Latent Dynamical Variables Produce Signatures of Spatiotemporal Criticality in Large Biological Systems.” Phys Rev Lett. 2021 Mar; doi: 10.1103/PhysRevLett.126.118302.

      • Ngampruetikorn, V., Nemenman, I., Schwab, D., “Extrinsic vs Intrinsic Criticality in Systems with Many Components.” arXiv: arXiv:2309.13898 [physics.bio-ph]

      Major comments:

      1) For many readers, the essential messages of the paper may not be immediately clear. For example, is the paper criticizing the criticality hypothesis of cortical networks, or does the criticism extend deeper, to the theoretical predictions of "crackling" relationships in physical systems as they can emerge without criticality? Statements like "We show that a system coupled to one or many dynamical latent variables can generate avalanche criticality ..." could be misinterpreted as affirming criticality. A more accurate language is needed; for instance, the paper could state that the model generates relationships observed in critical systems. The paper should provide a clearer conclusion and interpretation of the findings in the context of the criticality hypothesis of cortical dynamics.

      Please see the response to the Public Review, above. To clarify the essential message that the dynamical latent variable model produces avalanche criticality without fine-tuning, we have made revisions to the abstract and introduction. This point was already made in the discussion (first sentence).

      Key sentences changed in the abstract:

      "… We find that populations coupled to multiple latent variables produce critical behavior across a broader parameter range than those coupled to a single, quasi-static latent variable, but in both cases, avalanche criticality is observed without fine-tuning of model parameters. … Our results suggest that avalanche criticality arises in neural systems in which activity is effectively modeled as a population driven by a few dynamical variables and these variables can be inferred from the population activity."

      In the introduction, we changed the final sentence to read:

      "These results demonstrate how criticality in neural recordings can arise from latent dynamics in neural activity, without need for fine-tuning of network parameters."

      2) On lines 97-99, the authors state that "We are agnostic as to the origin of these inputs: they may be externally driven from other brain areas, or they may arise from recurrent dynamics locally". This idea is also repeated at the beginning of the Summary section. Perhaps being agnostic isn't such a good idea: it's possible that the recurrent dynamics is in a critical regime, which would just push the problem upstream. Presumably you're thinking of recurrent dynamics with slow timescales that's not critical? Or are you happy if it's in the critical regime? This should be clarified.

      We have amended this sentence to clarify that any latent dynamics with large fluctuations would suffice:

      ”We are agnostic as to the origin of these inputs: they may be externally driven from other brain areas, or they may arise from large fluctuations in local recurrent dynamics.”

      3) Even though the model in Equation 2 has been described in a previous publication and the Methods section, more details regarding the origin and justification of this model in the context of cortical networks would be helpful in the Results section. Was it chosen just for simplicity, or was there a deeper reason?

      This model was chosen for its simplicity: there are no direct interactions between neurons, coupling between neurons and latent variables is random, and simulation is straightforward. More complex latent dynamics or non-random structure in the coupling matrices could have been used, but our aim was to explore this model in the simplest setting possible.

      We have revised the Results (“Avalanche scaling in a dynamical latent variable model,” first paragraph) to justify the choice of the model:

      "We study a model of a population of neurons that are not coupled to each other directly but are driven by a small number of dynamical latent variables -- that is, slowly changing inputs that are not themselves measured (Fig.~\ref{fig:fig1}A). We are agnostic as to the origin of these inputs: they may be externally driven from other brain areas, or they may arise from large fluctuations in local recurrent dynamics. The model was chosen for its simplicity, and because we have previously shown that this model with at least about five latent variables can produce power laws under the coarse-graining analysis \citep{Morrell2021}."

      We have added the following to the beginning of the Methods section expanding on the reasons for this choice:

      "We study a model from Morrell 2021, originally constructed as a model of large populations of neurons in mouse hippocampus. Neurons are non-interacting, receiving inputs reflective of place-field selectivity as well as input current arising from a random projection from a small number of dynamical latent variables, representing inputs shared across the population of neurons that are not directly measured or controlled. In the current paper, we incorporate only the latent variables (no place variables), and we assume that every cell is coupled to every latent variable with some randomly drawn coupling strength."

      4) The Methods section (paragraph starting on line 340) connects the time scale to actual time scales in neuronal systems, stating that "The timescales of latent variables examined range from about 3 seconds to 3000 seconds, assuming 3-ms bins". While bins of 3 ms are relevant for electrophysiological data from LFPs or high-density EEG/MEG, time scales above 10 seconds are difficult to generate through biophysically clear processes like ionic channels and synaptic transmission. The paper suggests that slow time scales of the latent variables are crucial for obtaining power law behavior resembling criticality. Yet, one way to generate such slow time scales is via critical slowing down, implying that some brain areas providing input to the network under study may operate near criticality. This pushes the problem toward explaining the criticality of those external networks. Hence, discussing potential sources for slow time scales in latent variables is crucial. One possibility you might want to consider is sources external to the organism, which could easily have time scales in the 1-24 hour range.

      As the reviewers note, it is a possibility that slow timescales arise from some other brain area in which dynamics are slow due to critical dynamics, but many other plausible sources exist. These include slowly varying sensory stimuli or external sources, as suggested by the reviewers. It is also possible to generate “effective” slow dynamics from non-critical internal sources. One example, from recordings in awake mice, is the slow change in the level of arousal that occurs on the scale of many seconds to minutes. These changes arise from release of neuromodulators that have broad effects on neural populations and correlations in activity (for a focused review, see Poulet and Crochet, 2019).

      We have added the following sentence to the Methods section where timescales of latent variables was discussed:

      "The timescales of latent variables examined range from about $3$ seconds to $3000$ seconds, assuming $3$-ms bins. Inputs with such timescales may arise from external sources, such as sensory stimuli, or from internal sources, such as changes in physiological state."

      5) It is common in neuronal avalanche analysis to calculate the branching parameter using the ratio of events in consecutive bins. Near-critical systems should display values close to 1, especially in simulations without subsampling. Including the estimated values of the branching parameter for the different cases investigated in this study could provide more comprehensive data. While the paper acknowledges that the obtained exponents in the model differ from those in a critical branching process, it would still be beneficial to offer the branching parameter of the observed avalanches for comparison.

      The reviewers requested that the branching parameter be computed in our model. We point out that, for the quasi-stationary latent variables (as in Fig. 3), a branching parameter of 1 is expected because the summed activity at time t+k is, on average, equal to the summed activity at time t, regardless of k. Numerics are consistent with this expectation. Following the methodology for an unbiased estimate of the branching parameter from Wilting and Priesemann (2018), we checked an example set of parameters (epsilon = 8, eta = 3) for quasi-stationary latent fields. We found that the naïve (biased) estimate of the branching parameter was 0.94, and that the unbiased estimator was exp(−1.4⋅10−8) ≈ 0.999999986.

      For faster time scales, it is no longer true that summed activity is constant over time, as the temporal correlations in activity decay exponentially. Using the five-field simulation from Figure 2, we calculated the branching parameter for several values of tau. The biased estimates of m are 0.76 (𝜏=50), 0.79 (𝜏=500), and 0.79 (𝜏=5000). The corrected estimates are 0.98 (𝜏=50), 0.998 (𝜏=500), and 0.9998 (𝜏=5000).

      6) In the Discussion (l 269), the paper suggests potential differences between networks cultured in vitro and in vivo. While significant differences indeed exist, it's worth noting that exponents consistent with a critical branching process have also been observed in vivo (Petermann et al 2009; Hahn et al. 2010), as well as in large-scale human data.

      We thank the reviewers for pointing out these studies, and we have added the missing one (Hahn et al. 2010) to our reference list. The following was added to the discussion, in the section “Explaining Experimental Exponents:”

      "A subset of the in vivo recordings analyzed from anesthetized cat (Hahn et al. 2010) and macaque monkeys (Petermann et al. 2009) exhibited a size distribution exponent close to 1.5."

      Along these lines, we noted two additional studies of high relevance that have been published since our initial submission (Capek et al. 2023, Lombardi et al. 2023), and we have added these references to the discussion of experimental exponents.

      Minor comments:

      1) The term 'latent variable' should be rigorously explained, as it is likely to be unfamiliar to some readers.

      Sentences and clauses have been added to the Introduction, Results and the Methods to clarify the term:

      Intro: “Numerous studies have reported relatively low-dimensional structure in the activity of large populations of neurons [refs], which can be modeled by a population of neurons that are broadly and heterogeneously coupled to multiple dynamical latent (i.e., unobserved) variables.”

      Results: “We studied a population of neurons that are not coupled to each other directly but are driven by a small number of dynamical latent variables -- that is, slowly changing inputs that are not themselves measured.”

      Methods: “Neurons are non-interacting, receiving inputs reflective of place-field selectivity as well as input current reflecting a random projection from a small number of dynamical latent variables, representing inputs shared across the population of neurons that are not directly measured.”

      2) There's a relatively important typo in the equations: Eq. 2 and Eq. 6 differ by a minus sign in the exponent. Eqs. 3 and 4 use the plus sign, but epsilon_0 on line 198 uses the minus sign. All very confusing until we figured out what was going on. But easy to fix.

      Thank you for catching this. We have made the following corrections:

      1) Figures adopted the sign convention that epsilon > 0, with larger values of epsilon decreasing the activity level. Signs in Eqs. 3 and 4 have been corrected to match.

      2) Equation 5 was missing a minus sign in front of the Hamiltonian. Restoring this minus sign fixed the discrepancy between 2 and 6.

      3) In Eq. 7, the left hand side is zeta'/zeta', which is equal to 1. Maybe it should be zeta'/zeta? Fixed, thank you.

      Additional comments:

      The authors are free to ignore these; they are meant to improve the paper.

      We are extremely grateful for the close reading of our paper and note the actions taken below.

      1) We personally would not use the abbreviation DLV; we find abbreviations extremely hard to remember. And DLV is not used that often.

      Done, thank you for the suggestion.

      2) l 198: epsilon_0 = -log(2^{1/N}-1) was kind of hard to picture -- we had to do a little algebra to make sense of it. Why not write e^{-epsilon_0} = 2^{1/N}-1 \approx log(2)/N, which in turn implies that epsilon_0 ~ log(N)?

      Thank you, good point. We have added a sentence now to better explain:

      "...which is maximized at $\epsilon_0 = - \log (2^{1/N} - 1)$, independent of $J_i$ and $\eta$. After some algebra, we find that $\epsilon_0 \sim \log N$ for large $N$."

      3) Typo on l 202: "We plot P_ava as a function of epsilon in Fig. 4B". 4B --> 4D.

      Done

      4) It would be easier on the reader if the tables were all in one place. It would be even nicer to put the parameters in the figure captions. Or at least N; that one is kind of important.

      Table placement was a Latex issue, which we have now fixed. We also have included links between tables and relevant figures and indicated network size.

      5) What's x_i in Eqs. 7 and 8?

      We added a sentence of explanation. These are the individual observations of avalanche sizes or durations, depending on what is being fit.

      6) The latent variables evolve according to an Ornstein-Uhlenbeck process. But we might equally expect oscillations or non-normal behavior coupling dynamical modes, and these are likely to give different behavior with respect to avalanches. It might be worth commenting on this.

      7) The model assumes a normal distribution of the coupling strengths between the latent variables and the binary units. Discussing the potential effects of different types of random coupling could provide interesting insights.

      Both 6 and 7 are interesting questions. At this point, we could speculate that the main results would be qualitatively unchanged, provided dynamics are sufficiently slow and that the distribution of coupling strengths is sufficiently broad (that is, there is variance in the coupling matrix across individual neurons). Further studies would be needed to make these statements more precise.

      8) In Fig 1, tau_f = 1E4 whereas in Fig 2 tau_f = 5E3. Why the difference?

      For Figure 1, we chose a set of parameters that gave clear scaling. In Figure 2, we saw some value in showing more than one example of scaling, hence different parameters for the examples in Fig 2 than Fig 1. Note that the Fig 1 simulations are represented in Fig. 2 G-J, as the 5-field simulation with tau_F = 1e4.

  2. Jan 2024
    1. Author Response

      eLife assessment

      This study presents a valuable finding on a new role of Foxp3+ regulatory T cells in sensory perception, which may have an impact on our understanding of somatosensory perception. The authors identified a previously unappreciated action of enkephalins released by immune cells in the resolution of pain and several upstream signals that can regulate the expression of the proenkephalin gene PENK in Foxp3+ Tregs. However, whereas the generation of transgenic mice with conditional deletion of PENK in Foxp3+ cells and PENK fate-mapping is novel and generates compelling data, they show an incomplete analysis of Tregs in the control and transgenic mice, proper tamoxifen controls nor the role of PENK+ skin T cells to further support their hypothesis. Nonetheless, the study would be of interest to the biologists working in the field of neuroimmunology and inflammation.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors explore mechanisms through which T-regs attenuate acute pain using a heat sensitivity paradigm. Analysis of available transcriptomic data revealed expression on the proenkephalin (Penk) gene in T-regs. The authors explore the contribution of T-reg Penk in the resolution of heat sensitivity.

      Strengths:

      Investigating the potential role of T-reg Penk in the resolution of acute pain is a strength.

      Weaknesses:

      The overall experimental design is superficial and lacks sufficient rigor to draw any meaningful conclusions.

      For instance:

      1) The were no TAM controls. What is the evidence that TAM does not alter heat-sensitive receptors.

      Author response : By comparing panel A and C, it appears that heat-sensitivity in controls (blue dots) is slightly different before and after TMX administration, suggesting that heat-sensitive receptors are moderately altered by TMX per se. However, heat sensitivity is increased by two fold in KO animals. Thus, a possible effect of TAM on heat receptors is not responsible for the heat hyperalgesia seen in KO, as shown in figure 4 and S3.

      2) There are no controls demonstrating that recombination actually occurred. How do the authors know a single dose of TAM is sufficient?

      Author response : these experiments are in progress. Specificity of the deletion will be presented in an updated version of the manuscript in the near future.

      3) Why was only heat sensitivity assessed? The behavioral tests are inadequate to derive any meaningful conclusions. Further, why wasn't the behavioral data plotted longitudinally

      Author response : We respectfuly point the reviewer to figure S3 where the longitudinal data are presented. New behavorial tests are being performed. The results will be presented in a revised version.

      Reviewer #2 (Public Review):

      Summary:

      The present study addresses the role of enkephalins, which are specifically expressed by regulatory T cells (Treg), in sensory perception in mice. The authors used a combination of transcriptomic databases available online to characterize the molecular signature of Treg. The proenkephalin gene Penk is among the most enriched transcripts, suggesting that Treg plays an analgesic role through the release of endogenous opioids. In addition, in silico analysis suggests that Penk is regulated by the TNFR superfamily; this being experimentally confirmed. Using flow cytometry analysis, the authors then show that Penk is mostly expressed in Treg of the skin and colon, compared to other immune cells. Finally, genetic conditional excision of Penk, selectively in Treg, results in heat hypersensitivity, as assessed by behavior analysis.

      Strengths:

      The manuscript is clear and reveals a previously unappreciated role of enkephalins, as released by immune cells, in sensory perception. The rationale in this manuscript is easy to follow, and conclusions are well supported by data.

      Weaknesses:

      The sensory deficit of Penk cKO appears to be quite limited compared to control littermates.

      Reviewer #3 (Public Review):

      Summary:

      Aubert et al investigated the role of PENK in regulatory T cells. Through the mining of publicly available transcriptome data, the authors confirmed that PENK expression is selectively enriched in regulatory but not conventional T cells. Further data mining suggested that OX40, 4-1BB as well as BATF, can regulate PENK expression in Tregs. The authors generated fate-mapping mice to confirm selective PENK expression in Tregs and activated effector T cells in the colon and spleen. Interestingly, transgenic mice with conditional deletion of PENK in Tregs resulted in hypersensitivity to heat, which the authors attributed to heat hyperalgesia.

      Strengths:

      The generation of transgenic mice with conditional deletion of PENK in foxp3 and PENK fate-mapping is novel and can potentially yield significant findings. The identification of upstream signals that regulate PENK is interesting but unlikely to be the main reason why PENK is predominantly expressed in Tregs as both BATF and TNFR are expressed in effector T cells.

      Weaknesses:

      There is a lack of direct evidence and detailed analysis of Tregs in the control and transgenic mice to support the authors' hypothesis. PENK was previously reported to be expressed in skin Tregs and play a significant role in regulating skin homeostasis: this should be considered as an alternative mechanism that may explain the changed sensitivity to heat observed in the paper.

      Author response : Supplementary figures are being prepared and new results are being collected to show that the KO do not perturb immune and/or skin homeostasis at the time of the experiments. These will be presented in a revised version.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The present study provides a phylogenetic analysis of the size prefrontal areas in primates, aiming to investigate whether relative size of the rostral prefrontal cortex (frontal pole) and dorsolateral prefrontal cortex volume vary according to known ecological or social variables.

      I am very much in favor of the general approach taken in this study. Neuroimaging now allows us to obtain more detailed anatomical data in a much larger range of species than ever before and this study shows the questions that can be asked using these types of data. In general, the study is conducted with care, focusing on anatomical precision in definition of the cortical areas and using appropriate statistical techniques, such as PGLS. That said, there are some points where I feel the authors could have taken their care a bit further and, as a result, inform the community even more about what is in their data.

      We thank the reviewer for this globally positive evaluation of our work, and we appreciate the advices to improve our manuscript.

      The introduction sets up the contrast of 'ecological' (mostly foraging) and social variables of a primate's life that can be reflected in the relative size of brain regions. This debate is for a large part a relic of the literature and the authors themselves state in a number of places that perhaps the contrast is a bit artificial. I feel that they could go further in this. Social behavior could easily be a solution to foraging problems, making them variables that are not in competition, but simply different levels of explanation. This point has been made in some of the recent work by Robin Dunbar and Susanne Shultz.

      Thank you for this constructive comment, and we acknowledge that the contrast between social vs ecological brain is relatively marginal here. Based also on the first remark by reviewer 3, we have reformulated the introduction to emphasize what we think is actually more critical: the link between cognitive functions as defined in laboratory conditions and socio-ecological variables measured in natural conditions. And the fact that here, we use brain measures as a potential tool to relate these laboratory vs natural variables through a common scenario. Also, we were already mentioning the potential interaction between social and foraging processes in the discussion, but we are happy to add a reference to recent studies by S. Shultz and R. Dunbar (2022), which is indeed directly relevant. We thank the reviewer for pointing out this literature.

      In a similar vein, the hypotheses of relating frontal pole to 'meta-cognition' and dorsolateral PFC to 'working memory' is a dramatic oversimplification of the complexity of cognitive function and does a disservice to the careful approach of the rest of the manuscript.

      We agree that the formulation of which functions we were attributing to the distinct brain regions might not have been clear enough, but the functional relation between frontal pole and metacognition in the one hand, and DLPFC and working memory on the other hand, have been firmly established in the literature, both through laboratory studies and through clinical data. Clearly, no single brain region is necessary and sufficient for any cognitive operation, but decades of neuropsychology have demonstrated the differential implication of distinct brain regions in distinct functions, which is all we mean here. We have made a specific point on that topic in the discussion (cf p. 16). We have also reformulated the introduction to clarify that, even if the relation between these regions and their functions (FP/ metacognition; DLPFC/ working memory) was clear in laboratory conditions, it was not clear whether this mapping could be used for real life conditions. And therefore whether that simplification was somehow justified beyond the lab (and the clinics), and whether these neuro-cognitive concepts could be applied to natural conditions, are indeed critical questions that we wanted to address. The central goal of the present study was precisely to evaluate the extent to which this brain/cognition relation could be used to understand more natural behaviors and functions, and we hope that it appears more clearly now.

      One can also question the predicted relationship between frontal pole meta-cognition and social abilities versus foraging, as Passingham and Wise show in their 2012 book that it is frontal pole size that correlates with learning ability-an argument that they used to relate this part of the brain to foraging abilities. I would strongly suggest the authors refrain from using such descriptive terms. Why not simply use the names of the variables actually showing significant correlations with relative size of the areas?

      We basically agree with the reviewer, and we acknowledge the lack of clarity in the introduction of the previous manuscript. There were indeed lots of ambiguity in what we were referring to as ‘function’, associated with a given brain region. « Function » referred to way to many things! We have reformulated the introduction not only to clarify the different types of functions that were attributed to distinct brain regions in the literature but also to clarify how this study was addressing the question: by trying to articulate concepts from neuroscience laboratory studies with concepts from behavioral ecology and evolution using intuitive scenarios. We hope that the present version of the introduction makes that point clearer.

      The major methodological judgements in this paper are of course in the delineation of the frontal pole and dorsolateral prefrontal cortex. As I said above, I appreciate how carefully the authors describe their anatomical procedure, allowing researchers to replicate and extend their work. They are also careful not to relate their regions of interest to precise cytoarchitectonic areas, as such a claim would be impossible to make without more evidence. That said, there is a judgement call made in using the principal sulcus as a boundary defining landmark for FP in monkeys and the superior frontal sulcus in apes. I do not believe that these sulci are homologous. Indeed, the authors themselves go on to argue that dorsolateral prefrontal cortex, where studied using cytoarchitecture, stretches to the fundus of principal sulcus in monkeys, but all the way to the inferior frontal sulcus in apes. That means that using the fundus of PS is not a good landmark.

      We thank the reviewer for his kind remarks on our careful descriptions. But then, it is not clear whether our choice of using the principal sulcus as a boundary for FP in monkeys vs the superior frontal sulcus in apes is actually a judgement call. First, and foremost, there is no clear and unambiguous definition of what should be the boundaries of the FP. By contrast with cytoarchitectonic maps, but clearly this is out of reach here. In humans and great apes we used Bludau et al 2014 (i.e. sup frontal sulcus), and in monkeys, we chose a conservative landmark that eliminated area 9, which is traditionally associated with the DLPFC (Petrides, 2005; Petrides et al, 2012; Semendeferi et al, 2001).

      Of course, any definition will attract criticism, so the best solution might be to run the analysis multiple times, using different definitions for the areas, and see how this affects results.

      Indeed, functional maps indicate that dorsal part of anterior PFC in monkeys is functionally part of FP. But again, cytoarchitectonic maps also indicate that this part of the brain includes BA 9, which is traditionally associated with DLPFC (Petrides, 2005; Petrides et al, 2012). As already pointed out in the discussion, there is a functional continuum between FP and DLPFC and our goal when using PS as dorsal border was to be very conservative and to exclude the ambiguous area. But we agree with the reviewer that given that this decision is arbitrary, it was worth exploring other definitions of the FP volume. So, we did complete a new analysis with a less conservative definition of the FP, to include this ambiguous dorsal area, and it is now included in the supplementary material. Maybe as expected, including the ambiguous area in the FP volume shifted the relation with socio-ecological variables towards the pattern displayed by the DLPFC (ie the influence of population density decreased). The most parsimonious interpretation of this results is that when extending the border of the FP region to cover a part of the brain which might belong to the DLPFC, or which might be somehow functionally intermediate between the 2, the specific relation of the FP with socio-ecological variables decreases. Thus, even if we agree that it was important to conduct this analysis, we believe that it only confirms the difficulty to identify a clear boundary between FP and DLPFC. Again, we have clearly explained throughout the manuscript that we admit the lack of precision in our definitions of the functional brain regions. In that frame, the conservative option seems more appropriate and for the sake of clarity, the results of the additional analysis of a FP volume that includes the ambiguous area is only included in the supplementary material.

      If I understand correctly, the PGLS was run separately for the three brain measure (whole brain, FP, DLPFC). However, given that the measures are so highly correlated, is there an argument for an analysis that allows testing on residuals. In other words, to test effects of relative size of FP and DLPFC over and above brain size?

      Generally, using residuals as “data” (or pseudo-data) is not recommended in statistical analyses. Two widely cited references from the ecological literature are:

      Garcia-Berthou E. (2001) On the Misuse of Residuals in Ecology: Testing Regression Residuals vs. the Analysis of Covariance. Journal of Animal Ecology, 70 (4): 708-711.

      Freckleton RP. (2002). On the misuse of residuals in ecology: regression of residuals vs. multiple regression. Journal of Animal Ecology 71: 542–545. https://doi.org/10.1046/ j.1365-2656.2002.00618.x.

      The main reason for this recommendation is that residuals are dependent on the fitted model, and thus on the particular sample under consideration and the eventual significant effects that can be inferred.

      In the discussion and introduction, the authors discuss how size of the area is a proxy for number of neurons. However, as shown by Herculano-Houzel, this assumption does not hold across species. Across monkeys and apes, for instance, there is a different in how many neurons can be packed per volume of brain. There is even earlier work from Semendeferi showing how frontal pole especially shows distinct neuron-to-volume ratios.

      We appreciate the reviewer’s comment, but the references to Herculano-Houzel that we have in mind do indicate that the assumption is legitimate within primates.

      Herculano-Houzel et al (2007) show that the neuronal density of the cortex is well conserved across primate species (but only monkeys were studied). The conclusion of that study is that using volumes as a proxy for number of neurons, as a measure of computational capacity, should be avoided between rodents and primates (and as they showed later, even more so with birds, for which neuronal density is higher). BUT within primates, since neuronal densities are conserved, volume is a good predictor of number of neurons. Gabi et al (2016) provide evidence that the neuronal density of the PFC is well conserved between humans and non-human primates, which implies that including humans and great apes in the comparison is legitimate. In addition, the brain regions included in the analysis presumably include very similar architectonic regions (e.g. BA 10 for FP, BA 9/46 for DLPFC), which also suggests that the neuronal density should be relatively well conserved across species. Altogether, we believe that there is sufficient evidence to support the idea that the volume of a PFC region in primates is a good proxy for the number of neurons in that region, and therefore of its computational capacity.

      Semendeferi and colleagues (2001) pointed out some differences in cytoarchitectonic properties across parts of the FP and discussed how these properties could 1) be used to identify area 10 across species 2) be associated with distinct computational properties, with the idea that thicker ‘cell body free’ layers would leave more space for establishing connections (across dendrites and axons). This pioneering work, together with more recent imaging studies on functional connectivity (e.g. Sallet et al, 2013) emphasize the critical contribution of connectivity pattern as a tool for comparative anatomy. But unfortunately, as pointed out in the discussion already, this is currently out of reach for us.

      We acknowledge the limitations, and to be fair, the notion of computational capacity itself is hard to define operationally. Based on the work of Herculano-Houzel et al, average density is conserved enough across primates (including humans) to justify our approximation. We have tried to define our regions of interest using both anatomical and functional maps and, thanks to the reviewer’s suggestions, we even tried several ways to segment these regions. Functional maps in macaques and humans do not exactly match cytoarchitectonic maps, presumably because functions rely not only upon the cytoarchitectonics but also on connectivity patterns (e.g. Sallet et al, 2013).

      In sum, we appreciate the reviewer’s point but feel that, given the current understanding of brain functions and the relative conservation of neuronal density across primate PFC regions, the volume of a PFC region seems to be reasonable proxy for its number of neurons, and therefore its computational capacity. We have added these points to the discussions, and we hope that the reader will be able to get a fair sense of how legitimate is that position, given the literature.

      Overall, I think this is a very valuable approach and the study demonstrates what can now be achieved in evolutionary neuroscience. I do believe that they authors can be even more thorough and precise in their measurements and claims.

      Reviewer #2 (Public Review):

      In the manuscript entitled "Linking the evolution of two prefrontal brain regions to social and foraging challenges in primates" the authors measure the volume of the frontal pole (FP, related to metacognition) and the dorsolateral prefrontal cortex (DLPFC, related to working memory) in 16 primate species to evaluate the influence of socio-ecological factors on the size of these cortical regions. The authors select 11 socio-ecological variables and use a phylogenetic generalized least squares (PGLS) approach to evaluate the joint influence of these socio-ecological variables on the neuro-anatomical variability of FP and DLPFC across the 16 selected primate species; in this way, the authors take into account the phylogenetic relations across primate species in their attempt to discover the influence of socio-ecological variables on FP and DLPF evolution.

      The authors run their studies on brains collected from 1920 to 1970 and preserved in formalin solution. Also, they obtained data from the Mussée National d´Histoire Naturelle in Paris and from the Allen Brain Institute in California. The main findings consist in showing that the volume of the FP, the DLPFC, and the Rest of the Brain (ROB) across the 16 selected primate species is related to three socio-ecological variables: body mass, daily traveled distance, and population density. The authors conclude that metacognition and working memory are critical for foraging in primates and that FP volume is more sensitive to social constraints than DLPFC volume.

      The topic addressed in the present manuscript is relevant for understanding human brain evolution from the point of view of primate research, which, unfortunately, is a shrinking field in neuroscience.

      We must not have been clear enough in our manuscript, because our goal is precisely not to separate humans from other primates. This is why, in contrast to other studies, we have included human and non-human primates in the same models. If our goal had been to study human evolution, we would have included fossil data (endocasts) from the human lineage.

      But the experimental design has two major weak points: the absence of lissencephalic primates among the selected species and the delimitation of FP and DLPFC. Also, a general theoretical and experimental frame linking evolution (phylogeny) and development (ontogeny) is lacking.

      We admit that lissencephalic species could not be included in this study because we use sulci as key landmarks. We believe that including lissencephalic primates would have introduced a bias and noise in our comparisons, as the delimitations and landmarks would have been different for gyrencephalic and lissencephalic primates. Concerning development, it is simply beyond the scope of our study.

      Major comments.

      1) Is the brain modular? Is there modularity in brain evolution?: The entire manuscript is organized around the idea that the brain is a mosaic of units that have separate evolutionary trajectories:

      "In terms of evolution, the functional heterogeneity of distinct brain regions is captured by the notion of 'mosaic brain', where distinct brain regions could show a specific relation with various socio-ecological challenges, and therefore have relatively separate evolutionary trajectories".

      This hypothesis is problematic for several reasons. One of them is that each evolutionary module of the brain mosaic should originate in embryological development from a defined progenitor (or progenitors) domain [see García-Calero and Puelles (2020)]. Also, each evolutionary module should comprise connections with other modules; in the present case, FP and DLPFC have not evolved alone but in concert with, at least, their corresponding thalamic nuclei and striatal sector. Did those nuclei and sectors also expand across the selected primate species? Can the authors relate FP and DLPFC expansion to a shared progenitor domain across the analyzed species? This would be key to proposing homology hypotheses for FP and DLPFC across the selected species. The authors use all the time the comparative approach but never explicitly their criteria for defining homology of the cerebral cortex sectors analyzed.

      We do not understand what the referee is referring to with the word ‘module’, and why it relates to development. Same thing for the anatomical relation with subcortical structures. Yes, the identity of distinct functional cortical regions relies upon subcortical inputs during development, but clearly this is neither technically feasible, nor relevant here anyways.

      We acknowledge, however, that our definition of functional regions was not precise enough, and we have updated the introduction to clarify that point. In short, we clearly do not want to make a strong case for the functional borders that we chose for the regions of interest here (FP and DLPFC), but rather use those regions as proxies for their corresponding functions as defined in laboratory conditions for a couple of species (rhesus macaques and humans, essentially).

      Contemporary developmental biology has showed that the selection of morphological brain features happens within severe developmental constrains. Thus, the authors need a hypothesis linking the evolutionary expansion of FP and DLPFC during development. Otherwise, the claims form the mosaic brain and modularity lack fundamental support.

      Once again, we do not think that our definition of modules matches what the reviewer has in mind, i.e. modules defined by populations of neurons that developed together (e.g. visual thalamic neurons innervating visual cortices, themselves innervating visual thalamic neurons). Rather, the notion of mosaic brain refers to the fact that different parts of the brain are susceptible to distinct (but not necessarily exclusive) sources of selective pressures. The extent to which these ‘developmental’ modules are related to ‘evolutionary’ modules is clearly beyond the scope of this paper.

      Our goal here was to evaluate the extent to which modules that were defined based on cognitive operations identified in laboratory conditions could be related (across species) to socio-ecological factors as measured in wild animals. Again, we agree that the way these modules/ functional maps were defined in the paper were confusing, and we hope that the new version of the manuscript makes this point clearer.

      Also, the authors refer most of the time to brain regions, which is confusing because they are analyzing cerebral cortex regions.

      We do not understand why the term ‘brain’ is more confusing than ‘cerebral cortex’, especially for a wide audience.

      2) Definition and delimitation of FP and DLPFC: The precedent questions are also related to the definition and parcellation of FP and DLPFC. How homologous cortical sectors are defined across primate species? And then, how are those sectors parcellated?

      The authors delimited the FP:

      "...according to different criteria: it should match the functional anatomy for known species (macaques and humans, essentially) and be reliable enough to be applied to other species using macroscopic neuroanatomical landmarks".

      There is an implicit homology criterion here: two cortical regions in two primate species are homologs if these regions have similar functional anatomy based on cortico-cortical connections. Also, macroscopic neuroanatomical landmarks serve to limit the homologs across species.

      This is highly problematic. First, because similar function means analogy and not necessarily homology [for further explanation see Puelles et al. (2019); García-Cabezas et al. (2022)].

      We are not sure to follow the Reviewer’s point here. First, it is not clear what would be the evolutionary scenario implied by this comment (evolutionary divergence followed by reversion leading to convergence?). Second, based on the literature, both the DLPFC and the FP display strong similarities between macaques and humans, in terms of connectivity patterns (Sallet et al, 2013), in terms of lesion-induced deficit and in terms of task-related activity (Mansouri et al, 2017). These criteria are usually sufficient to call 2 regions functionally equivalent. We do not see how this explanation is "highly problematic" as it is clearly the most parsimonious based on our current knowledge.

      Second, because there are several lissencephalic primate species; in these primates, like marmosets and squirrel monkeys, the whole approach of the authors could not have been implemented. Should we suppose that lissencephalic primates lack FP or DLPFC?

      We understand neither the reviewer’s logic, nor the tone. We understand that the reviewer is concerned by the debate on whether some laboratory species are more relevant than others for studying the human prefrontal cortex, but this is clearly not the objective of our work. As explained in the manuscript, we identified FP and DLPFC based on functional maps in humans and laboratory monkeys (macaques), and we used specific gyri as landmarks that could be reliably used in other species. And, as rightfully pointed out by reviewer 1, this is in and off itself not so trivial. Of course, lissencephalic animals could not be studied because we could not find these landmarks, but why would it mean that they do not have a prefrontal cortex? The reviewer implies that species that we did not study do not have a prefrontal cortex, which makes little sense. Standards in the field of comparative anatomy of the PFC, especially when it implies rodents (lissencephalic also) include cytoarchitectonic and connectivity criteria, but obviously we are not in a position to address it here. We have, however, included references to the seminal work of Angela Roberts and collaborator in the discussion on marmosets prefrontal functions, to reinforce the idea that the functional organization is relatively well conserved across all primates (with or without gyri on their brain) (Dias et al, 1996; Roberts et al, 2007).

      Do these primates have significantly more simplistic ways of life than gyrencephalic primates? Marmosets and squirrel monkeys have quite small brains; does it imply that they have not experience the influence of socio-ecological factors on the size of FP, DLPFC, and the rest of the brain?

      Again, none of this is relevant here, because we could not draw conclusions on species that we cannot study for methodological reasons. The reviewer seems to believe that an absence of evidence is equivalent to an evidence of absence, but we do not.

      The authors state that:

      "the strong development of executive functions in species with larger prefrontal cortices is related to an absolute increase in number of neurons, rather than in an increase in the ration between the number of neurons in the PFC vs the rest of the brain".

      How does it apply to marmosets and squirrel monkeys?

      Again, we do not understand the reviewer’s point, since it is widely admitted that lissencephalic monkeys display both a prefrontal cortex and executive functions (again, see the work of Angela Roberts cited above). Our goal here was certainly not to get into the debate of what is the prefrontal cortex in a handful of laboratory species, but to evaluate the relevance of laboratory based neuro-cognitive concepts for understanding primates in general, and in their natural environment.

      References:

      García-Cabezas MA, Hacker JL, Zikopoulos B (2022) Homology of neocortical areas in rats and primates based on cortical type analysis: an update of the Hypothesis on the Dual Origin of the Neocortex. Brain structure & function Online ahead of print. doi:doi.org/ 10.1007/s00429-022-02548-0

      García-Calero E, Puelles L (2020) Histogenetic radial models as aids to understanding complex brain structures: The amygdalar radial model as a recent example. Front Neuroanat 14:590011. doi:10.3389/fnana.2020.590011

      Nieuwenhuys R, Puelles L (2016) Towards a New Neuromorphology. doi:10.1007/978-3-319-25693-1

      Puelles L, Alonso A, Garcia-Calero E, Martinez-de-la-Torre M (2019) Concentric ring topology of mammalian cortical sectors and relevance for patterning studies. J Comp Neurol 527 (10):1731-1752. doi:10.1002/cne.24650

      Reviewer #3 (Public Review):

      This is an interesting manuscript that addresses a longstanding debate in evolutionary biology - whether social or ecological factors are primarily responsible for the evolution of the large human brain. To address this, the authors examine the relationship between the size of two prefrontal regions involved in metacognition and working memory (DLPFC and FP) and socioecological variables across 16 primate species. I recommend major revisions to this manuscript due to: 1) a lack of clarity surrounding model construction; and 2) an inappropriate treatment of the relative importance of different predictors (due to a lack of scaling/normalization of predictor variables prior to analysis). My comments are organized by section below:

      We thank the reviewer for the globally positive evaluation and for the constructive remarks. Introduction:

      • Well written and thorough, but the questions presented could use restructuring.

      Again, we thank the reviewer, and we believe that this is coherent with some of the remarks of reviewer 1. We have extensively revised the introduction, toning down the social vs ecological brain issue to focus more on what is the objective of the work (evaluating the relevance of lab based neuro-cognitive concepts for understanding natural behavior in primates).

      Methods:

      • It is unclear which combinations of models were compared or why only population density and distance travelled tested appear to have been included.

      The details of the model comparison analysis were presented as a table in the supplementary material (#3, details of the model comparison data), but we understand that this was not clear enough. We have provided more explanation both in the main manuscript and in the supplements. All variables were considered a priori; however, we proceeded beforehand to an exploratory analyses which led us to exclude some variables because of their lack of resolution (not enough categories for qualitative variables) or strong cross-correlations with other quantitative variables. There were much more than three variables included in the models but the combination of these 3 (body mass, daily traveled distance and population density) best predicted (had the smallest AIC) the size of the brain regions. We provide additional information about these exploratory analyses in the supplementary material, sections 2 and 3.

      • Brain size (vs. body size) should be used as a predictor in the models.

      We do not understand the theoretical reason for replacing body size by brain size in the models. Brain size is not a socio-ecological variable. And of course, that would be impossible for modeling brain size itself. Or is it that the reviewer suggests to use brain size as a covariate to evaluate the effects of other variables in the model over and above the effect on brain size? But what is the theoretical basis for this?

      • It is not appropriate to compare the impact of different predictors using their coefficients if the variables were not scaled prior to analysis.

      We thank the Reviewer for this comment; however, standardized coefficients are not unproblematic because their calculations are based on the estimated standard-deviations of the variables which are likely to be affected by sampling (in effect more than the means). We note that the methods of standardized coefficients have attracted several criticisms in the literature (see the References section in https://en.wikipedia.org/wiki/Standardized_coefficient). Nevertheless, we now provide a table with these coefficients which makes an easy comparison for the present study. We also updated tables 1, 2 and 3 to include standardized beta values.

      Reviewer #1 (Recommendations For The Authors):

      N/A

      Reviewer #2 (Recommendations For The Authors):

      Contemporary developmental biology has showed that the brain of all mammals, including primates, develops out of a bauplan (or blueprint) made of several fundamental morphological units that have invariant topological relations across species (Nieuwenhuys and Puelles 2016).

      At some point in the discussion the authors acknowledge that:

      "Our aim here was clearly not to provide a clear identification of anatomical boundaries across brain regions in individual species, as others have done using much finer neuroanatomical methods. Such a fine neuroanatomical characterization appears impossible to carry on for a sample size of species compatible with PGLS".

      I do not think it would be impossible to carry such neuroanatomical characterization. It would take time and effort, but it is feasible. Such characterization, if performed within the framework of contemporary developmental biology, would allow for well-founded definition and delineation of cortical sectors across primate species, including lissencephalic ones, and would allow for meaningful homologies and interspecies comparisons.

      We do not see how our work would benefit from developmental biology at that point, because it is concerned with evolution, and these are very distinct biological phenomena. We do not understand the reviewer’s focus on lissencephalic species, because they are not so prevalent across primates, and it is unlikely that adding a couple of lissencephalic species will change much to the conclusions.

      Minor points:

      • Please, format references according to the instructions of the journal.

      Ok - done

      • The authors could use the same color code across Figures 1, 2, and 3.

      Ok – done

      • The authors say that group hunting "only occurs in a few primate species", but it also occurs in wolves, whales, and other mammalian species.

      We focus on primates here, these other species are irrelevant. Again, this is beside the point.

      Reviewer #3 (Recommendations For The Authors):

      My comments are organized by section below:

      Introduction:

      • Well written and thorough

      • The two questions presented towards the end of the intro are not clear and do not guide the structure of the methods/results sections. I believe one it would be more appropriate to ask if: 1) the relative proportions of the FP and DLPFC (relative to ROB) are consistent across primates; and 2) if the relative size of these region is best predicted by social and/ or ecological variables. Then, the results sections could be organized according to these questions (current results section 1 = 1; current results sections 2, 3, 4 = 2.1, 2.2, 2.3)

      As explained above, we agree with the reviewer that the introduction was somehow misleading and we have edited it extensively. We do not, however, agree with the reviewer regarding the relative (vs absolute) measure. We have discussed this in our response to reviewer 1 regarding the comparison of regional volumes as proxies for number of neurons. The best predictor of the computing capacity of a brain region is its number of neurons, but there is no reason to believe that this capacity should decrease if the rest of the brain increases, as implied by the relative measure that the reviewer proposes. That debate is probably critical in the field of comparative neuroanatomy, and confronting different perspectives would surely be both interesting and insightful, but we feel that it is beyond the scope of the present article.

      Methods:

      • While the methods are straightforward and generally well described, it is unclear which combinations of models were compared or why only population density and distance travelled tested appear to have been included (in e.g., Fig SI 3.1) even though many more variables were collected.

      We agree that this was not clear enough, and we have tried to improve the description of our model comparison approach, both in the main text and in the supplementary material.

      • Why was body mass rather than ROB used as a predictor in the models? The authors should instead/also include analyses using ROB (so the analysis is of FP and DLPFC size relative to brain size). Using body mass confounds the analyses since they will be impacted by differences in brain size relative body size.


      Again, we have addressed this issue above. First, body size is a socio-ecological variable (if anything, it especially predicts energetic needs and energy expenditure), but ROB is clearly not. We do not see the theoretical relevance of ROB in a socio-ecological model. Second, from a neurobiological point of view, since within primates the volume of a given brain region is directly related to its number of neurons (again, see work of Herculano-Houzel), which is a good proxy for its computing capacity, we do not see the theoretical reason for considering ROB.

      • It is not appropriate to compare the impact of different predictors using their coefficients if the variables were not scaled prior to analysis. The authors need to implement this in their approach to make such claims.

      We thank the reviewer again for pointing that out. We have addressed this question above.

      • Differences across primates in terms of frontal lobe networks throughout the brain should be acknowledged (e.g., Barrett et al. 2020, J Neurosci).

      We have added that reference to the discussion, together with other references showing that the difference between human and non-human primates is significant, but essentially quantitative, rather than qualitative (the building blocks are relatively well conserved, but their relative weight differs a lot). Thank you for pointing it out.

      I hope the authors find my comments helpful in revising their manuscript.

      And we thank again the reviewer for the helpful and constructive comments.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This fundamental study identifies the homeodomain transcription factor and suspected autism-candidate gene Meis2 as transcriptional regulators of maturation and end-organ innervation of low-threshold mechanoreceptors (LTMRs) in the dorsal root ganglia (DRG) of mice. For a few years, the view on autism spectrum disorders (ASD) has shifted from a disorder that exclusively affects the brain to a condition that also includes the peripheral somatosensory system, even though our knowledge about the genes involved is incomplete. The study by Desiderio and colleagues is therefore not only scientifically interesting but may also have clinical relevance. The work is convincing, with appropriate and validated methodology in line with current state-of-the-art and the findings contribute both to understanding and potential application.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This work examined transcription factor Meis2 in the development of mouse and chick DRG neurons, using a combination of techniques, such as the generation of a new conditional mutant strain of Meis2, behavioral assays, in situ hybridization, transcriptomic study, immunohistochemistry, and electrophysiological (ex vivo skin-nerve preparation) recordings. The authors found that Meis2 was selectively expressed in A fiber LTMRs and that its disruption affects the A-LTMRs' end-organ innervation, transcriptome, electrophysiological properties, and light touch-sensation.

      Strengths:

      1) The authors utilized a well-designed mouse genetics strategy to generate a mouse model where the Meis2 is selectively ablated from pre- and post-mitotic mouse DRG neurons. They used a combination of readouts, such as in situ hybridization, immunhistochemistry, transcriptomic analysis, skin-nerve preparation, electrophysiological recordings, and behavioral assays to determine the role of Meis2 in mouse DRG afferents.

      2) They observed a similar preferential expression of Meis2 in large-diameter DRG neurons during development in chicken, suggesting evolutionarily conserved functions of this transcription factor.

      3) Conducted severe behavioral assays to probe the reduction of light-touch sensitivity in mouse glabrous and hairy skin. Their behavioral findings support the idea that the function of Meis2 is essential for the development and/or maturation of LTMRs.

      4) RNAseq data provide potential molecular pathways through which Meis2 regulates embryonic target-field innervation.

      5) Well-performed electrophysiological study using skin-nerve preparation and recordings from saphenous and tibial nerves to investigate physiological deficits of Meis2 mutant sensory afferents.

      6) Nice whole-mount IHC of the hair skin, convincingly showing morphological deficits of Meis2 mutant SA- and RA- LTMRs.

      Overall, this manuscript is well-written. The experimental design and data quality are good, and the conclusion from the experimental results is logical.

      Weaknesses:

      1) Although the authors justify this study for the involvement of Meis2 in Autism and Autism associated disorders, no experiments really investigated Autism-like specific behavior in the Meis2 ablated mice.

      Indeed, in the first version of the manuscript, we use current understanding of ASD in mouse models and associated sensory defects to articulate our introduction and discussion. As noticed by reviewer 1, none of our experiments really investigated ASD. To avoid over-interpretation of the data, we have now removed sentences mentioning ASD and related references throughout the manuscript.

      2) For mechanical force sensing-related behavioral assays, the authors performed VFH and dynamic cotton swabs for the glabrous skin, and sticky tape on the back (hairy skin) for the hairy skin. A few additional experiments involving glabrous skin plantar surfaces, such as stick tape or flow texture discrimination, would make the conclusion stronger.

      We fully agree on that performing more behavioral analysis investigating with more details the primary sensory defects as well as some ASD-related behavior would re-inforce our conclusions. Our behavioral analysis clearly showed a loss of sensitivity in response to mechanical stimuli within the light touch range but not for higher range mechanical or noxious thermal stimuli. While the experiments suggested by the reviewer are interesting and would strengthen our conclusions, they are far from trivial and require large cohorts. Given the current laboratory conditions as stated at the outset, these unfortunately are not within reach.

      3) The authors considered von Frey filaments (1 and 1.4 g) as noxious mechanical stimuli (Figure 1E and statement on lines 181-183), which is questionable. Alligator clips or pinpricks are more certain to activate mechanical nociceptors.

      To avoid misinterpretation of the higher Von Frey filament tests, we deleted the two following statement in page 7: “In the von Frey test, the thresholds for paw withdrawal were similar between all genotypes when using filaments exerting forces ranging from 1 to 1.4g, which likely reflects the activation of mechanical nociception suggesting that Meis2 gene inactivation did not affect nociceptor function.”. The sentence “… while sparing other somatosensory behaviors” was also deleted.

      4) There are disconnections and inconsistencies among findings from morphological characterization, physiological recordings, and behavior assays. For example, Meis2 mutant SA-LTMRs show a deficiency in Merkel cell innervation in the glabrous skin but not in hairy skin. With no clear justification, the authors pooled recordings of SA-LTMRs from both glabrous and hairy skin and found a significant increase in mean vibration threshold. Will the results be significantly different if the data are analyzed separately? In addition, whole-mount IHC of Meissner's corpuscles showed morphological changes, but electrophysiological recordings didn't find significant alternation of RAI LTMRs. What does the morphological change mean then? Since the authors found that Meis2 mice are less sensitive to a dynamic cotton swab, which is usually considered as an RA-LTMR mediated behavior, is the SAI-LTMR deficit here responsible for this behavior? Connections among results from different methods are not clear, and the inconsistency should be discussed.

      We thank Reviewer 1 for the careful review of our data and fully agree with the weaknesses identified, weaknesses we were ourselves aware of at the time of submission. In particular on the lack of stronger connections between histological and electrophysiological data. Electrophysiological studies were conducted on a first cohort of mice where we mostly emphasize on WT and Meis2 mutant mice. The goal was to describe differences in electrophysiological properties of identified mechanoreceptors from these two genotypes. While substantial differences between WT and Islet1-Cre mice were not expected, only very few mice with this genotype were examined at that time to confirm this assumption. We fully agree with reviewer 1 that confirming differences in SA-LTMRs responses in the hairy and glabrous at electrophysiological levels would be interesting and worthwhile. It is assumed that the physiological properties of SA-LTMRs from glabrous and hairy skins are equivalent in both skin types. Indeed direct comparisons have been made between glabrous and hairy skin SA-LTMRs revealing that they have equivalent receptor properties (see Walcher et al J Physiol quoted in the manuscript). We had not recorded from a sufficient number of hairy and glabrous skin SA-LTMRs to make any meaningful comparison statistically. When we noticed the dramatic differences in the innervation patterns of Merkel cell complexes between glabrous and hairy skin, we immediately planned a second mice cohort, but as explained in the onset to the Public Review, this cohort was sacrificed due to the pandemic lockdown. However, the obtained dataset clearly shows that in Meis2 mutant mice many SA-LTMRs had similar vibration thresholds to those of wild types.

      For Meissner corpuscle, histological analysis evidenced clear morphological differences that could of course be investigated at the level of the dual innervation previously reported by Neubarth et al. It is uncertain whether differences in their electrophysiological responses would be revealed by increasing the number of recorded fibers. For this reason, we clearly stated this limitation in the results section page 7 “There was a tendency for RA-LTMRs in Isl1Cre/+::Meis2LoxP/LoxP mutant mice to fire fewer action potentials to sinusoids and to the ramp phase of a series 2 second duration ramp and hold stimuli, but these differences were not statistically significant (Figure 5B). Nevertheless it is important to point out that an electrical search strategy revealed that many Aβ-fibers did not have mechanosensitive receptive fields. Thus by focusing on LTMRs with a mechanosensitive receptive field, we ignore the fact that fewer fibers are mechanosensitive. This is now more extensively discussed in the discussion section of the manuscript page 13:

      “Indeed, the electrophysiology methods used here can only identify sensory afferents that have a mechanosensitive receptive field. Primary afferents that have an axon in the skin but no mechanosensitvity can only be identified with a so-called electrical search protocol (45, 46) which was not used here. It is therefore quite likely that many primary afferents that failed to form endings would not be recorded in these experiments e.g. SA-LTMRs and RA-LTMRs that fail to innervate end-organs (Fig.4-6).”

      “From our data, we could not conclude whether SA-LTMR electrophysiological responses are differentially affected in the glabrous versus hairy skin of Meis2 mutant as suggested by histological analysis. Further electrophysiological analysis focused on SA-LTMR selectively innervating the glabrous or hairy skin would be necessary to answer this question. Similarly, the decreased sensitivity of Meis2 mutant mice in the cotton swab assay and the morphological defects of Meissner corpuscles evidenced in histological analysis do not correlate with RA-LTMR electrophysiological responses for which a tendency to decreased responses were however measured. The later might result from an insufficient number of fibers recording, whereas the first may be due of pooling SA-LTMR from both the hairy and glabrous skin.”.

      Reviewer #2 (Public Review):

      Summary:

      Desiderio and colleagues investigated the role of the TALE (three amino acid loop extension) homeodomain transcription factor Meis2 during maturation and target innervation of mechanoreceptors and their sensation to touch. They start with a series of careful in situ hybridizations to examine Meis2 transcript expression in mouse and chick DRGs of different embryonic stages. By this approach, they identify Meis2+ neurons as slowly- and rapidly adapting A-beta LTMRs, respectively. Retrograde tracing experiments in newborn mice confirmed that Meis2-expressing sensory neurons project to the skin, while unilateral limb bud ablations in chick embryos in Ovo showed that these neurons require target-derived signals for survival. The authors further generated a conditional knock-out (cKO) mouse model in which Meis2 is selectively lost in Islet1-expressing, postmitotic neurons in the DRG (IsletCre/+::Meis2flox/flox, abbreviated below as cKO). WT and Islet1Cre/+ littermates served as controls. cKO mice did not exhibit any obvious alteration in volume or cellular composition of the DRGs but showed significantly reduced sensitivity to touch stimuli and various innervation defects to different end-organ targets. RNA-sequencing experiments of E18.5 DRGs taken from WT, Islet1Cre/+, and cKO mice reveal extensive gene expression differences between cKO cells and the two controls, including synaptic proteins and components of the GABAergic signaling system. Gene expression also differed considerably between WT and heterozygous Islet1Cre/+ mice while several of the other parameters tested did not. These findings suggest that Islet1 heterozygosity affects gene expression in sensory neurons but not sensory neuron functionality. However, only some of the parameters tested were assessed for all three genotypes. Histological analysis and electrophysiological recordings shed light on the physiological defects resulting from the loss of Meis2. By immunohistochemical approaches, the authors describe distinct innervation defects in glabrous and hairy skin (reduced innervation of Merkel cells by SA1-LTMRs in glabrous but not hairy skin, reduced complexity of A-beta RA1-LTMs innervating Meissner's corpuscles in glabrous skin, reduced branching and innervation of A-betA RA1-LTMRs in hairy skin). Electrophysiological recordings from ex vivo skin nerve preparations found that several, but not all of these histological defects are matched by altered responses to external stimuli, indicating that compensation may play a considerable role in this system.

      Strengths:

      This is a well-conducted study that combines different experimental approaches to convincingly show that the transcription factor Meis2 plays an important role in the perception of light touch. The authors describe a new mouse model for compromised touch sensation and identify a number of genes whose expression depends on Meis2 in mouse DRGs. Given that dysbalanced MEIS2 expression in humans has been linked to autism and that autism seems to involve an inappropriate response to light touch, the present study makes a novel and important link between this gene and ASD.

      Weaknesses:

      The authors make use of different experimental approaches to investigate the role of Meis2 in touch sensation, but the results obtained by these techniques could be connected better. For instance, the authors identify several genes involved in synapse formation, synaptic transmission, neuronal projections, or axon and dendrite maturation that are up- or downregulated upon targeted Meis2 deletion, but it is unresolved whether these chances can in any way explain the histological, electrophysiological, or behavioral deficits observed in cKO animals. The use of two different controls (WT and Islet1Cre/+) is unsatisfactory and it is not clear why some parameters were studied in all three genotypes (WT, Islet1Cre/+ and cKO) and others only in WT and cKO. In addition, Meis2 mutant mice apparently are less responsive to touch, whereas in humans, mutation or genomic deletion involving the MEIS2 gene locus is associated with ASD, a condition that, if anything, is associated with an elevated sensitivity to touch. It would be interesting to know how the authors reconcile these two findings. A minor weakness, the first manuscript suffers from some ambiguities and errors, but these can be easily corrected.

      We thank the reviewer for the insightful comments and suggestions.

      The use of two different controls (WT and Islet1Cre/+) is unsatisfactory and it is not clear why some parameters were studied in all three genotypes (WT, Islet1Cre/+ and cKO) and others only in WT and cKO.

      First, we identified a labelling mistake in figures 4D, 5A and 6A where the control shown are from Islet1+/Cre mice and not from WT as reported in the first version. We apologize for this mistake which has now been corrected. This typographical error does not in any way affect our conclusion, on the contrary, it shows that innervation defects are not the consequence of Islet1 heterozygosity.

      The reviewer wonders why for some data both control genotypes are presented, and for some others only one is presented. It is quite possible that genes expression changes happen due to a synergistic effect of both heterozygous Meis2 deletion and heterozygous Islet1 deletion. However, we found no evidence that this led to defects in target-field innervation or to changes in the physiological properties of sensory neurons.

      Whereas it could be fairly envisaged that some gene expression is modified due to a synergistic effect of both heterozygous Meis2 deletion and heterozygous deletion of Islet1, several lines of evidence support that the defects in target-field innervation and electrophysiological responses are exclusively due to Meis2 deletion. Previous work on Islet1 specific deletion in DRG sensory neurons opens the possibility that some of the phenotypes we report here are in part due to an effect of Islet1 heterozygous deletion or a synergistic effect to Meis2 homozygous deletion.

      1) When Islet1 is conditionally deleted in mice using the Wnt1-Cre strain or at later stages using a tamoxifen inducible-Cre, homozygous pups die a few hours after birth. Early Islet1 deletion results in an increased apoptosis in the DRG, a massive loss of DRG sensory neurons and sensory defects associated to nociceptors mostly and some touch neurons while proprioceptive neurons are spared (Sun et al., 2008 now included in the revised version of the manuscript). There was a decrease in the number of Ntrk1+ and Ntrk2+ neurons whereas Ntrk3+ neurons number appeared normal. When Islet1 is inactivated later in development, the number of Ntrk1+ and Ntrk2+ neurons were normal and only the expression of nociceptor specific markers was decreased. Since neither the DRG volume, nor the number of Ntrk1+, Ntrk2+ and Ntrk3+ neurons are changed in Meis2 cKO using the Islet1-Cre strain, an early significant effect of Islet1 heterozygous deletion is very unlikely.

      2) For distal innervation defects, it is clear from the Wnt1-Cre::Meis2 data (Figure 3E) that the distal innervation phenotype occurred while Meis2 is inactivated independently of Islet1 expression.

      3) Finally, the lack of differences between WT and Islet+/Cre mice in behavioral assays and in electrophysiological characterization of RA-LTMR of the hairy skin (Figure 6C) and SA-LTMR (Figure 4B and C) argues for a lack of significant consequences of Islet1 heterozygous deletion on these parameters.

      4) For bulk RNAseq studies, all datasets has been now re-analyzed following Reviewer 2 specific comments (see below). To avoid misinterpretation of the data, the results are now presented differently (see pages 8 and 9) and more critically discussed (see pages 14 and 15). In particular, we included and discuss references on Islet1 cKO mice.

      We also agree with reviewer 2 that our RNAseq study only provides cues on potential genes expression that could impact distal innervation and electrophysiological responses. However, proving which of those genes are fully responsible for the morphological and electrophysiological defects would require extensive mouse genetic investigations such as restoring their normal expression level in a Meis2 mutant context, which is beyond the scope of the present study.

      Finally, the reviewer questioned how we could reconcile the lower touch sensitivity in Meis2 mutant mice with the exacerbated touch sensitivity found in ASD patient and mouse models of ASD. As suggested by reviewer 1, our study did not really investigate ASD specifically. Therefore, to avoid over interpretation of the data and to follow Reviewer 1 recommendation, we have removed all references to ASD in the revised version of the manuscript. Indeed, to our knowledge, none of the case reports on Meis2 mutant patients investigated sensory function in general and light touch in particular, maybe because of the severe intellectual disability characterizing these patients.

      Reviewer #1 (Recommendations For The Authors):

      In addition to the aforesaid suggestions in the section 2, there are some minor issues:

      We thank the reviewer for the careful reading and for identifying all these typos. All of them have been corrected in the revised version of the manuscript.

      1) There should not be a full stop mark in the title of the article. This has been corrected in the new version of the manuscript.

      2) Figure 1C, 1D, please correct the typo "controlateral' to "contralateral".

      This has been corrected in the new version of the manuscript.

      3) Figure 1D, lower graph, Y-axis, please correct the typo 'umber' to "number".

      This has been corrected in the new version of the manuscript.

      4) To make it easy for readers, add the names of the behavioral tests on top of the graphs in Fig 1E-H.

      The name of behavioral tests is now added to the figure.

      5) It would be easier to read the markers' names in IHC and ISH images if they were written outside of image panels. The blue staining color in image 1B could be easily mixed with the background. Suggest change colors.

      Markers for IHC and IH images are now written outside the image panel or colors have been change in figure 1 and 2 for better clarity.

      6) The font size of Genes' name in Figure 3B is too small and not readable.

      Figure 3 has now been changed following Reviewer 2 recommendation. The small font size in Figure 3B is no longer present in the figure.

      7) Quantification of Fig 3E (number of fibers innervating each dermal papilla or footpad, for example).

      Unfortunately, we did not kept the Wnt1Cre::Meis2LoxP/LoxP strain which prevents further analysis (see onset of the answer to public review).

      8) In Figure 4, please arrange IHC images and their quantification results adjacent to each other.

      The figure has been reorganized and changes in the result section and figures legend were made accordingly.

      9) For consistency, please use either LTMR or LTM (See Figure 4F, 5A, 6C), but not both.

      This has been homogenized throughout the manuscript.

      10) Add arrows/heads to mark the overlaps in Figure 4D.

      Arrows are now added in Figure 4D to point at the overlap between Nefh and CK8 staining.

      11) Figure 5A, 6A, Lines 236, 240, 247, 258, 305, 308, 313, 347, and many more in Figure legends: please check in entire manuscript and make the mouse genotype nomenclature (+/Cre?) consistent. In some places, Cre is written in all upper case (Line 657).

      This has been homogenized throughout the manuscript.

      12) Figure 4G: Histogram color could be darker for better contrast.

      The color of the histograms has been changes in figures 6 and 5 for better clarity.

      13) Please add the figure number to the Figure 6.

      The figure number is now indicated on the figure.

      1. Figure 6B: Y-axis typo, correct "Nfeh" to Nefh.

      This typo is now corrected.

      15) Either explain Figure 2B information before that of Figure 2C (In lines 204-207) in the text or change the figure panel sequence to keep the consistent flow of contents.

      The figure has been modified and the panel sequence now follows that of the main text.

      16) Line 213 has a typo: change "form" to "from".

      This typo is now corrected.

      17) Line 423 has a typo. Correct "al" to "all".

      This typo is now corrected.

      18) Line 625 has a typo. Correct "fo" to "of".

      This typo is now corrected.

      19) Line 669 has a typo. Correct "Alexa Fluo" to "Fluor".

      This typo is now corrected.

      20) Line 744: To be consistent in the entire manuscript, write "Nfh" as "Nefh".

      This typo is now corrected.

      21) 740-749: Please add host names for all primary antibodies, as some are given but some are not for the current version.

      We now indicated the host species for all primary antibodies used in the study.

      22) Line 751 has a typo: change "a" to "as".

      This typo is now corrected.

      23) Line 754: what is for 20'?

      This typo is now corrected.

      24) Line 832: change "day test" to "testing day".

      The change has been made.

      25) Please mention for how many seconds the VFH was administered on the plantar surface in the method.

      A new sentence has been added to the “Von Frey withdrawal test” Methods section (page 30): “During each application, bend filament was maintained for approximately four to five seconds”.

      26) For the sticky tape test, in lieu of hind paw attending bouts, wet-dog shake behavior, the authors also found some scratching behaviors. Did they separately quantify these behaviors? It would be interesting to see exactly which behavior significantly reduced after Meis2 inactivation.

      Unfortunately, at the time of the design of the sticky tape test, we did not consider separating the behaviors considered as “positive” reactions. As these experiments were not video recorded, we are not able to extract this kind of information without generating new mice cohort and repeating this experiment.

      27) Line 344-345: consider rephrasing the sentence.

      This sentence has been removed.

      Reviewer #2 (Recommendations For The Authors):

      This is a beautiful and well-conducted study with all the strengths listed in the paragraphs above. Nevertheless, there are still some open questions, ambiguities in the presentation, and minor errors that I would recommend addressing.

      Major Points:

      1) The authors performed RNA-seq analysis from E18.5 mouse total DEGs from three different genotypes, WT, Isle1Cre/+ and cKO. Although this approach identified several interesting Meis2-dependent candidate genes, the presentation of the results is confusing, and the publication would gain impact if the RNA-seq results were better connected to the histological, behavioral, and electrophysiological data. Specific concerns:

      1.1) The gene expression profiles of WT and Islet1Cre/+ samples are remarkably divergent. According to Yang Development 2006, Islet1-Cre was generated by knocking in Cre into the endogenous Islet1 locus and replacing the Isl1 ATG, hence resulting in a heterozygous null for Islet1. When purely technical derivations can be excluded, the RNAseq results presented here suggest that heterozygous loss of Islet1 causes considerable gene expression changes in the postnatal DRG. For analysis of the RNAseq results, the authors focus on genes that are differentially expressed between one experimental condition (Islet1Cre/+::Meis2flox/flox) and either one of two controls (WT or Islet1Cre/+). Hence, they pool the genes that are differently expressed between cKO and Islet1Cre/+ with the genes that are different between cKO and WT. This approach mixes gene expression differences that result from two different genetic alterations, heterozygosity of Islet1 and targeted deletion of Meis2, respectively. It seems much more logical to compare the results pairwise.

      We agree with reviewer 2 that heterozygous deletion of Islet1 causes a significant change in genes expression that seems to very little correlate with any of the phenotypes we investigated in the study. When Islet1 is conditionally deleted in mouse using the Wnt1-cre strain, pups die few hours after birth and display increased apoptosis in the DRG, massive loss of DRG sensory neurons and sensory defects associated to nociceptors mostly and some touch neurons while proprioceptive neurons are spared (Sun et al., 2008 now included in the revised version of the manuscript). There is a decrease numbers of Ntrk1+ and Ntrk2+ neurons whereas the numbers of Ntrk3+ neurons appear normal. Later Isl1 inactivation does not induces changes in number of neurons and does not change Ntrk1 and 2 expressions. As explained in the answer to public reviews, bulk RNAseq data have now been reanalyzed following the reviewer suggestions and presented accordingly in the related figures.

      In the study bay Sun et al. they also reported DEGs following Islet1 homozygous deletion, but data on Islet1 heterozygous deletion are not included. However, out of the 60 most dysregulated genes identified in their study, only 6 were differentially expressed in our datasets. Importantly, DEGs in their studies where identified using microarray. In another study, the same group, showed that Brn3a (another transcription factor important for DRG neurons differentiation) and Islet1 exhibit negative epistasis on sensory genes expression (Dykes et al., 2011 now included in the revised version of the manuscript). Thus we cannot rule out that similar rules apply for Islet1 and Meis2. However, given the high diversity of DRG sensory neurons, interpreting our bulk RNAseq analysis in such direction might lead to misinterpretation.

      1.2) Along the same line, gene expression changes in Islet1Cre/+ DRGs seem to have little functional consequences, at least in the cases where all three genotypes were analyzed (target dependency (Fig. 1E), behavior (Fig. 1F), innervation (Fig. 4F, 6C)). Why were some parameters measured in all three genotypes and others only for WT and cKO? The authors probably reason that parameters that do not differ between WT and cKO animals will likely also not differ between WT and Islet1Cre/+. But what about parameters that do differ? Considering that the innervation of Merkel cells (Fig. 4E) and Meissner corpuscles (Fig. 5A) differ profoundly between WT and cKO, it would be interesting to know what this innervation looks like in Islet1Cre/+ DRGs. NEFH staining together with CK8 or S100beta from existing tissue sections should easily answer this question.

      As explained in the answer for public reviews, there was a mistake in the annotation of the control in figure 4 D and E, and in Fig. 5 that has now been corrected. Concerning target-dependency, those are experiments conducted in chick embryo, and therefore no associated genotype.

      1.3) Was a minimum cut-off for gene expression applied? The up-and downregulated genes in Fig. 3B list a number of pseudogenes and predicted genes. A quick (and incomplete) check for their expression in Fig2 Supple Table 1 shows that only a few reads were detected for most of them. With such low expression, even small changes will show up as significant differences.

      In our first analysis, a cut-off of 10 reads was applied. As reviewer 2 mentioned, this cut-off included several pseudogenes and predicted genes with low expression for which small changes were significant. We now re-analyzed the dataset using a cut-off of 100 reads. This excluded most of the previous predicted genes and pseudogenes for the analysis and resulted in a much small number of DEGs for each dataset. As recommended by reviewer 2, we also now performed the David analysis separately. These results are now presented in Figure 3 and corresponding supplementary figures.

      1.4) Given that bulk RNAseq from whole embryonic DRGs was performed, it would be interesting to know what cell type(s) express the Meis2-dependent transcripts. To address this question, the authors resort to published scRNAseq data by Usoskin Nat Neurosci 2015. They correlate the expression of all 488 DEGs (different between cKO and either WT or Islet1Cre/+) with the expression of Meis2 in the sensory neuron subtypes that were classified in the Usoskin paper. From that they conclude that many Meis2-dependent genes were expressed in the same sensory neuron classes as Meis2 itself. This is not apparent from Fig. 3 Supplementary 2. Neither do the 488 DEGs seem to be in any way enriched in the MEIS2-expressing cell clusters NF2/3/4/5, nor is cluster PEP1 particularly high in Meis2 expression. Immunostaining for MEIS2 together with a few selected DEGs would be a better way to assess co-expression.

      We agree with reviewer 2 that the correlation between DEGs and the expression of Meis2 in the sensory neuron subtypes was far from striking. In our opinion, the new analysis shows now a more robust correlation. However, it has to be kept in mind that among DEGs not all are expected to be Meis2 direct target genes and therefore to be enriched in the same Meis2-expressing population. This also hold true for genes that could be de-repressed or induced following Meis2 inactivation. Finally, the scRNAseq by Usoskin et al was performed on adult sensory neurons whereas our bulk RNAseq was performed on E18.5 embryos. Thus, because gene expression in developing sensory neurons is well-known to be highly dynamic, it is not expected that the transcriptional signature of sensory neurons subclasses in E18.5 embryo perfectly matches the transcriptional signature of adult subclasses. Finally, we agree that immunostaining for Meis2 together with few selected DEGs would give a better answer on whether they co-localize or not, but our lack of experience with those antibodies together with the lack of financial support for the proposal precludes achieving this pertinent point.

      1.5) The authors identify Gabra1 and Gabra4 as upregulated and Gabrr1 as downregulated genes in MEIS2 cKO animals. Does this reflect a change in GABA-receptor subunit composition in LMTRs?

      This is an interesting point. First, in our new analysis, increasing the cut-off to 100 reads excluded Gabrr1 from the DEGs. Based on our results, we cannot conclude whereas Gabra1 and Gabra4 up-regulation reflects a change in GABA receptors composition. However, in the GEO term associated to Gabaergic synapse, whereas Gabra1 and Gabra4 were up-regulated the ionotropic glutamate receptor Grid1 was downregulated, rather claiming for an imbalanced GABA/Glutamate transmission. Finally, the increased GABAR expression in the LTMRs might be expected to increase pre-synaptic inhibition on the LTMR synapses onto target neurons in the dorsal horn, thus decreasing synaptic transmission from these neurons into spinal circuits.

      2) The authors assessed SA-LTMR innervating Merkel cells in glabrous and hairy skin by IFC staining for neurofilament H and electrophysiological recordings. Due to the small sample size, they pooled recordings, reasoning that nerves that do not successfully innervate Merkel cells (i.e. cKO glabrous skin) do not evoke electrophysiological responses following a touch stimulus.

      2.1) It is undoubtedly true that non-innervating nerves will likely not show electrophysiological responses. However, by pooling the recordings of SA-LTMRs from glabrous and hairy skin, the data obtained from the 20% successful recordings of SA-LTMRs from glabrous cKO skin (according to Fig. 4E, upper panel) will be overrepresented and hence lead to a systematic bias. How many recordings were made from the glabrous and hairy skin of each genotype? In case the number of recordings from cKO/glabrous skin is the limiting factor, does the observed difference in vibration threshold hold true when only recordings from hairy skin are compared?

      As explained in the text and in our answers to reviewer 1, data for hairy and glabrous SAMs where initially pooled as no differences between them were expected, and next planned electrophysiological experiments were compromised due to the Covid19 pandemic. We are sorry that at this point, we cannot provide additional experiments to clarify this important point. In addition, as mention

      3) From the IFC images shown in Fig. 6A, it is not clear how the authors quantified branch points and innervated hair follicles.

      Branch points correspond to every time a nerve split in 2 or more nerves. Innervated follicles correspond to follicles that are entangled by circumferential and/or lanceolate Nefh+ endings.

      4) The quality of the data is very high, but there are several ambiguities and errors in their presentation.

      We apologize for this mistake. Figure 1 Supplementary 1 that reports data from Cat walk analysis is now appropriately included in the files.

      4.2) Fig. 3A is confusing and the figure legend just repeats what is already said in the text. What do yellow, blue, and pink represent?

      Figure 3 is now fully remade. Legend is now better indicated in Figure 3A. We hope it is now more clear.

      4.3) What genotype do the black, grey, and white boxplots in Fig. 6C Fig. 3 Supplementary 1B correspond to?

      The legends were missing for Figure 6C and Figure 3 supplementary 1B. They are now appropriately included.

      4.4) Up- and downregulated genes are assigned differently in Fig. 3 and Fig. 3 Supplementary 2. The figure legend of Fig. 3 Suppl 2 lists panel B as up-regulated genes but the same genes are labeled down-regulated in Fig. 3.

      We apologize for this previous mistake. Figure 3 and corresponding supplementary figures have been redone in the new version.

      4.5) Fig. 3E would benefit from a more detailed description. One can easily appreciate that the neurofilament H staining in the cKO sample is different from that of the WT sample but what exactly can be seen here?

      We added the following sentence in the results section: “In WT newborn mice, numerous Nefh+ sensory fibers surround all dermal papillae of the hairy skin and footpad of the glabrous skin, whereas in Wnt1Cre::Meis2LoxP/LoxP littermates, very few Nefh+ sensory fibers are present and they poorly innervate the dermal papillae and footpads.“.

      4.6) The figure legend to Fig. 4A is unclear. Does the graph show the sum of all recordings performed? From the text, one would guess that the bars correspond to the cKO samples, but this is not specified. Do the controls correspond to WT, Islet1Cre/+ or a mixture of both? In addition, the graph in the lower panel is labeled % Ab fibers, the figure legend reads % of tap units among Ab fibers.

      The graphs show the number of tap units identified among all recorded Afibers. Numbers show the number of tap units over the number of recorded fibers. This as been now reformulated in the last version of the manuscript.

      4.7) The abbreviation SAM in figure legends 4F, G is not introduced.

      This is now indicated in the figure legend.

      4.8) Readers who are not familiar with the traces above the graphs in 4F and 4G will find a more detailed description helpful.

      This is now indicated in the figure legend.

      4.9) Lines 274-275: Does the statement "Finally, consistent with the lack of neuronal loss in Isl1Cre/+::Meis2LoxP/LoxP, the number of recorded fibers were identical in WT and Isl1Cre/+::Meis2LoxP/LoxP." refer to Fig. 4G? This is not specified in the text.

      These data were not included in the first version of the manuscript as we though they were not significantly informative. They just indicate the overall numbers of fibers that were recorded in electrophysiological experiments. The sentence has been now removed in the last version of the manuscript to avoid misunderstanding.

      4.10) There is no Fig. 6 supplementary 1.

      The typo is now corrected. The corresponding data were in fact in Figure 5 Supplementary 1.

      Minor points:

      • Gangfuß et al. report that a patient previously diagnosed with a range of neurological deficits including the diagnosis of severe infantile autism is heterozygous mutant for MEIS2. Although this study links MEIS2 gene function to ASD in the wider sense, adding a few additional references will make the link stronger. Examples are Shimojima et al., Hum Genome Var 2017 or Bae et al., Science 2022.

      These two references have been now included in the introduction section of the manuscript.

      • In some figures (e.g. Fig. 4) the numbering of the panels does not follow the order in which the respective data are mentioned in the text.

      Figure 4 is now re-organized so that panels follow the same order as in the results section.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Nitrogen metabolism is of fundamental importance to biology. However, the metabolism and biochemistry of guanidine and guanidine containing compounds, including arginine and homoarginine, have been understudied over the last few decades. Very few guanidine forming enzymes have been identified. Funck et al define a new type of guanidine forming enzyme. It was previously known that 2-oxogluturate oxygenase catalysis in bacteria can produce guanidine via oxidation of arginine. Interestingly, the same enzyme that produces guanidine from arginine also oxidises 2-oxogluturate to give the plant signalling molecule ethylene. Funck et al show that a mechanistically related oxygenase enzyme from plants can also produce guanidine, but instead of using arginine as a substrate, it uses homoarginine. The work will stimulate interest in the cellular roles of homoarginine, a metabolite present in plants and other organisms including humans and, more generally, in the biochemistry and metabolism of guanidines.

      1) Significance

      Studies on the metabolism and biochemistry of the small nitrogen rich molecule guanidine and related compounds including arginine have been largely ignored over the last few decades. Very few guanidine forming enzymes have been identified. Funck et al define a new guanidine forming enzyme that works by oxidation of homoarginine, a metabolite present in organisms ranging from plants to humans. The new enzyme requires oxygen and 2oxogluturate as cosubstrates and is related, but distinct from a known enzyme that oxidises arginine to produce guanidine, but which can also oxidise 2-oxogluturate to produce the plant signalling molecule ethylene.

      Overall, I thought this was an exceptionally well written and interesting manuscript. Although a 2-oxogluturate dependent guanidine forming enzyme is known (EFE), the discovery that a related enzyme oxidises homoarginine is really interesting, especially given the presence of homoarginine in plant seeds. There is more work to be done in terms of functional assignment, but this can be the subject of future studies. I also fully endorse the authors' view that guanidine and related compounds have been massively understudied in recent times. I would like to see the possibility that the new enzyme makes ethylene explored. Congratulations to the authors on a very nice study.

      Response: We thank the reviewer for the positive evaluation of our manuscript. In the revised version, we have emphasized more clearly that we found no evidence for ethylene production by the recombinant enzymes. The other suggestions of the reviewer are also considered in the revised version as detailed below.

      Reviewer #2 (Public Review):

      In this study, Dietmar Funck and colleagues have made a significant breakthrough by identifying three isoforms of plant 2-oxoglutarate-dependent dioxygenases (2-ODD-C23) as homo/arginine-6-hydroxylases, catalyzing the degradation of 6-hydroxyhomoarginine into 2aminoadipate-6-semialdehyde (AASA) and guanidine. This discovery marks the very first confirmation of plant or eukaryotic enzymes capable of guanidine production.

      The authors selected three plant 2-ODD-C23 enzymes with the highest sequence similarity to bacterial guanidine-producing (EFE) enzymes. They proceeded to clone and express the recombinant enzymes in E coli, demonstrating capacity of all three Arabidopsis isoforms to produce guanidine. Additionally, by precise biochemical experiments, the authors established these three 2-ODD-C23 enzymes as homoarginine-6-hydroxylases (and arginine-hydroxylase for one of them). Furthermore, the authors utilized transgenic plants expressing GFP fusion proteins to show the cytoplasmic localization of all three 2-ODD-C23 enzymes. Most notably, using T-DNA mutant lines and CRISPR/Cas9-generated lines, along with combinations of them, they demonstrate the guanidine-producing capacity of each enzyme isoform in planta. These results provide robust evidence that these three 2-ODD-C23 Arabidopsis isoforms are indeed homoarginine-6-hydroxylases responsible for guanidine generation.

      The findings presented in this manuscript are a significant contribution for our understanding of plant biology, particularly given that this work is the first demonstration of enzymatic guanidine production in eukaryotic cells. However, there are a couple of concerns and potential ways for further investigation that the authors should (consider) incorporate.

      Firstly, the observation of cytoplasmic and nuclear GFP signals in the transgenic plants may also indicate cleaved GFP from the fusion proteins. Thus, the authors should perform Western blot analysis to confirm the correct size of the 2-ODD-C23 fusion proteins in the transgenic protoplasts.

      Secondly, it may be worth measuring pipecolate (and proline?) levels under biotic stress conditions (particularly those that induce transcript changes of these enzymes, Fig S8). Given the results suggesting a potential regulation of the pathway by biotic stress conditions (eg. meJA), these experiments could provide valuable insights into the physiological role of guanidine-producing enzymes in plants. This additional analysis may give a significance of these enzymes in plant defense mechanisms.

      Response: We thank also reviewer 2 for the positive evaluation and useful suggestions. We performed the proposed GFP Western blot, which indeed indicated the presences of both, fulllength fusion proteins and free GFP, which can explain the partial nuclear localization. We fully agree that further experiments with biotic and abiotic stress will be required to determine the physiological function of the 2-ODD-C23 enzymes. However, the list of potential experiments is long and they are beyond the scope of the present manuscript.

      Reviewer #1 (Recommendations For The Authors):

      Specific points

      Overall, I thought this was a very interesting study, comprising biochemical, cellular, and in vivo studies. Of course more could be done on each of these, and likely will be, but I think the assignment of biochemical function is very strong, across all three approaches. The one new experiment I would like to see is a clear demonstration of whether ethylene is produced - unlikely but should be tested.

      We had mentioned our failure to detect ethylene production by the plant enzymes in the previous version and have made it more prominent and reliable by including ethylene production as positive control in the new supplementary figure S5.

      Abstract

      Delete 'hitherto overlooked' - this is implicit 'but is more likely' to 'is likely'?

      Agreed and modified

      Introduction

      Second sentence - what about relevant small molecule primary metabolites including precursors of proteins/nucleic acids.

      We modified the sentence accordingly.

      Paragraph 2 - maybe also note EFE produces glutamate semi aldehyde, via arginine C-5 oxidation.

      Paragraph 2 has been re-phrased according to your suggestion.

      Overall, I thought the introduction was exceptionally well written.

      Perhaps either in the introduction, or later, note there are other 2OG oxygenases that oxidise arginine/arginine derivatives in various ways, e.g. clavaminate synthase/arginine hydroxylases/desaturases.

      We added a sentence mentioning the arginine hydroxylases VioC and OrfP to the introduction and included VioC into the sequence comparison in supplementary figure 2 to show that these enzymes, as well as NapI, are very different from EFE and the plant hydroxylases.

      Results

      Paragraph 1 - qualify similarity and refer to/give a structurally informed sequence alignment, including EFE

      A new supplemental figure S2 was added with sequence identity values and a structurally informed alignment. The text has been modified accordingly.

      Paragraph 2 - briefly state method of guanidine analysis

      We included a reference to the M&M section and mentioned LC-MS in paragraph 2.

      Figure 1 - trivial point - proteins are not expressed/genes are

      We have modified the legend to figure 1. However, we would like to point out that terms like “recombinant protein expression” are widely used in the field. A quick search with google Ngram viewer shows that “protein expression” started to appear in the mid-80ies and its use stayed constantly at 1/8th of “gene expression”.

      Define errors clearly in all figure legends, clearly defining biological/technical repeats<br /> Page 6 - was the His-tag cleared to ensure no issues with Ni contamination?

      We treat individual plants or independent bacterial cultures as biological replicates. Only in the case of enzyme activity assays with NAD(P)H, technical replicates were used and this has been indicated in the legend of figure 6.

      Lower case 'p' in pentafluorobenzyl corrected

      In Figure 2 make clear the hydroxylated intermediates are not observed

      We now use grey color for the intermediates and have put them in brackets. Additionally we state in the figure legend that these intermediates were not detected.

      Pages 6-7 - I may have missed this but it's important to investigate what happens to the 2OG. Is succinate the only product or is ethylene also produced? This possibility should also be considered in the plant studies, i.e. is there any evidence for responses related to perturbed ethylene metabolism. The authors consider a signalling role relating to AASA/P6C, but seem to ignore a potential ethylene connection.

      As stated above, we checked for ethylene production with negative result. EFE produced 6 times more guanidine than the plant enzymes under the same condition, but even 100-fold lower ethylene production would have been clearly detected.

      Page 12 - 'plants have been shown to....' Perhaps note how hydroxy guanidine is made?

      We now mention the canavanine-γ-lyase that cleaves canavanine into hydroxyguanidine and homoserine.

      Overall, I thought the discussion was good, but perhaps a bit long/too speculative on pages 12/13 and this detracted from the biochemical assignment of the enzyme. I'd suggest shortening the discussion somewhat - the precise roles of the enzyme can be the subject of future work. As indicated above, some discussion on potential links to ethylene would be appreciated.

      Since reviewer 2 wanted more (speculative) discussion on the role of the 2-ODD-C23 enzymes and there was no detectable ethylene production, we took the liberty to leave the discussion largely unaltered.

      I'd also like to see some more consideration/metabolic analyses of guanidine related metabolism in the genetically modified plants.

      Such analyses will certainly be included in future experiments once we get an idea about the physiological role of the 2-ODD-C23 enzymes.

      Page 16 - mass spectrometry

      Corrected.

      Please add a structurally informed sequence alignment with EFE and other 2OG oxygenases acting on arginine/derivatives.

      An excerpt of the alignment is now presented in supplementary figure S2.

      Reviewer #2 (Recommendations For The Authors):

      I would like to see more discussion in the manuscript about the possible interconnection/roles between 2-ODD-C23 guanidine-producing, lysine- ALD1-Pipecolate producing, and proline metabolism pathways during both biotic and abiotic stresses.

      Since we were unable to detect pipecolate in any of our plant samples and also our preliminary results with biotic stress did not produce any evidence for a function of the 2ODD-C23 enzymes in the tested defense responses, we would like to postpone such extended discussion until we find a condition where the physiological function of these enzymes is evident.

      Fig. 4: Authors should change colors for Col-0, 0.2 HoArg and ctrl? They look too similar in my pdf file.

      We changed the colors in figure 4 and hope that the enhanced contrast is maintained during the production of the final version of our article.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      The single-mutant and double-mutant crp/rpoB strains were made by co-transduction with a nearby gene deletion (kanR-marked). I couldn't tell from the methods section whether these mutants, e.g., crp-H22N delta-chiA, were compared to wild-type cells or deletion mutants, e.g., delta chiA, in the proteomics experiments. I encourage the authors to explain this more clearly in the methods section, and to briefly mention in the Results section and relevant figure legends that the crp/rpoB mutant strains (and possibly the "wild-type" strains) also have gene deletions. If the comparison "wild-type" strains are fully wild-type (i.e., not deleted for chiA/yjaH), it is especially important to mention this in the Results section and the figure legends since the phenotypic changes could be due to the gene deletions rather than the mutations in crp/rpoB

      We appreciate and agree with the editor's suggestion to clarify this point.

      Accordingly, we have made the following changes to the text:

      p11 L30-34 in the main text:

      "The second experiment similarly compared an engineered BW25113 (BW) strain, containing the two regulatory mutations from the compact set (i.e., crp H22N and rpoB A1245V) together with the deletions used to insert them (see methods and DataS1 file), to a “wild type” BW strain (a corresponding knockout strain without the mutations, see methods)."

      p28 under Chemostat proteomics experiment L13-16 in methods:

      "The starting volume of each bioreactor was 150 ml M9 media supplemented with either 30 mM and 10mM D-xylose for the evolved and ancestor samples or only 10mM D-xylose for BW including compact set mutations and/or the deletions used for their insertions (DataS1 file). The minimal media also included trace elements and vitamin B1 was omitted."

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Sender et al describe a model to estimate what fraction of DNA becomes cell-free DNA in plasma. This is of great interest to the community, as the amount of DNA from a certain tissue (for example, a tumor) that becomes available for detection in the blood has important implications for disease detection.

      However, the authors' methods do not consider important variables related to cell-free DNA shedding and storage, and their results may thus be inaccurate. At this stage of the paper, the methods section lacks important detail. Thus, it is difficult to fully assess the manuscript and its results.

      Strengths:

      The question asked by the authors has potentially important implications for disease diagnosis. Understanding how genomic DNA degrades in the human circulation can guide towards ways to enrich for DNA of interest or may lead to unexpected methods of conserving cell-free DNA. Thus, the question "how much genomic DNA becomes cfDNA" is of great interest to the scientific and medical community. Once the weaknesses of the manuscript are addressed, I believe this manuscript has the potential to be a widely used resource.

      Weaknesses:

      There are two major weaknesses in how the analysis is presented. First, the methods lack detail. Second, the analysis does not consider key variables in their model.

      Issues pertaining to the methods section.

      The current manuscript builds a flux model, mostly taking values and results from three previous studies: 1) The amount of cellular turnover by cell type, taken from Sender & Milo, 2021

      2) The fractions of various tissues that contribute DNA to the plasma, taken from Moss et al, 2018 and Loyfer et al, 2023

      My expertise lies in cell-free DNA, and so I will limit my comments to the manuscripts in (2). Paper by Loyfer et al (additional context):

      Loyfer et al is a recent landmark paper that presents a computational method for deconvoluting tissues of origin based on methylation profiles of flow-sorted cell types. Thus, the manuscript provides a well-curated methylation dataset of sorted cell-types. The majority of this manuscript describes the methylation patterns and features of the reference methylomes (bulk, sorted cell types), with a smaller portion devoted to cell-free DNA tissue of origin deconvolution.

      I believe the data the authors are retrieving from the Loyfer study are from the 23 healthy plasma cfDNA methylomes analyzed in the study, and not the re-analysis of the 52 COVID-19 samples from Cheng et al (MED 2021).

      Paper by Moss et al (additional context):

      Moss et al is another landmark paper that predates the Loyfer et al manuscript. The technology used in this study (methylation arrays) is outdated but is an incredible resource for the community. This paper evaluates cfDNA tissues of origin in health and different disease scenarios. Again, I assume the current manuscript only pulled data from healthy patients, although I cannot be sure as it is not described in the methods section.

      This manuscript:

      The current manuscript takes (I think) the total cfDNA concentration from males and females from the Moss et al manuscript (pooled cfDNA; 2 young male groups, 2 old male groups, 2 young female groups, 2 old female groups, Supplementary Dataset; "total_cfDNA_conc" tab). I believe this is the data used as total cfDNA concentration. It would be beneficial for all readers if the authors clarified this point.

      The tissues of origin, in the supplemental dataset ("fraction" tab), presents the data from 8 cell types (erythrocytes, monocytes/macrophages, megakaryocytes, granulocytes, hepatocytes, endothelial cells, lymphocytes, other). The fractions in the spreadsheet do not match the Loyfer or Moss manuscripts for healthy individuals. Thus, I do not know what values the supplementary dataset represents. I also don't know what the deconvolution values are used for the flux model.

      The integration of these two methods lack detail. Are the authors here using yields (ie, cfDNA concentrations) from Moss et al, and tissue fractions from Loyfer et al? If so, why? There are more samples in the Loyfer manuscript, so why are the samples from Moss et al. being used? The authors are also selectively ignoring cell-types that are present in healthy individuals (Neurons from Moss et al, 2018). Why?

      Appraisal:

      At this stage of the manuscript, I think additional evidence and analysis is required to confirm the results in the manuscript.

      Impact:

      Once the authors present additional analysis to substantiate their results, this manuscript will be highly impactful on the community. The field of liquid biopsies (non-invasive diagnostics) has the potential to revolutionize the medical field (and has already in certain areas, such as prenatal diagnostics). Yet, there is a lack of basic science questions in the field. This manuscript is an important step forward in asking more "basic science" questions that seek to answer a fundamental biological question.

      We thank the reviewer for the valuable comments on our analysis. In response to the feedback, we have updated the analysis to address all critical points as described below and revised the text to enhance the clarity of our methodology. One notable improvement to our analysis involved ensuring better alignment between the cohort data for cfDNA plasma concentration and cell turnover estimates. To achieve this, we utilized the total plasma concentration of cfDNA from a study conducted by Meddeb et al. 2019, taking into account the influence of age and sex on these concentrations and specifically focusing on a cohort of relatively young and healthy individuals. Additionally, we considered expected variations related to sex, age, and other pertinent factors, as outlined in the studies by Meddeb et al. 2019 and Madsen et al. 2019.

      In addition, we have addressed concerns regarding the technical aspects of cfDNA analysis, providing detailed explanations of their limited impact on our analysis and the resulting conclusions.

      Reviewer #2 (Public Review):

      Summary:

      Cell-free DNA (cfDNA) are short DNA fragments released into the circulation when cells die. Plasma cfDNA level is thought to reflect the degree of cell-death or tissue injury. Indeed, plasma cfDNA is a reliable diagnostic biomarker for multiple diseases, providing insights into disease severity and outcomes. In this manuscript, Dr. Sender and colleagues address a fundamental question: What fraction of DNA released from cell death is detectable as plasma cfDNA? The authors use public data to estimate the amount of DNA produced from dying cells. They also utilize public data to estimate plasma cfDNA levels. Their calculations showed that <10% of DNA released is detectable as plasma cfDNA, the fraction of detectable cfDNA varying by tissue sources. The study demonstrates new and fundamental principles that could improve disease diagnosis and treatment via cfDNA.

      Strengths:

      1) The experimental approach is resource-mindful taking advantage of publicly available data to estimate the fraction of detectable cfDNA in physiological states. The authors did not assess if the fraction of detectable cfDNA changes in disease conditions. Nonetheless, their pioneering study lays the foundation and provides the methods needed for a similar assessment in disease states.

      2) The findings of this study potentially explain discrepancies in measured versus expected tissue-specific cfDNA from some tissues. For example, the gastrointestinal tract is subject to high cell turnover and release of DNA. Yet, only a small fraction of that DNA ends up in plasma as gastrointestinal cfDNA.

      3) The study proposes potential mechanisms that could account for the low fraction of detectable cfDNA in plasma relative to DNA released. This includes intracellular or tissue machinery that could "chew up" DNA released from dying cells, allowing only a small fraction to escape into plasma as cfDNA. Could this explain why the gastrointestinal track with an elaborate phagosome machinery contributes a small fraction of plasma cfDNA? Given the role of cfDNA as damage-associated molecular pattern in some diseases, targeting such a machinery may provide novel therapeutic opportunities.

      Weaknesses:

      In vitro and in vivo studies are needed to validate these findings and define tissue machinery that contribute to cfDNA production. The validation studies should address the following limitations of the study design: -

      1) Align the cohorts to estimate DNA production and plasma cfDNA levels. Cellular turnover rate and plasma cfDNA levels vary with age, sex, circadian clock, and other factors (Madsen AT et al, EBioMedicine, 2019). This study estimated DNA production using data abstracted from a homogenous group of healthy control males (Sender & Milo, Nat Med 2021). On the other hand, plasma cfDNA levels were obtained from datasets of more diverse cohort of healthy males and females with a wide range of ages (Loyfer et al. Nature, 2023 and Moss et al., Nat Commun, 2018).

      2) "cfDNA fragments are not created equal". Recent studies demonstrate that cfDNA composition vary with disease state. For example, cfDNA GC content, fraction of short fragments, and composition of some genomic elements increase in heart transplant rejection compared to no-rejection state (Agbor-Enoh, Circulation, 2021). The genomic location and disease state may therefore be important factors to consider in these analyses.

      3) Alternative sources of DNA production should be considered. Aside from cell death, DNA can be released from cells via active secretion. This and other additional sources of DNA should be considered in future studies. The distinct characteristics of mitochondrial DNA to genomic DNA should also be considered.

      We appreciate the reviewer's comments on our analysis. In response to the feedback, we have updated to address key points and revised the text accordingly.

      1) We have incorporated several enhancements to improve the coherence of our analysis. In our revised examination, we drew upon the total plasma concentration of cfDNA, as documented in a study conducted by (Meddeb et al. 2019), while considering the influence of age and sex on these concentrations. To ensure the cohort's alignment, we focus on relatively young and healthy individuals, specifically those below the age of 47. This approach allowed for a more meaningful comparison with the estimated DNA flux from a reference male human aged between 20 and 30 years.

      There was no specific estimate for a cohort of young males in both Meddeb et al. and Loyfer et al.; however, we factored in the expected variations stemming from sex, age, and other relevant factors, as elucidated in literature (Meddeb et al. 2019; Madsen et al. 2019). Thus, we demonstrate that sex and age have a small effect on the cfDNA concentrations and thus are unlikely to alter our conclusions substantially when considering a healthy population. We summarize the changes in the first paragraph, replacing the “Tissue-specific cfDNA concentration” subsection of the method, and the fourth paragraph added to the discussion.

      2) In this study, we addressed the total amount of cfDNA in healthy individuals without regard to GC content, representation of different genomic regions, or fragment length, as the goal was to understand if cell death rates are fully accounted for by cfDNA concentration. We agree that it will be interesting to study the relative representation of the genome in cfDNA and the processes that determine cfDNA concentration in pathologies beyond the rate of cell death. These topics for future research fall beyond this study's scope.

      3) We know only a few specific cases whereby DNA is released from cells that are not dying. These include the release of DNA from erythroblasts and megakaryocytes to generate anucleated erythrocytes and platelets (Moss et al. 2022, cited in our paper) and the release of NETs from neutrophils.

      The presence of cfDNA fragments originating from megakaryocytes and erythroblasts indicates the elimination of megakaryocytes and erythroblasts and the birth of erythrocytes and platelets. However, the considerations in the rest of the paper still apply: the concentration of cfDNA from these sources is far lower than expected from the cell turnover rate.

      Concerning NETosis: the presence of cfDNA originating in neutrophils that have not died would reduce the concentration of cfDNA from dying neutrophils and thus further increase the discrepancy, which is the topic of our study (under-representation of DNA from dying cells in plasma).

      We neglected mitochondrial DNA, as it is not measured in methylation cell-of-origin analysis. Similarly to the argument above, if some of the total DNA measured in plasma is in fact, mitochondrial, this would mean that genomic cfDNA concentration is actually lower than the estimates, meaning that an even smaller fraction of DNA from dying cells is measured in plasma.

      Recommendations For The Authors

      Reviewer #1 (Recommendations For The Authors):

      I think readers would appreciate the authors commenting or addressing the following points, in addition to addressing the concerns I raised about the methods section in the public review:

      What variables and considerations did the authors omit in this study?

      1) Cell-free DNA is found in virtually every biofluid.

      Thus, the fact that cell-free DNA is not present in the plasma does not mean it cannot be detected elsewhere. This also implies that phagocytosis may not be the only factor related to cfDNA not being present in the blood. One example (of many, many others) is neutrophil-derived cell-free DNA, which is present in the urine.

      Indeed, dying cells and their DNA can be consumed locally, released into the blood, or shed outside the body. The latter is a function of tissue topology. For example, intestinal epithelial cell turnover releases material to the lumen of the gut (i.e., stool); kidney and bladder cell turnover releases material to urine; and lung epithelium releases material to the air spaces. In these cases, the absence of cfDNA in plasma is expected. However, in cases where tissue topology dictates release to blood, low representation in cfDNA indicates local consumption or a related mechanism. In Figure 1 of the manuscript, we distinguish between tissues according to their topology, labeling organs that shed material to the outside denoted by open circles.

      Neutrophil-derived DNA in urine likely represents a local process in the kidney (neutrophils that penetrate the epithelium and fall into the urine). Neutrophils that die elsewhere in the body must release cfDNA to the blood before it can reach the urine. Hence, quantifying plasma cfDNA is a legitimate approach for assessing the relationship between cell death and cfDNA. The revised text clarifies this point. We made revisions to the initial paragraph in the results section and a paragraph within the discussion to provide clarity on this topic:

      “Based on atlases of human cell type-specific methylation signatures, Moss et al. and Loyfer et al. analyzed the main cell types contributing to plasma cfDNA. They found the primary sources of plasma cfDNA to be blood cells: granulocytes, megakaryocytes, macrophages, and/or monocytes (the signature could not differentiate between the last two), lymphocytes, and erythrocyte progenitors. Other cells that had detectable contributions are endothelial cells and hepatocytes. Qualitatively, these cells represent most of the leading cell types in cellular turnover, as shown in Sender & Milo 2021 (Sender and Milo 2021). Epithelial cells of the gastrointestinal tract, lung, kidney, bladder, and skin are other cell types that significantly contribute to cellular turnover. Dying cells in these tissues are shed into the gut lumen, the air spaces, the urine, or out of the skin (note that while DNA from gut, lung, and kidney epithelial cells can be found in stool, bronchoalveolar lavage, and urine, the fate of DNA from skin cells is not known). This arrangement may explain why DNA from these cell types is not represented in plasma cfDNA in healthy conditions. Therefore, it appears that cells with high cfDNA plasma levels are those with relatively high turnover that are not being shed out of the body.”

      “A comparison between the different types of cells shows a trend in which less DNA flux from cells with higher turnover gets to the bloodstream. In particular, a tiny fraction (1 in 3x104) of DNA from erythroid progenitors arrives at the plasma, indicating an extreme efficiency of the DNA recovery mechanism. Erythroid progenitors are arranged in erythroblastic islands. Up to a few tens of erythroid progenitors surround a single macrophage that collects the nuclei extruded during the erythrocyte maturation process (pyrenocytes) (Chasis and Mohandas 2008). The amount of DNA discarded through the maturation of over 200 billion erythrocytes per day (Sender and Milo 2021) exceeds all other sources of homeostatic discarded DNA. Our findings indicate that the organization of dedicated erythroblastic islands functions highly efficiently regarding DNA utilization. Neutrophils are another high-turnover cell type with a low level of cfDNA. When contemplating the process of NETosis (Vorobjeva and Chernyak 2020), the existence of cfDNA originating from live neutrophils would potentially diminish the concentration of cfDNA released by dying neutrophils, thereby amplifying the observed ratio for this particular cell type. The overall trend of higher turnover resulting in a lower cfDNA to DNA flux ratio may indicate similar design principles, in which the utilization of DNA is better in tissues with higher turnover. However, our analysis is limited to only several cell types (due to cfDNA test and deconvolution sensitivities), and extrapolation to cells with lower cell turnover is problematic.”

      2) Effect of biofluid storage.

      Cell-free DNA continues to degrade after it is extracted via blood draw. This is not expected to change tissue of origin predictions (although that remains to be shown in the literature), but definitely affects extraction yield. This is not accounted for (or even discussed) in the manuscript. It would be important to understand how this was done for the data presented here.

      The paper integrates data from multiple recent studies that adhered to state-of-the-art procedures requiring rapid processing of blood samples. In fact, earlier studies that were not careful to isolate plasma quickly typically reported very high concentrations due to the lysis of leukocytes and artifactual release of genomic DNA. Rapid plasma isolation and DNA extraction typically yield 5ng/ml in healthy donors, as stated in the paper (last paragraph of Results).

      3) Batch effects

      Batch effects are not discussed here and can affect cfDNA yields.

      Our analysis relies on data reported by multiple studies from different groups, which independently results in similar key findings (total concentration of cfDNA and the relative contribution of different tissues). Thus, batch effects are unlikely to affect the calculations markedly.

      4) Cell-free DNA extraction kits

      Different kits and methods extract cell-free DNA at different quantities. Importantly, much research has been done recently that most kits are not sensitive for ultrashort cell-free DNA (of lengths ~50bp). This may represent most of the DNA present in plasma. This raises an important question: are the yields that are being used in Moss et al (where I presume the total concentration is taken from) accurate? Is there more cell-free DNA that was missed? While the importance of this ultrashort cfDNA has yet to be shown, it is in the blood. Thus, the authors' model may underestimate ratios by not accounting for this. This is mentioned in the discussion, but it is not evident why it was not added into the model.

      The Qiagen cfDNA extraction kit can detect 50bp fragments. As shown in the specification sheets of the kit (https://www.qiagen.com/us/products/diagnostics-and-clinical-research/solutions-for -laboratory-developed-tests/qiasymphony-dsp-circulating-dna-kit), urine DNA contains abundant DNA fragments that peak at 50bp. In contrast, plasma cfDNA does not contain such fragments at appreciable concentrations. This suggests that small fragments, 50-150bp long, are not a major component of cfDNA, and thus, our measurements of the total concentration of cfDNA are not dramatically underestimated.

      The convention regarding the size distribution of cfDNA fragments is based on extensive evidence using multiple approaches. For example, a study that profiled the DNA released by multiple cell lines in vitro (Aucamp et al. 2017) used another kit for DNA isolation – the NucleoSpin Gel and PCR Clean-up kit (Macherey-Nagel, Düren, Germany). This kit does extract fragments that are 50bp long (nucleospin-gel-and-pcr-clean-up-mini). Indeed, the DNA released from cultured cells did contain a peak at 50bp, but it was minor compared with the nucleosome-size peak.

      More recently, several studies did suggest the presence of ultra-short cfDNA fragments, 50 bp long on average, and concluded that such fragments might be present at a molar concentration that is comparable to that of nucleosome-protected DNA (for example, (Hisano et al. 2021)).

      Thus, our model estimates can be off by up to 2-fold (that is, actual cfDNA concentration measured in most studies overlooks the small fragments and thus underestimates the actual concentration of cfDNA by 2-fold). This is incorporated into the revised manuscript.

      We note that we cannot exclude the presence of abundant ultra-short DNA fragments (e.g., 10bp long). However, such fragments are not measurable in cfDNA analysis. Thus, we can refine our conclusion and state that only a small fraction of DNA of dying cells appears as measured cfDNA. We included a section in the methods detailing the integration of a potential factor for the short fragments and revised the discussion:

      “The overall plasma cfDNA concentration was multiplied by a factor of 1.5 to accommodate for the presence of small fragments of approximately 50 base pairs of cfDNA in the plasma. These fragments are suggested to contribute comparable molar concentrations (Hisano, Ito, and Miura 2021). Despite having approximately one-third of the mass, it is reasonable to presume that these fragments represent a similar number of genomes. This assumption is based on the idea that their source is a broken nucleosome unit, and the fragments represent the portion that was not degraded. Given the restricted data and its interpretation, we consider factors spanning the range of 1 (negligible effect) and 2 (doubling of the amount). The chosen factor, 1.5, is selected as the midpoint within this range of uncertainty.”

      “In this study, we report a surprising, dramatic discrepancy between the measured levels of cfDNA in the plasma and the potential DNA flux from dying cells. One hypothetical explanation for that discrepancy is the limited sensitivity of typical cfDNA assays to short DNA fragments, which may contribute a significant fraction of the overall cfDNA mass. Regular cfDNA analysis shows a size distribution concentrated around a length of 165 base pairs (bp). The sizes in ctDNA vary more, but most are longer than 100 bp (Alcaide et al. 2020; Udomruk et al. 2021). Recent studies suggested a significant fraction of single-strand ultrashort fragments (length of 25-60 bp) (Cheng et al. 2022; Hisano, Ito, and Miura 2021). However, the total amount of DNA contained in these fragments is less than or comparable to that of the longer “regular” nucleosome-protected cfDNA fragments (Cheng et al. 2022; Hisano, Ito, and Miura 2021), arguing against ultrashort fragments as a dominant explanation for the “missing” cfDNA material. We integrated the estimate provided by Hisano et al. into our analysis as a modifying factor for both the total concentration and uncertainty of plasma cfDNA. Importantly, this incorporation did not alter the overall conclusions, as the discrepancy between the cfDNA plasma concentration and potential DNA flux remains on the same order of magnitude. We note that we cannot exclude the presence of abundant DNA fragments that are even shorter (e.g., 10bp long) and are not measurable in cfDNA analysis. Thus, our formal conclusion is that only a small fraction of the DNA of dying cells appears as measurable cfDNA.”

      5) Health status of samples analyzed.

      Health, sex and physical activity affects cfDNA yields. This is not accounted for or discussed in the manuscript.

      We incorporated several enhancements to improve our analysis in response to the provided feedback. In our revised examination, we drew upon the total plasma concentration of cfDNA, as documented in a study conducted by (Meddeb et al. 2019), while considering the influence of age and sex on these concentrations. To ensure the cohort's alignment, we focus on relatively young and healthy individuals, specifically those below the age of 47. This approach allowed for a more meaningful comparison with the estimated DNA flux from a reference male human aged between 20 and 30 years.

      Furthermore, we factored in the expected variations stemming from sex, age, and other relevant factors, as elucidated in the works of (Meddeb et al. 2019; Madsen et al. 2019). Our intent in doing so was to demonstrate that these factors are unlikely to alter our conclusions substantially when considering a healthy population. We summarize the changes in the first paragraph, replacing the “Tissue-specific cfDNA concentration” subsection of the method, and the fourth paragraph added to the discussion:

      “Our estimates for total plasma cfDNA concentration were derived from the median concentration observed in individuals below 47 years of age (n=52), as reported by (Meddeb et al. 2019). To complement this, we integrated our total concentration estimates with data on the proportion of cfDNA originating from specific cell types, leveraging a plasma methylome deconvolution method described by (Loyfer et al. 2023), which did not provide absolute quantities of cfDNA). To quantify the uncertainty associated with our cfDNA concentration estimates, we employed a methodology that considered several sources of variation. First, we incorporated the confidence interval of the median concentration reported by Meddeb et al. as a measure of uncertainty. Additionally, we accounted for individual-specific and analytic variations based on the study by (Madsen et al. 2019), encompassing factors such as the precise timing of measurements and assay precision. These sources of uncertainty were combined using the approach outlined below.”

      “Our current analysis focused on estimating plasma cfDNA concentration and cellular turnover in a cohort of healthy, relatively young individuals. The total plasma cfDNA concentrations were sourced from healthy individuals below 47 years, as reported by (Meddeb et al. 2019). We use data analyzed based on plasma samples from healthy individuals to estimate the proportion of cfDNA originating from specific cell types (Loyfer et al. 2023). These values were then compared to the potential DNA flux resulting from homeostatic cellular turnover, estimated for reference healthy males aged between 20 and 30 (Sender and Milo 2021). In our analysis, we considered various sources of uncertainty, including inter-individual variation, variability in the timing of sample collection, and analytical precision (Madsen et al. 2019; Meddeb et al. 2019). These factors collectively contributed to an uncertainty factor of less than 3. Importantly, this level of uncertainty does not alter our conclusion regarding the relatively small fraction of DNA present in plasma as cfDNA. Furthermore, we acknowledge that age and sex can impact total cfDNA concentration, as demonstrated by (Meddeb et al. 2019), with potential variations of up to 30%. However, as the results of our analysis present a much larger difference, these effects do not change the conclusions drawn from our analysis. Nevertheless, age and health status may influence the proportion of cfDNA originating from specific cell types and their corresponding cellular turnover rates. Consequently, the ratios themselves may vary in the elderly population or individuals with underlying health conditions.”

      Reviewer #2 (Recommendations For The Authors):

      1) Align the cohorts to estimate DNA production and plasma cfDNA levels. Cellular turnover rate and plasma cfDNA levels vary with age, sex, circadian clock, and other factors (Madsen AT et al, EBioMedicine, 2019). This study estimated DNA production using data abstracted from a homogenous group of healthy control males (Sender & Milo, Nat Med 2021). On the other hand, plasma cfDNA levels were obtained from datasets of more diverse cohort of healthy males and females with a wide range of ages (Loyfer et al. Nature, 2023 and Moss et al., Nat Commun, 2018).

      We have incorporated several enhancements to improve the coherence of our analysis. In our revised examination, we drew upon the total plasma concentration of cfDNA, as documented in a study conducted by (Meddeb et al. 2019), while considering the influence of age and sex on these concentrations. To ensure the cohort's alignment, we focus on relatively young and healthy individuals, specifically those below the age of 47. This approach allowed for a more meaningful comparison with the estimated DNA flux from a reference male human aged between 20 and 30 years.

      There was no specific estimate for a cohort of young males in both Meddeb et al. and Loyfer et al.; however, we factored in the expected variations stemming from sex, age, and other relevant factors, as elucidated in literature (Meddeb et al. 2019; Madsen et al. 2019). Thus, we demonstrate that sex and age have a small effect on the cfDNA concentrations and thus are unlikely to alter our conclusions substantially when considering a healthy population.

      We summarize the changes in the first paragraph, replacing the “Tissue-specific cfDNA concentration” subsection of the method, and the fourth paragraph added to the discussion.

      “Our estimates for total plasma cfDNA concentration were derived from the median concentration observed in individuals below 47 years of age (n=52), as reported by (Meddeb et al. 2019). To complement this, we integrated our total concentration estimates with data on the proportion of cfDNA originating from specific cell types, leveraging a plasma methylome deconvolution method described by (Loyfer et al. 2023), which did not provide absolute quantities of cfDNA). To quantify the uncertainty associated with our cfDNA concentration estimates, we employed a methodology that considered several sources of variation. First, we incorporated the confidence interval of the median concentration reported by Meddeb et al. as a measure of uncertainty. Additionally, we accounted for individual-specific and analytic variations based on the study by (Madsen et al. 2019), encompassing factors such as the precise timing of measurements and assay precision. These sources of uncertainty were combined using the approach outlined below.”

      “Our current analysis focused on estimating plasma cfDNA concentration and cellular turnover in a cohort of healthy, relatively young individuals. The total plasma cfDNA concentrations were sourced from healthy individuals below 47 years, as reported by (Meddeb et al. 2019). We use data analyzed based on plasma samples from healthy individuals to estimate the proportion of cfDNA originating from specific cell types (Loyfer et al. 2023). These values were then compared to the potential DNA flux resulting from homeostatic cellular turnover, estimated for reference healthy males aged between 20 and 30 (Sender and Milo 2021). In our analysis, we considered various sources of uncertainty, including inter-individual variation, variability in the timing of sample collection, and analytical precision (Madsen et al. 2019; Meddeb et al. 2019). These factors collectively contributed to an uncertainty factor of less than 3. Importantly, this level of uncertainty does not alter our conclusion regarding the relatively small fraction of DNA present in plasma as cfDNA. Furthermore, we acknowledge that age and sex can impact total cfDNA concentration, as demonstrated by (Meddeb et al. 2019), with potential variations of up to 30%. However, as the results of our analysis present a much larger difference, these effects do not change the conclusions drawn from our analysis. Nevertheless, age and health status may influence the proportion of cfDNA originating from specific cell types and their corresponding cellular turnover rates. Consequently, the ratios themselves may vary in the elderly population or individuals with underlying health conditions.”

      2) "cfDNA fragments are not created equal". Recent studies demonstrate that cfDNA composition vary with disease state. For example, cfDNA GC content, fraction of short fragments, and composition of some genomic elements increase in heart transplant rejection compared to no-rejection state (Agbor-Enoh, Circulation, 2021). The genomic location and disease state may therefore be important factors to consider in these analyses.

      In this study, we addressed the total amount of cfDNA in healthy individuals without regard to GC content, representation of different genomic regions, or fragment length, as the goal was to understand if cell death rates are fully accounted for by cfDNA concentration. We agree that it will be interesting to study the relative representation of the genome in cfDNA and the processes that determine cfDNA concentration in pathologies beyond the rate of cell death. These topics for future research fall beyond this study's scope.

      3) Alternative sources of DNA production should be considered. Aside from cell death, DNA can be released from cells via active secretion. This and other additional sources of DNA should be considered in future studies. The distinct characteristics of mitochondrial DNA to genomic DNA should also be considered.

      We know only a few specific cases whereby DNA is released from cells that are not dying. These include the release of DNA from erythroblasts and megakaryocytes to generate anucleated erythrocytes and platelets (Moss et al. 2022, cited in our paper) and the release of NETs from neutrophils.

      The presence of cfDNA fragments originating from megakaryocytes and erythroblasts indicates the elimination of megakaryocytes and erythroblasts and the birth of erythrocytes and platelets. However, the considerations in the rest of the paper still apply: the concentration of cfDNA from these sources is far lower than expected from the cell turnover rate.

      Concerning NETosis: the presence of cfDNA originating in neutrophils that have not died would reduce the concentration of cfDNA from dying neutrophils and thus further increase the discrepancy, which is the topic of our study (under-representation of DNA from dying cells in plasma).

      We updated a paragraph in the discussion regarding this issue:

      “A comparison between the different types of cells shows a trend in which less DNA flux from cells with higher turnover gets to the bloodstream. In particular, a tiny fraction (1 in 3x104) of DNA from erythroid progenitors arrives at the plasma, indicating an extreme efficiency of the DNA recovery mechanism. Erythroid progenitors are arranged in erythroblastic islands. Up to a few tens of erythroid progenitors surround a single macrophage that collects the nuclei extruded during the erythrocyte maturation process (pyrenocytes) (Chasis and Mohandas 2008). The amount of DNA discarded through the maturation of over 200 billion erythrocytes per day (Sender and Milo 2021) exceeds all other sources of homeostatic discarded DNA. Our findings indicate that the organization of dedicated erythroblastic islands functions highly efficiently regarding DNA utilization. Neutrophils are another high-turnover cell type with a low level of cfDNA. When contemplating the process of NETosis (Vorobjeva and Chernyak 2020), the existence of cfDNA originating from live neutrophils would potentially diminish the concentration of cfDNA released by dying neutrophils, thereby amplifying the observed ratio for this particular cell type. The overall trend of higher turnover resulting in a lower cfDNA to DNA flux ratio may indicate similar design principles, in which the utilization of DNA is better in tissues with higher turnover. However, our analysis is limited to only several cell types (due to cfDNA test and deconvolution sensitivities), and extrapolation to cells with lower cell turnover is problematic.”

      We neglected mitochondrial DNA, as it is not measured in methylation cell-of-origin analysis. Similarly to the argument above, if some of the total DNA measured in plasma is in fact mitochondrial, this would mean that genomic cfDNA concentration is actually lower than the estimates, meaning that an even smaller fraction of DNA from dying cells is measured in plasma.

    1. Author Response

      The following is the authors’ response to the current reviews.

      We would firstly like to thank all reviewers for their comments and support of this manuscript.

      Reviewer #1 (Recommendations For The Authors):

      No further recommendations.

      Reviewer #2 (Recommendations For The Authors):

      All of my comments have been sufficiently addressed.

      Reviewer #3 (Recommendations For The Authors):

      Thanks for responding to my former recommendations constructively. I believe these points have been fully addressed in this new version.

      However, I have not seen any comments on the points I raised in my former public review concerning the I-2 dependence of the FonSIX4 cell death. Do you know whether FonSIX4 would trigger cell death in tissues not expressing any I-2?

      We are a little confused concerning this comment. I-2 is a different class of resistance protein (NLR) that recognises Avr2 and this is likely to be intracellular. From the previous public review, we believe reviewer 3 may have been asking us to clarify the dependence of I (MM or M82) on FonSIX4 cell death. We have performed these controls by expressing FonSIX4 and associated FonSIX4/Avr1 chimeras in N. benthamiana (with the PR-1 signal peptide for efficient secretion of effectors) and it does not cause cell death in the absence of the I receptor – see S11F Fig. This was not explicitly conveyed in text so we have included the following in text: “Using the N. benthamiana assay we show FonSIX4 is recognised by I receptors from both cultivars (IM82 and iMoneymaker) and cell death is dependent on the presence of IM82 or iMoneymaker (Fig 5B, S11 Fig).”

      I still recommend discussing whether the Avr1 residues crucial for Avr activity are in the same structural regions of the C-terminal domain where previous work has identified residues under diversifying selection in symbiotic fungal FOLD proteins.

      The region important for recognition does encompass some residues within the structural region identified to be under diversifying selection in FOLD effectors from Rhizophagus irregularis previously reported (two residues within one beta-strand). However, we also see residues that don’t overlap to this area. We also note that the mycFOLD proteins analysed in symbiotic fungi are heavily skewed towards strong structurally similarity with FolSIX6 (similar cysteine spacing within both N and C-domains and structural orientation of the N and C-domains) rather than Avr1. We are under the impression that Avr1 was not included in the analysis of diversifying selection in symbiotic fungal FOLD proteins, it also is unclear to us if close Avr1 homologues are present. With this in mind, and considering our already lengthy discussion (as previously highlighted during reviewer), we have decided not to include further discussion concerning this point.


      The following is the authors’ response to the original reviews.

      We would like to thank the editor(s) and reviewers for their work concerning our manuscript. Most of the suggested changes were related to text changes which we have incorporated into the revised version. Please find our response to reviewers below.

      Reviewer #1 (Recommendations For The Authors):

      I only have very minor suggestions for the authors. The first one comes from reading the manuscript and finding it very dense with so many acronyms. This will limit the audience that will read the study and appreciate its impact. This is more noticeable in the Results, with many passages that I would suggest moving to Methodology.

      We thank reviewer 1 for their very positive review. We understand that due to the nature of this study, which includes many protein alleles/mutations that were expressed with different boundaries etc., it is difficult to achieve this. Reviewer 2 asked for more details to be provided. We hope we have achieved a nice balance in the revised manuscript.

      Something else that would facilitate the reading of the manuscript is the effectors name. The authors use the SIX name or the Avr name for some effectors and it makes it difficult to follow up.

      We have tried to make this consistent for Avr1 (SIX4), Avr2 (SIX3) and Avr3 (SIX1). Other SIX effectors are not known Avrs so the SIX names were used.

      Reading the manuscript and seeing how in most of the sections the authors used a computational approach followed by an experimental approach, I wonder why Alphafold2-multimer was not used to investigate the interaction between the effector and the receptor?

      This is a great suggestion, we have certainly investigated this, however to date there is no experimental evidence to directly support the direct interaction between I and Avr1. Post review, we spent some time trying to capture an interaction using a co-immunoprecipitation approach however to date we have not been able to obtain robust data that support this. We are currently looking to study this utilising protein biophysics/biochemistry but this work will take some time.

      Reviewer #2 (Recommendations For The Authors):

      We thank reviewer 2 for the very thorough editing and recommendations. We have incorporated all minor text edits below into the manuscript.

      Line 43: perhaps "Effector recognition" instead of "Effector detection", to be consistent with line 51?

      Line 60: Change to "leads".

      Line 79: Italicise Avr2.

      Line 94: Add the acronym ETI in parentheses after "effector-triggered immunity".

      Line 106: "(Leptosphaeria Avirulence-Supressing)" should be "(Leptosphaeria Avirulence and Supressing)".

      Line 112: Change "defined" to "define".

      Line 119: Spell out the species name on first use.

      Line 205: Glomeromycota is a division rather than a genus. Consistent with Fig 2, it also does not need to italicized.

      Line 207: Change "basidiomycete" to "Division Basidiomycota", consistent with Fig 2.

      Line 214: Change "alignment of Avr1, Avr3, SIX6 and SIX13" to "alignment of the mature Avr1, Avr3, SIX6 and SIX13 sequences".

      Line 324: Change "solved structures" to "solved protein structures".

      Line 335: Spell out acronyms like "MS" on first use in figure legends. Also dpi in other figure legends.

      Line 341: replace "effector-triggered immunity (ETI)" with "(ETI)" - see comment on Line 94.

      Line 370: Change "domains" to "domain".

      Line 374: In the title, change "C-terminus" to C-domain", consistent with the rest of the figure legend.

      Line 404: Change "(basidiomycetes and ascomycetes)" to "(Basidiomycota and Ascomycota fungi)", consistent with Fig 2C.

      Line 416: Change "in" to "by".

      Line 427: un-italicize the parentheses.

      Line 519: First mention of NLR. Spell out the acronym on first use in main text. S5 and S11 figure titles should be bolded.

      Line 852: Replace "@" with "at".

      S4 Table: Gene names should be italicised.

      S5 Table: Needs to be indicated that the primer sequences are in the 5´-3´ orientation.

      With regards to the Agrobacterium tumefaciens-mediated transient expression assays involving co-expression of the Avr1 effector and I immune receptor, the authors need to make clear how many biological replicates were performed as this information is only provided for the ion leakage assay.

      We have added these data to the figure legend

      Line 57: For me, the text "Fol secretes a limited number of structurally related effectors" reads as Fol secretes structurally related effectors, but very few of them are structurally related. Perhaps it would be better to say that the effector repertoire of Fol is made up of proteins that adopt a limited number of structural folds, or that the effector repertoire can be classified into a reduced set of structural families?

      This edit has been incorporated.

      Lines 66-67: Subtle re-wording required for "The best-characterized pathosystem is F. oxysporum f. sp. lycopersici (Fol)", as a pathosystem is made up of a pathogen and its host. Perhaps "The best-characterized pathosystem involves F. oxysporum f. sp. lycopersici (Fol) and tomato".

      Sentence has been reworded.

      Line 113 and throughout: Stick with one of "resistance protein", "receptor", "immune receptor" and "immunity receptor" throughout the manuscript.

      We have decided to use both receptor and immunity receptor as not all receptors investigated in the manuscript provide immunity.

      Lines 149-150: The title does not fully represent what is shown in the figure. The text "that is unique among fungal effectors" can be deleted as there is nothing in Fig 1 that shows that the fold is unique to fungal effectors.

      Figure title has been changed.

      Line 173: The RMSD of Avr3 is stated as being 3.7 Å, but in S3 Fig it is stated as being 3.6 Å.

      This was a mistake in the main text and has been corrected.

      Lines 202-204: This sentence needs to be reworded, as the way that it is written implies that the Diversispora and Rhizophagus genera are in the Ascomycota division. Also, "Ascomycetes" should be changed to "Ascomycota fungi", consistent with Fig 2.

      Sentence has been reworded.

      Line 233: "Scores above 8". What type of scores? Z-scores?

      These are Z-scores. This has been added in text.

      Lines 242-246: It is stated that SIX9 and SIX11 share structural similarity to various RNA-binding proteins, but no scores used to make these assessments is given. The scores should be provided in the text.

      Z-scores have been added.

      Fig 4A: SIX3 should be Avr2, consistent with line 292. The gene names should be italicised in Fig 4A.

      SIX3 was changed to Avr2. Gene names have been italicised.

      Line 356: Subtle rewording required, as "co-infiltrated with both IM82 and iMoneymaker" implies that you infiltrated with protein rather than Agrobacterium strains.

      Sentence has been reworded.

      Fig 5A, Fig 5C and Line 380: Light blue is used, but this looks grey. Perhaps change colour, as grey is already used to show the pro-domain in Fig 5A (or simply change the colour used to highlight the pro-domain)?

      Colour depicting the C-domain was changed.

      Lines 530-531: This text is no longer correct. Rlm4 and Rlm3 are now known to be alleles of Rlm9. See: Haddadi, P., Larkan, N. J., Van deWouw, A., Zhang, Y., Neik, T. X., Beynon, E., ... & Borhan, M. H. (2022). Brassica napus genes Rlm4 and Rlm7, conferring resistance to Leptosphaeria maculans, are alleles of the Rlm9 wall‐associated kinase‐like resistance locus. Plant Biotechnology Journal, 20(7), 1229.

      We thank the reviewer for picking this up. This text has been updated.

      Line 553: Provide more information on what the PR1 signal peptide is.

      More information about the PR1 signal peptide has been added.

      Lines 767-781: Descriptions and naming conventions of proteins throughout the figure legend need to be consistent and better reflect their makeup. For example, I think it would be best to put the sequence range after each protein mentioned - e.g. Avr118-242 or Avr159-242 instead of Avr1, PSL1_C37S18-111 instead of PSL1_C37S, etc. Furthermore, it is often stated that a protein is full-length when it lacks a signal peptide - my thought is that if a proteins lack its signal peptide, it is not full-length. The acronym "PD" also needs to be spelled out as "pro-domain (PD)" in the figure legend.

      We have incorporated sequence range for proteins that were produced upon first use. Sequence ranges that were modelled in AlphaFold2 were not added in text because they can be found in Supplementary Table 3.

      Lines 853-845: It is stated the sizes of proteins are indicated above the chromatogram in S10 Fig, but this is not the case. It is also not clear from S10B Fig that the faint peaks correspond to the peaks in the Fig 4B chromatogram. In S10D Fig, the stick of C58S is difficult to see. Perhaps change the colour or use an arrow/asterisk?

      Protein size estimates have been added above the chromatogram. Added text to indicate that the faint peaks correspond to peaks in Fig 4B. Added an asterisk in S10D Fig to identify the location of C58.

      S14 Fig is not mentioned/referenced in the main text of the manuscript.

      This was a mistake and has been added.

      The reference list needs to be updated to accommodate those referenced bioRxiv preprints that have now been published in peer-reviewed journals.

      The reference list has been updated.

      Reviewer #3 (Recommendations For The Authors):

      It would be good to discuss whether the pro-domains affecting virulence or avirulence activity.

      Kex2, the protease that cleaves the pro-domain functions in the golgi. We therefore suspect that the pro-domain is removed prior to secretion. For recombinant protein production in E. coli we find that these pro-domains are necessary to obtain soluble protein (doi: 10.1111/nph.17516). As we require the pro-domain for protein production and can not completely removing them from our preps, we cannot perform experiments to test this and subsequently comment further. In a paper that identified SIX effectors in tomato utilising proteomics approach (https://bsppjournals.onlinelibrary.wiley.com/doi/10.1111/j.1364-3703.2007.00384.x), it appears that the pro-domains were not captured in this analysis. This supports the conclusion that they are not associated with the mature/secreted protein.

      The authors stated that the C-terminal domain of SIX6 has a single disulfide bond unique to SIX6. Please clarify in which context is it unique: in Fusarium or across all FOLD proteins?

      This is in direct comparison to Avr1 and Avr3. The disulfide in the C-domain of SIX6 is unique compared to Avr1 and Avr3. This has been made clear in text.

      The structural similarity of FOLD proteins to other known structures have been discussed (lines 460ff), but it is not clear whether all structures and models identified in this work would yield cysteine inhibitor and tumor necrosis factors as best structural matches in the database or whether this is specific to a single FOLD protein. Please consider discussing recently published findings by others (Teulet et al. 2023, New Phytologist) on this aspect.

      This analysis was performed for Avr1, we obtained relatively low similarity hits for Avr3/Six6. We have updated this text accordingly… “Unfortunately, the FOLD effectors share little overall structural similarity with known structures in the PDB outside of the similarity with each other. At a domain level, the N-domain of the FOLD effector Avr1 has some structural similarities with cystatin cysteine protease inhibitors (PDB code: 4N6V, PDB code: 5ZC1) [60, 61], and the C-domain with tumour necrosis factors (PDB code: 6X83) [62] and carbohydrate-binding lectins (PDB code: 2WQ4) [63]. Relatively weak hits were observed for Avr3/Six6.”

      It might be useful to clearly point out that the ToxA fold and the C-terminus of the FOLD fold are different.

      We have secondary structural topology maps of the FOLD and ToxA-like families in S8 Fig which highlight the differences in topology between these two families.

      Please add information to Fig.S8 listing the approach to generate the secondary structure topology maps.

      We have added this information in the figure caption.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The authors found that nifuroxazide has the potential to augment the efficacy of radiotherapy in HCC by reducing PD-L1 expression. This effect may be attributed to increased degradation of PD-L1 through the ubiquitination-proteasome pathway. The paper provides new ideas and insights to improve treatment effectiveness, however, there are additional points that could be addressed.

      • The paper highlights that the combination of nifuroxazide increases tumor cell apoptosis. A discussion regarding the potential crosstalk or regulatory mechanisms between apoptotic pathways and PD-L1 expression would be valuable.

      Response: Thank you very much for your suggestion. Research has shown that regulating the STAT3/PD-L1 pathway can effectively increase apoptosis in lung cancer cells (1). Our study confirmed that nifuroxazide can effectively inhibit the expression of p-STAT3 and PD-L1 in liver cancer cells, which may be the reason for the increased apoptosis of these cells. We have added relevant descriptions in the discussion.

      • The benefits and advantages of nifuroxazide combination could be compared to the current clinical treatment options.

      Response: Thank you greatly for your insightful feedback. The primary objective of this study is to explore whether nifuroxazide can effectively enhance the degradation of PD-L1, thereby increasing the radiosensitivity of HCC. Our research reveals that compared to radiation therapy alone, combination therapy involving nifuroxazide and radiation significantly inhibits tumor growth in mice and boosts the anti-tumor immune response. This finding could potentially provide a valuable strategy for patients who exhibit resistance to radiation therapy in clinical practice. Moreover, clinical trial investigations have demonstrated that nivolumab, a PD-1 monoclonal antibody, when combined with radiation therapy for HCC, exhibits promising safety and efficacy (2). This evidence supports the future application of nifuroxazide in the treatment of HCC. However, to reach this objective, we must continue to conduct extensive research, including comparing nifuroxazide with existing therapies in clinical practice. We believe that nifuroxazide not only significantly inhibits the expression of PD-L1 protein in HCC cells but also functions as a PD-L1 inhibitor. Furthermore, it effectively curbs the proliferation and migration of HCC cells, induces tumor cell apoptosis, and may exhibit enhanced anti-tumor effects, making it a promising candidate for clinical use. We have incorporated relevant discussion content in the article to address these points.

      Reviewer #2 (Public Review):

      Summary:

      Zhao et al. aimed to explore an important question - how to overcome the resistance of hepatocellular carcinoma cells to radiotherapy? Given that the immune-suppressive microenvironment is a major mechanism underlying resistance to radiotherapy, they reasoned that a drug that blocks the PD-1/PD-L1 pathway could improve the efficacy of radiation therapy and chose to investigate the effect of Nifuroxazide, an inhibitor of stat3 activation, on radiotherapy efficacy in treating hepatocellular carcinoma cells. From in vitro experiments, they find combination treatment (Nifuroxazide+ radiotherapy) increases apoptosis and reduces proliferation and migration, in comparison to radiotherapy alone. From in vivo experiments, they demonstrate that combined treatment reduces the size and weight of tumors in vivo and enhances mice survival. These data indicate a better efficacy of combination therapy compared to radiotherapy alone. Moreover, they also determined the effect of combination therapy on tumor microenvironment as well as peripheral immune response. They find that combination therapy increases infiltration of CD4+ and CD8+ cells as well as M1 macrophages in the tumor microenvironment. Interestingly, they find that the ratio of Treg cells in spleen is increased by radiotherapy but decreased by Nifuroxazide. Considering the immune-suppressive role of Treg cells, this finding is consistent with reduced tumor growth by combination therapy. However, it is unclear whether the combined therapy affects the ratio of Treg cells in the tumors or not. The most intriguing part of the study is the determination of the effect of Nifuroxazide on PD-L1 expression in the context of radiotherapy. Considering Nifuroxazide is a stat3 activation inhibitor and stat3 inhibition leads to reduced expression of PD-L1, one would expect Nifuroxazide decreases PD-L1 expression through stat3. However, they found that the effect of Nifuroxazide on PD-L1 is dependent on GSK3 mediated Proteasome pathways and independent of stat3, in the given experimental context. To determine the relevance to human hepatocellular carcinoma, they also measured the PD-L1 expression in human tumor tissues of HCC patients pre- and post-radiotherapy. The increased PD-L1 expression level in HCC after radiotherapy is impressive. However, it is unclear whether the patients being selected in the study had resistant disease to radiotherapy or not.

      Overall, the data are convincing and supportive to the conclusions.

      Strengths:

      1) Novel finding: Identified novel mechanism underlying the effect of Nifuroxazide on PD-L1 expression in hepatocellular carcinoma cells in the context of radiotherapy.

      2) Comprehensive experimental approaches: using different approaches to prove the same finding. For example, in Fig 4, both IHC and WB were used. In Fig 5, both IF and WB were used.

      3) Human disease relevance: Compared observations in mice with human tumor samples.

      The question in the summary, “However, it is unclear whether the combined therapy affects the ratio of Treg cells in the tumors or not”.

      Response: Thank you very much for your valuable feedback. We have included additional flow cytometry results regarding the expression of relevant Treg cells (CD4+CD25+Foxp3+ T lymphocytes) in tumor tissues (Supplementary Fig 2). Our findings indicate that the number of Treg cells in tumor tissues significantly decreased following combination therapy with nifuroxazide and radiotherapy.

      The question in the summary, “However, it is unclear whether the patients being selected in the study had resistant disease to radiotherapy or not”.

      Response: Thank you very much for your valuable feedback. All the HCC patients selected in this study experienced recurrence after radiation treatment.

      Weaknesses:

      1) It is hard to tell whether the observed phenotype and mechanism are generic or specific to the limited cell lines used in the study. The in vitro experiments were performed in one human cell line and the in vivo experiments were performed in one mouse cell line.

      Response: Thank you very much for your feedback. We have included additional experimental data from another human cell line Huh7 (Supplementary Fig 3).

      2) The study did not distinguish the effect of increased radiosensitivity by nifuroxazide from combined anti-tumor effects by two different treatments.

      Response: Thank you greatly for your insightful feedback. In this study, we primarily compared the antitumor effects of nifuroxazide combined with radiotherapy versus either nifuroxazide or radiotherapy alone, and confirmed that the combined treatment demonstrated a more potent anti-hepatocellular carcinoma effect compared to single therapy. Furthermore, to achieve the goal of utilizing nifuroxazide for the treatment of clinical hepatocellular carcinoma, additional research is necessary, including comparisons with other clinically established therapies. We have also incorporated relevant discussions in our analysis.

      Reviewer #3 (Public Review):

      Summary:

      In this study, the authors embarked on an exploration of how nifuroxazide could enhance the responsiveness to radiotherapy by employing both an in vitro cell culture system and an in vivo mouse tumor model.

      Strengths:

      The researchers conducted an array of experiments aimed at revealing the function of nifuroxazide in aiding the radiotherapy-induced reduction of proliferation, migration, and invasion of HepG2 cells.

      Weaknesses:

      The authors did not provide the molecular mechanism through which nifuroxazide collaborates with radiotherapy to effectively curtail the proliferation, migration, and invasion of HCC cells. Moreover, the evidence supporting the assertion that nifuroxazide contributes to the degradation of radiotherapy-induced upregulation of PD-L1 via the ubiquitin-proteasome pathway appears to be insufficient. Importantly, further validation of this discovery should involve the utilization of an additional syngeneic mouse HCC tumor model or an orthotopic HCC tumor model.

      Response: Thank you very much for your insightful comments. Nifuroxazide has been demonstrated to inhibit the expression of p-STAT3, thereby suppressing tumor cell proliferation and migration (3, 4). In our study, we observed that after 48 hours of treatment with Nifuroxazide, the expression of p-STAT3 in irradiated cells was significantly inhibited. Furthermore, compared to radiation alone, combined Nifuroxazide and radiotherapy resulted in a more pronounced decrease in PCNA expression. Simultaneously, we performed additional detection of migration-related protein MMP2 expression (revised Fig 2B), confirming that combined Nifuroxazide and radiotherapy led to a more significant inhibition of MMP2 expression. These findings suggest that the combined treatment may be responsible for the synergistic suppression of HCC cell proliferation and migration. We have included relevant discussions in our manuscript.

      Our initial results indicate that Nifuroxazide inhibits the expression of PD-L1 at the protein level, but does not affect its mRNA level. Interestingly, upon treatment with a proteasome inhibitor MG132, the inhibitory effect of Nifuroxazide on PD-L1 was eliminated, suggesting that Nifuroxazide may enhance the degradation of PD-L1 protein. Our experiments have demonstrated the inhibitory effect of Nifuroxazide on PD-L1 in both human and mouse cell lines. However, to translate these findings into clinical application for the treatment of hepatocellular carcinoma, additional research is necessary, including validation in genetically engineered mouse models of HCC. We have addressed these points in the discussion section of our manuscript.

      Reviewer #1 (Recommendations For The Authors):

      1) Please improve the quality of Figure 3E. It is hard to figure out the bar and details.

      Response: Thank you for your valuable feedback. We have meticulously revised the figures to enhance their clarity and presentation (revised Fig 3E).

      2) In Figure 7E, please elucidate the methods used for calculating the amount of PD-L1 mRNA level. Please adjust the picture angle and label the marker size on the left as well

      Response: Thank you for your feedback. We have incorporated a method for calculating PD-L1 mRNA levels and revised the corresponding figures accordingly (revised Fig 7E).

      Reviewer #2 (Recommendations For The Authors):

      Questions:

      1) What is the advantage of using a combination of nifuroxazide and radiotherapy in comparison to using a combination of anti-PD1/PDL1 and radiotherapy?

      Response: Thank you very much for your insightful comments. We believe that the advantage of nifuroxazide over PD-1 or PD-L1 antibodies lies in its ability not only to effectively inhibit PD-L1 expression but also to suppress tumor cell proliferation, migration, and promote cell apoptosis (Supplementary Fig 1). We have also expanded on these aspects in the discussion section of the manuscript.

      2) For the characterization of tumor microenvironment and immune cells in the spleen, were the same cell populations being investigated? What about NK and Treg cells in tumors? What about M1 macrophages in spleen?

      Response: Thank you very much for your insightful suggestion. We have measured the infiltration of NK and Treg cells in tumor tissues (Supplementary Fig 2), as well as the abundance of M1 macrophages (revised Fig 6) in the spleen, and provided additional relevant data to strengthen our study.

      Other comments:

      1) The data in Fig 1 is solid. However, it is hard to distinguish the effect of increased radiosensitivity by nifuroxazide from combined anti-tumor effects by two different treatments. The anti-tumor role of Nifuroxazide has been reported in melanoma, colorectal carcinoma, and hepatocellular carcinoma previously (PMID: 26830149; 28055016, 26154152). Therefore, the increased apoptosis and decreased proliferation and migration could be caused by nifuroxazide and not related to the sensitivity of cells to radiation therapy.

      Response: Thank you very much for your constructive feedback. As you suggested, the anti-tumor role of nifuroxazide has been reported. However, the innovation of our study does not lie in confirming its antitumor effects but rather in demonstrating how nifuroxazide can enhance radiotherapy's efficacy in treating hepatocellular carcinoma by inhibiting PD-L1 levels.

      We compared the efficacy of combined therapy versus radiotherapy and found that compared to radiation alone, combined therapy more significantly inhibited hepatocellular carcinoma cell proliferation and migration. In our animal model, we compared the therapeutic effects of combined therapy, nifuroxazide, and radiotherapy on hepatocellular carcinoma-bearing mice. We observed that compared to individual treatment groups, combined therapy more profoundly suppressed tumor growth and enhanced the antitumor effects in the mice.

      In response to your feedback, we have expanded the discussion on the impact of combined therapy versus nifuroxazide or radiotherapy on hepatocellular carcinoma cell proliferation, migration, and apoptosis (Supplementary Fig 1). The data show that compared to either individual therapy, combined therapy further inhibited cell proliferation and migration while promoting apoptosis.

      2) There is no direct evidence to show the improved efficacy of radiation therapy by nifuroxazide through the degradation of PD-L1.

      Response: Thank you very much for your valuable suggestions. In our cell experiments, we found that nifuroxazide inhibits the increased expression of PD-L1 in cells induced by radiation therapy, and this inhibitory effect is counteracted when using the proteasome inhibitor MG132. Therefore, we speculate that nifuroxazide may inhibit PD-L1 expression through a proteasome-dependent mechanism. To better reflect this, we have revised the title of our manuscript to "Nifuroxazide Suppresses PD-L1 Expression and Enhances the Efficacy of Radiotherapy in Hepatocellular Carcinoma."

      3) "The oncogene Stat3.....was effectively inhibited by radiotherapy in cells" - this sentence may be rephrased to make the point clear. The authors might mean to say "activation of the oncogene stat3...."

      "The results demonstrated that the combination therapy increased the expression of PARP," the authors might mean to say "expression of c-PARP"

      Response: Thank you very much for your feedback. We have revised the relevant sentence descriptions to improve clarity and accuracy.

      4) "histomorphology significantly improved after the treatment with nifuroxazide and radiation therapy (Fig 3E)." How to define "improved histomorphology"? The authors may want to provide more details to clarify "improved".

      Response: Thank you very much for your feedback. We have revised the relevant sentence descriptions to improve clarity and accuracy.

      5) In addition to normalizing protein expression by tubulin, the authors may consider normalizing p-stat3 expression level by stat3.

      Response: Thank you very much for your feedback. We have conducted a quantitative analysis of the expression levels of p-STAT3 and STAT3 (revised Fig 2A).

      6) Figure 3C and D, using a different color to represent each group might help the readers to better differentiate each group.

      Response: Thank you very much for your feedback. Following your suggestion, we have revised the figures accordingly (revised Fig 3C and 3D).

      Reviewer #3 (Recommendations For The Authors):

      In this study, the authors revealed the pivotal role of nifuroxazide in augmenting the efficacy of radiotherapy. This was evidenced by its synergistic effect in suppressing the proliferation and migratory capabilities of HCC cells, alongside its capacity to induce apoptosis in these cells. Furthermore, their findings underscored the substantial synergy between nifuroxazide and radiotherapy in retarding tumor growth, thereby extending survival rates in a tumor-bearing murine model. Moreover, the authors observed that nifuroxazide combined with radiotherapy significantly increases the tumor-infiltrating CD4+ T cells, CD8+ T cells, and M1 macrophages. Finally, the authors found that nifuroxazide countered the radiotherapy-induced upregulation of PD-L1 through the ubiquitin-proteasome pathway. However, the evidence for supporting the main claims is only partially supported. The following are my concerns and suggestions.

      1) In Figures 1 and 2, the authors convincingly demonstrate the synergistic impact of nifuroxazide and radiotherapy on curtailing the proliferation, colony formation, and migratory capabilities of HCC cells, while also instigating apoptosis in these cells. However, the underlying molecular mechanism remains elusive. A recent study highlighted nifuroxazide's potential to impede the proliferation of glioblastoma cells and induce apoptosis via the MAP3K1/JAK2/STAT3 pathway (Wang X., et al., Int Immunopharmacol. 2023 May;118:109987. doi: 10.1016/j.intimp.2023.109987). It would be valuable for the authors to investigate whether nifuroxazide employs a similar molecular mechanism to regulate proliferation and apoptosis in the context of HCC. This could offer deeper insights into the mechanisms at play in their observed effects.

      Response: Thank you very much for your insightful comments. As you pointed out, previous studies have reported that nifuroxazide exerts antitumor effects by inhibiting the STAT3 pathway. However, in our experiments, we observed that radiation therapy significantly increased the expression of PD-L1, but showed a trend of decreased p-STAT3 expression. Therefore, we believe that nifuroxazide does not inhibit PD-L1 expression through the STAT3 pathway. Subsequently, our further research revealed that the inhibitory effect of nifuroxazide on PD-L1 can be counteracted by a proteasome inhibitor. Thus, we propose that nifuroxazide inhibits PD-L1 expression through a proteasome-dependent mechanism, thereby enhancing the efficacy of radiation therapy in hepatocellular carcinoma.

      2) Figures 1 and 2 solely rely on the HepG2 cell line to establish their conclusions. To validate these findings robustly, it is recommended that another HCC cell line be included in the study. This additional cell line will contribute to the generalizability and reliability of the results, enhancing the overall credibility of the study's conclusions.

      Response: Thank you very much for your suggestion. We have included additional experimental results with the relevant cell line Huh7 (supplementary Fig 3).

      3) Figure 3 demonstrates the use of only one syngeneic mouse H22 tumor model. To ensure the robustness and validity of this finding, it would be advisable to incorporate at least one more syngeneic mouse HCC tumor model or even an orthotopic mouse tumor model. The inclusion of additional models would bolster the significance and reliability of the observed results, contributing to a more comprehensive understanding of the phenomenon under investigation.

      Response: Thank you for your valuable suggestion. In the H22 mouse tumor model, we conducted relevant assessments of survival rate and tumor growth. The results confirm that the combination of nifuroxazide and radiation therapy exhibits a promising synergistic antitumor effect. However, to achieve the goal of applying nifuroxazide combined with radiation therapy for the treatment of clinical hepatocellular carcinoma, we still need to undertake extensive research, including validation on genetically identical mouse HCC tumor models. We have also included relevant discussions in our ongoing discussions.

      4) In Figure 5, employing an alternative method, such as the flow cytometry assay, to analyze and corroborate the tumor-infiltrating immune cell profiling following various treatments would enhance the rigor of the study. This additional approach would provide a complementary perspective and validate the findings, strengthening the overall reliability and impact of the results presented.

      Response: Thank you for your insightful suggestion. We have included additional experimental data to strengthen our study (supplementary Fig 2).

      5) In Figure 7, the conclusion drawn regarding nifuroxazide's impact on PD-L1 expression through ubiquitination-proteasome mechanisms seems to lack the robust evidence needed to firmly establish nifuroxazide's role in regulating PD-L1 ubiquitination. To reinforce this aspect of the study, the authors may conduct comprehensive in vitro and in vivo ubiquitination assays. Performing these assays would offer direct insights into whether nifuroxazide genuinely influences PD-L1 ubiquitination, thus fortifying the credibility and importance of the reported findings.

      Response: Thank you for your valuable feedback. Our initial findings suggest that nifuroxazide inhibits the expression of PD-L1 protein levels, but does not affect the mRNA levels. Moreover, upon treatment with the proteasome inhibitor MG132, the inhibitory effect of nifuroxazide on PD-L1 was found to be abolished. Concurrently, we observed that nifuroxazide significantly enhances GSK-3β expression in both cell and animal experiments. Consequently, we propose that nifuroxazide augments the degradation of PD-L1 protein.

      6) Statistical methods should be included in the captions of all the figures with statistical graphs. The size of the scale should be supplemented with a description in the captions.

      Response: Thank you for your valuable suggestion. We have made the appropriate modifications to our study based on your recommendations.

      7) Considering the outcomes presented in the study, it appears that the title "Nifuroxazide enhances radiotherapy efficacy against hepatocellular carcinoma by upregulating PD-L1 degradation via the ubiquitin-proteasome pathway" may not accurately reflect the findings.

      Response: Thank you for your insightful feedback. We have revised the title to read, "Inhibitory Effects of Nifuroxazide on PD-L1 Expression and Enhanced Radiotherapy Efficacy in Hepatocellular Carcinoma".

      References

      1) Xie C, Zhou X, Liang C, Li X, Ge M, Chen Y, et al. Apatinib triggers autophagic and apoptotic cell death via VEGFR2/STAT3/PD-L1 and ROS/Nrf2/p62 signaling in lung cancer. Journal of experimental & clinical cancer research : CR. 2021;40(1):266. doi: 10.1186/s13046-021-02069-4.

      2) de la Torre-Alaez M, Matilla A, Varela M, Inarrairaegui M, Reig M, Lledo JL, et al. Nivolumab after selective internal radiation therapy for the treatment of hepatocellular carcinoma: a phase 2, single-arm study. Journal for immunotherapy of cancer. 2022;10(11). doi: 10.1136/jitc-2022-005457.

      3) Yang F, Hu M, Lei Q, Xia Y, Zhu Y, Song X, et al. Nifuroxazide induces apoptosis and impairs pulmonary metastasis in breast cancer model. Cell Death Dis. 2015;6(3):e1701. doi: 10.1038/cddis.2015.63.

      4) Nelson EA, Walker SR, Kepich A, Gashin LB, Hideshima T, Ikeda H, et al. Nifuroxazide inhibits survival of multiple myeloma cells by directly inhibiting STAT3. Blood. 2008;112(13):5095-102. doi: 10.1182/blood-2007-12-129718.

    1. Author Response

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript aimed at elucidating the substrate specificity of two M23 endopeptidase Lysostaphin (LSS) and LytM in S. aureus. Endopeptidases are known to cleave the glycine-bridges of staphylococcal cell wall peptidoglycan (PG). To address this question, various glycine-bridge peptides were synthesized as substrates, the catalytic domain of LSS and LytM were recombinantly expressed and purified, and the reactions were analyzed using solution-state NMR. The major finding is that LytM is not only a Gly-Gly endopeptidase, but also cleaves D-Ala-Gly. Technically, the advantage of using real-time NMR was emphasized in the manuscript. The study explores an interesting aspect of cell wall hydrolases in terms of substrate-level regulation. It potentially identified new enzymatic activity of LytM. However, the biological significance and relevance of the conclusions remain clear, as the results are mostly from synthetic substrates.

      Strengths:

      The study explores an interesting aspect of cell wall hydrolases in terms of substrate-level regulation. It potentially identified new enzymatic activity of LytM.

      Weaknesses:

      1) Significance: while the current study provided a detailed analysis of various substrates, the conclusions are mainly based on synthesized peptides. One experiment used purified muropeptides (Fig. 3H); however, the results were unclear from this figure.

      We acknowledge the Reviewer for comments and concerns regarding the potential weaknesses of this study.

      Because peptidoglycan is insoluble, as such it is not amenable to solution-state NMR studies. However, soluble peptidoglycan (PG) fragments for NMR analyses can be obtained by digesting bacterial sacculi or via chemical synthesis. Whereas digestion results in mixtures of products, synthesis yields pure molecules. Analysis of NMR spectra of muropeptide-mimicking synthetic peptides before and after enzyme addition provides tools to identify peaks in the much more complex spectra of mutanolysin-treated sacculus.

      We will improve data presentation in Figure 3H in the revised version of our manuscript and emphasize the similarity of product peaks in spectra acquired from experiments using either synthetic peptides or mutanolysin-digested sacculus.

      The results from synthesized peptides may not necessarily correlate with their biological functions in vivo.

      The Reviewer refers several times to the use of synthetic peptides in this study. While it is unclear to us whether the concern is about the synthetic nature of the molecules or because the peptides are devoid of PG disaccharide units, it is true that PG fragments lack the 3D architecture present in intact sacculus, and thus cannot perfectly mimic the in vivo milieu. The fragments, as well as purified sacculus, also lack all other components present in an intact bacterial cell wall. Our largest synthetic peptide (7), however, represents a crosslinked muropeptide (stem-pentaGly-stem) which according to the structural model recently presented by Razew et al. (2023) (Staphylococcus aureus sacculus mediates activities of M23 hydrolases. Nat Commun 14, 6706) is large enough to cover the peptidic interaction interface between substrate and enzyme.

      Secondly, the study used only the catalytic domain of both proteins. It is known that the substrate specificity of these enzymes is regulated by their substrate-binding domains. There is no mention of other domains in the manuscript and no justification of why only the catalytic domain was studied. In short, the relevance of the results from the current study to the enzymes' actual physiological functions remains to be addressed, which attenuated the significance of the study.

      Lysostaphin catalytic domain was used for experimental simplicity and to allow direct comparison with LytM catalytic domain. Because lysostaphin cell-wall targeting (SH3b) domain interacts with the substrate with variable affinities depending on the substrate structure (Tossavainen et al., Structural and functional insights into lysostaphin-substrate interaction, Front. Mol. Biosci. 5, 60 (2018) and Gonzalez-Delgado et al., Two-site recognition of Staphylococcus aureus peptidoglycan by lysostaphin SH3b, Nat. Chem. Biol. 16, 24-30 (2020)), we would have had skewed results on kinetics because of this interaction.

      Catalytic domains were used also in the article by Razew et al. (Staphylococcus aureus sacculus mediates activities of M23 hydrolases. Nat Commun 14, 6706 (2023)). They showed that mature lysostaphin and lysostaphin catalytic domain hydrolysed the same Gly-Gly bonds.

      Moreover, full-length LytM is catalytically inactive. This is because the linker between its N-terminal and catalytic domains occludes the catalytic site (Odintsov et al. Latent LytM at 1.3 Å resolution. J. Mol. Biol. 225, 775 (2004)). LytM catalytic domain without its N-terminal segment is active (Odintsov et al (2004) and Firczuk et al. Crystal structure of active LytM. J. Mol. Biol 354, 578 (2005)).

      2) Impact and novelty:

      (1) the current study provided evidence suggesting the novel function of LytM in cleaving D-Ala-Gly. The impact of this finding is unclear. The manuscript discussed Enterococcus faecalis EnpA. But how about other M23 endopeptidases? What is biological relevance?

      EnpA was specifically mentioned because it has been reported to also cleave the D-Ala-Gly bond. Structural similarities between the enzymes could reveal the basis for this bond specificity. Moreover, the focus of the study was not to reveal the biological function of LytM but rather to understand which amino acid substitutions lead to differences in specificities in the two structurally very similar enzymes.

      (2) A very similar study published recently showed that the activity of LSS and LytM is regulated by PG cross-linking: LSS cleaves more cross-linked PG and LytM cleaves less cross-linked PG (Razew, A., Laguri, C., Vallet, A., et al. Staphylococcus aureus sacculus mediates activities of M23 hydrolases. Nat Commun 14, 6706 (2023). The results of this paper are different from the current study whereby both LSS and LytM prefer cross-linked substrates (Fig, 2JKL). Moreover, no D-Ala-Gly cleavage was observed by LytM using purified PG substrate from Razew A et al. An explanation of inconsistent results is needed here. In my opinion, the knowledge generated from the current study has not been fully settled. If the results can be validated, the contribution to the field is incremental, but not substantial.

      Another point raised by the Reviewer concerned the inconsistent results between our study and the recent paper by Razew et al. (2023) regarding LytM D-Ala--Gly cleavage. The explanation might lie in the type of NMR data acquired and its interpretation. We identified all hydrolysis products using 1H, 13C multiple bond correlation NMR spectra acquired from samples dissolved in deuterated buffers. Use of C-H signals is advantageous in that they are not prone to chemical exchange phenomena and enable unambiguous chemical shift assignment. Based on shown NMR spectra, Razew and co-workers identified cleaved muropeptide bonds by observing product glycine peaks in 1H, 15N correlation spectra, specifically amide peaks of product C-terminal glycines appearing in the 114-117 ppm 15N region of spectra of samples treated with LytM/LSS. D-Ala--Gly cleavage, however produces an N-terminal glycine, whose signal due to chemical exchange is not typically observed in regular N,H correlation spectra. Razew and co-workers validated their observations with UPLC-MS analysis. However, to our understanding, their data analysis was based on the assumption that LytM cleaves between Gly4-Gly5 (or Gly1-Gly2 using our numbering), and accordingly only masses corresponding to potential products containing 1 to 4 glycines anchored to the lysine side chain were considered.

      (3) The authors emphasized a few times in the text that it is superior to use NMR technology. In my opinion, NMR has certain advantages, such as measuring the efficacy of cleavage, but it is not that superior. It should be complementary to other methods such as mass spectrometry. In addition, more relevant solid-state NMR using intact PG or bacterial cells was not discussed in the study. I am of the opinion that the corresponding text should be revised.

      We value and agree with the Reviewer’s opinion that NMR spectroscopy is complementary to other methods e.g., mass spectrometry. However, in this particular case, NMR provided simultaneously information on reaction kinetics as well as scissile bonds in the substrates, which allowed us to compare rates of hydrolysis in different PG fragments and reshape the substrate specificities of LytM/LSS. We also agree that solid-state NMR is a wonderful technique. In our revised manuscript, we will edit the text accordingly.

      3) The conclusions are not fully supported by the data

      As mentioned above, the conclusions from synthesized peptide substrates may not necessarily reveal physiological functions. The conclusions need to be validated by more physiological substrates.

      As pointed out above in our response to the potential weaknesses of this study, the aim of this work was not to reveal the physiological function of LytM but to glean information on its substrate specificity that echoes its functional role in a substrate level. Hitherto LytM has been shown to cleave amide bonds between glycines without providing detailed information about the specific scissile bonds in the established PG components in S. aureus cell wall. The same holds true for lysostaphin as well. This study provides concomitantly information on the rates of hydrolysis and scissile bonds of these two enzymes. We deduced that LytM, and especially lysostaphin substrate specificity is defined by D-Ala-Gly cross-linking, which is a structural property, whereas Razew et al. (2023) discuss about “more cross-linked” and “less cross-linked PG”, which is a supramolecular asset or density.

      4) There are some issues with the presentation of the figures, text, and formatting.

      We are grateful to the Reviewer for bringing up issues in figures and text. We will address these in the revised version of the manuscript.

      Reviewer #2 (Public Review):

      Summary:

      This work investigates the enzymatic properties of lysostaphin (LSS) and LytM, two enzymes produced by Staphylococcus aureus and previously described as glycyl-glycyl endopeptidases. The authors use synthetic peptide substrates mimicking peptidoglycan fragments to determine the substrate specificity of both enzymes and identify the bonds they cleave.

      Strengths:

      • This work is addressing a real gap in our knowledge since very little information is available about the substrate specificity of peptidoglycan hydrolases.

      • The experimental strategy and its implementation are robust and provide a thorough analysis of LSS and LytM enzymatic activities. The results are very convincing and demonstrate that the enzymatic properties of the model enzymes studied need to be revisited.

      Weaknesses:

      • The manuscript is difficult to read in places and some figures are not always presented in a way that is easy to follow. This being said, the authors have made a good effort to present their experiments in an engaging manner. Some recommendations have been made to improve the current manuscript but these remain minor issues.

      We thank the Reviewer for providing positive feedback on our manuscript and for appreciating the systematic work behind this study which aims to unknot the substrate specificity of two S. aureus PG hydrolyzing enzymes. We are grateful for the comments aiming to improve the presentation of the current version of manuscript and we will take these into account while preparing the revised version of the manuscript.

    1. Author response

      eLife assessment

      Using a genetically controlled experimental setting, the authors find that the lack of Polycomb-dependent epigenetic programming in the oocyte and early embryo influences the developmental trajectory through gestation in the mouse. By showing a two-phase outcome of early growth restriction followed by enhancement, the authors address previous inconsistencies in the field. However, the link with placenta function and gene misregulation is not yet fully supported.

      We thank the Reviewers for their constructive comments. In response we have added significantly more data to the study and substantially rewritten the manuscript. New data include analyses of glucose, amino acid and metabolite levels in fetal and maternal blood samples, more highly resolved fetal growth analyses, a more detailed study of the hyperplastic placenta including IF analyses of labyrinth area, labyrinth to placenta and capillary to labyrinth ratios. We have also added analyses of placental DNA methylation state in offspring from oocytes lacking EED, which reveals a range of DNA methylation changes at imprinted and non-imprinted genes in HET-hom offspring compared to HET-het or WT-wt controls.

      Reviewer #1 (Public Review):

      Oberin, Petautschnig et. al investigated the developmental phenotypes that resulted from oocyte-specific loss of the EED (Embryonic Ectoderm Development) gene - a core component of the Polycomb repressive complex 2 (PRC2), which possess histone methyltransferase activity and catalyses trimethylation of histone H3 at lysine 27 (H3K27). The PRC2 complex plays essential roles in regulating chromatin structure, being an important regulator of cellular differentiation and development during embryogenesis. As novel findings, the authors find that PRC2-dependent programming in the oocyte, via loss of the core component EE2, causes placental hyperplasia and propose that the increase of placental transplacental flux of nutrients leads to fetal and postnatal overgrowth. At the mechanistic level, they show altered expression of genes previously implicated in placental hyperplasia phenotypes. They also establish interesting parallelism with the placental hyperplasia phenotype that is frequently observed in cloned mice.

      Strengths:

      The mouse breeding experiments are very well designed and are powerful to exclude potential confounding genetic effects on the developmental phenotypes that resulted from the loss of EED in oocytes. Another major strength is the developmental profiling across gestation, from pre-implantation to late gestation.

      Weaknesses:

      The evidence for 'oocyte' programming is restricted to phenotypic and gene expression analysis, without measurements of epigenetic dysregulation. It would be an added value if the authors could show evidence for altered H3K27me3 or DNA methylation in the placenta, for example.

      In an earlier previous study we identified a large number of developmentally important genes that accumulated H3K27me3 in primary-secondary stage growing oocytes and were repressed by EED (Jarred et al., 2022 Clinical Epigenetics). However, H3K27me3 was removed from all from these genes during preimplantation development, indicating that maternal inheritance of H3K27me3 at a wide range of genes is unlikely (Jarred et al., 2022 Clinical Epigenetics). Consistent with this only a small number of genes, including Slc38a4 and C2MC, have been shown to be functionally important in H3K27me3-dependent imprinting (Matoba et al., 2022 Genes and Development). Moreover, a related study showed that deletion of Setd2 and consequent loss of H3K36me3 in oocytes led to spreading of H3K27me3 into regions that were otherwise marked by H3K36me3 and DNA methylation (Xu et al. 2019 Nature Genetics 51:844–56). Based on these studies, we proposed that loss of EED and H3K27me3 may result in the ectopic spreading of H3K36me3 and DNA methylation in oocytes and that altered DNA methylation may then be transmitted to offspring and affect developmental outcomes (Jarred et al., 2022 Clinical Epigenetics)

      Given this hypothesis we analysed DNA methylation rather than H3K27me3 in the placenta of WT-wt, HET- het and HET-hom offspring. This revealed differentially methylated regions (DMRs) in HET-hom placentas at two H3K27me3 imprinted genes Sfmbt2 (C2MC) and Mbnl2, five classically imprinted genes and at 74 DMRs not associated with imprinted loci. Together, our data supports the hypothesis from Jarred et al., 2022 Clinical Epigenetics that loss of EED in oocytes results in altered DNA methylation patterning at both imprinted and non-imprinted genes in offspring and that this is likely to affect offspring growth and development. However, whether these changes result from direct alteration of DNA methylation in oocytes remains unclear.

      These new data are now included in results (Lines 387-409), Figure 6I, Supplementary File H-J and Discussion Lines 569-581.

      Reviewer Comment 1. The claim that placental hyperplasia drives offspring catch-up growth is not supported by current experimental data. The authors do not address if transplacental flux is increased in the hyperplastic placentae, measure amino acids and glucose in fetal/maternal plasma, or perform tetraploid rescue experiments to ascertain the contribution of the placenta to growth phenotypes. Furthermore, it is unclear, from the current data, if the surface area for nutrient transport is actually increased in the hyperplastic placenta and the extent to which other cell populations (i.e. spongiotrophoblasts) are affected in addition to glycogen cells. In addition, one of the supporting conclusions that the placenta is a key contributor to fetal overgrowth is based on a very crude measurement - placenta efficiency - which the authors claim is increased in the homozygous mutants compared to controls. After analysing the data carefully, I find evidence for decreased placental efficiency instead. I believe that the authors mistakenly present the data as placenta to fetal weight ratios, which led to the misinterpretation of the 'efficiency' concept.

      We thank the reviewer for pointing out our error in the placental efficiency data and we have now corrected the placental efficiency graphs (fetal/placental weight ratios) and updated the text throughout the manuscript as required (Figure 3I-K). As requested and described below, we have also added significantly more data, which support the conclusion that placental function is not enhanced in HET-hom mice and is unlikely to support fetal growth recovery.

      The new data and analyses we have added include:

      1. Further analyses of glycogen-enriched and non-glycogen-enriched cell counts in the decidua and junctional zones (Figure 4F-J)

      2. Total glycogen cell counts for male and female placentas (Figure 4 – figure supplement 1F)

      3. New analyses of fetal blood glucose levels at E17.5 and E18.5 and matching data from the mothers of each litter (Figure 4M)

      4. New analyses of the circulating amino acid levels and metabolites in fetal blood of E17.5 offspring and matching data from the mothers of each litter (Figure 8)

      5. New IF analyses of CD31 (PECAM-1) and combined this with machine learning assisted quantitative analyses of labyrinth and capillary areas using HALO (Figure 5)

      6. Separated male and female offspring and placental weights at E14.5 and E17.5 and total areas of the placenta, decidua, junctional zone and labyrinth (Figure 3 – figure supplement 1) which provide more insight into potential sex-specific differences in HET-hom offspring and placenta

      We have significantly re-written the results and discussion to reflect our new data and interpretation.

      While we did not assess transplacental flux, our new data revealed: 1. HET-hom fetuses had lower blood glucose levels at E18.5; 2. Circulating levels of amino acids and a wide range of metabolites did not differ between HET-hom and control offspring, or between the mothers of these offspring; 3. HET-hom placentas had lower total labyrinth area, labyrinth/placenta and capillary/labyrinth ratios based on analysis of total capillary and labyrinth areas, indicating that the surface area for nutrient transfer is not increased

      Together these data strongly indicate that hyperplastic HET-hom placentas do not provide greater support to HET-hom fetuses than controls, and that increased placental function in HET-hom offspring is unlikely to explain the late gestation fetal growth recovery we observed in HET-hom offspring or how HET-hom offspring were able to attain normal weights by birth.

      While we have not directly counted the spongiotrophoblast populations, we have now included analyses of both the glycogen-enriched and non-glycogen cell populations in the junctional zone and the decidua (Figure 4H-K). This revealed an increased area of both glycogen-enriched and non-glycogen cells in the junctional zone and in the decidua of HET-hom placentas, consistent with the greater junctional zone/placenta ratio observed in HET-hom placentas (Figure 4D). Together with data in Figure 4C-F and Supp. Fig. 3, our observations demonstrate that the overall decidua and junctional zone areas were increased in HET-hom offspring, but there was a disproportionate expansion of the junctional zone that was caused by increased areas of both glycogen and non-glycogen-enriched cells.

      Tetraploid rescue experiments would require a very significant amount of time and investment and are technically very demanding. While creation of complementary tetraploid offspring would be informative, unfortunately these experiments are beyond the scope of this current study.

      Reviewer Comment 1 cont. The authors do not mention alternative explanations for the observed fetal catch-up and postnatal overgrowth. Why would oocyte epigenetic programming effects be restricted to the placenta, and not include fetal organs?

      Our intention was certainly not to convey a message that effects may be placenta specific. Indeed, our ongoing work beyond the scope of this study provides evidence for effects in other tissues (brain and bones) that will be published elsewhere. Our new data clearly show low placental efficiency, fetal blood glucose, low capillary/labyrinth ratio and no impact on circulating fetal amino acid or metabolite levels in HET-hom offspring. In light of these new data, we have reinterpreted the findings of this study and substantially updated the discussion.

      Given our observations that fetal growth rate markedly increased during late gestation, but placental efficiency was reduced, our data strongly indicate that the effects of altered epigenetic oocyte programming due to loss of Eed affect both the placenta and the fetus. While our findings are significant, the precise mechanism underlying this growth response in HET-hom fetuses remains unknown. Understanding this mechanism will require substantially more work that will be the subject of future studies.

      Reviewer #2 (Public Review):

      Consistent fetal growth trajectories are vital for survival and later life health. The authors utilise an elegant and novel animal model to tease apart the role of Eed protein in the female germline from the role of somatic Eed. The authors were able to experimentally attribute placental overgrowth - particularly of the endocrine region of the placenta - to the function of Eed protein in the oocyte. Loss of Eed protein in the oocyte was also associated with dynamic changes in fetal growth and prolonged gestation. It was not determined whether the reported catch-up growth apparent on the day of birth was due to enhanced fetal growth very late in gestation, a longer gestational time ie the P0 pups are effectively one day "older" compared to the controls, or the pups catching up after birth when consuming maternal milk.

      To understand if increased growth occurred in HET-hom fetuses prior to birth, we have now included analyses of offspring weight at E18.5 (Figure 2F), all pups collected with a verified E19.5 birth date (Figure 2J) and for pups from similar litter sizes (5-7 pups) at E19.5 (Figure 2K). Together with our existing data, these additional analyses provide average weights for fetuses at E14.5, E17.5, E18.5 and pups born on E19.5. This confirmed that HET-hom offspring undergo enhanced growth in the last few days of pregnancy, resulting in the progression of substantially growth and developmentally restricted HET-hom fetuses at E14.5, to pups with normal weight at birth within the 40% of pregnancies that were born on E19.5 in a normal gestational time.

      However, in addition, gestational length was increased by one to two days in 60% of pregnancies from hom oocytes, but not in control pregnancies from het or wt oocytes. As average weights were significantly greater in all surviving HET-hom offspring at P0 (i.e. surviving pups born on E19.5-E21.5; Figure 2G), it appears that this additional gestational time contributed to the offspring overgrowth. This is logical, however it does not explain how growth and developmentally delayed fetuses at E14.5 attained normal weight and developmental stage by E19.5 (Figure 2J-K).

      Together our data clearly show that HET-hom offspring undergo enhanced growth during the late stages of pregnancy, allowing them to resolve the developmental delay and growth insufficiency observed at E14.5 so that they were born at normal weight and stage at E19.5. In addition, increased gestational time contributes to weight of pups delivered on E20.5 or 21.5, partly explaining the overgrowth phenotype observed in this model.

      The idea that increased milk consumption may explain the overgrowth of HET-hom offspring is interesting. It is possible that the increased growth rate of HET-hom offspring continues after birth and contributes to overgrowth. However, examining this outcome in a tightly controlled manner is complicated given that we cannot predict the day of birth of HET-hom litters, and that these litters are generally small and would need to be fostered on the day of birth alongside control litters. Given these challenges and that our primary observation is that HET-hom offspring underwent fetal growth recovery during pregnancies of normal length and via extension of gestational length, we have not examined the possibility of increased milk consumption after birth.

      We have updated the results to reflect the new analyses and have provided relevant discussion to address these data. Our description of these data can be found in Results (lines 165-197) and in Figure 2.

      Reviewer #3 (Public Review):

      My understanding of the main claims of the paper, and how they are justified by the data are discussed below:

      Overall, loss of PRC2 function in the developing oocyte and early embryo causes:

      1) Growth restriction from at least the blastocyst stage with low cell counts and midgestational developmental delay.

      Strengths:

      • Live embryo imaging added an important dimension to this study. The authors were able to confirm an unquantified finding from a previous lab (reduced time to 2-cell stage in oocyte-deletion Eed offspring, Inoue 2018, PMID: 30463900) as well as identify developmental delay and mortality at the blastocyst- hatching transition.

      • For the weight and morphological analysis the authors are careful to provide isogenic controls for most of the experiments presented. This means that any phenotypes can be attributed to the oocyte genotype rather than any confounding effects of maternal or paternal genotype.

      • Overall, there is good evidence that oocyte deletion of Eed results in early embryonic growth restriction, consistent with previous observations (Inoue 2018, PMID: 30463900).

      Reviewer 3, Comment 1: Weaknesses: Gaps in the reporting of specific features of the methodology make it difficult to interpret/understand some of the results.

      While we are unsure exactly which methods Reviewer 3 would like expanded, we have updated parts that we thought required further detail and allow more informed interpretation of the results. These include methods for placental histology (Lines 650-669) and immuno- histochemistry (Lines 671-690), and new methods for CD31 immunofluorescence (Lines 692-714), glucose and metabolomics (Lines 752-769) and DNA methylation (RRBS; Lines 734-750) analyses.

      To clarify the approach taken for histology, immunohistochemical and immunofluorescent staining, sections were cut in compound series from the centre of each placenta, ensuring that we collected representative data for each sample. QuPath was used to quantify the decidual and junctional zone areas in one complete, fully intact midline section for each placenta as close to the midline as possible. This provided data from 10 placentas for each genotype. In addition, glycogen-enriched and non-glycogen-enriched cells were identified and quantified using machine learning assisted QuPath analyses of the whole placenta, decidua and junctional zone regions. We have also added quantitative analyses of the labyrinth and labyrinth capillary network using immunofluorescent CD31 staining and machine learning assisted HALO software. This new analysis of placental morphology is included in the methods section.

      Moreover, as there were no sex-specific differences in placental morphology or weight, we combined the samples from both sexes to provide greater numbers for analysis in each genotype. For example, as described for the analyses of labyrinth and capillaries using CD31 IF, 4 placentas of each sex were used for data collection. This provided data from a total of 8 placentas (4 male and 4 female) for each genotype from a total of 17 WT-wt (9 male and 8 female), 21 HET-het (9 male and 12 female) and 24 HET-hom (16 male and 8 female) sections (2-3 sections/placenta).

      Reviewer 3, Comment 2: Placental hyperplasia with disproportionate overgrowth of the junctional trophoblast especially the glycogen trophoblast (GlyT) cells.

      Strengths: • The authors provide a comprehensive description of how placental and embryo weight is affected by the oocyte-Eed deletion through mid-to-late gestation development. The case for placentomegaly is clear.

      Weaknesses:

      • The placental efficiency data presented in Figure 3G-I is incorrect. Placental efficiency is calculated as embryo mass/placental mass, and it increases over the late gestation period. For e14.5 for example (Fig3G), WT-wt embryo mass = ~0.3g, placenta mass = 0.11g (from Fig 3D) = placental efficiency 2.7; HET-hom = 0.25/0.12 = 2.1. The paper gives values: WT-wt 0.5, HET-hom 0.7. Have the authors perhaps divided placenta weight by embryo mass? This would explain why the E17.5 efficiencies are so low (WT-wt 0.11 rather than a more usual figure of 8.88. If this is the case then the authors' conclusion that placental efficiency is improved by oocyte deletion of Eed is wrong - in fact, placental efficiency is severely compromised.

      The authors have performed cell type counting on histological sections obtained from placentas to discover which cells are contributing to the placentomegaly. This data is presented as %cell type area in the main figure, though the untransformed cross-sectional area for each cell type is shown in the supplementary data. This presentation of the data, as well as the description of it, is misleading because, while it emphasises the proportional increase in the endocrine compartment of the placenta it downplays the fact that the exchange area of the mutant placentas is vastly expanded. This is important for two reasons.

      Firstly, the whole placenta is increased in size suggesting that the mechanism is not placental lineage- specific and instead acting on the whole organ. Secondly in relation to embryonic growth, generally speaking, genetic manipulations that modify labyrinthine volume tend to have a positive correlation with fetal mass whereas the relationship between junctional zone volume and embryonic mass is more complex (discussed in Watson PMID: 15888575, for example). The authors should reconsider how they present this data in light of the previous point.

      We thank the reviewer for pointing out our error in the placental efficiency analysis and apologise for this error. We have corrected the presentation and interpretation of these data and have described this in detail in our response to Reviewer 1, Comment 1.

      As discussed in our response to Reviewer 1, Comment 1, we have added a range of analyses to determine whether placental efficiency was enhanced in HET-hom offspring. These include measuring fetal and maternal circulating glucose levels (Figure 4K), individual amino acids and an extensive range of metabolites (Figure 8) and providing CD31 immunofluorescent analyses of labyrinth area, labyrinth/placental ratio and capillary/labyrinth ratio in HET-hom and control placentas (Figure 5).

      We also added analyses of glycogen enriched and non-glycogen-enriched cell counts in the decidua and junctional zones. As suggested by Reviewer 3, both glycogen-enriched and non-enriched cell populations are significantly increased in HET-hom placentas.

      Combined, these new analyses significantly expand the study and support the conclusion that placental efficiency in HET-hom offspring was either compromised or not different from controls, depending on the analysis. We find no evidence that placental efficiency was increased in HET-hom offspring and have reworked our results and discussion sections to reflect these new data and interpretation.

      Reviewer 3, Comment 2 cont: Again, some of the methods are not clearly reported making interpretation difficult - especially how they have estimated their GlyT number.

      As outlined in our response to Reviewer 3 Comment 1, in the methods section we have added further detail of how we counted glycogen-enriched and non-enriched cells in the decidua and junctional zone regions of sections for the middle of WT-wt, WT-het, HET-het and HET-hom placentas (Lines 650-669).

      Reviewer 3, Comment 3: Perinatal embryonic/pup overgrowth.

      Strengths:

      • The overgrowth exhibited by the oocyte-Eed-deleted pups is striking and confirms the previous work by this group (Prokopuk, 2018). This is an important finding, especially in the context of understanding how PRC2-group gene mutations in humans cause overgrowth syndromes. It is also intriguing because it indicates that genetic/environmental insults in the mother that affect her gamete development can have long-term consequences on offspring physiology.

      Weaknesses:

      • Is the overgrowth intrauterine or is it caused by the increase in gestation length? The way the data is reported makes it impossible to work this out. The authors show that gestation time is consistently lengthened for mothers incubating oocyte-Eed-deleted pups by 1-2 days. In the supplementary material, the mutant embryos are not larger than WT at e19.5, the usual day of birth. Postnatal data is presented as day post-parturition. It would probably be clearer to present the embryonic and postnatal data as days post coitum. In this way, it will be obvious in which period the growth enhancement is taking place. This is information really important to determine whether the increased growth of the mutants is due to a direct effect of the intrauterine environment, or perhaps a more persistent hormonal change in the mother that can continue to promote growth beyond the gestation period.

      We have used embryonic day (E) to denote embryo and fetal age throughout the study – this is the same as using DPC (i.e. E19.5 is equivalent to 19.5 DPC). As described in the Methods “Collection of post-implantation embryos, placenta and postnatal offspring”, mice were time mated for two-four nights, with females plug checked daily. Positive plugs were noted as day E0.5.

      To make the data presentation clearer, we have shown the data for surviving HET-hom pups born on E19.5 (Figure 2J) separately from all HET-hom surviving pups born on E19.5-E21.5. (Figure 2G). As discussed in our response to Reviewer 2, we have also included growth data for pregnancies at E14.5, E17.5, E18.5 (Fig. 2C-F) and E19.5 (Figure 2J,K), as well as P0 (combined data for surviving pups born E19.5-E21.5), and P3 (combined data for surviving pups born E19.5-E21.5, Figure 2G,H).

      These data clearly show that HET-hom fetuses are substantially growth and developmentally delayed at E14.5 (Figure 2D), but HET-hom pups born on E19.5 are the same weight as WT-wt, WT-het and HET-het control pups (Figure 2J). This demonstrates that weight of HET-hom fetuses is normalised in utero between E14.5 and day of birth on E19.5.

      Importantly, as requested by Reviewer 3, we have separated average weight for all surviving pups with a day of birth of E19.5-21.5 (Figure 2G) from average weight of pups born on E19.5 only (Figure 2J). These analyses revealed that the average weight of surviving pups born between E19.5-21.5 was significantly higher than for controls (Figure 2G), but the average weight of pups born on E19.5 only was not. It is therefore clear that extended gestation also contributed to increased HET-hom pup birth weight. We have updated these additional analyses in Results (Lines 165-197) and Figure 2

      As revealed in Figure 2H, it is also possible/likely that growth of HET-hom pups during the three days post- partum may have contributed to the offspring overgrowth we observed in this and our previous study (Prokopuk et al., 2018 Clinical Epigenetics). However, we cannot determine whether there is a contribution from a persistent maternal hormonal change that promotes post-natal offspring growth or whether there is an innate growth benefit in HET-hom pups. As this is very difficult to dissect, separating these possibilities is beyond the scope of our study.

      Reviewer 3, Comment 4: "fetal growth restriction followed by placental hyperplasia, .. drives catch-up growth that ultimately results in perinatal offspring overgrowth".

      Here the authors try to link their observations, suggesting that i) the increased perinatal growth rate is a consequence of placentomegaly, and ii) the placentomegaly/increased fetal growth is an adaptive consequence of the early growth restriction. This is an interesting idea and suggests that there is a degree of developmental plasticity that is operating to repair the early consequences of transient loss of Eed function.

      Strengths:

      • Discrepancies between earlier studies are reconciled. Here the authors show that in oocyte-Eed-deleted embryos growth is initially restricted and then the growth rate increases in late gestation with increased perinatal mass.

      Weaknesses:

      • Regarding the dependence of fetal growth increase on placental size increase, this link is far from clear since placental efficiency is in fact decreased in the mutants (see above).

      • "Catch-up growth" suggests that a higher growth rate is driven by an earlier growth restriction in order to restore homeostasis. There is no direct evidence for such a mechanism here. The loss of Eed expression in the oocyte and early embryo could have an independent impact on more than one phase of development.

      Firstly, there is growth restriction in the early phase of cell divisions. Potentially this could be due to depression of genes that restrain cell division on autosomes, or suppression of X-linked gene expression (as has been previously reported, Inoue, 2018 PMID: 30463900). The placentomegaly is explained by the misregulation of non-canonically imprinted genes, as the authors report (and in agreement with other studies, e.g. Inoue, 2020. PMID: 32358519).

      • Explaining the perinatal phase of growth enhancement is more difficult. I think it is unlikely to be due to placentomegaly. Multiple studies have shown that placentomegaly following somatic cell nuclear transfer (SCNT) is caused by non-canonically imprinted genes, and can be rescued by reducing their expression dosage. However, SCNT causes placentomegaly with normal or reduced embryonic mass (for example -Xie 2022, PMID: 35196486), not growth enhancement. Moreover, since (to my knowledge) single loss of imprinting models of non-canonically imprinted genes do not exist, it is not possible to understand if their increased expression dosage can drive perinatal overgrowth, and if this is preceded by growth restriction and thus constitutes 'catch up growth'.

      Reviewer 3 is correct in their assessment that placental efficiency was decreased in HET- hom offspring and we have corrected the placental efficiency analysis based on fetal/placental weight ratios (discussed in detail in our response to Reviewer 1 Comment 1). We have added substantially more data (glucose, amino acids, metabolites, labyrinth capillary area and density). These data support the conclusion that a placentally driven advantage for HET-hom fetal growth is unlikely, despite our observation that HET- hom fetuses are developmental delayed and underweight at E14.5, but are born at normal weight after a normal gestational length (19.5 days) (discussed in our responses to Reviewer 3, Comment 3 and Reviewer 2).

      This demonstrates that HET-hom fetuses are able to attain normal birth weight despite being initially growth restricted state at E14.5, and that this occurs despite low placental function. Moreover, as we compared isogenic offspring with heterozygous loss of Eed (Het-het compared to HET-hom offspring) the outcomes we observed in HET-hom offspring originate from loss of EED in the growing oocyte or loss of maternal EED in the zygote strongly suggesting that a non-genetic mechanism is involved.

      As pointed out by Reviewer 3, the initial developmental delay in HET-hom offspring may be due to increased expression of genes that regulate cell proliferation – this could clearly explain the lower number of cells we observed in the ICM and the growth delay at later stages of embryonic and fetal development. Another possibility is that maternal PRC2 provided by the oocyte promotes cell divisions in preimplantation embryos We have discussed these possibilities on Lines 467-476.

      In addition, Matoba et al 2022 demonstrated that deletion of maternal Xist together with Eed was able to rescue male-biased lethality in offspring from oocytes lacking Eed, revealing a clear role for X-linked genes in this phenotype (Matoba et al 2022, Genes and Development). However, deletion of maternal Xist did not properly normalise survival offspring from Eed null oocytes (i.e. Eed/Xist double maternal null litters were smaller than litters derived from wild type oocytes) strongly suggesting other mechanisms provide the capacity for HET-hom offspring to attain normal weight at birth. We have added further discussion of the Matoba study in the context of our study on of the Discussion (Lines 544-555)

      Finally, with respect to the outcomes for SCNT derived offspring, we extracted SCNT fetal growth and placental weight data from the supplementary data included in Matoba et al., 2018 Cell Stem Cell. 2018;23(3):343-54.e5 and compared it with data collected in our study (Figure 7). This analysis revealed that the weights of placentas and fetuses of offspring derived via SCNT were very similar to the HET-hom offpsring in our study and we have discussed the similarities and potential differences between HET-hom and SCNT offspring in the Discussion (Lines 478-500).

      As pointed out by Reviewer 3, deletion of maternal non-canonically imprinted genes partially or fully rescued the placental hyperplasia phenotype in both SCNT derived and offspring from oocyte lacking EED. However, as we have discussed, the mechanisms underlying other aspects of the offspring phenotype, such as fetal growth recovery of HET-hom offspring observed in our study, remain unknown. Moreover, the comparison we provide in Figure 7 strongly indicates that HET-hom and SCNT fetuses are similarly delayed at E14.5 and undergo similar fetal growth recovery before birth, but the mechanism also remains unknown. Together, it appears that offspring derived from either Eed-null oocytes or by SCNT have an innate ability to remediate fetal growth restriction during the late stages of pregnancy without a requirement to correct maternally inherited impacts mediated by Xist or H3K27me3-dependent imprinting.

    1. Author response

      Reviewer #1 (Public Review):

      The main contribution appears to be related to functional specialization. I suggest clarifying the major novelty of the present report and to focus the introduction on it.

      We thank this reviewer for this suggestion. We have revised the introduction to emphasize the functional specialization question. The changes are extensive; we have included a tracked-changes version of the manuscript to make these edits easy to see.

      There is a growing literature on fluctuating neural firing patterns that is not considered in this report. The scholarship appears a bit impoverished with only 19 references, many of which point to work from this group of collaborators. I suggest that the authors consider the present work in the context of the wider literature more scholarly, even if not all the relations of these different lines of work can be conclusively connected at this point. For a few examples, there is work by Kienitz and colleagues on fluctuating neural patterns in V4 evoked by competing grating stimuli. Also, the work by Engel, Moore, and colleagues on 'on' and 'off' states in the context of selective attention seems relevant, or the work by Fiebelkorn and Kastner on rhythmic perception and attention.

      We agree completely with this suggestion! We have reworded the introduction to be more inclusive of other research in this area (especially Kienitz and colleagues – exciting work that we are pleased to have had brought to our attention) and we have added about 500 words in the Discussion to cover the work on on/off states (Engel et al.), rhythmic perception (Fiebelkorn & Kastner and others), and attention more generally (e.g., Triesman & Gelade’s work on serial sampling). We are particularly pleased to add these sections because these topics are very much on our minds – we have a commentary piece under review elsewhere in which we evaluate these synergistic lines of approach in a more complete fashion. In total, we’ve added about 15 additional references.

      Reviewer #2 (Public Review):

      The description of the results would benefit from a better explanation of how low spike counts may influence the outcome of the analysis. Due to a smoothing procedure used for visualization, the spike counts for the paired stimuli (AB, black lines) shown in Figure 3a-b and Figure 4a-d go below 0. However, the actual spike count on a trial can not go below 0. The symmetric smoothing procedure may hide an underlying skewed distribution of spike counts that can only be positive. The statistical analysis is not performed on the smoothed distribution but on the actual spike counts, and the validity of the result is therefore not in question. However, the paper would benefit from 1) visualization of the unsmoothed trial counts, and 2) an explanation of how assumptions of symmetric/skewed distributions may affect the outcome.

      We thank the reviewers for noting this and making these suggestions. We now include unsmoothed raw spike counts in all the example figures (Figure 3a-b and Figure 4a-d). With regard to the symmetric/skewed distributions and the analysis methods, a Poisson distribution will be skewed at low rates and become more symmetric at higher rates, so this is already incorporated into the analysis. Indeed, the utility of Poisson distributions for fitting non-negative data is one of the reasons these distributions are so commonly used in neuroscience. We now make this point explicitly at the beginning of Methods/Data analysis: “Our method centers on modeling spike counts based on Poisson distributions, a common technique for handling non-negative count data in neuroscience and other fields.” With this edit as well as the revised example figures now making clear that no spike counts are below zero, we are optimistic that readers will better understand the analysis method and how the shape of response distributions are incorporated into it.

    1. Author Response:

      We take the liberty to thank all of you for your constructive and inspiring comments, which will help us substantially improve the final version of the paper. Before our final revision with details, I am writing this provisional letter to have a quick response to our reviewers’ comments.

      I first give a quick and short summary for your public reviews, then respond point-by-point.

      Editors:

      1. More discussion is needed.

      2. More discussion about eye fixation during adaptation. Discuss why increasing visual uncertainty by blurring the cursor in the present study produces the opposite findings of previous studies (Tsay et al., 2021; Makino et al., 2023).

      3. Discuss the broad impact of the current model.

      4. Share the codes and the metadata (instead of the current data format).

      Response: This is a concise summary of the major concerns listed in the public review. Given these concerns are easy to address, we are giving a quick but point-to-point response for now. The elaborate version will be put into our formal revision.

      **Reviewer 1: **

      1) More credit should be given to the PReMo model: a) The PReMo model also proposes that perceptual error drives implicit adaptation, as in a new publication in Tsay et al., 2023, which was not public at the time of the current writing; and b) The PReMo model can account for some dataset, e.g. Fig 4A.

      Response: We will add this new citation and point out that the new paper also uses the term perceptual error. We will also point out that the PReMo model has the potential to explain Fig 4A, though for now, it assumes an additional visual shift to explain the positive proprioceptive changes relative to the target. We would expand the discussion about the comparison between the two models.

      2) The present study produced an opposite finding of a previous finding, i.e., upregulating visual uncertainty (by cursor blurring here) decreases adaptation for large perturbations but less so for small perturbations, while previous studies have shown the opposite (by using a cursor cloud; Tsay et al., 2021; Makino et al., 2023). This needs explanation.

      Response: Using the cursor cloud (Tsay et al., 2021, Makino et al., 2023) to modulate visual uncertainty has inherent drawbacks that make it unsuitable for testing the sensory uncertainty effect for visuomotor rotation. For the error clamp paradigm, the error is defined as angular deviation. The cursor cloud consists of multiple cursors spanning over a range of angles, which affects both the sensory uncertainty (the intended outcome) AND the sensory estimate of angles (the error itself, the undesired outcome). In Bayesian terms, the cursor cloud aims to modulate the sigma of a distribution (sigma_v in our model), but it additionally affects the mean of the distribution (mu). This unnecessary confound is avoided by using cursor blurring, which is still a cursor with its center (mu) unchanged from an un-blurred cursor. Furthermore, as correctly pointed out in the original paper by Tsay et al., 2021, the cursor cloud often overlaps with the visual target. This “target hit” would affect adaptation, possibly via a reward learning mechanism (See Kim et al., 2019 eLife). This is a second confound that accompanies the cursor cloud. We will expand our discussion to explain the discrepancy between our findings and previous findings.

      3) The estimation of visual uncertainty (our exp1) required people to fixate on the target, while this might not reflect the actual scenario during adaptation where people are free to look wherever they want.

      Response: Our data shows otherwise: in a typical error-clamp setting, people fixate on the target for the majority of the time. For our Exp1, the fixation on the straight line between the starting position and the target is 86%-95% (as shown in Figure S1). We also collected eye-tracking data in our Exp4, which is a typical error-clamp experiment. More than 95% of gaze falls with +/- 50 pixels around the center of the screen, even slightly higher than Exp1. We will provide this part of the data in the revision. In fact, we designed our Exp1 to mimic the eye-tracking pattern as in typical error-clamp learning with carefully executed pilot experiments.

      This high percentage of fixating on the target is not surprising: the error-clamp task requires participants to use their hands to move towards the target and to ignore the cursor. In fact, we would also like to point out that the high percentage of fixation on the aiming target is also true for conventional visuomotor rotation, which involves strategic re-aiming (shown in de Brouwer et al. 2018; Bromberg et al. 2019; we have an upcoming paper to show this). This is one reason that our new theory would also apply to other types of motor adaptation.

      4) More methodology details are needed. E.g., a figure showing the visual blurring, a figure showing individual data, a table showing data from individual sessions, code sharing, and a possible new correlational analysis.

      Response: All these additional methodological/analysis information will be provided. We were self-limited by writing a short paper, but the revision would be extended for all these details.

      Reviewer 2:

      1) More discussions are needed since the focus of this study is narrowly confined to visuomotor rotation. “A general computational principle, and its contributions to other motor learning paradigms remain to be explored”.

      Response: This is a great suggestion since we also think our original Discussion has not elaborated on the possible broad impact of our theory. Our model is not limited to the error-clamp adaptation, where the participants were explicitly told to ignore the rotated cursor. The error-clamp paradigm is one rare example that implicit motor learning can be isolated in a nearly idealistic way. Our findings thus imply two key aspects of implicit adaptation: 1) localizing one’s effector is implicitly processed and continuously used to update the motor plan; 2) Bayesian cue combination is at the core of integrating multimodal feedback and motor-related cues (motor prediction cue in our model) when forming procedural knowledge for action control.

      We will propose that the same two principles should be applied to various kinds of motor adaptation and motor skill learning, which constitutes motor learning in general. Most of our knowledge about motor adaptation is from visuomotor rotation, prism adaptation, force field adaptation, and saccadic adaptation. The first three types all involve localizing one’s effector under the influence of perturbed sensory feedback, and they also have implicit learning. We believe they can be modeled by variants of our model, or at least we should consider using the two principles above to think of their computational nature. For skill learning, especially for de novo learning, the area still lacks a fundamental computational model that accounts for the skill acquisition process on the level of relevant movement cues. Our model suggests a promising route, i.e., repetitive movements with a Bayesian cue combination of movement-related cues might underlie the implicit process of motor skills.

      We will add more discussion on the possible broad implications of our model in the revision.

      Reviewer 3:

      1) Similar to Reviewer 1, raised the concern about whether people’s fixation in typical motor adaptation settings is similar to the fixation that we instructed in our Exp1.

      Response: see above.

      2) Similar to Reviewer 2, the concern was raised about whether our new theory is applicable to a broad context. Especially, error clamp appears to be a strange experimental manipulation that has no real-life appeal, “(i)Ignoring errors and suppressing adaptation would also be a disastrous strategy to use in the real world”.

      Response: about the broad impact of our model, please see responses to Reviewer 2 above. We agree that ignoring errors (and thus “trying” to suppress adaptation) should not be a movement strategy for real-world intentional tasks. However, even in real life, we constantly attend to one thing and do the other thing; that’s when implicit motor processes are in charge. Furthermore, it is this exact “ignoring” instruction that elicits the implicit adaptation that we can work on. In this sense, the error-clamp paradigm is a great vehicle to isolate implicit adaptation and allows us to unpack its cognitive mechanism.

      3) In Exp1, the 1s delay between the movement end and the presentation of the reference cursor might inflate the actual visual uncertainty.

      Response: The 1s delay of the reference cursor would not inflate the estimate of visual uncertainty. Our Exp1 used a similar paradigm by visual science (e.g., White, Levi, and Aitsebaomo, Vision Research, 1992), which shows that delay does not lead to an obvious increase in visual uncertainty over a broad range of values (from 0.2s to >1s, see their Figure 5-6). We will add more methodology justifications in our revision.

      4) Our Fig4A used Tsay et al., 2021 data, which, in the reviewer’s view, is not an appropriate measure of proprioceptive bias. The reason is that in this dataset, “participants actively move to a visual target, the reported hand positions do not reflect proprioception, but mostly the remembered position of the target participants were trying to move to.”

      Response: We agree that Tsay et al., 2021 study used an unconventional way to measure the influence of implicit adaptation on proprioception. And, their observed “proprioceptive changes” should not be called “proprioceptive bias” which is conventionally a reserved term for measuring the difference between the estimated hand location relative to the actual hand location (and better to be a passively moved hand). However, we think their dataset is still subject to the same Bayesian cue combination principle and thus can be modeled. Our modeling of this dataset includes all relevant cues: the implicitly perceived hand position and the proprioceptive cue (given that the hand stays at the movement end). Both cues are in the extrinsic coordinates, which happened to set the target position as zero. But where to set the zero (whether it is the target or the actual hand location) does not matter for the model fitting. Note that our Exp4 is also based on PEA modeling of proprioceptive bias, and this time the data is presented relative to the actual location.

      In the revision, we would keep the current Fig4A and start to call the data as proprioceptive change as opposed to proprioceptive bias to follow the convention.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      In no particular order:

      1. In Figs S3 and S4, can they also show gamma fit? (or rather corrected fit accounting for abundance conditioning?) The shapes look different, especially for the microbial mat.

      Author response: We have added gamma distribution fits to the rescaled AFD plots (Figs. S3, S4).

      1. Lines 170-176 seem like they should come before lines 164-166.

      Author response: In lines 166-170 we discuss empirical patterns in the data that motivate the introduction of the SLM as a model in lines 170-175. We have clarified these points in the revision.

      1. The wiggles in the gamma predictions in the occupancy-abundance plots are because occupancy depends not only on abundance but also on the shape parameter, right? Probably good to write a sentence or two explaining what's going on here.

      Author response: We agree with the reviewer that the variation in the prediction could be in-part driven by variation in the shape parameter across community members. We now include this observation in our revision (lines 209-211).

      1. In the predicted vs observed occupancy plots, it would be nice to add curves showing predicted standard deviation or similar to give a sense of how well the model is predicting the variability.

      Author response: In the revised manuscript we now include predictions for the variance of occupancy using the gamma distribution under both taxonomic and phylogenetic coarse-graining (Fig. S9; S10; lines 211-214).

      1. Covariance between sister groups: Figs S9 and S10 look very nice, but it's hard to see much because they're log-log plots over multiple decades, while even a several-fold difference from y = x would indicate a strong effect of correlations. It would be clearer if the y-axis showed the ratio of the coarsegrained variance to the sum of OTU variances and we were looking at how well it fit y = 1.

      Author response: We have included these plots in the revision (Fig. S14, S15).

      1. If the sum of gammas can be well-approximated by a gamma, does that mean that the gamma is just a fairly flexible distribution and we shouldn't take the quality of the gamma fits in general as a very specific indication of what's going on?

      Author response: While the sum of random variables that are drawn from gamma distributions with different parameters is often well-approximated by another gamma, this does not tell us why the gamma distribution holds for microbial communities at the finest-grain level (i.e., OTUs/ASVs). At present, the best explanation is that the gamma is a stationary distribution for certain stochastic differential equations which have ecological interpretations (Grilli, 2020; Shoemaker et al., 2023). Furthermore, alternative two-parameter distributions have been tested alongside the gamma and have done a comparatively poor job capturing observed macroecological patterns (Grilli, 2020). These results suggest that the utility of the gamma distribution is not simply an outcome of its flexible nature, it succeeds because it has captured core ecological properties of microbial communities. In the case of the SLM, gamma-like distributions arise when a community member is subject to self-limiting growth and environmental noise. On the other hand, the stability of the gamma distribution might explain why it can be detected as shape of the AFD, as it does not fade out across coarse-graining level.

      1. What's going on with the variance of diversity in Fig S12? Does this suggest that some of the problem in Figure 4 could be with the analytic approximation rather than the model? I had a hard time understanding the part of the Methods explaining the simulation details (lines 587-597). It would be worth expanding this. Is there some way to explain how the correlations were simulated in terms of the SLM, e.g., correlations in the noise term across OTUs?

      Author response: We believe that deviations in the variance of diversity in Fig. S16g,h are driven by small deviations in our predictions of the second moment $$< (x*ln(x) | N_{m}, \bar{x}{i}, \beta{i}^{2} >$$ (Eq. S16). Alone these predictions are slight, but their effects become noticeable when summed over hundreds or thousands of taxa. We have included this observation in the revised manuscript (lines 268-271). However, this deviation pales in comparison with the magnitude of covariance in the empirical data, suggesting that our inability to predict the variance of richness and diversity is primarily driven by our assumption of statistical independence.

      Regarding the source of the correlations, under the SLM correlations in abundances can be introduced either by adding deterministic interaction terms or through correlated environmental noise. Determining which of these two options drives empirical correlations is an active area of research (e.g., Camacho-Mateu et al., 2023). For the purpose of this study, we remain agnostic on the cause of the correlations, optioning to instead emphasize that that the inclusion of correlations is necessary to reproduce observed slopes of the fine vs. coarse-grained relationship for diversity.

      1. In Figure 5ab, is the idea that the correlation in richness is primarily driven by the number of samples from the environment? Line 390 seems to say so, but it would be good to make this explicit and put it right in that section of the Results.

      Author response: Our results suggest that sampling effort (# reads) plays a larger role in determining the correlations between fine and coarse-grained measures of richness. We now clarify this point in the revised manuscript (lines 429-435).

      1. I don't totally understand the contrast in lines 369-372. If fine-scale diversity within one group begets coarse-grained diversity in another group, couldn't that show up as correlations in the AFDs? Or is the argument that only including within-group correlations in AFDs is enough to reproduce the pattern? I'm not sure I see how that could be.

      Author response: The term “begets” implies both causation and direction. If we see a positive relationship between diversity estimates at two different scales of observation the causal mechanism cannot be determined solely from correlations between samples obtained once from different sites. So, mechanisms consistent with niche construction/"DBD" can produce correlations, though the existence of correlations do not necessarily imply DBD.

      1. The discussion of niche construction on 429-431 doesn't match very well with 440-441. Basically, niche construction is a very broad concept, not a specific one, right?

      Author response: In lines 472-576 (formerly 429-431) we discuss how the existence of correlations between fine and coarse-grained scales does not point to a single ecological mechanism. Alternatively stated, observing a non-zero slope does not mean that niche construction is driving the relationship.

      In lines 476-487 (formerly 440-441) we discuss how the mechanism of cross-feeding has been shown to generate a positive relationship between fine and coarse-grained measures of diversity. This mechanism can be interpreted as a form of “niche construction”, so it is an instance of a tested ecological mechanism that aligns with the interpretation given in Madi et al. (2020).

      1. Isn't (8) just the negative binomial distribution?

      Author response: The convolution of the stationary solution of the SLM (i.e., a gamma distribution) and the Poisson limit of a multinomial sampling distribution returns a negative binomial distribution of read counts across hosts if samples have identical sampling depths. We now include this detail in the revision (line 593-595). Note however that if different samples have different sampling depths, the distribution of reads across samples is not a negative binomial.

      1. Missing 1/M in (9).

      Author response: We have fixed this omission in the revision.

      1. Schematic figures illustrating what the different statistics are intuitively capturing would really help this work be understandable to a broader audience, but they'd also be a ton of work.

      Author response: Richness and diversity are used in ecology to such an extent that we do not see the benefit of a conceptual diagram. Furthermore, we have included a conceptual diagram about our pipeline in our revision at the request of Reviewer 2 (Fig. S20).

      Reviewer #2:

      Major Recommendations

      If I were reviewing this manuscript for a regular journal, I believe the following issues would be important to address prior to publication.

      1. From my reading, the main points of this advance are that

      a. SLM models AFDs well at all levels of coarse-graining.

      b. This makes SLM a better null-model than UNTB for macroecological relationships.

      c. Using SLM on the EMP data, the richness slopes are well explained by SLM but not the diversity slopes. Therefore, any theory that hopes to explain the diversity slopes must include interactions. Argument B appears to be one of the key points yet is missing from the abstract, and should be made clearer. If these aren't the main points the authors intended, then other main points need to be highlighted more.

      Author response: In the revision we now explicitly mention argument b in the Abstract.

      1. The title should be more specific, so as to better reflect the content. (E.g. "UNTB is not a good null model for macroecological patterns" would seem more appropriate.)

      Author response: We would prefer to focus on the success of the SLM rather than the limitations of the UNTB in the title of this work. Therefore, we have modified our title as follows: “Investigating macroecological patterns in coarse-grained microbial communities using the stochastic logistic model of growth”.

      1. The manuscript would benefit from a clearer description of exactly what information the SLM retains about the data (perhaps even a cartoon panel in one of the figures). In particular, it is important to be explicit about the number of model parameters.

      Author response: The number of model parameters for the gamma AFD are now explicitly stated in the revision (Lines 579-580).

      1. The main point of Figures 2-4 seems to be that SLM is good at describing the data (and when it fails it is due to interactions) while UNTB fails to reproduce this behavior, in support of Argument B. This is not clear from the figure descriptions or titles, which focus on SLM's "predictive" power.

      Author response: Fig. 2a demonstrates that the gamma distribution predicted by the SLM explains the empirical distribution of abundances. This result provides motivation to predict the fraction of sites harboring a given community member (i.e., occupancy, Fig. 2c) as well as general measures of community composition including mean richness (Fig. 3a,c) and mean diversity (Fig. 3b,d) using parameters estimated from the data (not free parameters).

      This success led us to consider whether the gamma distribution could predict the variance of richness and diversity, which it could not because it does not capture covariance between community members (Fig. 4).

      In the revision we have identified opportunities to make these points clear throughout the Results. Furthermore, we have added additional detail to the legends of Figs. 2-4.

      1. The manuscript would benefit from clarifying the use of "prediction" related to the SLM. Since the gamma distributions predicted by SLM were fit to empirical data, it seems like the agreement between analytic means and empirical means (Fig. 3) is a statement on gamma distributions being a good fit for the AFD's more than SLM predicting richness and diversity. For example, from my reading, it seems like this analysis could be done numerically by shuffling species abundances across environments and seeing whether this changed the mean richness/diversity. I would not call this shuffling test a prediction, since it is more a statement on the relevance of interactions. SLM predicts gamma-distributed AFD's, but those distributions recovering the data they were trained on doesn't seem like a prediction.

      Author response: In this manuscript we identified the gamma distribution as an appropriate probability distribution to describe the distribution of relative abundances across samples over a range of coarse-grained scales. Motivated by this result, we performed a separate analysis where at each scale we estimated the mean and variance of relative abundance across sites for each community member. We then used these parameters to obtain the expected value of a community-level measure using an equation we derived by assuming that the gamma distribution was appropriate (e.g., richness, Eq. 13). We then compared the expected value of richness to the mean value from empirical data and assessed the similarity between the two values.

      The outcome of this procedure constitutes a prediction. While the mean and variance are parameters, estimating them from the empirical data has no connection with the operation of training a distribution on empirical data. We could have derived predictions such as Eq. 13 using any other probability distribution that can be parameterized using the mean and variance (e.g., Gaussian). Such a prediction would likely do a poor job even though it used the same means and variances used for our gamma predictions. This is because the choice of distribution would not have been a good descriptor of the distribution of abundances across hosts.

      To better explain this last -- perhaps the most significant -- issue, I'd like to ask the authors if the following recasting would be an accurate reflection of their conclusions, or if something is missing.

      1. "Focusing on the empirical relationship observed between diversity slopes by Madi 2020, we ask the question: does explaining these relationships require accounting for species-species correlations? Or could it be reproduced in a noninteracting model?" To address this question, one can perform a randomization test, shuffling abundances to preserve all single-OTU statistics but breaking any correlations. My reading of the authors' results is that (new result 1) the richness relationships would be preserved, while diversity relationships would not be preserved. [Note that this result 1 need not mention either SLM or UNTB.]

      Author response: The question of whether correlations between species are necessary to explain the observed slope of the fine vs. coarse-grained relationship was only one component of our research goals. Our first question was whether the SLM would prove to be a more appropriate null for evaluating the novelty of observed slopes. We believe that our results support the conclusion that the SLM is an appropriate null for this question, as it was able to capture observed slopes of the fine vs. coarse-grained relationship for estimates of richness, determining that correlations and the interactions that are ultimately responsible are not necessary to explain this result.

      We then find that the SLM as a null model fails to capture observed slopes of the fine vs. coarsegrained relationship for estimates of diversity and simulate the SLM with correlations to return reasonable estimates of the slope. However, here the question about correlations is a direct follow-up from our question about a null model that excludes interactions, so it is unclear how a randomization test would relate to this result.

      1. Instead of doing a randomization test (resampling the empirical distribution), one might insist on instead fitting a model to the AFD distributions, and sampling from that distribution rather than the empirical one.

      a. If doing it this way, one should of course ensure that the distribution being fit is a good description of the data.

      b. UNTB is a bad fit. SLM is a better fit, and in fact (new result 2) continues to be a good empirical fit even at coarse-grained levels.

      c. Can make statements on using SLM as a null model for these types of cross-scale relationships. Could try arguing that fitting an SLM model per-OTU (instead of resampling the empirical distribution) could offer some advantage if certain properties could be computed analytically from the fit parameters, instead of averaging over multiple computational rounds of resampling.

      Do these two points accurately summarize the manuscript? If so, this presentation avoids the confusion with "prediction". If my summary is missing some important point, the presentation should be revised to clarify the points I appear to have missed.

      Author response: In our manuscript we derive predictions from the gamma distribution, the stationary distribution of the SLM, that require parameters estimated from the data (i.e., mean and variance of relative abundance). These parameters are estimated from the data using normal procedures and then plugged into our predictions that assume the appropriateness of the gamma, returning values that are then compared to estimates from empirical data. Our estimation of the mean and variance does not assume that the empirical distribution following a gamma distribution, but the value returned by our function derived from the gamma distribution (e.g., Eq. 13) does make that assumption.

      To address the reviewer’s broader comment, we believe that following points summarize our manuscript:

      1. The gamma distribution as a stationary solution of the SLM captures macroecological patterns and predicts typical community-level properties (i.e., mean richness and diversity) across phylogenetic and taxonomic scales.

      2. The gamma distribution fails to predict variation in community-level properties (i.e., variance of richness and diversity) across phylogenetic and taxonomic scales. This occurs because the SLM is a mean-field model that does not explicitly include interactions between community members.

      3. Despite the inability to capture interactions, the gamma distribution succeeds at predicting the fine vs. coarse-grain slope for richness, a pattern that had previously been attributed to community member interactions. This result demonstrates that the novelty of a macroecological pattern hinges on one’s choice of null model.

      4. However, the gamma cannot capture the same relationship for diversity. Simulations of the gamma distribution that incorporate correlations between community members are capable of generating reasonable estimates of the slope.

      To address the reviewer’s comments regarding the appropriateness fitted gamma distributions, in our revision we have added fitted gamma distributions to plots of AFDs so that the reader can visually assess the ability of the gamma to describe empirical patterns (Fig. S3, S4).

      We have also obtained predictions for the slope of the fine vs. coarse-grained relationship for community richness using the same form of UNTB used by Madi et al (2020). In our revised manuscript we establish a procedure to infer the single parameter of this model, generate predictions of richness at fine and coarse-grained scales, and then evaluate whether the UNTB is capable of predicting the slope of the fine vs. coarse-grained relationship for richness (Supplementary Information; Figs. S18, 24-28; lines 277-278; 370-380).

      Other/minor comments

      1. The manuscript would be improved with more consistent terminology ("fine vs. coarse-grained relationship"/"the relationship" vs. "diversity slope"). Also, many readers may be used to OTUs referring to the rather fine level of description, as opposed to any chosen level; and could interpret indexing over groups as being in contrast with indexing over OTU's (coarse vs fine). The authors' use is perfectly correct, but keeping a consistent terminology would help.)

      Author response: We have revised our manuscript to specify the “slope” as the “slope of the fine vs. coarse-grained relationship” (e.g., Line 318). We also specify in the Results and in the Methods that we use “fine” and “coarse” as relative terms, keeping with the sliding-scale approach used in Madi et al (2020).

      1. While I appreciate this "slope" is something borrowed from other work, the clarity of the paper might benefit from a cartoon of how one goes from the raw data to the slopes at a particular coarse-graining level. (Optional).

      Author response: We had added a conceptual diagram to the revision (Fig. S20).

      1. The text often colloquially references "the gamma," "predictions of the gamma," etc. This phrasing comes across as sloppy, and the manuscript would be improved by being more specific.

      Author response: We now specify “gamma” as the “gamma distribution” throughout the manuscript.

      1. Equation 6 appears to be missing some subscripts on the x terms (included on the left of the equation).

      Author response: We thank the reviewer for noticing this error and we have corrected it in the revision.

      1. In "Simulating communities of correlated...AFDs", the acronym SAD is not defined.

      Author response: We thank the reviewer for noticing this error and we have corrected it in the revision.

      1. In Figure 2:

      a. Invariant is probably the wrong word for the title, since all the AFD's were rescaled by mean and variance before being compared. Data does support that the gamma distributions are good at describing the AFD's, but as stated in the description it's the general shape that is preserved, not the distribution itself.

      Author response: When we mention the invariance of the AFD we now specify that we mean that the shape of the distribution remained qualitatively invariant.

      b. I'd recommend changing the color coding to something with more contrast, since currently it's impossible to assess the claim that the shape of the distribution collapses.

      Author response: Our coarse-graining procedure is a sequential operation that has no intuitive point that would suggest the use of a contrasting colormap (e.g., if our scale ranged from -1 to 1 then there would be a natural point of contrast at zero).

      c. The legend is missing relevant technical details: How many OTU's were used to make plot a? How many samples?

      Author response: The number of samples was listed in the Materials and Methods (line 523). In the revision we now include a table with the average and total number of OTUs as well as the average number of reads for each environment (Table S1, S2).

      d. In plot b, is the mean relative abundance referring to "mean abundance when observed" or "mean across all samples"?

      Author response: The mean relative abundance is the mean abundance across all sites (line 204) and in the legend of Fig. 2.

      e. Since one argument here is that SLM fits these distributions better than UNTB, if possible it would be nice to see UNTB's failed fits here.

      Author response: A major feature of the UNTB is that the demographic parameters of community members are indistinguishable. Under the SLM, the variation in the mean relative abundance we observe suggests that the carrying capacities of community members vary over multiple orders of magnitude, a result that is incompatible with most forms of the UNTB (x-axis of Fig. 2b). We now mention this point in the revised manuscript (lines 110; 229; 455-471).

      1. In Figure 3:

      a. It is not clear how coarse-graining is included in model fitting. The "Deriving biodiversity measure predictions" section would benefit from including how coarse-graining is incorporated.

      Author response: We predict measures of biodiversity separately at each coarse-grained scale. We now clarify this detail in the revised manuscript (Lines 624-627).

      b. Reference Shannon Diversity in Methods.

      Author response: We now cite Shannon’s diversity.

      c. What is the blue/white color coding in plots a & c? It doesn't have any color key.

      Author response: Figs. 3-6 use a uniform light-to-dark scale for all environments, with each environment having its own color. For example, Fig. 3a contains data from the human gut microbiome. Human gut data were assigned the color aquamarine, so the shade of aquamarine for a given datapoint in Fig. 3a indicates the phylogenetic scale.

      In the revision we now clarify the colorscale in the legend of Fig. 3 and specify that the same scale is used in all subsequent figure legends.

      d. Re: earlier comments, why is richness considered a prediction? (Am I correct in my interpretation that panel b is almost a tautology - counting the number of zeros in the matrix either by rows or by columns - whereas panel d is nontrivial?)

      Author response: Mean richness as a measure of biodiversity depends on the fraction of sites where a given community member is present (i.e., occupancy). The mean relative abundance of a community member and its variation across sites (beta) is clearly related to occupancy, but those two statistics do not give you a prediction of occupancy. Obtaining a prediction of occupancy and, subsequently, richness, requires 1) a probability distribution of abundances (i.e., the gamma) and 2) a probability distribution of sampling (i.e., the Poisson). Using these two pieces of information, we derived a prediction for mean richness (Eq. 13). We then compare the value of richness obtained by plugging in the mean relative abundances, betas, and known number of reads to the observed mean richness obtained from the data.

      e. The lettering of subplots in Figure 3 is not consistent with Figure 4. Figure 3 subplots are also cited incorrectly in paragraph two on page six (lines 251-254).

      Author response: We thank the reviewer for noticing the error and we have corrected it in the revision.

      f. Again, if possible show UNTB predictions in plots a & c.

      Author response: In our revised manuscript we provide extensive descriptions and predictions of mean richness and the slope of the fine vs. coarse-grained relationship for richness using the form of the UNTB used in Madi et al. (2020; Figs. S18, S24 - S29; lines 277-282; 370-380). We then compare the error of these slope predictions to those obtained from the SLM, finding that the SLM generally outperforms UNTB (Figs. S27-S29).

      1. In Figure 4:

      a. What are the color codings in plots a & b?

      Author response: The color scale used in Fig. 4 is identical to the color scale used in Fig. 3. This detail is now specified in the legend of Fig. 4.

      b. What are the two lines of empirical data in plots a & b, and why is one of them dashed?

      Author response: We now specify what the two lines mean in the key within the figure.

      c. Same comment as earlier on predictions and richness.

      Author response: We now specify what the two lines mean in the key within the figure.

      1. In Figure 5:

      a. It wasn't clear to me in the manuscript how the authors generated these plots from the raw data. The manuscript would benefit from a clear cartoon/description of the data pipeline, from raw data to empirical (and analytic) slopes.

      Author response: We have added a conceptual diagram to the revised manuscript (Fig. S20).

      b. Make the figure title more descriptive to better connect it to the figure's objective (the richness slopes relationship is not novel, but the diversity slopes relationship is).

      Author response: We have revised the figure title.

      References

      Camacho-Mateu, J., Lampo, A., Sireci, M., Muñoz, M. Á., & Cuesta, J. A. (2023). Species interactions reproduce abundance correlations patterns in microbial communities (arXiv:2305.19154). arXiv. https://doi.org/10.48550/arXiv.2305.19154

      Grilli, J. (2020). Macroecological laws describe variation and diversity in microbial communities. Nature Communications, 11(1), 4743. https://doi.org/10.1038/s41467-020- 18529-y

      Madi, N., Vos, M., Murall, C. L., Legendre, P., & Shapiro, B. J. (2020). Does diversity beget diversity in microbiomes? eLife, 9, e58999. https://doi.org/10.7554/eLife.58999

      Shoemaker, W. R., Sánchez, Á., & Grilli, J. (2023). Macroecological laws in experimental microbial systems (p. 2023.07.24.550281). bioRxiv. https://doi.org/10.1101/2023.07.24.550281

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their thorough assessment of our study, their overall enthusiasm, and the helpful suggestions for clarifying the methods and results, additional analyses, and discussion points. We have made earnest efforts to address the weaknesses raised in the public review and other recommendations made by the reviewers.

      Public Reviews:

      Reviewer #1 (Public Review):

      Herein, Blaeser et al. explored the impact of migraine-related cortical spreading depression (CSD) on the calcium dynamics of meningeal afferents that are considered the putative source of migraine-related pain. Critically previous studies have identified widespread activation of these meningeal afferents following CSD; however, most studies of this kind have been performed in anesthetized rodents. By conducting a series of technically challenging calcium imaging experiments in conscious head fixed mice they find in contrast that a much smaller proportion of meningeal afferents are persistently activated following CSD. Instead, they identify that post-CSD responses are differentially altered across a wide array of afferents, including increased and decreased responses to mechanical meningeal deformations and activation of previously non-responsive afferents following CSD. Given that migraine is characterized by worsening head pain in response to movement, the findings offer a potential mechanism that may explain this clinical phenomenon.

      Strengths:

      Using head fixed conscious mice overcomes the limitations of anesthetized preps and the potential impact of anaesthesia on meningeal afferent function which facilitated novel results when compared to previous anesthetized studies. Further, the authors used a closed cranial window preparation to maximize normal physiological states during recording, although the introduction of a needle prick to induce CSD will have generated a small opening in the cranial preparation, rendering it not fully closed as suggested.

      Weaknesses:

      Although this is a well conducted technically challenging study that has added valuable knowledge on the response of meningeal afferents the study would have benefited from the inclusion of more female mice. Migraine is a female dominant condition and an attempt to compare potential sex-differences in afferent responses would undoubtedly have improved the outcome.

      Our study included only two females, largely reflecting the much higher success rate of AAV-mediated meningeal afferent GCaMP expression in males than in females. The reason for the lower yield in female mice is unclear to us at present but may involve, at least partly, sex-specific differences in the mechanisms responsible for efficient transduction with this AAV vector observed in peripheral tissues (Davidoff et al. 2003). While our study did not address sex differences, a recent study (Melo-Carrillo et al. 2017) reported CSD equally activating and sensitizing second-order dorsal horn neurons that receive input from meningeal afferents in male and female rats.

      The authors imply that the current method shows clear differences when compared to older anaesthetized studies; however, many of these were conducted in rats and relied on recording from the trigeminal ganglion. Inclusion of a subgroup of anesthetized mice in the current preparation may have helped to answer these outstanding questions, being is this species dependent or as a result of the different technical approaches.

      We have tried to address the anesthesia issue by conducting imaging sessions in several isoflurane-anesthetized mice. However, during these experiments, we observed a substantial decrease in the GCaMP fluorescence signal with a much lower signal-to-noise ratio that made the analyses of the afferents’ calcium signal unreliable. Reduced GCaMP signal in meningeal axons during anesthesia may be related to the development of respiratory acidosis, since lower pH leads to decreased GCaMP signal, as also mentioned by Reviewer #3. Of note, urethane anesthesia, which was used in all previous rat experiments, also produces respiratory acidosis.

      The authors discuss meningeal deformations as a result of locomotion; however, despite referring to their previous work (Blaeser et al., 2022), the exact method of how these deformations were measured could be clearer. It is challenging to imaging that simple locomotion would induce such deformations and the one reference in the introduction refers to straining, such as cough that may induce intracranial hypertension, which is likely a more powerful stimulus than locomotion.

      As part of the revision, we now provide a better description of the methodology (“Image processing and calcium signal extraction” section) used to determine meningeal deformations, including scaling, shearing, and Z-shift. In our previous paper (Blaeser et al. 2023), we provided an extensive description of the types of meningeal deformations occurring in locomoting mice. It should also be noted that locomotion drives cerebral vasodilation and intracranial pressure increases (Gao and Drew, 2016), which likely mediate, at least in part, the movement of the meninges towards the skull (positive Z-shift) and potentially other meningeal deformation parameters. We also agree with the reviewer that sudden maneuvers such as coughing and sneezing that lead to a larger increase in intracranial pressure are likely to be even more powerful drivers of endogenous intracranial mechanical stimulation than locomotion. Thus, our finding of increased responsiveness to locomotion-related meningeal deformation post-CSD may underestimate the increased afferent responsivity post-CSD during other behaviors such as coughing. We added this point to the discussion.

      More recently, several groups have used optogenetic triggering of CSD to avoid opening of the cranium for needle prick. Given the authors robustly highlight the benefit of the closed cranium approach, would such an approach not have been more appropriate.

      We agree with the reviewer that optogenetic methods used for CSD induction in non-craniotomized animals will further ensure accurate pressurization and, thus, will be an even better approach that avoids the burr hole used for pinprick. It should be noted, however, that the burr hole used for the pinprick likely had a minimal effect on intracranial pressure, as we minimized depressurization by plugging the burr hole throughout the experiments with a silicone elastomer. We have added this information to the revised Methods section.

      It is also worth noting that the optogenetic methodology used by others to provoke CSD was optimized only recently and relies on transgenic mice with a strong expression of YFP (Thy1.ChR2-YFP mice) within the superficial cortex that is not compatible with the afferent GCaMP imaging of meningeal afferents. Modifications using red-shifted opsins may allow the use of this strategy in the future.

      It was not clear how deformations predictors increased independent of locomotion (Figure 4D) as locomotion is essentially causing the deformations as noted in the study. This point was not so clear to this reviewer.

      As noted in our previous paper (Blaeser et al., 2023), deformation variables often exhibit different time courses than locomotion, even when a deformation is initially induced by the onset of locomotion. Most notably, the scaling-related deformation ramps up slowly and often persists for tens of seconds after the onset and termination of locomotion, which may be related to the recovery dynamics of the meningeal vascular response to locomotion. Overall, while locomotion serves as a predictor of meningeal deformation, we observed previously (Blaeser et al. 2023) many afferents whose responses were more closely associated with the moment-to-moment deformations than with the state of locomotion per se, suggesting that a unique set of stimuli is responsible for the activation of this deformation-sensitive afferent population. The increased sensitivity to deformation signals we observed following CSD suggests that the afferent population sensitive to deformation has unique properties that render it most susceptible to becoming sensitized following CSD. We now discuss this possibility.

      Reviewer #2 (Public Review):

      This is an interesting study examining the question of whether CSD sensitizes meningeal afferent sensory neurons leading to spontaneous activity or whether CSD sensitizes these neurons to mechanical stimulation related to locomotion. Using two-photon in vivo calcium imaging based on viral expression of GCaMP6 in the TG, awake mice on a running wheel were imaged following CSD induction by cortical pinprick. The CSD wave evoked a rise in intracellular calcium in many sensory neurons during the propagation of the wave but several patterns of afferent activity developed after the CSD. The minority of recorded neurons (10%) showed spontaneous activity while slightly larger numbers (20%) showed depression of activity, the latter pattern developed earlier than the former. The vast majority of neurons (70%) were unaffected by the CSD. CSD decreased the time spent running and the numbers of bouts per minute but each bout was unaffected by CSD. There also was no influence of CSD on the parameters referred to as meningeal deformation including scale, shear, and Z-shift. Using GLM, the authors then determine that there there is an increase in locomotion/deformation-related afferent activity in 51% of neurons, a decrease in 12% of neurons, and no change in 37%. GLM coefficients were increased for deformation related activity but not locomotion related activity after CSD. There also was an increase in afferents responsive to locomotion/deformation following CSD that were previously silent. This study shows that unlike prior reports, CSD does not lead to spontaneous activity in the majority of sensory neurons but that it increases sensitivity to mechanical deformation of the meninges. This has important implications for headache disorders like migraine where CSD is thought to contribute to the pathology in unclear ways with this new study suggesting that it may lead to increased mechanical sensitivity characteristic of migraine attacks.

      1) It would be helpful to know what is meant by "post-CSD" in many of the figures where a time course is not shown. The methods indicate that 4, 30 min runs were collected after CSD but this would span 2 hours and the data do not indicate whether there are differences across time following CSD nor whether data from all 4 runs are averaged.

      While we monitored time course changes in ongoing activity (see Figure 2), it was challenging to evaluate post-CSD changes in locomotion-related deformation responses at a fine temporal scale, as running bouts resumed at different time points post-CSD and occurred intermittently throughout the post-CSD analysis period. Our experiments were also not sufficiently powered to break out analyses at multiple different epochs post-CSD, partly because there wasn’t much locomotion. To allow comparisons using a sufficient number of bouts, we conducted our GLM analyses using all data collected during running bouts in the 2-hour post-CSD period (termed “post-CSD) versus in the 1-hour pre-CSD period. We have now clarified this further in the main text and figure legends.

      2) Why is only the Z-shift data shown in Figures 4A-C? Each of the deformation values seems to contribute to the activity of neurons after CSD but only the Z-shift values are shown.

      In many afferents, only one deformation variable best predicted the activity at both the pre- and post-CSD epochs. However, at the population level, all deformation variables were equally predictive. In the examples provided, the afferent developed augmented sensitivity that could only be predicted by the Z-shift variable, and the other deformation variables were not included to keep the figure legible. This is now clarified in the figure legend.

      3) How much does the animal moving its skull against the head mount contribute to deformations of the meninges if the skull is potentially flexing during these movements? Even if mice are not locomoting, they can still attempt to move their heads thus creating pressure changes on the skull and underlying meninges. The authors mention in the methods that the strong cement used to bind the skull plates and headpost together minimize this, but how do they know it is minimized?

      We did not measure skull flexing during locomotion and its potential effect on meningeal deformation. However, we would like to point out several considerations. It is evident from numerous imaging studies across various brain regions in freely moving animals, utilizing brain motion registration, that brain motion of the same scale (a few microns), as that observed in our studies, also occurs in the absence of head fixation (e.g., Glas et al, 2019; Zong et al 2021). In our system, the head-fixed mouse is locomoting on a cantilevered (spring-like) running wheel (see also Ramesh et al., 2018), which dissipates most, albeit not all, upward and forward forces applied to the skull during locomotion. Furthermore, the position of the headpost, anterior to where the mouse's paws touch the wheel, makes it hard for the mouse to push straight up and apply forces to the skull. We have updated the text in the methods section (Running wheel habituation) to address this. In our previous work (See Figure 2B in Blaeser et al. 2023), we found a substantial subset of afferents showing an increase in calcium activity that began after each bout of locomotion had terminated, and that lasted for many seconds, suggesting that skull flexing during locomotion may not play a leading role. Finally, we proposed in that study that meningeal deformations play a major role in the afferent response, given our findings of (i) sigmoidal stimulus-response curves between afferent activity and meningeal deformation and (ii) of different afferents that track scaling deformations along different axes. It is unlikely that all of these are related to any residual forces generated from skull deformations.

      4) What is the mechanism by which afferents initiate the calcium wave during the CSD itself? Is this mechanical pressure due to swelling of the cortex during the wave? If so, why does the CSD have no impact on the deformation parameters? It seems that this cortical swelling would have some influence on these values unless the measurements of these values are taken well after cortical swelling subsides. Related to point 1 above, it is not clear when these measurements are taken post-CSD.

      We provide, for the first time, evidence that CSD evokes local calcium elevation in meningeal afferent fibers in a manner that is incongruent with action potential propagation, as the activity gradually advances along individual afferents across many seconds during the wave. As indicated in Figure 1H, we measured these changes during the first 2 minutes post-CSD. Based on the reviewer’s question, we have now addressed whether mechanical changes occurring in the cortex in the wake of CSD might be responsible for the acute afferent activation we observed. We now include new data (Results, “Acute afferent activation is not related to CSD-evoked meningeal deformation” and Figure S2) showing an acute phase of meningeal deformation (as expected given the changes in extracellular fluid volume) lasting 40-80 seconds following the induction of CSD. Our data suggests, however, that these meningeal deformations are unlikely to be the main driver of the acute afferent calcium response. We propose that, based on the speed of the afferent calcium wave propagation and the distinct dynamics of calcium activity as compared to the dynamics of the deformations, the acute afferent response is more likely to be mediated by the spread of algesic mediators (e.g., glutamate, K+ ATP) and their diffusion into the overlying meninges.

      Because the peri-CSD meningeal deformations return to baseline soon after the cessation of the CSD wave, they are unlikely to affect our analyses of post-CSD changes in afferent sensitivity in the following 2 hours. This is also supported by our data (see Figure 3F-H) showing similar locomotion-related deformations pre- and post-CSD, which were measured after the deformations related to the CSD itself had subsided.

      5) How does CSD cause suppression of afferent activity? This is not discussed. It is probably a good idea in this discussion to reinforce that suppression in this case is suppression of the calcium response and not necessarily suppression of all neuronal activity.

      The mechanism underlying the suppression of afferent activity remains unclear. We now discuss the following points:

      First, the pattern of afferent responses resembles the rapid loss of cortical activity in the wake of a CSD, but its faster recovery points to a mechanism distinct from the pre-and post-synaptic changes responsible for the silencing of cortical activity (Sawant-Pokam et al., 2017; Kucharz and Lauritzen, 2018). Whether CSD drives the local release of mediators capable of reducing afferent excitability and spiking dynamics will require further studies.

      Second, the reviewer proposes that the suppressed calcium activity we observed in ~20% of the afferents immediately following CSD may reflect a decreased calcium response independent of afferent spiking activity. Such a process could theoretically involve factors influencing the GCaMP fluorescence (see also our response to Reviewer #3) and/or factors modifying the afferents’ spiking-to-calcium coupling. We note that if a CSD-related factor could modify the calcium response independent of afferent spiking, one would expect a more consistent effect across axons, reflected as a reduced signal in a larger proportion of the afferents, which we did not observe.

      6) How do the authors interpret the influence of CSD on locomotor activity? There was a decrease in bouts but the bouts themselves showed similar patterns after CSD. Is CSD merely inhibiting the initiation of bouts? Is this consistent with what CSD is known to do to motor activity? And again related to point 1, how long after CSD were these measurements taken? Were there changes in locomotor activity during the actual CSD compared to post-CSD?

      To the best of our knowledge, there is very little data on the effect of CSD on motor activity, making it challenging to engage in further speculation regarding the mechanisms underlying the preservation of running bouts patterns post-CSD. Houben et al. (2017) described a similar reduction in locomotion in mice, corresponding to decreased motor cortex (M1) activity, and preservation of intermittent locomotion bouts. In the revised Results section, we now provide information about the cessation of locomotor activity during the CSD wave and have added information regarding the measurement of locomotion following CSD.

      7) The authors mention the caveats of prior work where the skull is open and is thus depressurized. Is this not also the case here given there is a hole in the skull needed to induce CSD?

      Unlike previous electrophysiological studies, which involved several large openings (~2x2 mm), including at the site of the afferents’ receptive field, our study involved only a small burr hole located remotely (1.5 mm) from the frontal edge of our imaging window. As noted in our response to Reviewer #1, this burr hole (~0.5 mm diameter) was unlikely to produce inflammation at the imaging site or cause depressurization as it was sealed with a silicone plug throughout the experiment.

      8) The authors should check the %'s and the numbers in the pie chart for Figure 4. Line 224 says 53 is 22% but it does not look this way from the chart.

      The 22% reported is the percentage of afferents that developed sensitivity post-CSD among all the non-sensitive ones pre-CSD. The pie chart illustrates only afferents that were deemed sensitive before and/or after the CSD. We removed the % to clarify.

      9) Line 319 mentions that CSD causes "powerful calcium transients" in sensory neurons but it is not clear what is meant by powerful if there are no downstream effects of these transients being measured. The speculation is that these calcium transients could cause transmitter release, which would be an important observation in the absence of AP firing, but there are no data evaluating whether this is the case.

      We changed the term to “robust”

      Reviewer #3 (Public Review):

      Summary:

      Blaeser et al. set out to explore the link between CSD and headache pain. How does an electrochemical wave in the brain parenchyma, which lacks nociceptors, result in pain and allodynia in the V1-3 distribution? Prior work had established that CSD increased the firing rate of trigeminal neurons, measured electrophysiologically at the level of the peripheral ganglion. Here, Blaeser et al. focus on the fine afferent processes of the trigeminal neurons, resolving Ca2+ activity of individual fibers within the meninges. To accomplish these experiments, the authors injected AAV encoding the Ca2+ sensitive fluorophore GCamp6s into the trigeminal ganglion, and 8 weeks later imaged fluorescence signals from the afferent terminals within the meninges through a closed cranial window. They captured activity patterns at rest, with locomotion, and in response to CSD. They found that mechanical forces due to meningeal deformations during locomotion (shearing, scaling, and Z-shifts) drove non-spreading Ca2+ signals throughout the imaging field, whereas CSD caused propagating Ca2+ signals in the trigeminal afferent fibers, moving at the expected speed of CSD (3.8 mm/min). Following CSD, there were variable changes in basal GCamp6s signals: these signals decreased in the majority of fibers, signals increased (after a 25 min delay) in other fibers, and signals remained unchanged in the remainder of fibers. Bouts of locomotion were less frequent following CSD, but when they did occur, they elicited more robust GCamp6s signals than pre-CSD. These findings advance the field, suggesting that headache pain following CSD can be explained on the basis of peripheral cranial nerve activity, without invoking central sensitization at the brain stem/thalamic level. This insight could open new pathways for targeting the parenchymal-meningeal interface to develop novel abortive or preventive migraine treatments.

      Strengths:

      The manuscript is well-written. The studies are broadly relevant to neuroscientists and physiologists, as well as neurologists, pain clinicians, and patients with migraine with aura and acephalgic migraine. The studies are well-conceived and appear to be technically well-executed.

      Weaknesses:

      1) Lack of anatomic confirmation that the dura were intact in these studies: it is notoriously challenging to create a cranial window in mouse skull without disrupting or even removing the dura. It was unclear which meningeal layers were captured in the imaging plane. Did the visualized trigeminal afferents terminate in the dura, subarachnoid space, or pia (as suggested by Supplemental Fig 1, capturing a pial artery in the imaging plane)? Were z-stacks obtained, to maintain the imaging plane, or to follow visualized afferents when they migrated out of the imaging plane during meningeal deformations?

      We agree that avoiding disruption of the dura is challenging. Indeed, it took many months of practice before conducting the experiments in this manuscript to master methods for a craniotomy that spared the dura.

      We addressed the issue of meningeal irritation due to cranial window surgery in our previous work (Blaeser et al., 2023). In brief, we conducted vascular imaging using the same cranial window approach and showed no leakage of macromolecules from dural or pial vessels anywhere within the imaging window at 2-6 weeks after the surgery (Figure S1D in Blaeser et al. 2022). This data suggested no ongoing meningeal inflammation below the window. The very low level of ongoing activity we observed at baseline also suggests a lack of an inflammatory response that could lead to afferent sensitization before CSD. This is now mentioned in the Discussion.

      We conducted volumetric imaging for three main reasons: 1) To capture the activity of afferents throughout the meningeal volume. In our volumetric imaging approach, including in this work, we observed afferent calcium signals throughout the meningeal thickness (see Figure 5 in Blaeser et al. 2022). However, the majority of afferents were localized to the most superficial 20 microns (Figure S1E in Blaeser et al. 2022), suggesting that we mostly recorded the activity of dural afferents; 2) to enable simultaneous quantification of three-dimensional deformation and the activity of afferents throughout the thickness of the meninges. This allowed us to determine whether changes in mechanosensitivity could involve augmented activity to intracranial mechanical forces that produced meningeal deformation along the Z-axis of the meninges (e.g., increased intracranial pressure); 3) to provide a direct means to confirm that the afferent GCaMP fluorescent changes we observed were not due to artifacts related to meningeal motion along the Z-axis. We have now added this information to the “Two-photon imaging” section of the Methods.

      2) Findings here, from mice with chronic closed cranial windows, failed to fully replicate prior findings from rats with acute open cranial windows. While the species, differing levels of inflammation and intracranial pressure in these two preparations may contribute, as the authors suggested, the modality of measuring neuronal activity could also contribute to the discrepancy. In the present study, conclusions are based entirely on fluorescence signals from GCamp6s, whereas prior rat studies relied upon multiunit recordings/local field potentials from tungsten electrodes inserted in the trigeminal ganglion.

      As a family, GCamp6 fluorophores are strongly pH dependent, with decreased signal at acidic pH values (at matched Ca2+ concentration). CSD induces an impressive acidosis transient, at least in the brain parenchyma, so one wonders whether the suppression of activity reported in the wake of CSD (Figure 2) in fact reflects decreased sensitivity of the GCamp6 reporter, rather than decreased activity in the fibers. If intracellular pH in trigeminal afferent fibers acidifies in the wake of CSD, GCamp6s fluorescence may underestimate the actual neuronal activity.

      Previous in vivo rodent studies observed a tissue acidosis transient that peaks during the DC shift corresponding to the wavefront of the spreading depolarization, and lasting for ~ 10 min. (Mutch and Hansen, 1984). Since we observed a massive increase in afferent calcium activity with a propagation pattern resembling the cortical wave, it is unlikely that the cortical acidosis during the CSD wave strongly affected the GCaMP signal in the overlying meninges. Furthermore, if cortical acidosis non-discriminately affects the GCaMP signal, one would expect a more consistent effect across axons, reflected as a reduced calcium signal in a larger proportion of the afferents, which we did not observe. Finally, the finding that in affected afferents, decreased calcium activity lasted for > 20 min – a time point when cortical acidosis has fully recovered - points to a distinct underlying mechanism. We also note that any residual acidosis would not confound our main finding of increased calcium responses to meningeal deformation at later periods post-CSD, as acidosis should, if anything, decrease calcium-related fluorescence.

      The authors might consider injecting an AAV encoding a pHi sensor to the trigeminal ganglion, and evaluating pHi during and after CSD, to assess how much this might be an issue for the interpretation of GCamp6s signals. Alternatively, experiments assessing trigeminal fiber (or nerve/ganglion) activity by electrophysiology or some other orthologous method would strengthen the conclusions.

      Please see our comment above regarding the short duration of the pH changes post-CSD.

      N's are generally reported as # of afferents, obscuring the number of technical/biological replicates (# of imaging sessions, # of locomotion bouts, # of CSDs induced, # of animals).

      We now report the number of replicates (# of afferent, # of CSD events, and # of mice).

      Fig 1F trace over the heatmap is not explained in the figure legend. Is this the speed of the running wheel? Is it the apparent propagation rate of the GCamp6s transient through the imaging field?

      We have added to the legend of Figure 1 that the trace in panel F depicts locomotion speed.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This valuable paper examines gene expression differences between male and female individuals over the course of flower development in the dioecious angiosperm Trichosantes pilosa. Male-biased genes evolve faster than female-biased and unbiased genes, which is frequently observed in animals, but this is the first report of such a pattern in plants. In spite of the limited sample size, the evidence is mostly solid and the methods appropriate for a non-model organism. The resources produced will be used by researchers working in the Cucurbitaceae, and the results obtained advance our understanding of the mechanisms of plant sexual reproduction and its evolutionary implications: as such they will broadly appeal to evolutionary biologists and plant biologists.

      Public Reviews:

      Reviewer #1 (Public Review):

      The evolution of dioecy in angiosperms has significant implications for plant reproductive efficiency, adaptation, evolutionary potential, and resilience to environmental changes. Dioecy allows for the specialization and division of labor between male and female plants, where each sex can focus on specific aspects of reproduction and allocate resources accordingly. This division of labor creates an opportunity for sexual selection to act and can drive the evolution of sexual dimorphism.

      In the present study, the authors investigate sex-biased gene expression patterns in juvenile and mature dioecious flowers to gain insights into the molecular basis of sexual dimorphism. They find that a large proportion of the plant transcriptome is differentially regulated between males and females with the number of sex-biased genes in floral buds being approximately 15 times higher than in mature flowers. The functional analysis of sex-biased genes reveals that chemical defense pathways against herbivores are up-regulated in the female buds along with genes involved in the acquisition of resources such as carbon for fruit and seed production, whereas male buds are enriched in genes related to signaling, inflorescence development and senescence of male flowers. Furthermore, the authors implement sophisticated maximum likelihood methods to understand the forces driving the evolution of sex-biased genes. They highlight the influence of positive and relaxed purifying selection on the evolution of male-biased genes, which show significantly higher rates of non-synonymous to synonymous substitutions than female or unbiased genes. This is the first report (to my knowledge) highlighting the occurrence of this pattern in plants. Overall, this study provides important insights into the genetic basis of sexual dimorphism and the evolution of reproductive genes in Cucurbitaceae.

      Reviewer #2 (Public Review):

      Summary:

      This study uses transcriptome sequence from a dioecious plant to compare evolutionary rates between genes with male- and female-biased expression and distinguish between relaxed selection and positive selection as causes for more rapid evolution. These questions have been explored in animals and algae, but few studies have investigated this in dioecious angiosperms, and none have so far identified faster rates of evolution in male-biased genes (though see Hough et al. 2014 https://doi.org/10.1073/pnas.1319227111).

      Strengths:

      The methods are appropriate to the questions asked. Both the sample size and the depth of sequencing are sufficient, and the methods used to estimate evolutionary rates and the strength of selection are appropriate. The data presented are consistent with faster evolution of genes with male-biased expression, due to both positive and relaxed selection.

      This is a useful contribution to understanding the effect of sex-biased expression in genetic evolution in plants. It demonstrates the range of variation in evolutionary rates and selective mechanisms, and provides further context to connect these patterns to potential explanatory factors in plant diversity such as the age of sex chromosomes and the developmental trajectories of male and female flowers.

      Weaknesses:

      The presence of sex chromosomes is a potential confounding factor, since there are different evolutionary expectations for X-linked, Y-linked, and autosomal genes. Attempting to distinguish transcripts on the sex chromosomes from autosomal transcripts could provide additional insight into the relative contributions of positive and relaxed selection.

      Reviewer #3 (Public Review):

      The potential for sexual selection and the extent of sexual dimorphism in gene expression have been studied in great detail in animals, but hardly examined in plants so far. In this context, the study by Zhao, Zhou et al. al represents a welcome addition to the literature.

      Relative to the previous studies in Angiosperms, the dataset is interesting in that it focuses on reproductive rather than somatic tissues (which makes sense to investigate sexual selection), and includes more than a single developmental stage (buds + mature flowers).

      Recommendations for the authors:

      Reviewer #3 (Recommendations For The Authors):

      I have reviewed this new version and find that it now addresses some of the shortcomings of the previous manuscript. However, several important limitations still remain:

      1) The conclusion that sex-linked genes contribute relatively little to the patterns described is important and would be worth including in the manuscript briefly (not just the response letter), focusing for instance on the overall comparable proportions of sex-linked genes among male-biased (3/343=0.087%), female-biased (19/1145=1.66%) and unbiased genes (36/2378=1.51%).

      Authors’ response: Thank you for your advice. We have added these sentences in “Discussion” section (Lines 492-499).

      2) The new sentence included in the results "we also found that most of them were members of different gene families generated by gene duplication" is too vague. The motivation of this analysis is not explained, leaving the intended message unclear.

      Authors’ response: In the previous revision, as stressed by reviewer #1 “(2) Paragraph (407-416) describes the analysis of duplicated genes under relaxed selection but there is no mention of this in the results”, we added the sentence “we also found that most of them were members of different gene families generated by gene duplication” in “Relaxed selection” paragraph of the results. Accordingly, in “Discussion” section, we discussed the associations between gene duplication and relaxed selection (Lines 461-473).

      Following your suggestion, we revised the results (Lines 304-307) to “Using the RELAX model, we detected that 18 out of 343 OGs (5.23%) showed significant evidence of relaxed selection (K = 0.0184–0.6497) (Tables S9). Most of the 18 OGs are members of different gene families generated by gene duplication (Table S13)”. This makes it more coherent with the discussion.

      3) The sentences "given that dN/dS values of sex-biased genes were higher due to codon usage bias..." are very confusing. I do not understand the argument being made here. I do not see why "lower dS rates would be expected in sex-biased genes ..."

      Authors’ response: We respectfully argue that codon usage bias was positively related to synonymous substitution rates. That is, stronger codon usage bias may be related to higher synonymous substitution rates (Parvathy et al., 2022). Lower ENC values represent stronger codon usage bias. So, if ω (dN/dS) values of sex-biased genes are higher due to codon usage bias, we expect lower dS rates (That is, higher ENC values). Please refer to the relevant papers (e. g. Darolti et al., 2018; Catalan et al., 2018; Schrader et al., 2021, cited in the references of the paper).

      4) The manuscript now reports the proportion of unitigs annotated by similarity with a number of species. While this is an interesting observation, the reviewer was actually asking for a comparison between the number of unitigs (59,051) and the number of genes annotated in a typical cucurbitaceae genome. This would give an indication of the level of redundancy of the de novo assembled transcriptome.

      Authors’ response: We admit that in the final assembly, transcripts may be overestimated. We respectfully suggest that it may be inappropriate to assess the redundancy of the de novo assembled transcriptome by comparing the transcriptome sequences with the genomic sequences. An appropriate approach is to compare transcriptome sequences and transcriptome sequences among different species. For example, Hu et al., 2020 (reference cited in the paper) obtained 145,975 non-redundant unigenes from flower buds of female and male plants in Trichosanthes kirilowii. Mohanty et al. (2017) obtained 71,823 non-redundant unigenes from flower buds of female and male plants in Coccinia grandis.

      Reference:

      Mohanty JN, Nayak S, Jha S, Joshi RK. 2017. Transcriptome profiling of the floral buds and discovery of genes related to sex-differentiation in the dioecious cucurbit Coccinia grandis (L.) Voigt. Gene. 626: 395-406.

      5) From reading the text I could not understand the extent to which the permutation test actually agreed with the Wilcoxon rank sum test. The text says that the results were "almost consistent", which is too vague. This paragraph should be clarified.

      Authors’ response: We performed permutation test for sex-biased genes in floral buds and flowers at anthesis. However, only in floral buds, the results of both tests (permutation test and Wilcoxon rank sum test) are significant. Taking your suggestions in consideration, we have revised them as “Additionally, we found that only in floral buds, there were significant differences in ω values in the results of ‘free-ratio’ model (female-biased versus male-biased genes, P = 0.04282 and male-biased versus unbiased genes, P = 0.01114) and ‘two-ratio’ model (female-biased versus male-biased genes, P = 0.01992 and male-biased versus unbiased genes, P = 0.02127, respectively) by permutation t test, which is consistent with the results of Wilcoxon rank sum test.(Lines 273-280)”.

      6) The paragraph on the link between codon usage and dN/dS is very unclear and quite unnecessary. I would suggest to simply remove lines 312-323.

      Authors’ response: We respectfully argue that codon usage bias is one of the most important factors for higher rates of sequence evolution. Please refer to Darolti et al. (2018), Catalan et al. (2018) and Schrader et al. (2021) (cited in the references of the paper). We retain these lines here.

      7) The discussion contains many unnecessary repeats from the introduction and results section. I suggest shortening drastically at several places, including:

      • remove lines 367-369

      Authors’ response: Thank you for your suggestion. We revised these lines to “In this study, we compared the expression profiles of sex-biased genes between sexes and two tissue types, investigated whether sex-biased genes exhibited evidence of rapid evolutionary rates of protein sequences and identified the evolutionary forces responsible for the observed patterns in the dioecious Trichosanthes pilosa (Lines 369-373)”.

      We removed the sentence “We compared the expression profiles of sex-biased genes between sexes and two tissue types and examined the signatures of rapid sequence evolution for sex-biased genes, as well as the contributions of potential evolutionary forces. (Lines 374-376)”.

      • remove lines 395-410

      Authors’ response: Here we mainly discussed the possible associations between sex-biased genes, adaptation and sexual dimorphic traits. We retain them here for clarity.

      • remove lines 449-483, as they are almost entirely repetitions of elements already made clear in the results section.

      Authors’ response: In these paragraphs, we discussed reasons that lead to relaxed purifying selection for sex-biased genes. They are coherent with the results section. We retain them to make it clearer.

      Minor comments:

      • line 146: remove "However"

      Authors’ response: We have revised it.

      • line 187: "female flower buds tend to masculinize": the meaning is obscure

      Authors’ response: We revised them as “Using hierarchical clustering analysis, we evaluated different levels of gene expression across sexes and tissues (Fig. 2C). Gene expression for female floral buds clustered most distantly from expression in female flowers at anthesis. However, expression in male floral buds clustered with expression in female flowers at anthesis, suggesting that male floral buds maybe tend to feminization in the early stages of floral development.”.

      • line 226: "we sequenced transcriptomes of T. pilosa": rather say "we used the transcriptomes described above for T. pilosa"

      Authors’ response: We have revised it.

      • line 279: the meaning of "branch-site model A and branch site model null" is still not made clear.

      Authors’ response: We have revised it.

      • line 324: change to: "we also analysed whether female-biased and unbiased genes underwent... "

      Authors’ response: We have revised it.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The apicoplast, a non-photosynthetic vestigial chloroplast, is a key metabolic organelle for the synthesis of certain lipids in apicomplexan parasites. Although it is clear metabolite exchange between the parasite cytosol and the apicoplast must occur, very few transporters associated with the apicoplast have been identified. The current study combines data from previous studies with new data from biotin proximity labeling to identify new apicoplast resident proteins including two putative monocarboxylate transporters termed MCT1 and MCT2. The authors conduct a thorough molecular phylogenetic analysis of the newly identified apicoplast proteins and they provide compelling evidence that MCT1 and MCT2 are necessary for normal growth and plaque formation in vitro along with maintenance of the apicoplast itself. They also provide indirect evidence for a possible need for these transporters in isoprenoid biosynthesis and fatty acid biosynthesis within the apicoplast. Finally, mouse infection experiments suggest that MCT1 and MCT2 are required for normal virulence, with MCT2 completely lacking at the administered dose. Overall, this study is generally of high quality, includes extensive quantitative data, and significantly advances the field by identifying several novel apicoplast proteins together with establishing a critical role for two putative transporters in the parasite. The study, however, could be further strengthened by addressing the following aspects:

      Response: We thank very much the reviewer for his/her positive evaluation of our work. To address the detailed function of the transporters, in the past three months, we have re-constructed plasmids (with codon-optimized DNA sequences of the genes) for expression of the transporters in a regular expression E. coli strain (BL21DE3) and in a pyruvate import knockout E. coli strain (a gift from Prof. Kirsten Jung), to examine the transport capability in vitro. And, we have also re-constructed a new plasmid containing a new leading peptide for targeting the pyruvate sensor PyronicSF to the apicoplast in the parasite, to probe the possible substrate pyruvate. However, we did not successfully observe expression of the transporters in the above E. coli strains, and we were unable to target the sensor to the correct localization (the apicoplast) in the parasite. As a result, all efforts have led the study to the current version of manuscript on the functional identification of transporters. We will keep working on this aspect, attempting to dissect out the exact transport function of the transporters in the future. In the current manuscript, we have discussed the limitations of our study in the last part of the manuscript.

      Main comments

      1) The conclusion that condition depletion of AMT1 and/or AMT2 affects apicoplast synthesis of IPP is only supported by indirect measurements (effects on host GFP uptake or trafficking, possibly due to effects on IPP dependent proteins such as rabs, and mitochondrial membrane potential, possibly due to effects on IPP dependent ubiquinone). This conclusion would be more strongly supported by directly measuring levels of IPP. If there are technical limitations that prevent direct measurement of IPP then the author should note such limitations and acknowledge in the discussion that the conclusion is based on indirect evidence.

      Response: We thank the reviewer very much for the suggestions. We have tried to establish the measurement of IPP using a commercial company in recent months, yet we have not been successful in making the assay work. Considering the problem of indirect evidence, we have discussed this limitation in the discussion.

      2) The conclusion that condition depletion of AMT1 and/or AMT2 affects apicoplast synthesis of fatty acids is also poorly supported by the data. The authors do not distinguish between the lower fatty acid levels being due to reduced synthesis of fatty acids, reduced salvage of host fatty acids, or both. Indeed, the authors provide evidence that parasite endocytosis of GFP is dependent on AMT1 and AMT2. Host GFP likely enters the parasite within a membrane bound vesicle derived from the PVM. The PVM is known to harbor host-derived lipids. Hence, it is possible that some of the decrease in fatty acid levels could be due to reduced lipid salvage from the host. Experiments should be conducted to measure the synthesis and salvage of fatty acids (e.g., by metabolic flux analysis), or the authors should acknowledge that both could be affected.

      Response: We thank the reviewer very much for comments and suggestions. We partially agree with the comments that the depletion of transporters could affect lipids scavenged from the host cells, as endocytic vesicles are indeed derived from the parasite plasma membrane at the micropore and potentially from the host cell endo-membrane system, as demonstrated with the micropore endocytosis in our previous study (pmid: 36813769). Our latest study has addressed this by showing that the endocytic trafficking of GFP vesicles is regulated by prenylation of proteins (e.g. Rab1B and YKT6.1), depletion of which resulted in diffusion of GFP vesicles, but not disappearance of GFP vesicles in the parasites (pmid: 37548452), indicating that the vesicles (containing lipids) enter the parasites. In the current manuscript, the percentage of parasites containing GFP foci was significantly reduced in AMT1/AMT2-depleted parasites, and instead, parasites containing GFP diffusion appeared and the percentage was almost equal to the reduced level of parasites with GFP foci. These results suggested that endocytic vesicles (e.g. GFP vesicles) were continuously generated by the micropore in the parasites depleted with AMT1/AMT2, and that the vesicle trafficking was regulated by proteins modified by IPP derivatives that were derived from the apicoplast. Based on these observations, we considered that lipids in endocytic vesicles should not contribute to the reduced level of fatty acids and other lipids in parasites depleted with AMT1/AMT2. We have added in a short discussion concerning the fatty acids and lipids reduced in the parasites.

      Reviewer #2 (Public Review):

      In this study Hui Dong et al. identified and characterized two transporters of the monocarboxylate family, which they called Apcimplexan monocarboxylate 1 and 2 (AMC1/2) that the authors suggest are involved in the trafficking of metabolites in the non-photosynthetic plastid (apicoplast) of Toxoplasma gondii (the parasitic agent of human toxoplasmosis) to maintain parasite survival. To do so they first identified novel apicoplast transporters by conducting proximity-dependent protein labeling (TurboID), using the sole known apicoplast transporter (TgAPT) as a bait. They chose two out of the three MFS transporters identified by their screen based and protein sequence similarity and confirmed apicoplast localisation. They generated inducible knock down parasite strains for both AMC1 and AMC2, and confirmed that both transporters are essential for parasite intracellular survival, replication, and for the proper activity of key apicoplast pathways requiring pyruvate as carbon sources (FASII and MEP/DOXP). Then they show that deletion of each protein induces a loss of the apicoplast, more marked for AMC2 and affects its morphology both at its four surrounding membranes level and accumulation of material in the apicoplast stroma. This study is very timely, as the apicoplast holds several important metabolic functions (FASII, IPP, LPA, Heme, Fe-S clusters...), which have been revealed and studied in depth but no further respective transporter have been identified thus far. hence, new studies that could reveal how the apicoplast can acquire and deliver all the key metabolites it deals with, will have strong impact for the parasitology community as well as for the plastid evolution communities. The current study is well initiated with appropriate approaches to identify two new putatively important apicoplast transporters, and showing how essential those are for parasite intracellular development and survival. However, in its current state, this is all the study provides at this point (i.e. essential apicoplast transporters disrupting apicoplast integrity, and indirectly its major functions, FASII and IPP, as any essential apicoplast protein disruption does). The study fails to deliver further message or function regarding AMC1 and 2, and thus validate their study. Currently, the manuscript just describes how AMC1/2 deletion impacts parasite survival without answering the key question about them: what do they transport? The authors yet have to perform key experiments that would reveal their metabolic function. I would thus recommend the authors work further and determine the function of AMC1 and 2.

      Response: We thank very much the reviewer for his/her positive evaluation of our work. To address the detailed function of the transporters, in the past three months, we have re-constructed plasmids (with codon-optimized DNA sequences of the genes) for expression of the transporters in a regular expression E. coli strain (BL21DE3) and in a pyruvate import knockout E. coli strain (a gift from Prof. Kirsten Jung), to examine the transport capability in vitro. And, we have re-constructed a new plasmid containing a new leading peptide for targeting the pyruvate sensor PyronicSF to the apicoplast in the parasite, to probe the possible substrate pyruvate. However, we were unable to successfully observe expression of the transporters in the above E. coli strains, and we were unable to target the sensor to the correct localization (the apicoplast) in the parasite. As a result, all these efforts have led the study to the current version of manuscript on the functional identification of transporters. We will keep working on this aspect, attempting to dissect out the exact transport function of the transporters in the near future. In this current manuscript, we have discussed the limitations of our study in the last part of the manuscript.

      Reviewer #1 (Recommendations For The Authors):

      Minor comments

      Line 35: ...appears to have evolved...

      Line 67: remove first comma

      Line 105: thereafter or therefore?

      Line 130: define ACP

      Line 131: define TMD

      Response: We thank very much the reviewer for the suggestions, and we have revised the points in the current manuscript.

      Figure 1: more information on APT1 would be helpful for readers to interpret the results from turboID e.g., consider showing an illustration showing, according to Karnataki et al 2007 that APT1 likely occupies all 4 membranes of the apicoplast. Also, according to DeRocher et al 2012, APT1 N-term and C-term are both cytosolically exposed, at least in the outermost membrane. The orientation in the other membranes is not known.

      Response: We thank very much the reviewer for the suggestions. We analyzed the localization information of APT1 in T. gondii, based on the studies as the reviewer proposed (Karnataki, et al., 2007; DeRocher et al., 2012). The HA tag at the C-terminus of APT1 was distributed at the four membranes of the apicoplast, indicating that the topology of APT1 might be difficult to be defined at the membranes. Considering this information, we felt hesitant to clearly describe the topology in a schematic diagram about the protein APT1. Nevertheless, the TurboID tagging at the C-terminus of APT1 was an excellent model for identification of potential transporters localized at membranes of the apicoplast. We have put more information about the topology of APT1 in the manuscript, thus providing a better understanding of the proteomic results.

      Figure 2: add a space between "T." and "gondii"

      Figure 2: remove period between "Fitness" and "scores"

      Figure 2: different fonts are used within the figure. Consider using only one font such as arial. Same for Figure 4.

      Figure 2: "Fitness scores" is not bold in panel A but is bold in panel B.

      Response: We thank very much the reviewer for the suggestions. We have revised the points in the current version of the manuscript.

      Line 187: superscript -7

      Line 249: Caution should be used in interpreting two bands as being a precursor and mature product without additional experiments to establish such a relationship. Consider using the term "might" rather than "appear to". The presence of multiple bands could be due to phenomena other than proteolytic processing e.g., alternative splicing, alternative initiator codons, etc.

      Response: We thank very much the reviewer for the suggestions. We have revised the sentences in the current version of manuscript.

      Line 291: define IPP

      Figure 3E. The data points for KD strains appear to be positioned above the zero value on the y-axis. Is this correct?

      Response: We thank very much the reviewer for the suggestions. We have rechecked the figure and replaced it with the correct one.

      Figure 3 G/H legend. Please describe what a single data point represents e.g., the average of one field of view, the average of a certain number of fields of view, or something else? Are the data combined from three experiments or from a representative experiment?

      Response: We thank very much the reviewer for the suggestions. Three independent experiments were performed with at least three replicates. At least 150 vacuoles were scored in each replicate, thus resulting in at least 9 data points in total. The data points were shown with the results from each replicate.

      Line 325: define MEP and explain how it is connected to IPP

      Response: We thank very much the reviewer for the suggestions. We have provided the information in the current version of the manuscript.

      Lines 351-355: The authors refer to Figure 4D to support this statement, but presumably they mean 4E. Also, the authors use the terms C14, C16, and C18. They should more precisely use the terms myristic acid, palmitoleic acid, and trans_oleic acid if this is what they are referring to. Finally, the authors should determine if there is a statistically significant difference between levels of these fatty acids between AMT1 KD and AMT2 KD. If not, they should suggest there is an overall trend toward lower levels of these fatty acids in AMT2 KD parasites compared to AMT1 KD parasites.

      Response: We thank very much the reviewer for the suggestions. We have revised the information in the current version of the manuscript.

      Lines 363-364: The basis of this comment is unclear. Please clarify.

      Lines 369-370: the authors have not shown that the observed lower levels of fatty acids are due to synthesis, as noted above

      Response: We thank very much the reviewer for the suggestions. We have accordingly revised the information in the current version of the manuscript.

      Line 383: Should be Figure S6D

      Line 386: An entire section of the results is used to describe data that are entirely in a supplemental figure. Consider moving this data to a main figure.

      Response: We thank very much the reviewer for the suggestions. We have transferred the data to the main figure in the current version of the manuscript.

      Line 391: Consider using the term virulence instead of growth since now experiments were performed to specifically assess parasite growth in the infected mice.

      Response: We thank very much the reviewer for the suggestions. We have revised the terms in the Results section.

      Line 427: Perhaps the authors mean "...strong growth defect..." or ...strong growth impairment..."

      Line 460-461: This statement is unclear. Please explain how strong backgrounds in proteomics have made it difficult to identify apicoplast transporters. Because they are low abundance? Because they are membrane proteins?

      Response: We thank very much the reviewer for the suggestions. We have revised the corresponding sentences in the current version. The strong backgrounds in the proteomics resulted from the high activity and nonspecific labeling of biotin ligase fused with the apicoplast proteins.

      518-521: It would be helpful for non-specialists if the authors explained how pyruvate is connected to IPP biosynthesis.

      523: delete period after "Escherichia"

      548-549: "We observed similar decreases in level of the MEP biosynthesis activity upon depletion of AMT1 and AMT2..." Reword this since no experiments were done to measure MEP biosynthesis activity.

      Response: We thank very much the reviewer for the suggestions. We have accordingly revised the relevant sentences in the manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Major points:

      • The metabolomic data on fatty acid synthesis and isoprenoid levels is relevant but cannot inform about the function of the transporter, since any protein causing loss of the apicoplast would behave in such a manner, i.e. block the apicoplast pathways.

      Response: We thank very much the reviewer for the comment. We agree with this comment. We have thus discussed these points in a subsection in the Discussion, pointing out some of the limitations in the study.

      • Currently, the manuscript fails to directly prove what AMC1 and AMC2 transports, potentially pyruvate as suggested to putatively fuel FASII and MEP/DOXP. Further experimental approaches using exogenous complementation and/or metabolomic analyses using stable isotope labelling (for example) should potentially bring light to the putative functions of AMC1/2.

      Response: We thank very much the reviewer for the comments. As described above, we attempted several approaches to find out the substrates that the AMT1 and AMT2 transports. However, we could not successfully express the proteins in E. coli strains, and we did not generate a T. gondii strain that a pyruvate sensor was properly targeted to the apicoplast. At the end of the Discussion, we have a subsection that discusses the limitations of this study. We hope that our future approaches will be able to tackle these difficulties on the substrate identification.

      Furthermore, the authors have not considered other pathways of interest, like heme or lysophosphatidic acid (LPA)n synthesis, which are two other key pathway, which may be related to AMC1/2 function. Those proposed experiments represent an important body of work, required to bring light to their metabolic functions.

      Response: We thank very much the reviewer for the comments. We thought about that, but we finally decided to mainly discuss two of the pathways that the transporters might participate in, since the transporters contain specific domains on the proteins sequences that potentially are associated with pyruvate.

      Further, the authors might have partially missed some referencing and data about the apicoplast in their introduction (and potentially to address other facets of the apicoplast metabolic functions/capacities in regards to AMC1/2 function): the introduction referencing and explanations are somehow not fully exact/precise for the part of the apicoplast and its pathway: references about the apicoplast, discovery and origin are not citing the original work (that should be Wilson et al. 1996, McFadden et al. 1996, Kohler et al. 1997,), same for the discovery of FASII and MEP./DOXP (Waller 1998, Jomaa et al...). The introduction (and the study?) lacks information about other key functions of the apicoplast: heme synthesis, lysophosphatidic acid synthesis (using FASII products). The explanations about the roles of FASII/DOXP are partial and not fully citing important references: Krishnan et al. 2020, and Amiar et al. 2020 are also key to understanding how the role of FASII is metabolically flexible depending on nutrient content. A whole part on the fact that FASII is not only dispensible but can also become essential under metabolic adaptations conditions, are missing (Botté et al. 2013, Amiar et al. 2020, Primo et al. 2021). These novel important facets of parasite biology should be mentioned as well as directly linked to the author's topic. This is more minor but could bring new ideas to the authors.

      Response: We thank very much the reviewer for the suggestions. We have revised the relevant part in the introduction.

      We are grateful for the suggestions to improve the manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      Summary:

      The evolution of transporter specificity is currently unclear. Did solute carrier systems evolve independently in response to a cellular need to transport a specific metabolite in combination with a specific ion or counter metabolite, or did they evolve specificity from an ancestral protein that could transport and counter-transport most metabolites? The present study addresses this question by applying selective pressure to Saccharomyces cerevisiae and studying the mutational landscape of two well-characterised amino acid transporters. The data suggest that AA transporters likely evolved from an ancestral transporter and then specific sub-families evolved specificity depending on specific evolutionary pressure.

      Strengths:

      The work is based on sound logic and the experimental methodology is well thought through. The data appear accurate, and where ambiguity is observed (as in the case of citruline uptake by AGP1), in vitro transport assays are carried out to verify transport function.

      Weaknesses:

      Although the data and findings are well described, the study lacked additional contextual information that would support a clear take-home message.

      We appreciate the reviewer’s positive assessment of the work, and the helpful comment to summarize the findings into a short take-home message. We chose not to discuss protein evolution theories in detail to keep the text as concise as possible. However, we do acknowledge the fact that the reader might want to see our results embedded in more context. In a revised version, we will integrate our findings more with the pertinent literature, which will show how our results align with theoretical models for protein evolution towards novel functions. We will also discuss in more detail how our laboratory results could be translated into a “natural” setting of evolution.

      Reviewer #2 (Public Review):

      Summary:

      This paper describes evolution experiments performed on yeast amino acid transporters aiming at the enlargement of the substrate range of these proteins. Yeast cells lacking 10 endogenous amino acid transporters and thus being strongly impaired to feed on amino acids were again complemented with amino acid transporters from yeast and grown on media with amino acids as the sole nitrogen source.

      In the first set of experiments, complementation was done with seven different yeast amino acid transporters, followed by measuring growth rates. Despite most of them have been described before in other experimental contexts, the authors could show that many of them have a broader substrate range than initially thought.

      Moving to the evolution experiments, the authors used the OrthoRep system to perform random mutagenesis of the transporter gene while it is actively expressed in yeast. The evolution experiments were conducted such that the medium would allow for poor/slow growth of cells expressing the wt transporters, but much better/faster growth if the amino acid transporter would mutate to efficiently take up a poorly transported (as in the case of citrulline and AGP1) or non-transported (as in case of Asp/Glu and PUT4) amino acid.

      This way and using Sanger sequencing of plasmids isolated from faster-growing clones, the authors identified a number of mutations that were repeatedly present in biological replicates. When these mutations were re-introduced into the transporter using site-directed mutagenesis, faster growth on the said amino acids was confirmed. Growth phenotype data were attempted to be confirmed by uptake experiments using radioactive amino acids; however, the radioactive uptake data and growth-dependent analyses do not fully match, hinting at the existence of further parameters than only amino acid uptake alone to impact the growth rates.

      When mapped to Alphafold prediction models on the transporters, the mutations mapped to the substrate permeation site, which suggests that the changes allow for more favourable molecular interactions with the newly transported amino acids.

      Finally, the authors compared the growth rates of the evolved transporter variants with those of the wt transporter and found that some variants exhibit a somewhat diminished capacity to transport its original range of amino acids, while other variants were as fit as the wt transporter in terms of uptake of its original range of amino acids.

      Based on these findings, the authors conclude that transporters can evolve novel substrates through generalist intermediates, either by increasing a weak activity or by establishing a new one.

      Strengths:

      The study provides evidence in favour of an evolutionary model, wherein a transporter can "learn" to translocate novel substrates without "forgetting" what it used to transport before. This evolutionary concept has been proposed for enzymes before, and this study shows that it also can be applied to transporters. The concept behind the study is easy to understand, i.e. improving growth by uptake of more amino acids as nitrogen source. In addition, the study contains a large and extensive characterization of the transporter variants, including growth assays and radioactive uptake measurements.

      Weaknesses:

      The authors took a genetic gain-of-function approach based on random mutagenesis of the transporter. While this has worked out for two transporters/substrate combinations, I wonder how comprehensive and general the insights are. In such approaches, it is difficult to know which mutation space is finally covered/tested. And information that can be gained from loss-of-function analyses is missed. The entire conclusions are grounded on a handful of variants analyzed. Accordingly, the outcome is somewhat anecdotal; in some cases, the fitness of the variants was changed and in others not. Highlighting the amino acid changes in the context of the structural models is interesting, but does not fully explain why the variants exhibit changed substrate ranges. Two important technical elements have not been studied in detail by the authors, but may well play a certain role in the interpretation of the results. Firstly, the authors did not quantify the amount of transporter being present on the cell surface; altered surface expression can impact uptake rates and thus growth rates. Secondly, the authors have not assessed whether overexpressing wt versus variant transporters has an impact on the growth rate per se. Overexpressing transporters from plasmids is quite a burden for the cells and often impacts growth rates. Variants may be more or less of a burden, an effect that may (or may also not) go hand in hand with increased/decreased surface production levels.

      And finally, I was somewhat missing an evolutionary analysis of these transporters to gain insights into whether the identified substitutions also occurred during natural evolution under real-life conditions.

      First of all, we thank the reviewer for the attention to detail with which they have read the manuscript, and the very helpful comments on how to improve it. We will indeed take on some of the suggestions in a revised version of the text:

      Regarding the match of growth rate and uptake rate measurements, we plan to plot their correlation in a graph.

      Regarding the amount of transporter on the plasma membrane, we acknowledge that the visual representation of the fluorescence micrographs already in the text might not be enough. We therefore will quantify expression levels from said micrographs and include the information in the manuscript.

      On a similar note, we had already measured the growth rates of all transporter variant cultures in the absence of selection for amino acid uptake (i.e., in medium with ammonium as the nitrogen source; Figure 4 - Supplement figure 1). We will include the measured growth rates in the text to give an indication of what the impact of transporter overexpression is on the growth rate per se.

      Regarding the proposed analysis of natural transporter sequences, we do see the possible value in such an analysis. However, it is currently out of scope for the present study. The reasons are 1) that preliminary analyses show that the sequence similarity of functionally verified/annotated transporters is too low to reliably pinpoint a phenotype to a single residue, and 2) that we do not envision that the variants that we discovered are necessarily beneficial in a natural setting, where fine-grained regulation of amino acid transport may be more important than a broad substrate range. Regarding the generality of the insights, we do agree on the reviewer’s comment that we “only” analyzed a relatively small number of variants. However, the target of the study was not to generate high-throughput data on a large set of variants (e.g., by NGS of the whole culture) but to provide in-depth data for characterized and verified variants in a clean genetic background (i.e., verified phenotype and fitness measurements on all native and novel substrates).

      As to the mutation space, we will include an estimate in a revised version of the text. We estimate that a majority of all possible single mutants is covered in the first and second passages of the selection experiment, which is corroborated by the fact that we repeatedly find the same mutants in biological replicates.

      Regarding the mentioned loss-of-function analyses, we are unsure about what the reviewer intends with this statement at this point. To briefly summarize, we feel that our results are a good indication that transporters can evolve new functions analogously to enzymes. We explicitly do not imply that this is the only way to evolve novelty.

      Reviewer #3 (Public Review):

      The goal of the current manuscript is to investigate how changes in transporter substrate specificity emerge through experimental evolution. The authors investigate the APC family of amino acid transporters, a large family with many related transporters that together cover the spectrum of amino acid uptake in yeast.

      The authors use a clever approach for their experimental evolutions. By deleting 10 amino acid uptake transporters in yeast, they develop a strain that relies on amino acid import by introducing APC transporters under nitrogen-limiting conditions. They can thus evolve transporters towards the transport of new substrates if no other nitrogen source is available. The main takeaway from the paper is that it is relatively easy for the spectrum of substrates in a particular transporter of this family to shift, as a number of single mutants are identified that modulate substrate specificity. In general, transporters evolved towards gain-of-function mutations (better or new activities) and also confer transport promiscuity, expanding the range of amino acids transported.

      The data in the paper support the conclusions, in general, and the outcomes (evolution towards promiscuity) agree with the literature available for soluble enzymes. However, it is also a possibility that the design of these experiments selects for promiscuity among amino acids. The selections were designed such that yeast had access to amino acids that were already transported, with a greater abundance of the amino acid that was the target of selection. Under these conditions, it seems probable that the fittest variants will provide the yeast access to all amino acid substrates in the media, and unlikely that a specificity swap would occur, limiting the yeast to only the new amino acid.

      The authors also examine the fitness costs of mutants, but only in the narrow context of growth on a single (original) amino acid under conditions of nitrogen limitation. Amino acid uptake is typically tightly controlled because some amino acids (or their carbon degradation products) are toxic in excess. This paper does not address or discuss whether there might be a fitness cost to promiscuous mutants in conditions where nitrogen is not limiting.

      We are grateful for the reviewer’s insightful comments on the paper.

      Regarding the design of our experiments, we followed the concept of directed evolution as described by pioneers of the field, in which the starting point for evolving a protein is to have a basic level of that activity. In the case of AGP1, the promiscuous activity is Cit uptake. We recognize that elimination of all the already transported amino acids from the evolution media could also yield very insightful results. However, we aimed to simulate the effect of the evolutionary pressure acting in a “natural” environment, where the uptake of the specific amino acid is not initially crucial for its survival. In the case of PUT4, the experimental design was chosen to ensure the initial survival of the culture (since neither Glu nor Asp support the growth of the strain) by providing a low level of already transported amino acids. In the revised manuscript, we will state this more clearly.

      Regarding the second point, we agree that a short discussion about the potentially detrimental effects of promiscuous transporters would be beneficial for the reader. We will touch on this aspect in the revised version of the text. Indeed, our system is intentionally simplified, as we try to take regulation of transport out of the equation (e.g., by using the constitutive ADH1 promoter as opposed to a nitrogen-regulated one). In a natural setting, microorganisms encounter fluctuations of nutrient availability, necessitating tight control of nutrient transport. This is probably a major reason why microorganisms typically encode transporters with redundant specificities (i.e., promiscuous and specific ones). Otherwise, one very broad-range nutrient transporter would suffice. In our system, we artificially select for broad-range transport, which is reflected in the observed phenotypes of the evolved transporters. We expect that in a natural setting, a broad-range transporter would be a stepping stone to evolve a narrow-range transporter with a new specificity (which is actually what we see in the double-mutant AGP1-NV, with lowered fitness in original substrates and increased fitness in Cit).

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study advances our understanding of the ways in which different types of communication signals differentially affect mouse behaviors and amygdala cholinergic/dopaminergic neuromodulation. Researchers interested in the complex interaction between prior experience, sex, behavior, hormonal status, and neuromodulation should benefit from this study. Nevertheless, the data analysis is incomplete at this stage, requiring additional analysis and description, justification, and - potentially - power to support the conclusions fully. With the analytical part strengthened, this paper will be of interest to neuroscientists and ethologists.

      GENERAL COMMENTS ON REVIEWS AND REVISIONS

      Experimental design

      Here we address questions from several reviewers regarding our periods of neuromodulator and behavioral analysis. First, we recognize that the text would benefit from an overview of the experimental structure different from the narrative we provide in the first paragraphs of the Results. We now include this near the beginning for the Materials and Methods (page 17). We further articulate that the 10-minute time periods were dictated by the sampling duration required to perform accurate neurochemical analyses (and to reserve half of the sample in the event of a catastrophic failure of batch-processing samples). Since neurochemical release may display multiple temporal components (e.g., ACh: Aitta-aho et al., 2018) during playback stimulation, and since these could differ across neurochemicals of interest, we decided to collect, analyze, and report in two stimulus periods as well as one Pre-Stim control. We now clarify this in additional text in the Material and Methods (p. 24, lines 20-22; p. 26, lines 17-19). We decided not to include analyses of the post-stimulus period because this is subject to wider individual and neuromodulator-specific effects and because it weakens statistical power in addressing the core question—the change in neuromodulator release DURING vocal playback.

      We also sought to clarify the meaning of the periods “Stim 1” and “Stim 2”; they are two data collection periods, using the same examplar sequences in the same order. We have added statements in the Material and Methods (p. 18, lines 4-7; Fig. caption, p. 39, lines 11-13) to clarify these periods.

      For behavioral analyses, observation periods were much shorter than 10 mins, but the main purpose of behavioral analyses in this report is to relate to the neurochemical data. As a result, we matched the temporal features of the behavioral and neurochemical analyses (p. 22, lines 17-22). We plan a separate report, focused exclusively on a broader set of behavioral responses to playback, that may examine behaviors at a more granular level.

      Data and statistical analyses

      Reviewers 1 and 3 expressed concerns about our normalization of neurochemical data, suggesting that it diminishes statistical power or is not transparent. We note that normalization is a very common form of data transformation that does not diminish statistical power. It is particularly useful for data forms in which the absolute value of the measurement across experiments may be uninformative. Normalization is routine in microdialysis studies, because data can be affected by probe placement and factors affecting neurochemical recovery and processing. Recent examples include:

      Li, Chaoqun, Tianping Sun, Yimu Zhang, Yan Gao, Zhou Sun, Wei Li, Heping Cheng, Yu Gu, and Nashat Abumaria. "A neural circuit for regulating a behavioral switch in response to prolonged uncontrollability in mice." Neuron (2023).

      Gálvez-Márquez, Donovan K., Mildred Salgado-Ménez, Perla Moreno-Castilla, Luis Rodríguez-Durán, Martha L. Escobar, Fatuel Tecuapetla, and Federico Bermudez-Rattoni. "Spatial contextual recognition memory updating is modulated by dopamine release in the dorsal hippocampus from the locus coeruleus." Proceedings of the National Academy of Sciences 119, no. 49 (2022): e2208254119.

      Holly, Elizabeth N., Christopher O. Boyson, Sandra Montagud-Romero, Dirson J. Stein, Kyle L. Gobrogge, Joseph F. DeBold, and Klaus A. Miczek. "Episodic social stress-escalated cocaine self-administration: role of phasic and tonic corticotropin releasing factor in the anterior and posterior ventral tegmental area." Journal of Neuroscience 36, no. 14 (2016): 4093-4105.

      Bagley, Elena E., Jennifer Hacker, Vladimir I. Chefer, Christophe Mallet, Gavan P. McNally, Billy CH Chieng, Julie Perroud, Toni S. Shippenberg, and MacDonald J. Christie. "Drug-induced GABA transporter currents enhance GABA release to induce opioid withdrawal behaviors." Nature neuroscience 14, no. 12 (2011): 1548-1554.

      However, since all reviewers requested raw values of neurochemicals, we provide these in supplementary tables 1-3. The manuscript references these table early in the Results (p. 6, lines 18-19) and in the Material and Methods (p. 27, lines 3-4)

      All reviewers commented on correlation analyses that we presented, with different perspectives. Reviewer 2 questioned the validity of such analyses, performed across experimental groups, while Reviewer 1 pointed out that the analyses were redundant with the GLM. We agree with these criticisms, and note the challenges associated with correlations involving behaviors for which there is a “floor” in the number of observations. As a result, we have removed most correlation analyses from the manuscript. The text and figures have been modified accordingly. Due these changes, we have to decline requests of Reviewer 3 to include many more such analyses. While correlation analyses could still be performed between neurochemicals and behaviors for each group, the relatively small size of each experimental group, the large number of groups, and the even larger numbers of pairings between neurochemicals and behavior, the statistical power is very low. The only correlations we utilize in the manuscript concern the interpretation of our increased acetylcholine levels.

      As part of this revision, we re-ran our statistical analyses on neuromodulators because of a calculation error in 3 animals (regarding baseline values). In a few instances, a significance level changed, but none of these changed a conclusion regarding neuromodulator changes under our experimental conditions.

      Other revisions

      INTRODUCTION: We modified the Introduction to provide both a more general framework and specific gaps in our understanding relating neuromodulators with vocal communication.

      DISCUSSION: We have added material in the first two pages of the Discussion to provide more framework to our conclusions, to address the issues of the temporal aspects of neurochemical release and behavioral observations, and to identify limitations that should be addressed in future studies.

      FIGURES: All figures are now in the main part of the manuscript. We modified most figures in response to reviewer comments. We removed neuromodulator – behavior correlations from several figures. We modified all box plots to ensure that all data points are visible. The visible data points match the numbers reported in figure captions. We brought 5-HIAA data into the main figures reporting on neuromodulator results.

      Public Reviews:

      Reviewer #1 (Public Review):

      The manuscript addresses a fundamental question about how different types of communication signals differentially affect brain states and neurochemistry. In addition, the manuscript highlights the various processes that modulate brain responses to communication signals, including prior experience, sex, and hormonal status. Overall, the manuscript is well-written and the research is appropriately contextualized. The authors are thoughtful about their quantitative approaches and interpretations of the data.

      That being said, the authors need to work on justifying some of their analytical approaches (e.g., normalization of neurochemical data, dividing the experimental period into two periods (as opposed to just analyzing the entire experimental period as a whole)) and should provide a greater discussion of how their data also demonstrate dissociations between neurochemical release in the basolateral amygdala and behavior (e.g., neurochemical differences during both of the experimental periods but behavioral differences only during the first half of the experimental period). The normalization of neurochemical data seems unnecessary given the repeated-measures design of their analysis and could be problematic; by normalizing all data to the baseline data (p. 24), one artificially creates a baseline period with minimal variation (all are "0"; Figures 2, 3 & 5) that could inflate statistical power.

      Please see our general responses to structure of observation periods and normalization of neuromodulator data. Normalization is a common and appropriate procedure in microdialysis studies that does not alter statistical power.

      We have included a section in the Discussion concerning the temporal relationship between behavioral responses and neurochemical changes in response to vocal playback (p. 12, lines 3-17). We note where the linkage is particularly strong (e.g., ACh release and flinching). This points to a need to examine these phenomena with finer temporal resolution, but also with the recognition that the brain circuits driving a behavioral response may extend beyond the BLA.

      The Introduction could benefit from a priori predictions about the differential release of specific neuromodulators based on previous literature.

      We added some material to the Introduction to provide additional rationale for the study. However, we did not attempt to develop predictions for the range of neuromodulators that we sought to test. The literature can lead to opposite predictions for a given neuromodulator. For example, acetylcholine could be associated with both positive and negative valence. Instead, we note in the Introduction the association of both DA and ACh with vocalizations.

      The manuscript would also benefit from a description of space use and locomotion in response to different valence vocalizations.

      We have provided additional descriptions of space use and video tracking data in Material and Methods (p. 23, lines 1-6). We now report a few correlations based on these data in the Results to demonstrate that increased ACh in Restraint males and Mating estrus females was not related to the amount of locomotion (p. 9, lines 8-14).

      Nevertheless, the current manuscript seems to provide some compelling support for how positive and negative valence vocalizations differentially affect behavior and the release of acetylcholine and dopamine in the basolateral amygdala. The research is relevant to broad fields of neuroscience and has implications for the neural circuits underlying social behavior.

      Reviewer #2 (Public Review):

      Ghasemahmad et al. report findings on the influence of salient vocalization playback, sex, and previous experience, on mice behaviors, and on cholinergic and dopaminergic neuromodulation within the basolateral amygdala (BLA). Specifically, the authors played back mice vocalizations recorded during two behaviors of opposite valence (mating and restraint) and measured the behaviors and release of acetylcholine (ACh), dopamine (DA), and serotonin in the BLA triggered in response to those sounds.

      Strength: The authors identified that mating and restraint sounds have a differential impact on cholinergic and dopaminergic release. In male mice, these two distinct vocalizations exert an opposite effect on the release of ACh and DA. Mating sounds elicited a decrease of Ach release and an increase of DA release. Conversely, restraint sounds induced an increase in ACh release and a trend to decrease in DA. These neurotransmission changes were different in estrus females for whom the mating vocalization resulted in an increase of both DA and ACh release.

      Weaknesses: The behavioral analysis and results remain elusive, and although addressing interesting questions, the study contains major flaws, and the interpretations are overstating the findings.

      Although Reviewer 2 raises several valid issues that we have addressed in our response and revision, we believe that none represent “major flaws” in the study that challenge the validity of our central conclusions. In brief, we will:

      --provide enhanced description of behaviors (pp. 22-23 and Table 1)

      --clarify / modify box-plot representations of data (p 28. Lines 3-9)

      --point to our methods that describe corrections for multiple comparisons (p. 27; lines 15-16)

      --revise figures to clarify sample size (Figs. 3-6)

      Reviewer #3 (Public Review):

      Ghasemahmad et al. examined behavioral and neurochemical responses of male and female mice to vocalizations associated with mating and restraint. The authors made two significant and exciting discoveries. They revealed that the affective content of vocalizations modulated both behavioral responses and the release of acetylcholine (ACh) and dopamine (DA) but not serotonin (5-HIAA) in the basolateral amygdala (BLA) of male and female mice. Moreover, the results show sex-based differences in behavioral responses to vocalizations associated with mating. The authors conclude that behavior and neurochemical responses in male and female mice are experience-dependent and are altered by vocalizations associated with restraint and mating. The findings suggest that ACh and DA release may shape behavioral responses to context-dependent vocalizations. The study has the potential to significantly advance our understanding of how neuromodulators provide internal-state signals to the BLA while an animal listens to social vocalizations; however, multiple concerns must be addressed to substantiate their conclusions.

      Major concerns:

      1) The authors normalized all neurochemical data to the background level obtained from a single pre-stimulus sample immediately preceding playback. The percentage change from the background level was calculated based on a formula, and the underlying concentrations were not reported. The authors should report the sample and background concentrations to make the results and analyses more transparent. The authors stated that NE and 5-HT had low recovery from the mouse brain and hence could not be tracked in the experiment. The authors could be more specific here by relating the concentrations to ACh, DA, and 5-HIAA included in the analyses.

      Please see our general statement regarding normalization of neurochemical data. We have added supplemental tables that shows concentrations of dopamine, acetylcholine, 5-HIAA. We do not report serotonin or noradrenalin since these were below the detection threshold.

      2) For the EXP group, the authors stated that each animal underwent 90-min sessions on two consecutive days that provided mating and restraint experiences. Did the authors record mating or copulation during these experiments? If yes, what was the frequency of copulation? What other behaviors were recorded during these experiences? Did the experiment encompass other courtship behaviors along with mating experiences? Was the female mouse in estrus during the experience sessions?

      In the mating experience, mounting or attempted mounting was required for the animal to be included in subsequent testing. Since the session lasted 90 minutes, more general courtship behavior was likely. However, we did not record detailed behaviors or track estrous stage for the mating experience. See p. 21, line 20-22.

      3) For the mating playback, the authors stated that the mating stimulus blocks contained five exemplars of vocal sequences emitted during mating interactions. The authors should clarify whether the vocal sequences were emitted while animals were mating/copulating or when the male and female mice were inside the test box. If the latter was the case, it might be better to call the playback "courtship playback" instead of "mating playback".

      We have modified the Results (p. 5, lines 18-20) and Materials and Methods (p. 21, lines 8-15) to clarify our meaning. We continue to use the term “mating” because this refers to a specific set of behaviors associated with mounting and copulation, rather than the more general term “courtship”. We also indicate that we based these behaviors on previous work (e.g., Gaub et al., 2016).

      4) Since most differences that the authors reported in Figure 3 were observed in Stim 1 and not in Stim 2, it might be better to perform a temporal analysis - looking at behaviors and neurochemicals over time instead of dividing them into two 10-minute bins. The temporal analysis will provide a more accurate representation of changes in behavior and neurochemicals over time.

      Please see our general response to the structuring of experimental periods. The 10-min periods are the minimum for the neurochemical analyses, and we adopted the same periods for behavioral analyses to match the two types of observations. Our repeated measures analysis is a form of temporal analysis, since it compares values in three observation periods.

      5) In Figures 2 and 3, the authors show the correlation between Flinching behavior and ACh concentration. The authors should report correlations between concentrations of all neurochemicals (not just ACh) and all behaviors recorded (not just Flinching), even if they are insignificant. The analyses performed for the stim 1 data should also be performed on the stim 2 data. Reporting these findings would benefit the field.

      Please see general comments regarding correlation analyses. We removed almost all such analyses and references to them from the manuscript based on concerns of the other reviewers.

      6) The mice used in the study were between p90 - p180. The mice were old, and the range of ages was considerable. Are the findings correlated with age? The authors should also discuss how age might affect the experiment's results.

      Our p90-p180 mice are not “old”. CBA/CaJ mice display normal hearing for at least 1 year (Ohlemiller, Dahl, and Gagnon, JARO 11: 605-623, 2010) and adult sexual and social behavior throughout our observation period. They are sexually mature adults, appropriate for this study. We decline to perform correlation analyses with age, both because this was not a question for this study and because the very large number of correlations, for each experimental group (as requested by reviewer #2), render this approach statistically problematic.

      7) The authors reported neurochemical levels estimated as the animals listened to the sounds played back. What about the sustained effects of changes in neurochemicals? Are there any potential long-term effects of social vocalizations on behavior and neurochemical levels? The authors might consider discussing long-term effects.

      We have not included discussion of long term effects of neuromodulatory release, both because our data analysis doesn’t address it (see response to Comment #10) and because we desired to keep the Discussion focused on topics more closely related to the results.

      8) Histology from a single recording was shown in supplementary figure 1. It would benefit the readers if additional histology was shown for all the animals, not just the colored schematics summarizing the recording probe locations. Further explanation of the track location is also needed to help the readers. Make it clear for the readers which dextran-fluorescein labeling image is associated with which track in the schematic.

      Based on the recent publications cited in our overall response to reviewer comments about statistical methods, our reporting of histological location of microdialysis exceeds the standard. We believe that the inclusion of all histology is unnecessary and not particularly helpful. Raw photomicrographs do not always illustrate boundaries, so interpretation is required. However, we added a second photomicrograph example and we identified which tracks correspond to these photomicrographs (see Figure 2; now in main body of manuscript).

      9) The authors did not control for the sounds being played back with a speaker. This control may be necessary since the effects are more pronounced in Stim 1 than in Stim 2. Playing white noise rather than restraint or courtship vocalizations would be an excellent control. However, the authors could perform a permutation analysis and computationally break the relationship between what sound is playing and the neurochemical data. This control would allow the authors to show that the actual neurochemical levels are above or below chance.

      We considered a potential “control” stimulus in our experimental design. We concluded, based on our previous work (e.g., Grimsley et al., 2013; Gadziola et al., 2016), that white noise is not or not necessarily a neutral stimulus and therefore the results would not clarify the responses to the two vocal stimuli. Instead, we opted to use experience as a type of control. This control shows very clearly that temporal patterns and across-group differences in neurochemical response to playback disappear in the absence of experience with the associated behavior.

      10) The authors indicated that each animal's post-vocalization session was also recorded. No data in the manuscript related to the post-vocalization playback period was included. This omission was a missed opportunity to show that the neurochemical levels returned to baseline, and the results were not dependent on the normalization process described in major concern #1. The data should be included in the manuscript and analyzed. It would add further support for the model described in Figure 6.

      We decided not to include analyses of the post-stimulus period because this period is subject to wider individual and neuromodulator-specific effects and because it weakens statistical power in addressing the core question—the change in neuromodulator release DURING vocal playback. We agree that the general question is of interest to the field, but we don’t think our study is best designed to answer that question.

      11) The authors could use a predictive model, such as a binary classifier trained on the CSF sampling data, to predict the type of vocalizations played back. The predictive model could support the conclusions and provide additional support for the model in Figure 6.

      We recognize that a binary classifier could provide an interesting approach to support conclusions. However, we do not believe that the sample size per group is sufficient to both create and test the classifier.

      Reviewer #1 (Recommendations For The Authors):

      Major comments:

      • Introduction: It would be useful to set up an experimental framework before delving into the results. What are the predictions about specific neuromodulators based on previous literature?

      Because this narrative is laid out in the first two paragraphs of the Results, which immediately follow the Introduction, we believe that additional text in the Introduction on the experimental framework is redundant. As stated above, detailing predictions for a range of neuromodulators would make for a long and not particularly illuminating Introduction. We instead have related our findings to more general understanding of DA and ACh in the Discussion.

      • There really isn't a major difference in stimuli during the "Stim 1" and "Stim 2" phases, and it's not clear why the authors divided the experimental period into two phases. Therefore, the authors need to justify their experimental approach. For example, the authors could first anecdotally mention that behavioral responses to playbacks seem to be larger in the first half of the playbacks than during the second half, therefore they individually analyzed each half of the experimental period. Or adopt a different approach to justify their design. Overall, the analytical approach is reasonable but it is currently not justified.

      See general comment for analysis periods. As noted, we clarified these issues in several locations with Materials and Methods (pp. 24, lines 20-22; p. 26, lines 17-19). We also sought to clarify the meaning of the periods “Stim 1” and “Stim 2”; they are two data collection periods, using the same examplar sequences in the same order. We have added statements in the Material and Methods (p. 18, lines 4-7; Fig. caption, p. 39, lines 11-13).

      • The normalization of neurochemical data seems problematic and unnecessary. By normalizing all data to the baseline data (p. 24), one artificially creates a baseline period with minimal variation (all are "0"; Figures 2, 3 & 5) and this has implications for statistical power. Because the analysis is a within-subjects analysis, this normalization is not necessary for the analysis itself. It can be useful to normalize data for visualization purposes, but raw data should be analyzed. Indeed, behavioral data are qualitatively similar to the neurochemical data, and those data are not normalized to baseline values.

      Please see our general comment on this issue. We believe normalization does not affect statistical power and is both the standard way and an appropriate way to analyze microdialysis results. We include concentrations of ACh, DA, and 5-HIAA in supplementary tables?

      • The authors should include a discussion (in the Discussion section) of how behavior and neurochemical release are associated during the first half of the experimental session but not in the second half (e.g., differences in Ach and DA release between mating and restraint groups during stim 1 and 2, but behavioral differences only during stim 1).

      We have included a section in the Discussion concerning the temporal relationship between behavioral responses and neurochemical changes in response to vocal playback. We note that the linkage is particularly strong in some cases (e.g., ACh release and flinching). This points to a need to examine these phenomena with finer temporal resolution, but also with the recognition that the brain circuits driving a behavioral response may extend beyond the BLA.

      Minor comments:

      • Keywords: add "serotonin" (even though there are no significant differences on 5-HIAA, people interested in serotonin would find this interesting).

      Added to keywords list.

      • Do the authors collect data on the vocalizations of mice in response to these playbacks?

      We monitored vocalizations during playback, noting that vocalizations–especially “Noisy” vocalization–were common. However, we did not record vocalizations and are therefore unable quantify our observations.

      • First line of page 7: readers do not know about "stim 1" and "stim 2". Therefore, the authors need to describe their approach to analyzing behavior and neurochemical release.

      We first introduce these terms earlier, citing Figure 1D,E. We have added some additional wording for further clarification. page 7, lines 4-5.

      • Make sure citations are uniformly formatted (e.g., Inconsistencies in: "As male and female mice emit different vocalizations during mating (Finton et al., 2017; J. M. S. Grimsley et al., 2013; Neunuebel et al., 2015; Sales (née Sewell), 1972)").

      We have reviewed and corrected citations throughout the manuscript.

      • Last paragraph of page 7: "attending behavior" has not been defined yet.

      Table 1 contains our description of the behaviors analyzed in this study. We have now inserted a reference to Table 1 earlier in the Results (p. 6, line 12).

      • Figure 2E and 3G: I find these correlations to be redundant with the GLMs. This is because the significant relationship is likely to be driven by group differences in behavior and in neurochemical release.

      Please see general comments regarding correlation analyses. We removed such analyses and references to them from the manuscript.

      • Page 2, 2nd paragraph, 2nd sentence: this paragraph seems to be rooted in comparing and contrasting experienced and inexperienced mice, so there should be explicit comparisons in each sentence. For example, the 2nd sentence should read: "Whereas EXP estrus females demonstrated increased flinching behaviors in response to mating vocalizations, INEXP ....". This paragraph overall could use some refining.

      We believe this refers to page 9. We have revised the paragraph to clarify our findings (Beginning p. 9, line 23).

      • Page 9: "Further, there were no significant differences across groups during Stim 1 or Stim 2 periods. These results contrast sharply with those from all EXP groups, in which both ACh and DA release changed significantly during playback (Figs. 2C, 2D, 3E, 3F)." While I understand their perspective, this is misleading because changes were only observed during the Stim 1 period.

      We have slightly revised the wording in this paragraph, because the restraint males did not show significant ACh decreases. However, we do not believe our statements mislead readers just because some changes are observed in only one of the stimulation periods (p 10, lines 13-16).

      • Last paragraph of page 14: it would be useful to mention the increase in flinching in experienced females in response to mating vocalizations.

      We have added a sentence in this paragraph relating flinching in estrus females to increased ACh (p. 15, lines 18-20).

      • Was there a full analysis of locomotion in response to playbacks? I see that locomotion was correlated with neurochemical release but was it different in response to different stimuli? Were there changes to the part of the arena that mice occupied in response to restraint vs. mating vocalizations? Given their methods section, it would be useful for the authors to mention the results of the analyses of these aspects of movement.

      We have provided additional descriptions of space use and video tracking data in Material and Methods (p. 23, lines 1-6). We now report additional results associated with these analyses (p. 8, lines 13-15; p. 9, lines 8-14).

      • I believe that each experimental mouse only heard one of the stimuli (given the analytical approach). Because it is plausible to measure neurochemical release in response to both types of stimuli, I encourage the authors to be more explicit about this aspect of the experimental design (e.g., mention in Results section).

      Sentence modified to read: “Each mouse received playback of either the mating or restraint stimuli, but not both: same-day presentation of both stimuli would require excessively long playback sessions, the condition of the same probe would likely change on subsequent days, and quality of a second implanted probe on a subsequent day was uncertain.” (p. 7, lines 5-9).

      • Figure 1A and 1B: add labels to the panels so readers don't have to read the legend to know what spectrogram is associated with what context.

      We added these labels to Figure 1.

      • Table 1: in the definition of "still and alert", should this mention "abrupt attending" instead of "abrupt freezing"? The latter isn't described.

      Yes, we intended “abrupt attending”, and now indicated that in Table 1

      Reviewer #2 (Recommendations For The Authors):

      Major comments:

      • The authors report they performed manual behavioral analysis, and provide a table defining the different behaviors. However, it remains unclear how some of these behaviors were detected (such as still-and-alert events). A thorough description of the criteria used to define these events needs to be provided.

      We have modified some descriptions of manually analyzed behaviors in Table 1, and have added additional description of how we developed this set of behaviors for analysis in the study (pp. 22-23).

      • The box plots do not appear to represent the "minimum, first quartile, median, third quartile, and maximum values." as specified on page 24 (Methods). Indeed, the individual data points sometimes do not reach the max or min of the bar plot, and sometimes are way beyond them.

      We used the “inclusive median” function in Excel to generate final boxplots. These boxplots will sometimes result in a data point being placed outside of the whiskers. SPSS considers these to be “outliers”, but our GLM analysis includes these values. We describe this in Data Analysis section of Materials and Methods (p. 28, lines 3-9)

      • Some of the data are replicated in different Figures: Figure 2A and Figure 3C. While this is acceptable, the authors did not correct for multiple comparisons (dividing the p value by the number of comparisons).

      Our analysis included corrections for multiple comparisons, as we have indicated on p. 27, lines 15-16.

      • Overall, the sample sizes are too small (for example in Figure 3, non-estrus females are at n=3), and are different in experiments where they should be equal (Figure 2B: mating stim 1 is at n=5 and mating stim 2 is at n=3).

      We apologize that sample sizes were not properly displayed in figures. Please note that sample sizes are identified in the figure captions. For neuromodulator data, all sample sizes are at least 7. For behavioral data, the minimum sample size is 5. We have revised Figures 3-6 to ensure that all data points are visible.

      • It remains unclear why the impact of mating vocalizations has been tested only in males.

      We assume the reviewer meant that only males were tested in restraint. We now indicate that our preliminary evidence indicated no difference in behavioral responses to restraint vocalization between males and females, so we opted to perform the neurochemical analysis for restraint only in males (page 22 lines 4-5). If there were no limitations to time and cost, we would have preferred to test responses to restraint in females as well. We note that such inclusion would have added up to 4 experimental groups (estrus and non-estrus groups in both EXP and INEXP groups).

      • The correlation between the number of flinching and ACh release changes (Figure 2E) visually appears to be opposite between mating and restraint playbacks. The authors should perform independent correlations for these 2 playbacks.

      Please see general comments regarding correlation analyses. We removed such analyses and references to them from the manuscript.

      • The authors state that their findings "indicate that behavioral responses to salient vocalizations result from interactions between sex of the listener or context of vocal stimuli with the previous behavioral experience associated with these vocalizations.". However, in male mice, they do not report any difference in previous experience on flinching for both restraint and mating sounds, as well as no difference in rearing for the restrain sounds (Figure 4A-B). Thus, the discussion of these results should be completely revisited.

      We revised the paragraph in question (p. 9, line 22 through p. 10, line 9). For instance, we note that significant differences between EXP male-mating and male-restraint flinching do not exist between the INEXP groups. We believe that the last sentence correctly summarizes findings described in this paragraph.

      • For serotonin experiments in Figure S2 there are strong outliers (150% increase in 5HIAA release). Did the authors correlate these levels with the behavior of the animals?

      Outliers are identified by the Excel function that generated the boxplots, but we have no reason to consider these as outliers and exclude them. As noted above, we have clarified that these “outliers” are the result of the Excel function in the Materials and Methods (p. 28, lines 3-9) and we have revised the plotting of data points

      Minor comments:

      • Mating vocalization playback is mainly emitted by males, thus, instead of a positive valence signal, this could also be interpreted as a competitive signal to other males.

      There is support in the literature for viewing our mating stimulus as having positive valence. Gaub et al., 2016 describe the emission of stepped calls, lower frequency harmonics, and increased sound level as indicators of “positive emotion”. We have shown (Grimsley et al, 2013) that the female LFH vocalization can be highly attractive to male mice, under the right conditions, indicating something like “sex is happening”. The inclusion of both the male and female vocalizations in our stimuli was a key piece of our experimental design, based on our understanding of the contributions of both vocalizations to the meaning of the overall acoustic experience.

      • Figure 1 should include panel titles.

      No change. This information is available in the Figure caption.

      • n=31 should be indicated in the EXP group.

      We’re not sure where the reviewer is referring to this value.

      • The color legend of Figure 1E is absent, making the Figure not understandable.

      We added text in the Figure 1 caption to indicate that each color represents a different exemplar. We don’t think a legend provides additional useful information.

      • The point of making two blocks (stim 1 and stim2) should be stated more clearly.

      Please see general statement regarding experimental blocks. We have modified our description of these in an Experimental overview section in the Material and Methods.

      • Including raw data of micro-dialysis in the supplementary figures would allow assessment of the variability and quality of the measurements.

      We have added concentrations of neurochemicals in supplemental tables 1-3.

      • Baseline (prestimulus) number of flinch and rearing should systematically be indicated (missing in Figure 4).

      The focus in this figure is on the differences that occur in Stim 1 values. There are no differences between EXP and INEXP animals of any group during the Pre-Stim period. We now state that in the Figure 4 caption.

      • Discussion: "increase in AMPA/NMDA currents". We believe the authors are referring to the ratio of AMPA to NMDA currents. This sentence should be reformulated.

      These are modified to refer to “… the AMPA/NMDA current ratio…” in two locations in the Discussion (p. 14, lines 8-9; p. 15, line 4)

      • Overall the discussion is very speculative and should rely more on the data.

      We believe that the Discussion provides appropriate speculation that is based on our experimental data and previous literature. We have added a paragraph to identify limitations of our findings and recommendations of future experiments to resolve some issues (p. 12, lines 3-17)

      Reviewer #3 (Recommendations For The Authors):

      Minor concerns:

      1) The authors stated that USVs are most likely to be emitted by males, and LFH are likely to be emitted by females. However, Oliveira-Stahl et al. 2023, Matsumoto et al. 2022, Warren et al. 2018, Heckman et al. 2017, Neunuebel et al., 2015 showed that females also emit USVs. The authors should mention that USVs are emitted by both males and females and discuss how the sex of the vocalizing animal (both males and females) can influence neuromodulator release.

      The reviewer slightly mis-stated the wording of our text, changing the meaning significantly. Our wording is “These sequences included ultrasonic vocalizations (USVs) with harmonics, steps, and complex structure, mostly emitted by males, and low frequency harmonic calls (LFHs) emitted by females (Fig. 1A,C)…” This phrasing is correct and carefully chosen. The Discussion in Oliveira-Stahl et al 2023 (p. 10-11) supports our statement: “The exact fraction of USVs emitted by females as concluded in all previous studies on dyadic courtship has varied, ranging from 18%, 17.5%, and 16% to 10.5% in the present study…”.

      2) The authors should explain why ECF from BLA was collected unilaterally from the left hemisphere.

      p. 23, lines 9-11: We inserted a sentence to explain why we targeted the BLA unilaterally. “Since both left and right amygdala are responsive to vocal stimuli in human and experimental animal studies (Wenstrup et al., 2020), we implanted microdialysis probes into the left amygdala to maintain consistency with other studies in our laboratory..” Beyond that, the choice was arbitrary.

      3) The authors said each animal recovered in its home cage for four days before the playback experiment. A 4-day period may not be sufficient for every animal to recover from surgery, so the authors should describe how a mouse's recovery was assessed.

      p. 23, lines 20-23: We provide more description about the recovery and how it was assessed. Except for a few animals that were not included in the experiments, all animals recovered within 4 days.

      4) The authors stated that each animal was exposed to 90-min sessions with mating and restraint behaviors in a counterbalanced design. This description for Figure 1D should also include the duration of the mating and restraint experience.

      The Results that immediately precede citation to this figure include this information.

      5) The authors stated, "Data are reported only from mice with more than 75% of the microdialysis probe implanted within the BLA". What are the implications of having 25% of the probe outside the BLA? The authors should shed more light on this by discussing this issue as it relates to the findings and commenting on where the other 25% of the probe was located.

      We inserted a sentence to explain the rationale for this inclusion criterion. “We verified placement of microdialysis probes to minimize variability that could arise because regions surrounding BLA receive neurochemical inputs from different sources (e.g., cholinergic inputs to putamen and central amygdala).” (p. 25, lines 21-23).

      All brain regions that surround BLA, dorsal, medial, ventral, or lateral, could have been sampled by the “other” 25%. Some of these, e.g., the central amygdala or caudate-putamen, have different sources of cholinergic input that may not have the same release pattern. We do not think it is worthy of further speculation in the Discussion. Due to the high cost of the neurochemical analysis, we often did not process the neurochemistry data if histology indicated that a probe missed the BLA target.

      6) The authors confirmed that the estrus stage did not change during the experiment day by evaluating and comparing estrus prior to and after data collection. This strategy was a fantastic experimental approach, but the authors should have discussed the results. How did the results the authors included change when the females were in estrus before but not after data collection? What percentage of females started in estrus but ended in metestrus? Assuming that some females changed estrus state, were these animals excluded from the analyses?

      All animals were in the same estrus state at the beginning and end of the playback session.

      7). Authors cite Neunuebel et al., 2015 for the sentence "As male and female mice emit different vocalizations during mating". However, Neunuebel et al., 2015 showed vocalizations emitted during chasing--not mating. If mating is a general term for courtship, then this reference is appropriate, but see major concern #3.

      In the Results (p. 8, line 5), we changed the phrasing to “courtship and mating” to include the Neunubel et al study.

      As we indicate in our response to Public Comment #3, we have modified the Results (p. 5, lines 18-20) and Materials and Methods (p. 21, lines 8-15) to clarify our meaning. We continue to use the term “mating” because this refers to a specific set of behaviors associated with mounting and copulation, rather than the more general term “courtship”. We also indicate that we based these behaviors on previous work (e.g., Gaub et al., 2016).

      8) Authors interpret Figure 3F as DA release showed a "consistent" increase during mating playback across all three experimental groups. However, the increase in the estrus female group is inconsistent, as seen in the graph. This verbiage should be reworded to describe the data more accurately.

      p. 8, line 23 “consistent” was deleted.

      9) In all the box plots, multiple data points overlay each other. A more transparent way of showing the data would be adding some jitter to the x value to make each data point visible. The mean (X's) in Figure 3D (pre-stim mating and mating estrus) are difficult to see, as are all the data points in mating non-estrus. Adding all the symbols to the figure legend or a key in the figure instead of the method section would aid the reader and make the plots easier to interpret

      We have revised the boxplots to ensure that all data points are visible.

      10) Some verbiage used in the discussion should be toned down. For example, "intense" experiences and "emotionally charged" vocalizations should be removed.

      We have not changed these terms, which we believe are appropriate to describe these experiences and vocalizations.

      11) The authors include "Emotional Vocalizations" in the title. It would be beneficial if the authors included more detail and references in the introduction to help set up the emotional content of vocalizations. It may benefit a broader readership as typically targeted by eLife.

      We now cite Darwin and some more recent publications that articulate the general understanding that social vocalizations carry emotional content.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #2 (Recommendations For The Authors):

      While the details are mostly well-explained, I think that the authors could better bring forth the goals and potential usages of hippocampome.org overall.

      I think that this is a great and helpful tool that can leverage various and detailed cellular experimental studies that are out there in the literature to garner potential insights, direct future experimental studies, observe/classify experimental 'differences' (e.g., the deep and superficial pyramidal studies they mention) and so on. Say that one gets some mechanistic insight from more abstract theoretical models, hippocampome can be used to determine whether the experimental data where available is supportive of the theory. They also describe CA3 model and grid cells. While I am not suggesting that the authors completely re-organize the manuscript, I did feel that the last section 'potential applications...' could have perhaps been brought forth earlier (in a summarized form) for the reader/user to better appreciate hippocampome - indeed it is line 288 that should be near the beginning of the paper I thought.

      We thank the Reviewer for the suggestion. We have now included a summary of the simulation readiness of Hippocampome.org in the Introduction.

      I thought the 'application' paragraph (starting line 288) needed expansion to appreciate - I did not have a chance to look at the cited papers in that section - but maybe 2 paragraphs, one on CA3 and the other on grid cells, with a few more sentences of goal/context and tool usage details could be provided?

      We thank the Reviewer for the suggestion. We have added expanded paragraphs describing the simulation work on CA3 and grid cells.

      The authors start their Discussion by mentioning other resources (e.g. blue brain) in comparison. I thought that this was not too helpful without a bit more expansion about these other resources and what in particular is comparable. For example, the blue brain project is different in that it does not mine the literature per se (I think)? But then I am not sure of the extent of the comparison that the authors intend with blue brain and the other mentioned resources.

      Thank you for the helpful suggestion. We have now expanded upon the paragraph to draw more explicit parallels and contrasts among the various projects, in particular between the Blue Brain Project and Hippocampome.org.

      Minor comments

      • Fig 3D caption missing

      Thank you for pointing this out. We have now amended the figure caption.

      • Fig 5A line 211-12 refers to v2.0 but Fig 5 caption says v1.0?

      We apologize for the confusion. We have now added text clarifying the V1.X relevant descriptions around Figure 5.

      • Fig 6A confusing with thin and thick arrows and direction?

      We apologize for the confusion. We have re-colored the thick arrows orange to emphasize the fact that they are feeding directly into the spiking neural simulations.

      • Line 260 - not sure what this means - how is importance defined?

      We apologize for the confusion. We have now added text clarifying that “importance” refers to the role the neuron type plays in the functioning circuitry of the hippocampal formation.

      • CARLsim vs Brian/NEST in choosing - maybe a sentence or two for rationale

      Thank you for the suggestion. We have now added a sentence explaining the selection of CARLsim. CARLsim was selected due to its ability to run on collections of GPUs. CARLsim was the only simulator with this capability at the time the simulation work was being planned, and the power of a GPU supercomputer was needed to simulate the millions of neurons that comprise a full simulation of the complete hippocampal formation.

      • Fig 9 mv should be mV, and the voltage values specified there refer to which dash?

      Thank you for pointing these situations out. We have amended the millivolts label and have made changes to the figure to help clarify which specific tick marks are being labeled.

      Reviewer #3 (Recommendations For The Authors):

      Compliments to the authors on this nicely organized and structured presentation of V 2.0 of hippocampome.org. The paper is well prepared giving a useful short summary of the history of hippocampome for the newcomers and refreshing the memory of users, switching to highlighting the new data additions, why these are relevant and how these complement the existing database, and opening up to new applications. The added potential is well illustrated and in addition, the authors provide numerical information on the usage of this amazing resource. I enjoyed roaming around in the new version, which was made available for reviewers, and although it has been a while since I worked with the system, the new version is easy to work with. I have not had the time to use it extensively so cannot comment in detail but based on the long experience of the authors and their support team, I trust that version 2 will be almost not completely flawless; however that will for sure become clear when it is released.

      One could always wish for more, disagree, or even criticize choices made to cluster neurons, divide areas, and so forth, though in my view that does not contribute to what the resource has to offer. Having said this, the authors might consider addressing briefly issues about differences in the nomenclature used in original descriptions and how they handled the translation into their nomenclature. To mention one that is constantly being debated: how does one define the border between SMo and SMi.

      Thank you for the suggestion. We have added text to the Introduction that addresses the nomenclature issue, as presented in Hamilton et al. (2017), and provide a definition for SMo and SMi.

      Another confusing issue is presented by layers in the entorhinal cortex or its subdivisions (how many and how are these defined). So, some remarks for newcomers in the field who might use the database without spending too much energy to read the original data, might be useful.

      Thank you for the suggestion to clarify this situation pertaining to the entorhinal cortex. Often, we have assumed the authors’ own definitions of the layers and subdivisions (medial and lateral), when naming neuron types. When our name is a hybrid of two published names that include both medial and lateral neurons, our name is prefixed by a simple EC, rather than by MEC or LEC.

      As noted, the authors present version 2 nicely and comprehensibly and I have only a few additional comments, meant to further improve the already high quality of the paper.

      1) The figures, nice as they are, are incredibly information-dense, so they require serious study to get the details; the legends do help, but the many abbreviations coming from totally different fields make it challenging to keep track of them while reading. This is a pity since there is a lot of new information in this version of the dataset, compared to previous versions and the authors overall succeed in emphasizing what is new and why this might be of use/importance.

      So a few suggestions: i) add relevant/most important abbreviations to the legends of the individual figures; ii) introduce all abbreviations upon first use and do not simply refer to the table in the methods. Interestingly, even the authors lose track in the introduction where they use BICCN in line 43 and refer to the abbreviation list, though the full name is given two lines below.

      We apologize for the confusion. We have amended the main text to clarify abbreviations. We have added the abbreviation definitions to the captions of the figures, and in some instances, removed the abbreviations from the figures altogether where space allowed.

      2) Figure 3 and even more so figure 5 depend strongly on the color differences red/green; please change since generally red/green is no longer used for obvious reasons.

      Thank you for pointing this out. We have switched the fonts in Figure 3 to black (excitatory) and gray (inhibitory) to match our previous publication. We have also changed the color schemes in Figure 5 to avoid red and green.

      Reviewer #3 commented on the complexity of our figures and how the figures are information dense. To partially address this, we have decided to remove panel A2 of Figure 3. It was originally meant to emphasize where the information came from to add new axonal projections to two v1.0 neuron types; however, it is not necessary to make the point in the illustration. Thus, we have removed the panel and amended the caption for Figure 3A to include the cited reference.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Recommendations for The Authors):

      1) While the specificity of the observed muscle phenotypes seems clear, the subsequent molecular analysis of Numb protein interactors does not seem to consider the potential involvement of Numb-like. The authors should demonstrate the relative expression levels of Numb and Numb-like in the models used, and establish the specificity of the antibodies used in IP, western and staining experiments.

      Response: Perhaps the most convincing evidence that the anti-Numb antibody did not pull down Numb-like is that this protein was not detected among immunoprecipitated protein complexes pulled down by the anti-Numb antibody used. The antibody used in the immunoprecipitation was validated by the supplier and was previously reported to immunoprecipitate Numb [1, 2]. We previously demonstrated that a morpholino against Numb mRNA almost completely eliminated the band detected by this antibody and that this band was at the expected molecular weight [ref]. In our hands, mRNA levels for Numb-like in skeletal muscle are 5-10-fold lower than those for Numb [3]. We have been unable to detect Numb-like protein in healthy adult skeletal muscle by immunoblotting or immunofluorescence staining. Taking all of these findings together, it seems unlikely that the antibodies used for immunoprecipitating Numb-protein complexes pulls down Numb-like.

      2) The authors use PCR to investigate Numb isoform expression and conclude that p65 is likely the dominant protein isoform expressed. While this agrees with the single band observed in Supp Figure 4A, a positive control for exon 9 excluded and included isoforms in the PCR reactions would strengthen this conclusion.

      Response: The amplicons shown in Supplemental 4 were sequenced. The clones corresponded to the isoforms with the exon 3 present or removed. No amplicons containing exon 9 were detected. The following sentence was added to the Analysis of Splice Variants section of Methods to address this point: “PCR products were cloned using the TOPO TA cloning system (ThermoFisher) and multiple resulting clones were sequenced to confirm that the expected products were generated.”

      3) PCR analysis of total Numb and Numb-like expression levels are not shown. This is important given the specificity of the Numb antibodies used for AP-MS experiments are not described and some Numb antibodies are well known to also recognize Numb-like. Two different Numb antibodies were used for Western and immunoprecipitation but the specificity for Numb and Numb-like is not described. In particular, does the antibody used in the AP-MS experiment recognize both Numb and Numb-like? Supplementary Table 1 does not list Numb or Numb-like, but presumably peptides were identified?

      Response: As noted above, the specificity of anti-Numb antibodies was confirmed in previous studies [3]. Importantly, Numb-like mRNA levels are 5-10-fold lower than Numb mRNA, and NumbL protein is undetectable in healthy adult skeletal muscle by Western. The physiology data reported in this manuscript supports the conclusion that a single KO of Numb is sufficient to recapitulate the physiological phenotype of Numb/Numb-like KO . We therefore reason that the majority, if not all, of the physiological contribution of these proteins to muscle contractility due to Numb (Fig. 1).

      4) The validation experiment used the same Numb antibody for immunoprecipitation, immunoblotted with Septin 7. A reciprocal IP of Septin 7 and blotted with Numb should be performed. In addition, a Numb-like IP or immunoblot would also be useful to demonstrate the specificity of the interaction. Efforts to map the interaction between Numb and Septin 7 would be useful to demonstrate specificity of the interaction and strategies to establish the biological relevance of the interaction.

      Response: We agree with the reviewer and attempted several IPs with anti-Septin7 antibodies. These were unsuccessful. In a new collaboration, Dr. Italo Cavini (University of Sao Paulo) has used machine-learning-based approaches to model binding between Numb and several septins, including Septin 7. The analysis suggests that binding of Numb with septins involves a domain of Numb that has not yet been ascribed a function in protein-protein interactions. These computational predictions require experimental validation but provide rational starting point for experiments to define the domains responsible for these interactions. Such experiments were included in our recent NIH R01 renewal application. We hope to be able to report on results of confirmatory experiments of these computational models in the future.

      5) Other septins were identified in the AP-MS experiment and might have been anticipated to also be disrupted by Numb/Numb-like deletion. Are these septins known to interact in a complex?

      Response: This is an excellent question. Septins have conserved motifs providing a clear reason to imagine that many different mammalian septins could directly interact with Numb. Septins form heterooligomers consisting of complexes formed by 3, 6 or 8 septins [4]. It is likely that when Numb binds to one septin, antibodies against Numb pull down other septins present in the septin oligomer to which Numb is bound. The following paragraph was added to the discussion: “Our findings suggest that Numb may also interact with other septins such as septins 2, 9 and 10, which were also identified with a high level of confidence as Numb interacting proteins by our LC/MS/MS analysis. Our data to not allow us to determine if Numb binds directly to these septins. Septins contain highly conserved regions, and, consequently, if one such region of septin 7 interacts with Numb, then many septins would be expected to directly bind Numb through the same domain. However, because septins self-oligomerize, is possible that when Numb binds to one septin, antibodies against Numb could also pull down other septins present in the septin oligomer to which Numb is bound regardless of whether or not they are also bound by Numb. “

      6) The text for Figure 5 describes analysis of Septin localization in inducible Numb/Numb-like cKO muscle, but the figure indicates only Numb is knocked out. Please clarify.

      Response: We apologize for this oversight on our part. The Legend to Figure 5 has been corrected.

      7) Supplementary Figure 2 seems to show that TAM treatment increases Numb expression. Please clarify. Also, please correct reference 9.

      Response: The figure was incorrectly labeled. We apologize for this oversight and have corrected the figure in the revised manuscript.

      Reviewer #2 (Recommendations for The Authors):

      Overall, the manuscript is well written. I do have a few minor issues/concerns, which are detailed below.

      Abstract: Please be a little more specific regarding which where the tissue came from (i.e. humans, mice, cell) when referring to your previous studies.

      Response: The abstract has been revised as requested.

      Introduction: Please be more specific regarding the technique used for detecting ultrastructural changes. I assume it was done with TEM, but the reference is listed as an "invalid citation" in your reference list.

      Response: The introduction was revised as requested and the citation was updated to reference a valid citation.

      Methods / Numb Co-Immunoprecipitation: Please indicated the level of confluency of the C2C12 cells as this will alter gene expression.

      Response: As indicated in the updated Methods section, confluent C2C12 cells were switched to differentiation media (low serum) for seven days. When harvested, the cells had differentiated and fused into myotubes.

      Methods / Immunohistochemical Staining: The first sentence needs to be edited regarding plurality and grammar.

      Response: Thank you for this comment. The text was revised accordingly.

      Results / GWAS and WGS Identify...: Please spell out phosphodiesterase (I assume) for PDE4D

      Response: This change was incorporated in the text.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study reports jAspSnFR3, a biosensor that enables high spatiotemporal resolution of aspartate levels in living cells. To develop this sensor, the authors used a structurally guided amino acid substitution in a glutamate/aspartate periplasmic binding protein to switch its specificity towards aspartate. The in vitro and in cellulo functional characterization of the biosensor is convincing, but evidence of the sensor's effectiveness in detecting small perturbations of aspartate levels and information on its behavior in response to acute aspartate elevations in the cytosol are still lacking.

      We thank the reviewers and editors for the detailed assessment of our work and for their constructive feedback. Most comments have now been experimentally addressed in the revised manuscript, which we feel is substantially improved from the initial draft.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this manuscript, Davidsen and coworkers describe the development of a novel aspartate biosensor jAspSNFR3. This collaborative work supports and complements what was reported in a recent preprint by Hellweg et al., (bioRxiv; doi: 10.1101/2023.05.04.537313). In both studies, the newly engineered aspartate sensor was developed from the same glutamate biosensor previously developed by the authors of this manuscript. This coincidence is not casual but is the result of the need to find tools capable of measuring aspartate levels in vivo. Therefore, it is undoubtedly a relevant and timely work carried out by groups experienced in aspartate metabolism and in the generation of metabolite biosensors.

      Reviewer #2 (Public Review):

      In this work the IGluSnFR3 sensor, recently developed by Marvin et al (2023) is mutated position S72, which was previously reported to switch the specificity from Glu to Asp. They made 3 mutations at this position, selected a S72P mutant, then made a second mutation at S27 to generate an Asp-specific version of the sensor. This was then characterized thoroughly and used on some test experiments, where it was shown to detect and allow visualization of aspartate concentration changes over time. It is an incremental advance on the iGluSnFR3 study, where 2 predictable mutations are used to generate a sensor that works on a close analog of Glu, Asp. It is shown to have utility and will be useful in the field of Asp-mediated biological effects.

      Reviewer #3 (Public Review):

      In this manuscript, Davidsen and collaborators introduce jAspSnFR3, a new version of aspartate biosensor derived from iGluSnFR3, that allows monitoring in real-time aspartate levels in cultured cells. A selective amino acids substitution was applied in a key region of the template to switch its specificity from glutamate to aspartate. The jAspSnFR3 does not respond to other tested metabolites and performs well, is not toxic for cultured cells, and is not affected by temperature ensuring the possibility of using this tool in tissues physiologically more relevant. The high affinity for aspartate (KD=50 uM) allowed the authors to measure fluctuations of this amino acid in the physiological range. Different strategies were used to bring aspartate to the minimal level. Finally, the authors used jAspSnFR3 to estimate the intracellular aspartate concentration. One of the highlights of the manuscript was a treatment with asparagine during glutamine starvation. Although didn't corroborate the essentiality of asparagine in glutamine depletion, the measurement of aspartate during this supplementation is a glimpse of how useful this sensor can be.

      Reviewer #1 (Recommendations For The Authors):

      The authors should evaluate the effectiveness of the sensor in detecting small perturbations of aspartate levels and its behavior in response to acute aspartate elevations in the cytosol. In vivo aspartate determinations were performed exclusively in conditions that cause aspartate depletion. By means the use of mitochondrial respiratory inhibitors or aspartate withdrawal, it was determined the reliability of the sensor performing readings during relatively long periods, until reaching a steady-state of aspartate-depletion 12-60 hours later. Although in Hellweg and coworkers, it has been demonstrated that a related aspartate sensor could detect increases in aspartate in cell overexpressing the aspartate-glutamate GLAST transporter, the differences reported here between both sensors advise testing whether this aspect is also improved, or not, using jAspSNFR3.

      Similarly, Davidsen et al. did not test if the sensor can be able to detect transient variations in cytosolic aspartate levels. In proliferative cells aspartate synthesis is linked to NAD+ regeneration by ETC (Sullivan et al., 2015, Cell), indeed the authors deplete aspartate using CI or CIII inhibitors but do not analyze if those are recovered, and increased, after its removal. Furthermore, the sequential addition of oligomycin and uncouplers could generate measurable fluctuations of aspartate in the cytosol.

      We agree with the reviewer that only including situations of aspartate depletion in our cell culture experiments provided an incomplete evaluation of the utility of this biosensor. In the revised manuscript we provide three additional experiments using secondary treatments that restore aspartate synthesis to conditions that initially caused aspartate depletion. First, we conducted experiments where cells expressing jAspSnFR3/NucRFP were changed into media without glutamine, inducing aspartate depletion, with glutamine being replenished at various time points to observe if GFP/RFP measurements recover. As expected, glutamine withdrawal caused a decay in the GFP/RFP signal and we found that restoring glutamine caused a subsequent restoration of the GFP/RFP signal at all time points, with each fully recovering the GFP/RFP signal over time (Revised Manuscript Figure 2E). Next, we conducted the experiment suggested by the reviewer, testing whether the published finding, that oligomycin induced aspartate limitation can be remedied by co-treatment with electron transport chain uncouplers, could be visualized using jAspSnFR3 measurements of GFP/RFP. Indeed, after 24 hours of oligomycin induced aspartate depletion, treatment with the ETC uncoupler BAM15 dose dependently restored GFP/RFP signal (Revised Manuscript Figure 2G). Finally, we also measured whether the ability of pyruvate to mitigate the decrease in aspartate upon co-treated with rotenone (Figure 2B) could also be detected in a sequential treatment protocol after aspartate depletion. Indeed, after 24 hours of aspartate depletion by rotenone treatment, the GFP/RFP signal was rapidly restored by additional treatment with pyruvate (Revised Manuscript Figure 2, figure supplement 1C). Collectively, these results provide support for the utility of jAspSnFR3 to measure transient changes in aspartate levels in diverse metabolic situations, including conditions that restore aspartate to cells that had been experiencing aspartate depletion.

      Reviewer #2 (Recommendations For The Authors):

      Weaknesses: Sensor basically identical to iGluSnFR3, but nevertheless useful and specific. The results support the conclusions, and the paper is very straightforward. I think the work will be useful to people working on the effects of free aspartate in biology and given it is basically iGluSnFR3, which is widely used, should be very reproducible and reliable.

      We appreciate the reviewer’s comment that sensor is useful for specific detection of aspartate. We agree that the advance of the paper is primarily in demonstrating its utility to measure aspartate, rather than any fundamental innovation on the biosensor approach. We hope the fact that jAspSnFR3 derives from a well validated biosensor (iGluSnFR3) will support its adoption.

      Reviewer #3 (Recommendations For The Authors):

      Although this is a well-performed study, I have some comments for the authors to address:

      1) A red tag version of the sensor (jAspSnFR3-mRuby3) was generated for normalization purposes, with this the authors plan to correct GFP signal from expression and movement artifacts. I naturally interpret "movement artifacts" as those generated by variations in cell volume and focal plane during time-lapse experiments. However, it was mentioned that jAspSnFR3-mRuby3 included a histidine tag that may induce a non-specific effect (responses to the treatment with some amino acids). This suggests that a version without the tag needs to be generated and that an alternative design needs to be set for normalization purposes. A nuclear-localized RFP was expressed in a second attempt to incorporate RFP as a normalization signal. Here the cell lines that express both signals (sensor and RFP) were generated by independent lentiviral transductions (insertions). Unless the number of insertions for each construct is known, this approach will not ensure an equimolar expression of both proteins (sensor and RFP). In this scenario is not clear how the nuclear expression of RFP will help the correction by expression or monitor changes in cell volume. The authors may be interested in attempting a bicistronic system to express both the sensor and RFP.

      The reviewer noted several potential issues concerning the use of RFP for normalization, which will be separated into sections below:

      Movement artifacts:

      We are glad the reviewer raised this issue since we see how it was confusingly worded. We have deleted the text “and movement artefacts” from the sentence.

      His-tag and non-specific responses to some amino acids:

      We also found it concerning that non-specific responses to amino acids could potentially contribute to our RFP normalization signal, and so we conducted additional experiments to address whether this was likely to be an issue in intracellular measurements. We first tested whether the non-specific signal was related to the histidine tag, or was intrinsic to the mRuby3 protein itself, by comparing the fluorescence response to a titration of histidine (which showed the largest effect of red fluorescence), aspartate, and GABA (structurally related to glutamate and aspartate, but lacking a carboxylate group) across a group of mRuby containing variants, with or without histidine tags. We replicated the non-specific signal originally observed in jAspSnFR3-mRuby3-His and found that another biosensor with a histidine tagged on the C terminus of mRuby3 had a similar response (iGlucoSnFR2.mRuby3-His), as did mRuby3-His alone, indicating that the aspect of being fused with jAspSnFR3 or another binding protein was not required for this effect. Additionally, we also compared the fluorescence response of lysates expressing mRuby2 and mRuby3 without histidine tags and found that the non-specific signal was essentially absent (Revised Manuscript Figure 1, figure supplement 4B-D). Collectively. These data support our original hypothesis that the histidine tag was responsible for the non-specific signal, alleviating concerns about more substantial protein design issues or with using nuc-RFP for normalization. Since we also found that measuring aspartate signal using GFP/RFP ratios from cells with linked the jAspSnFR3-Ruby3-His agreed with measurements from cells separately expressing jAspSnFR3 and nucRFP (without a His tag), and the amino acid concentrations needed to significantly alter His tagged Ruby3 signal are above those typically found in cells, we conclude that this is unlikely to be a significant factor in cells. Nonetheless, we have added all the relevant data to the manuscript to allow readers to make their own decision about which construct would be best for their purposes.

      Original text:

      "Surprisingly, the mRuby3 component responds to some amino acids at high millimolar concentrations, indicating a non-specific effect, potentially interactions with the C-terminal histidine tag (Figure 1—figure Supplement 2, panel B). Notably, this increase in fluorescence is still an order of magnitude lower than the green fluorescence response and it occurs at amino acid concentrations that are unlikely to be achieved in most cell types."

      Revised text:

      "Surprisingly, the mRuby3 fluorescence of affinity-purified jAspSnFR3.mRuby3 responds to some amino acids at high millimolar concentrations, indicating a non-specific effect (Figure 1—figure Supplement 4, panel A). This was determined to be due to an unexpected interaction with the C-terminal histidine tag and could be reproduced with other proteins containing mRuby3 and purified via the same C-terminal histidine tag (Figure 1—figure Supplement 4, panel B and C). Interestingly, a structurally related, non-amino acid compound, GABA, does not elicit a change in red fluorescence; indicating, that only amino acids are interacting with the histidine tag (Figure 1—figure Supplement 4, panel D). Nevertheless, most of our cell culture experiments were performed with nuclear localized mRuby2, which lacks a C-terminal histidine tag, and these measurements correlated with those using the histidine tagged jAspSnFR3-mRuby3 construct (Figure 1—figure Supplement 1 panel D)."

      Lentiviral transductions

      We agree that splitting the two fluorescent proteins across two expression constructs and infections effectively guarantees that there will not be equimolar expression of jAspSnFR3 and RFP, however we do not think equimolar expression is necessary in this context. The primary goal of RFP measurements in these experiments (and in experiments using the jAspSnFR3-mRuby3 fused construct) is to control for global alterations in protein expression that might confound the interpretation that a change in GFP fluorescence corresponds to a change in aspartate levels. While a bicistronic system is arguably a better approach to improve the similarity of expression of jAspSnFR3 and nuc-RFP in a cell, we only require that the cells have consistent expression of both proteins across all cells in the population, not that the expression of one necessarily be a similar molarity to the other. We accomplish consistent expression of proteins by single cell cloning after expression of jAspSnFR3 and nucRFP (or jAspSnFR3-mRuby3), and screening for clones that have high enough expression of both proteins such that they are well detected by standard Incucyte conditions. Given that our data do not identify an obvious downside to separate expression of jASPSnFR3 and nuc-RFP compared to the fused jAspSnFR3-mRuby3 construct (where the fluorescent proteins are truly equimolar) (Figure 2, Figure Supplement 1C), we elected to prioritize the separate jAspSnFR3 and nuc-RFP combination, which provides additional opportunities to measure cell number in the same experiment (see below).

      2) The authors were interested in establishing the temporal dynamics of aspartate depletion by genetics and pharmaceutical means. For the inhibition of mitochondrial complex I rotenone and metformin were used. Although the assays are clearly showing aspartate depletion the report of cell viability is missing. Considering that glutamine deprivation induces arrest in cell proliferation, I think will be important to know the conditions of the cell cultures after 60 hours of treatment with such inhibitors.

      We agree that ensuring that cells are still viable in conditions where aspartate is depleted, as determined by GFP/RFP in jAspSnFR3 expressing cells, is an important goal. To this end, we added a new experiment investigating the restoration of glutamine on the GFP/RFP signal at different time points after glutamine depletion (Revised Manuscript Figure 2E, see response to reviewer 1). One advantage of using the nuclear RFP as a normalization marker is that it also enables measurements of nuclei counts, a surrogate measurement for cell number. In the same glutamine depletion experiment we therefore measured cell counts using nuclear RFP incidences and confluency as measurements of cell proliferation/growth. In both cases, the arrest in cell proliferation upon glutamine withdrawal was obvious, as was the restoration of cell proliferation following glutamine replenishment, with the amount of growth delay corresponding to the length of glutamine withdrawal (Revised Manuscript Figure 2, Figure Supplement 2A-B). Nonetheless, there was no obvious lasting defects in restarting cell proliferation even after 12 hours of glutamine withdrawal, indicating that cell viability is preserved. In the case of mitochondrial inhibitors, we also observe even that after 24 hours of treatment with oligomycin or rotenone, restoration of aspartate synthesis from BAM15 or pyruvate, respectively, can also restore GFP/RFP signal, supporting the conclusion that cellular metabolism is still active in these conditions (Revised Manuscript Figure 2G; Revised Manuscript Figure 2, figure supplement 1C).

      3) The pH sensitivity was checked in vitro with jAspSnFR3-mRuby3 and the sensor reported suitable for measurements at physiological pH. It would be an opportunity to revisit the analysis for pH sensitivity in cultured cells using an untagged version of jAspSnFR3 coupled, for example, to a sensor for pH.

      We thank the reviewer for the suggestion and agree that pH effects on sensor signal could be a confounding factor in some conditions. Unfortunately, measuring intracellular pH is not trivial and using multiple fluorescent sensors that change simultaneously would be complex to interpret, particularly in the absence of controls to unambiguously control intracellular pH and aspartate concentrations. Thus, we believe that proper investigation of the variable of pH is beyond the scope of this study. Nonetheless, we agree that measuring the contribution of pH to sensor signal is an important goal for future work, particularly if deploying it in conditions likely to cause substantial pH differences, such as comparing compartmentalized signal of jAspSnFR3 in the cytosol and mitochondria. We have added the following italicized text to the conclusions section to underscore this point:

      “Another potential use for this sensor would be to dissect compartmentalized metabolism, with mitochondria being a critical target, although incorporating the influence of pH on sensor fluorescence will be an important consideration in this context.”

      4) While the authors take an interesting approach to measuring intracellular aspartate concentration, it will be highly desirable if a calibration protocol can be designed for this sensor. Clearly, glutamine depletion grants a minimal ("zero") aspartate concentration. However, having a more dynamic way for calibration will facilitate the introduction of this tool for metabolism studies. This may be achieved by incorporating a cultured cell that already expresses the transporter or by ectopic expression in the cells that have already been used.

      We appreciate the suggestion and would similarly desire a calibration protocol to serve as a quantitative readout of aspartate levels from fluorescence signal, if possible. While we do calibrate jAspSnFR3 fluorescence in purified settings, conducting an analogous experiment intracellularly is currently difficult, if not impossible. While we have several methods to constrain the production rate of aspartate (glutamine withdrawal, mitochondrial inhibitors, and genetic knockouts of GOT1 and GOT2), we cannot prevent cells from decreasing aspartate consumption and so cannot get a true intracellular zero to aid in calibration. Additionally, the impermeability of aspartate to cell membranes makes it challenging to specifically control intracellular concentrations using environmental aspartate, and the best-known aspartate transporter (SLC1A3) is concentrative and so has the reciprocal problem. Considering these issues, we are wary of implying to readers that any specific fluorescence measurement can be used to directly interpret aspartate concentration given the many variables that can impact its signal, both related to the biosensor system itself (expression of jAspSnFR3, expression of Nuc-RFP, sensitivity and settings of the fluorescence detector) and based on cell intrinsic variability (differences in basal ASP levels, different sensitivity to treatments, influence of pH, etc.). We maintain that jAspSnFR3 has utility to measure relative changes in aspartate within a cell line across treatment conditions and over time, but absolute quantitation of aspartate still will require complementary approaches, like mass spectrometry, enzymatic assays, or NMR.

      5) jAspSnFR3 seems to have the potential to be incorporated easily for several research groups as a main tool. In general, a minor correction to replace F/F with ΔF/F in the text.

      Thank you for catching this error, the text has been edited accordingly.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this work, the authors provide evidence to show that an increase in Kv7 channels in hilar mossy cells of Fmr1 knock out mice results in a marked decrease in their excitability. The reduction in excitatory drive onto local hilar interneurons produces an increased excitation/inhibition ratio in granule cells. Inhibiting Kv7 channels can help normalize the excitatory drive in this circuit, suggesting that they may represent a viable target for targeted therapeutics for fragile-x syndrome.

      Strengths:

      The work is supported by a compelling and thorough set of electrophysiological studies. The authors do an excellent job of analysing their data and present a very complete data set.

      We thank the Reviewer for the positive comments.

      Weaknesses:

      There are no significant weaknesses in the experimental work, however the complexity of the data presentation and the lack of a schematic showing the organizational framework of this circuit make the data less accessible to non-experts in the field. I highly encourage a graphical abstract and network diagram to help individuals understand the implications of this work.

      We thank the Reviewer for the suggestion, and added a schematic of the dentate network organization (Figure 1A).

      The work is important as it identifies a unique regional and cell-specific abnormality in Fmr1 KO mice, showing how the loss of one gene can result in region-specific changes in brain circuits.

      Reviewer #2 (Public Review):

      Summary:

      Deng et al. investigate, for the first time to my knowledge, the role that hippocampal dentate gyrus mossy cells play in Fragile X Syndrome. They provide strong evidence that, in slice preparations from Fmr1 knockout mice, mossy cells are hypoactive due to increased Kv7 function whereas granule cells are hyperactive compared to slices from wild-type mice. They provide indirect evidence that the weakness of mossy cell-interneuron connections contributes to granule cell hyperexcitability, despite converse adaptations to mossy cell inputs. The authors show that application of the Kv7 inhibitor XE991 is able to rescue granule cell hyperexcitability back to wild-type baseline, supporting the overall conclusion that inhibition of Kv7 in the dentate may be a potential therapeutic approach for Fragile X Syndrome. However, any claims regarding specific circuit-based intervention or analysis are limited by the exclusively pharmacological approach of the manipulations.

      Strengths:

      Thorough electrophysiological characterization of mossy cells in Fmr1 knockout mice, a novel finding.

      Their electrophysiological approach is quite rigorous: patched different neuron types (GC, MC, INs) one at a time within the dentate gyrus in FMR1 KO and WT, with and without 'circuit blockade' by pharmacologically inhibiting neurotransmission. This allows the most detailed characterization possible of passive membrane/intrinsic cell differences in the dentate gyrus of Fmr1 knockout mice.

      Provide several examples showing the use of Kv7 inhibitor XE991 is able to rescue excitability of granule cell circuit in Fmr1 knockout mice (AP firing in the intact circuit, postsynaptic current recordings, theta-gamma coupling stimulation).

      We thank the Reviewer for the positive comments.

      Weaknesses:

      The implications for these findings and the applicability of the potential treatment for the disorder in a whole animal are limited due to the fact that all experiments were done in slices.

      We appreciate the Reviewer’s point and agree. To address this concern, we have revised the Discussion to state that “the applicability of a circuit-wide approach as a potential treatment in vivo will require extensive future behavioral analyses, which are beyond the scope of the current study”. We also now emphasize in Discussion that “these findings provide a proof-of-principle demonstration that a circuit-based intervention can normalize dynamic E/I balance and restore dentate circuit output in vitro”.

      The authors' interpretation of the word 'circuit-based' is problematic - there are no truly circuit-specific manipulations in this study due to the reliance on pharmacology for their manipulations. While the application of the Kv7 inhibitor may have a predominant effect on the circuit through changes to mossy cell excitability, this manipulation would affect many other cells within the dentate and adjacent brain regions that connect to the dentate that express Kv7 as well.

      We appreciate the reviewer’s point but would like to clarify that by using a term “circuit-based” we did not intend to imply that it is a “’circuit-specific” intervention. Our intended interpretation of the term ‘circuit-based’ stems from the following reasoning: the dentate circuit has two types of excitatory neurons which show opposite excitability defects in FXS mice, thus presenting an irreconcilable conflict to correct pharmacologically for each cell type individually. Instead, we sought an approach to correct the overall dentate circuit output, rather than to restore excitability defects of individual cell types. Notably, when we pharmacologically isolated granule cells from the circuit, inhibition of Kv7 failed to restore their excitability, suggesting that normalization of the dentate output depends on the circuit activity. Since we focused on correcting dentate output using such a circuit-dependent approach, we used the term ‘circuit-based intervention’ to emphasize this notion.

      Reviewer #3 (Public Review):

      The paper by Deng, Kumar, Cavalli, Klyachko describes that, unlike in other cell types, loss of Fmr1 decreases the excitability of hippocampal mossy cells due to up-regulation of Kv7 currents. They also show evidence that while muting mossy cells appears to be a compensatory mechanism, it contributes to the higher activity of the dentate gyrus, because the removal of mossy cell output alleviates the inhibition of dentate principal cells. This may be important for the patho-mechanism in Fragile X syndrome caused by the loss of Fmr1.

      These experiments were carefully designed, and the results are presented ‎in a very logical, insightful, and self-explanatory way. Therefore, this paper represents strong evidence for the claims of the authors. In the current state of the manuscript, there are only a few points that need additional explanation.

      We thank the Reviewer for the positive comments.

      One of the results, which is shown in the supplementary dataset, does not fit the main conclusions. Changes in the mEPSC frequency suggest that in addition to the proposed network effects, there are additional changes in the synaptic machinery or synapse number that are independent of the actual activity of the neurons. Since the differences of the mEPSC and sEPSC frequencies are similar and because only the latter can signal network effects, while the former is typically interpreted as a presynaptic change, it cannot be claimed that sEPSC frequency changes are due to the hypo-excitability of mossy cells.

      We thank the Reviewer for this important point and agree. To address this concern, we now state in Results that “We note that changes in the excitatory drive onto interneurons include both mEPSC and sEPSC frequencies, which reflect not only potential deficits in excitability of their input cells, such as MCs, but also changes in synaptic connectivity/function, that may arise from homeostatic circuit reorganization/compensation (see Discussion)”.

      We also now emphasize this point in Discussion by stating that “alterations in excitatory drives, including both mEPSC and sEPSC frequencies onto interneurons, suggest changes in the excitatory synapse number and/or function. Together with alterations in inhibitory drives these changes may reflect compensatory circuit reorganization of both excitatory and inhibitory connections, including mossy cell synapses”.

      We also note in Discussion that “Such circuit reorganization can explain the balanced E/I drive onto granule cells in Fmr1 KO mice we observed in the basal state, which can result from reorganization of excitatory and inhibitory axonal terminals”.

      Notably, our findings that Kv7 blocker acting by increasing MC excitability is sufficient to correct dentate output, supports the notion that hypo-excitability of mossy cells is a major factor contributing to dentate circuit E/I imbalance. This does not exclude the presence of additional mechanisms contributing to E/I imbalance, such as changes of synaptic connectivity or release machinery. To reflect this point, we revised the Results to temper the initial claim that “this analysis supports the notion that the hypo-excitability of MCs in Fmr1 KO mice caused (now replaced with “is a major factor contributing to”) the reduction of excitatory drive onto hilar interneurons, which ultimately results in reduced local inhibition”.

      An apparent technical issue may imply a second weak point in the interpretation of the results. Because the IPSCs in the PP stimulation experiments (Fig 8) start within a few milliseconds, it is unlikely that its first ‎components originate from the PP-GC-MC-IN feedforward inhibitory circuit. The involvement of this circuit and MCs in the Kv7-dependent excitability changes is the main implication of the results of this paper. But this feedforward inhibition requires three consecutive synaptic steps and EPSP-AP couplings, each of them lasting for at least 1ms + 2-5ms. Therefore, the inhibition via the PP-GC-MC-IN circuit can be only seen from 10-20ms after PP stimulation. The earlier components of the cPSCs should originate from other circuit elements that are not related to the rest of the paper. Therefore, more isolated measurements on the cPSC recordings are needed ‎which consider only the later phase of the IPSCs. This can be either a measurement of the decay phase or a pharmacological manipulation that selectively enhances/inhibits a specific component of the proposed circuit.

      We appreciate the Reviewer’s point. As we mentioned in Results: “The EPSP measured in granule cells in response to the PP stimulation integrates both excitatory and inhibitory synaptic inputs onto granule cells, including the direct synaptic input from the PP and all the PP stimulation-associated feedforward and feedback synaptic inputs. In other words, the EPSP in granule cells integrates all dentate circuit ‘operations’.” As the Reviewer pointed out, this is also the case in the measurements of cPSCs, which comprise all of PP stimulation-associated feedforward and feedback inhibition. We thank the Reviewer for the suggestion to isolate specific components of IPSC. However, we did not attempt to do it in this study for three reasons. First, activity of all of these circuit components likely overlaps extensively in time and it is difficult to identify the specific time point that can separate contributions from earlier canonical feed-forward and feed-back components from the contribution of the later MC-dependent PP-GC-MC-IN feed-forward component. Notably the tri-synapse PP-GC-MC-IN component differs temporarily from the canonical di-synaptic (PP-GC-IN) feed-back inhibition only by a single synaptic activation step, resulting in only a few milliseconds difference. Moreover, the temporal differences in the contributions of these components vary widely among different recordings making a uniform analysis very difficult. Second, we used three different metrics to assess E/I changes in cPSC measurements, which capture a wide range of temporal processes and their integration, including peak-to-peak measurements, the charge transfer, and the excitation window metrics. Third, the principal readout in our study was the overall dentate output (i.e., granule cell firing), which reflects the integration of all dentate circuit ‘operations’ thus making the overall cPSC measurements appropriate, in our view, for this readout.

      I suggest refraining from the conclusions saying "‎MCs provide at least ~51% of the excitatory drive onto interneurons in WT and ~41% in KO mice", because too many factors (eg. IN cell types, slice condition, synaptic reliability) are not accounted for in these actual numbers, and these values are not necessary for the general observation of the paper.

      We thank the reviewer for this suggestion, and have revised the manuscript accordingly.

      There are additional minor issues about the presentation of the results.

      We have carefully checked and corrected the minor errors that reviewer pointed out.

      Recommendations for the authors:

      Revisions that are considered essential for improved assessment regarding the strengths of support of the claims:

      • Temper claims regarding circuit-based effects

      • Temper claims regarding very specific quantitative assessments of synaptic drives

      • Differentiate between monosynaptic inputs and inputs arriving through multiple synaptic contacts with proper analytical techniques.

      We appreciate these suggestions and have revised the manuscript to address the concerns raised by the reviewers.

      Reviewer #1 (Recommendations For The Authors):

      The authors do an outstanding job of reviewing and presenting all of their data. This is a paper I will recommend all of my trainees read, as it is an excellent example of a complete research project. While I am impressed with the effort involved, I also wondered if the complexity and thoroughness of their presentations could make the story less accessible to non-expert readers. My comments are simply intended to help them present a more coherent and succinct story to a wider audience, though I am not sure I really provide any meaningful changes. This is simply a very thorough and complete body of work that the authors should be commended for. After reading it I felt they had gone above and beyond what most authors would provide in terms of data to support their story, and thus I had no doubt that a change in Kv7 plays a role in changing the excitability of the network.

      We thank the Reviewer for the positive comments and great suggestions. We have made numerous changes to present our work in a more coherent and succinct way, in part by re-plotting some of the figures, as well as by adding a schematic of the dentate circuit in Figure 1.

      Figure 1. A visual of mossy cells and the local circuit they are studying would be a useful addition to Figure. 1. I also feel this is important for conveying the story of how hypo-excitability can impact the E/I of the network. I think it has to be more of a cell structure/circuit-based figure than is presented in Supplementary Figure 8.

      We thank the reviewer for this suggestion. We have added a schematic of the dentate circuit with all major cell types involved in Figure 1A.

      Figure 1. A, B, and C tell a coherent story and are easy to understand. The interpretation of the phase plot in D is harder to access. Perhaps having this as a separate figure and providing a clearer presentation of the way the phaseplot was created (see Figure 3 Bove et al., 2019, Neuroscience 418; DOI: 10.1016/j.neuroscience.2019.08.048)

      We appreciate the Reviewer’s point and agree. In order to keep Figure 1 more concise and readable, we removed the phase plot in the revised version. This change did not negatively impact the result presentation because the primary aim of this plot was to visualize changes in voltage threshold in an alternative way, but it was already clearly shown by the ramp-evoked AP traces (revised Figure 1D, insert), and thus was not essential to show.

      Figure 1 E-N might be better situated in a supplementary graph as the characteristics of the AP aren't changing.

      We understand the Reviewer’s point, but we feel it would be better to keep all action potential metrics together in one figure, to show that only a specific subset of parameters was affected in Fmr1 KO mice.

      Figure 2: (A-D) I am not sure having so many figures is required given the focus is on having a small change in Ir at one membrane potential. I do worry that the significance appears to be due to 2 cells with an IR of over 100 in the WT group and 2 with an IR of around 62 in the KO group. All other cells are between 75-100 in both groups. I also worry a bit bc in the literature IRs between 55 and 125 seem to be commonly reported by groups that do this work normally (Buzsacki, Westbrook, etc.). I would be cautious about making too much out of this result.

      We thank the Reviewer for these comments. We have performed additional analyses of these data, as also suggested by Reviewer 3 (Point #1), and improved presentation of the data in Figure 2D-F by showing the effect of XE991 on increasing input resistance in WT vs KO. We also plotted other panels in a similar way to show the comparisons between WT and KO, as well as comparisons within genotype +/- XE991, which makes the results easy to follow. For more details, please also see the response to Reviewer 3, Point 1.

      Figure 2D-E: As in the text, this result is really pointing towards there being a Kv7 issue. Worries about the data in D aside, I think these two figures alone tell a clearer story. Figure 3 on the other hand tells a story of the effects of blocking Kv7 on membrane potential. Is this central to the story the others are trying to tell?

      We thank the reviewer for this point. We believe that Figure 2, Figure 3 and Figure 4—figure supplement 1 together provide strong and multifaceted evidence to support changes in Kv7 function in Fmr1 KO mossy cells.

      Figure 3. This is an interesting finding that shows how detailed their analysis was. Showing that the change in holding current in KO animals is greater than in WT is the first solid piece of evidence that there is a change in Kv7 in these cells that affects their excitability.

      We appreciate the reviewer’s comment. As mentioned above, we believe that Figure 2, Figure 3 and Figure 4—figure supplement 1 together provide strong and multifaceted evidence to support changes in Kv7 function in Fmr1 KO mossy cells.

      Figures 4 and 5 provide additional detail to support the idea that Kv& changes by showing how the E/I ratio and spontaneous minis are shifted in KO animals.

      We thank the Reviewer for the comments.

      Figures 6-8 build a compelling story for the reduction in excitatory drive in mossy cells affecting the network dynamics in excitatory/inhibitory interactions in DG cells.

      We appreciate the Reviewer’s comment.

      Reviewer #2 (Recommendations For The Authors):

      1) Other than location and characteristic morphology, the other parameters that were used to identify mossy cells and granule cells were also parameters used to find differences in cellular properties between wild-type and Fmr1 KO mice (RMP, sEPSC frequency, etc.), which would confound the results shown. The use of available transgenic mouse lines would provide for a more unbiased screen of these cells. Afterhyperpolarization was also used as a parameter while screening cells, yet none of the data on this measurement is shown.

      We thank the reviewer for this point and agree that transgenic mouse lines provide a more unbiased way to identify various types of neurons. However, since the present study involves analyses of at least three different types of neurons, establishing multiple transgenic lines labeling different types of dentate neurons in the Fmr1 KO mouse model would be very time consuming and beyond the current resources of the lab. We would also like to clarify that the three types of dentate neurons are easily distinguished according to the large differences in location, morphology and basal electrophysiological properties, none of which were essential in defining differences between genotypes. Specifically, granule cells are located in the granule cell layer, have a small cell body (<10 m), RMP around -80mV, capacitance ~20 pF, and infrequent sEPSCs (<20 events/min); mossy cells are located in the hilus, have a large cell body (>15 m), RMP around -65 mV, capacitance >100 pF, and fast afterhyperpolarization less than -10 mV (WT –5.1 ± 0.7 mV, KO -5.8 ± 0.5 mV); interneurons are located in the hilus or border of granule cell layer, have a relative smaller cell body (10-15 m), RMP around -55 mV, capacitance <60 pF, and afterhyperpolarization larger than -15 mV (WT -20.4 ± 1.3 mV, KO -19.8 ±1.4 mV). We note that the cells that could not be definitively classified into the three categories were not included in analyses, and we have now clarified this further in the Methods. To address the reviewer’s second concern regarding AHP, we now provided the corresponding values in the Methods.

      2) A definitive way to test the cell-autonomous nature of the Kv7 changes would be to use female mice, who will have a mosaic of cells affected by the fragile X chromosome, and the Fmr1 KO cells could be engineered to express GFP to help identify them from wild-type cells.

      We agree and appreciate this suggestion. This could be an interesting follow up study to further verify the cell-autonomous nature of Kv7 changes.

      3) The authors heavily rely on XE991 as a selective Kv7 blocker. Is it blocking all Kv7 channels at the concentration used? If so, given the significant expression of Kv7 in the dentate as shown by Western blot, is it surprising that there is no effect of this inhibitor on wild-type slices in most cases?

      We thank the reviewer for this important point. We used 10x of IC50 concentration in the present study, suggesting that more than 80% of Kv7 should be blocked. Notably, we observed several effects of XE991 in WT mice: it significantly increased input resistance (new Figure 2D-F), and strongly enhanced AP firing evoked by step depolarization (Figure 7E-H), although we did not observe effect of XE991 in WT in the analyses of spiking evoked by theta-gamma stimulation in Figure 8. However, this is not surprising. If a parameter we measured is predominately cell-autonomous (for example, input resistance), the effects of XE991 are easy to observe. However, if a parameter reflects integration of all dentate circuit operations (for example, AP probability in response to theta-gamma stimulation), it is difficult to detect the effect of XE991 in WT mice because the dentate circuit of WT mice has larger capability to maintain E/I balance in response to XE991.

      4) E/I ratio is a helpful concept, and it is heavily relied upon in the results text, but statistically shaky, especially for sEPSC:sIPSCs since you are combining uncertainty in the sEPSC and sIPSC to make one very uncertain ratio that doesn't undergo any subsequent statistical confirmation (such as in Fig 4I).

      We appreciate the reviewer’s point and apologize for the confusion in presentation of Fig 4I (and 5I), due to lack of detailed explanation. The E/I ratio shown in Figs. 4I (and 5I) is a single data-point estimate calculated from the mean values of independent sEPSC and sIPSC measurements (Figs. 4G-H and 5G-H, respectively). This ratio was used only as an estimate/illustration of the changes, rather than a precise determination of the shift in E/I balance. Because there is only one data-point for this ratio, statistical analysis is not possible. For this reason we performed extensive additional analyses in Figures 7 and 8, in which the EPSC and IPSC were measured from the same cells and at the same time to define the actual E/I ratio with the corresponding statistical analyses (i.e., a real matched and dynamic E/I ratio).

      5) Is this mGlur2/CB1 specificity to PP/granule and MC axons, respectively, true in the Fmr1 KO mice? It is possible that mGluR2 and CB1 expression patterns are altered in FMR1 KO, thus the assumption used to isolate these distinct inputs may not hold true.

      This is a very good point. We do assume that the specificity of Group II mGluR and CB1 is similar between Fmr1 KO and WT mice, but this is an assumption that we have not directly verified. However, our results in Figures 7 and 8 strongly support this assumption, because if it were not true, then our intervention would be unlikely to correct the excessive dentate output.

      6) XE991 only normalized GC firing when other cells were not pharmacologically blocked. The authors suggest this means blockage of MC Kv7 reduces GC excitability back to normal...presumably by increasing MC --> IN --> GC firing. This is a conclusion from many indirect comparisons (comparing XE991 effect on GC with/without GABA and glutamate blockers; comparing MC firing rates with/without XE991, and using CB1 agonist versus mGluR2 agonist to say it is mossy cells that are mostly controlling INs) - a clincher experiment would be to acutely knockdown Kv7 in mossy cells specifically and measure GC and IN firing.

      Thank you, this is a great suggestion. Indeed, as an expansion of this project, in the future studies we are planning to manipulate excitability of mossy cells through manipulating Kv7, or using chemogenetic or optogenetic approaches.

      7) The reasoning behind the FMRP-Kv7 connection is quite weak, citing the paper Darnell 2011 as "translational target", but FMRP has myriad translational targets.

      We agree, and attempted to define the mechanism of increased Kv7 function using co-immunoprecipitation approach, as well as immunostaining to look at cell-type specific expression changes. However, both of these approaches were difficult to interpret due to technical limitations of the available antibodies. We also note that “We did not further investigate the precise mechanisms underlying enhancement of Kv7 function in the absence of FMRP, since the present study primarily focuses on the functional consequences of abnormal cellular and circuit excitability”. To address this concern, we extensively discussed the potential mechanisms of FMRP-Kv7 connection, acknowledged in Discussion that “further studies will be needed to elucidate the precise mechanism responsible for the increased Kv7 function in Fmr1 KO mice”, and will continue to investigate it in the future studies.

      8) The authors attempt to look for changes in Kv7 expression with Western blot, but since they hypothesize that Kv7 changes are mainly in the mossy cells, it is perhaps not surprising that they would not be able to see any changes when they look at dentate as a whole. Staining for Kv7 subunits to look at expression on a cellular level would be beneficial.

      We appreciate the reviewer’s suggestion. We attempted to perform the suggested experiments using immunostaining for KCNQ2, KCNQ3 and KCNQ5 in different subtypes of dentate neurons. However, these experiments failed to produce interpretable results due to technical limitations of the available antibodies.

      9) Is Kv7 localization or splice/composition different in FMR1 KO mice?

      This is a very good point. As we mentioned in Point 8 above, we were not able to perform these experiments and do not have the answer at this point.

      10) Regarding the 3 subtypes of interneurons in the dentate, the authors are pooling data based on similar intrinsic properties, but this conclusion may be affected by the low number of recorded neurons for the regular-spiking type. In addition, it is unclear whether these different interneuron types have differential circuit connectivity (most likely) which would make it imperative to keep circuit analysis for interneurons segregated into these cell types.

      We appreciate the reviewer’s point. Indeed, these different interneuron types may have distinct circuit connectivity and contributions to circuit activity. However, identification of these 3 types of interneurons and determination of their respective functions is in itself a very extensive set of experiments which is beyond the scope of the current manuscript. We also note that the functional readout of circuit activity in our measurements was the AP firing and EPSPs evoked in granule cells by PP stimulation, which integrate all dentate circuit operations, including all of the feedforward and feedback loops which are mediated by all of these different types of interneurons. For simplicity, we thus pooled all interneuron data for the purposes of this study. But we fully agree that extensive future work is required to elucidate interneuron-type specific changes in Fmr1 KO mice and their contributions to the dentate circuit dysfunction.

      11) To do statistics treating each cell individually, and therefore assuming each cell is independent of one another, is not correct. Two cells from the same mouse will be more similar than two cells from different mice, therefore they are not independent data points. Nested statistical methods (n cells from o slices from p mice) will be important in future work, as discussed by (Aarts et al., Nat. Neurosci. 2014).

      We agree with the Reviewer’s point and appreciate this suggestion. In the present study, the cells tested in electrophysiological experiments were from at least 3 different mice for each condition, which help minimize this kind of errors.

      Reviewer #3 (Recommendations For The Authors):

      Is there a difference in the Rin at -45mV of the control cell after the application of XE991? This is important to appreciate whether the XE991-sensitive conductances contribute to the basal excitability of MCs. Furthermore, the statistical comparison of the Rin at -45mV of the FXS animals in the control solution and in the presence of XE991 would be also important‎. Actually, the most accurate measurement would be to show a difference in the acute Kv7-blockade between control and FXS animals, if that is possible with this blocker. Additionally, it would be also informative if the bar graphs in Fig.2 D & E were merged for this purpose, similarly as in the later figures.

      We thank the Reviewer for this suggestion and agree. Following this suggestion, we have re-plotted the data in Figure 2 accordingly. Specifically, we now show that XE991 significantly increased input resistance in both WT and KO mossy cells, and the effect of XE991 on increasing input resistance was markedly larger in KO than WT mossy cells. For other figures, we have plotted data in a similar way to show the comparisons between WT and KO, as well as comparisons within genotype +/- XE991.

      Because of the cell-to-cell variability of the voltage responses, it would be more informative and representative if the average of traces from all cells were shown in Fig.2 D & E.

      We agree with the Reviewer’s point. For clarity of presentation, we presented the cell-to-cell variability of the data as scatter points of input resistance values in the bar graph (Figure 2E), together with the representative traces (Figure 2D). Plotting the average traces from all cells would result in a total of 30 traces for all the WT and KO mice, which is difficult to visually assess clearly.

      On page 7, please clarify the recorded cell type in this sentence: "In ‎contrast, WIN markedly reduced the number of sEPSCs in both WT and KO mice...".

      We thank the Reviewer for pointing out this omission and have clarified it in the revised version.

      In Figures 6 C, F, and I, the title of the Y-axis should be normalized frequency. Please also correct the figure legend accordingly because the current sentence can be also interpreted as the absolute or total number of events that were compared, irrespective of the duration of the recordings.

      We thank the Reviewer for this point and have corrected the revised version accordingly.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      I highly appreciate this study and found the paper to be very well-written and easy to follow. However, a more extensive discussion of what I summarized under "weakness" would strengthen the paper. This may include a broader discussion of the canopy effect itself and the most relevant literature on its extent in rainforest settings in general and primate foods in particular, as well as more details on the dietary behavior of modern orangutans (stratigraphy of orangutan foods) and how seasonal their diet is. The extreme seasonality in orangutan plant food availability should be discussed. Now there are only 2 sentences in the discussion (lines 304-312) and I find the word "plant' only twice overall, though variation in plant food d18O is what drives variation in orangutan dental d18O values.

      We very much appreciate the support of this reviewer, and their feedback about the clarity of the paper. As noted in the provisional reply to reviewers, we are happy to add additional context about the issue of isotopic enrichment within forest canopies, and have expanded the original paragraph in the discussion devoted to this subject. We made reference to the fact that orangutan diets vary by season and site in the original submission, and have now acknowledged that seasonal diet variation may also contribute to variation in enamel isotope values.

      Also, I'd like to note that there has been only one recent study so far that made some level of an attempt to find a breastfeeding effect in orangutans using fecal isotope data. Tsutaya et al. 2022 (AJBA) report some seasonality in adult orangutan fecal isotope values, which could be relevant here as well. But also they reported some data from 2 to 7-year-old orangutan offspring and did not see any breastfeeding pattern in isotope values here either. Probably not too surprising at this older age, but still worth noting in the context of this study.

      There is a 2019 study that sampled fecal isotopes in 43 mother-infant orangutan pairs and found a different pattern than Tsutaya et al. (2022), although these data have not been published in full (Knott et al. (2019) AJBA 168, S68, 128-129). Given these contradictions, the fact that neither study serially sampled the first two years of life, and caveats to fecal isotope sampling of wild primates reviewed in Bădescu et al. (2023: American Journal of Primatology 2023;e235), introducing these nitrogen isotope studies does not aide in the interpretation of oxygen isotope data during intensive nursing, and thus is beyond the scope of this paper. The seasonality Tsutaya et al. (2022) reported in adult fecal samples was for carbon isotopes rather than nitrogen isotopes, and its relevance to the current study is unclear given that the orangutan plant foods measured did not show seasonal variation in carbon isotopes. As requested above, we have noted orangutans’ dietary seasonality might influence the variation of oxygen isotope values.

      Reviewer #2 (Recommendations For The Authors):

      First, the manuscript offers upfront flashy numbers with respect to the number of samples, but what the reader really needs to know upfront is the number of individuals and the number of teeth per individual. These facts are buried and make the reader work too hard to keep track. While the specimen ID numbers are valuable in the table, perhaps a different ID could be used in the text, such as individuals modern Borneo A and B, fossil Sumatra A and B, etc.? Similarly, it would be helpful to remind readers of each locality - Borneo or Sumatra, modern or fossil.

      Tables 1 and 2 and the first sentence of the results and the materials and methods stated that we measured 18 teeth in this study. It is likely that the placement of the tables at the very end of the manuscript in the submitted version made the sample sizes and specimen information less evident to the reviewer. In response to this critique we have now added the number of teeth to the abstract, and trust that when the tables are placed within the text as indicated it will be easier to follow textual references to particular individuals. Museum identification codes have been provided in two previous publications of these teeth, and we retain them here for consistency.

      Second, the manuscript mentions some climate change in Sumatra, but what about Borneo?

      The results on the Bornean fossil teeth stated: “The range of values from these two fossil molars (14.2–24.8 ‰) markedly exceeds the range of modern Bornean orangutans (12.7–20.0 ‰) (Figure 4), with the mean δ18O value at least 2‰ heavier, suggesting possibly drier conditions with greater seasonality during their formation.” In the final section of the discussion, we devoted two paragraphs to discussing evidence for climate change at Niah Cave in Borneo - more than we devote to discussing such data from Sumatra.

      The most valuable figure in the manuscript is Figure 3 showing the serial sampling of modern teeth. It would be incredibly useful to see a similar graph for the fossils and a graph of the modern and fossils together for each island. The violin plots demonstrate a range of values but fail to provide the important seasonality signals. The manuscript is promising but as written is difficult to follow, and the results and conclusions with regard to climate change need more demonstration. On a minor note, I found myself wanting to know about the dates of fossils before knowing the isotopic values. You might wish to move the dating section to precede the isotopes.

      As requested, we have added an additional Supplemental figure making the comparisons of seasonality between fossil and modern individual more evident.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study addressed an alternative hypothesis to temporal binding phenomena. In temporal binding, two events that are separated in time are "pulled" towards one another, such that they appear more coincidental. Previous research has shown evidence of temporal binding events in the context of actions and multisensory events. In this context, the author revisits the well-known Libet clock paradigm, in which subjects view a moving clock face, press a button at a time of their choosing to stop the clock, a tone is played (after some delay), and then subjects move the clock dial to the point where the one occurred (or when the action occurred). Classically, the reported clock time is a combination of the action and sound times. The author here suggests that attention can explain this by a mechanism in which the clock dial leads to a roving window of spatiotemporal attention (that is, it extends in both space and time around the dial). To test this, the author conducted a number of experiments where subjects performed the Libet clock experiment, but with a variety of different stimulus combinations. Crucially, a visual detection task was introduced by flashing a disc at different positions along the clock face. The results showed that detection performance was also "pulled" towards the action event or sensory event, depending on the condition. A model of roving spatiotemporal attention replicated these effects, providing further evidence of the attentional window.

      Strengths:

      The study provides a novel explanation for temporal binding phenomena, with clear and cleverly designed experiments. The results provide a nice fit to the proposed model, and the model itself is able to recapitulate the observed effects.

      Weaknesses:

      Despite the above, the paper could be clearer on why these effects are occurring. In particular, the control experiment introduced in Experiment 3 is not well justified. Why should a tactile stimulus not lead to a similar effect? There are possibilities here, but the author could do well to lay them out. Further, from a perspective related to the attentional explanation, other alternatives are not explored. The author cites and considers work suggesting that temporal binding relies on a Bayesian cue combination mechanism, in which the estimate is pulled towards the stimulus with the lowest variance, but this is not discussed. None of this necessarily detracts from the findings, but otherwise makes the case for attention less clear.

      I would like to thank the reviewer for the helpful comments and recommendations. Regarding Experiment 3, the rationale is this. We showed in Experiments 1 and 2 that, for outcome binding, there were two types of difference between Action Sound condition and Sound Only condition: the reported time of sound onset (i.e. the reported clock hand location at the sound onset) and the attention distribution. To experimentally test the relevance of the attention difference to the difference of reported time, we created a situation where the attention difference could be minimised and then checked the difference of reported time. We found that when the attention difference was controlled for between the two conditions, the difference of reported time was also gone, thus providing further evidence for a close link between attention and time report in the current testing paradigm. Therefore, Experiment 3 was primarily targeting the experimental evidence for the claim of the current study. What we needed in Experiment 3 was a condition that could have a smaller attention difference with the Action Sound condition than the attention difference between Sound Only and Action Sound conditions in Experiments 1 and 2. We expected that a tactile stimulus before the sound onset could work, without a clear prediction of the strength of the tactile stimulus in shifting attention, which was also not necessary. This experimental manipulation was a nice fit for the purpose of experiment 3, as we could empirically measur the effectiveness of the tactile stimulus on attention shift and then relate it to the changes in outcome binding.

      As the reviewer correctly suggested, the Bayesian framework has been applied in several studies to explain the time judgement distortion in sensorimotor situations (e.g. the temporal binding effect studied here). However, the current study asked what temporal binding is really about when it is measured with the Libet clock method. Is it really about a distortion in time perception (which the Bayesian account tries to explain)? Or is it also about attention? The results showed that the spatiotemporal attention distribution is at least a confound in measuring the perceived time of an event using the Libet clock method. Therefore, the Bayesian account raised in previous studies is relevant when explaining the distortion in time perception, given that it really exists. We here asked if the distortion really exists, and to what extent.

      Reviewer #2 (Public Review):

      Summary:

      Temporal binding, generally considered a timing illusion, results from actions triggering outcomes after a brief delay, distorting perceived timing. The present study investigates the relationship between attention and the perception of timing by employing a series of tasks involving auditory and visual stimuli. The results highlight the role of attention in event timing and the functional relevance of attention in outcome binding.

      Strengths:

      • Experimental Design: The manuscript details a well-structured sequence of experiments investigating the attention effect in outcome binding. Thoughtful variations in manipulation conditions and stimuli contribute to a thorough and meaningful investigation of the phenomenon.

      • Statistical Analysis: The manuscript employs a diverse set of statistical tests, demonstrating careful selection and execution. This statistical approach enhances the reliability of the reported findings.

      • Narrative Clarity: Both in-text descriptions and figures provide clear insights into the experiments and their results, facilitating readers in following the logic of the study.

      Weaknesses:

      • Conceptual Clarity: The manuscript aims to integrate key concepts in human cognitive functions, including attention, timing perception, and sensorimotor processes. However, before introducing experiments, there's a need for clearer definitions and explanations of these concepts and their known and unknown interrelationships. Given the complexity of attention, a more detailed discussion, including specific types and properties, would enhance reader comprehension.

      • Computational Modeling: The manuscript lacks clarity in explaining the model architecture and setup, and it's unclear if control comparisons were conducted. These details are critical for readers to properly interpret attention-related findings in the modeling section. Providing a clearer overview of these aspects will improve the overall understanding of the computational models used.

      I would like to thank the reviewer for the helpful comments and recommendations. The attention in the current study, which has been made clearer in the revised manuscript, refers specifically to visuospatial attention. It is presented as a key factor shaping the results of timing report obtained with the clock method, thereby contributing to the explanation of temporal binding. Indeed, attention has been mentioned previously in a similar context, but was treated vaguely as a kind of general cognitive resources. The current study specifically tested and verified that the visuospatial attention paid to the clock face influenced the timing reports. This point has been discussed in a dedicated paragraph in the discussion section of the revised manuscript.

      The modelling of the timing report using the attention data was based on a very simple idea: The clock hand location receiving more attention should be given more weight when participants made the timing report (i.e. reporting the clock hand position). The weight for each location was calculated using the detection rate at each location. The relevant methods section has been extensively revised to provide a step-by-step implementation of the modelling, with rationales and pitfalls in the interpretation of the modelling results given (also in the discussion section).

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the reviewers and the editors for their constructive and critical comments/ suggestions regarding our paper. We have since extensively revised the manuscript accordingly, including the addition of new experimental data. Hope the readers, reviewers, and editors are now satisfied with the quality and significance of the revised paper.

      Our responses to the eLife assessment and the reviewers’ comment as well as the details of the revisions are described below.

      Wang et al present a useful manuscript that builds modestly on the group's previous publication on KLF1 (EKLF) K47R mice focused on understanding how Eklf mutation confers anticancer and longevity advantages in vivo (Shyu et al., Adv Sci (Weinh). 2022). The data demonstrates that Eklf (K74R) imparts these advantages in a background, age, and gender independent manner, not the consequence of the specific amino acid substitution, and transferable by BMT. However, the authors overstate the meaning of these results and the strength of evidence is incomplete, since only a melanoma model of cancer is used, it is unclear why only homozygous mutation is needed when only a small fraction of cells during BMT confer benefit, they do not show EKLF expression in any cells analyzed, and the PD-1 and PDL-1 experiments are not conclusive. The definitive mechanism relative to the prior publication from this group on this topic remains unclear.

      The issues in the assessment by the editor on our paper were also brought up by the reviewers. We have taken care of them by carrying out new experiments as well as rewriting of the paper to highlight the rationales and novel aspects of the current study, as described below in our responses to the three reviewers.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors Wang et al. present a study of a mouse model K74R that they claim can extend the life span of mice, and also has some anti-cancer properties. Importantly, this mechanism seems to be mediated by the hematopoietic system, and protective effects can be transferred with bone marrow transplantation.

      The authors need to be more specific in the title and abstract as to what is actually novel in this manuscript (a single tumor model), and what relies on previously published data (lifespan). Because many of these claims derive from previously published data, and the current manuscript is an extension of previously published work. The authors need to be more specific as to the actual data they present (they only use the B16 melanoma model) and the actual novelty of this manuscript.

      Especially experiments on life span are published and not sufficiently addressed in this actual paper, as the title would suggest.

      Indeed important to point out the novelty of this paper in comparison to the previous paper. First, we have modified the title, the abstract, and the text so to emphasize that the extended lifespan as well as tumor resistance could be transferred by from Eklf(K74R) mice to WT mice by a single transplantation of the Eklf(K74R) bone marrow mononuclear cells (BMT) to the WT mice at their young age (2 months).

      We now also provide several new experimental data including the one demonstrating that Eklf(K74R) mice are resistant to tumorigenesis of hepatocellular carcinoma as well (new Fig. 1E). These points are elaborated in more details below in my responses to the reviewers’ comments/ suggestions.

      Reviewer #2 (Public Review):

      The manuscript by Wang et al. follows up on the group's previous publication on KLF1 (EKLF) K47R mice and reduced susceptibility to tumorigenesis and increased life span (Shyu et al., Adv Sci (Weinh). Sep 2022;9(25):e2201409. doi:10.1002/ advs.202201409). In the current manuscript, the authors have described the dependence of these phenotypes on age, gender, genetic background, and hematopoietic translation of bone marrow mononuclear cells. Considering the current study is centered on the phenotypes described in the previous study, the novelty is diminished. Further, there are significant conceptual concerns in the study that make the inferences in the manuscript far less convincing. Major concerns are listed below:

      1) The authors mention more than once in the manuscript that KLF1 is expressed in range of blood cells including hematopoietic stem cells, megakaryocytes, T cells and NK cells. In the case of megakaryocytes, studies from multiple labs have shown that while EKLF is expressed megakaryocyte-erythroid progenitors, EKLF is important for the bipotential lineage decision of these progenitors, and its high expression promotes erythropoiesis, while its expression is antagonized during megakaryopoiesis. In the case of HSCs, the authors reference to their previous publication for KLF1's expression in these cells- however, in this study nor in the current study, there is no western blot documented to convincingly show that KLF1 protein is expressed at detectable levels in these cells. For T cells, the authors have referenced a study which is based on ectopic expression of KLF1. For NK cells, the authors reference bioGPS: however, upon inspection, this is also questionable.

      2) The current study rests on the premise that KLF1 is expressed in HSCs, NK cells and leukocytes, and the references cited are not sufficient to make this assumption, for the reasons mentioned in the first point. Therefore, the authors will have to show both KLF1 mRNA and protein levels in these cells, and also compare them to the expression levels seen in KLF1 wild type erythroid cells along with knockout erythroid cells as controls, for context and specificity.

      Regarding the novelties of the current story. Besides demonstration of the independence of the healthy longevity characteristics on age, gender, and genetic background, as exemplified by the tumor resistance, another novelty of the current study is that the healthy longevity characteristics, in particular the tumor resistance and extended lifespan, could be transferred by one-time long-term transplantation of the Eklf(K74R) bone marrow mononuclear cells from young Eklf(K74R) mice to young WT mice. Also, since submission of the last version of the paper, we have carried out new experiments, including the characterization of the anti-cancer capability of NK cells (new Fig. 6) as well as assay of the tumor-resistance of Eklf(K74R) mice to hepatocellular carcinoma (new Fig. 1E), etc.

      We have also modified the title, Abstract, and different parts of the text to highlight the novelties of the current study.

      As to the expression of EKLF in different hematopoietic blood cell types, we have now added a paragraph in Result (p.6 and p.7) describing what have been known in literature in relation to our data presented in the paper. Importantly, following the reviewer’s comments, we have since carried out Western blot analysis of EKLF expression in NK, T, and B cells (p. 6, p.7 and new Fig. S4B). Also noted is that the level of EKLF in B cells is very low and only could be detected by RT-qPCR (Fig. S4C) and RNA-Seq (Bio-GPS database)

      3) To get to the mechanism driving the reduced susceptibility to tumorigenesis and increased life span phenotypes in EKLF K74R mice, the authors report some observations- However, how these observations are connected to the phenotypes is unclear.

      a. For example, in Figure S3, they report that the frequency of NK1.1+ cells is higher in the mutant mice. The significance of this in relation to EKLF expression in these cells and the tumorigenesis and life span related phenotypes are not described. Again, as mentioned in the second point, KLF1 protein levels are not shown in these cells.

      b. In Figure 4, the authors show mRNA levels of immune check point genes, PD-1 and PD-l1 are lower in EKLF K74R mice in PB, CD3+ T cells and B220+ B cells. Again, the questions remain on how these genes are regulated by EKLF, and whether and at what levels EKLF protein is expressed in T cells and B cells relative to erythroid cells. Further, while the study they reference for EKLF's role in T cells is based on ectopic expression of EKLF in CD4+ T cells, in the current study, CD3+ T cells are used. Also, there are no references for the status of EKLF in B cells. These details are not discussed in the manuscript.

      Regarding this part of the questions and comments by the reviewer.

      First, we have since assayed the effect of the K74R substitution of EKLF on the in vitro cancer cell-killing ability of NK cells (termed NK1.1 cells in the previous version). The data showed that NK(K74R) cells have higher ability than the WT NK cells (new Fig. 6). This property together with the higher expression level of NK(K74R) cells in 24 month-old Eklf (K74R) mice than NK cells in 24 month-old WT mice would contribute to the higher tumor-resistance of the Eklf (K74R) mice. This point is also addressed on p. 8 andp.9.

      Second, as stated in previous sections, we have since carried out comparative Western blot analysis of the expression of EKLF protein in NK, CD3 T, and B cells of the WT and Eklf(K74R) mice, respectively (please see the new Fig. S4B). Also, description regarding what are known in literature in relation to our data on the expression of EKLF protein/ Eklf mRNA in different types of hematopoietic blood cells is now included in the Result (please see p.6 and p.7). Notably though, the level of EKLF protein in B cells was too low to be detected by WB (Fig. S4B).

      4) The authors perform comparative proteomics in the leukocytes of EKLF K74R and WT mice as shown in Figure S5. What is the status of EKLF levels in the mutant lysate vs wild type lysates based on this analysis? More clarity needs to be provided on what cells were used for this analysis and how they were isolated since leukocytes is a very broad term.

      The leukocytes used by us were isolated from the peripheral blood after removal of red blood cells, as described in the Materials and Methods.

      Also, the Western blot analysis of EKLF expression in the lysates of leukocytes/ white blood cells (WBC) has been shown previously, now presented in the new Figure S4A.

      5) In the discussion the authors make broad inferences that go beyond the data shown in the manuscript. They mention that the tumorigenesis resistance and long lifespan is most likely due to changes in transcription regulatory properties and changes in global gene expression profile of the mutant protein relative to WT leukocytes. And based on reduced mRNA levels of Pd-1 Pd-l1 genes in the CD3+ T cells and B220+ B cells from mutant mice, they "assert" that EKLF is an upstream regulator of these genes and regulates the transcriptomes of a diverse range of hematopoietic cells. The lack of a ChIP assay to show binding of WT EKLF on genes in these cells and whether this binding is reduced or abolished in the mutant cells, make the above statements unsubstantiated.

      We have since carried out ChIP-PCR analysis of EKLF-binding in the Pd-1 promoter (new Fig. S5). The data showed that EKLF was bound on the CACCC box at -103 of the promoter in WT CD3+T as well as in CD3+T(K74R) cells. This result is discussed on p.7.

      6) Where westerns are shown, the authors need to show the molecular weight ladder, and where qPCR data are shown for EKLF, it will be helpful to show the absolute levels and compare these levels to those in erythroid cells, along the corresponding EKLF knock out cells as controls.

      We have since included the molecular weight markers by the side of Western blots in Fig. S4. Also, we have added a new figure (Fig.S4C) showing the comparison of the expression levels of Eklf mRNA in B cells and CD3+ T cells to the mouse erythroleukemia (MEL) cells, as analyzed by RT-qPCR.

      Also, as indicated now in the Material and Methods section, the specificity of the primers used for RT-qPCR quantitation of mouse Eklf mRNA has been validated before by comparative analysis of wild type and EKLF-knockout mouse erythroid cells (Hung et al., IJMS, 2020).

      7) Figure S1D does not have a figure legend. Therefore, it is unclear what the blot in this figure is showing. In the text of the manuscript where they reference this figure, they mention that the levels of the mutant EKLF vs WT EKLF does not change in peripheral blood, while in the figure they have labeled WBCs for the blot, and the mRNA levels shown do seem to decrease in the mutant compared to WT peripheral blood.

      We apologize for this ignorance on our side. The data shown in the original Fig. SID (new Fig. S4A) are from Western blot analysis of EKLF protein and RT-qPCR analysis of Eklf mRNA in leukocytes/ white blood cells (WBC) isolated from the peripheral blood samples. We have now added back the figure legend and also rewritten the corresponding description in the text on p.6.

      Reviewer #3 (Public Review):

      Hung et al provide a well-written manuscript focused on understanding how Eklf mutation confers anticancer and longevity advantages in vivo. The work is fundamental and the data is convincing although several details remain incompletely elucidated. The major strengths of the manuscript include the clarity of the effect and the appropriate controls. For instance, the authors query whether Eklf (K74R) imparts these advantages in a background, age, and gender dependent manner, demonstrating that the findings are independent. In addition, the authors demonstrate that the effect is not the consequence of the specific amino acid substitution, with a similar effect on anticancer activity. Furthermore, the authors provide some evidence that PD-1 and PDL-1 are altered in Eklf (K74R) mice.

      Here we thank the encouraging comments by this reviewer.

      Finally, they demonstrate that the effects are transferrable with BMT. Several weaknesses are also evidence. For instance, only melanoma is tested as a model of cancer such that a broad claim of "anti-cancer activity" may be somewhat of an overreach.

      We have now included new data showing that the Eklf(K74R) mice also carry a higher anti-cancer ability against hepatocellular carcinoma than the WT mice (new Fig. 1E).

      It is also unclear why a homozygous mutation is needed when only a small fraction of cells during BMT can confer benefit. It is also difficult to explain how transplanted donor Eklf (K74R) HSCs confer anti-melanoma effect 7 and 14 days after BMT.

      First, these two observations not necessarily conflict with each other. It is likely that homozygosity, but not heterozygosity, of the K74R substitution in EKLF allows one or more types of hematopoietic blood cells to gain new functions, e.g. the higher cancer cell- killing capability of NK(K74R) cells (new Fig. 6), that help the mice to live long and healthy. Also, the data in Fig. 2D indicated that as low as 20% of the blood cells carrying homozygous Eklf(K74R) alleles in the recipient mice upon BMT could be sufficient to confer the mice a higher anti-cancer capability, likely in part due to cells such as NK(K74R). These points are now clarified in Discussion (p.9 and p.10).

      Second, we think the NK(K74R) cells contributed a significant part to the anti-cancer capability of the transplanted Eklf(K74R) blood in the recipient WT mice. As documented in some literature, e.g. Ferreira et al., Journal of Molecular Medicine (2019), the hematopoietic lineage of the NK cells would be fully reconstituted as early as 2 weeks after BMT. Of course, there could be other still unknown factors/ cells that also contribute to the tumor-resistance of the recipient mice at 7 day following BMT. This point is now touched upon on p.8 and p.9.

      Furthermore, it would be useful to see whether there are virulence marker alterations in the melanoma loci in WT vs Eklf (K74R) mice.

      As responded in the Public Reviews, we will analyze this in future together with other types of tumors in a separate study.

      Finally, the data in Fig 4c is difficult to interpret as decreased PD-1 and PDL-1 after knockdown of EKLF in vitro is not a useful experiment to corroborate how mutation without changing EKLF expression impacts immune cells. The work is impactful as it provides evidence that healthspan and lifespan may be modulated by specific hematological mutation but the mechanism by which this occurs is not completely elucidated by this work.

      As described in a previous section, we have since also carried out ChIP-qPCR analysis of the binding of WT EKLF and EKLF (K74R) on the Pd-1 promoter (new Fig. S5).

      Reviewer #1 (Recommendations For The Authors):

      The authors present interesting melanoma model data but need to tone down their claim of multiple effects of their model system. It needs to be clear what is new and what is previously known.

      As respond in the Public Reviews, we have since added new data on the tumor resistance of the Eklf(K74R) mice to hepatocellular carcinoma (new Fig. 1E). We have also modified the title as well as highlighted the novel points in the Abstract and text of the revised draft.

      Reviewer #2 (Recommendations For The Authors):

      In addition to the major concerns listed in the public review, the minor concerns that the authors could address are listed below:

      1) Will be helpful to describe why was the pulmonary melanoma focus assay chosen for metastasis assay?

      We now describe on p. 4 the rationale behind the initial choice of this assay for analysis of the anti-cancer capability of the Eklf(K74R) mice. Also, we have since included data from experiment using the subcutaneous cancer cell inoculation assay for comparative analysis of the anti-hepatocellular carcinoma capability of Eklf(K74R) and WT mice (Fig. 1E and p.5).

      2) Reference #61 for B16-F10-luc cells cited in the methods does not have details on the generation of these cells. What these cells are and why this model was chosen needs to be described.

      Sorry about not providing this information before. We now describe the generation of B16F10-luc cells in the Material and Methods section (p.13). The rationale of choosing the B16-F10 cells for the pulmonary lung foci assay is also added on p.4.

      3) The DNA binding consensus site for EKLF needs to be expanded in the introduction.

      This part has been taken care of now on p.13.

      Reviewer #3 (Recommendations For The Authors):

      Hung et al provide a well-written manuscript focused on understanding how Eklf mutation confers anticancer and longevity advantages in vivo. The work is fundamental and the data is convincing although several details remain incompletely elucidated.

      1) Only melanoma is tested as a model of cancer such that a broad claim of "anti-cancer activity" may be somewhat of an overreach. The authors, therefore, need to provide evidence of a second type of malignancy to which Eklf mutation confers anticancer and longevity advantages or temper the claims in the discussion that the effect still needs to be tested in non-melanoma cancer models to determine the broad anti-cancer effect.

      As responded in the Public Reviews, we have since shown that Eklf(K74R) mice also exhibited a higher resistance to the carcinogenesis of hepatocellular carcinoma (new Fig. 1E).

      2) Why is a homozygous mutation needed when only a small fraction of cells during BMT can confer benefit of Eklf mutation? Is there evidence that the cellular effect is binary but only a few such cells are needed? This is confusing and requires further clarification.

      As responded in the Public Reviews, these two observations not necessarily conflict with each other. It is likely that homozygosity, but not heterozygosity, of the K74R substitution in EKLF allows one or more types of hematopoietic blood cells to gain new functions, e.g. the higher cancer cell- killing capability of NK(K74R) cells (new Fig. 6), that help the mice to live long and healthy. Also, the data in Fig. 2D indicated that as low as 20% of the blood cells carrying homozygous Eklf(K74R) alleles in the recipient mice upon BMT could be sufficient to confer the mice a higher anti-cancer capability, likely in part due to cells such as NK(K74R). This point is now clarified in Discussion (p.9).

      3) BMT typically requires at least 3-4 weeks to reconstitute the marrow compartment but the authors are able to see effects of Eklf mutation as early as 7 days following BMT. This is surprising and brings into question the mechanism of effect.

      As responded in the Public Reviews, we think the NK(K74R) cells contributed a significant part to the anti-cancer capability of the transplanted Eklf(K74R) blood in the recipient WT mice. As documented in some literature, e.g. Ferreira et al., Journal of Molecular Medicine (2019), the hematopoietic lineage of the NK cells would be fully reconstituted as early as 2 weeks after BMT. Of course, there could be other still unknown factors/ cells that also contribute to the tumor-resistance of the recipient mice at 7 day following BMT (please see discussion of this point on p. 9).

      4) It would be useful to see whether there are virulence marker alterations in the melanoma loci in WT vs Eklf (K74R) mice.

      As responded in the Public Reviews, we will analyze this in future together with other types of tumors in a separate study.

      5) The data in Fig 4c is difficult to interpret as decreased PD-1 and PDL-1 after knockdown of EKLF in vitro is not a useful experiment to corroborate how mutation WITHOUT changing EKLF expression impacts immune cells.

      Indeed, the RNAi knockdown experiment only demonstrated a positive regulatory role of EKLF in Pd1/Pd-l1 gene expression. We have followed the reviewer’s suggestion and carried out ChIP-qPCR analysis and shown that the factor is bound on the Pd-1 promoter in both WT CD3+T cells and CD3+T(K74R) cells (new Fig. S5). We briefly discuss these data on p.7 in relation to the possible effect of K74R substitution of EKLF on Pd-1 expression.

      We have now further clarified this point on p. 7.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Congratulations on the very nice structure! In my opinion, which you can feel free to take or leave, this would work better as a short report focused on the improvement of the structure relative to the current published model. To my mind, while the functional and dimerization studies are supportive of the cryo-EM studies (specifically, the purified protein is functional, and does tend to dimerize in various membrane mimetics), these experiments don't provide a lot of new mechanistic insight on their own. The dimerization, in particular, could be developed further.

      Response: Thank you for the comments. We have chosen to stick with the current article format. That the protein is dimeric is exciting in our view and we are working to further define the functional significance of this formation.

      Reviewer #2 (Recommendations For The Authors):

      Ln 48. Abstract. "highlighting feature of the complex interface" sounds a bit vague. I was wondering if the authors considered including more specific findings here.

      Response: This sentence has been removed.

      Ln 149 and elsewhere. The authors refer to the previously published structure of HiSiaQM as "low resolution". It may just be me and likely not the intention of the authors, but this comes across as an attempt to diminish the validity of this previous work from another group, which is not necessary. I would recommend rewording these parts slightly, even if it is just to say "lower resolution" instead of "low resolution".

      Response: It was not our intention to diminish the excellent work published by another group, we have changed “low resolution” to “lower resolution” throughout.

      Ln 160. The authors state that the inward-open conformation is likely "the resting state of the transporter". I think this statement should be modified slightly to acknowledge that this is only true under these conditions, i.e. in the absence of the bilayer, membrane potential and chemical gradients.

      Response: We have edited this as follows “That we observe the inward-open conformation without either a bound P-subunit or fiducial marker, suggests that this is the resting state of the transporter under experimental conditions (in the absence of a membrane bilayer, membrane potential and chemical gradients).”

      Ln 202. I'm not convinced that the use of the word "probable" is appropriate here; "possible" would likely fit better in the absence of compelling evidence that this dimer forms in a bacterial cell membrane with physiological levels of HiSiaQM expression.

      Response: We have changed “probable” to “possible”.

      The authors show an SEC trace for DDM solubilised protein, which is a single peak, whereas the LMNG extracted protein has 2 distinctly different elution profiles depending on the LMNG concentration. Was the same phenomenon observed when varying the DDM concentration?

      Response: We observed significantly more aggregation with DDM than L-MNG, so it was infrequently used and considerably less well characterised. In one purification, moderately higher DDM shifted the elution peak to be slightly later but retained a similar profile. Overall, we did not observe the same phenomenon of distinctly different elution profiles with DDM, but we have limited data.

      Ln 245. The two positions cited as important for the elevator-type mechanism are the fusion helix and the dimer interface. However, there is no evidence that the dimer interface observed in this work has any relevance to the transport mechanism. To make this statement, the interface would need to be disrupted and the effects on transport evaluated.

      Response: This has been edited as follows. “Evident in our cryo-EM maps are well-defined phospholipid densities associated with areas of HiSiaQM that may be important for the function of an elevator-type mechanism (Figure 4), but require further testing.”

      Ln 257. The authors state that the lipids form "specific and strong interactions" with the protein, but without knowing the identity of the lipids present, it is difficult to say anything about the specificity of this interaction. I think the authors could consider rewording this. Response: We have edited this by removing the term “specific” and describing the lipid interactions only as strong interactions.

      Ln 270. The authors identify a lipid-binding site and residues that likely interact with the headgroup. It would be interesting if the authors could speculate on the purpose of this lipid binding site and how it could affect transport. The residues are not conserved, which the authors suggest reflects the variety of lipid compositions in different bacteria. Are the authors suggesting that this lipid binding site is a general feature for all fused TRAP transporters and that the identity of the lipid changes depending on the species?

      Response: Yes, we speculate that the lipid binding site may be a general feature for fused TRAP transporters. We have added speculation about this binding site, specifically that “the fusion helix and concomitant lipid molecule may provide a more structurally rigid scaffold than a Q-M heterodimer, i.e., PpSiaQM, although how this impacts the elevator transition requires further testing” at Line 283.

      Though we believe that a binding pocket is likely found in a number of fused TRAPs (based on sequence and Alphafold predictions, e.g., FnSiaQM and AaSiaQM), we have now acknowledged that some fusions may not necessarily bind a lipid molecule here, by stating “While this binding pocket is likely found in a number of fused TRAPs (based on sequence predictions, e.g., FnSiaQM and AaSiaQM in Supplementary Figure 8), it is not clear whether they also bind lipids here without experimental data” at Line 290.

      Ln 306. The authors state that the HiSiaPQM has a 10-fold higher transport activity than PpSiaPQM. Unless the transport assays were performed in parallel (to mitigate small changes in experimental set-up) and the reconstitution efficiency for each proteoliposome preparation was carefully analysed, it is very difficult for this to be a meaningful comparison. Even if the amount of protein incorporated into the proteoliposomes is quantified (e.g. by evaluating protein band intensity when the proteoliposomes are analysed using SDS-PAGE), this does not account for an inactive protein that was incorporated, nor the proportion of the protein that was incorporated in the inside-out orientation, which would be functionally silent in these assays. I'm not suggesting these assays actually need to be performed, but I think the text should be modified to reflect what can actually be compared.

      Response: We agree with the reviewer that a meaningful comparison is difficult to make without a careful analysis of the reconstitution efficiency and have modified the text to reflect this. We have altered the paragraph beginning at Line 319 to the following: “The fused HiSiaPQM system appears to have a higher transport activity than the non-fused PpSiaPQM system. With the same experimental setup used for PpSiaPQM (5 M Neu5Ac, 50 M SiaP) (33), the accumulation of [3H]-Neu5Ac by the fused HiSiaPQM is ~10-fold greater. Although this difference may reflect the reconstitution efficiency of each proteoliposome preparation, it is possible that it has evolved as a result of the origins of each transporter system—P. profundum is a deep-sea bacterium and as such the transporter is required to be functional at low temperatures and high pressures… ”

      Ln 335. "S298A did not show an effect on growth when mutated to alanine previously." Suggest changing "S298A" here to "S298".

      Response: This has been changed.

      Ln 340. In addition to PpSiaQM, the large cavity was also presumably observed in the lower resolution structure of HiSiaQM?

      Response: The cavity is detectable in the lower resolution structure (7qe5), though very poorly defined by the density. Furthermore, the AlphaFold model fitted to this density has positioned sidechains inside the cavity, which we consider very likely to be an error (in comparison to our structures, VcINDY and our estimates of the volume required to house sialic acid). The cavity is generally much better defined by the structures we have referenced.

      Ln 345. Reference missing after "previously reported"? Response: This has been added. Measuring the affinity for the P-to-QM interaction is very useful, but it would have enhanced the study if some of the residues identified as important for this interaction (detailed on p.13) had been tested for their contributions to binding using this approach.

      Response: We do aim to perform this assay with these mutants in the future, but are also developing parallel assays to further test this interaction in different membrane mimetics.

      Ln 436. As stated previously, it is more accurate to say that "this is the most stable conformation" under these conditions.

      Response: We have edited this to say “The ‘elevator down’ (inward-facing) conformation is preferred in experimental conditions”. We have also changed the last sentence of this paragraph to say “However, the dimeric structures we have presented have no other proteins bound, yet exist stably in the elevator down state, suggesting this is the most stable conformation in experimental conditions, where there is no membrane bilayer, membrane potential, or chemical gradient present.”

      Ln 438. "Lipids associated with HiSiaQM are structurally and mechanistically important." This conclusion is not supported by the data presented; there is no evidence that the bound lipids influence the mechanism at all. The lipids observed are certainly interestingly placed and one could speculate about their relevance, but this statement of fact is not supported. Therefore, their importance to the mechanism needs to be tested or this conclusion needs to be substantially softened.

      Response: We have softened this statement by changing it to “Lipids have strong interactions with HiSiaQM and are likely to be important for the transport mechanism.”

      Reviewer #3 (Recommendations For The Authors):

      The fact that HiSiaQM samples consist of a mixture of compact monomer and dimer is clear, from Fig. S5 and S6. However, the analysis displayed in Fig 3 and Fig S4 would require more explanation. To my understanding, it requires the values of the sedimentation and diffusion coefficients. It could be good to provide the experimental values of D, and explain a little more about the method in the material and method section.

      Response: Yes, the analysis requires the experimental diffusion coefficients. These have been added to the Figure 3 and S4 legends and more detail has been added to the method section.

      In addition, I am puzzled when reading, in the legend of Fig 3, considerations that peak 2 could not correspond to a monomer or trimer: do these sentences correspond to other mathematical solutions, or is a given frictional ratio considered, or do they refer to Fig. S5 analysis?

      We can see where this confusion could arise from. These sentences do not correspond to a given frictional ratio or the Fig. S5 analysis (this is a separate, complementary analysis). For peak 2 not existing as a monomer is strictly a physical justification – with pure protein and an observed peak smaller than peak 2, a monomer is not possible for peak 2. For peak 2 not existing as a trimer is a mathematical solution using the s and D coefficients. The solutions identify that an unreasonably low amount of detergent would be bound to a trimer (32 molecules for L-MNG or 0 for DDM) to exist at those s and D values so we have ruled the trimer out. Reassuringly, the complementary analysis in Fig. S5/S6 agrees with the monomer-dimer outputs from the s and D analysis. We have adjusted the text in the legends of Fig. 3 and S4 to better convey these points.

    1. Author Response

      eLife assessment

      This useful study uses a mouse model of pancreatic cancer to examine mitochondrial mass and structure in atrophying muscle along with aspects of mitochondrial metabolism in the same tissue. Most relevant are the solid transcriptomics and proteomics approaches to map out related changes in gene expression networks in muscle during cancer cachexia.

      Response: We very much appreciate the positive feedback from the editors on our article and are delighted to have it published in eLife. Our sincere thanks to the Reviewers for their positive feedback on our work, and for their insightful and constructive comments.

      Reviewer #1 (Public Review):

      Summary:

      This important study provides a comprehensive evaluation of skeletal muscle mitochondrial function and remodeling in a genetically engineered mouse model of pancreatic cancer cachexia. The study builds upon and extends previous findings that implicate mitochondrial defects in the pathophysiology of cancer cachexia. The authors demonstrate that while the total quantity of mitochondria from skeletal muscles of mice with pancreatic cancer cachexia is similar to controls, mitochondria were elongated with disorganized cristae, and had reduced oxidative capacity. The mitochondrial dysfunction was not associated with exercise-induced metabolic stress (insufficient ATP production), suggesting compensation by glycolysis or other metabolic pathways. However, mitochondrial dysfunction can lead to increased production of ROS/oxidative stress and would be expected to interfere with carbohydrate and lipid metabolism, events that are linked to cancer-induced muscle loss. The data are convincing and were collected and analyzed using state-of-the-art techniques, with unbiased proteomics and transcriptomics analyses supporting most of their conclusions.

      Additional Strengths:

      The authors utilize a genetically engineered mouse model of pancreatic cancer which recapitulates key aspects of human PDAC including the development of cachexia, making the model highly appropriate and translational.

      The authors perform transcriptomic and proteomics analyses on the same tissue, providing a comprehensive analysis of the transcriptional networks and protein networks changed in the context of PDAC cachexia.

      Weaknesses:

      The authors refer to skeletal muscle wasting induced by PDAC as sarcopenia. However, the term sarcopenia is typically reserved for the loss of skeletal muscle mass associated with aging.

      Response: We agree that the term sarcopenia initially refers to aged muscle, but its use has spread to other fields, including oncology (for example, in this article, which we quote: Mintziras I et al. Sarcopenia and sarcopenic obesity are significantly associated with poorer overall survival in patients with pancreatic cancer: Systematic review and meta-analysis. Int J Surg 2018;59:19-26). Actually, the term sarcopenia is now widely used in the literature and in the clinic to describe the loss of muscle mass and strength in cancer patients (see for example, this recent review: Papadopetraki A. et al. The Role of Exercise in Cancer-Related Sarcopenia and Sarcopenic Obesity. Cancers 2023;15;5856).

      In Figure 2, the MuRF1 IHC staining appears localized to the extracellular space surrounding blood vessels and myofibers-which causes concern as to the specificity of the antibody staining. MuRF1, as a muscle-specific E3 ubiquitin ligase that degrades myofibrillar proteins, would be expected to be expressed in the cytosol of muscle fibers.

      Response: We agree that MuRF1 IHC staining was also observed in the extracellular space, which was a surprise, for which we have no explanation to date.

      Disruptions to skeletal muscle metabolism in PDAC mice are predicted based on mitochondrial dysfunction and the transcriptomic and proteomics data. The manuscript could therefore be strengthened by additional measures looking at skeletal muscle metabolites, or linking the findings to previous work that has looked at the skeletal muscle metabolome in related models of PDAC cachexia (Neyroud et al., 2023).

      Response: We agree that our omics data could be strengthened by additional measures looking at skeletal muscle metabolites. It's an excellent suggestion to parallel the transcriptomic and proteomic data we obtained on the gastrocnemius muscle with the metabolomic data obtained by Neyroud et al. on the same muscle. These authors used another mouse model of PDAC than our KIC GEMM model, namely the allograft model implanting KPC cells (derived from the pancreatic tumor of KPC mice, another PDAC GEMM model) into syngeneic recipient mice. They carried out a proteomic study on the tibialis anterior muscle and a metabolomic study on the gastrocnemius muscle. Proteomics data identified in particular a KPC-induced reduction in the relative abundance of proteins annotating to oxidative phosphorylation, consistently with our data showing reduced mitochondrial activity pathways. Metabolomic data showed reduced abundance of many amino acids as expected, and of intermediates of the mitochondrial TCA cycle (malate and fumarate) in KPC-atrophied muscle consistently with reduced mitochondrial metabolic pathways that we illustrated. In contrast, metabolites that were increased in abundance included those related to oxidative stress and redox homeostasis, which is not surprising regarding the profound oxidative stress affecting atrophied muscle. Finally, we noted in Neyroud's metabolomic data the dysregulation of certain lipids and nucleotides in atrophied muscle, which is very interesting to relate to our study describing alterations in lipid and nucleotide metabolic pathways.

      Reviewer #2 (Public Review):

      The present work analyzed the mitochondrial function and bioenergetics in the context of cancer cachexia induced by pancreatic cancer (PDAC). The authors used the KIC transgenic mice that spontaneously develop PDAC within 9-11 weeks of age. They deeply characterize bioenergetics in living mice by magnetic resonance (MR) and mitochondrial function/morphology mainly by oxygraphy and imaging on ex vivo muscles. By MR they found that phosphocreatine resynthesis and maximal oxidative capacity were reduced in the gastrocnemius muscle of tumor-bearing mice during the recovery phase after 6 minutes of 1 Hz electrical stimulation while pH was reduced in muscle during the stimulation time. By oxygraphy, the authors showed a decrease in basal respiration, proton leak, and maximal respiration in tumor-bearing mice that was associated with the decrease of complex I, II, and IV activity, a reduction of OXPHOS proteins, mitochondrial mass, mtDNA, and to several morphological alterations of mitochondrial shape. The authors performed transcriptomic and proteomic analyses to get insights into mitochondrial defects in the muscles of PDAC mice. By IPA analyses on transcriptomics, they found an increase in the signature of protein degradation, atrophy, and glycolysis and a downregulation of muscle function. Focusing on mitochondria they showed a downregulation mainly in OXPHOS, TCA cycle, and mitochondrial dynamics genes and upregulation of glycolysis, ROS defense, mitophagy, and amino acid metabolism. IPA analysis on proteomics revealed major changes in muscle contraction and metabolic pathways related to lipids, protein, nucleotide, and DNA metabolism. Focusing on mitochondria, the protein changes mainly were related to OXPHOS, TCA cycle, translation, and amino acid metabolism.

      The major strength of the paper is the bioenergetics and mitochondrial characterization associated with the transcriptomic and proteomic analyses in PDAC mice that confirmed some published data of mitochondrial dysfunction but underlined some novel metabolic insights such as nucleotide metabolism.

      There are minor weaknesses related to some analyses on mitochondrial proteins and to the fact that proteomic and transcriptomic comparison may be problematic in catabolic conditions because some gene expression is required to maintain or re-establish enzymes/proteins that are destroyed by the proteolytic systems (including the autophagy proteins and ubiquitin ligases). The authors should consider the following points.

      Point 1. The authors used the name sarcopenia as synonymous with muscle atrophy. However, sarcopenia clearly defines the disease state (disease code: ICD-10-CM (M62.84)) of excessive muscle loss and force drop during ageing (Ref: Anker SD et al. J Cachexia Sarcopenia Muscle 2016 Dec;7(5):512-514.). Therefore, the word sarcopenia must be used only when pathological age-related muscle loss is the subject of study. Sarcopenia can be present in cancer patients who also experience cachexia, however since the age of tumor-bearing mice in this study is 7-9 weeks old, the authors should refrain from using sarcopenia and instead replace it with the words muscle atrophy/ muscle wasting/muscle loss.

      Response: This issue has also been raised by the Reviewer #1. We agree that the term sarcopenia historically refers to aged muscle, but it is also used in oncology (for example, in this article, which we quote: Mintziras I et al. Sarcopenia and sarcopenic obesity are significantly associated with poorer overall survival in patients with pancreatic cancer: Systematic review and meta-analysis. Int J Surg 2018;59:19-26). Actually, the term sarcopenia is now widely used in the literature and in the clinic to describe the loss of muscle mass and strength in cancer patients (see for example, this recent review: Papadopetraki A. et al. The Role of Exercise in Cancer-Related Sarcopenia and Sarcopenic Obesity. Cancers 2023;15;5856).

      Point 2. Most of the analyses of mitochondrial function are appropriate. However, the methodological approach to determining mitochondrial fusion and fission machinery shown in Fig. 5F is wrong. The correct way is to normalize the OPA1, MFn1/2 on mitochondrial proteins such as VDAC/porin. In fact, by loading the same amount of total protein (see actin in panel 5F) the difference between a normal and a muscle with enhanced protein breakdown is lost. In fact, we should expect a decrease in actin level in tumor-bearing mice with muscle atrophy while the blots clearly show the same level due to the normalization of protein content. Moreover, by loading the same amount of proteins in the gel, the atrophying muscle lysates become enriched in the proteins/organelles that are less affected by the proteolysis resulting in an artefactual increase. The correct way should be to lyse the whole muscle of control and tumor-bearing mice in an identical volume and to load in western blot the same volume between control cachectic muscles. Alternatively, the relative abundance of mitochondrial shaping proteins related to mitochondrial transmembrane or matrix proteins (mito mass) should compensate for the loading normalization. Because the authors showed elongated mitochondria despite mitophagy genes being up, fragmentation may be altered. Moreover, DNM1l gene is suppressed and therefore DRP1 protein must be analyzed. Finally, OPA 1 protein has different isoforms due to the action of proteases like OMA1, and YME1L that elicit different functions being the long one pro-fusion while the short ones do not. The authors must quantify the long and short isoforms of OPA1.

      Response: We acknowledge that our analysis of a minor set of proteins involved in mitochondrial dynamics by Western blotting (Figure 5F) is basic and could have been improved. We thank the Reviewer for all the suggestions, which will be very useful in future projects studying the subject in greater depth and according to the molecular characteristics of each player in mitochondrial fusion, fission, mitophagy and biogenesis.

      Point 3. The comparison of proteomic and transcriptomic profiles to identify concordance or not is problematic when atrophy programs are induced. In fact, most of the transcriptional-dependent upregulation is to preserve/maintain/reestablish enzymes that are consumed during enhanced protein breakdown. For instance, the ubiquitin ligases when activated undergo autoubiquitination and proteasome degradation. The same happens for several autophagy-related genes belonging to the conjugation system (LC3, Gabarap), the cargo recognition pathways (e.g. Ubiquitin, p62/SQSTM1) and the selective autophagy system (e.g. BNIP3, PINK/PARKIN) and metabolic enzymes (e.g. GAPDH, lipin). Finally, in case identical amounts of proteins have been loaded in mass spec the issues rise in point 2 of selective enrichment should be considered. Therefore, when comparing proteomic and transcriptomic these issues should be considered in discussion.

      Response: We fully agree with the Reviewer that seeking concordance between transcriptomic and proteomic data in the case of an organ affected by a high level of proteolysis is a difficult business. Another major difficulty we discussed in the Discussion section of the article is the fact that there is no concordance between RNA and protein level for a good proportion of proteins, for multiple reasons, so each level of omics has to be interpreted independently to give information on the pathophysiology of the organ studied.

    1. Author Response

      We thank the editors and reviewers for taking the time to provide a critical assessment of our manuscript. We are delighted our work was found to have merit, and will revise the manuscript based on their valuable input.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations for The Authors):

      Major comments:

      1) The immunolabeling data in Figure S4 shows no change in puncta number but reduced puncta size in Kit KO. sIPSC data show reduced frequency but little change in amplitude. These data would seem contradictory in that one suggests reduced synaptic strength, but not number, and the other suggests reduced synapse number, but not strength. How do the authors reconcile these results?

      Regarding the synaptic puncta, In Kit KO (or KL KO), we have not detected an overt reduction in the average VGAT/Gephyrin/Calbindin positive puncta density or puncta size per animal. With respect to puncta size, only in the Kit KO condition, and only when individual puncta are assessed does this modest (~10%) difference in size become statistically significant. In the revision, we eliminate this figure and focus on the per animal averages.

      We interpret that the reduction in sIPSC and mIPSC frequency likely stems from a decreased proportion of functional synapse sites. The number of MLIs, their action potential generation, the density of synaptic puncta, and the ability of direct stimulation to evoke release and equivalent postsynaptic currents, are all similar in Control vs Kit KO. It is therefore feasible that a reduced frequency of postsynaptic inhibitory events is due to a reduced ability of MLI action potentials to invade the axon terminal, and/or an impaired ability for depolarization to drive (e.g. coordinated calcium flux) transmitter release. That is, while the number of MLIs and their synapses appear similar, the reduced mIPSC frequency suggests that there is a reduced proportion of, or probability that, Kit KO synapse sites that function properly.

      2) Related to point 1, it would be helpful to see immunolabeling data from Kit ligand KO mice? Do these show the same pattern of reduced puncta size but no change in number?

      Although we have not added a figure, we have now added experiments and a corresponding analysis in the manuscript. As we had previously for Kit KO, we now for KL KO conducted IHC for VGAT, Gephyrin, and Calbindin, and we analyzed triple-positive synaptic puncta in the molecular layer of Pcp2 Cre KL KO mice and Control (Pcp2 Cre negative, KL floxed homozygous) mice. We did not find a gross reduction in the average synaptic puncta size or density, or in the PSD-95 pinceau size. From this initial analysis, it appears that the presynaptic hypotrophy is more notable in the receptor than in the ligand knockout. We speculate that this is perhaps because the Kit receptor may have basal activity in the absence of Kit ligand, that Kit may serve a presynaptic scaffolding role that is lost in the receptor (but not the ligand) knockout, or simply that the embryonic timing of the Pax2 Cre vs Pcp2 Cre recombination events is more relevant to pinceaux development, especially as basket cells are born primarily prenatally.

      3) The data using KL overexpression in PC (figure 4E,F) are intriguing, but puzzling. The reduction in sIPSC frequency and amplitude in the control PC is much greater than seen in the Kit or KL KO. The interpretation of these data, "Thus, KL-Kit levels may not set the number of MLI:PC release sites, but may instead influence the proportion of synapses that are functional for neurotransmission (Figure 4G)" is not clear and the reasoning here should be explained in more detail, perhaps in the discussion.

      We have attempted to clarify this portion of the manuscript by eliminating the cartoon of the proposed model, and by revising and adding to the discussion. Either MLI Kit KO or PC KL KO seems to preserve the absolute number of MLI:PC anatomical synapse sites (IHC) but to reduce the proportion of those synapse that are contributing to neurotransmission (mIPSC). We speculate that sparse PC KL overexpression (OX) may either 1) weaken inhibition to surrounding control PCs by either diminishing KL OX PC to KL Control PC inhibition, and/or 2) act retrogradely through MLI Kit to potentiate MLI:MLI inhibition, reducing the MLI:PC inhibition at neighboring Control PCs.

      Minor comments:

      1) In the first sentence of the results, should "Figure 1A, B" be "Figure C, D"?

      Yes, corrected.

      2) The top of page 6 states "the mean mIPSC amplitude was ~10% greater in PC KL KO than in control", this does not appear to be the case in Figure 3E. control and KL KO look very similar here.

      In this portion of the text citing the modest 10% increase in mIPSC amplitude, we are referring to the average amplitude of all individual mIPSC events in the PC KL KO condition; in the figure referred to by the reviewer (3E), we are instead referring to the average of all mIPSC event amplitudes per KL KO PC. Because of the dramatic difference in sample size for individual events vs cells, this modest difference rises to statistical, if not biological, significance. We include this individual event analysis only to suggest that, since we in fact saw a slightly higher event amplitude in the KL KO condition, it is unlikely that a reduced amplitude would have been a technical reason that we detected a lower event frequency.

      3) Figure 3 D, duration, y-axis should be labelled "ms"

      Event duration is no longer graphed or referenced. This has been replaced with total inhibitory charge.

      Reviewer #2 (Recommendations For The Authors):

      Methods:

      • Pax2-Cre line: embryonal Cre lines sometimes suffer from germline recombination. Was this evaluated, and if yes, how?

      The global loss of Kit signaling is incompatible with life, as seen from perinatal lethality in other Kit Ligand or Kit mutant mouse lines or other conditional approaches. Furthermore, a loss of Kit signaling in germ cells impedes fertility. Thus, while not explicitly ruled out, since conditional Pax2 Cre mediated Kit KO animals were born, survived, and produced offspring in normal ratios, we do not suspect that germline recombination was a major issue in this specific study.

      • Include rationale for using different virus types in different studies (AAV vs. Lenti).

      This rationale is now included and reflects the intention to achieve infection sparsity in the smaller and less dense tissue of perinatal mouse brains.

      • How, if at all, was blinding performed for histological and electrophysiological experiments?

      It was not possible for electrophysiology to be conducted blinded for the Kit KO experiments, owing to the subjects’ hypopigmentation. However, whenever feasible, resultant microscopy images or electrophysiological data sets were analyzed by Transnetyx Animal ID, and the genotypes unmasked after analysis.

      • Provide justification for limiting electrophysiology recordings to lobule IV/V and why MLIs in the middle third of the molecular layer were prioritized when inhibition of PCs is dominated by large IPSCs from basket cells. Why were 2 different internals used for recording IPSCs and EPSCs in PCs and MLIs? While that choice is justified for action potential recordings, it provides poor voltage control in PC voltage clamp. Both IPSCs and EPSCs could have been isolated pharmacologically using a CsCl internal.

      The rationale for regional focus has been added to the text. For MLI action potential recordings, we opted to sample the middle third of the molecular layer so that we would not be completely biased to either classic distal stellate vs proximal basket subtypes. It is our hope, in future optogenetic interrogations, to simultaneously record the dynamics of all MLI subtypes in a more unbiased way. With respect to internal solutions, we initially utilized a cesium chloride internal to maximize our ability to resolve differences in GABAA mediated currents, which was the hypothesis-driven focus of our study. While we agree that utilizing a single internal and changing the voltage clamp to arrive at per-cell analysis of Excitatory/Inhibitory input would have been most informative, our decision to utilize pharmacological methods was driven by our experience that achieving adequate voltage clamp across large Purkinje cells was often problematic, particularly in adult animals.

      Introduction:

      In the introduction, the authors state that inactivating Kit contributes to neurological dysfunction - their examples highlight neurological, psychiatric, and neurodevelopmental conditions.

      The language has been changed.

      General:

      Using violin plots illustrates the data distribution better than bar graphs/SEM.

      We have included violin plots throughout, and we have changed p values to numeric values, both in the interest of presenting the totality of the data more clearly.

      Synapses 'onto' PCs sounds more common than 'upon' PCs.

      We have changed the wording throughout.

      Figure 1:

      1F - there seems to be an antero-posterior gradient of Kit expression.

      Though not explicitly pursued in the manuscript, it is possible that such a gradient may reflect differences in the timing of the genesis and maturation of the cerebellum along the AP axis. Regional variability is however now briefly addressed as a motivator for focused studies within lobules IV/V.

      E doesn't show male/female ratios but only hypopigmentation.

      This language has been corrected.

      Figure 2 and associated supplementary figures:

      2A/B: The frequency of sIPSCs is very high in PCs, making the detection of single events challenging. How was this accomplished? Please add strategy to the methods.

      We have added methodological detail for electrophysiology analysis.

      How were multi-peak events detected and analyzed? 'Duration' is not specified - do the authors refer to kinetics? If so, report rise and decay. It is likely impossible to show individual aligned sIPSCs with averages superimposed, given that sIPSCs strongly overlap. Alternatively, since no clear baseline can be determined in between events, and therefore frequency, amplitude, and kinetics quantification is near-impossible, consider plotting inhibitory charge.

      Given the heterogeneity of events, we now do not refer to individual event kinetics. As suggested, we have now included an analysis of the total inhibitory charge transferred by all events during the recording epoch.

      S2: Specify how density, distribution, and ML thickness were determined in methods. How many animals/cells/lobules?

      For consistency with viral injections and electrophysiology, the immunohistochemical analysis was restricted to lobule IV/V. This is clearer in the revision and detail is added in the methods.

      S3:

      S3B: the labels of Capacitance and Input resistance are switched.

      This has been corrected.

      How were these parameters determined? Add to methods.

      Added

      In the previous figure the authors refer to 'frequency', in this figure to 'rate' - make consistent

      This has been corrected.

      D: example does not seem representative. Add amplitude of current pulse underneath traces.

      We added new traces from nearer the group means and we now include the current trace.

      F/G example traces (aligned individual events + average) are necessary.

      We added example traces near the relevant group means for each condition.

      Statement based on evoked IPCSs that 'synapses function normally' is a bit sweeping and can only be fully justified with paired recordings. Closer to the data would be the release probability of individual synapses is similar between control and Kit KO.

      Paired recordings in both Kit Ligand and Kit receptor conditional knockout conditions is indeed an informative aim of future studies should support permit. For now, we have clarified the language to be more in line with the reviewer’s welcome suggestion.

      S4:

      Histological strategy cannot unambiguously distinguish MLI-PC and PC-PC synapses. Consider adding this confound to the text.

      We have added this confound to the discussion.

      The observation that the pinceau is decreased in size could have important implications for ephaptic coupling of MLI and PC and could be mentioned.

      We agree and have added this notion to the discussion.

      Y-label is missing in B.

      Corrected.

      Figure 3 and associated supplementary figures:

      In the text, change PC-Cre to L7-Cre or Pcp2-Cre.

      Changed

      How do the authors explain a reduction in frequency, amplitude, and duration of sIPSCs in the KL KO but not in the Kit KO? Add to the discussion

      We now address this apparent discordance in the discussion. Pax2 Cre mediates recombination weeks ahead of Pcp2 Cre. We therefore suspect that postnatal PC KL KO may be more phenotypic than embryonic MLI Kit KO because there is less time for developmental compensation. A future evaluation of the impact of postnatal Kit KO would be informative to this end.

      As in Figure 2, plotting the charge might be more accurate.

      We now plot total charge transfer.

      Are the intrinsic properties in KL KO PCs altered? (Spontaneous firing, capacitance, input resistance).

      We have added to the text that we found no difference in capacitance or input resistance between Purkinje cells from KL floxed homozygous Control animals versus those from KL floxed homozygous, PCP2 Cre positive KL KO animals. We plan to characterize both basal and MLI modulated PC firing in a future manuscript, especially since Pcp2 Cre mediated KL KO seems more phenotypic than Pax2 Cre mediated Kit KO, we agree that this seems a better testbed for investigating differences in both the basal, and the MLI-mediated modulations in, PC firing.

      3D-F - Example traces would be desirable (see above, analogous to Fig. 2).

      More example traces have been added.

      Figure 4: 'In vivo mixtures' sounds unusual. Consider revision (e.g., 'to sparsely delete KL').

      Changed

      The observation that control PC sIPSC frequency is lower in KL OX PCs than in sham is interesting. This observation would be consistent with overall inhibitory synapse density being preserved. This could be evaluated with immunohistochemistry. For how far away from the injection area does this observation hold true?

      Because we have now analyzed and failed to find an overt (per animal average) change in synaptic puncta size or density in the whole animal Control vs PCP2 Cre mediated KL KO conditions, we do not have confidence that it is feasible to pursue this IHC strategy in the sparse viral-mediated KL KO or OX conditions. To the reviewer’s valid point however, we intend to probe the spatial extent/specificity of the sparse phenomenon when we are resourced to complement the KL/Kit manipulations with transgenic methods for evaluating MLI-PC synapses specifically, potentially by GRASP or related methods that would not be confounded by PC-PC synapses. Transgenic MLI access would also facilitate determining the spatial extent to which opto-genetically activated MLIs evoke equivalent responses in Control vs KL manipulated PCs.

      Y-legend in D clipped.

      Corrected

      Existing literature suggests that MLI inhibition regulates the regularity of PC firing - this could be tested in Kit and KL mutants.

      For now, based upon transgenic animal availability, we have now included an evaluation of PC firing in the (Pax2 Cre mediated) Kit KO condition. PC average firing frequency, mean ISI, and ISI CV2 were not significantly different across genotypes. A KS test of individual ISI durations for Control vs Kit KO did reveal a difference (p<0.0001). We have added a supplementary figure (S6) with this data. It is possible that in the more phenotypic PC KL KO condition that we may find a difference in these PC spiking patterns of PC firing, however, we are also eager to test in future studies whether postnatal KL or Kit KO impairs the ability of MLI activation to produce pauses or other alterations in PC firing or in PF-PC mediated plasticity.

      Reviewer #3 (Recommendations For The Authors):

      Reference to Figure 1A in the Results section is slightly inaccurate. Kit gene modifications are illustrated in Figures 1A, B. Where Figure 1A shows Kit distribution. Please rephrase. Relatedly, the reference to Figs 1B - D are shifted in the results section, and 1E is skipped.

      We have changed the text.

      Please show cumulative histograms for frequency too for consistency with amplitude (e.g. Fig 2).

      We have instead, for reasons outlined by other reviewers, documented total charge transfer for both Kit KO and KL KO experiments where sIPSC events were analyzed.

      Fig S3: include example traces of PPR.

      This is now included.

      Include quantifications of GABAergic synapse density in Fig S4.

      This is now included.

      Include inset examples of KO in Fig S4A.

      This is now included.

      Add average puncta size graphs along Figure S4B. The effect apparent in the histogram of S4B is small and statistics using individual puncta as n values (in the 20,000s) therefore misleading.

      Per animal analysis is now instead included in the figure and text.

      Figure S4B y axis label blocked.

      Corrected

      Include quantification referenced in "As PSD95 immunoreactivity faithfully follows multiple markers of pinceaux size 40, we quantified PSD95 immunoreactive pinceau area and determined that pinceaux area was decreased by ~50% in Kit KO (n 26 Control vs 43 Kit KO, p<0.0001, two-tailed t-test)."

      We added a graph of per animal averages, instead of in text individual pinceau areas.

      Include antibody dilutions in the methods.

      Added.

      It's unclear from the text where the Mirow lab code comes from.

      Detail has now been added in text.

      Typo in methods "The Kit tm1c alle was bred...".

      Corrected

      Typo in Figure S4 legend "POSD-95 immuno-reactivity".

      Corrected

    1. Author Response

      The following is the authors’ response to the original reviews.

      First of all, we'd like to thank the three reviewers for their meticulous work that enable us to present now an improved manuscript and substantial changes were made to the article following reviewers' and editors' recommendations. We read all their comments and suggestions very carefully. Apart from a few misunderstandings, all comments were very pertinent. We responded positively to almost all the comments and suggestions, and as a result, we have made extensive changes to the document and the figures. This manuscript now contains 16 principal figures and 15 figure supplements.

      The number of principal figures is now 16 (1 new figure), and additional panels have been added to certain figures. On the other hand, we have added 7 additional figures (supplement figures) to answer the reviewers' questions and/or comments.

      Main figures

      ▪ Figures 1, 4, 5, 10, 11, 12, 13, 14: unchanged ▪ Figure 7 and 8 were switched.

      ▪ Figure 2: we added panel F in response to reviewer 3's and request for sperm defect statistics

      ▪ Figure 3: the contrast in panel B has been taken over to homogenize colors

      ▪ Figure 6: This figure was recomposed. The WB on testicular extract was suppressed and we present a new WB allowing to compare the presence of CCDC146 in the flagella fraction. Using an anti-HA Ab, we demonstrate that the protein is localized in the flagella in epididymal sperm. Request of the 3 reviewers.

      ▪ Figure 7 (old 8): to avoid the issue of the non-specificity of secondary antibodies, we performed a new set of IF experiments using an HA Tag Alexa Fluor® 488-conjugated Antibody (anti-HA-AF488-C Ab) on WT and HA-CCDC146 sperm. These results are now presented in figure 7 panel A (new). The specificity of the signal obtained with the anti-HA-AF488-C Ab on mouse spermatozoa was evaluated by performing a statistical study of the density of dots in the principal piece of the flagellum from HA-CCDC146 and WT sperm. These results are now presented in figure 7 panel B (new). This study was carried out by analyzing 58 WT spermatozoa and 65 CCDC146 spermatozoa coming from 3 WT and 3 KI males. We found a highly significant difference, with a p-value <0.0001, showing that the signal obtained on spermatozoa expressing the tagged protein is highly specific. We have added a paragraph in the MM section to describe the process of image analysis. We finally present new images obtained by ExM showing no staining in the midpiece (figure 7C new). Altogether, these results demonstrate unequivocally the presence of the protein in the flagellum. Moreover, the WB was removed and is now presented in figure 6 (improved as requested).

      ▪ Figure 8. Was old figure 7

      ▪ Figure 9: figure 9 was recomposed and improved for increased clarity as suggested by reviewer 2 and 3.

      ▪ Figure 16 was before appendix 11

      Figure supplements and supplementary files

      ▪ Figure 1-Figure supplement 1 New. Sperm parameters of the 2 patients. requested by editor (remark #1) by the reviewer 1 (Note #3)

      ▪ Figure 2-Figure supplement 1 new. Sperm parameters of the line 2 (KO animals) requested by the reviewer 1 (Note #5)

      ▪ Figure 4-Figure supplement 1 New. Experiment to evaluate the specificity of the human CCDC146 antibody. Minimal revision request and reviewer 1 note #8

      ▪ Figure 6-Figure supplement 1 New. Figure recomposed; Asked by reviewer 2 note #4 and reviewer 3

      ▪ Figure 8-Figure supplement 1 New. We now provide new images to show the non-specific staining of the midpiece of human sperm by secondary Abs in ExM experiments; Asked by reviewer 2

      ▪ Figure 10-Figure supplement 1 New. We added new images to show the non-specific staining of the midpiece of mouse sperm by secondary Abs in IF (panel B). Rewiever 1 note #9 and reviewer 2 note #5

      ▪ Figure 12-Figure supplement 1 New. Control requested by reviewer 3 Note #23

      ▪ Figure 13-Figure supplement 1 New. We provide a graph and a statistical analysis demonstrating the increase of the length of the manchette in the Ccdc146 KO. Requested by editor and reviewer 3 Note 24

      ▪ Figure 15-Figure supplement 1 New. Control requested by reviewer 2. Minor comments

      ▪ Figure supplementary 1 New. Answer to question requested by reviewer 2 note #1

      All the reviewers' and editors’ comments have been answered (see our point to point response) and we resubmit what we believe to be a significantly improved manuscript. We strongly hope that we meet all your expectations and that our manuscript will be suitable for publication in "eLife". We look forward to your feedback,

      Point by point answer

      Please note that there has been active discussion of the manuscript and the summarize points below is the minimal revision request that the reviewers think the authors should address even under this new review model system. It was the reviewers' consensus that the manuscript is prepared with a lot of oversights - please see all the minor points to improve your manuscript.

      All minimal revision requests have been addressed

      Minimal revision request

      1) Clinical report/evaluation of the two patients should be given as it was not described even in their previous study as well as full description of CCDC146.

      We provide now a new Figure 1-figure supplement 1 describing the patients sperm parameters

      2) Antibody specificity should be provided, especially given two of the reviewers were not convinced that the mid piece signal is non-specific as the authors claim. As both KO and KI model in their hands, this should be straightforward.

      To validate the specificity of the Antibody, we transfected HEK cells with a human DDK-tagged CCDC146 plasmid and performed a double immunostaining with a DDK antibody and the CCDC146 antibody. We show that both staining are superimposable, strongly suggesting that the CCDC146 Ab specifically target CCDC146. This experiment is now presented in Figure 4-Figure supplement 1. Next, to avoid the issue of the non-specificity of secondary antibodies, we performed a new set of IF experiments using an HA Tag Alexa Fluor® 488-conjugated Antibody (anti-HA-AF488-C Ab) on WT and HA-CCDC146 sperm. These results are now presented in figure 7 panel A (new). The specificity of the signal obtained with the anti-HA-AF488-C Ab on mouse spermatozoa was evaluated by performing a statistical study of the density of dots in the principal piece of the flagellum from HA-CCDC146 and WT sperm. These results are now presented in figure 7 panel B (new). This study was carried out by analyzing 58 WT spermatozoa and 65 CCDC146 spermatozoa coming from 3 WT and 3 KI males. We found a highly significant difference, with a p-value <0.0001, showing that the signal obtained on spermatozoa expressing the tagged protein is highly specific. We have added a paragraph in the MM section to describe the process of image analysis. We finally present new images obtained by ExM showing no staining in the midpiece (figure 7C new). Altogether, these results demonstrate unequivocally the presence of the protein in the flagellum.

      3) The authors should improve statistical analysis to support their experimental results for the reader can make fair assessment. Combined with clear demonstration of ab specificity, this lack of statistical analysis with very few sample number is a major driver of dampening enthusiasm towards the current study.

      Several statistical analyses were carried out and are now included:

      1) distribution of the HA signal in mouse sperm cells (see point 2 Figure 7 panel B)

      2) quantification and statistical analyses of the defect observed in Ccdc146 KO sperm (figure 2 panel E)

      3) Quantification and statistical analyses of the length of the manchette in spermatids 13-15 steps (Figure 13-Figure supplement 1 new)

      4) The authors need to clarify (peri-centriolar vs. centriole)

      In figure 4A, we have clearly shown that the protein colocalizes with centrin, a centriolar core protein in somatic cells. This colocalization strongly suggests that CCDC146 is therefore a centriolar protein, and this is now clearly indicated lines 211-212. However, its localization is not restricted to the centrioles and a clear staining was also observed in the pericentriolar material (PCM). The presence of a protein in PCM and centriole was already described, and the best example is maybe gamma-tubulin (PMID: 8749391).

      or tone down (CCDC146 to be a MIP) of their claim/description.

      Concerning its localization in sperm, we agree with the reviewer that our demonstration that CCDC146 is MIP would deserve more results. Because of that, we have toned down the MIP hypothesis throughout the manuscript. See lines 491495

      Testis-specific expression of CCDC146 as it is not consistent with their data.

      We have also modified our claim concerning the testis-expression of CCDC146. Line 176

      Reviewer #1 (Recommendations For The Authors):

      Major comments

      1) As described in general comments, this study limits how the CCDC146 deficiency impairs abnormal centriole and manchette formation. The authors should explain their relationship in developing germ cells.

      In fact, there are limited information about the relationship between the manchette and the centriole. However, few articles have highlighted that both organelles share molecular components. For instance, WDR62 is required for centriole duplication in spermatogenesis and manchette removal in spermiogenesis (Commun Biol. 2021; 4: 645. doi: 10.1038/s42003-021-02171-5). Another study demonstrates that CCDC42 localizes to the manchette, the connecting piece and the tail (Front. Cell Dev. Biol. 2019 https://doi.org/10.3389/fcell.2019.00151). These articles underline that centrosomal proteins are involved in manchette formation and removal during spermiogenesis and support our results showing the impact of CCDC146 lack on centriole and manchette biogenesis. This information is now discussed. See lines 596-603

      2) The authors generated knock-in mouse model. If then, are the transgene can rescue the MMAF phenotype in CCDC146-null mice? This reviewer strongly suggest to test this part to clearly support the pathogenicity by CCDC146.

      We indeed wrote that we created a “transgenic mice”, which was misleading. We actually created a CCDC16 knock-in expressing a tagged-protein. The strain was actually made by CRISPR-Cas9 and a sequence coding for the HA-tag was inserted just before the first amino acid in exon 2, leading to the translation of an endogenous HA-tagged CCDC146 protein. We have removed the word transgenic from the text and made changes accordingly (see lines 250-253). We can therefore not use this strain to rescue the MMAF phenotype as suggested by the reviewer.

      3) Although the authors cite the previous study (Coutton et al., 2019), the study does not describe any information for CCDC146 and clinical information for the patients. The authors must show the results for clinical analysis to clarify the attended patients are MMAF patients without other phenotypic defects.

      We have now inserted a table, indicating all sperm parameters for the patients harboring a mutation in the CCDC146 gene (Figure 1-Figure supplement 1) and is now indicated lines 159-160

      4) The authors describe CCDC146 expression is dominant in testes, However, the level in testis is only moderate in human (Supp Figure 1). Thus, this description is not suitable.

      In Figure 1-figure supplement 2 (old FigS1), the median of expression in testis is around 12 in human, a value considered as high expression by the analysis software from Genevestigator. However, for mouse, it is true that the level of expression is medium. We assumed that reviewer’s comment concerned testis expression in mouse. To take into account this remark, we changed the text accordingly. See line 176.

      5) Although the authors mentioned that two mice lines are generated, only one line information is provided. Authors must include information for another line and provide basic characterization results to support the shared phenotype within the lines.

      We now provide a revised Figure 2-figure supplement 1CD, presenting the second line and the corresponding text in the main text is found lines 178-183.

      6) In somatic cells, the CCDC146 localizes at both peri-centriole and microtubule but its intracellular localization in sperm is distinguished. The authors should explain this discrepancy.

      The multi-localization of a centriolar protein is already discussed in detail in discussion lines 520-526. We have written:

      “Despite its broad cellular distribution, the association of CCDC146 with tubulin-dependent structures is remarkable. However, centrosomal and axonemal localizations in somatic and germ cells, respectively, have also been reported for CFAP58 [37, 55], thus the re-use of centrosomal proteins in the sperm flagellar axoneme is not unheard of. In addition, 80% of all proteins identified as centrosomal are found in multiple localizations (https://www.proteinatlas.org/humanproteome/subcellular/centrosome). The ability of a protein to home to several locations depending on its cellular environment has been widely described, in particular for MAP. The different localizations are linked to the presence of distinct binding sites on the protein…. “

      7) Authors mention CCDC146 is a centriolar protein in the title and results subtitle. However, the description in results part depicts CCDC146 is a peri-centriolar protein, which makes confusion. Do the authors claim CCDC146 is centrosomal protein?

      In figure 4A, we have clearly shown that the protein colocalizes with centrin, a centriolar core protein. This colocalization strongly suggests that CCDC146 is therefore a centriolar protein in somatic cells, and is now clearly indicated lines 211-212. However, its localization is not restricted to the centrioles and a clear staining was also observed in the pericentriolar material (PCM). The presence of a protein in PCM and centriole was already described and the best example is maybe gamma-tubulin (PMID: 8749391).

      8) Verification of the antibody against CCDC146 must be performed and shown to support the observed signal are correct. 2nd antibody only signal is not proper negative control.

      It is a very important remark. The commercial antibody raised against human CCDC146 was validated in HEK293-cells expressing a DDK-tagged CCDC146 protein. Cells were co-marked with anti-DDK and anti-CCDC146 antibodies. We have a perfect colocalization of the staining. This experiment is now presented in Figure 4-figure supplement 1 and presented in the text (lines 206-208).

      9) In human sperm, conventional immunostaining reveals CCDC146 is detected from acrosome head and midpiece. However, in ExM, the signal at acrosome is not detected. How is this discrepancy explained? The major concern for the ExM could be physical (dimension) and biochemical (properties) distortion of the sample. Without clear positive and negative control, current conclusion is not clearly understood. Furthermore, it is unclear why the authors conclude the midpiece signal is non-specific. The authors must provide experimental evidence.

      Staining on acrosome should always be taken with caution in sperm. Indeed, numerous glycosylated proteins are present at the surface of the plasma membrane regarding the outer acrosomal membrane for sperm attachment and are responsible for numerous nonspecific staining. Moreover, this acrosomal staining was not observed in mouse sperm, strongly suggesting that it is not specific.

      Concerning the staining in the midpiece observed in both conventional and Expansion microscopy, it also seems to be nonspecific and associated with secondary Abs.

      For IF, we now provide new images showing clearly the nonspecific staining of the midpiece when secondary Ab were used alone (see Figure 10-figure supplement 1B).

      For ExM, we provide new images in Figure 8-figure supplement 1B (POC5 staining) showing a staining of the midpiece (likely mitochondria), although POC5 was never described to be present in the midpiece. Both experiments (CCDC146 and POC5 staining by ExM) shared the same secondary Ab and the midpiece signal was likely due to it.

      Moreover, we now provide new images (figure 7C) in ExM on mouse sperm showing no staining in the midpiece and demonstrating that the punctuated signal is present all along the flagellum. Finally, we would like to underline that we now provide new IF results, using an anti-HA conjugated with alexafluor 488 and confirming the ExM results.

      These points are now discussed lines 498-502 for acrosome and lines 503-511 for midpiece staining.

      10) For intracellular localization of the CCDC146 in mouse sperm, the authors should provide clear negative control using WT sperm which do not carry the transgene.

      This experiment was performed.

      To avoid the issue of the non-specificity of secondary antibodies, we performed a new set of IF experiments using an HA Tag Alexa Fluor® 488-conjugated Antibody (anti-HA-AF488-C Ab) on WT and HA-CCDC146 sperm. These results are now presented in figure 7 panel A (new). The specificity of the signal obtained with the anti-HA-AF488-C Ab on mouse spermatozoa was evaluated by performing a statistical study of the density of dots in the principal piece of the flagellum from HA-CCDC146 and WT sperm. These results are now presented in figure 7 panel B (new). This study was carried out by analyzing 58 WT spermatozoa and 65 CCDC146 spermatozoa coming from 3 WT and 3 KI males. We found a highly significant difference, with a p-value <0.0001, showing that the signal obtained on spermatozoa expressing the tagged protein is highly specific. We have added a paragraph in the MM section to describe the process of image analysis. We finally present new images obtained by ExM showing no staining in the midpiece (figure 7C new). Altogether, these results demonstrate unequivocally the presence of the protein in the flagellum.

      11) Current imaging data do not clearly support the intracellular localization of the CCDC146. Although western blot imaging reveal that CCDC146 is detected from sperm flagella, this is crude approach. Thus, this reviewer highly recommends the authors provide more clear experimental evidence, such as immuno EM.

      We provide now a WB comparing the presence of the protein in the flagellum and in the head fractions; see new figure 6. We show that CCDC146 is only present in the flagellum fraction; The detection of the band appeared very quickly at visualization and became very strong after few minutes, demonstrating that the protein is abundant in the flagella. It is important to note that epididymal sperm do not have centrioles and therefore this signal is not a centriolar signal. We also now provide new statistical analyses showing that the immuno-staining observed in the principal piece is very specific (Figure 7B). Altogether, these results demonstrate unequivocally the intracellular localization of CCDC146 in the flagellum. This point is now discussed lines 480-489

      12) Although sarkosyl is known to dissociate tubulin, it is not well understood and accepted that the enhanced detection of CCDC146 by the detergent indicates its microtubule inner space. Sperm axoneme to carry microtubule is also wrapped peri-axonemal components with structural proteins, which are even not well solubilized by high concentration of the ionic detergent like SDS.

      We agree with the reviewer that the solubilization of the protein by sarkozyl is not a proof of the presence of the protein inside microtubule. Taking into account this point, the MIP hypothesis was toned down and we now discuss alternative hypothesis concerning these results; See discussion lines 490-497

      13) SEM image is not suitable to explain internal structure (line 317-323).

      We agree with the reviewers and changes were made accordingly. See lines 354-357

      Minor comments

      1) In main text, supplementary figures are cited "Supp Figure". And the corresponding legends are written in "Appendix - Figure". Please unify them.

      Done Labelled now “Figure X-figure supplement Y”

      2) Line 159, "exon 9/19" is not clear.

      We have written now exons 9 and indicated earlier that the gene contains 19 exons

      3) Line 188, "positive cells" are vague.

      Positive was changed by “fluorescent”

      4) Representative TUNEL assay image for knockout testes were not shown in Supp Figure 3B.

      It was a mistake now Figure 2-figure supplement 2C

      5) Please provide full description for "IF" and "AB" when described first.

      Done

      6) Line 262, It is unclear what is "main piece".

      Changed to principal piece

      7) Line 340, Although the "stage" information might be applicable, this is information for "seminiferous tubule" rather than "spermatid". This reviewer suggests to provide step information rather than stage information.

      We agree with the reviewer that there was a confusion between “stage” and “step”. We change to step spermatids

      8) Line 342, Step 1 is not correct in here.

      OK corrected. now steps 13-15 spermatids

      9) Line 803, "C." is duplicated.

      Removed

      10) Figure 3A, it will be good to mark the defective nuclei which are described in figure legends.

      These cells are now indicated by white arrow heads

      11) Figure 5, Please provide what MT stands for.

      Now explained in the legend of figure 5

      12) Figure 6. Author requires clear blot images for C. In addition, Panel B information is not correct. If the blot was performed using HA antibody, then how "WT" lane shows bands rather than "HA" bands?

      The reviewer is correct. It was a mistake; The figure was recomposed and improved.

      Reviewer #2 (Recommendations For The Authors):

      Overall, editing oversights are present throughout the manuscript, which has made the review process quite difficult. Some repetitive figures can be removed to streamline to grasp the overall story easier. Some claims are not fully supported by evidence that need to tone down. Some figures not referenced in the main text need to be mentioned at least once.

      All figures are now referenced in the text

      Major comments:

      1) 163-164 - Please clarify the claim that there is going to be an absence of the protein or nonfunctional protein, especially for the patient with a deletion that could generate a truncated protein at two third size of the full-length protein. Similarly, 35% of the protein level is present for the patient with a nonsense mutation. Some in silico structural analysis or analysis of conserved domains would be beneficial to support these claims.

      Both mutations are predicted to produce a premature stop codons: p.Arg362Ter and p.Arg704serfsTer7, leading either to the complete absence of the protein in case of non-sense mediated mRNA decay or to the production of a truncated protein missing almost two third or one fourth of the protein respectively. CCDC146 is very well conserved throughout evolution (Figure supplementary 1), including the 3’ end of the protein which contains a large coil-coil domain (Figure 1B). In view of the very high degree of conservation, it is most likely that the 3’ end of the protein, absent in both subjects, is critical for the CCDC146 function and hence that both mutations are deleterious. This explanation is now added to the discussion. see lines 439-448

      2) 173, 423 - Please clearly state a rationale of your mouse model design (i.e., why a mouse model that recapitulate human mutation is not generated) as the truncations identified in human patients are located further towards the C-terminus, and it is not clear whether truncated proteins are present, and if so, they could still be functional. Basically, the current mouse model supports the causality of the human mutations.

      This is an important question, which goes beyond the scope of this article, and raises the question of how to confirm the pathogenicity of mutations identified by high-throughput sequencing. The production of KO or KI animals is an important tool to help confirm one’ suspicions but the first element to take into consideration is the nature of the genetic data.

      Here we had two patients with homozygous truncating variants. In human, it is well established that the presence of premature stop codons usually induces non-sense mediated mRNA decay (NMD), inducing the complete absence of the protein or a strong reduction in protein production. In the unlikely absence of NMD in our two patients, the identified variants would induce the production of proteins missing 60% and 30% of their C terminal part. Often (and it is particularly true for structural proteins) the production of abnormal proteins is more deleterious than the complete absence of the protein (and it is most likely the purpose of NMD, to limit the production of abnormal “toxic” proteins). For these reasons, to try to recapitulate the most likely consequences of the human variants, without risking obtaining an even more severe effect, we decided to introduce a stop codon in the first exon in order to remove the totality of the protein in the KO mice.

      The second element is to interpret the phenotype of the KO animals. Here, the human sperm phenotype is perfectly recapitulated in the KO mice.

      Overall, we have strong genetic arguments in human and the reproduction of the phenotype in KO mice confirming the pathogenicity of the variants identified in men.

      This point is now discussed see lines 433-438

      3) Figure 6A - the labelling is misleading as it seems to suggest that the specific cells were isolated from the testes for RT-PCR.

      We have modified the labelling to avoid any confusion.

      Figure 6B -Signal of HA-tag is shown in WT, not in transgenic. Please check the order of the labels. Figure 6C - This blot is NOT a publication-quality figure. The bands are very difficult to observe, especially in lane D18. Because it is one of the important data of this study, replacing this figure is a must.

      The figure has been completely remade, including new results. See new figure 6. Figure 6C was suppressed.

      4) Supplementary fig 6 is also not a publication-level figure, and the top part seems largely unnecessary (already in the figure legend).

      The figure has been completely remade as well (now Figure 6-Figure Supplement 1).

      5) 261/267- The conclusion that mitochondrial staining in the flagellum (in both mice and humans) is non-specific is not convincing. Supplementary fig 8 shows that the signal from secondary only IF possibly extends beyond the midpiece - but it is hard to determine as no mitochondrial-specific staining is present. Either need to tone down the conclusion or provide supporting experimental evidence.

      First, to avoid the issue of the non-specificity of secondary antibodies, we performed a new set of IF experiments using an HA Tag Alexa Fluor® 488-conjugated Antibody (anti-HA-AF488-C Ab) on WT and HA-CCDC146 sperm. These results are now presented in figure 7 panel A (new). The specificity of the signal obtained with the anti-HA-AF488-C Ab on mouse spermatozoa was evaluated by performing a statistical study of the density of dots in the principal piece of the flagellum from HA-CCDC146 and WT sperm. These results are now presented in figure 7 panel B (new). This study was carried out by analyzing 58 WT spermatozoa and 65 CCDC146 spermatozoa coming from 3 WT and 3 KI males. We found a highly significant difference, with a p-value <0.0001, showing that the signal obtained on spermatozoa expressing the tagged protein is highly specific. We have added a paragraph in the MM section to describe the process of image analysis. We finally present new images obtained by ExM showing no staining in the midpiece (figure 7C new). Altogether, these results demonstrate unequivocally the presence of the protein in the flagellum. These experiments are now described lines 271-279

      Second, we provide new images of the signal obtained with secondary Abs only that shows more clearly that the secondary Ab gave a non-specific staining (Figure 10-Figure supplement 1B). This point is discussed lines 503-511

      6) Figure 9 A - Please relate the white line to Fig. 9B label in X-axis. The information from Fig 9A+D and 9E+F are redundant. The main text nor the figure legends indicate why these specific two sperm were chosen for quantification and demonstrating the outcomes. One of them could be moved to supplementary information or removed, or the two could be combined.

      As suggested by the reviewer, we have combined the two sperm to demonstrate that CCDC146 staining is mostly located on microtubule doublets. Moreover, the figure was recomposed to make it clearer.

      Minor comments:

      All of the supplementary figures are referred to as Supp Fig X in the text, however, they are actually titled Appendix - Figure X. This needs to be consistent.

      The figures are now referred as figure supplement x in both text and figures

      Line 125 - edit spacing.

      We think this issue (long internet link) will be curated later and more efficiently by the journal, during the step of formatting necessary for publication.

      144 - With which to study  with which we studied?

      We made the change as suggested.

      151 - Supp Fig 1 - the text says that the gene is highly transcribed in human and mouse testes, but the information in the figure states that the level in mouse tissues is "medium"

      We have corrected this mistake in the text; See line 176

      165 - The two mutations are most likely deleterious. Please specifically mention what analyses done to predict the deleterious nature to support these claims.

      Both variants, c.1084C>T and c.2112del, are extremely rare in the general population with a reported allele frequency of 6.5x10-5 and 6.5x10-06 respectively in gnomAD v3. Moreover, these variants are annotated with a high impact on the protein structure (MoBiDiC prioritization algorithm (MPA) score = 10, DOI: 10.1016/j.jmoldx.2018.03.009) and predicted to induce each a premature termination codon, p.(Arg362Ter) and p.(Arg704SerfsTer7) respectively, leading to the production of a truncated protein. This information is now given line 164-169

      196-200/Figure 4 - As serum starved cells/basal body (B) are not mentioned in the main text, as is, Fig 4A would be sufficient/is relevant to the text. Please make the text reflect the contents of the whole figure, or re/move to supplement.

      We agree with the reviewer that the full description of the figure should be in the text. We added two sentences to describe figure 4B see lines 217-218.

      224 - spermatozoa (plural) fits better here, not spermatozoon

      OK changed accordingly

      236 - According to the figure legend, 6B is only showing data from the epididymal sperm, not postnatal time points; should be referencing 6C. Alignment of Marker label

      As indicated above, the figure has been completely remade, including new results. See new figure 6. Figure 6C was suppressed. The corresponding text was changed accordingly see lines 249-266

      255-256 - Referenced figure 7B3, however, 7B3 only shows tubulin staining, so no CCDC146 can be observed. Did authors mean to reference fig 7B as a whole?

      Sorry for this mistake. We agree and the text is now figure 8B6 (figure 7 and 8 were switched)

      305 - "of tubules" - I presume it is meant to be microtubules?

      Yes; The text was changed as suggested

      317-321 - a diagram of HTCA would be useful here

      We have added a reference where HTCA diagram is available see line 363. Moreover, a TEM view of HTCA is presented figure 12A

      322/Fig 11A - an arrow denoting the damage might be useful, as A1 and A3 look similar. The size of the marker bar is missing. Please update the information on figure legend.

      Concerning, the comparison between A1 and A3, the take home message is that there is a great variability in the morphological damages. This point is now underlined in the corresponding text. We updated the size of the marker bar as suggested (200 nm). See line 365-367

      323 - Please mark where capitulum is in the figure

      Capitulum was changed for nucleus

      Since Fig 11B2 is not referenced in the main text, it does not seem to add anything to the data, and could be removed/moved to supplement.

      We added a sentence to describe figure 11B2 line 370

      342-343 - manchette in step I is not seen clearly - the figure needs to be annotated better. However, DPY19L2 is absent in step I in the KO, but the main text does not reflect that - why is that?

      We do not understand the remark of the reviewer “manchette in step I is not seen clearly”. The figure shows clearly the manchette (red signal) in both WT and KO (Figure 13 D1/D2).

      For steps 13-15 WT spermatids, the size of the manchette decreases and become undetectable. In KO spermatids, the shrinkage of the manchette is hampered and in contrast continue to expand (Figure 13D2). We also provide a new Figure 13-figure supplement 1 for other illustrations of very long manchettes and a statistical analysis. In the meantime, the acrosome is strongly remodeled, as shown in figure 16-new, with detached acrosome (panel H). This morphological defect may induce a loss of the DPY19L2 staining (Figure 13 D2 stage I-III). This explanation is now inserted in the text line 396399

      Figure 15B and 15C only show KO, corresponding images from the WT should be present for comparison.

      WT images are now provided in Figure 1-figure supplement 1 new

      Figure 12 - Figure 12 - JM?.

      JM was removed. It does not mean anything

      Figure 12C and Supplementary Fig 10 - structures need to be labelled, as it is unclear what is where

      Done

      338 - text mentions step III, but only sperm from step VII are shown in Figure 13

      As suggested by reviewer 3, we changed stage by step. The text was modified to take into account this remark see lines 388-396

      360 - This is likely supposed to say Supp Figure 11E-G, not 13??

      Yes, it is a mistake. Corrected

      388 Typo "in a in a".

      Yes, it is a mistake. Corrected

      820 - Fig 3 legend - in KO spermatid nuclei were elongated - could this be labelled by arrows? I am not convinced this phenotype is that different from the WT.

      In fact, the nuclei of elongating KO spermatids are elongated and also very thin, a shape not observed in the WT; We have added arrow heads and modified the text to indicate this point line 200.

      836 - Figure 5 legend says that in yellow is centrin, but that is not true for 5A, where the figure shows labelling for y-tubulin (presumably, according to the figure itself).

      We have modified the text of the legend to take into account the remark

      837- 5A supposedly corresponds to synchronized HEK293T cells, but the reasoning behind using synchronized cells is not mentioned at all in the main text; furthermore, how this synchronization is achieved is not explained in materials and methods (serum starvation? Thymidine block?).

      Yes, figure 5A was obtained with synchronized cells. We have added one paragraph in the MM section. For cell synchronization experiments, cells underwent S-phase blockade with thymidine (5 mM, SigmaAldrich) for 17 h followed by incubation in a control culture medium for 5 h, then a second blockade at the G2-M transition with nocodazole (200 nM, Sigma-Aldrich) for 12 h. Cells were then fixed with cold methanol at different times for IF labelling. See line 224 for changes made in the result section and lines 700-704 for changes made in the MM section.

      845- figure legend says that the RT-PCR was done on CCDC146-HA tagged mice, but the main text does not reflect that.

      We made changes and the description of the KI is now presented before (line 240) the RT-PCR experiment (line 257).

      949 - it is likely supposed to say A2, not B1 (B1 does not exist in Fig 15)

      Yes, it is a mistake. Corrected

      971 - Appendix Fig 3 legend - I believe that the description for B and C are swapped.

      Yes, it is a mistake. Corrected

      Furthermore, some questions to address in A would be: Which cross sections were from which animal/points? How many per animal? Were they always in the same location?

      Yes, we have a protocol for arranging and orienting all testes in the same way during the paraffin embedding phase. The cross-sections are therefore not taken at random, and we can compare sections from the same part of the testis. The number of animals was already indicated in the figure legend (see line 1128)

      Reviewer #3 (Recommendations For The Authors):

      1) There are a number of grammatical and orthographical errors in the text. Careful proofreading should be performed.

      We have sent the manuscript to a professional proofreader

      2) The author should also check for redundancies between the introduction and the discussion.

      The discussion has modified to take into account reviewers’ remarks. Nevertheless, we did our best to avoid redundancies between introduction and discussion.

      3) Can the authors provide a rationale why they have chosen to tag their gene with an HA tag for localisation? One would rather think of fluorescent proteins or a Halo tag.

      Because the functional domains of the protein are unknown, adding a fluorescent protein of 24 KDa may interfere with both the localization and the function of CCDC146. For this reason, we choose a small tag of only 1.1 KDa, to limit as such as possible the risk of interfering with the structure of the protein. This rational is now indicated in the manuscript lines 251-254. It is worth to note, that the tagged-strain shows no sperm defect, demonstrating that the HA-tag does not interfere with CCDC146 function.

      4) In the abstract, line 53, "provide evidence" is not the right term for something that is just suggestive. The term "suggests" would be more appropriate.

      The text was modified to take into account this remark

      5) Line 74: "genetic deficiency" sounds strange here, do the authors mean simply "mutation"?

      Infertility may be due to several genetic deficiency such as chromosomal defects (XXY (Klinefelter syndrome)), microdeletion of the Y chromosome or mutations in a single gene. Therefore, mutation is too restrictive. Nevertheless, we modified the sentence which is now “…or a genetic disorder including chromosomal or single gene deficiencies”

      6) Lines 163-164: the authors describe the mutations (premature stop mutations) and say that they could either lead to complete absence of the gene product, or the expression of a truncated protein. Did they test this, for example, with some immuno blot analyses?

      As stated above, unfortunately, we were unable to verify the presence of RNA-decay in these patients for lack of biological material.

      7) Line 184 and Fig 2E: the sperm head morphologies should be quantitatively assessed.

      We provide now a full statistical analysis of the observed defects: see new panel in Figure 2 F

      8) Fig 3: The annotation should be more precise - KO certainly means CDCC146-KO. The colours of the IH panels is different, which attracts attention but is clearly a colour-adjustment artefact. Colours should be adjusted for the panels to look comparable. It would be also helpful to add arrowheads into the figure to point at the phenotypes that are highlighted in the text.

      We have added Ccdc146 KO in all figures. We have added arrow heads to point out the spermatids showing a thin and elongated nucleus. Concerning adjustment of colors, we attempted to make images of panel B comparable. See new figure 3.

      9) Fig 6A: the authors use RT PCR to determine expression dynamics of their gene of interested, and use actin (apparently) as control. However, actin and CDCC146 expression levels follow the same trend. How is the interpreted?

      The reviewer did not understand the figure. The orange bars do not correspond to actin expression and the grey bars to Ccdc146 expression but both bars represent the mRNA expression levels of Ccdc146 relative to Actb (orange) and Hprt (grey) expression in CCDC146-HA mouse pups’ testes. We tested two housekeeping genes as reference to be sure that our results were not distorted by an unstable expression of a housekeeping gene. We did not see significant difference between both house keeping genes. Actin was not used.

      10) In line 235, the authors suggest posttranslational modifications of their protein as potential cause for a slightly different migration in SDS PAGE as predicted from the theoretical molecular weight. This is not necessarily the case, some proteins do migrate just differently as predicted.

      We have changed the text accordingly and now provide alternative explanation for the slightly different migration. See lines 258-259

      11) The annotation of Fig 6 panels is problematic. First, why do the authors write "Laemmli" as description of the gel? It would be more helpful to write what is loaded on the gel, such as "sperm". Second, in panels B and C it would be helpful to add the antibodies used. It is not clear why there is a signal in the WT lane of panel B, but not in the HA lane (supposing an anti-HA antibody is used: why has WT a specific HA band?). In panel C, it is not clear why the blot that has so beautifully shown a single band in panel B suddenly gives such a bad labelling. Can the authors explain this? Also, they cut off the blot, likely because to too much background, but this is bad practice as full blots should be shown. In the current state, the panel C does not allow any clear conclusion. To make it conclusive, it must be repeated.

      Several mistakes were present in this figure. This figure was recomposed. The WB on testicular extract was suppressed and we now present a new WB allowing to compare the presence of CCDC146 in the flagella and head fractions from WT and HA-CCDC146 sperm. Using an anti-HA Ab, we demonstrate that in epididymal sperm the protein is localized in the flagella only. See new figure 6. The corresponding text was changed accordingly.

      12) The authors have raised an HA-knockin mouse for CDCC146, which they explained by the unavailability of specific antibodies. However, in Fig 7, they use a CDCC146 antibody. Can they clarify?

      The commercial Ab work for HUMAN CCDC146 but not for MOUSE CCDC146. We have added few words to make the situation clearer, we have added the following information “the commercial Ab works for human CCDC146 only”. See line 240

      13) In Fig 7A (line 258), the authors hypothesise that they stain mitochondria - why not test this directly by co-staining with mitochondria markers?

      We chose another solution to resolve this question:

      To avoid the issue of the non-specificity of secondary antibodies, we performed a new set of IF experiments using an HA Tag Alexa Fluor® 488-conjugated Antibody (anti-HA-AF488-C Ab) on WT and HA-CCDC146 sperm. These results are now presented in figure 7 panel A (new). The specificity of the signal obtained with the anti-HA-AF488-C Ab on mouse spermatozoa was evaluated by performing a statistical study of the density of dots in the principal piece of the flagellum from HA-CCDC146 and WT sperm. These results are now presented in figure 7 panel B (new). This study was carried out by analyzing 58 WT spermatozoa and 65 CCDC146 spermatozoa coming from 3 WT and 3 KI males. We found a highly significant difference, with a p-value <0.0001, showing that the signal obtained on spermatozoa expressing the tagged protein is highly specific. We have added a paragraph in the MM section to describe the process of image analysis. We finally present new images obtained by ExM showing no staining in the midpiece (figure 7C new). Altogether, these results demonstrate unequivocally the presence of the protein in the whole flagellum.

      14) It seems that in both, Fig 7 and 8, the authors use expansion microscopy to localise CDCC146 in sperm tails. However, the staining differs substantially between the two figures. How is this explained?

      In figure 8 we used the commercial Ab in human sperm, whereas in figure 7 we used the anti-HA Abs in mouse sperm. Because the antibodies do not target the same part of the CCDC146 protein (the tag is placed at the N-terminus of the protein, and the HPA020082 Ab targets the last 130 amino acids of the Cter), their accessibility to the antigenic site could be different. However, it is important to note that both antibodies target the flagellum. This explanation is now inserted see lines 304-312

      15) Fig 8D and line 274: the authors do a fractionation, but only show the flagella fraction. Why?

      Showing all fractions of their experiment would have underpinned the specific enrichment of CDCC146 in the flagella fraction, which is what they aim to show. Actually, given the absence of control proteins, the fact that the band in the flagellar fraction appears to be weaker than in total sperm, one could even conclude that there is more CDCC146 in another (not analysed) fraction of this experiment. Thus, the experiment as it stands is incomplete and does not, as the authors claim, confirm the flagellar localisation of the protein.

      We agree with the reviewer’s remark. We provide now new results showing both flagella and nuclei fractions in new figure 6A. This experiment is presented lines 253-256

      16) Line 283, Fig 9D,F: The description of the microtubules in this experiment is not easy to understand. Do the authors mean to say that the labelling shows that the protein is associated with doublet microtubules, but not with the two central microtubules? They should try to find a clearer way to explain their result.

      As suggested by reviewer 2, we have changed the figure to make it clearer. The text was changed accordingly. See new figure 9 and new corresponding legend lines 1006.

      17) Fig 9G - how often could the authors observe this? Why is the axoneme frayed? Does this happen randomly, or did the authors apply a specific treatment?

      Yes, it happens randomly during the fixation process.

      18) Line 300 and Fig 10A - the authors talk about the 90-kDa band, but do say anything about what they think this band is representing.

      We have now added the following sentence lines 340-342: “This band may correspond to proteolytic fragment of CCDC146, the solubilization of microtubules by sarkosyl may have made CCDC146 more accessible to endogenous proteases.”

      19) Fig 11A, lines 321-322: the authors write that the connecting piece is severely damaged. This is not obvious for somebody who does not work in sperm. Perhaps the authors could add some arrow heads to point out the defects, and briefly describe them in the text.

      We realized from your remark that our message was not clear. In fact, there is a great variability in the morphological damages of the HTCA. For instance, the HTCA of Ccdc146 KO sperm presented in figure 10A2 is quite normal, whereas that in figure 10A4 is completely distorted. This point is now underlined in the corresponding text. See lines 367-369

      We also added the size of the marker bar (200 nm), which were missing in the figure’s legend.

      20) Line 323: it will be important to name which tubulin antibody has been used to identify centrioles, as they are heavily posttranslationally modified.

      The different types of anti-tubulin Abs are described in the corresponding figure’s legend

      21) Fig 11B - phenotypes must be quantified to make these observations meaningful.

      We agree that a quantification would improve the message. However, testicular sperm are obtained by enzymatic separation of spermatogenic cells and the number of testicular sperm are very low. Moreover, not all sperm are stained. Taking these two points into account, it seems to us that quantification could be difficult to analyze. For this reason, the quantification was not done; however, it is important to note that these defects were not observed in WT sperm, demonstrating that these defects are cased by the lack of CCDC146. We have added a sentence to underline this point; See lines 374-375

      22) Line 329: Figure 12AB - is this a typo - should it read Figure 12B?

      We have split the panel A in A1 and A2 and changed the text accordingly. See line 378

      23) Why are there not wildtype controls in Fig 12B, C?

      We provide now as Figure 12-figure supplement 1, a control image for fig 12B. For figure 12C, the emergence of the flagellum from the distal centriole in WT is already shown in Fig 12A1

      24) Fig 13: the authors write that the manchette is "clearly longer and wider than in WT cells" (lines 342-343). How can they claim this without quantitative data?

      We now provide a statistical analysis of the length of the manchette. See figure 13-figure supplement 1A. We also provide a new a new image illustrating the length of the manchette in Ccdc146 KO spermatids; See Figure 13-figure supplement 1B.

    1. Author Response

      We appreciate the insightful and constructive feedback from the reviewers regarding our manuscript, "Gain neuromodulation mediates perceptual switches: evidence from pupillometry, fMRI, and RNN Modelling." The comments have provided us with a number of valuable perspectives that will undoubtedly strengthen the impact and clarity of our work.

      We recognize the need for a more detailed and comparative analysis of the perceptual tasks used in our pupil and fMRI experiments. To address these points directly: the jittered intertrial intervals (ITIs) in the fMRI work were deemed necessary to effectively deconvolve the BOLD response (see Stottinger et al., 2018). In our fMRI work, each image was randomly preceded and followed by varying ITIs (2, 4, 6, and 8 seconds), ensuring an equitable distribution across sets and subjects. Importantly, our analysis of both fMRI and behavioral studies, including eye tracking data, indicates that perceptual switch behavior – the point at which switches occur – is consistent across modalities. If more predictive or preparatory activity were present in the fMRI version of the task, we would expect earlier switches or choices and altered reaction time distributions – neither of these signatures was observed in the original study (Stottinger et al., 2018). Importantly, this suggests that the additional time available in the fMRI experiments did not significantly alter behavioral outcomes. Thus, our findings suggest that despite the differences in timing and task structure, the behavioural responses remain consistent across both experimental setups. We will clarify this in the revised manuscript.

      In response to the reviewer's comments on our computational model, particularly regarding the modelling of noradrenaline (NA) effects in the RNN, we agree that modelling gain as stationary is a substantial approximation. However, given the slow ramping of pupil diameter, which served as our proxy for gain, it is an approximation that we believe is justified: in the revised manuscript, we will run additional simulations to ensure the validity of this approximation. In addition, whilst we agree that the model is more complicated than is needed for the task, we opted for RNN modelling, in lieu of a simpler modelling approach, because we wanted to use RNN modelling as a method for both hypothesis testing and generation. To build the RNN, the only key elements of model structure we had to specify in advance were the inputs and the target outputs of the network. The solution the RNN arrived at, although involving many more parameters than a simpler model, was entirely determined by optimisation (i.e., not our a priori hypotheses). We feel that this strengthens the result considerably. Importantly, this approach also allowed us to be surprised by the results of the model – for instance, we did not anticipate that the effect of gain on the energy landscape to be primarily mediated by inhibitory gain. In the revised manuscript, we will integrate this line of thinking into the paper. We are also sensitive to the fact that this result is both counterintuitive and difficult to study in high-dimensional dynamical systems like RNNs. In revisions, we will provide further analysis of the RNN and build a 2D approximation to the RNN that can be studied on the phase plane to better conceptually illuminate the mechanisms at play.

      Furthermore, we agree with the suggestion to consider alternative mechanisms that might contribute to perceptual switches, such as attention and top-down processing. While our study primarily focuses on LC-mediated gain modulation, we acknowledge the complexity of neural processes involved in perception and will expand our discussion to include these potential mechanisms. Furthermore, noting the importance of moderating the causal language used in our manuscript. We will revise our wording to more accurately reflect the correlational nature of our findings and ensure that our conclusions are firmly grounded in the data presented.

      In conclusion, we are enthusiastic about the opportunity to refine our manuscript based on these valuable comments. In an updated version, we will address the overall points by providing clearer explanations of our methods, refining our figures for better readability, and ensuring that our conclusions are supported by robust analysis. We believe that these revisions will not only address the concerns raised but also significantly enhance the overall quality of our research. We thank the reviewers for their thorough and thoughtful critiques and look forward to submitting our revised manuscript.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      In this manuscript, the authors explore the effects of DNA methylation on the strength of regulatory activity using massively parallel reporter assays in cell lines on a genome-wide level. This is a follow-up of their first paper from 2018 that describes this method for the first time. In addition to adding more indepth information on sequences that are explored by many researchers using two main methods, reduced bisulfite sequencing and sites represented on the Illumina EPIC array, they now show also that DNA methylation can influence changes in regulatory activity following a specific stimulation, even in absence of baseline effects of DNA methylation on activity. In this manuscript, the authors explore the effects of DNA methylation on the response to Interferon alpha (INFA) and a glucocorticoid receptor agonist (dexamethasone). The authors validate their baseline findings using additional datasets, including RNAseq data, and show convergences across two cell lines. The authors then map the methylation x environmental challenge (IFNA and dex) sequences identified in vitro to explore whether their methylation status is also predictive of regulatory activity in vivo. This is very convincingly shown for INFA response sequences, where baseline methylation is predictive of the transcriptional response to flu infection in human macrophages, an infection that triggers the INF pathways.

      Thank you for your strong assessment of our work!

      The extension of the functional validity of the dex-response altering sequences is less convincing.

      We agree. We note that genes close to dex-specific mSTARR-seq enhancers tend to be more strongly upregulated after dex stimulation than those near shared enhancers, which parallels our results for IFNA (lines 341-344). However, there is unfortunately no comparable data set to the human flu data set (i.e., with population-based whole genome-bisulfite sequencing data before and after dex challenge), so we could not perform a parallel in vivo validation step. We have added this caveat to the revised manuscript (lines 555-557).

      Sequences altering the response to glucocorticoids, however, were not enriched in DNA methylation sites associated with exposure to early adversity. The authors interpret that "they are not links on the causal pathway between early life disadvantage and later life health outcomes, but rather passive biomarkers". However, this approach does not seem an optimal model to explore this relationship in vivo. This is because exposure to early adversity and its consequences is not directly correlated with glucocorticoid release and changes in DNA methylation levels following early adversity could be related to many physiological mechanisms, and overall, large datasets and meta-analyses do not show robust associations of exposure to early adversity and DNA methylation changes. Here, other datasets, such as from Cushing patients may be of more interest.

      Thank you for making these important points. We have expanded the set of caveats regarding the lack of enrichment of early adversity-reported sites in the mSTARR-data set (lines 527-533). Specifically, we note that the relationship between early adversity and glucocorticoid physiology is complex (e.g., Eisenberger and Cole, 2012; Koss and Gunnar, 2018) and that dex challenge models one aspect of glucocorticoid signaling but not others (e.g., glucocorticoid resistance). Nevertheless, we also see little evidence for enrichment of early adversity-associated sites in the mSTARR data set at baseline, independently of the dex challenge experiment (lines 483-485; Figure 4).

      We also agree that large data sets (e.g., Houtepen et al., 2018; Marzi et al., 2018) and reviews (e.g., Cecil et al., 2020) of early adversity and DNA methylation in humans show limited evidence of associations between early adversity and DNA methylation levels. However, the idea that early adversity impacts downstream outcomes remains pervasive in the literature and popular science (see Dubois et al., 2019), which we believe makes tests like ours important to pursue. We also hope that our data set (and others generated through these methods) will be useful in interpreting other settings in which differential methylation is of interest as well—in line with your comment below. We have clarified both of these points in the revised manuscript (lines 520-522; 536-539).

      Overall, the authors provide a great resource of DNA methylation-sensitive enhancers that can now be used for functional interpretation of large-scale datasets (that are widely generated in the research community), given the focus on sites included in RBSS and the Illumina EPIC array. In addition, their data lends support that differences in DNA methylation can alter responses to environmental stimuli and thus of the possibility that environmental exposures that alter DNS methylation can also alter the subsequent response to this exposure, in line with the theory of epigenetic embedding of prior stimuli/experiences. The conclusions related to the early adversity data should be reconsidered in light of the comments above.

      Thank you! And yes, we have revised our discussion of early life adversity effects as discussed above.

      Reviewer #1 (Recommendations For The Authors):

      While the paper has a lot of strengths and provides new insight into the epigenomic regulation of enhancers as well as being a great resource, there are some aspects that would benefit from clarification.

      a. It would be great to have a clearer description of how many sequences are actually passing QC in the different datasets and what the respective overlaps are in bps or 600bp windows. Now often only % are given. Maybe a table/Venn diagram for overview of the experiments and assessed sequences would help here. This concern the different experiments in the K652, A549, and Hep2G cell lines, including stimulations.

      We now provide a supplementary figure and supplementary table providing, for each dataset, the number of 600 bp windows passing each filter (Figure 2-figure supplement 1; Supplementary File 9), as well as a supplementary figure providing an upset plot to show the number of assessed sequences shared across the experiments (Figure 2-figure supplement 2).

      b. It would also be helpful to have a brief description of the main differences in assessed sequences and their coverage of the old (2018) and new libraries in the main text to be able better interpret the validation experiments.

      We now provide information on the following characteristics for the 2018 data set versus the data set presented for the first time here: mean (± SD) number of CpGs per fragment; mean (± SD) DNA sequencing depth; and mean (± SD) RNA sequencing depth (lines 169-170 provide values for the new data set; in line 194, we reference Supplementary File 5, which provides the same values for the old data set). Notably, the coverage characteristics of analyzed windows in both data sets are quite high (mean DNA-seq read coverage = 94x and mean RNA-seq read coverage = 165x in the new data set at baseline; mean DNA-seq read coverage = 22x and mean RNA-seq read coverage = 54x in Lea et al. 2018).

      c. Statements of genome-wide analyses in the abstract and discussion should be a bit tempered, as quite a number of tested sites do not pass QC and do not enter the analysis. From the results it seems like from over 4.5 million sequences, only 200,000 are entering the analysis.

      The reason why many of the windows are not taken forward into our formal modeling analysis is that they fail our filter for RNA reads because they are never (or almost never) transcribed—not because there was no opportunity for transcription (i.e., the region was indeed assessed in our DNA library, and did not show output transcription, as now shown in Figure 2-figure supplement 1). We have added a rarefaction analysis (lines 715-722 in Materials and Methods) of the DNA fragment reads to the revised manuscript which supports this point. Specifically, it shows that we are saturated for representation of unique genomic windows (i.e., we are above the stage in the curve where the proportion of active windows would increase with more sequencing: Figure 1figure supplement 4). Similarly, a parallel rarefaction curve for the mSTARR-seq RNA-seq data (Figure 1-figure supplement 4) shows that we would gain minimal additional evidence for regulatory activity with more sequencing depth. We now reference these analyses in revised lines 179-184 and point to the supporting figure in line 182.

      In other words, our analysis is truly genome-wide, based on the input sequences we tested. Most of the genome just doesn’t have regulatory activity in this assay, despite the potential for it to be detected given that the relevant sequences were successfully transfected into the cells.

      d. Could the authors comment on the validity of the analysis if only one copy is present (cut-off for QC)?

      We think this question reflects a misunderstanding of our filtering criteria due to lack of clarity on our part, which we have modified in the revision. We now specify that the mean DNA-seq sequencing depth per sample for the windows we subjected to formal modeling was quite high:

      93.91 ± 10.09 SD (range = 74.5 – 113.5x) (see revised lines 169-170). In other words, we never analyze windows in which there is scant evidence that plasmids containing the relevant sequence were successfully transfected (lines 170-172).

      Our minimal RNA-seq criteria require non-zero counts in at least 3 replicate samples within either the methylated condition or the unmethylated condition, or both (lines 166-168). Because we know that multiple plasmids containing the corresponding sequence are present for all of these windows—even those that just cross the minimal RNA-seq filtering threshold—we believe our results provide valid evidence that all analyzed windows present the opportunity to detect enhancer activity, but many do not act as enhancers (i.e., do not result in transcribed RNA). Notably, we observe a negligible correlation between DNA sequencing depth for a fragment, among analyzed windows, and mSTARR-seq enhancer activity (R2 = 0.029; now reported in lines 183-184). We also now report reproducibility between replicates, in which all replicate pairs have r > 0.89, on par with previously published STARR-seq datasets (e.g., Klein et al., 2020; Figure 1-figure supplement 6, pointed to in line 193).

      e. While the authors state that almost all of the control sequences contain CpGs sites, could the authors also give information on the total number of CpG sites in the different subsets? Was the number of CpGs in a 600 bp window related to the effects of DNA methylation on enhancer activity?

      We now provide the number of CpG sites per window in the different subsets in lines 282-284. As expected, they are higher for EPIC array sites and for RRBS sites because the EPIC array is biased towards CpG-rich promoter regions, and the enzyme typically used in the starting step of RRBS digests DNA at CpG motifs (but control sequences still contain an average of ~13 CpG sites per fragment). We also now model the magnitude of the effects of DNA methylation on regulatory activity as a function of number of CpG sites within the 600 bp windows. Consistent with our previous work in Lea et al., 2018, we find that mSTARR-seq enhancers with more CpGs tend to be repressed by DNA methylation (now reported in lines 216-219 and Figure 1figure supplement 11).

      f. In the discussion, a statement on the underrepresented regions, likely regulatory elements with lower CG content, that nonetheless can be highly relevant for gene regulation would be important to put the data in perspective.

      Thanks for this suggestion. We agree that regulatory regions, independent of CpG methylation, can be highly relevant, and now clarify in the main text that the “unmethylated” condition of mSTARR-seq is essentially akin to a conventional STARR-seq experiment, in that it assesses regulatory activity regardless of CpG content or methylation status (lines 128-130).

      Consequently, our study is well-designed to detect enhancer-like activity, even in windows with low GC content. We now show with additional analyses that we generated adequate DNA-seq coverage on the transfected plasmids to analyze 90.2% of the human genome, including target regions with no or low CpG content (lines 148-149; 153-156; Supplementary file 2). As noted above, we also now clarify that regions dropped out of our formal analysis because we had little to no evidence that any transcription was occurring at those loci, not because sequences for those regions were not successfully transfected into cells (see responses above and new Figure 1-figure supplement 4 and Figure 2-figure supplement 1).

      g. To control for differences in methylation of the two libraries, the authors sequence a single CpGs in the vector. Could the authors look at DNA methylation of the 600 bp windows at the end of the experiment, could DNA methylation of these windows be differently affected according to sequence? 48 hours could be enough for de-methylation or re-methylation.

      We agree that variation in demethylation or remethylation depending on fragment sequence is possible. We now state this caveat in the main text (lines 158-159), and specify that genomic coverage of our bisulfite sequencing data across replicates are (unfortunately) too variable to perform reliable site-by-site analysis of DNA methylation levels before and after the 48 hour experiment (lines 1182-1185). Instead, we focus on a CpG site contained in the adapter sequence (and thus included in all plasmids) to generate a global estimate of per replicate methylation levels. We also now note that any de-methylation or re-methylation would reduce our power to detect methylation-dependent activity, rather than leading to false positives (lines 163-165).

      h. The section on the method for correction for multiple testing should be more detailed as it is very difficult to follow. Why were only 100 permutations used, the empirical p-value could then only be <0.01? The description of a subsample of the N windows with positive Betas is unclear, should the permutation not include the actual values and thus all windows - or were the no negative Betas? Was FDR accounting for all elements and pairs?

      We have now expanded the text in the Materials and Methods section to clarify the FDR calculation (lines 691, 695-699, 702, 706). We clarify that the 100 permutations were used to generate a null distribution of p-values for the data set (e.g., 100 x 17,461 p-values for the baseline data set), which we used to derive a false discovery rate. Because we base our evidence on FDRs, we therefore compare the distribution of observed p-values to the distribution of pvalues obtained via permutation; we do not calculate individual p-values by comparing an observed test statistic against the test statistics for permuted data for that individual window.

      We compare the data to permutations with only positive betas because in the observed data, we observe many negative betas. These correspond to windows which have no regulatory activity (i.e., they have many more input DNA reads than RNA-seq reads) and thus have very small pvalues in a model testing for DNA-RNA abundance differences. However, we are interested in controlling the false discovery rate of windows that do have regulatory activity (positive betas). In the permuted data, by contrast and because of the randomization we impose, test statistics are centered around 0 and essentially symmetrical (approximately equally likely to be positive or negative). Retaining all p-values to construct the null therefore leads to highly miscalibrated false discovery rates because the distribution of observed values is skewed towards smaller values— because of windows with “significantly” no regulatory activity—compared to the permuted data. We address that problem by using only positive betas from the permutations.

      i. The interpretation of the overlap of Dex-response windows with CpGs sites associated with early adversity should be revisited according to the points also mentioned in the public review and the authors may want to consider exploring additional datasets with other challenges.

      Thank you, see our responses to the public review above and our revisions in lines (lines 555559). We agree that comparisons with more data sets and generation of more mSTARR-seq data in other challenge conditions would be of interest. While beyond the scope of this manuscript, we hope the resource we have developed and our methods set the stage for just such analyses.

      Reviewer #2 (Public Review):

      This work presents a remarkably extensive set of experiments, assaying the interaction between methylation and expression across most CpG positions in the genome in two cell types. To this end, the authors use mSTARR-seq, a high-throughput method, which they have previously developed, where sequences are tested for their regulatory activity in two conditions (methylated and unmethylated) using a reporter gene. The authors use these data to study two aspects of DNA methylation:

      1) Its effect on expression, and 2. Its interaction with the environment. Overall, they identify a small number of 600 bp windows that show regulatory potential, and a relatively large fraction of these show an effect of methylation on expression. In addition, the authors find regions exhibiting methylation-dependent responses to two environmental stimuli (interferon alpha and glucocorticoid dexamethasone).

      The questions the authors address represent some of the most central in functional genomics, and the method utilized is currently the best method to do so. The scope of this study is very impressive and I am certain that these data will become an important resource for the community. The authors are also able to report several important findings, including that pre-existing DNA methylation patterns can influence the response to subsequent environmental exposures.

      Thank you for this generous summary!

      The main weaknesses of the study are: 1. The large number of regions tested seems to have come at the expense of the depth of coverage per region (1 DNA read per region per replicate). I have not been convinced that the study has sufficient statistical power to detect regulatory activity, and differential regulatory activity to the extent needed. This is likely reflected in the extremely low number of regions showing significant activity.

      We apologize for our lack of clarity in the previous version of the manuscript. Nonzero coverage for half the plasmid-derived DNA-seq replicates is a minimum criterion, but for the baseline dataset, the mean depth of DNA coverage per replicate for windows passing the DNA filter is quite high: 12.723 ± 41.696 s.d. overall, and 93.907 ± 10.091 s.d. in the windows we subjected to full analysis (i.e., windows that also passed the RNA read filter). We now provide these summary statistics in lines 148-149 and 169-170 and Supplementary file 5 (see also our responses to Reviewer 1 above). We also now show, using a rarefaction analysis, that our data set saturates the ability to detect regulatory windows based on DNA and RNA sequencing depth (new Figure 1-figure supplement 4; lines 179-184; 715-722).

      2) Due to the position of the tested sequence at the 3' end of the construct, the mSTARR-seq approach cannot detect the effect of methylation on promoter activity, which is perhaps the most central role of methylation in gene regulation, and where the link between methylation and expression is the strongest. This limitation is evident in Fig. 1C and Figure 1-figure supplement 5C, where even active promoters have activity lower than 1. Considering these two points, I suspect that most effects of methylation on expression have been missed.

      Thank you for pointing this out. We agree that we have not exhaustively detected methylationdependent activity in all promoter regions, given that not all promoter regions are active in STARR-seq. However, there is good evidence that some promoter regions can function like enhancers and thus be detected in STARR-seq-type assays (Klein et al., 2020). This important point is now noted in lines 187-189; an example promoter showing methylation-dependent regulatory activity in our dataset is shown in Figure 3E.

      We also now clarify that Figure 1C shows significant enrichment of regulatory activity in windows that overlap promoter sequence (line 239). The y-axis is not a measure of activity, but rather the log-transformed odds ratio, with positive values corresponding to overrepresentation of promoter sequences in regions of mSTARR-seq regulatory activity. Active promoters are 1.640 times more likely to be detected with regulatory activity than expected by chance (p = 1.560 x 10-18), which we now report in a table that presents enrichment statistics for all ENCODE elements shown in Figure 1C for clarity (Supplementary file 4). Moreover, 74.1% of active promoters that show regulatory activity have methylation-dependent activity, also now reported in Supplementary file 4.

      Overall, the combination of an extensive resource addressing key questions in functional genomics, together with the findings regarding the relationship between methylation and environmental stimuli makes this a key study in the field of DNA methylation.

      Thank you again for the positive assessment!

      Reviewer #2 (Recommendations For The Authors):

      I suggest the authors conduct several tests to estimate and/or increase the power of the study:

      1) To estimate the potential contribution of additional sequencing depth, I suggest the authors conduct a downsampling analysis. If the results are not saturated (e.g., the number of active windows is not saturated or the number of differentially active windows is not saturated), then additional sequencing is called for.

      We appreciate the suggestion. We have now performed a downsampling/rarefaction curve analysis in which we downsampled the number of DNA reads, and separately, the number of RNA reads. We show that for both DNA-seq depth and RNA-seq depth, we are within the range of sequencing depth in which additional sequencing would add minimal new analysis windows in the dataset (Figure 1-figure supplement 4; lines 179-184; 715-722).

      2) Correlation between replicates should be reported and displayed in a figure because low correlations might also point to too few reads. The authors mention: "This difference likely stems from lower variance between replicates in the present study, which increases power", but I couldn't find the data.

      We now report the correlations between RNA and DNA replicates within the current dataset and within the Lea et al., 2018 dataset (Figure 1-figure supplement 6). The between-replicate correlations in both our RNA libraries and DNA libraries are consistently high (r ≥ 0.89).

      3) The correlation between the previous and current K562 datasets is surprisingly low. Given that these datasets were generated in the same cell type, in the same lab, and using the same protocol, I expected a higher correlation, as seen in other massively parallel reporter assays. The fact that the correlations are almost identical for a comparison of the same cell and a comparison of very different cell types is also suspicious.

      Thanks for raising this point. We think it is in reference to our original Figure 1-Figure supplement 6, for which we now provide Pearson correlations in addition to R2 values (now Figure 1-Figure supplement 8). We note that this is not a correlation in raw data, but rather the correlation in estimated effect sizes from a statistical model for methylation-dependent activity. We now provide Pearson correlations for the raw data between replicates within each dataset (Figure 1-Figure supplement 6), which for the baseline dataset are all r > 0.89 for RNA replicates and r > 0.98 for DNA replicates, showing that replicate reproducibility in this study is on par with other published studies (e.g., Klein et al., 2020 report r > 0.89 for RNA replicates and r > 0.91 for DNA replicates).

      We do not know of any comparable reports in other MPRAs for effect size correlations between two separately constructed libraries, so it’s unclear to us what the expectation should be. However, we note that all effect sizes are estimated with uncertainty, so it would be surprising to us to observe a very high correlation for effect sizes in two experiments, with two independently constructed libraries (i.e., with different DNA fragments), run several years apart—especially given the importance of winner’s curse effects and other phenomena that affect point estimates of effect sizes. Nevertheless, we find that regions we identify as regulatory elements in this study are 74-fold more likely to have been identified as regulatory elements in Lea et al., 2018 (p < 1 x10-300).

      4) The authors cite Johnson et al. 2018 to support their finding that merely 0.073% of the human genome shows activity (1.7% of 4.3%), but:

      a. the percent cited is incorrect: this study found that 27,498 out of 560 million regions (0.005%) were active, and not 0.165% as the authors report.

      We have modified the text to clarify the numerator and denominator used for the 0.165% estimate from Johnson et al 2018 (lines 175-176). The numerator is their union set of all basepairs showing regulatory activity in unstimulated cells, which is 5,547,090 basepairs. The denominator is the total length of the hg38 human genome, which is 3,298,912,062 basepairs.

      Notably, the denominator (the total human genome) is not 560 million—while Johnson et al (2018) tested 560 million unique ~400 basepair fragments, these fragments were overlapping, such that the 560 million fragments covered the human genome 59 times (i.e., 59x coverage).

      b. other studies that used massively parallel reporter assays report substantially higher percentages, suggesting that the current study is possibly underpowered. Indeed, the previous mSTARR-seq found a substantially larger percentage of regions showing regulatory activity (8%). The current study should be compared against other studies (preferably those that did not filter for putatively active sequences, or at least to the random genomic sequences used in these studies).

      We appreciate this point and have double checked comparisons to Johnson et al., 2018 and Lea et al., 2018. Our numbers are not unusual relative to Johnson et al., 2018 (0.165%), which surveyed the whole genome. Also, in comparing to the data from Lea et al., 2018, when processed in an identical manner (our criteria are more stringent here), our values of the percent of the tested genome showing significant regulatory activity are also similar: 0.108% in the Lea et al., 2018 dataset versus 0.082% in the baseline dataset. Finally, our rarefaction analyses (see our responses above) indicate that we are not underpowered based on sequencing depth for RNA or DNA samples. We also note that there are several differences in our analysis pipeline from other studies: we use more technical replicates than is typical (compare to 2-5 replicates in Arnold et al., 2013; Johnson et al., 2018; Muerdter et al., 2018), we measure DNA library composition based on DNA extracted from each replicate post-transfection (as opposed to basing it on the pre-transfection library: [Johnson et al., 2018], and we use linear mixed models to identify regulatory activity as opposed to binomial tests [Johnson et al., 2018; Arnold et al., 2013; Muerdter et al., 2018].

      I find it confusing that the four sets of CpG positions used: EPIC, RRBS, NR3C1, and random control loci, add up together to 27.3M CpG positions. Do the 600 bp windows around each of these positions sufficient to result in whole-genome coverage? If so, a clear explanation of how this is achieved should be added.

      Thanks for this comment. Although our sequencing data are enriched for reads that cover these targeted sites, the original capture to create the input library included some off target reads (as is typical of most capture experiments, which are rarely 100% efficient). We then sequenced at such high depth that we ultimately obtained sequencing coverage that encompassed nearly the whole genome. We now clarify in the main text that our protocol assesses 27.3 million CpG sites by assessing 600 bp windows encompassing 93.5% of all genomic CpG sites (line 89), which includes off-target sites (line 149).

      scatter plot showing the RNA to DNA ratios of the methylated (x-axis) vs unmethylated (y-axis) library would be informative. I expect to see a shift up from the x=y diagonal in the unmethylated values.

      We have added a supplementary figure showing this information, which shows the expected shift upwards (Figure 1-figure supplement 9).

      Another important figure missing is a histogram showing the ratios between the unmethylated and methylated libraries for all active windows, with the significantly differentially active windows marked.

      We have added a supplementary figure showing this information (Figure 1-Supplementary Figure 10).

      Perhaps I missed it, but what is the distribution of effect sizes (differential activity) following the various stimuli?

      This information is provided in table form in Supplementary Files 3, 10, and 11, which we now reference in the Figure 2 legend (lines 365-366).

      Minor changes

      It is unclear what the lines connecting the two groups in Fig.3C represent, as these are two separate groups of regions.

      We now clarify in the figure legend that values connected by a line are the same regions, not two different sets of regions. They show the correlation between DNA methylation and gene expression at mSTARR-seq-identified enhancers in individuals before and after IAV stimulation, separately for enhancers that are shared between conditions (left) versus those that are IFNAspecific (right). The two plots therefore do show two different sets of regions, which we have depicted to visualize the contrast in the effect of stimulation on the correlation on IFNA-specific enhancers versus shared enhancers. We have revised the figure legend to clarify these points (line 458-460).

      L235-242 are unclear. Specifically - isn't the same filter mentioned in L241-242 applied to all regions?

      Yes, the same filter for minimal RNA transcription was applied to all regions. We have modified the text (lines 264-265, 271, 275-277) to clarify that the enrichment analyses were performed twice, to test whether the target types were: 1) enriched in the dataset passing the RNA filter (i.e., the dataset showing plasmid-derived RNA reads in at least half the sham or methylated replicates; n = 216,091 windows) and 2) enriched in the set of windows showing significant regulatory activity (at FDR < 1%; n = 3,721 windows).

      To improve cohesiveness, the section about most CpG sites associated with early life adversity not showing regulatory activity in K562s can be moved to the supplementary in my opinion.

      Thank you for this suggestion. Because ELA and the biological embedding hypothesis (via DNA methylation) were major motivations for our analysis (see Introduction lines 42-48; 75-79), and we also discuss these results in the Discussion (lines 518-520), we have respectfully elected to retain this section in the main manuscript. We have added text in the Discussion explaining why we think experimental tests of methylation effects on regulation are relevant to the literature on early life adversity (lines 520-522), and have added discussion on limits to these analyses (lines 527-533).

      References:

      Arnold CD, Gerlach D, Stelzer C, Boryń ŁM, Rath M, Stark A (2013) Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science, 339, 1074-1077.

      Cecil CA, Zhang Y, Nolte T (2020) Childhood maltreatment and DNA methylation: A systematic review. Neuroscience & Biobehavioral Reviews, 112, 392-409.

      Dubois M, Louvel S, Le Goff A, Guaspare C, Allard P (2019) Epigenetics in the public sphere: interdisciplinary perspectives. Environmental Epigenetics, 5, dvz019.

      Eisenberger NI, Cole SW (2012) Social neuroscience and health: neurophysiological mechanisms linking social ties with physical health. Nature neuroscience, 15, 669-674.

      Houtepen L, Hardy R, Maddock J, Kuh D, Anderson E, Relton C, Suderman M, Howe L (2018) Childhood adversity and DNA methylation in two population-based cohorts. Translational Psychiatry, 8, 1-12.

      Johnson GD, Barrera A, McDowell IC, D’Ippolito AM, Majoros WH, Vockley CM, Wang X, Allen AS, Reddy TE (2018) Human genome-wide measurement of drug-responsive regulatory activity. Nature communications, 9, 1-9.

      Klein JC, Agarwal V, Inoue F, Keith A, Martin B, Kircher M, Ahituv N, Shendure J (2020) A systematic evaluation of the design and context dependencies of massively parallel reporter assays. Nature Methods, 17, 1083-1091.

      Koss KJ, Gunnar MR (2018) Annual research review: Early adversity, the hypothalamic–pituitary– adrenocortical axis, and child psychopathology. Journal of Child Psychology and Psychiatry, 59, 327-346.

      Marzi SJ, Sugden K, Arseneault L, Belsky DW, Burrage J, Corcoran DL, Danese A, Fisher HL, Hannon E, Moffitt TE (2018) Analysis of DNA methylation in young people: limited evidence for an association between victimization stress and epigenetic variation in blood. American journal of psychiatry, 175, 517-529.

      Muerdter F, Boryń ŁM, Woodfin AR, Neumayr C, Rath M, Zabidi MA, Pagani M, Haberle V, Kazmar T, Catarino RR (2018) Resolving systematic errors in widely used enhancer activity assays in human cells. Nature methods, 15, 141-149.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      1) Can the authors statistically define the egg-laying classes? In some parts of the manuscript, the division between the different classes could be more ambiguous. I understand that the class III strains are divided by the kcnl-1 genotype, but given the different results for diverse traits, it could be more clear to keep them as one class. Also, overall, the authors choose a collection of 15 strains across the different classes to phenotype for many traits and perform genome edits. It is understandable that they cannot test all strains, but given the variation across traits and classes, it might be good to add a few more caveats about how these strains might not be representative of all strains across the species.

      Response: The egg-laying classes were defined as in Figure 1A by arbitrarily chosen cut-offs (at 10, 10-25, and 25 eggs in utero) to simplify subsequent analyses. We added this explanation to the first paragraph of the results section. However, the differences in average egg retention are significantly different between the four defined classes using the 15 selected strains (Fig. 2A).

      We think that the distinction between Class IIIA and IIIB strains is important and justified because the two Classes significantly differ in mean egg retention (Fig. 2A) and because Class IIIB harbour the large-effect variant KCNL-1 V530L whereas Class IIIA do not.

      We agree that the 15 selected strains are not necessarily representative of all strains across the species. We have added a note of caution regarding this point to the first paragraph of the section “Temporal progression of egg retention and internal hatching”: “Note that this strain selection, especially concerning the largest Class II, is unlikely to reflect the overall strain diversity observed across the species". In addition, we have reworded the first sentence of this paragraph as follows: “ To better characterize natural variation in C. elegans egg retention, we focused on a subset of 15 strains from divergent phenotypic Classes I-III, with an emphasis on Class III strains exhibiting strong egg retention (at mid-L4 + 30h) (Fig. 2A and 2B).”

      2) For the GWAS experiments, the authors should describe if any of the QTL overlap with hyper-divergent regions in the strain set. The QTL could be driven by these less well defined regions.

      Response: We have added the following sentence: “The three QTLs do not align with any of the recently identified hyper-divergent regions of the genome (Lee et al., 2021).

      3) The authors should look at correlations between the mod-5(n822) edit phenotypes and the exogenous 5-HT and SSRI phenotypes to demonstrate how the traits can differ. Some correlation plots might help that point as well.

      Response: We examined all possible correlations as suggested: none are significant and strain effects on trait differences are idiosyncratic, as written in our results section. The correlational analyses remain of limited value due to small samples: N=10 for mean strain values for measured phenotypes. We therefore feel that these analyses do not provide any additional insights beyond our figures (4C, 4D, 5C, 5D, S5A-C ) and our statement on page 15: “As in previous experiments (Fig. 4C and 5C), we find again that strains sharing the same egg retention phenotype may differ strongly in egg-laying behaviour in response to modulation of both exo- and endogenous serotonin levels (Class IIIA: ED3005 and JU2829) (Fig. 5D and S5C).”

      4) Figure 6D, was there any censoring of the data? Normally, these types of studies are plagued by an increase in censored animals that can decrease significance. The effects among the classes seem large, but statistical comparisons might help as well.

      Response: There was no censoring of animals (censoring of animals in lifespan studies is usually done by removing “bags of worms”, which here was our study phenotype). We now mention this in the corresponding figure legend. We also added a statistical analysis showing that mean survival was significantly different between all Classes.

      5) Many of the traits, edits, and deeper analyses are performed on the JU751 genetic background. This choice is sensible, otherwise, the work can increase exponentially. However, the authors should add a caveat about how these results might be limited to JU751 and other strains might respond differently.

      Response: For certain experiments, it was not feasible to include multiple strains from all phenotypic classes, so we selected JU751 (Class IIIB) and JU1200 (Class II), for which we had established CRISPR-engineered lines to modulate the egg retention phenotype by a single amino acid change in KCNL-1. To emphasize that these experimental observations cannot be generalized, we added the following statement in the relevant results section: “These experimental results offer preliminary evidence (bearing in mind that our analysis was primarily centered on a single genetic background) that laying of advanced-stage embryos may enhance intraspecific competitive ability, particularly in scenarios where multiple genotypes compete for colonization and exploitation of limited, patchily distributed resources.”

      6) The authors argue that evolution could be acting on specific parts of the egg-laying machinery (e.g., muscledirected signaling components). It might be useful to look at levels of standing variation and selection at groups of loci compared to genomic controls to see if this conclusion can be strengthened.

      Response: This is a good idea but how to select pertinent candidate loci is unclear (there are over 300 genes with effects on egg laying, www.wormbase.org). In addition, the genetics of muscle-directed signalling components in egg laying is only starting to be explored, with no specific candidate genes having been identified (Medrano & Collins, 2023, Curr Biol). We therefore think that such an analysis is currently not possible.

      7) Completely optional: The authors present a compelling and interesting case for transitions and trade-offs between oviparity and viviparity. The C. vivipara species has a different egg-laying mode than other Caenorhabditis species. The authors could add a short section describing their expectations about the neuronal morphology, 5-HT circuits, and muscle function in this species given their results. What genes or circuits should be the focus of future studies to address this question in Caenorhabditis. Also, Loer and Rivard present some similar ideas based on the differences in 5-HT staining neurons across diverse nematodes. Those results can be incorporated and discussed as well.

      Response: Our current research focuses on the evolution of egg laying in different Caenorhabditis species. So far, however, it remains difficult to provide specific hypotheses on how the egg-laying circuit has changed in C. vivipara. We rephrased the final paragraph of the discussion to incorporate some of the reviewer’s suggestions: “Nematodes display frequent transitions from oviparity to obligate viviparity in many distinct genera (Sudhaus, 1976; Ostrovsky et al., 2015), including in the genus Caenorhabditis, with at least one viviparous species, C. vivipara (Stevens et al., 2019). Although evidence exists for the evolution of egg-laying circuitry across oviparous Caenorhabditis species (Loer and Rivard, 2007), the specific cellular and genetic changes responsible for the transition to obligate viviparity in C. vivipara have yet to be examined. Resolving the genetic basis of intraspecific variation in C. elegans egg retention, including partial or facultative viviparity, may thus shed light on the molecular changes underlying the initial steps of evolutionary transitions from oviparity to obligate viviparity in invertebrates.”

      Specific edits:

      1) Perhaps a silly point, but "parity" (to my knowledge) does not have a biological meaning on its own. I suggest "egg-laying mode" or "birth mode".

      Response: This term has been used previously in the literature (e.g.https://onlinelibrary.wiley.com/doi/10.1111/jeb.13886 or https://doi.org/10.1101/2023.10.22.563505). However, as the referee rightly points out, this is not a standard term. We therefore replaced “parity mode” with “egg-laying mode”.

      2) "Against fluctuating environmental fluctuations" is a bit strange

      Response: Corrected.

      3) The first publications of Egl mutants were by the Horvitz lab so some citations are not in all of the first descriptions of the trait (early in Results)

      Response: We have added the relevant work (Trent 1982, Trent 1983, Desai & Horvitz 1989) to this paragraph in the early results section.

      4) "Strong egg retention usually strongly..." is a bit strange

      Response: Corrected.

      1. Figure 8G font looks smaller than the others.

      Response: Corrected.

      Reviewer #2:

      1) In Figure 1A, I infer that in the graph class I measurements are represented by dark blue dots and class II by purple dots. I am having a really hard time distinguishing between these two colors in the graph. In the pie chart I have no problem, but in the graph the black lines around the colored dots seem to obscure the colors. Not sure how to fix this graphical problem, but it is preventing the graph from communicating the results effectively.

      Response: We have changed the colours, spacing and format of this figure to resolve this problem.

      2) The behavioral analysis of Figure 3B-3F is problematic. The experimental methods used and the interpretation of the results each have issues. This is cause for concern since this is the most direct analysis of the actual variations in egg-laying behavior across strains presented in this paper.

      This experiment is modeled after the work of Waggoner et al. 1998, who recorded egg laying events of individual worms on video over several hours and noted the exact time of individual egg laying events. Waggoner et al. found in the reference C. elegans strain N2 that egg-laying events occurred in ~2 minute clusters ("active phases") separated by ~20 minute silent periods ("inactive phases"). Mignerot et al. did not take continuous videos of animals, but rather examined plates bearing a single worm only every 5 minutes and noted the number of new eggs that appeared on the plate in each 5-minute interval. From these data, the authors claim they have measured the intervals between "egg-laying phases" (the term used in the Figure 3 legend). In the Results, the authors explicitly claim they are measuring the timing and frequency of actual active and inactive egg-laying phases. Apparently, all the eggs laid within one 5-minute interval are considered to have been laid in a single active phase, and the time between 5-minute intervals containing egg laying events is considered an "inactive phase" and is measured only with a resolution of 5 minutes. It is not explained anywhere how the authors handle the situation of seeing eggs laid in two consecutive 5-minute intervals. Is that one active phase that is 10 minutes long, or is that two separate active phases with a 5-minute active phase in between? Because of this ambiguity in how they define active and inactive phases, I find it impossible to understand and judge the data presented in Fig. 3D-3F. The authors in the results state that "Class I and Class IIIB displayed significantly accelerated and reduced egg laying activity respectively (Fig. 3C to 3E)" . I assume they are referring to the statistical analysis described in the figure legend, which is quite difficult to understand. Frankly, just looking at the graphs in Fig. 3D3F, it is hard for the reader to identify specific features shown in the graphs can explain why, for example, Class I strains have fewer retained eggs than Class III strains. So, I found this analysis very unsatisfying.

      I also feel the authors are making an unwarranted assumption that their non-N2 strains will have distinguishable active and inactive phases of egg-laying behavior analogous to those seen in the N2 strain. Given the possibly large variations in egg-laying behavior in the various strains examined, that assumption should be questioned. Thus, framing the entire analysis of behavior patterns in terms of the length of active and inactive phases might not be appropriate.

      Response: This comment validly highlights important problems and limitations of our scan-sampling method to quantify strain differences in egg-laying behaviour. We acknowledge that we failed to present the data with due diligence, and clarity regarding terminology and interpretation. However, we think that some of these results are still of value after revised presentation. Our biggest mistake was to use the terms “active and inactive phase”, as coined by Waggoner et al. 1998. We are aware that our measures are not equivalent to these previously defined measures but have been sloppy with terminology. We therefore carefully reworded this entire results section, using clear definitions to indicate differences between the Waggoner assay and our assay (including a graphical representation of our assay design in the revised Fig. 3B). In brief, our simplified assay is useful to estimate the frequency and approximate duration of prolonged inactive periods of egg laying because we can unambiguously determine intervals in which eggs were laid or not. In contrast, as pointed out by the reviewer, we cannot determine if multiple active phases occurred within a 5-min interval, nor can we estimate the duration of an active “phase”. We now state this limitation explicitly in the manuscript. What our results do show is that the number of intervals during which egg laying occurred is significantly different between strains and Classes: Class I (low retention) have a higher number of intervals with egg-laying events, whereas Class IIIB showed a reduced number of such events (Fig. 3D). We can therefore also roughly estimate the mean time (per individual) between two egg-laying intervals, giving us a proxy for prolonged periods when egg-laying is inactive (Fig. 3E); we note that our estimate for N2 is very close to what has been previously measured (~20 min). Therefore, we can confidently conclude that there are natural strains which have both shorter (Class I) and longer (Class IIIB) inactive periods of egg laying. These results partly align with observed variation in egg retention. However, we agree with the reviewer – as we had stated both in results and discussion sections – that these behavioural differences act together with differences in the sensing of egg accumulation in utero (as suggested by results shown in Fig. 3G and 3H). We also agree that it seems very plausible that the observed behavioural differences, as revealed by scan-sampling, may only have a secondary role in accounting for natural variation in egg retention. We will be testing these hypotheses specifically in our future research.

      Note: The statistical analyses are nested ANOVAs to ask (a) does the value differ between strains within a given class and (b) does the value differ between Classes? Classes labelled with different letters in the figures therefore significantly differ in their mean values, demonstrating that measured behavioural phenotypes consistently differ between some (but not all) phenotypic classes, yet largely in line with their egg retention phenotypes (Fig. 3D and 3E).

      3) Figure 4A is a schematic diagram of how the egg-laying circuit works based on previous literature, and the authors cite Collins et al. 2015 and Kopchock et al. 2021 as their sources. One feature of this figure seems unwarranted, namely the part indicating that egg accumulation acts on the UM muscles, and the statement in the legend that "mechanical excitation of uterine muscles (UM) in response to egg accumulation favours exit from the inactive state (Collins et al., 2016)". I believe Collins et al. 2016 showed that egg accumulation favors egg laying and may have speculated that it does so by stretching the um muscles, but this idea remains speculative and has not been established by any experimental data. I point out this issue,in particular, because it may bear on the nice data the authors of this manuscript show in Figure 3G and 3H, which show that some strains accumulate many eggs in the uterus before they initiate egg laying.

      Also, in Figure 4A and 4B, the legend does not explain the logic of the green areas labeled "egg-laying active phase" and the yellow area labeled "egg-laying inactive state". I was not sure what sure how to interpret these features of the graphics.

      Response: The input from uterine muscles remains indeed hypothetical, and we have corrected the figure accordingly, now simply referring to the feedback of egg accumulation on egg laying activity, as recently characterized in more detail by Medrano & Collins (2023, Curr Biol).

      The green/yellow backgrounds shown in figures 4A (and 4B) are not useful and we have removed them.

      4) Results, page 11: "We used standard assays, in which animals are reared in liquid M9 buffer without bacterial food." In the standard assays, animals are reared on NGM agar plates with bacterial food, and then at the start of the egg-laying assay, are transferred to liquid M9 buffer without bacterial food. I assume that is what these authors did, and they should correct the language of the text to make it more accurate.

      Response: The reviewer is correct. We have incorporated this change to improve accuracy.

      5) The authors note that "serotonin induced a much stronger egg-laying responds in the Class IIIA strain ED3005 than in other strains (Fig. 4C)". I would like to point out to the authors that strains such as ED3005 that have a very large number of unlaid eggs in their uterus are prone to lay a very large number of eggs when treated with exogenous serotonin, simply for the trivial reason that they have more eggs to release. This was previously seen in, for example, in Desai and Horvitz (1989) in certain egg-laying defective mutants.

      Response: This is an important point and our comparison of ED3005 to ALL other strains is problematic. We changed this result description by stating that ED3005 shows possible serotonin hypersensitivity compared to strains with similar levels of egg retention (Class IIIA): “In addition, serotonin induced a much stronger egg-laying response in the strain ED3005 than in other Class IIIA strains with similar levels of egg retention (Fig. 4B). ED3005 may thus exhibit serotonin hypersensitivity, which has been observed in certain egg-laying mutants where perturbed synaptic transmission impacts serotonin signalling (Schafer and Kenyon, 1995; Schafer et al., 1996).”

      6) In Figure 4 the authors show that all strains lay eggs in response to fluoxetine and imipramine, but some strains (Class IIIB) do not lay eggs in response to serotonin. They then cite a series of papers, starting with Trent et al. 1983, that they claim show that this specific phenotype demonstrates that the HSN neurons are functionally releasing serotonin (bottom of page 11). This statement needs to be removed - it is incorrect. It is true that egg laying in response to fluoxetine and/or imipramine AS WELL AS egg laying in response to serotonin has been interpreted as indicating the presence of HSN neurons that functionally release serotonin to stimulate egg laying (these were referred to as Category C by Trent et al., 1983). However, the mutants that Mignerot et al. are talking about (those that don't respond to serotonin but do respond to imipramine/fluoxetine) were called Category D by Trent et al., 1983, and to my knowledge these have never been interpreted as necessarily having functionally intact HSN neurons. Mutants such as these that can lay eggs in some circumstances but cannot lay eggs in response to exogenous serotonin have usually been interpreted as having egg-laying muscles that are defective in responding to serotonin.

      How can we interpret strains that respond to imipramine/fluoxetine and not serotonin? Mignerot et al. cite some of the papers (Kullyev et al. 2010; Wenishenker et al., 1999; Yue et al., 2018) showing that imipramine and fluoxetene have off-target effects and can stimulate egg laying by acting through proteins other than the serotonin-reuptake inhibitor. The authors later in their discussion at the top of Page 24 also cite Dempsey et al 2005, a paper that also argues that imipramine and fluoxetene act via off target effects. However, currently in Figure 4B Mignerot et al. emphasize that the serotonin reuptake inhibitor is the target of these drugs. Since the results presented for Class IIIB strains are not in accord with this interpretation, this seems misleading to me. The bottom line for me is that class IIIB strains cannot respond to exogenous serotonin, but can lay eggs in other conditions, so perhaps there is something specifically wrong with their ability to respond to serotonin.

      Response: We thank the reviewer for this important comment – we misinterpreted some of these past findings and our statements were either inexact or incorrect. We have revised this section accordingly: “Both drugs also stimulated egg laying in the Class IIIB strains and the Class IIIA strain JU2829 for which exogenous serotonin either inhibited egg laying or had no effect on it (Fig. 4B). In the past, mutants unresponsive to serotonin yet responsive to other drugs, including fluoxetine and imipramine, have been interpreted as being defective in the serotonin response of vulval muscles (Trent et al., 1983; Reiner et al., 1995; Weinshenker et al., 1995). This is indeed the likely case of Class IIIB strains carrying the KCNL-1 V530L variant thought to specifically reduce excitability of vulval muscles (Vigne et al., 2021). Our results therefore suggest that JU2829 (Class IIIA) may exhibit a similar defect in vulval muscle activation via serotonin caused by an alternative genetic change. Overall, these pharmacological assays do not allow us to conclude if and how HSN function has diverged among strains because the mode of action and targets of tested drugs has not been fully resolved. Nevertheless, our results are consistent with previous models proposing that these drugs do not simply block serotonin reuptake but can stimulate egg laying, to some extent, through mechanisms independent of serotonergic signaling (Trent et al., 1983; Desai and Horvitz, 1989; Reiner et al., 1995; Weinshenker et al., 1995, 1999; Dempsey et al., 2005; Kullyev et al., 2010; Branicky et al., 2014; Yue et al., 2018).”

      We removed the oversimplified Fig. 4B to avoid any misinterpretation.

      8) In Figure 7B and 7C, the authors should add some type of error bars to the graphs to and give the readers an idea of whether the differences between strains that they write about are statistically significant or not.

      Response: These are frequency data to describe temporal dynamics of hatching (N=45-72 eggs per strain) (Fig. 7B) and development in single cohorts (N=48-177 eggs per strain) (Fig. 7C), hence, the absence of error bars.

      We agree that this representation of the data is not very telling. We therefore changed the data representation in these two figures to show that there are clear, statistically significant, negative correlations between egg retention and time to hatching / egg-to-adult developmental time.

      9) When the authors reference a list of papers in a single list, e.g. "(Burton et al., 2021; Fausett et al., 2021; Garsin et al., 2001; Padilla et al., 2002; Van Voorhies and Ward, 2000)" they seem to do so in alphabetical order by the first author's last name. I believe the usual practice is to list references by year of publication, with the earliest first.

      Response: We corrected citation style according to eLIFE format.

      10) At the top of page 24, the authors write "It seems unlikely, however, that any of these variants strongly alter central function of HSN and HSN-mediated signalling because fluoxetine and imipramine, known to act via HSN (Dempsey et al., 2005; Trent et al., 1983; Weinshenker et al., 1995), triggered a robust stimulatory effect on egg laying in all examined strains (Fig. 4C)." I believe that the Weinshenker paper in fact showed that imipramine does not act via the HSN, and the Dempsey paper suggested that both drugs can act at least in part independently of the HSN. Therefore, the authors should revise their statement.

      Response: We have removed the sentence.

      Reviewing Editor:

      Minor suggestions:

      1) p. 2, fifth line from bottom: "lead" instead of "leads";

      2) p. 2, last line: "muscle" instead of "muscles";

      3) p. 3, first full paragraph, 17th line: "populations" instead of "population";

      4) p. 5, fourth line from bottom: Delete first comma;

      5) p. 6, Figure 1D: "of" instead of "off";

      6) p. 7, fifth line: "KCNL-1";

      7) p. 9, third paragraph, second line: please clarify "late mid-L4";

      8) p. 16, first line: "exogenous";

      9) p 20, first paragraph, beginning of second sentence: "Whether" instead of "If";

      10) p. 22, ninth line from bottom: delete "shaped by";

      11) p. 23, last paragraph, third and eighth lines from bottom: change "between" to "among"

      Response: Thank you. All corrected.

      Additional changes:

      Figure 5A: We removed figure 5A showing a cartoon of mod-5/SERT and its effects on serotonin signalling. This figure was incorrectly showing that MOD-5 is expressed in HSN (Jafari et al 2011 J. Neuroscience, Hammarlund et al 2018 Neuron).

      Abstract: We reworded the abstract to reduce its length.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This work describes new validated conditional double KO (cDKO) mice for LRRK1 and LRRK2 that will be useful for the field, given that LRRK2 is widely expressed in the brain and periphery, and many divergent phenotypes have been attributed previously to LRRK2 expression. The manuscript presents solid data demonstrating that it is the loss of LRRK1 and LRRK2 expression within the SNpc DA cells that is not well tolerated, as it was previously unclear from past work whether neurodegeneration in the LRRK double Knock Out (DKO) was cell autonomous or the result of loss of LRRK1/LRRK2 expression in other types of cells. Future studies may pursue the biochemical mechanisms underlying the reason for the apoptotic cells noted in this study, as here, the LRRK1/LRRK2 KO mice did not replicate the dramatic increase in the number of autophagic vacuoles previously noted in germline global LRRK1/LRRK2 KO mice.

      We thank the editors for handling our manuscript and for the succinct summary that recognizes the significance of our findings and points out interesting directions for future studies. We also thank the reviewers for their helpful comments and positive evaluation of our work. Below, we have provided point-by-point responses to the reviewers’ comments.

      Reviewer #1 (Public Review):

      Summary:

      This is an important work showing that loss of LRRK function causes late-onset dopaminergic neurodegeneration in a cell-autonomous manner. One of the LRRK members, LRRK2, is of significant translational importance as mutations in LRRK2 cause late-onset autosomal dominant Parkinson's disease (PD). While many in the field assume that LRRK2 mutant causes PD via increased LRRK2 activity (i.e., kinase activity), it is not a settled issue as not all disease-causing mutant LRRK2 exhibit increased activity. Further, while LRRK2 inhibitors are under clinical trials for PD, the consequence of chronic, long-term LRRK2 inhibition is unknown. Thus, studies evaluating the long-term impact of LRRK deficit have important translational implications. Moreover, because LRRK proteins, particularly LRRK2, are known to modulate immune response and intracellular membrane trafficking, the study's results and the reagents will be valuable for others interested in LRRK function.

      Strengths:

      This report describes a mouse model where the LRRK1 and LRRK2 gene is conditionally deleted in dopaminergic neurons. Previously, this group showed that while loss of LRRK2 expression does not cause brain phenotype, loss of both LRRK1 and LRRK2 causes a later onset, progressive degeneration of catecholaminergic neurons and dopaminergic (DAergic) neurons in the substantia nigra (SN), and noradrenergic neurons in the locus coeruleus (LC). However, because LRRK genes are widely expressed with some peripheral phenotypes, it was unknown if the neurodegeneration in the LRRK double knockout (DKO) was cell autonomous. To rigorously test this question, the authors have generated a double conditional (cDKO) allele where both LRRK1 and LRRK2 genes were targeted to contain loxP sites. In my view, this was beyond what is usually required, as most investigators might might combine one KO allele with another floxed allele. The authors provide a rigorous validation showing that the Driver (DAT-Cre) is expressed in most DAergic neurons in the SN and that LRRK levers are decreased selectively in the ventral midbrain. Using these mice, the authors show that the number of DAergic neurons is normal at 15 but significantly decreased at 20 months of age. Moreover, the authors show that the number of apoptotic neurons is increased by ~2X in aged SN, demonstrating increased ongoing cell death, as well as an increase in activated microglia. The degeneration is limited to DAergic neurons as LC neurons are not lost as this population does not express DAT. Overall, the mouse genetics and experimental analysis were performed rigorously, and the results were statistically sound and compelling.

      Weaknesses:

      I only have a few minor comments. First is that in PD and other degenerative conditions, loss of axons and terminals occurs prior to cell bodies. It might be beneficial to show the status of DAergic markers in the striatum. Second, previous studies indicate that very little, if any, LRRK1 is expressed in SN DAergic neurons. This also the case with the Allen Brain Atlas profile. Thus, authors should discuss the discrepancy as authors seem to imply significant LRRK1 expression in DA neurons.

      We appreciate the reviewer’s recognition of the importance of the study as well as our rigorous experimental approaches and compelling results. Our responses to the reviewer's two minor comments are below.

      1) DAergic markers in the striatum: We performed TH immunostaining in the striatum and quantified TH+ DA terminals in the striatum of DA neuron-specific LRRK cDKO and littermate control mice at the ages of 15 and 24 months. We found similar levels of TH immunoreactivity in the striatum of LRRK cDKO and littermate control mice at the age of 15 months (p = 0.6565, unpaired Student’s t-test) and significantly reduced levels of TH immunoreactivity in the striatum of LRRK cDKO, compared to control mice at the age of 24 months (~19%, p = 0.0215), suggesting an age-dependent loss of dopaminergic terminals in the striatum of DA neuron-specific LRRK cDKO mice. These results are now included as Figure 5 of the revised manuscript.

      2) LRRK1 expression in the SNpc: It is shown in the Mouse brain RNA-seq dataset and the Allen Mouse brain ISH dataset (https://www.proteinatlas.org/ENSG00000154237-LRRK1/brain) that LRRK1 is broadly expressed in the mouse brain and is expressed at modest levels in the midbrain, comparable to the cerebral cortex. Indeed, our Western analysis also showed that levels of LRRK1 detected in the dissected ventral midbrain and the cerebral cortex of control mice are similar (40µg total protein loaded per lane; Figure 2E). Furthermore, we previously demonstrated that deletion of LRRK2 (or LRRK1) alone does not cause age-dependent loss of DA neurons in the SNpc, but deletions of both LRRK1 and LRRK2 result in age-dependent loss of DA neurons in LRRK DKO mice, indicating the functional importance of LRRK1 in the protection of DA neuron survival in the aging mouse brain (Tong et al., PNAS 2010, 107: 9879-9884, Giaime et al., Neuron 2017, 96: 796-807).

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Shen and collaborators described the generation of cDKO mice lacking LRRK1 and LRRK2 selectively in DAT-positive DAergic neurons. The Authors asked whether selective deletion of both LRRK isoforms could lead to a Parkinsonian phenotype, as previously reported by the same group in germline double LRRK1 and LRRK2 knockout mice (PMID: 29056298). Indeed, cDKO mice developed a late reduction of TH+ neurons in SNpc that partially correlated with the reduction of NeuN+ cells. This was associated with increased apoptotic cell and microglial cell numbers in SNpc.

      Unlike the constitutive DKO mice described earlier, however, cDKO mice did not replicate the dramatic increase in the number of autophagic vacuoles. The study supports the authors' hypothesis that loss of function rather than gain of function of LRRK2 leads to PD.

      Strengths:

      The study described for the first time a model where both the PD-associated gene LRRK2 and its homolog LRRK1 are deleted selectively in DAergic neurons, offering a new tool to understand the physiopathological role of LRRK2 and the compensating role of LRRK1 in modulating DAergic cell function.

      Weaknesses:

      The model has no construct validity since loss of function mutations of LRRK2 are well-tolerated in humans and do not lead to PD. The evidence of a Parkinsonian phenotype in these cDKO mice is limited and should be considered preliminary.

      We thank the reviewer for commenting on the usefulness of this new PD mouse model.

      The reviewer did not include a reference citation for the statement "loss of function mutations of LRRK2 are well-tolerated in humans and do not lead to PD." It is possible that the reviewer was referring to a human population study (Whiffin et al., Nat Med 2020, 26: 869-877), entitled "The effect of LRRK2 lossof-function variants in humans." In this study, the authors analyzed 141,456 individuals sequenced in the Genome Aggregation Database, 49,960 exome-sequenced individuals from the UK Biobank, and more than 4 million participants in the 23andMe genotyped dataset, and they looked for human genetic variants predicted to cause loss-of-function of protein-coding genes (pLoF variants). The reported findings were interesting, and the authors were careful in stating their conclusions. However, this is not a linkage study of large pedigrees carrying a single, clear-cut loss-of-function mutation (e.g. large deletions of most exons and coding sequences). Therefore, the experimental evidence is not compelling enough to conclude whether loss-of-function mutations in LRRK2 cause PD or do not cause PD.

      The current report is an unbiased genetic study in an effort to reveal the normal physiological role of LRRK in dopaminergic neurons. It was not intended to produce Parkinsonian phenotypes in LRRK cDKO mice, which would be a biased effort. However, the unequivocal discovery of the cell intrinsic role of LRRK in the protection of DA neurons from age-dependent degeneration and apoptotic cell death should be considered seriously, while we contemplate the disease mechanism and how LRRK2 mutations may cause DA neuron loss and PD.

      Reviewer #3 (Public Review):

      Kang, Huang, and colleagues investigated the impact of LRRK1 and LRRK2 deletion, specifically in dopaminergic neurons, using a novel cDKO mouse model. They observed a significant reduction in DAergic neurons in the substantia nigra in their conditional LRRK1 and LRRK2 KO mice and a corresponding increase in markers of apoptosis and gliosis. This work set out to address a longstanding question within the field around the role and importance of LRRK1 and LRRK2 in DAergic neurons and suggests that the loss of both proteins triggers some neurodegeneration and glial activation.

      The studies included in this work are carefully performed and clearly communicated, but additional studies are needed to strengthen further the authors' claims around the consequences of LRRK2 deletion in DAergic neurons.

      1) In Figures 2E and F, the authors assess the protein levels of LRRK1 and LRRK2 in their cDKO mouse model to confirm the deletion of both proteins. They observe a mild loss of LRRK1 and LRRK2 signals in the ventral midbrain compared to wild-type animals. While this is not surprising given other cell types that still express LRRK1 and LRRK2 would be present in their dissected ventral midbrain samples, it does not sufficiently confirm that LRRK1 and LRRK2 are not expressed in DAergic neurons. Additional data is needed to more directly demonstrate that LRRK1 and LRRK2 protein levels are reduced in DAergic neurons, including analysis of LRRK1 and LRRK2 protein levels via immunohistochemistry or FACS-based analysis of TH+ neurons.

      We thank the reviewer for highlighting this incredibly important but often overlooked issue. We agree that the data in Figure 2E, F alone would be inadequate to validate DA neuron-specific LRRK cDKO mice.

      Cell type-specific conditional knockouts are a mosaic with KO cells mixed with other cell types expressing the gene normally. DA neuron-specific cDKO is particularly challenging, as DA neurons are a subset of cells embedded in the ventral midbrain. Rather than using immunostaining, which relies upon specific, good LRRK1 and LRRK2 antibodies for IHC, or FACS sorting of TH+ neurons followed by Western blotting (few cells, mixed cell populations, etc.), we chose a clean genetic approach by generating germline mutant mice carrying the deleted LRRK1 and LRRK2 alleles in all cells from the floxed LRRK1 and LRRK2 alleles. This approach permits characterization of these deletion mutations in germline mutant mice using molecular approaches that yield unambiguous results.

      We crossed CMV-Cre deleter mice with floxed LRRK1 and LRRK2 mice to generate respective germline LRRK1 KO and LRRK2 KO mice, in which all cells carry the LRRK1 or LRRK2 deleted alleles that are identical to those in DA neurons of cDKO mice. We then performed Northern, extensive RTPCR followed by sequencing, and Western analyses to show the absence of the full length LRRK1 and LRRK2 mRNA (Figure 1G, H, Figure 1-figure supplement 8 and 10), and the expected truncation of LRRK1 and LRRK2 mRNA (Figure 1-figure supplement 9 and 11), and the absence of LRRK1 and LRRK2 proteins (Figure 1I). These analyses together demonstrate that in the presence of Cre, either CMV-Cre expressed in all cells or DAT-Cre expressed selectively in DA neurons, the floxed LRRK1 and LRRK2 exons are deleted, resulting in null alleles. We further demonstrated the specificity of DAT-Cremediated recombination (deletion) by crossing DAT-Cre mice with a GFP reporter, showing that 99% TH+ DA neurons in the SNpc are also GFP+ (Figure 2A, B), indicating that DAT-Cre-mediated recombination of the floxed alleles occurs in essentially all TH+ DA neurons in the SNpc.

      2) The authors observed a significant but modest effect of LRRK1 and LRRK2 deletion on the number of TH+ neurons in the substantia nigra (12-15% loss at 20-24 months of age). It is unclear whether this extent of neuron loss is functionally relevant. To strengthen the impact of these data, additional studies are warranted to determine whether this translates into any PD-relevant deficits in the mice, including motor deficits or alterations in alpha-synuclein accumulation/aggregation.

      Yes, the reduction of DA neurons in the SNpc of cDKO mice at the age of 20-24 months is modest. At 15 months of age, the number of TH+ DA neurons in the SNpc is similar between LRRK cDKO mice (10,000 ± 141) and littermate controls (10,077 ± 310, p > 0.9999). At 20 months of age, the number of DA neurons in the SNpc of LRRK cDKO mice (8,948 ± 273) is significantly reduced (-12.7%), compared to control mice (10,244 ± 220, F1,46 = 16.59, p = 0.0002, two-way ANOVA with Bonferroni’s post hoc multiple comparisons, p = 0.0041). By 24 months of age, the number of DA neurons in the SNpc of LRRK cDKO mice (8,188 ± 452) relative to controls (9,675 ± 232, p = 0.0010) is further reduced (15.4%).

      Similar results were obtained by an independent quantification by another investigator, also conducted in a genotype blind manner, using the fractionator and optical dissector method, by which TH+ cells were quantified in 25% areas. These results are included as Figure 3-figure supplement 1 in the revised manuscript. Because of the more limited sampling, the quantification data are more variable, compared to quantification of TH+ cells in all areas of the SNpc, shown in Figure 3. With both methods, we quantified TH+ cells in every 10th sections encompassing the entire SNpc (3D structure), as sampling using every 5th or every 10th sections yielded similar results.

      We also performed behavioral analysis of LRRK cDKO mice and littermate controls at the ages of 10 and 25 months using the beam walk test (10 mm and 20 mm beam) and the pole test, which are sensitive to impairment of motor coordination. We found that LRRK cDKO mice at 10 months of age showed significantly more hindlimb errors (p = 0.0005, unpaired two-tailed Student’s t-test) and longer traversal time (p = 0.0075) in the 10mm beam walk test, compared to control mice, though their performance is similar in the 20 mm beam walk (hindlimb slips: p = 0.0733, traversal time: p = 0.9796) and in the pole test. At 22 months of age, the performance of LRRK cDKO mice and littermate controls is more variable and worse, compared to the younger mice, and is not significantly different between the genotypic groups. These results are now included as Figure 9 of the revised manuscript.

      3) The authors demonstrate that, unlike in the germline LRRK DKO mice, they do not observe any alterations in electron-dense vacuoles via EM. Given their data showing increased apoptosis and gliosis, it remains unclear how the loss of LRRK proteins leads to DAergic neuronal cell loss. Mechanistic studies would be insightful to understand better potential explanations for how the loss of LRRK1 and LRRK2 may impair cellular survival, and additional text should be added to the discussion to discuss potential hypotheses for how this might occur.

      We agree that this phenotypic difference between germline DKO and DA neuron-specific cDKO mice is intriguing, suggesting a non-cell autonomous contribution of LRRK in age-dependent accumulation of autophagic and lysosomal vacuoles in SNpc neurons of germline LRRK DKO mice. We will discuss the phenotypic difference further in the revised manuscript. We are generating microglial specific LRRK cDKO mice to investigate the role of LRRK in microglia and whether microglia contribute in a cell extrinsic manner to the regulation of the autophagy-lysosomal pathway in DA neurons.

      4) The authors discuss the potential implications of the neuronal cell loss observed in cDKO mice for LRRK1 and LRRK2 for therapeutic approaches targeting LRRK2 and suggest this argues that LRRK2 variants may exert their effects through a loss-of-protein function. However, all of the data generated in this work focus on a mouse in which both LRRK1 and LRRK2 have been deleted, and it is therefore difficult to make any definitive conclusions about the consequences of specifically targeting LRRK2. The authors note potential redundancy between the two LRRK proteins, and they should soften some of their conclusions in the discussion section around implications for the effects of LRRK2 variants. Human subjects that carry LRRK2 loss-of-function alleles do not have an increased risk for developing PD, which argues against the author's conclusions that LRRK2 variants associated with PD are loss-o-ffunction. Additional text should be included in their discussion to better address these nuances and caution should be used in terms of extrapolating their data to effects observed with PD-linked variants in LRRK2.

      We will modify the discussion accordingly in the revised manuscript.

    1. Author Response

      eLife assessment

      This valuable paper presents a thoroughly detailed methodology for mesoscale-imaging of extensive areas of the cortex, either from a top or lateral perspective, in behaving mice. While the examples of scientific results to be derived with this method are in the preliminary stages, they offer promising and stimulating insights. Overall, the method and results presented are convincing and will be of interest to neuroscientists focused on cortical processing in rodents.

      Authors’ Response: We thank the reviewers for the helpful and constructive comments. They have helped us plan for significant improvements to our manuscript. Our preliminary response and plans for revision are indicated below.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors introduce two preparations for observing large-scale cortical activity in mice during behavior. Alongside this, they present intriguing preliminary findings utilizing these methods. This paper is poised to be an invaluable resource for researchers engaged in extensive cortical recording in behaving mice.

      Strengths:

      -Comprehensive methodological detailing:

      The paper excels in providing an exceptionally detailed description of the methods used. This meticulous documentation includes a step-by-step workflow, complemented by thorough workflow, protocols, and a list of materials in the supplementary materials.

      -Minimal movement artifacts:

      A notable strength of this study is the remarkably low movement artifacts. To further underscore this achievement, a more robust quantification across all subjects, coupled with benchmarking against established tools (such as those from suite2p), would be beneficial.

      Authors’ Response: This is a good suggestion. Since we used suite2p for our data analysis, and have records of the fast-z correction applied by the microscope, we can supply these as quantifications of movement corrections that were applied across our sample of mice. We hope to supply this information as a supplement in the revised manuscript.

      Currently, we have chosen to show that the corrected, post- suite2p registration movement artifacts are very close to zero. We will revise the manuscript with clear descriptions of methods that we have found important, such as fully tightening all mounting devices, utilizing the air table properly, implanting the cranial window with proper, even pressure across its entire extent, and mounting the mouse so that it is not too close or far from the surface of the running wheel.

      Insightful preliminary data and analysis:

      The preliminary data unveiled in the study reveal interesting heterogeneity in the relationships between neural activity and detailed behavioral features, particularly notable in the lateral cortex. This aspect of the findings is intriguing and suggests avenues for further exploration.

      Weaknesses:

      -Clarification about the extent of the method in the title and text:

      The title of the paper, using the term "pan-cortical," along with certain phrases in the text, may inadvertently suggest that both the top and lateral view preparations are utilized in the same set of mice. To avoid confusion, it should be explicitly stated that the authors employ either the dorsal view (which offers limited access to the lateral ventral regions) or the lateral view (which restricts access to the opposite side of the cortex). For instance, in line 545, the phrase "lateral cortex with our dorsal and side mount preparations" should be revised to "lateral cortex with our dorsal or side mount preparations" for greater clarity.

      Authors’ Response: We will revise the manuscript so that it is clear that we made use of two imaging configurations for the 2-photon mesoscope data and the benefits and limitations of these two preparations. The dorsal mount and the side mount each have their advantages and disadvantages, but together form a powerful tool for imaging much of the dorsal and lateral cortex in awake, behaving mice.

      -Comparison with existing methods:

      A more detailed contrast between this method and other published techniques would add value to the paper. Specifically, the lateral view appears somewhat narrower than that described in Esmaeili et al., 2021; a discussion of this comparison would be useful.

      Authors’ Response: We will modify the manuscript so that a more detailed comparison with other published techniques is included. The preparation by Esmaeili et al. 2021 has some similarities, but also differences, from our preparation. Our preliminary reading is that their through-the-skull field of view is approximately the same as our through-the-skull field of view that exists between our first (headpost implantation) and second (window implantation) surgeries, although our preparation appears to include more anterior areas both near to and on the contralateral side of the midline. We will compare these preparations more accurately in the revised manuscript.

      If you compare the imageable extent of our cranial window for mesoscale 2-photon imaging to that of their through-the-skull widefield preparation, which is a bit of an “apples to oranges” comparison, then you are likely correct that their field of view is larger than ours, if you are referring to our 10 mm radius-bend glass. However, use of our 9 mm radius bend glass (i.e. a tighter bend) allows us to image additional ventral auditory areas. We could show an example of this, perhaps, although we did not make as much use of this alternative window in the large FOV experiments, because the increased curvature of the glass relative to the 10 mm radius bend window prevents imaging of the entire preparation in a single 2-photon z-plane. With the 9 mm radius bend glass we mostly imaged in the multiple, small FOV configuration (see Fig. S2).

      Furthermore, the number of neurons analyzed seems modest compared to recent papers (50k) - elaborating on this aspect could provide important context for the readers.

      Authors’ response: With respect to the “modest” number of neurons analyzed (between 2000 and 8000 neurons per session for our dorsal and side mount preparations with medians near 4500; See Fig. S2e) we would like to point out that factors such as use of dual-plane imaging or multiple imaging planes, different mouse lines, use of different duration recording sessions (see our Fig S2c), use of different imaging speeds and resolutions (see our Fig S2d), use of different Suite2p run-time parameters, and inclusion or areas with blood vessels and different neuron cell densities, may all impact the count of total analyzed neurons. We could provide additional documentation of these issues, but we would like to point out that, in our case, we were not trying to maximize neuron count at the expense of other factors such as imaging speed and total spatial FOV extent.

      -Discussion of methodological limitations:

      The limitations inherent to the method, such as the potential behavioral effects of tilting the mouse's head, are not thoroughly examined. A more comprehensive discussion of these limitations would enhance the paper's balance and depth.

      Authors’ Response: Our mice readily adapted to the 22.5 degree head tilt and learned to perform 2-alternative forced choice (2-AFC) auditory and visual tasks in this situation (Hulsey et al, 2024; Cell Reports). The advantages and limitations of such a rotation of the mouse, and possible ways to alleviate these limitations, as detailed in the following paragraphs, will be discussed more thoroughly in the revised manuscript.

      One can look at Supplementary Movie 1 for examples of the relatively similar behavior between the dorsal mount (not rotated) and side mount (rotated) preparations. We do not have behavioral data from mice that were placed in both configurations. Our preliminary comparison across mice indicates that side and dorsal mount mice show similar behavioral variability.

      It was in general important to make sure that the distance between the wheel and all four limbs was similar for both preparations. In particular, careful attention must be paid to the positioning of the front limbs in the side mount mice so that they are not too high off the wheel. This can be accomplished by a slight forward angling of the left support arm for side mount mice.

      Although it would in principle be nearly possible to image the side mount preparation in the same optical configuration that we do without rotating the mouse, by rotating the objective to 20 degrees to the right, we found that the last 2-3 degrees of missing rotation (our preparation is rotated 22.5 degrees left, which is more than the full available 20 degrees rotation of the objective), along with several other factors, made this undesirable. First, it was very difficult to image auditory areas without the additional flexibility to rotate the objective more laterally. Second, it was difficult or impossible to attach the horizontal light shield and to establish a water meniscus with the objective fully rotated. One could use gel instead (which we found to be optically inferior to water), but without the horizontal light shield, the UV and IR LEDs can reach the PMTs via the objective and contaminate the image or cause tripping of the PMT. Third, imaging the right pupil and face of the mouse is difficult to impossible under these conditions because the camera would need the same optical access angle as the objective, or would need to be moved down toward the air table and rotated up 20 degrees, in which case its view would be blocked by the running wheel and other objects mounted on the air table.

      -Preliminary nature of results:

      The results are at a preliminary stage; for example, the B-soid analysis is based on a single mouse, and the validation data are derived from the training data set. The discrepancy between the maps in Figures 5e and 6e might indicate that a significant portion of the map represents noise. An analysis of variability across mice and a method to assign significance to these maps would be beneficial.

      Authors’ Response: In this methods paper, we have chosen to supply proof of principle examples, without a complete analysis of animal-to-animal variance. The dataset for this paper contains both neural and behavioral data for 91 sessions across 18 mice from both dorsal and side mount preparations. The complete analysis of this dataset exceeds the capacity of the present study. We will include more individual examples in the revised version, along with data showing the amount of between session and across mouse variance. We will include in the revised manuscript a comparison of the stability of B-SOiD measures across sessions, as a demonstration of what may be expected with this method.

      -Analysis details:

      More comprehensive details on the analysis would be beneficial for replicability and deeper understanding. For instance, the statement "Rigid and non-rigid motion correction were performed in Suite2p" could be expanded with a brief explanation of the underlying principles, such as phase correlation, to provide readers with a better grasp of the methodologies employed.

      Authors’ Response: We are revising the manuscript to give more detail without reducing readability, so as to increase clarity of presentation. Since this is a methods paper, we are modifying the manuscript to include more details and clear explanations so that the reader may replicate our methods and results.

      Reviewer #2 (Public Review):

      Summary:

      The authors present a comprehensive technical overview of the challenging acquisition of large-scale cortical activity, including surgical procedures and custom 3D-printed headbar designs to obtain neural activity from large parts of the dorsal or lateral neocortex. They then describe technical adjustments for stable head fixation, light shielding, and noise insulation in a 2-photon mesoscope and provide a workflow for multisensory mapping and alignment of the obtained large-scale neural data sets in the Allen CCF framework. Lastly, they show different analytical approaches to relate single-cell activity from various cortical areas to spontaneous activity by using visualization and clustering tools, such as Rastermap, PCA-based cell sorting, and B-SOID behavioral motif detection.

      Authors’ Response: Thank you for this excellent summary of the scope of our paper.

      The study contains a lot of useful technical information that should be of interest to the field. It tackles a timely problem that an increasing number of labs will be facing as recent technical advances allow the activity measurement of an increasing number of neurons across multiple areas in awake mice. Since the acquisition of cortical data with a large field of view in awake animals poses unique experimental challenges, the provided information could be very helpful to promote standard workflows for data acquisition and analysis and push the field forward.

      Authors’ Response: We very much support the idea that our work here will contribute to the development of standard workflows across the field including multiple approaches to large-scale neural recordings.

      Strengths:

      The proposed methodology is technically sound and the authors provide convincing data to suggest that they successfully solved various problems, such as motion artifacts or high-frequency noise emissions, during 2-photon imaging. Overall, the authors achieved their goal of demonstrating a comprehensive approach for the imaging of neural data across many cortical areas and providing several examples that demonstrate the validity of their methods and recapitulate and further extend some recent findings in the field.

      Weaknesses:

      Most of the descriptions are quite focused on a specific acquisition system, the Thorlabs Mesoscope, and the manuscript is in part highly technical making it harder to understand the motivation and reasoning behind some of the proposed implementations. A revised version would benefit from a more general description of common problems and the thought process behind the proposed solutions to broaden the impact of the work and make it more accessible for labs that do not have access to a Thorlabs mesoscope. A better introduction of some of the specific issues would also promote the development of other solutions in labs that are just starting to use similar tools.

      Authors’ Response: We will re-write the motivation behind the study to clarify the general problems that are being addressed. As the 2-photon imaging component of these experiments were performed on a Thorlabs mesoscope, the imaging details will necessarily deal specifically with this system. We will briefly compare the methods and results from our Thorlabs system to that of other systems, based on what we are able to glean from the literature on their strengths and weaknesses.

      Reviewer #3 (Public Review):

      Summary

      In their manuscript, Vickers and McCormick have demonstrated the potential of leveraging mesoscale two-photon calcium imaging data to unravel complex behavioural motifs in mice. Particularly commendable is their dedication to providing detailed surgical preparations and corresponding design files, a contribution that will greatly benefit the broader neuroscience community as a whole. The quality of the data is high, but it is not clear whether this is available to the community, some datasets should be deposited. More importantly, the authors have acquired activity-clustered neural ensembles at an unprecedented spatial scale to further correlate with high-level behaviour motifs identified by B-SOiD. Such an advancement marks a significant contribution to the field. While the manuscript is comprehensive and the analytical strategy proposed is promising, some technical aspects warrant further clarification. Overall, the authors have presented an invaluable and innovative approach, effectively laying a solid foundation for future research in correlating large-scale neural ensembles with behaviour. The implementation of a custom sound insulator for the scanner is a great idea and should be something implemented by others.

      Authors’ Response: Thank you for the kind words.

      We intend to make the data set used in making our main figures available to the public, perhaps using FigShare, so that they may check the validity of the methods and analysis. We intend to release a complete data set to the public as a Dandiset on the DANDI archive in conjunction with a second in-depth analysis paper that is currently in preparation.

      This is a methods paper, but there is no large diagram that shows how all the parts are connected, communicating, and triggering each other. This is described in the methods, but a visual representation would greatly benefit the readers looking to implement something similar.

      Authors’ Response: This is an excellent suggestion. We will include a workflow diagram in the revised manuscript for the methods, data collection, and analysis.

      The authors should cite sources for the claims stated in lines 449-453 and cite the claim of the mouse's hearing threshold mentioned in lines 463.

      Authors’ Response: For the claim stated in lines 449-453, “The unattenuated or native high-frequency background noise generated by the resonant scanner causes stress to both mice and experimenters, and can prevent mice from achieving maximum performance in auditory mapping, spontaneous activity sessions, auditory stimulus detection, and auditory discrimination sessions/tasks,” we can provide the following references: (i) for mice: Sadananda et al, 2008 (“Playback of 22-kHz and 50-kHz ultrasonic vocalizations induces differential c-fos expression in rat brain”, Neuroscience Letters, Vol 435, Issue 1, p 17-23), and (ii) for humans: Fletcher et al, 2018 (“Effects of very high-frequency sound and ultrasound on humans. Part I: Adverse symptoms after exposure to audible very-high frequency sound”, J Acoust Soc A, 144, 2511-2520). We will include these references in the revised paper.

      For line 463, “i.e. below the mouse hearing threshold at 12.5 kHz of roughly 15 dB”, we can provide the following reference: Zheng et al, 1999 (“Assessment of hearing in 80 inbred strains of mice by ABR threshold analyses”, Vol 130, Issues 1-2, p 94-107). We will also include this reference in the paper. Thank you for identifying these citation omissions.

      No stats for the results shown in Figure 6e, it would be useful to know which of these neural densities for all areas show a clear statistical significance across all the behaviors.

      Authors’ Response: There are two statistical comparisons that we feel may be useful to add to the single session data displayed in this figure, in order to address the point that you raise. The first would allow us to assess whether for each Rastermap group, the distribution of neuron densities across CCF areas differs from a null, uniform distribution. The second would allow us to examine differences between Rastermap groups associated with different qualitative behaviors in order to know with which patterns of neural activity they are reliably associated.

      For the first comparison, we could provide a statistic similar to what we provide for Fig. S6c and f, in which for each CCF area we compare the observed mean correlation values to a null of 0, or, in this case, the population densities of each Rastermap group for each CCF area to a null value equal to the total number of CCF areas divided by the total number of recorded neurons for that group (i.e. a Rastermap group with 500 neurons evenly distributed across ~30 CCF areas would contain ~17 neurons (or ~6% density) per CCF area.) Our current figure legend states that the maximum of the scale bar look-up value (reds) for each group ranges from ~8% to 32%. So indeed, adding these significances would be informative in this case.

      For the second comparison, we could compare the density of neurons for each CCF area across Rastermap groups for this session. For example, it may be the case that the density of neurons in primary and secondary visual areas belonging to Rastermap groups that predominate during the “walk” behavior is higher than in the Rastermap group that predominates during the “whisk” behavior, or that the density of neurons in the “whisk” and “twitch” Rastermap groups in primary and secondary motor areas is higher than in the Rastermap groups that are active during the “walk” and “oscillate” behaviors.

      Such a comparison should in fact be robust to Rastermap group variability across sessions and mice, as long as the same qualitative behaviors recur. However, our current qualitative methods for discretization of the Rastermap groups likely limits our ability to extend such an analysis accurately across our entire dataset. We are pursuing more rigorous analysis methods in this vein for our second, results oriented paper.

      While I understand that this is a methods paper, it seems like the authors are aware of the literature surrounding large neuronal recordings during mouse behavior. Indeed, in lines 178-179, the authors mention how a significant portion of the variance in neural activity can be attributed to changes in "arousal or self-directed movement even during spontaneous behavior." Why then did the authors not make an attempt at a simple linear model that tries to predict the activity of their many thousands of neurons by employing the multitude of regressors at their disposal (pupil, saccades, stimuli, movements, facial changes, etc). These models are straightforward to implement, and indeed it would benefit this work if the model extracts information on par with what is known from the literature.

      Authors’ Response: This is an excellent suggestion, but beyond the scope of the current methods paper. We are following up this methods paper with an in depth analysis of neural activity and corresponding behavior across the cortex during spontaneous and trained behaviors, but this analysis goes well beyond the scope of the present manuscript. Here, we prefer to present examples of the types of results that can be expected to be obtained using our methods, and how these results compare with those obtained by others in the field.

      Specific strengths and weaknesses with areas to improve:

      The paper should include an overall cartoon diagram that indicates how the various modules are linked together for the sampling of both behaviour and mesoscale GCAMP. This is a methods paper, but there is no large diagram that shows how all the parts are connected, communicating, and triggering each other.

      Authors’ Response: This is an excellent suggestion and will be included in the revised manuscript, so that readers can more readily follow our workflow, data collection, and analysis.

      The paper contains many important results regarding correlations between behaviour and activity motifs on both the cellular and regional scales. There is a lot of data and it is difficult to draw out new concepts. It might be useful for readers to have an overall figure discussing various results and how they are linked to pupil movement and brain activity. A simple linear model that tries to predict the activity of their many thousands of neurons by employing the multitude of regressors at their disposal (pupil, saccades, stimuli, movements, facial changes, etc) may help in this regard.

      Authors’ Response: This is an excellent suggestion, but beyond the scope of the present methods paper. Such an analysis is a significant undertaking with such large and heterogeneous datasets, and we provide proof-of-principle data here so that the reader can understand the type of data to be expected using our methods. We hope to provide a more complete analysis of data obtained using our methodology in the near future in a second manuscript.

      However, we may be amenable to including preliminary linear model fit results, as supplementary material, for the two example sessions highlighted in this paper (i.e. the one dorsal mount session in Fig. 4, and the one side mount session shown in Figs. 5 and 6).

      Previously, widefield imaging methods have been employed to describe regional activity motifs that correlate with known intracortical projections. Within the authors' data it would be interesting to perhaps describe how these two different methods are interrelated -they do collect both datasets. Surprisingly, such macroscale patterns are not immediately obvious from the authors' data. Some of this may be related to the scaling of correlation patterns or other factors. Perhaps there still isn't enough data to readily see these and it is too sparse.

      Authors’ Response: Unfortunately, we are unable to directly compare widefield GCaMP6s activity with mesoscope 2-photon GCaMP6s activity. During widefield data acquisition, animals were stimulated with visual, auditory, or somatosensory stimuli, while 2-photon mesoscope data collection occurred during spontaneous changes in behavioral state, without sensory stimulation. The suggested comparison is, indeed, an interesting project for the future.

      In lines 71-71, the authors described some disadvantages of one-photon widefield imaging including the inability to achieve single-cell resolution. However, this is not true. In recent years, the combination of better surgical preparations, camera sensors, and genetically encoded calcium indicators has enabled the acquisition of single-cell data even using one-photon widefield imaging methods. These methods include miniscopes (Cai et al., 2016), multi-camera arrays (Hope et al., 2023), and spinning disks (Xie et al., 2023).

      Cai, Denise J., et al. "A shared neural ensemble links distinct contextual memories encoded close in time." Nature 534.7605 (2016): 115-118.

      Hope, James, et al. "Brain-wide neural recordings in mice navigating physical spaces enabled by a cranial exoskeleton." bioRxiv (2023).

      Xie, Hao, et al. "Multifocal fluorescence video-rate imaging of centimetre-wide arbitrarily shaped brain surfaces at micrometric resolution." Nature Biomedical Engineering (2023): 1-14.

      Authors’ Response: We will correct these statements and incorporate these, and other relevant, references. There are advantages and disadvantages to each chosen technique, such as ease of use, field of view, accuracy, speed, etc., and we will highlight a few of these without an extensive literature review.

      Even the best one-photon imaging techniques typically have ~10-20 micrometer resolution in xy (we image at 5 micrometer resolution for our large FOV configuration, but the xy point-spread function for the Thorlabs mesoscope is 0.61 x 0.61 micrometers in xy with 970 nm excitation) and undefined z-resolution (4.25 micrometers for Thorlabs mesoscope). A coarser resolution increases the likelihood that activity data from neighboring cells may contaminate the fluorescence observed from imaged neurons. Reducing the FOV and using sparse expression of the indicator lessens this overlap problem.

      We do appreciate these recent advances, however, particularly for use in cases where more rapid imaging is desired over a large field of view (CCD acquisition can be much faster than that of standard 2-photon galvo-galvo or even galvo-resonant scanning, as the Thorlabs mesoscope uses). This being said, there are few currently available genetically encoded Ca2+ sensors that are able to measure fluctuations faster than ~10 Hz, which is a speed achievable on the Thorlabs 2-photon mesoscope with our techniques using the “small, multiple FOV” method (Fig. S2d, e).

      The authors' claim of achieving optical clarity for up to 150 days post-surgery with their modified crystal skull approach is significantly longer than the 8 weeks (approximately 56 days) reported in the original study by Kim et al. (2016). Since surgical preparations are an integral part of the manuscript, it may be helpful to provide more details to address the feasibility and reliability of the preparation in chronic studies. A series of images documenting the progression optical quality of the window would offer valuable insight.

      Authors’ Response: As you suggest, we will include images and data demonstrating the average changes in the window preparation, as well as the degree of variability and a range of outcome scenarios that we observed over the prolonged time periods of our study. We will also include methodological details that we found were useful for facilitating long term use of these preparations.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study addresses how protein synthesis in activated lymphocytes keeps up with their rapid division, with important findings that are of significance to cell biologists and immunologists endeavouring to understand the 'economy' of the immune system. The work is supported by solid data but because it proposes non-conventional mechanisms, it requires additional explanation and justification to align with the current understanding in the field.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors examine the fascinating question of how T lymphocytes regulate proteome expression during the dramatic cell state change that accompanies the transition from the resting quiescent state to the activated, dividing state. Orthogonal, complementary assays for translation (RPM/RTA, metabolic labeling) are combined with polyribosome profiling and quantitative, biochemical determinations of protein and ribosome content to explore this question, primarily in the OT-I T lymphocyte model system. The authors conclude that the ratio of protein levels to ribosomes/protein synthesis capacity is insufficient to support activation-coupled T cell division and cell size expansion. The authors hint at cellular mechanisms to explain this apparent paradox, focusing on protein acquisition strategies, including emperipolesis and entosis, though these remain topic areas for future study.

      The strengths of the paper include the focus on a fundamental biological question - the transcriptional/translational control mechanisms that support the rapid, dramatic cell state change that accompanies lymphocyte activation from the quiescent to activated state, the use of orthogonal approaches to validate the primary findings, and the creative proposal for how this state change is achieved.

      The weakness of the work is that several cellular regulatory processes that could explain the apparent paradox are not explored, though they are accessible for experimental analysis. In the accounting narrative that the authors highlight, a thorough accounting of the cellular process inventory that could support the cell state change should be further explored before committing to the proposal, provocative as it is, that protein acquisition provides a principal mechanism for supporting lymphocyte activation cell state change.

      Appraisal and Discussion:

      1) relating to the points raised above, two recent review articles explore this topic area and highlight important areas of study in RNA biology and translational control that likely contribute to the paradox noted by the authors: Choi et al. 2022, doi.org/10.4110/in.2022.22.e39 ("RNA metabolism in T lymphocytes") and Turner 2023, DOI: 10.1002/bies.202200236 ("Regulation and function of poised mRNAs in lymphocytes"). These should be cited, and the broader areas of RNA biology discussed by these authors integrated into the current manuscript.

      Good suggestion. We have added these references with a short discussion.

      2) The authors cite the Wolf et al. study from the Geiger lab (doi.org/10.1038/s41590-020-07145, ref. 41) though largely to compare determined values for ribosome number. Many other elements of the Wolf paper seem quite relevant, for example, the very high abundance of glycolytic enzymes (and whose mRNAs are quite abundant as well), where (and as others have reported) there is a dramatic activation of glycolytic flux upon T cell activation that is largely independent of transcription and translation, the evidence for "pre-existing, idle ribosomes", the changes in mRNA copy number and protein synthesis rate Spearman correlation that accompanies activation, and that the efficiencies of mRNA translation are heterogeneous. These data suggest that more accounting needs to be done to establish that there is a paradox.

      As one example, what if glycolytic enzyme protein levels in the resting cell are in substantial excess of what's needed to support glycolysis (likely true) and so translational upregulation can be directed to other mRNAs whose products are necessary for function of the activated cell? In this scenario, the dilution of glycolytic enzyme concentration that would come with cell division would not necessarily have a functional consequence. And the idle ribosomes could be recruited to key subsets of mRNAs (transcriptionally or post-transcriptionally upregulated) and with that a substantial remodeling of the proteome (authors ref. 44). The study of Ricciardi et al. 2018 (The translational machinery of human CD4+ T cells is poised for activation and controls the switch from quiescence to metabolic remodeling (doi.org/10.1016/j.cmet.2018.08.009) is consistent with this possibility. That study, and the short reviews noted above, are useful in highlighting the contributions of selective translational remodeling and the signaling pathways that contribute to the cell state change of T cell activation.

      Our study focuses on the central issue of whether measured ribosome translation rates support rapid division. The abundance of glycolytic enzymes, mRNA copy numbers etc., are clearly interesting and critical to cell metabolism, but are irrelevant to measuring the overall translation rate and capacity of T cells.

      From this perspective, an alternative view can be posited, where the quiescent state is biologically poised to support activation, where subsets of proteins and mRNAs are present in far higher levels than that necessary to support basal function of the quiescent lymphocyte. In such a model, the early stages of lymphocyte activation and cell division are supported by this surplus inventory, with transcriptional activation, including ribosomal genes, primarily contributing at later stages of the activation process. An obvious analogy is the developing Drosophila embryo where maternal inheritance supports early-stage development and zygotic transcriptional contributions subsequently assuming primary control (e.g. DOI 10.1002/1873-3468.13183 , DOI: 10.1126/science.abq4835). To pursue that biological logic would require quantifying individual mRNAs and their ribosome loading states, mRNA-specific elongation rates, existing individual protein levels, turnover rates of both mRNAs and proteins, ribosome levels, mean ribosome occupancy state, and how each of these parameters is altered in response to activation. Such accounting could go far to unveil the paradox. This is a considerable undertaking, though, and outside the scope of the current paper.

      The reviewer is essentially proposing RiboSeq analysis of pre- and post-activation T cells, whereby individual mRNAs can be queried for ribosome occupancy, and where translation inhibitors could be used to quantify mRNA-specific transit rates. This is important information but would not provide a more accurate accounting of protein synthesis rates than our much more direct measurement. We note that other labs have begun to work on this exact topic, however – see both PMID: 36002234 and PMID: 32330465.

      Reviewer #2 (Public Review):

      This paper takes a novel look at the protein economy of primary human and mouse T-cells - in both resting and activated state. Their findings in primary human T-cells are that:

      1) A large fraction of ribosomes are stalled in resting cultured primary human lymphocytes, and these stalled ribosomes are likely to be monosomes.

      2) Elongation occurs at similar rates for HeLa cells and lymphocytes, with the active ribosomes in resting lymphocytes translating at a similar rate as fully activated lymphocytes.

      They then turn their attention to mouse OT-1 lymphocytes, looking at translation rates both in vitro and in vivo. Day 1 resting T-cells also show stalling - which curiously wasn't seen on freshly purified cells - I didn't understand these differences.

      This is clarified and discussed starting in the third paragraph of “Protein synthesis in mouse lymphocytes ex vivo” section. Cells cultured ex vivo for 1 day with no activation show signs of stalling, as we observed in isolated human cells. But cells immediately out of an animal show a measurable decay rate since they are obviously synthesizing proteins in vivo and are processed rapidly.

      In vivo, they show that it is possible to monitor accurate translation and measure rates. Perhaps most interestingly they note a paradoxically high ratio of cellular protein to ribosomes insufficient to support their rapid in vivo division, suggesting that the activated lymphocyte proteome in vivo may be generated in an unusual manner.

      This was an interesting and provocative paper. Lots of interesting techniques and throwing down challenges to the community - it manages to address a number of important issues without necessarily providing answers.

      Reviewer #3 (Public Review):

      This manuscript provides a more or less quantitative analysis of protein synthesis in lymphocytes. I have no issue with the data as presented, as I'm sure all measurements have been expertly done. I see no need for additional experimental work, although it would be helpful if the authors could comment on the possibility of measuring the rate of synthesis of a defined protein, say a histone, in cells prior to and after activation. The conclusion the authors leave us with is the idea that the rates of protein synthesis recorded here are incompatible with observed rates of T cell division in vivo. Indeed, in the final paragraph of the discussion, the authors note the mismatch between what they consider a requirement for cell division, and the observed rates of protein synthesis. They then invoke unconventional mechanisms to make up for the shortfall, without -in this reviewer's opinion- discussing in adequate detail the technical limitations of the methodology used.

      Points #1-3 in the Discussion relate to potential pitfalls of our analyses; in point #3 we now add further limitations of RTA based on non-random detection of nascent chains due either to bias in either puromycylation or antibody detection of puromycylated nascent chains.

      A key question is the broad interest, novelty, and extension of current knowledge, in comparison with Argüello's (reference 27) 'SunRise' method. It would be helpful for the authors to stake out a clear position as to the similarities and differences with reference 27: what have we learned that is new? The authors could cite reference 27 in the introduction of their manuscript, given the similarity in approach. That said, the findings reported here will generate further discussion.

      We did cite this reference (27) in the section “Flow RPM measures ribosome elongation rate in live cells” giving credit where credit is due. We independently devised the method in 2014, and uniquely, to our knowledge, have applied it in vivo. We now further discuss the importance of our CHX modification to limit dissociation and increase the accuracy of RTA (second and third paragraphs of “Protein synthesis in mouse lymphocytes and innate immune cells in vivo”).

      The manuscript would increase in impact if the authors were to clearly define why a particular measurement is important and then show the actual experiment/result. As an example, it would be helpful to explain to the non-expert why the distinction between monosomes, polysomes, and stalled versions of the same is important, and then explain the rationale of the actual experiment: how can these distinctions be made with confidence, and what are confounding variables?

      We believe this is addressed in the section “Resting human lymphocytes have a dominant monosome population”.

      The initial use of human cells, later abandoned in favor of the OT-1 in vitro and in vivo models, requires contextualization. If the goal is to address the relationship between rates of translation and cell division of antigen-activated T cells in vivo, then a lot of the work on the human model and the in vitro experiments becomes more of a distraction, unless properly contextualized. Is there any reason to assume that antigen-specific activation in vivo will impact translation differently than the use of the PMA/ionomycin/IL2 cocktail? The way the work is presented leaves me with the impression that everything that was done is included, regardless of whether it goes to the core of the question(s) of interest.

      Donor PBMCs are clearly the more relevant model for understanding human T cell biology, which is why started our studies with this model. Had the manuscript strictly described mouse studies it is likely that we would be criticized for not studying human cells: Catch 22! However, as we state in the manuscript, the human cell model has a variety of technical downsides, including donor heterogeneity. PMA/ionomycin activation is also physiologically questionable, and while we could deliver a defined TCR to redirect their specificity, this is typically done after cells have been activated, since lentiviral delivery is poor in resting lymphocytes. A main point we try to make from this work is that cells derived from human blood donors show signs of ribosomal stalling by the time they are isolated and put into culture. This may limit the usefulness of studying them preactivation, although based on our mouse data, some level of stalled ribosomes may be a feature as well – to prime T cells to be ready for their massive expansion. The move to the OT-I system gave us complete control over the system, including in vivo delivery of translation inhibitors.

      It would be helpful if the authors made explicit some of the assumptions that underlie their quantitative comparisons. Likewise, the authors should discuss the limitations of their methods and provide alternative interpretations where possible, even if they consider them less/not plausible, with justification. As they themselves note, improvements in the RPM protocols raised the increase in translating ribosomes upon activation from 10-fold to 15-fold. Who's to say that is the best achievable result? What about the reliability/optimization of the other measurements?

      We expanded discussion of potential pitfalls of the RPM techniques and others in the Discussion section. Regarding RPM per se, we use it as a readout of ribosome time decay, so even if further optimizations can be made, the decay rates we have made should still be accurate. In addition, for our cell accounting measurements in Figure 6, we do not use RPM data and rather calculate based on the assumption that every ribosome is used for protein synthesis at a “maximal” rate of mRNA transit.

      The composition of the set of proteins produced upon activation will differ from cell to cell (CD4, CD8, B, resting vs. dividing). Even if analyses are performed on fixed cells, the ability of the monoclonal anti-puromycin antibody to penetrate the matrix of the various fixed cell types may not be equal for all of them, depending on protein composition, susceptibility to fixation etc. Is it possible for puromycin to occupy the ribosome's A site and terminate translation without forming a covalent bond with the nascent chain? This could affect the staining with anti-puromycin antibodies and also underestimate the number of nascent chains.

      Yes, the method (like every other one) is imperfect. Harringtonine run-off experiments show that RPM staining only detects nascent chains. Note that reference 47 reports that 75% of translation in activated T cells is devoted to synthesizing ~250 housekeeping proteins, which are likely to be highly similar between lymphocyte subsets.

      I believe that the concept of FACS-based quantitation also requires an explanation for the nonexpert. For the FACS plots shown, the differences between the highest and lowest RPM scores for cells that divided and that have a similar CFSE score is at least 10-fold. Does that mean that divided cells can differ by that margin in terms of the number of nascent chains present? If I make the assumption that cells stimulated with PMA/ionomycin/IL2 respond more or less synchronously, why would there be a 10-fold difference in absolute fluorescence intensity (anti=puromycin) for randomly chosen cells with similar CFSE values? While the use of MFI values is standard practice in cytofluorimetry, the authors should devote some comments to such variation at the population level.

      We believe that the referee is referring to Sup Fig. 1B. In this experiment the T cells are polyclonal and represent the full range of naïve to potentially exhausted differentiation states. Looking at our initial in vivo RPM study (reference 22) and comparing Figure 2 (OTI’s) to Figure 3 (endogenous CD4s or CD8s), reveals more spread in the RPM values polyclonal vs. monoclonal T cells - now clarified in the third paragraph of “Protein synthesis in mouse lymphocytes and innate immune cells in vivo”). Flow cytometry is by far the most accurate method for measuring fluorescence in individual cells. It is likely to be an accurate measure of the variation of nascent chains in cells in the same division cohort but likely represents the diversity of T cell activation profiles in blood of healthy donors.

      It is assumed that for cells to complete division, they must have produced a full and complete copy of their proteome and only then divide. What if cells can proceed to divide even when expressing a subset of the proteome of departure (=the threshold set required for initiation of division), only to complete synthesis of the 'missing ' portion once cell division is complete? Would this obviate the requirement for an unusual mechanism of protein acquisition (trogocytosis; other)?

      There must be a steady state level of translation and proteome replenishment, though. If a cell can divide when it affords daughter cells with 90% of its G0 proteome (as an example), that daughter cell would either 1) be 10% smaller, or 2) require extra translation to make up for the missing proteome during its own division cycle. Though T cells do typically shrink slightly after an initial activation, cell size stabilizes over time. Requiring each daughter cell to make more and more missing proteome could be plausible, considering that initial bursts of division do take longer over time, but still, even in vitro activated T cells divide rapidly for weeks without large decreases in their division rates.

      Translation is estimated to proceed at a rate of ~6 amino acids per second, but surely there is variability in this number attributable to inaccuracies of the methods used, in addition to biological variability. Were these so-called standard values determined for a range of different tissues? It stands to reason that there might be variation depending on the availability of initiation/elongation factors, NTPs, aminoacyl tRNAs etc. What is the margin of error in calculating chain elongation rates based on the results shown here?

      We refer to all relevant studies we know of, including new in vivo estimates of elongation rates (reference 40).

      Reviewer #1 (Recommendations For The Authors):

      A "limitations of study" section would be a helpful way to detail potential contributing mechanisms that were not explored in the current study.

      We have expanded the methodological limitations in the Discussion section.

      Major:

      1) Broaden the scope of biological models that could explain the paradox.

      In the Discussion, we suggest that T cells acquire some fraction of their proteome through external sources and highlight some examples of this occurring.

      Minor:

      1) Include Mr markers for Fig. 2C.

      Done.

      2) Though commonly used interchangeably, historically the term protein synthesis was the consequence of mRNA translation. In other words, proteins are not translated.

      Good point! We have changed the text accordingly.

      3) Include more meaningful X-axis legend in polysome gradient panels i.e., Fig. S2, e.g., fraction number.

      In most experiments, fractions were not collected. Rather, the x-axis refers to time that the sample took to be queried by the detector.

      4) Figure 3A does not report polysome profiles as described in the text, pg. 5, though this is reported in Fig S2D.

      The figure callouts were correct but confusing. We now separately refer to out each result to clarify.

      5) In Fig 5A, SDS-PAGE/anti-Puro blots would be more convincing and contain more information. The dot-blot is difficult to interpret.

      Disagree. To quantitate total anti-puromycin signal a dot blot is far better than immunoblotting, which is compromised by unequal transfer of different protein species.

      6) It's not clear why a degree of monosome translation is necessarily surprising (pg. 7).

      It’s surprising since for many decades it was believed that translation by monosomes is a tiny fraction of translation. But separately, with this particular mode of activation, activated T cells displayed a preponderance of monosomes during their burst of division. When the activation method was improved, polysomes dominated. But monosome translation clearly supported T cell division during activation without cognate peptide, which was interesting.

      Reviewer #2 (Recommendations For The Authors):

      1) One concern is the dose of puromycin used. My understanding is that puromycin acts as a chain termination inhibitor - but is being used here predominantly as a label for nascent polypeptide chains. My concern, therefore, is the dose being used - here at 50ug/ml - which seems high and I would be concerned that at this dose it would act as a translational inhibitor rather than just labelling nascent chains, and is therefore resulting in a lower signal/background ration than expected. In human cell lines 0.1ug/ml is optimal and doses published (in cell lines) range between 1 and 10ug/ml so it will be interesting to understand why this high dose was used.

      Do they have a dose-response curve - is this high dose necessary because these are primary Tcells. Can the authors show that 50 µg/mL of puromycin is optimal for studying protein translation in primary human T cells? A titration curve will help answer this question and could be included in Suppl Figure 1. This experiment is critical as the authors use a higher dose than previous studies (commonly between 1 and 10 µg/mL).

      The reviewer is referencing puromycin concentrations typically used in the selection of cells – for the RPM assay, puromycin is used at saturating doses to label the maximal number of nascent chains stalled by CHX or EME pretreatment.

      2) None of the figures show statistical significance.

      Statistics on relevant comparisons are now indicated on figures and in legends.

      3) The authors mention: "We performed RPM on cells labelled with CFSE to track cell division by dye dilution (Supplemental Figure 1B). On day 2, activated cells exhibited multiple populations, with nearly all divided cells showing a high RPM signal.". However, on day 2 it is hard to see any dividing cells in the dot plot included in the supplemental figure. Dividing cells only appear on day 5? Their statements make the subsequent paragraphs also difficult to follow.

      We modified the text to clarify this data – there is likely activation-induced cell death occurring which is why there are relatively few CFSE-low cells at this timepoint, and they do exhibit a fairly wide range of RPM staining. The main point is that by day 5, nearly all divided cells exhibit high RPM.

      4) "Many divided cells exhibited near baseline RPM signals, however, consistent with their return to the resting state. Interestingly, although non-activated cells did not divide, ~50% demonstrated increased RPM staining.". Again, it is hard to see the ~50% of cells with increased RPM the authors refer to in the provided supplemental figure.

      This is from quantification of the flow data and is described more fully later when we discuss ribosome stalling.

      5) The authors say "Thus, we cannot attribute the persistence of flow RPM staining in translation initiation inhibitor-treated cells to incomplete inhibition of protein synthesis.' - but it's unclear what this refers to as in the previous paragraph they also say: 'Initiation inhibitors, however, clearly discriminated between day 1 resting and activated cells. RPM signal was diminished by up to 8090% on day 5 post-activation.' - this is all somewhat confusing. It would be helpful to have this clarified and in the text to make more liberal use of referring to specific figures.

      Figure 1B shows that RPM is maintained at fairly high levels during treatment with EME or CHX (in contrast to the initiation inhibitors HAR/PA). To rule out that the drugs were simply not active, tritiated leucine labeling was conducted to confirm that incorporation of the radiolabeled amino acid dropped to near-baseline (Figure 1C). Therefore, we can conclude that the drugs are indeed working as intended, but EME/CHX does not decrease RPM signal to the same extent that they prevent leucine incorporation.

      6) Page 5 Fig 3A - I don't understand the difference between freshly isolated OT-1 cells - which don't stall and day 1 OT-1 cells which do. Why are freshly isolated cells not behaving like the naïve cells- isn't this what they would predict? Also - I accept that there is a move from monosome to polysome population between day 1 and 2 - the effect isn't huge - it would be helpful/interesting to know what has happened by day 5 - is the effect much more significant?

      Freshly isolated cells are harvested from animals and immediately queried, whereas day 1 cells are cultured for 24h in the absence of any activation. Presumably, the ex vivo culture without any activation causes the mouse T cell ribosomes to stall, just as we observed in cells obtained from human donors that took hours to collect and bring to the bench. The appearance of polysomes is really related to how the activation of the cells is done… refer to Figure 5B to see how significant the polysome buildup can be!

      7) Fig S3C - I don't understand how they reach the conclusion from this figure that: '~15-fold increase in translating ribosomes in activated OT-I T cells in vivo (Supplemental Figure 3C) as compared to the 10-fold increase we previously reported using the original protocol. It would very much help the reader if these calculations could be better explained.

      These are simply quantifications of the RPM staining done in Supplemental Figure 3C compared to experiments done in the absence of the CHX-modified method.

      8) Page 7 - They conclude that the Tan paper has superior lymphocyte activation - but presumably this depends on the signal as to whether there is more activation and how this affects the shift from monosome to polysome -ie maybe a stronger activation signal affects the distribution more - perhaps their method is the more physiological? Is their conclusion fair - that 'These findings indicate that monosomes make a major contribution to translation in resting T cells but are likely to make a minor contribution in fully activated cells.'

      Yes, we believe that their published method would be more physiological with the use of the natural OT-I peptide. We conclude that although monosome translation is present (as others have published), there are relatively few monosomes in fully activated T cells. Therefore, the monosome contribution to overall translation in activated T cells appears to be minor.

      9) Contrary to observations in vitro, ribosomes are not stalled in naïve mouse T cells in vivo, as we show via RTA analysis of non-activated T cells. - yes - this seems somewhat surprising - what is the explanation?

      We presume this is due to the stress/non-native environment that ex vivo cultured cells are subjected to.

      10) Whilst I understand the point that the authors are trying to make in Figure 1D about resting T cells having high background RPM staining due to stalled ribosomes, it is intriguing that there is almost no difference (no statistical significance provided) after 2 or 5 days of activation. Isn't this finding contrary to the one provided in Figure 1A and Suppl Figure 1B?

      Figure 1A is showing the difference between no activation and activation conditions. Figure 1D is predominantly meant to show that the increase in RPM from activated cells at day 1 and day 5 are not as different as one might predict. The reason, as we describe in further experiments, is likely that cells exhibiting ribosomal stalling can incorporate puromycin, damping the “fold change” we calculate (unlike what we observe in metabolic labeling experiments in the same figure panel). Statistics have now been displayed on the graphs in Figure 1D for further clarification.

      11) "Including EME with HAR prevented decay of the RPM signal, as predicted, since EME blocks elongation while enabling (even enhancing) puromycylation21,26." I find this very confusing. I understand that emetine blocks protein elongation whilst enabling puromycilation, but why does it block the effect of the protein initiation inhibitor Harringtonin? Do they compete with each other?

      When ribosomes are frozen with emetine, they cannot transit mRNA and “fall off”. Therefore, the inclusion of EME in these experiments is a control to ensure that we are looking at true transit and runoff of ribosomes with harringtonine treatment (explanation in the second paragraph of “Flow RPM measures ribosome elongation rates in live cells” section)

      12) Can the authors explain why the RPM signal of activated OT-I cells (PMA/Iono) increases 20fold compared to resting cells, but there is only a ~2-fold increase in signal in human cells? The authors previously mentioned: "We noted that the RPM signal in activated cells was only 2- to 5fold higher than in non-activated cells. This increase is modest compared to the ~15-fold activation-induced increase in protein synthesis in original studies 10,11. To examine this discrepancy, we incubated cells for 15 min with harringtonin (HAR) or pactamycin (PA) to block translation initiation or emetine (EME) or cycloheximide (CHX) to block elongation." Would the authors have followed the same path if they had started the paper with OT-I cells?

      Human cells are not as well activated as OT-I in our study. The last question is beyond the scope of our reasoning as empirical evidence-based scientists, but we have applied for funding from the HG Wells Foundation for a time machine to answer this question.

      13) Authors should include representative raw data of the flow cytometries used to perform the "Ribosome Transit Assay (RTA) in Figures 2 and 3 as supplemental data.

      Done; now included in Supplemental Figures 1 and 3.

      14) It would be interesting to compare RPM in T cells activated with a more physiological stimulus, such as beads anti-CD3 anti-CD28 vs PMA/Iono. Particularly after showing that peptide-specific stimulation (with SIINFEKL) is more effective than PMA/Iono in activating OT-I cells and inducing polysome formation (Figures 5B and Suppl Figure 4A).

      We tried plate bound anti- CD3 and anti-CD28 early in these studies, and they didn’t induce as much early activation.

      15) Can the authors include the gating strategy to call "activated OT-I cells" to the cells shown in Suppl Figure 3c?

      A new Supplemental Figure 3D has been added showing the exact gating strategy for the OT-I cell RTA assays described in Supplemental Figure 3C and elsewhere.

      16) In Figure 6B, the authors mention an increase in the volume of the cells based on the assumption of spherical morphology but then show an increase in diameter. It would be more consistent to show both parameters in the same graph.

      The graph was changed to volume calculations instead of diameter for clarity. But they are linked as volume scales by radius cubed.

      17) The paper's main conclusion (i.e., that the ratio of proteins to ribosomes in T cells activated in-vivo does not support their doubling time) is exciting. They conclude this after measuring cell volume, protein abundance, and ribosomes per cell. As no changes in cell volume and protein abundance between T cells activated in vitro vs in vivo were observed (Figures 6B and 6C), the difference is exclusively attributable to a reduced number of ribosomes per cell in T cells activated in vivo (Figure 6F). Critically, the measurement of ribosomes per cell in T cells activated in vivo (Figure 6F, "ex vivo day 2") includes only two data points. It is hard to understand how the authors calculated this figure's means and standard deviations as it is not described in the figure legend. From the dispersion observed for "day 1" and "day 2" in vitroactivated T cells, it seems that the variability of the assay to measure ribosome content could explain part of the phenotype. Additionally, there are several missing data points in Figure 6H. As this figure is just a transformation of Figures 6D and 6G, it isn't easy to understand why. Can I suggest that they include more data points for Figures 6F, G, and H in the ex vivo day 2' category as the two data points shown with little variability is out of keeping with the rest of the data, and may be skewing their data?

      Figure 6F does not have the same number of data points as other panels because it required measurement of both protein content and ribosome number. Since the ribosome quantification method described here was developed later than some of our earlier protein measurements, not all experiments had both sets of data to properly calculate the proteins per ribosome. All data that had both values are included, though.

      Reviewer #3 (Recommendations For The Authors):

      Minor points:

      If an increase in cell diameter is recorded upon activation, why not also provide the value for the increase in volume?

      Done

      Regarding the writing, the erratic punctuation/hyphenation - or lack thereof - doesn't improve readability. One example: "....consistent with the idea that the flow RPM signal in day 1 resting lymphocytes...." Perhaps better: "... consistent with the idea that the RPM signal, obtained by flow cytometry for lymphocytes analyzed on day 1 and maintained in the absence of any activating agent,..." I understand that this can make for longer sentences, but I object to the use of 'flow' as shorthand for 'flow cytometry', and to the use of day 1 as an adverb or adjective. That works as lab jargon, it's less effective in a written text. The abbreviation 'DRiPs' is not defined. Words like 'notably', and 'surprisingly' can be eliminated.

      This work would benefit from the inclusion of a section describing 'Limitations of the study'.

      This is now expanded in the Discussion, as described above.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      The association of vitamin D supplementation in reducing Asthma risk is well studied, although the mechanistic basis for this remains unanswered. In the presented study, Kilic and co-authors aim to dissect the pathway of Vitamin D-mediated amelioration of allergic airway inflammation. They use initial leads from bioinformatic approaches, which they then associate with results from a clinical trial (VDAART) and then validate them using experimental approaches in murine models. The authors identify a role of VDR in inducing the expression of the key regulator Ikzf3, which possibly suppresses the IL-2/STAT5 axis, consequently blunting the Th2 response and mitigating allergic airway inflammation.

      The major strength of the paper lies in its interdisciplinary approach, right from hypothesis generation, and linkage with clinical data, as well as in the use of extensive ex vivo experiments and in vivo approaches using knock-out mice. The study presents some interesting findings including an inducible baseline absence/minimal expression of VDR in lymphocytes, which could have physiological implications and needs to be explored in future studies. However, the study presents a potential for further dissection of relevant pathophysiological parameters using additional techniques, to explain certain seemingly associative results, and allow for a more effective translation.

      Several results in the study suggest multiple factors and pathways influencing the phenotype seen, which remain unexplored. The inferences of this study also need to be read in the context of the different sub-phenotypes and endotypes of Asthma, where the Th2 response may not be predominant. While this does not undermine the importance of this elegant study, it is essential to emphasize a holistic picture while interpreting the results.

      Reviewer #2 (Public Review):

      Summary:

      This study seeks to advance our knowledge of how vitamin D may be protective in allergic airway disease in both adult and neonatal mouse models. The rationale and starting point are important human clinical, genetic/bioinformatic data, with a proposed role for vitamin D regulation of 2 human chromosomal loci (Chr17q12-21.1 and Chr17q21.2) linked to the risk of immune-mediated/inflammatory disease. The authors have made significant contributions to this work specifically in airway disease/asthma. They link these data to propose a role for vitamin D in regulating IL-2 in Th2 cells implicating genes associated with these loci in this process.

      Strengths:

      Here the authors draw together evidence form. multiple lines of investigation to propose that amongst murine CD4+ T cell populations, Th2 cells express high levels of VDR, and that vitamin D regulates many of the genes on the chromosomal loci identified to be of interest, in these cells. The bottom line is the proposal that vitamin D, via Ikfz3/Aiolos, suppresses IL-2 signalling and reduces IL-2 signalling in Th2 cells. This is a novel concept and whilst the availability of IL-2 and the control of IL-2 signalling is generally thought to play a role in the capacity of vitamin D to modulate both effector and especially regulatory T cell populations, this study provides new data.

      Weaknesses:

      Overall, this is a highly complicated paper with numerous strands of investigation, methodologies etc. It is not "easy" reading to follow the logic between each series of experiments and also frequently fine detail of many of the experimental systems used (too numerous to list), which will likely frustrate immunologists interested in this. There is already extensive scientific literature on many aspects of the work presented, much of which is not acknowledged and largely ignored. For example, reports on the effects of vitamin D on Th2 cells are highly contradictory, especially in vitro, even though most studies agree that in vivo effects are largely protective. Similarly other reports on adult and neonatal models of vitamin D and modulation of allergic airway disease are not referenced. In summary, the data presentation is unwieldy, with numerous supplementary additions, that makes the data difficult to evaluate and the central message lost. Whilst there are novel data of interest to the vitamin D and wider community, this manuscript would benefit from editing to make it much more readily accessible to the reader.

      Wider impact: Strategies to target the IL-2 pathway have long been considered and there is a wealth of knowledge here in autoimmune disease, transplantation, GvHD etc - with some great messages pertinent to the current study. This includes the use of IL-2, including low dose IL-2 to boost Treg but not effector T cell populations, to engineered molecules to target IL-2/IL-2R.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      In the revised manuscript, the authors have addressed a significant number of concerns raised. The restructuring and incorporation of a number of discussion points have improved the readability. Moreover, the authors have also incorporated some more figures to address certain questions raised.

      However, the authors could reconsider a few more points which would improve the readability of the manuscript.

      For e.g.

      1) While it is appreciated that the authors have provided the schematic of the study design for the VDAART trial, the visualization for the RNA-seq analysis may be helpful.

      We have created a visualization of the workflow for the RNA seq analysis as part of Figure 1 – figure supplement 1C.

      2) Quantification of images would not require any additional experiments, yet can reinforce the results with objectivity.

      We appreciate this comment. We chose to display histology images to allow a glimpse at the inflammatory condition in the lung tissue. For histological quantification, lung tissue should have been harvested and analyzed in a systematic and randomized way as well as in sufficient animal numbers to allow statistical analyses. This has not been done for these mouse models since the focus was in analyzing cytokine production by lung tissue CD4+ T cells as the driver of inflammation.

      3) The authors have not addressed the discrepancy of the sample sizes in the experiments. Some dot plots still don't match the legends, and there is a wide variation in the numbers chosen for different experiments and different groups in the same experiments.

      We appreciate the thorough screening of our manuscript and apologize for this oversight. We corrected the errors in the respective figure legends.

      The in vivo experiments comprise studies performed in (A) VDR-KO mice and (B) WT mice fed with vit-D supplemented chow.

      Sample size calculations for the mouse models of allergic airway inflammation based on BAL cell numbers revealed a minimum of n=8 per group for correct statistical analysis. In both experimental settings, the respective mouse lines were bred in the mouse facilities of MGH (A) and BWH (B). Depending on the litter sizes, additional mice were added in the HDM group, since bigger variability was expected in this group than the saline group.

      Intracellular CD4+ cytokine staining was performed for all mice, however some stainings failed and could not be reliably interpreted and were therefore excluded.

      Reviewer #2 (Recommendations For The Authors):

      The authors have largely replied to the reviewer comments, amended some noted typos & figure legend issues, as well as discussed the reviewers concerns in text and in their rebuttal.

      The data presented are novel and of significant interest, conceptually moving this field forward, but in this reviewer's opinion reflect one pathway, of likely several, linked to protective effects of vitamin D on airway disease. This reviewer recommends a further slight editing of the text to present this broader scenario.

      i) Treg cells are highly dependent on IL-2 (both Foxp3+ and IL-10+ cells, not always the same population), constitutively express the IL-2R, and there is already a significant literature regarding vitamin D and IL-10/Treg in control of immune-mediated conditions. A simple statement acknowledging this and that there are likely more than one mechanisms by which vitamin D may regulate allergic airway disease (directly or indirectly) would be appreciated - this is no way detracts from the novelty and contribution of the current findings.

      We thank the reviewer for this suggestion. We have added the following statement to the manuscript (lines 623-625):

      “Additional pathways, including the induction of IL-10 production by CD4+ T cells as well as a direct induction of Foxp3+ T reg cells could have further contributed to the observed protective effect of vitamin D supplementation (PMID: 21047796; 22529297).”

      ii) More comprehensive referencing of earlier papers proposing effects of vitamin D in controlling Treg/IL-10 and dampening Th2 responses in mouse (and human) models

      (e.g. Taher, Y. A., van Esch, B. C. A. M., Hofman, G. A., Henricks, P. A. J. & van Oosterhout, A. J. M. 1alpha,25-dihydroxyvitamin D3 potentiates the beneficial effects of allergen immunotherapy in a mouse model of allergic asthma: role for IL-10 and TGF-beta. J. Immunol. 180, 5211-21 (2008). Vassiliou JE et al, 2014. Vitamin D deficiency induces Th2 skewing and eosinophilia in neonatal allergic airways disease. Allergy DOI10.1111/all.12465).

      We have included the reference in the discussion section of our manuscript in lines 617-619:

      “Similar findings regarding the effects of vitamin D in controlling Treg/IL-10 and dampening Th2 responses have been reported, e.g., in (PMID: 18390702) and in offspring of mice that had been subjected to vitamin D deficiency in the third trimester of their pregnancy (PMID: 24943330).”

    1. Author Response

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public Review):

      The authors of the manuscript "High-resolution kinetics of herbivore-induced plant volatile transfer reveal tightly clocked responses in neighboring plants" assessed the effects of herbivory induced maize volatiles on receiver plants over a period of time in order to assess the dynamics of the responses of receiver plants. Different volatile compound classes were measured over a period of time using PTR-ToF-MS and GC-MS, under both natural light:dark conditions, and continuous light. They also measured gene expression of related genes as well as defense related phytohormones. The effects of a secondary exposure to GLVs on primed receiver plants was also measured.

      The paper addresses some interesting points, however some questions arise regarding some of the methods employed. Firstly, I am wondering why VOCs (as measured by GC-MS) were not quantified. While I understand that quantification is time consuming and requires more work, it allows for comparisons to be made between lines of the same species, as well as across other literature on the subject. Simply relying on the area under the curve and presenting results using arbitrary units is not enough for analyses like these. AU values do not allow for conclusions regarding total quantities, and while I understand that this is not the main focus of this paper, it raises a lot of uncertainty for readers (for example, the references cited show that TMTT has been found to accumulate at similar levels of caryophyllene, however the AU values reported are an order of magnitude higher for TMTT. Again, without actual quantification this is meaningless, but for readers it is confusing).

      With regards to the correlation analyses shown in figure 6, the results presented in many of the correlation plots are not actually informative. While there is a trend, I do not think that this is an appropriate way to show the data, as there are clearly other relationships at play. The comparison between plants under continuous light and normal light:dark conditions is interesting.

      This paper addresses a very interesting idea and I look forward to seeing further work that builds on these ideas.

      As mentioned in our previous response, we have added the quantification of GLVs in order to increase the comparability of our work to other studies.

      Regarding the comment about TMTT (only measured as internal pools), the purpose of the inclusion of these internal pool data, was simply to determine whether terpenes were accumulating in leaf tissue during the night when emissions are hindered (likely due to closed stomata). The data clearly show that internal terpene pools do not accumulate above daytime levels during darkness – this is further supported by gene expression data that show downregulation of terpene synthase genes during darkness. While quantification would certainly increase the ability to compare internal pools, it would not change the interpretation of our results. Also note that absolute quantification is challenging for compounds such as TMTT, which are not readily available.

      Regarding the comment on Figure 6, while we agree there may be interesting patterns beyond linear relationships, as stated in our previous response, the purpose of our analysis was to determine if the higher terpene burst in receiver plants on the second day may be explained by sender plants emitting more GLVs on the second day. Figure 6 shows that this is not the case. Further analyses would not provide additional significant insights into the hypothesis that we tested here.

      We thank the reviewer for their overall positive outlook on our paper and for the constructive comments.

      Reviewer #2 (Public Review):

      The exact dynamics of responses to volatiles from herbivore-attacked neighbouring plants have been little studied so far. Also, we still lack evidence whether herbivore-induced plant volatiles (HIPVs) induce or prime plant defences of neighbours. The authors investigated the volatile emission patterns of receiver plants that respond to the volatile emission of neighbouring sender plants which are fed upon by herbivorous caterpillars. They applied a very elegant approach (more rigorous than the current state-of-the-art) to monitor temporal response patterns of neighbouring plants to HIPVs by measuring volatile emissions of senders and receivers, senders only and receivers only. Different terpenoids were produced within 2 h of such exposure in receiver plants, but not during the dark phase. Once the light turned on again, large amounts of terpenoids were released from the receiver plants. This may indicate a delayed terpene burst, but terpenoids may also be induced by the sudden change in light. As one contrasting control, the authors also studied the time-delay in volatile emission when plants were just kept under continuous light. Here they also found a delayed terpenoid production, but this seemed to be lower compared to the plants exposed to the day-night-cycle. Another helpful control was now performed for the revision in which the herbivory treatment was started in the evening hours and lights were left on. This experiment revealed that the burst of terpenoid emission indeed shifted somewhat. Circadiane and diurnal processes must thus interact.

      Interestingly, internal terpene pools of one of the leaves tested here remained more comparable between night and day, indicating that their pools stay higher in plants exposed to HIPVs. In contrast, terpene synthases were only induced during the light-phase, not in the dark-phase. Moreover, jasmonates were only significantly induced 22 h after onset of the volatile exposure and thus parallel with the burst of terpene release.

      An additional experiment exposing plants to the green leaf volatile (glv) (Z)-3-hexenyl acetate revealed that plants can be primed by this glv, leading to a stronger terpene burst. The results are discussed with nice logic and considering potential ecological consequences. All data are now well discussed.

      Overall, this study provides intriguing insights in the potential interplay between priming and induction, which may co-occur, enhancing (indirect and direct) plant defence. Follow-up studies are suggested that may provide additional evidence.

      We thank the reviewer for their positive outlook on our paper and for their constructive comments.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      The authors did a great job with the revision. The additional experiments strengthened their conclusions. Thanks also for performing the suggested test for potential differences in induction capacity at different times of day, the new data are very interesting.

      Thank you very much.

      Line 49-52: The newly added sentence could be clarified in wording.

      We will clarify the sentence.

      Line 254-255: The newly added sentence needs to be corrected. This is no full sentence and it is not clear what the authors wanted to say here.

      We will clarify this sentence.

      Figure 6: In those instances, in which the correlation is not significant, the line should not be shown.

      We will remove the lines when correlations are not significant.

      The names of chemical compounds and terpene synthases should be written in lower case letters (see legend Fig 6, e.g. hexenal, not Hexenal; legend fig. 2: terpene synthase, not Terpene synthase)

      In the last round of revisions, I commented on Line 23: consequences on community dynamics are not investigated here, so this is a bit misleading. ... Your response was "We have deleted the sentence about community dynamics ..." which, however, in fact was not done! Please change!

      Apologies for that, we will delete mention of community dynamics in that sentence (for real).


      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study examines the effects of herbivory-induced maize volatiles on neighboring plants and their responses over time. Measurements of volatile compound classes and gene expression in receiver plants exposed to these volatiles led to the conclusion that the delayed emission of certain terpenes in receiver plants after the onset of light may be a result of stress memory, highlighting the role of priming and induction in plant defenses triggered by herbivore-induced plant volatiles (HIPVs). Most experimental data are compelling but additional experiments and accurate quantifications of the compounds would be required to confirm some of the main claims.

      Response: We thank the editors for their overall positive feedback on our MS. We have added additional experiments to quantify green leaf volatile emissions in both sender plants and synthetic dispensers (Reviewer 1) and address the importance of the precise time of day plants are induced (Reviewer 2). These additions strengthen the main conclusions of our study.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors of the manuscript "High-resolution kinetics of herbivore-induced plant volatile transfer reveal tightly clocked responses in neighboring plants" assessed the effects of herbivory-induced maize volatiles on receiver plants over a period of time in order to assess the dynamics of the responses of receiver plants. Different volatile compound classes were measured over a period of time using PTR-ToF-MS and GC-MS, under both natural light:dark conditions, and continuous light. They also measured gene expression of related genes as well as defence-related phytohormones. The effects of a secondary exposure to GLVs on primed receiver plants were also measured.

      The paper addresses some interesting points, however, some questions arise regarding some of the methods employed. Firstly, I am wondering why VOCs (as measured by GC-MS) were not quantified. While I understand that quantification is time-consuming and requires more work, it allows for comparisons to be made between lines of the same species, as well as across other literature on the subject. As experiments with VOC dispensers were also used in this experiment, I find it even more baffling that the authors didn't confirm the concentration of the emission from the plants they used to make sure they matched. The references cited justifying the concentration used (saying it was within the range of GLVs emitted by their plants) to prepare the dispenser were for either a different variety of maize (delprim versus B73) or arabidopsis. Simply relying on the area under the curve and presenting results using arbitrary units is not enough for analyses like these.

      Response: We thank the reviewer for their comment. We have now quantified both the emission of dispensers and maize seedlings infested with 3 4th-instar Spodoptera exigua larvae. Averaged across 1 h, HAC dispensers emitted roughly 2x higher molar concentrations than total GLV molar concentrations emitted by plants infested by 3 caterpillars. Of note, GLV emissions induced by caterpillars vary over time, and can be more than 2-fold higher than the average during times of strong active feeding (Supplemental Fig 4). Thus, the release rate of the dispensers is well within the plant’s physiological range.

      Note that the references cited were included to support the claim of the biological activity of all three GLVs rather than to justify concentration of our dispensers. We have rephrased this sentence to reflect this (see L330-333).

      With regards to the correlation analyses shown in Figure 6, the results presented in many of the correlation plots are not actually informative. By blindly reporting the correlation coefficient important trends are being ignored, as there are clearly either bimodal relationships (e.g. upper left panel, HAC/TMTT, HAC/MNT) or even stranger relationships (e.g. upper left panel, IND/SQT, IND/MNT) that are not being well explained by a correlation plot. It is not appropriate to discuss the correlation factors presented here and to draw such strong conclusions on emission kinetics. The comparison between plants under continuous light and normal light:dark conditions is interesting, but I think there are better ways to examine these relationships, for example, multivariate analysis might reveal some patterns.

      Response: We thank the reviewer for their comment. With our analysis we aimed at testing specifically whether the high release of known bioactive volatiles (GLVs and indole) by sender plants on the second day can explain the higher terpene emissions in the receiver plants. We explicitly mention this in the text (L176-L186). Indeed, under normal light conditions (light and dark phase), there are clear positive correlations between the GLV release of sender plants and the terpene release of receiver plants over time (see also Fig 1 and Fig 5). However, under continuous light conditions, GLV emissions in sender plants no longer correlate with terpene emissions in receiver plants (also apparent by comparison of Fig 4 and Fig 5). This shows that temporal variation in GLV emissions are insufficient to explain the delayed terpene burst. This is the relevant conclusion we draw from this analysis. As presented, we find the data to provide strong evidence that the delayed burst in receiver plant terpene emissions cannot be solely explained by higher availability of active signals on the second day. The priming experiment in Figure 7 then provides a direct additional test for this concept. While more complex analyses could indeed reveal additional patterns, these would not be particularly informative for the question at hand.

      In Figure 2, the elevated concentrations of beta-caryophyllene found in the control plants at 8h and 16.75h measurement timepoints are curious. Is this something that is commonly seen in B73?

      Response: We thank the reviewer for this comment. A small number of untreated plants indeed accumulated β -caryophyllene at night, which is likely the result of biological variability between samples. Our plants were soil-grown, and it is for instance possible that variation in soil biota may account for this variability. Alternatively, some plants may have been slightly stressed during handling. Note that this variability does not affect any of the conclusions in our manuscript.

      While there can be discrepancies between emissions and compounds actually present within leaf tissue, it is a little bit odd that such high levels of b-caryophyllene were found at these timepoints, however, this is not reflected in the PTR-ToF-MS measurements of sesquiterpenes. It would be beneficial to include an overview of the mechanism of production and storage of sesquiterpenes in maize leaves, which would clarify why high amounts were found only in the GC-MS analysis and not the PTR-ToF-MS analysis, which is a more sensitive analytical tool. It is possible that the amounts of b-caryophyllene present in the leaf are actually extremely low, however as the values are not given as a concentration but rather arbitrary units, it is not possible to tell. I would include a line explaining what is seen with b-caryophyllene.

      Response: Thank you for this comment. It is important to note that accumulation in maize leaves can differ substantially from emission, especially at night when stomata are closed. This has been observed before in maize leaves (Seidl-Adams et al., 2015). As the reviewer suspects, earlier work indeed found that β-caryophyllene is a minor sesquiterpene compared to β-farnesene and α-bergamotene in B73 ( Block et al., 2018). The PTR-ToF-MS does not discriminate between terpenes with the same m/z and thus measures total sesquiterpene emissions. Given that sesquiterpene emissions are strongly regulated by stomatal aperture and that overall sesquiterpene accumulation in control plants is low, it is not surprising that we measure only minor amounts of sesquiterpene emissions in general, and in control plants in particular. We now text to the manuscript to explain these aspects (L116-L122). Block, A.K., Hunter, C.T., Rering, C. et al. Contrasting insect attraction and herbivore-induced plant volatile production in maize. Planta 248, 105–116 (2018).

      Seidl-Adams I, Richter A, Boomer KB, Yoshinaga N, Degenhardt J, Tumlinson JH. Emission of herbivore elicitor-induced sesquiterpenes is regulated by stomatal aperture in maize (Zea mays) seedlings. Plant Cell Environ. 38, 23-34 (2015).

      Additionally, it seems like the amounts of TMTT within the leaf are extraordinarily high (judging only by the au values given for scale), far higher than one would expect from maize.

      Response: We are unsure about the reviewer’s interpretation here. The AU values do not allow for conclusions regarding total quantities. An earlier study found that TMTT in induced B73 plants accumulates to similar amounts as β-caryophyllene (Block et al., 2018), thus it is not surprising to detect significant TMTT pools in induced maize leaves. It is important to note that the aim of the experiment here was to test the hypothesis that plants may be hyperaccumulating volatiles when the stomata are closed at night, which could potentially explain the delayed terpene burst on the second day. We do not observe such a hyperaccumulation, thus ruling out this as the primary factor responsible for the observed phenomenon. This is further supported by the continuous light experiments, where the delayed burst in terpene emission is not hindered by the lack of a dark phase.

      Block, A.K., Hunter, C.T., Rering, C. et al. Contrasting insect attraction and herbivore-induced plant volatile production in maize. Planta 248, 105–116 (2018).

      Reviewer #2 (Public Review):

      The exact dynamics of responses to volatiles from herbivore-attacked neighbouring plants have been little studied so far. Also, we still lack evidence of whether herbivore-induced plant volatiles (HIPVs) induce or prime plant defences of neighbours. The authors investigated the volatile emission patterns of receiver plants that respond to the volatile emission of neighbouring sender plants which are fed upon by herbivorous caterpillars. They applied a very elegant approach (more rigorous than the current state-of-the-art) to monitor temporal response patterns of neighbouring plants to HIPVs by measuring volatile emissions of senders and receivers, senders only and receivers only. Different terpenoids were produced within 2 h of such exposure in receiver plants, but not during the dark phase. Once the light turned on again, large amounts of terpenoids were released from the receiver plants. This may indicate a delayed terpene burst, but terpenoids may also be induced by the sudden change in light. A potential caveat exists with respect to the exact timing and the day-night cycle. The timing may be critical, i.e. at which time-point after onset of light herbivores were placed on the plants and how long the terpene emission lasted before the light was turned off. If the rhythm or a potential internal clock matters, then this information should also be highly relevant. Moreover, light on/off is a rather arbitrary treatment that is practical for experiments in the laboratory but which is not a very realistic setting. Particularly with regard to terpene emission, the sudden turning on of light instead of a smooth and continuous change to lighter conditions may trigger emission responses that are not found in nature.

      Response: We thank the reviewer for their comment. Although not explicitly mentioned it in the initial draft of the MS, we employed 15 min transition periods for light and dark phase transitions with a light intensity of 60 µmol m-2 s-1 (compared to 300 µmol m-2 s-1 at full light) to achieve a more gradual transition. We now included this information in the manuscript (L291-L292).

      As one contrasting control, the authors also studied the time-delay in volatile emission when plants were just kept under continuous light (just for the experiment or continuously?). Here they also found a delayed terpenoid production, but this seemed to be lower compared to the plants exposed to the day-night-cycle. Another helpful control would be to start the herbivory treatment in the evening hours and leave the light on. If then again plants only release volatiles after a 17 h delay, the response is indeed independent of the diurnal clock of the plant.

      Response: This is a very interesting point raised by the reviewer. We now conducted an additional experiment under continuous light where we started the herbivory treatment just before the start of the dark phase (ca. 20:00 PM). We found a similar pattern: a distinct delay in the highest burst. However, interestingly, the burst was shifted from 12-18 hr to 10-12 hr (Supplemental Fig 1). This burst aligned reasonably well with the point at which lights would normally be turned on again. In light of this, and, as the herbivore additions typically started ca. 5 hrs after the onset of light following a dark period (Figures 1-7), we wanted to rule out the possibility that the lack of a burst on the first day, was simply due to a difference in induction capacity depending on how shortly after the onset of light plants became exposed to GLVs. As such, we designed an additional experiment to examine whether exposure to GLVs immediately after the lights come on induce higher terpene emissions than plants exposed to GLVs ca. 5 hr after lights come on (Supplemental Fig 2). Interestingly, emissions across the terpenes were similar, regardless how long after the onset of lights on plants were exposed to GLVs. This suggests that the delayed burst is not due to the fact that, on the second day, plants are exposed to GLVs immediately after the lights come on whereas the first day they are only exposed 5 hr after the lights come on. Both continuous light experiments (normal timing and shifted timing) show bursts that occur slightly earlier than we observe with under normal day : night light conditions (L159-L166 and L207-L211), suggesting an interaction between circadian and diurnal processes. For instance, it is possible that plants would start producing volatiles slightly earlier than the onset of the day, however, light and stomatal opening limits the exact timing of the burst under normal light:dark transitions. The additional data provide further evidence for the delayed burst as a timed response in maize plants.

      Additionally, we have added explanation the continuous light figure legends that plants were grown under normal conditions and lights were only left on following treatment.

      Interestingly, internal terpene pools of one of the leaves tested here remained more comparable between night and day, indicating that their pools stay higher in plants exposed to HIPVs. In contrast, terpene synthases were only induced during the light-phase, not in the dark-phase. Moreover, jasmonates were only significantly induced 22 h after the onset of the volatile exposure and thus parallel with the burst of terpene release. An additional experiment exposing plants to the green leaf volatile (glv) (Z)-3-hexenyl acetate revealed that plants can be primed by this glv, leading to a stronger terpene burst. The results are discussed with nice logic and considering potential ecological consequences. Some data are not discussed, e.g. the jasmonate and gene induction pattern.

      Response: Thanks for this comment. We have added a sentence regarding the jasmonate data suggesting that, in addition to providing an additional layer of evidence for the observed delay, suggest that other JA-dependent defenses in maize may follow similar temporal patterns (L254-L257).

      Overall, this study provides intriguing insights into the potential interplay between priming and induction, which may co-occur, enhancing (indirect and direct) plant defence. Follow-up studies are suggested that may provide additional evidence.

      Reviewer #1 (Recommendations For The Authors):

      Could the authors please explain why they chose not to calculate concentrations for VOCs? Perhaps it is that B73 is a very unique variety in that it contains very high levels of TMTT, even in control plants? This should be clarified by the authors.

      Response: We address this comment in the public review portion

      For the legend within Figure 2, I would move it to be in the upper left or right corners of the figure. It is not easy to see in its current position.

      Response: We have moved the figure legend based on the reviewers recommendation

      Figures depicting PTR-ToF-MS data: add m/z values to either the figures themselves and/or the legends.

      Response: We have added m/z values to the legends and added molecular formulas of protonated compounds to each panel.

      Overall, here are some other suggestions: I am slightly weary of the term "clocked response". I'm not sure this is the correct fit for what you are trying to convey. I think "regulated" is a better term than "clocked". I understand that it is likely a stylistic choice to use this word, however, I advise reconsidering for the sake of clarity of the results.

      Response: Thank you. We find clocked to be an appropriate term, as it highlights the temporal aspect of the burst, and have thus left the title as is.

      Have another look at the references as some are not in the correct format (i.e., species not in italics).

      Response: We have checked and corrected the references

      Reviewer #2 (Recommendations For The Authors):

      Line 23: consequences on community dynamics are not investigated here, so this is a bit misleading.

      Last sentence of the abstract: It would be nice to read the answer to this long-standing question here.

      Response: We have deleted he sentence about community dynamics and provided a more concrete final sentence (L38-L40)

      Lines 48-50: The example does not fit so well with the first sentence and is not entirely clear (relation to temporal dynamics; similar to what?).

      Response: We have reworded the sentence for clarity (L49-L52)

      Line 56: "volatiles" should be plural.

      Response: Changed (L58)

      Line 58: "to be produced" rather than "to produce"

      Response: This seems a stylistic choice, and have left it as is.

      End of abstract: Did you have any hypotheses? These should be stated here.

      Response: The listing of hypotheses is also a stylistic choice, which is in some cases required by journals, but not eLife. As such we have not included a discrete list of hypotheses and instead describe what we aimed to investigate and what we found.

      Line 93: "This response disappeared at night." Does this mean: "No volatiles were emitted during night"? Or was this a gradual disappearance? How many hours after the onset of light did the herbivore treatment start and how many hours after the first emission of volatiles was the light turned off?

      Response: We have added when herbivory began (L92-L93) and changed the text to ‘as soon as light was restored’ (L97-L98).

      Line 93: "as soon as the night was over" means practically rather "as soon as the light was switched on".

      Response: See above

      Line 91: "small induction" - do you mean "low amounts of xxx"?

      Response: We mean a small induction. Terpene emission is relatively low (hence small), but still induced relative controls.

      Line 91: which mono- and sesquiterpenes were monitored?

      Response: It is PTR-ToF-MS a thus we cannot identify individual sesquiterpenes and monoterpenes (as they all have the same mass), and thus group them generally.

      Figure 1: What exactly is the "control"? And what does the vertical hatched line in the beginning represent?

      Response: We have defined the control and added a sentence describing the vertical hatched line

      "Black points represent the same but with undamaged sender plants" - what is "the same" here? I find that a bit confusing!

      Response: We have rephrased

      Line 104: how do you define an "overaccumulation"?

      Response: We have added ‘above daytime levels’ to clarify that we mean over daytime levels (L106)

      Why was the oldest developing leaf chosen? Is this the largest one when plants are two weeks old? How many leaves do they have then? Is this the leaf with the highest biomass?

      Response: We chose this leaf as it is the largest and also highly responsive to HIPVs. We have added this sentence (with a reference) in the methods section (L369-L370)

      Line 107: "started increasing after 3 hours" - they may already have started before. The following description also sounds like the dynamics were investigated here. However, instead the authors measured samples at four distinct time-points and cannot say whether something "began" or "remained" etc. The wording should be changed to a more appropriate description, describing the differences at a given time-point.

      Response: We changed the wording to ‘were marginally induced after 3 hr’ see L110

      Line 113: What do you mean by "delete BELOW NIGHTTIME levels"?

      Response: The word we used was ‘deplete’ to ‘drop’ (L116)

      Line 114: "the expression of terpene synthases" add "in the receiver plants exposed to HIPVs."

      Response: Added

      Figure 2ff: The situation of receiver plants exposed to control plant volatiles is not explained in the method section and also not depicted in the Suppl. Fig. 1. Here, the sender plants seem to always have been induced (if the red star-like structure should resemble an induction - a legend may be helpful here).

      Response: We have changed to ‘connected to undamaged sender plants’. We additionally added a sentence to the methods section describing controls L300

      Line 140: This treatment is not described in the methods section. Were the plants only kept under constant conditions for the 2 experimental days? Compared to the induction shown in Fig. 1, the amount of released volatiles seems less here.

      Response: We have added explanation of this to the figure legends, explaining that plants were grown under normal conditions and lights were only left on following treatment

      Another helpful control would be to start the herbivory treatment in the evening hours and leave the light on. If then again plants only release volatiles after a 17 h delay, the response is indeed independent of the diurnal clock of the plant.

      Response: See public review comment. We have added this experiment and discuss it accordingly in the MS (L159-L166 and L207-L211)

      Line 157: Check sentence/grammar

      Response: Checked and modified

      Figure 5: I suggest using a different colour for volatiles released from the sender plants, not again the green also used in the other figures for the receiver plants. This would help the reader to quickly see which plants are in focus in each figure.

      Response: We have changed the color of the figures for clarity

      Figure 6 legend: check grammar in several sentences (use of singular vs. plural)

      Response: We have made the tense uniform

      The diurnal rhythm of jasmonates (and potentially also terpene synthases?) is not considered in the discussion.

      Response: See above, and we have added a sentence to the discussion mentioning the jasmonates (L254-L257)

      Line 230-231: check grammar. Given the complexity, the response pattern may not be so predictable.

      Response: We do not understand this comment, but have checked the grammar throughout the manuscript.

      Line 235: I like the discussion on potential ecological consequences.

      While some interpretation for each experiment is already given in the results section, not all results are discussed in the discussion section. For example, the jasmonate data are not discussed. This should be added.

      Response: See above

      Line 266: To get an idea about the plant size: How many leaves do the plants have in that stage?

      Response: Added a sentence describing the size L287-L288

      Line 321: change to "as in the greenhouse"

      Response: Changed

      Line 334: How were the terpenoids identified and, in particular, quantified?

      Response: Added (L379-L380)

      Line 354: Maybe rather change to: "Plant treatments and tissue collection for phytohormone sampling were identical as described above for terpene and gene expression analysis.

      Response: Changed

      Line 357: add "material" or "leaf tissue" after "flash frozen"

      Response: Added

      Line 359: What was the source of the isotopically labelled phytohormones?

      Response: Added (L400-L403)

      Line 360: The phytohormones are "analyzed" using UPLC. The "quantification" is then done afterward. Please correct.

      Response: Corrected (L404)

      Overall: a great approach and a wonderful idea!

      Thanks

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer 1

      Strengths:

      The major strength of this paper is the series of laser cutting experiments supporting that asters position via pushing forces acting both on the boundary (see below for a relevant comment) and between asters. The combination of imaging, data analysis and mathematical modeling is also powerful.

      Author Response: We thank the Reviewer for the positive comments, especially in recognising the power of our quantitative approaches.

      Weaknesses:

      This paper has weaknesses, mainly in the presentation but also in the quality of the data which do not always support the conclusions satisfactorily (this might in part be a presentation issue).

      Author Response>: We address these concerns below.

      My overall suggestion for the authors is to explain better the motivation and interpretation of their experiments and also to remove some of the observations which seem to be there because they could be done rather than because they add to the main message of the paper, which I find straightforward, valuable and supported by the data in Figure 4.

      Author Response: We have extended the motivation of the study in the Introduction, and at the beginning of appropriate Results sections. We better motivate the force potential and especially the key results from Figure 4. We outline specific changes below.

      In Figure 2, it is difficult for me to understand what is being tracked. I believe that the authors track the yolk granules (visible as large green blobs) and not lipid droplets. There is some confusion between the text, legends and methods so I could not tell. If the authors are tracking yolk granules as a proxy for hydrodynamics flows it seems appropriate to cite previous papers that have used and verified these methods. More notably, this figure is somewhat disconnected with the rest of the paper. I find the analysis interesting in principle but would urge the authors to propose some interpretation of the experiments in the context of their big-picture message. At this point, I cannot understand what the Figure adds.

      Author Response: Indeed, we track the yolk droplets that move around the aster. In the extraction protocol, we likely get a mixture of lipid droplets and yolk granules; this is due to the extraction procedure involving shear forces within the pipette. We are not certain about the exact nature of these droplets, but they are likely to a large extent yolk. We have clarified the terminology in the text, the legend and methods section. In this figure, we now show that the droplets do not move towards the aster center as the hydrodynamic pulling model would suggest. Instead, they appear to passively respond to a repulsive force, that results in them streaming around the aster. We have added additional panels to the figure that illustrates the directionality of yolk granule movements (lines 159-164). We agree with the Reviewer that the context could have been clarified. The role of fluid flows in biological systems is, as the Reviewer highlights, well studied. We have added additional contextualisa8on in the text (lines 140-146). We also motivate more clearly the figure, as it provides evidence that the asters generate forces over 20µm scale (lines 159-164). This is highly relevant for one of the paper’s main conclusions – that the Drosophila blastocyst asters generate pushing forces that enable regular packing.

      In Figure 3, it is not surprising that the aster-aster interactions are different from interactions with the boundary which is likely more rigid. It is also hard to understand why the force and thus velocity should scale as microtubule length. This Figure should be better conceptualized. I think that it becomes clear at the end of the paper that the authors are trying to derive an effective potential to use in a mathematical model in Figure 5 to test their hypotheses. I think that should be told from the start, so a reader understands why these experiments are being shown.

      Author Response: We don’t claim that the force scales with microtubule length on a single microtubule. However, at larger distances from the aster, the microtubule density decreases, and hence the effective force decreases.

      The Reviewer is correct that we use these results to motivate our effective potential. We have brought this motivation forward in the manuscript to guide the reader (lines 169-171) and included a further note at the end of the section (lines 216-218).

      The experiments in Figure 4 are very nice in suppor8ng a pushing model. However, it would help if the authors could speculate what the single aster is pushing against in this experiment. The experiments reported in Figure 1 seemed to suggest that the aster mainly pushed against the boundary. In the experiments in Figure 4 do the individual asters touch the boundary on both sides? I think that readers need more information on what the extract looks like for those experiments.

      Author Response: We now include an additional panel B in Figure 4– that shows an example of an explant during aster ablation. The distance between asters is typically less than the distance to the explant boundary. Boundary effects likely play a small role in the aster-aster separation, in terms of potentially determining the axis of separation. However, the separation of asters occurs along a straight line for a substan8al period (>1 min) of separation; if boundary effects were more dominant, we may expect to see curving of the aster-aster separation trajectories as they also receive feedback from the boundary.

      Figure 4F could use some statistics. I doubt that the acceleration in the pink curves would be significant. I believe that the decelera8on is and that is probably the most crucial result. Since the authors present only 3 asters pairs it is important to be sure that these conclusions are solid.

      Author Response: We agree with the Reviewer. These experiments are challenging to do, as they require carefully controlled conditions. In two out of three experiments we see significant increase in acceleration in the pink curves. Of course, the interpretation of this must be caveated as our experimental number is low. These details are now provided in the revision (lines 263267).

      Reviewer 2

      Strengths:

      This study reveals a unique aster positioning mechanics in the syncytial embryo explant, which leads to an understanding of the mechanism underlying the positioning of multiple asters associated with nuclei in the embryo. The use of explants enabled accurate measurement of aster motility and, therefore, the construc8on of a quantitative model. This is a notable achievement.

      Author Response: We thank the Reviewer for their review, and in highlighting how our quantitative model is a clear step forward in our understanding of aster dynamics.

      Weaknesses:

      The main conclusion that aster repulsion predominates in this system has already been drawn by the same authors in their recent study (de-Carvalho et al., Development, 2022). As the present work provides additional support to the previous study using different experimental system, the authors should emphasize that the present manuscripts adds to it (but the conceptual novelty is limited).

      Author Response: While this study is related to the previous work, there are major differences. First, here we quantitatively assess aster dynamics within a “clean” system. Such accurate measurements are not possible in vivo currently. Further, experiments like laser ablation are much better defined within the explant system. We do recognise more clearly the previous work in the Introduc8on and lines 291-293, 299-300. Combined, with the different perspectives provided in these papers on the problem of aster positioning in syncytia, we believe these papers provide new and well-supported insights.

      The molecular mechanisms underlying aster repulsion remain unexplored since the authors were unable to identify specific factor(s) responsible for aster repulsion in the explant.

      Author Response: Given that the nature of the aster dynamics were not previously characterised, our work presents a major step forward. We show compelling evidence that an effective pushing force potential plays a role in aster interactions. With this critical knowledge, we can now explore for the potential molecular mechanisms – but such information lies beyond the current manuscript scope. This is particularly challenging due to the lack of specific microtubule drug inhibitors in Drosophila. We highlight related issues in the Discussion: paragraph starting on line 340 and lines 367-370.

      Specific suggestions:

      Microtubules should be visualized more clearly (either in live or fixed samples). This is particularly important in Figure 4E and Video 4 (laser ablation experiment to create asymmetric asters).

      Author Response: This is similar to Reviewer 1 final comment above. These experiments are very challenging and being able to see the microtubules with sufficient clarity is not straightforward. Given our controls and previous experience, we are confident we are ablating the microtubules.

      Minor points:

      1) The authors explain the roles of microtubule asters in several model systems in the first paragraph of the introduction part. Please specify the species and/or cell types in each description.

      Author Response: We have provided as suggested.

      2) In lines 164 and 172, the citing figure numbers should be modified to Supplementary Fig. 1A and 1B, respectively.

      Author Response: We thank the Reviewer for spotting this error. It has now been corrected.

      3) The authors showed in the previous study that the boundary in the explant does not have an intact cell cortex and f-actin compartments (de-Carvalho et al., Development, 2022). This important informa8on should also be described in the current manuscript. It is also valuable to mention whether the pulling force mechanism operates in embryos where the intact cell cortex is present.

      Author Response: This is an interesting point We have added a sentence in the discussion with this information. We have now added additional text in the Discussion (lines 324-327).

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      It is somewhat speculative that the structure represents the EIIa-bound regulatory state. There's a strong enough case that it should be analyzed in the discussion, but I don't think it is firmly established. Therefore, the title of the paper should be changed.

      Our answer: Thank you for the comment. We have changed the title to “Mobile barrier mechanisms for Na+-coupled symport in an MFS sugar transporter”

      Reading through the manuscript, it was challenging to distinguish what is new in the current manuscript and what has been done previously. There were a lot of parts where it was hard for me to identify the main point of the current study among all the details of previous studies. It would also benefit from shortening. For example:

      -Page 6: Nb725 binding has already been characterized extensively in the very nice JBC paper earlier this year. It's important to test 725-4 for binding, but since it doesn't change the binding interaction, and probably wouldn't be expected to, the entire section could be written more succinctly. The main point, which is that 725-4 behaves like 725, is lost among all the details

      Our answer: Thanks for this instructive suggestion. We have shortened the description in this section.

      -Page 9-10. I don't understand what summarizing all of the results from the previous D59C studies adds to the current story. It's important because it provides an indication of the substrate binding site, but its mechanism of action does not seem relevant to the current work.

      Our answer: We have shortened the description of the sugar-binding site and moved the previous Fig. 3b to supplementary figure sFig. 11. According to your comment about showing the location of the binding sites, which is also suggested by Reviewer #2, we modified Fig. 3 and added two panels to map the location of the bound Na+ in the inward-facing structure and the bound sugar in the outward-facing structure.

      The sugar-binding site identified in the published structure is critical to construct the mobile barrier mechanism. The sugar-binding residues identified in the published structure provided essential data to support the conclusion that the sugar-binding pocket is broken in the inward-facing structure. Thus, this published structure is mechanistically relevant to the current study.

      -Page 12. Too much summary of the previous outward structure. Since this is already part of the literature, it would be more efficient to reference the previous data when it is important to interpret the new data (or show as a figure).

      Our answer: The introduction of the previous sugar-binding sit is important for the detailed comparison between the two states as discussed above, but we agree with this reviewer and have significantly shortened the paragraph by moving the detailed description into the legend to the sFig. 11.

      -Instead of providing the PDB ID in figures of the current structure, just say "current work" or similar. Then it is obvious you are not citing a previous structure.

      Our answer: To distinguish clearly the new data and published results, the citation of the cryoEM structure [PDP ID 8T60] has been completely removed from the main text but kept in sTable 1.

      -An entire panel of Figure 3 is dedicated to ligand binding in a previous outward-facing structure.

      Showing it in the overlay would be sufficient.

      Our answer: It is the first time for us to show a structure with a bound-Na+. Fig. 3 also illustrates the spatial relationship between the sugar-binding pocket and the cation-binding pocket since both binding sites are determined now. As stated above, according to two reviewers’ comments, we have modified the Figures and the Fig. 3d is the overlay.

      Please increase the size of the font in all figures. It should be 6-8 point when printed on a standard sheet of paper. Labels in Figure 3, distances in Figure 4, and everything in Figure 5 is hard to see.

      Our answer: Thank you for the comments and the enlargement of the figure size and label font in all figures have been made.

      Figure 2: would be helpful to show Figure S8 in the main text, orienting the reader to the approximate location of substrate binding. What is known about the EIIA-Glc binding interface? Has anyone probed this by mutagenesis? Where are these residues on the overall structure, and are they somewhere other than the nanobody interface?

      Our answer: Thank you for this comment. We have added a panel for orienting the readers about the substrate location in MelB in Figure 3c. The sFig. 8 actually focuses on the details of Nb interactions with MelB. Our current data strongly supported the notion that the Nb-bound MelBSt structure mimics the EIIAGlc-bound MelB but is not structurally resolved, so we have tuned down our statement on EIIAGlc. There is one study suggesting the C-terminal tail helix may be involved in the EIIAGlc binding, which has been added to the discussion.

      Can Figure 5 be split into 2 figures and simplified?

      Our answer: thanks for the suggestion. We have split it into Figs. 5b and 6 and also moved the peptide mapping to the Fig 5a.

      What is the difference between cartoon and ribbon rendering?

      Our answer: Ribbon: illustrating the structure; cartoon: highlighting the positions with statistically significant protection or deprotection. The statistically significant changes are implied by the ribbon representation; Sphere: not covered by labeled peptides.

      Can the panels showing the kinetic data be enlarged? I don't think they need to surround the molecule. An array underneath would be fine.

      Our answer: We have enlarged all figures and labels. The placement of selected plots around the model could clearly show the difference in deuterium uptake rates between the transmembrane domain and extra-membrane regions. We will maintain this arrangement.

      Do colors in panel A correspond with colors in panel B?

      Our answer: The color usage in both are different. Now the two panels have been separated.

      Do I understand correctly that in the HDX experiments, negative values indicate positions that exchange more quickly in the nanobody-free protein relative to the nanobody-bound protein?

      Our answer: Your understanding is correct.

      I assume some of this is due to the protein changing conformation, but some of it might be due to burial at the nanobody-binding interface. Can those peptides be indicated?

      Our answer: Thank you for this comment. We have marked the peptide carrying the Nb-binding residues on uptake plots in Figs.6 and Extended Fig. 1. There are only three Nb-binding residues covered by many overlapping peptides. Most are not covered, either not carried by the labeled peptides (Tyr205, Ser206, and Ser207) or with insignificant changes (Pro132 and Thr133), except for Asp137, Lys138, and Arg141 which are presented in 8 labeled peptides.

      Few buried positions in the outward-facing state are expected to be solvent in the inward-facing state; unfortunately, inward-facing state they are buried by Nb binding.

      Make figure legends easier to interpret by removing non-essential methods details (like buffer conditions).

      Our answer: We removed the detailed method descriptions in most figure legends. Thank you.

      Check throughout for typos.

      ie page 9 Lue Leu

      Page 9 like likely

      Our answer: We have corrected them. Thank you!

      Reviewer #2 (Recommendations For The Authors):

      I have mostly minor questions/remarks.

      • Why not do the hdx-ms experiments in the presence of sugar? That would give a proper distinction between two conformational states, instead of an ensemble of states vs one state.

      Our answer: MelB conformation induced by sugar is also multiple states, and likely most are outward-facing states and occluded intermediate states. This is also supported by the new finding of an inward state with low sugar affinity. The ideal design should be one inward and one outward to understand the inward-outward transition. We have not identified an outward-facing mutant while we can obtain the inward by the Nb. WT MelBSt with bound Na+ favors the outward-facing state. Although our design is not ideal, we do have one state vs a predominant outward-facing WT with bound Na+.

      Minor comments:

      • Fig 5 is misleading as the peptide number does not match with the amino acid sequence. I would suggest putting a heat map with coverage on top. Or showing deuterium uptake per peptide. See examples below.

      Our answer: The peptide number should not match with sequence number. We have 155 overlapping peptides that cover the entire amino acid sequence including the 10-His tag, and there are 60 residues with no data because they are not covered by a labeled peptide. The residue positions that are covered by peptides are estimated by bars on the top. The cylinder length does not correspond to the length of the transmembrane helix, just for mapping purposes.

      • Can the authors explain how they found that the Nbs bind to the cytoplasmic side (before obtaining the structure)?

      Our answer: Our in vivo two-hybrid assay between the Nb and MelBSt indicated their interaction on the cytoplasmic surface of MelBSt, which is further confirmed by the melibiose fermentation and transport assay, where the transport activities were completely inhibited by intracellularly coexpressed Nb and MelBSt. Thanks for raising this question.

      • The authors use the word "substrate" indifferently for sugar and Na+ binding, which is a bit confusing. Technically, only sugar is the substrate and Na+ is a ligand, or cotransported-ion, that powers the reaction of transport. This might sound like nit-picking but it can lead to misunderstandings (at some point I thought two sugars were transported, and then I was looking for the second Na+ binding site).

      Our answer: We used to call the sugar and Na as co-substrate but we agree with this comment.

      We have changed by using substrate for the cargo sugar and coupling cation for the driving cation.

      • Abstract "only the inner barrier" - the is missing.

      Thanks. We have corrected this.

      • p.3 intro "and identified that the positive cooperativity of cation and melibiose, " something is missing.

      Thanks again. We missed the “as the core symport mechanism”.

      • P.6 Nb275_4 instead of Nb725_4

      Thank you very much for your careful reading.

      • P.7. Also, affinity affinities

      Thank you very much. We changed to “; and also, the -NPG affinity decreased by 21~32-fold for both Nbs”

      • P.8 " contains 417 MelBSt residues (positions 2-210, 219-355, and 364-432). This does not sum up to 417 residues.

      Thanks for your critical reading. We changed 364-432 to 262-432.

      • p.9 Lue 54

      We have corrected it to Leu54.

      • I find fig.3 hard to read. Can the authors show the Na+ binding pockets and sugar binding pockets within the structure? Especially figure 3b. why are the residues in different colors?

      Our answer: We have moved Fig 3b into sFig. 11. We colored the residues in the previous Fig 3B to match the hosting helices. We have added two panels to show the location of both sugar and Na in the molecular. Thank you for your comments.

      • Fig4 bcef. Colored circles at the end of the helices. What are they for?

      Our answer: We revised the legend. “The paired helices involved in either barrier formation were highlighted in the same colored circles.”

      • 86% coverage includes the his-tag - it would be good to clarify that.

      Our answer: Yes, it includes the 10-His tag.

      • Fig.7 - anti clockwise cycle of transport is counter-intuitive.

      Our answer: We have re-arranged. Our model was constructed originally to explain efflux due to limited information at the earlier state. Now more data are available allowing us to explain inflow and active transport.

      • Where are all the uptake plots per peptide for the HDX-MS data?

      Our answer: We have added the course raw data and prepared all uptake plots for all 71 peptides with statistically significant changes as an Extended Fig. 1.

      • P.22 protein was concentrated to 50 mg/mL. Really? That is a lot.

      This is correct. We can even concentrate MelBSt protein to greater than 50 mg/ml.

      • Have the authors looked into the potential role of lipids in regulating the conformational transition? Since the structure was obtained in nanodiscs, have they observed some unexplained densities? The role of lipid-protein interactions in regulating such transitions was observed for several transporters including MFS (Gupta K, et al. The role of interfacial lipids in stabilizing membrane protein oligomers. Nature. 2017 10.1038/nature20820. Martens C, et al. Direct protein-lipid interactions shape the conformational landscape of secondary transporters. Nat Commun. 2018 10.1038/s41467-018-06704-1.). Furthermore, I see the authors have already observed lipid specific functional regulation of MelB (ref: Hariharan, P., et al BMC Biol 16, 85 (2018). https://doi.org/10.1186/s12915-018-0553-0). A few words about this previous work, and even commenting on the absence of lipid-protein interactions in this current work is worthwhile.

      Our answer: Thanks for this very relevant comment. We paid attention to the unmodelled densities. There is one with potential but it is challenging to model it. We have added a sentence “There is no unexplained density that can be clearly modeled by lipids.” in the method to address this concern.

      Reviewer #3 (Recommendations For The Authors):

      1) In the following sentence, the authors report high errors for the Kd value. The anti-Fab Nb binding to NabFab was two-fold poorer than Nb725_4 at a Kd value of 0.11 {plus minus} 0.16 μM. The figure however indicates that the error value is 0.016 µM. Pls correct.

      Our answer: Thank you. You are correct. The error has been corrected. 0.16 ± 0.02 uM. In this revised manuscript, we present the data in nM units.

      2) Is the stoichiometry of the MelB:Na+ symport clearly known in this transporter. It can be mentioned in the discussion with appropriate references.

      Our answer: Yes, the stoichiometry of unity has been clearly determined, which was included in the second paragraph of the previous version.

      3) In the last section of results, the authors seem to suggest a greater movement within their Cterminal helical bundle compared to N-terminal helices. Is there evidence to suggest an asymmetry in the rocker switch between the two states of the transporter?

      Our answer: Our structural data revealed that the C-terminal bundle is more dynamic compared with the N-terminal bundle where hosts the residues for specific binding of galactoside and Na+. The HDX data showed that the most dynamic regions are the structurally unresolved C-terminal tail by either method, the conserved tail helix and the middle-loop helix. transmembrane helices are relatively less dynamic with similar distributions on both transmembrane bundles. Since the most dynamic regions are peripheral element associated with the C-terminal domain, it might give a wrong impression. With regard to the symmetric or asymmetric movement, which will certainly affect the dynamic interactions between the transporter and the lipids, we favor the notion that MelBSt performs symmetric movement during the rocker switch between inward and outward states at the least cost for the protein-lipids interaction.

      4) Figure 1. Are the thermograms exothermic or endothermic? clarify

      Our answer: In our thermograms, all positive peaks are exothermic due to the direct detection of the heat release by the TA instrument. We clarified this in Method and now we stress this in figure legends to avoid confusion.

      5) Figure 4a,d. Please put in a membrane bilayer and depict cytosolic and extracellular compartments for clarity.

      Thank you. We have added a bilayer and labeled the sidedness in this figure and other related figures.

      6) Fig 7. Melibiose symport cannot be referred to as Melibiose efflux transport in the legend as the latter refers to antiport. Pls rectify.

      Our answer: Influx and efflux are conventionally used to describe the direction of movement of a substrate. The use of symport and antiport indicates the directions of the coupling reaction for the cargo and cation. For the symporter MelB, melibiose efflux means that sugar with the coupled cation moves out, which is driven by the melibiose concentration. During the steady state of melibiose active transport, efflux rate = influx rate.

      7) Page 11 "A common feature of carrier transporters". The authors can use either carriers or transporters. Need not use both simultaneously.

      Sorry for overlooking this. We have deleted carriers. Thank you very much for your time.

      8) Several typos were noticed in this manuscript. some are listed below. pls correct.

      Page 4- last paragraph "Furthermore"

      We have corrected it. Thank you again!

      Page 7 - second para one repharse "affinity reduced by 21~32 fold/units.." pls clarify

      Added 21~32 fold.

      Page 9 - "so it is highly likely that inward-open conformation" pls correct.

      We have corrected to “likely”.

      Fig. S9c - correct the spelling "Distance".

      We have corrected to “Distance”

    1. Author Response

      eLife assessment

      In this valuable study, the authors investigate the transcriptional landscape of tuberculous meningitis, revealing key molecular differences contributed by HIV co-infection. Whilst some of the evidence presented is compelling, the bioinformatics analysis is limited to a descriptive narrative of gene-level functional annotations, which are somewhat basic and fail to define aspects of biology very precisely. Whilst the work will be of broad interest to the infectious disease community, validation of the data is critical for future utility.

      Response: We appreciate eLife’s positive assessment, although we challenge the conclusion that we ‘fail to define aspects of biology very precisely’. Our stated objective was to use bioinformatics tools to identify the biological pathways and hub genes associated with TBM pathogenesis and the eLife assessment affirms we have investigated ‘the transcriptional landscape of tuberculous meningitis’. To more precisely define aspects of the biology will require another study with different design and methods. Therefore the criticism seems unnecessarily harsh given the limitations of our stated objective.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Tuberculous meningitis (TBM) is one of the most severe forms of extrapulmonary TB. TBM is especially prevalent in people who are immunocompromised (e.g. HIV-positive). Delays in diagnosis and treatment could lead to severe disease or mortality. In this study, the authors performed the largest-ever host whole blood transcriptomics analysis on a cohort of 606 Vietnamese participants. The results indicated that TBM mortality is associated with increased neutrophil activation and decreased T and B cell activation pathways. Furthermore, increased angiogenesis was also observed in HIV-positive patients who died from TBM, whereas activated TNF signaling and down-regulated extracellular matrix organisation were seen in the HIV-negative group. Despite similarities in transcriptional profiles between PTB and TBM compared to healthy controls, inflammatory genes were more active in HIV-positive TBM. Finally, 4 hub genes (MCEMP1, NELL2, ZNF354C, and CD4) were identified as strong predictors of death from TBM.

      Strengths:

      This is a really impressive piece of work, both in terms of the size of the cohort which took years of effort to recruit, sample, and analyse, and also the meticulous bioinformatics performed. The biggest advantage of obtaining a whole blood signature is that it allows an easier translational development into a test that can be used in the clinical with a minimally invasive sample. Furthermore, the data from this study has also revealed important insights into the mechanisms associated with mortality and the differences in pathogenesis between HIV-positive and HIV-negative patients, which would have diagnostic and therapeutic implications.

      Weaknesses:

      The data on blood neutrophil count is really intriguing and seems to provide a very powerful yet easy-to-measure method to differentiate survival vs. death in TBM patients. It would be quite useful in this case to perform predictive analysis to see if neutrophil count alone, or in combination with gene signature, can predict (or better predict) mortality, as it would be far easier for clinical implementation than the RNA-based method. Moreover, genes associated with increased neutrophil activation and decreased T cell activation both have significantly higher enrichment scores in TBM (Figure 9) and in morality (Figure 8). While I understand the basis of selecting hub genes in the significant modules, they often do not represent these biological pathways (at least not directly associated in most cases). If genes were selected based on these biologically relevant pathways, would they have better predictive values?

      Response: Blood neutrophil count was not found to be a predictor for TBM mortality in our previous studies. We agree it could be useful to perform predictive analysis with neutrophil count as suggested by reviewer. Regarding hub genes versus genes representative of the biological pathways, we cannot know which have better predictive values without performing variable selection for the sets of all genes including both hub genes and pathway representative genes, additional analysis which we will undertake.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript describes the analysis of blood transcriptomic data from patients with TB meningitis, with and without HIV infection, with some comparison to those of patients with pulmonary tuberculosis and healthy volunteers. The objectives were to describe the comparative biological differences represented by the blood transcriptome in TBM associated with HIV co-infection or survival/mortality outcomes and to identify a blood transcriptional signature to predict these outcomes. The authors report an association between mortality and increased levels of acute inflammation and neutrophil activation, but decreased levels of adaptive immunity and T/B cell activation. They propose a 4-gene prognostic signature to predict mortality.

      Strengths:

      -Biological evaluations of blood transcriptomes in TB meningitis and their relationship to outcomes have not been extensively reported previously.

      -The size of the data set is a major strength and is likely to be used extensively for secondary analyses in this field of research.

      Weaknesses:

      The bioinformatic analysis is limited to a descriptive narrative of gene-level functional annotations curated in GO and KEGG databases. This analysis can not be used to make causal inferences. In addition, the functional annotations are limited to 'high-level' terms that fail to define biology very precisely. At best, they require independent validation for a given context. As a result, the conclusions are not adequately substantiated. The identification of a prognostic blood transcriptomic signature uses an unusual discovery approach that leverages weighted gene network analysis that underpins the bioinformatic analyses. However, the main problem is that authors seem to use all the data for discovery and do not undertake any true external validation of their gene signature. As a result, the proposed gene signature is likely to be overfitted to these data and not generalisable. Even this does not achieve significantly better prognostic discrimination than the existing clinical scoring.

      Response: As explained in response to the eLife assessment, our objective was to use bioinformatics tools to identify the biological pathways and hub genes associated with TBM pathogenesis. We agree that ‘This analysis can not be used to make causal inferences’: that would require different study design and approaches. The proposed gene signature has higher AUC values than the existing clinical model. We agree that validation of the gene signature in an independent sample set will be a crucial next step.

    1. Author Response

      Author responses to the original review:

      The data we produce are not criticized as such and thus, do not require revision; the criticisms concern our interpretation of them. General themes of the reviews are that i) genetic signatures do not matter for defining neuronal types (here sympathetic versus parasympathetic); ii) that a cholinergic postganglionic autonomic neuron must be parasympathetic; and iii) that some physiology of the pelvic region would deserve the label “parasympathetic”. We answered the latter argument in (Espinosa-Medina et al., 2018) to which we refer the interested reader; and we fully disagree with the first two. Of note, part of the last sentence of the eLife assessment is misleading and does not reflect the referees’ comments. Our paper analyses genetic differences between the cranial and sacral outflow and uses them to argue that they cannot be both parasympathetic. The eLife assessment acknowledges the “genetic differences” but concludes that, somehow, they don’t detract from a common parasympathetic identity. We take issue with this paradox, of course, but it is coherent with the referee’s comments. On the other hand, the eLife assessment alone pushes the paradox one step further by stating that “functional differences” between the cranial and sacral outflows can’t either prevent them from being both parasympathetic. We would also object to this, but the only “functional differences” used by the referees to dismiss our diagnostic of a sympathetic-like character (rather than parasympathetic) for the sacral outflow are between noradrenergic and cholinergic, and between sympathetic and parasympathetic (and we also disagree with those, see above, and below) —not between cranial and sacral.

      We will thus use the opportunity offered by eLife to keep the paper as it is (with a few minor stylistic changes). We respond below to the referees’ detailed remarks and hope that the publication, as per eLife new model, of the paper, the referees’ comments and our response will help move the field forward.

      Public review by Referee #1

      “Consistently, the P3 cluster of neurons is located close to sympathetic neuron clusters on the map, echoing the conventional understanding that the pelvic ganglia are mixed, containing both sympathetic and parasympathetic neurons”.

      The greater closeness of P3 than of P1/2/4 to the sympathetic cluster can be used to judge P1/2/4 less sympathetic than P3 (and more… something else), but not more parasympathetic. There is no echo of the “conventional understanding” here.

      “A closer look at the expression showed that some genes are expressed at higher levels in sympathetic neurons and in P2 cluster neurons ” [We assume that the referee means “in sympathetic neurons and in P3 cluster neurons”] but much weaker in P1, P2, and P4 neurons such as Islet1 and GATA2, and the opposite is true for SST. Another set of genes is expressed weakly across clusters, like HoxC6, HoxD4, GM30648, SHISA9, and TBX20.

      These statements are inaccurate; On the one hand, the classification is not based on impression by visual inspection of the heatmap, but by calculations, using thresholds. Admittedly, the thresholds have an arbitrary aspect, but the referee can verify (by eye inspection of heatmap) that genes which we calculate as being at “higher levels in sympathetic neurons and in P3 cluster neurons, but much weaker in P1, P2, and P4 neurons” or vice versa, i.e. noradrenergic or cholinergic neurons (genes from groups V and VI, respectively), have a much bigger difference than those cited by the referee, indeed are quasi-absent from the weaker clusters or ganglia. In addition, even by subjective eye inspection:

      Islet is equally expressed in P4 and sympathetics.

      SST is equally expressed in P1 and sympathetics.

      Tbx20 is equally expressed in P2 and sympathetics.

      HoxC6, HoxD4, GM30648, SHISA9 are equally expressed in all clusters and all sympathetic ganglia.

      “Since the pelvic ganglia are in a caudal body part, it is not surprising to have genes expressed in pelvic ganglia, but not in rostral sphenopalatine ganglia, and vice versa (to have genes expressed in sphenopalatine ganglia, but not in pelvic ganglia), according to well recognized rostro-caudal body patterning, such as nested expression of hox genes.”

      We do not simply show “genes expressed in pelvic ganglia, but not in rostral sphenopalatine ganglia, and vice versa”, i.e. a genetic distance between pelvic and sphenopalatine, but many genes expressed in all pelvic cells and sympathetic ones, i.e. a genetic proximity between pelvic and sympathetic. This situation can be deemed “unsurprising”, but it can only be used to question the parasympathetic nature of pelvic cells (as we do), or considered irrelevant (as the referee does, because genes would not define cell types, see our response to an equivalent stance by Referee#2). Concerning Hox genes, we do take them into account, and speculate in the discussion that their nested expression is key to the structure of the autonomic nervous system, including its division into sympathetic and parasympathetic outflows.

      It is much simpler and easier to divide the autonomic nervous system into sympathetic neurons that release noradrenaline versus parasympathetic neurons that release acetylcholine, and these two systems often act in antagonistic manners, though in some cases, these two systems can work synergistically. It also does not matter whether or not pelvic cholinergic neurons could receive inputs from thoracic-lumbar preganglionic neurons (PGNs), not just sacral PGNs; such occurrence only represents a minor revision of the anatomy. In fact, it makes much more sense to call those cholinergic neurons located in the sympathetic chain ganglia parasympathetic.

      This “minor revision of the anatomy” would make spinal preganglionic neurons which are universally considered sympathetic (in the thoraco-lumbar chord), synapse onto large numbers of parasympathetic neurons (in the paravertebral chains for sweat glands and periosteum, and in the pelvic ganglion), robbing these terms of any meaning.

      Thus, from the functionality point of view, it is not justified to claim that "pelvic organs receive no parasympathetic innervation".

      There never was any general or rigorous functional definition of the sympathetic and parasympathetic nervous systems — it is striking, almost ironic, that Langley, creator of the term parasympathetic and the ultimate physiologist, provides an exclusively anatomic definition in his Autonomic Nervous System, Part I. Hence, our definition cannot clash with any “functionality point of view”. In fact, as we briefly say in the discussion and explore in (Espinosa-Medina et al., 2018), it is the “sacral parasympathetic” paradigm which is unjustified from a functionality point of view, for implying a functional antagonism across the lumbo-sacral gap, which has been disproven repeatedly. It remains to be determined which neurons are antagonistic to which on the blood vessels of the external genitals; antagonism within one division of the autonomic nervous system would not be without precedent (e.g. there exist both vasoconstrictor and vasodilator sympathetic neurons, and both, inhibitor and activator enteric motoneurons). The way to this question is finally open to research, and as referee#2 says “it is early days”.

      Public review by Referee #2

      This work further documents differences between the cranial and sacral parasympathetic outflows that have been known since the time of Langley - 100 years ago.

      We assume that the referee means that it is the “cranial and sacral parasympathetic outflows” which “have been known since the time of Langley”, not their differences (that we would “further document”): the differences were explicitly negated by Langley. As a matter of fact, the sacral and cranial outflows were first likened to each other by Gaskell, 140 years ago (Gaskell, 1886). This anatomic parallel (which is deeply flawed (Espinosa-Medina et al., 2018)) was inherited wholesale by Langley, who added one physiological argument (Langley and Anderson, 1895) (which has been contested many times (Espinosa-Medina et al., 2018) and references within).

      In addition, the sphenopalatine and other cranial ganglia develop from placodes and the neural crest, while sympathetic and sacral ganglia develop from the neural crest alone.

      Contrary to what the referee says, the sphenopalatine has no placodal contribution. There is no placodal contribution to any autonomic ganglion, sympathetic or parasympathetic (except an isolated claim concerning the ciliary ganglion (Lee et al., 2003)). All autonomic ganglia derive from the neural crest as determined a long time ago in chicken. For the sphenopalatine in mouse, see our own work (Espinosa-Medina et al., 2016).

      One feature that seems to set the pelvic ganglion apart is […] the convergence of preganglionic sympathetic and parasympathetic synapses on individual ganglion cells (Figure 3). This unusual organization has been reported before using microelectrode recordings (see Crowcroft and Szurszewski, J Physiol (1971) and Janig and McLachlan, Physiol Rev (1987)). Anatomical evidence of convergence in the pelvic ganglion has been reported by Keast, Neuroscience (1995).

      Contrary to what the referee says, we do not provide in Figure 3 any evidence for anatomic convergence, i.e. for individual pelvic ganglion cells receiving dual lumbar and sacral inputs. We simply show that cholinergic neurons figure prominently among targets of the lumbar pathway. This said, the convergence of both pathways on the same pelvic neurons, described in the references cited by the referee, is another major problem in the theory of the “sacral parasympathetic” (as we discussed previously (Espinosa-Medina et al., 2018)).

      It should also be noted that the anatomy of the pelvic ganglion in male rodents is unique. Unlike other species where the ganglion forms a distributed plexus of mini-ganglia, in male rodents the ganglion coalesces into one structure that is easier to find and study. Interestingly the image in Figure 3A appears to show a clustering of Chat-positive and Th-positive neurons. Does this result from the developmental fusion of mini ganglia having distinct sympathetic and parasympathetic origins?

      The clustering of Chat-positive and Th-positive cells could arise from a number of developmental mechanisms, that we have no idea of at the moment. This has no bearing on sympathetic and parasympathetic.

      In addition, Brunet et al dismiss the cholinergic and noradrenergic phenotypes as a basis for defining parasympathetic and parasympathetic neurons. However, see the bottom of Figure S4 and further counterarguments in Horn (Clin Auton Res (2018)).

      The bottom of Figure S4 simply indicates which cells are cholinergic and adrenergic. We have already expounded many times that noradrenergic and cholinergic do not coincide with sympathetic and parasympathetic. Henry Dale (Nobel Prize 1936) demonstrated this. Langley himself devoted several pages of his final treatise to this exception to his “Theory on the relation of drugs to nerve system” (Langley, 1921) (p43) (which was actually a bigger problem for him than it is for us, for reason which are too long to recount here; it is as if the theoretical difficulties experienced by Langley had been internalized to this day in the form of a dismissal of the cholinergic sympathetic neurons as a slightly scandalous but altogether forgettable oddity). (Horn, 2018) reviews the evidence that the thoracic cholinergic sympathetic phenotype is brought about by a secondary switch upon interaction with the target and argues that this would be a fundamental difference with the sacral “parasympathetic”. But in fact the secondary switch is preceded by co-expression of ChAT and VAChT with Th in most sympathetic neurons (reviewed in (Ernsberger and Rohrer, 2018)); and we have no idea of the dynamic in the pelvic ganglion. It may also be mentioned in this context that target-dependent specification of neuronal identity has also been demonstrated of other types of sympathetic neurons ((Furlan et al., 2016)

      What then about neuropeptides, whose expression pattern is incompatible with the revised nomenclature proposed by Brunet et al.?

      There was never any neuropeptide-inspired criterion for a nomenclature of the autonomic nervous system.

      Figure 1B indicates that VIP is expressed by sacral and cranial ganglion cells, but not thoracolumbar ganglion cells.

      Contrary to what the referee says, there are VIP-positive cells in our sympathetic data set and even strongly positive ones, except they are scattered and few (red bars on the UMAP). They correspond to cholinergic sympathetics, likely sudomotor, which are known to contain VIP (e.g.(Anderson et al., 2006)(Stanke et al., 2006)). In other words, VIP is probably part of what we call the cholinergic synexpression group (but was not placed in it by our calculations, probably because of a low expression level in sympathetic noradrenergic cells).

      The authors do not mention neuropeptide Y (NPY). The immunocytochemistry literature indicates that NPY is expressed by a large subpopulation of sympathetic neurons but never by sacral or cranial parasympathetic neurons.

      Contrary to what the referee says, Keast (Keast, 1995) finds 3.7% of pelvic neurons double stained for NPY and VIP in male rats, and says (Keast, 2006) that in females “co-expression of NPY and VIP is common” ( thus in cholinergic neurons that the referee calls “parasympathetic”). Single cell transcriptomics is probably more sensitive than immunochemistry, and in our dichotomized data set (table S1), NPY is expressed in all pelvic clusters and all sympathetic ganglia. In other words, it is one more argument for their kinship. It does not appear in the heatmap because it ranks below the 100 top genes.

      Answer to the original recommendations by Referee #2

      Introduction - the use of the words 'consensual' and 'promiscuity' are not clear and rather loaded in the context of the pelvic ganglia. Pick alternative words.

      There is no sexual innuendo inherent in “promiscuity”: “condition of elements of different kinds grouped or massed together without order” (Oxford English Dictionary). We replaced “never consensual” by “never generally accepted”.

      Results - Page 2 - what sex were the mice? Previous works indicate significant sexual dimorphism in the pelvic ganglion.

      The mice included both males and females, and male and female cells are represented in all ganglia and clusters. This is now mentioned in the Material and Methods. Thus, however unsuited to analyze sexual dimorphism, our data set ensures that all the cell types we describe are qualitatively present in both sexes.

      Results line 3 - the celiac and mesenteric ganglia are prevertebral ganglia and not part of the sympathetic chain. The chain refers to the paravertebral ganglia.

      We replaced “part of the prevertebral chain” by “belonging to prevertebral ganglia”. This said, there are precedents for “prevertebral chain ganglia” to designate the rostro-caudal series of prevertebral ganglia. Rita Levi-Montalcini, for example, who devoted her glorious career to sympathetic ganglia, writes in 1972 “The nerve cell population of para- and prevertebral chain ganglia is reduced to 3–5% of that of controls”. (10.1016/0006-8993(72)90405-2).

      Page 3 - "as the current dogma implies". Dogma often refers to opinion or church doctrine. The current nomenclature is neither. Pick another word.

      There is little in science that is proven to the point of eliminating any element of opinion. “Dogma” refers to “that which is held as a principle or tenet […], especially a tenet authoritatively laid down by […] a school of thought” (OED). And “dogma” is used in science to designate tenets better experimentally supported than the “sacral parasympathetic”, such as the “central dogma of molecular biology”.

      Page 3 - "To give justice" implies the classical notion is unjust. How about, 'to further explore previous evidence indicating that ....'

      The term is indeed not proper English for the meaning intended, and the right expression is “to do justice”, to mean: “to treat [a subject or thing] in a manner showing due appreciation, to deal with [it] as is right or fitting” (OED). We have corrected the paper accordingly.

      Page 4 top - the convergence indicated by Figure 3 does not justify excluding cholinergic and noradrenergic genes from the analysis.

      Contrary to what the referee says, Figure 3 does not show any “convergence”, see our answer to Referee#1. What Figure 3 shows is that cells that are targeted by the lumbar pathway (a pathway universally deemed “sympathetic”) are cholinergic in massive proportion. Therefore, by an uncontroversial criterion, the pelvic ganglion contains lots of sympathetic cholinergic neurons. The only other option is to declare that sympathetic preganglionic neurons synapse onto parasympathetic postganglionic ones (which is what Referee#1 proposes, and considers “much simpler”. We beg to differ).

      Our justification for excluding cholinergic and noradrenergic genes from the definition of “sympathetic” and “parasympathetic” is simply that sympathetic neurons can be cholinergic (to sweat glands and periosteum; and — as we show in Figure 3 — many targets of the lumbar pathway); One can also note that anywhere else in the nervous system, classifying cell types as a function of neurotransmitter phenotype would lead to non-sensical descriptions, such as putting together pyramidal cells and cerebellar granules, or motor neurons and basal forebrain cholinergic neurons. Indeed Referee#1 proposes such a revolutionary revision, by calling all cholinergic autonomic neurons “parasympathetic” (see our answer above).

      Keast (1995) did similar experiments and used presynaptic lesions to draw a different conclusion indicating preferential innervation pelvic subpopulations.

      Keast found “preferential” innervation of pelvic subpopulations based on lesion experiments; Nevertheless, she concluded (at the time) that “the correct definition of these two components of the nervous system is based on neuroanatomy rather than chemistry” (Keast, 2006).

      Page 4 - "In the aggregate, the pelvic ganglion is best described as a divergent sympathetic ganglion devoid of parasympathetic neurons" The notion of a divergent ganglion is completely unclear!

      We take “divergent” in a developmental or evolutionary meaning: related to sympathetic ganglia, yet somewhat differing from them. Elsewhere we use the word “modified”. Importantly (and as cited in the paper), a similar situation emerges from the single cell transcriptomic analysis of the lumbar and sacral preganglionics (by other research groups).

      Granted, it is devoid of neurons having the signature of cranial parasympathetics, but that is insufficient to conclude that they are not parasympathetics.

      If a genetic signature which is not only un-parasympathetic, but sympathetic-like remains compatible with some version of the label “parasympathetic”, we get dangerously close to dismissing the molecular make-up of a neuron as a definition of its type. This goes against any contemporary understanding of neuron types (take (Zeisel et al., 2018) among hundreds of other examples).

      Page 4 - "the entire taxonomy of autonomic ganglia could be a developmental readout of Hox genes." This reader completely agrees! We appreciate this would be difficult to test but it helps to explain possible differences along the rostro-caudal axis. Consider making this a key implication of the study!

      If the reader agrees, then his/her previous points become mysterious: we speculate that the Hox code determines the structure of the autonomic nervous system, i.e. the array, along the rostrocaudal axis, of a bulbar parasympathetic, a thoracolumbar sympathetic and lumbo-sacral “pelvo-sympathetic”. The existence of caudal parasympathetic neurons, on the contrary, would subvert any role for Hox genes: similar neurons (similar enough to be called by the same name) would arise at completely different rostro-caudal levels, i.e. with a different Hox code.

      Page 5 - "It is thus remarkable ...that we uncover in no way contradicts the physiology." Not really. The 'classical' sympathetic system innervates the limbs, and the skin and it participates in thermoregulation and in cardiovascular adjustments to exercise. The parasympathetic system does none of these things. Reclassing the pelvic outflow as pseudo-sympathetic contradicts this physiology.

      We do not say that the sacral outflow is classically sympathetic; We go all the way to proposing the special name “pelvo-sympathetic”; And we insist that these special sympathetic-like neurons have special targets (detrusor muscle, helicine arteries…): there is no contradiction. Not only is there no contradiction, but we remove the mind-twister of an anatomical/genetic/cell type-based “sacral parasympathetic” combined with a lack of physiological lumbosacral antagonism (we provide a short history of this dissonance in (Espinosa-Medina et al., 2018)), which led Wilfrid Jänig to write (Jänig, 2006)(p. 357): “Thus, functions assumed to be primarily associated with sacral (parasympathetic) are well duplicated by thoracolumbar (sympathetic) pathways. This shows that the division of the spinal autonomic systems into sympathetic and parasympathetic with respect to sexual functions is questionable”. We could not agree more: this division is questionable in terms of physiology and inexistent in terms of cell types. In other words, we reconcile cell types with physiology (but “it is early days”).

      Answer to the novel recommendations by Referee #2

      In addition to my original comments, important anatomical and functional distinctions are not explained by the data in this paper. ANATOMY- Sympathetic ganglia are located in close proximity to major branches of the aorta. Cranial and sacral parasympathetic ganglia are located next to or within the structures they innervate (e.g. eye, lung, heart, bladder).

      The pelvic ganglion, including some of its cholinergic neurons, that the referee insist are parasympathetic, is further removed from one of its major targets (the helicine arteries of the external genitals) than the sympathetic prevertebral ganglia are of some of theirs (like the gut or kidney). We discussed this issue in (Espinosa-Medina et al., 2018).

      FUNCTION- The sympathetic system controls state variables (e.g. body temperature, blood pressure, serum electrolytes and fluid balance), parasympathetic neurons do not.

      Even in the classical view, the sympathetic system controls the blood vessels of the external genitals or the size of the pupil, for example, which are not state variables.

      […] The data in the paper are a useful next step in defining the genetic diversity of autonomic neurons but do not justify or improve upon existing nomenclature. The future challenge is to understand distinctions between subsets of autonomic ganglion cells that innervate different targets and the principles that govern the integrative function of the autonomic motor system that controls behavior.

      We thank the referee for finding our data useful; and we fully agree with the latter statement. However, neurons, like many other cell types, are hierarchically organized (Zeng and Sanes, 2017), i.e. subsets of neurons belong to sets, with defining traits. Our data argue that there is no parasympathetic neuronal set that includes any pelvic ganglionic neuron. In contrast, there is a ganglionic sympathetic set (defined by our analysis of gene expression) which includes all of them — as there is a preganglionic sympathetic set that includes sacral preganglionics (Alkaslasi et al., 2021; Blum et al., 2021)(although the direct comparison with cranial preganglionics is yet to be made).

      References

      Anderson, C. R., Bergner, A. and Murphy, S. M. (2006). How many types of cholinergic sympathetic neuron are there in the rat stellate ganglion? Neuroscience 140, 567–576.

      Alkaslasi, M. R., Piccus, Z. E., Hareendran, S., Silberberg, H., Chen, L., Zhang, Y., Petros, T. J. and Le Pichon, C. E. (2021). Single nucleus RNA-sequencing defines unexpected diversity of cholinergic neuron types in the adult mouse spinal cord. Nat Commun 12, 2471.

      Blum, J. A., Klemm, S., Shadrach, J. L., Guttenplan, K. A., Nakayama, L., Kathiria, A., Hoang, P. T., Gautier, O., Kaltschmidt, J. A., Greenleaf, W. J., et al. (2021). Single-cell transcriptomic analysis of the adult mouse spinal cord reveals molecular diversity of autonomic and skeletal motor neurons. Nat Neurosci 24, 572–583.

      Ernsberger, U. and Rohrer, H. (2018). Sympathetic tales: subdivisons of the autonomic nervous system and the impact of developmental studies. Neural Dev 13, 20.

      Espinosa-Medina I, Saha O, Boismoreau F, Chettouh Z, Rossi F, Richardson WD, Brunet JF (2016) The sacral autonomic outflow is sympathetic. Science 354, 893-897

      Espinosa-Medina, I., Saha, O., Boismoreau, F. and Brunet, J.-F. (2018). The “sacral parasympathetic”: ontogeny and anatomy of a myth. Clin Auton Res 28, 13–21.

      Furlan, A., La Manno, G., Lübke, M., Häring, M., Abdo, H., Hochgerner, H., Kupari, J., Usoskin, D., Airaksinen, M. S., Oliver, G., et al. (2016). Visceral motor neuron diversity delineates a cellular basis for nipple- and pilo-erection muscle control. 19, 1331–1340.

      Gaskell, W. H. (1886). On the Structure, Distribution and Function of the Nerves which innervate the Visceral and Vascular Systems. J Physiol 7, 1-80.9.

      Horn, J. P. (2018). The sacral autonomic outflow is parasympathetic: Langley got it right. Clin Auton Res 28, 181–185.

      Jänig, W. (2006). The Integrative Action of the Autonomic Nervous System: Neurobiology of Homeostasis. Cambridge: Cambridge University Press.

      Keast, J. R. (1995). Visualization and immunohistochemical characterization of sympathetic and parasympathetic neurons in the male rat major pelvic ganglion. Neuroscience 66, 655–662.

      Keast, J. R. (2006). Plasticity of pelvic autonomic ganglia and urogenital innervation. International Review of Cytology - a Survey of Cell Biology, Vol 248 248, 141-+.

      Langley, J. N. (1921). In The autonomic nervous system (Pt. I)., p. Cambridge: Heffer & Sons ltd.

      Langley, J. N. and Anderson, H. K. (1895). The Innervation of the Pelvic and adjoining Viscera: Part II. The Bladder. Part III. The External Generative Organs. Part IV. The Internal Generative Organs. Part V. Position of the Nerve Cells on the Course of the Efferent Nerve Fibres. J Physiol 19, 71–139.

      Lee, V. M., Sechrist, J. W., Luetolf, S. and Bronner-Fraser, M. (2003). Both neural crest and placode contribute to the ciliary ganglion and oculomotor nerve. Developmental biology 263, 176–190.

      Stanke, M., Duong, C. V., Pape, M., Geissen, M., Burbach, G., Deller, T., Gascan, H., Parlato, R., Schütz, G. and Rohrer, H. (2006). Target-dependent specification of the neurotransmitter phenotype:cholinergic differentiation of sympathetic neurons is mediated in vivo by gp130 signaling. Development 133, 141–150.

      Zeisel, A., Hochgerner, H., Lönnerberg, P., Johnsson, A., Memic, F., van der Zwan, J., Häring, M., Braun, E., Borm, L. E., La Manno, G., et al. (2018). Molecular Architecture of the Mouse Nervous System. Cell 174, 999-1014.e22.

      Zeng, H. and Sanes, J. R. (2017). Neuronal cell-type classification: challenges, opportunities and the path forward. Nat Rev Neurosci 18, 530–546.

    1. Author Response

      Reviewer #2 (Public Review):

      Manassaro et al. present an extensive three-session study in which they aimed to change defensive responses (skin conductance; SCR) to an aversively conditioned stimulus by targeting medial prefrontal cortex (their words) using repetitive TMS prior to retrieval. They report that stimulating mPFC using TMS abolishes SCR responses to the conditioned stimulus, and that this effect is specific for the stimulated region and the specific CS-US association, given that SCR responses to a different modality US are not changed.

      I like how the authors have clearly attempted to control for several potential confounds by including multiple stimulation sites, measured SCR responses to several unconditioned stimuli, and applied the experiment in multiple contexts. However, several conceptual and practical issues remain that I think limit the value of potential conclusions drawn from this work.

      The first issue that I have with this study concerns the relationship between the TMS manipulation and the theoretical background the authors present in their rationale. In the introduction the authors sketch that what they call 'mPFC' is involved in regulation of threat responses. They make a convincing case, however, almost all of the evidence they present concerns the ventromedial part of the prefrontal cortex (refs 18-25). The authors then mention that no one has ever studied the effects of 'mPFC'-TMS on threat memories. That is not surprising given that stimulating vmPFC with TMS is very difficult, if not impossible. Simulation of the electrical field that develops as a consequence from the authors manipulation (using the same TMS coil and positioning the authors use) shows that vmPFC (or mPFC for that matter) is not stimulated. The authors then continue in the methods section stating that the region they aimed for was BA10. This region they presumably do stimulate, however, that does not follow logically from their argument. BA10 is anatomically, cytoarchitectonically and functionally a wholly different area than vmPFC and I wonder if their rationale would hold given that they stimulate BA10.

      We would like to thank the Reviewer for highlighting this very important point. The Reviewer is right in stating that the Brodmann area 10 (BA 10) is anatomically, cytoarchitectonically, and functionally distinct from the ventromedial PFC. As we reported in the Methods section, the coil placement over the frontopolar midline electrode (Fpz) according to the international 10‒20 EEG coordinate system directly focused the stimulation over the medial portion of the BA 10. In the literature, the aPFC is also known as the “frontopolar cortex” or the “rostral frontal cortex” and encompasses the most anterior portion of the prefrontal cortex, which corresponds to the BA 10. In line with this observation, we have corrected “medial prefrontal cortex” (mPFC) with “medial anterior prefrontal cortex” (aPFC) throughout the manuscript. We also have corrected the theoretical background and the rationale in the Introduction section by mentioning several studies that: i) Reported the involvement of the aPFC in emotional down-regulation (Volman et al., 2013; Koch et al., 2018; Bramson et al., 2020). ii) Traced anatomical connections between the medial/lateral aPFC and the amygdala (Peng et al., 2018; Folloni et al., 2019; Bramson et al., 2020). iii) Detected functional connections between the aPFC and the vmPFC during fear down-regulation (Klumpers et al., 2010). iv) Found hypoactivation, reduced connectivity, and altered thickness of aPFC in PTSD patients (Lanius et al., 2005; Morey et al., 2008; Sadeh et al., 2015; Sadeh et al., 2016). v) Revealed that strong activation of the aPFC may promote a higher resilience against PTSD onset (Kaldewaij et al., 2021) and that enhanced aPFC activity and potentiated aPFC-vmPFC connectivity is detectable after effective therapy in PTSD patients (Fonzo et al., 2017). Furthermore, we discussed our results in light of this evidence in the Discussion section. We really thank the Reviewer for this key implementation of our study.

      The second concern I have is that although I think the authors should be praised for including both sham and active control regions, the controls might not be optimally chosen to control for the potential confounds of their condition of interest (mPFC-TMS). Namely, TMS on the forehead can be unpleasant, if not painful, whereas sham-TMS or TMS applied to the back of the head or even over dlPFC is not (or less so at the very least). Given that the SCR results after mPFC TMS show exactly the same temporal pattern as the sham-TMS but with a lower starting point, one could wonder whether a painful stimulation prior to the retrieval might have already caused habituation to painful stimulation observed in SCR in consequent CS presentations. A control region that would have been more obvious to take is the lateral part of BA10, by moving the TMS coil several centimeters to the left or right, circumventing all things potentially called medial but giving similar unpleasant sensations (pain etc).

      We would also like to thank the Reviewer for bringing to light this issue and allowing us to strengthen our results. The Reviewer is right in pointing out that rTMS application over the forehead can be subjectively perceived as unpleasant, relative to other head coordinates or sham stimulation. The question of whether an unpleasant stimulation prior to the retrieval might provoke habituation to discomfort sensations and lead to weaker SCRs in the consequent CS presentations is valid and reasonable. We also thank the Reviewer for advising us to stimulate the lateral part of BA 10 as an active control site. However, given the potential involvement of the lateral BA 10 in the fear network (see previous point) and the potential risks due to the anatomical proximity of lateral BA 10 with the temporal lobe, we reasoned to adopt an alternative approach to investigate whether “a painful stimulation prior to the retrieval might have already caused habituation to painful stimulation observed in SCR in consequent CS presentations”. We repeated the entire experiment in one further group (ctrl discomfort, n = 10) by replacing the rTMS procedure with a 10-min discomfort-inducing procedure over the same site of the forehead (Fpz) to mimic the rTMS-evoked unpleasant sensations in the absence of neural stimulation effects (see the new version of the Methods section). The electrical stimulation intensity was individually calibrated through a staircase procedure (0 = no discomfort; 10 = high discomfort). The shock amplitude was set at the current level corresponding to the mean rating of ‘4’ on the subjective scale because, in the new experiments that we performed targeting the aPFC with rTMS (n = 9), we collected participants’ rTMS-induced discomfort ratings obtaining a mean rating of 3.833 ± 0.589 SEM on the same scale. We found CS-evoked SCR levels not significantly different to those of the sham group during the test session as well as during the follow-up session, suggesting that the discomfort experienced during the rTMS procedure did not contribute to the reduction of electrodermal responses observed in the aPFC group. We reported the results of this experiment in the Results section and Figure 2-figure supplement 2.

      My final concern is that the main analyses are performed on single trials of SCR responses, which is a relatively noise measure to use on single trials. This is also done in relatively small groups (n=21). I would have liked to see both the raw or at least averaged timeseries SCR data plotted, and a rationale explaining how the authors decided on the current sample sizes, if that was based on a power analyses one must have expected quite strong effects.

      Following the Reviewer’s suggestion, we decided to remove the analysis on single trials, and we apologize for not including SCR timeseries. To quantify the amount of effect induced by the rTMS protocol, we have now added within-group comparisons (through 2 × 2 mixed ANOVAs) that show, for each group, the amount of change in CS-evoked SCRs from the conditioning phase to the test phase, as well as from the conditioning phase to the follow-up phase. Furthermore, to directly and simply depict these changes, in addition to dot plots, we have also represented them with line charts (Figs. 2C, 2H, 4C, 4H, 5C, 5H). To estimate the sample size, we had previously performed a power analysis through G*Power 3.1.9.2 and it had resulted in n = 21 per experimental group. However, by correcting data pre-processing procedures (in accordance with Reviewer 1), we obtained data that were not normally distributed. Thus, we reasoned to enlarge our sample width by re-performing a power analysis (with the new suggested statistical analyses) and then repeating the experiments. For the main statistics, i.e. mixed ANOVA (within-between interaction) with two groups and two measurements, with the following input parameters: α equal to 0.05, power (1-β) equal to 0.95, and a hypothesized effect size (f) equal to 0.25, the new estimated sample size resulted in n = 30 per experimental group.

    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript, the Authors implement a delayed feedback control method and use it for the first time in biological neuronal networks. They extend a well-established computational theory and expand it into the biological realm. With this, they obtain novel evidence, never considered before, that showcases the difference between simulated neuronal networks and biological ones. Furthermore, they optimize the DFC method to achieve optimal results in the control of cell excitability in the content of biological neuronal networks, taking advantage of a closed-loop stimulation setup that, by itself, is not trivial to build and operate and that will certainly have a positive impact the fields of cellular and network electrophysiology.

      Regarding the results, it would be very constructive if the Authors could share the code for the quasi-real-time interface with the Multichannel Systems software (current and older hardware versions), as this represents likely a bottleneck preventing more researchers to implement such an experimental paradigm.

      On the data focusing on the effects of the DFC algorithms on neuronal behavior, the evidence is very compelling, although more care should be devoted to the statistical analyses, since some of the applied statistical tests are not appropriate. In a more biological sense, further discussion and clarification of the experimental details would improve this manuscript, making it more accessible and clearer for researchers across disciplines (i.e., ranging from computational to experimental Neuroscience) and increasing the impact of this research.

      In summary, this work represents a necessary bridge between recent advances in computational neuroscience and the biological implementation of neuronal control mechanisms.

      Regarding sharing the control code, our application for closed-loop stimulation using aDFC, DFC and Poisson is now available in GitHub (https://github.com/NCN-Lab/aDFC). This was, in fact, our initial intention following the reviewing process. With this application, the user can run the developed algorithms with the MEA2100-256 System from Multi Channel Systems MCS GmbH.

      Same with the data. The dataset with the spike data from all experiments is also now publicly available in Zenodo. The data can be found in https://doi.org/10.5281/zenodo.10138446.

      Regarding the improvements in the statistical analysis, the tests are now performed following Reviewer #1 suggestions. Important to emphasize that this did not change the results/ conclusions of the work.

    1. Author Response

      Reviewer #1 (Public Review):

      The manuscript by Grove and colleagues analyzes the role of TEAD1 transcription factors in all events regulating PNS myelin formation and maintenance and regeneration. Throughout the manuscript, the authors compare the results obtained to those they previously described in YAP/TAZ double knockout mice. Strengths of the manuscript are combined in vivo analyses by generating mutants constitutively lacking TEAD1 expression in myelinating Schwann cells (P0Cre//TEAD1f/f mice: cKO) and mutants in which TEAD1 expression can be ablated after tamoxifen-mediated recombination is myelinating Schwann cells (PlpCreER//TEAD1f/f mice: iKO). Using this approach the authors were able to assess the role of TEAD1 in all aspects related to PNS myelin: formation as well as maintenance and remyelination after injury. By exploiting these models, they were able to define the role of TEAD1 in regulating Schwann cell proliferation as well as in the cholesterol biosynthetic pathway. Collectively, their data indicate that TEAD 1 has a composite role in PNS myelination being required for developmental myelination, but dispensable for myelin maintenance. Further, they also describe a role for TEAD1 in promoting PNS remyelination after an injury event.

      Despite these strengths, there are some weaknesses that should be addressed by the authors:

      1) The manuscript would benefit from better and more detailed analysis of the role of the other TEAD transcription factors, as they are likely redundant in function to TEAD1. For example, since in cKO mice some fibers can escape the sorting defect and eventually myelinate, albeit at a lower level, could they determine whether TEAD2-4 transcription factors might compensate for TEAD1 absence in this setting?

      We speculate that other TEADs, most likely both TEAD2 and TEAD3, compensate TEAD1 in myelinating some developing axons. We also speculate that TEAD4 counteracts TEAD1, resulting in excessive proliferation of Schwann cells in Tead1 cKO. Unfortunately, because, unlike TEAD1, floxed/congenic alleles and IHC-compatible antibodies are not yet available for TEAD2-4, it is difficult to determine their roles. We attempted to knock down TEAD2-4 by injecting AAV-shRNAs into the sciatic nerves of WT and Tead1 iKO, but this intervention was not successful. Our future studies will determine compensatory and/or opposing roles of other TEADs during development and homeostasis and after nerve injury.

      2) A striking result of the study is the morphological defects observed in the process of axonal sorting and in the Remak fibers formation of TEAD1 cKO mice. To explain the sorting defect, the authors correctly analyze Schwann cell proliferation. However, since axonal sorting is mediated by the interaction between the extracellular matrix and intracellular cytoskeleton rearrangement, they should address also these two aspects. As per the Remak bundles and the poly-axonal myelination they observe, it is difficult to reconcile this "abnormal" myelination with the fact that TEAD1 cKO mice have a very severe myelinating phenotype, which is persistent in adulthood.

      It is noteworthy that we found radial sorting to be delayed, but not blocked, in Tead1 cKO, as we had previously reported for Yap/Taz cDKO mice in our earlier publication (Grove et al., eLIFE 2017). The primary reason that myelin development fails in Schwann cells lacking YAP/TAZ (or TEAD1 in the present report) is because they do not initiate myelination of sorted axons, not because of defective radial sorting. We showed that radial sorting was delayed in Schwann cells lacking YAP/TAZ because of their late S phase entry (Figure 4 in Grove et al., eLIFE 2017). In addition, our earlier report demonstrated that the key laminin receptor, integrin 6, is strongly downregulated but axons are nevertheless sorted out by Schwann cells in Yap/Taz cDKO (Figure 4-figure supplement 2 in Grove et al., eLIFE 2017). Our current view, therefore, is that extracellular matrix may contribute to reducing Schwann cell proliferation (Berti et al., 2011; Pellegatta et al., 2013; Yu, Feltri, Wrabetz, Strickland, & Chen, 2005), which helps to delay radial sorting, but that it is not required for Schwann cells lacking YAP/TAZ (or TEAD1) to sort axons (see the author response #2 in Grove et al., eLIFE 2017). Based on this information, we disagree with the reviewer that it is essential for us to address the role of extracellular matrix in delaying radial sorting in Tead1 cKO.

      Regarding Remak bundles, ‘thinly’ myelinated Remak bundles are only ‘occasionally’ observed in Tead1 cKO mice. Given that some large axons are still myelinated in Tead1 cKO mice, likely due to compensation by other TEADs, we speculate that Remak bundles are occasionally myelinated by other TEADs in Tead1 cKO. We have clarified our description and expanded our discussion of TEAD1 regulation of Remak bundles, including abnormal polyaxonal myelination.

      3) In the analyses of the cholesterol biosynthetic pathway, TEAD1 seems to be only partly involved. Again, which is the role of any of the other TEADs?

      Examining cholesterol biosynthesis pathways (SREBP1 and 2) and their target enzymes (SCD1, HMGCR, FDPS, IDI1) in Tead1 cKO and Yap/Taz cDKO, we showed that TEAD1 is required for upregulating FDPS and IDI1. These data suggest that TEAD1 plays a major role in mediating YAP/TAZ-driven cholesterol synthesis by upregulating FDPS and IDI1. It is also important to note that FDPS and IDI1 levels are reduced in TEAD1 cKO as ‘greatly’ as those in Yap/Taz cDKO (Figure 5). We therefore speculate that other TEADs compensate TEAD1 modestly, if at all, in upregulating FDPS and IDI1. We do not rule out the possibility, however, that other TEADs fully compensate TEAD1 in ‘maintaining’ cholesterol synthesis in adult Schwann cells. We will address these important questions in the future when the key resources mentioned above become available to study TEAD2-4.

      4) Why do cKO mice die before P60?

      In accordance with IACUC guidelines, we humanely euthanized Tead1 cKO mice before P60 because, like Yap/Taz cKO mice, they develop severe peripheral neuropathy.

    1. Author Response

      Reviewer #2 (Public Review):

      In this paper, the authors discover that postsynaptic mitochondria in C. elegans govern glutamate receptor trafficking dynamics. The core results are two-fold. For one, they find that loss or inhibition of mcu-1 - the C. elegans mitochondrial calcium uniporter - increases GLR-1 glutamate receptor accumulation at the postsynaptic dendritic sites and enhances its trafficking dynamics. The authors hypothesize that this effect on glutamate receptors may have something to do with mitochondrial ROS production. This is because ROS is a by-product of normal oxidative phosphorylation, downstream of calcium import. Indeed, the generation of artificially high amounts of mitochondrial ROS has the opposite effect of mcu-1 loss: decreased glutamate receptor subunit accumulation. Collectively, the results support the idea that mitochondrial function can control receptor dynamics at synaptic sites. This is interesting because tight control of synaptic function likely combines several mitochondrial functions: energy production, calcium buffering, and (here) ROS signaling.

      STRENGTHS

      • The C. elegans genetic model is a strength because the authors are able to make refined conclusions by classical loss-of-function mutants (e.g., mcu-1) along with an impressive cytological toolkit to examine GLR-1 dynamics.

      • The use of pharmacology as a second means to test those genetic conclusions is a strength.

      • The authors' careful reagent verification of reporters (Ca2+, ROS, etc.) is a strength.

      • The ability to link fundamental mitochondrial processes to GLR-1 exocytosis will expand how the field thinks about mitochondrial synapse function.

      WEAKNESSES

      For the most part, the data in the paper support the conclusions, and the authors were careful to try experiments in multiple ways. But please see below:

      • (Main Point) The data are good, but they fall short of mechanism (e.g., Line 322). Figure 6 is accurate as drawn. But calcium and ROS are not abstract signals. They are likely exerting affirmative actions on specific targets. The Discussion does acknowledge this in terms of ROS and it speculates on possible targets.

      We thank the reviewer for their analytical review of our manuscript. We agree that all molecular players involved in the proposed mechanism were not identified by the data presented, so we modified the text to remove overstatements. We also agree that Ca2+ and ROS signaling is not abstract. Rather, there are specific and diverse targets of both Ca2+ and ROS signaling. Follow-up experiments are underway to identify and provide evidence for the necessity of potential ROS/Ca2+ targets in this proposed mechanism. For the current manuscript, we have modified our verbiage in an attempt to not mislead or overstate what our results suggest (e.g., changes/additions to the beginning of the ‘Discussion’, lines 365-377 and 385-388) and updated the illustration of the proposed model to include dashed lines that, as mentioned in the figure legend, indicate indirect action by ROS and Ca2+ (see revised Figure 7).

      The general idea seems to be that mitochondria import calcium through MCU-1 (and interacting factors). As a result, oxidative phosphorylation successfully occurs and mitochondrial ROS is a signaling by-product that signals glutamate receptors not to undergo exocytosis. But there are other interpretations of what might happen in between. In fact, if OXPHOS is disrupted, it is known that this can generate a lot more mitochondrial ROS than the normal by-product levels.

      We do agree that an alternative explanation could be that genetic or pharmacological inhibition of mitochondrial Ca2+ uptake disrupts oxidative phosphorylation, and as a result, inefficiencies or uncoupling in the electron transport chain would lead to an even greater increase in mitochondrial ROS production. Although oxidative phosphorylation was not directly measured, one of our post hoc analyses of GLR-1 transport suggests ATP levels are comparable between controls, mcu-1 mutants, and with Ru360 treatment: the velocity of GLR-1 transport is unchanged between these experimental groups. The processivity of molecular motors (which dictates transport velocity) is highly sensitive to relative ATP abundance. Thus, if ATP levels were dramatically decreased in mcu-1 mutants or following Ru360 treatment, then one would expect a detectable change in GLR-1 transport velocities, but we observed no change (see revised Figure S2E and related discussion at lines 183-190). Although these results do not directly indicate whether ATP production is altered with loss or inhibition of MCU-1, it does suggest that basal ATP levels remain sufficient to support the metabolic demands of GLR-1 transport.

      This reviewer wonders if excess ROS would cause an extreme response. Or alternatively, if scavenging ROS via pharmacological scavengers or SOD expression would reverse the effects.

      These are good points, and we have previously published experiments that address each of them. First, we have seen that globally increasing ROS with various concentrations of H2O2 within the physiological range (<100 nM) decreased GLR-1 transport to a similar extent (PMID: 32847966) indicating that there is not a dose-dependent decrease in GLR-1 transport. We have also assessed GLR-1 transport after treatment with concentrations of H2O2 well above the physiological range (e.g., 500 nM), but these high concentrations obliterated all GLR-1 transport. Contrary to what one may expect, we showed that decreasing ROS via pharmacological or genetic means (probably below physiological range) decreased GLR-1 transport (PMID: 35622512) via a Ca2+ independent mechanism. In other words, ROS scavenging did not have the opposite effect on GLR-1 transport, but we have not combined ROS scavenging with optical induction of ROS production (e.g., via KillerRed) nor have we assessed the potential influence of ROS scavenging on synaptic recruitment. Although we agree that these are important follow-up experiments, they will require a more sensitive ROS indicator because current genetically encoded in vivo ROS sensors cannot detect decreases in ROS levels below the physiological range (< 10 nM) (PMID: 31586057).

      Small Points

      • 33.3 mHz - just making sure, do the authors mean once every 30 seconds? That would be more straightforward.

      Yes, we do mean a 1-second pulse of light every 30 seconds. We have clarified this in the manuscript text (line 115).

      • Figure 2 is confusing. The text says that the mcu-1 mutants have a GLR-1::GFP FRAP rate that is comparable to controls (Lines 165-167). But Figure 2E suggests that it is markedly less, which is the opposite result of the slight increase in rate resulting from Ru360 treatment. And is the explanation why the GLR-1::GFP results differ from the SEP::GLR-1 results a difference between total GFP vs. surface GFP?

      The confusion is due to an incorrect statement in the results text. We have corrected this error and appreciate the reviewer for bringing it to our attention (lines 173-174).

      • I could not watch Video 2 (not sure if it is the file or just the copy I downloaded).

      We thank the reviewer for bringing this to our attention and we believe we have remedied the issue.

      • It is good that the authors tried both optical stimulation and mechanical stimulation (dropping culture plates to stimulate the worms, Figure 3). Why was the mechanical stimulation set aside for further tests in the paper?

      Mechanical stimulation consisted of dropping culture plates containing 2-3 C. elegans onto a lab bench every 30 seconds for 5 or 10 minutes. This mechanical stimulation paradigm was technically cumbersome and was less effective at inducing changes in mito-roGFP fluorescence that optical stimulation. This is likely due to habituation to the mechanical stimulus which has been well-characterized in C. elegans. The optical stimulation was therefore used as it is a more reliable and repeatable method for stimulating the AVA neuron.

      • Does this process affect all kinds of transport, or is it just the glutamate receptors? Was anything else examined?

      Transport of other proteins has not been examined in the context of mitoROS signaling. Our attempts at visualizing and quantifying the transport, synaptic delivery and exocytosis of other synaptic proteins in vivo has proven to be more technically challenging likely due to relatively lower expression in the C. elegans neurons suitable for transport analysis.

      Reviewer #3 (Public Review):

      Reactive oxygen species (ROS) have been previously shown to regulate glutamate receptor phosphorylation, long-distance transport, and delivery of glutamate receptors to synapses, however, the source of ROS is unclear. In this study, the authors test if mitochondria act as a signaling hub and produce ROS in response to neuronal activity in order to regulate glutamate receptor trafficking. The authors use a variety of optogenetic tools including the calcium reporter mitoGCaMP and the ROS reporter mito-roGFP to monitor changes in calcium and ROS, respectively, in mitochondria after activating neurons with ChRimson in the genetic model organism C. elegans. Repeated stimulation of interneurons called AVA with ChRimson leads to increased calcium uptake into mitochondria in dendrites and increased mitochondrial ROS production. The mitochondrial calcium uniporter mcu-1 is required for these effects because mcu-1 genetic loss of function or treatment with Ru360, a drug that inhibits mcu-1, inhibits the uptake of calcium into mitochondria and ROS production after neuronal activation. Mcu-1 genetic loss of function is correlated with an increase in exocytosis of glutamate receptors but a decrease in glutamate receptor transport and delivery to dendrites. This study suggests that mitochondria monitor neuronal activity by taking up calcium and downregulating glutamate receptor trafficking via ROS, as a means to negatively regulate excitatory synapse function.

      Strengths

      -The use of multiple optogenetic tools and approaches to monitor mitochondrial calcium, reactive oxygen species, and glutamate receptor trafficking in live organisms.

      -Identifying a novel signaling role for dendritic mitochondria which is to monitor neuronal activity (via calcium uptake into mitochondria) and generate a signal (reactive oxygen species) that regulates glutamate receptors at synapses.

      Weaknesses

      -Although the use of KillerRed to generate ROS downstream of mcu-1 is a clever approach, the fact that activation of KillerRed results in reduced GLR-1 exocytosis, delivery, and transport raises the concern that KillerRed is generating a high level or ROS that might be toxic to cellular processes. Experiments showing that other cellular processes are not affected by KillerRed activation and testing if reduced ROS production mimics the effects of blocking mcu-1 would strengthen the conclusions in this study.

      We thank the reviewer for their careful analyses of our findings. It is plausible that KillerRed could cause toxic levels of ROS, in fact, it was originally used to instigate oxidative stress-induced apoptosis to achieve cell-specific ablation. These cell ablation protocols required 20+ minutes of KillerRed activation with substantially higher levels of irradiation (e.g., 3.8 mW/mm [PMID: 24209746] vs. our light dosage of 25 µW/mm2). Additionally, our transgenic C. elegans strains expressing KillerRed were designed to have a relatively low KillerRed expression and were screened for low expression based on KillerRed’s fluorescence. Using these strains, we were able to minimally activate KillerRed in the AVA neuron resulting in ROS elevations at mitochondria that were comparable to neuronal activity-induced increases in mitochondrial ROS as measured by mito-roGFP. Specifically, we found that 10 minutes of mechano-stimulation and 5 minutes of ChRimson stimulation increased the fluorescence ratio (Fratio) of mito-roGFP nearly two-fold (Figure 4A-B and 4C-E). A 15-second pulse of light focused on a small region activating mitoKR in the AVA neurite also caused similar two-fold increase in the mito-roGFP Fratio (Figure 4C-E) comparable to what neuronal activity induced. Our 5-minute global KillerRed activation less effectively increased the mito-roGFP Fratio at mitochondria in the AVA neurite compared to neuronal activity (revised Figure 4B and 4H) but was sufficient in decreasing GLR-1 transport (revised Figure 5G-H). So, we decided to do all experiments with 5 minutes of global KillerRed activation since lower activation levels of KillerRed were more likely to achieve non-toxic, signaling levels of ROS. Since we strongly agree that this data is important for tool validation, we have reorganized the manuscript such that these data are now a primary figure (see revised Figure 4 and new results sub-section starting at line 252).

      Additionally, we added supplemental transport velocity data. This data shows that local photoactivation as well as whole-cell activation of KillerRed does not alter transport velocity of GLR-1 vesicles within the neurite (revised Figure S4A and S4B and lines 272-276 and 287-289), which would be the case if ATP, microtubules, or actin dynamics were affected. This supports that our local and whole-cell activation protocol does not cause toxic levels of ROS production.

      Lastly, the reviewer questions whether decreasing ROS alters GLR-1 transport, synaptic delivery and exocytosis in a similar fashion to loss or inhibition of mcu-1, and if so, would further support the proposed mechanism. We have decreased ROS via genetic (catalase overexpression) and pharmacological (using the mitochondria-targeted antioxidant MitoTEMPO) means and seen that diminished ROS levels decrease GLR-1 transport albeit to a lesser degree than that caused by loss/inhibition of mcu-1 (PMID: 35622512). To determine if decreased GLR-1 transport during diminished ROS levels involves mcu-1, we would need to assess GLR-1 transport in mcu-1 mutants while ROS is decreased (e.g., using MitoTEMPO treatment) to see if their combined effect phenocopies the effect of mcu-1(lf) or decreased ROS alone. However, as mentioned previously, we are unable to measure ROS levels below the sensitivity of roGFP but within physiological range so we cannot currently calibrate or validate our methods for scavenging ROS in vivo. This is why we have not yet analyzed synaptic delivery or exocytosis rates of GLR-1 in the context of decreased ROS, but these would be interesting follow-up experiments that may further support our model once more sensitive ROS sensors are available.

      Reviewer #4 (Public Review):

      Using optogenetic stimulation, the authors presented compelling evidence that neuronal activity increases mitochondrial calcium levels, facilitated by the mitochondrial uniporter MCU-1. Through ratiometric measurements, they showed that mitochondrial ROS levels also increase due to neuronal activity via MCU-1. Subsequent FRAP studies were employed to investigate the trafficking of the AMPA receptor, GLR-1. By integrating genetic and pharmacological methodologies, the recovery rate of GLR-1 was assessed. The authors concluded that increased mitochondrial ROS due to neuronal activity reduces the trafficking and exocytosis of AMPA receptors. They proposed that mitochondrial ROS serves as a homeostatic mechanism regulating AMPA receptor trafficking and abundance, thus maintaining synaptic strength. This research is crucial as it provides a direct link between mitochondrial signaling and AMPA receptor trafficking.

      However, there are several significant concerns regarding the methodologies and quantifications employed in this manuscript. The authors utilized GLR-SEP to label surface AMPA receptors and relied on the "FRAP rate" as an indicator of the exocytosis rate. The absence of direct visualization of exocytosis using GLR-SEP, and the lack of direct measurements of exocytosis events, casts doubt on the conclusions about ROS's impact on AMPA receptor exocytosis. Furthermore, the "FRAP rate" determined in this study is a combination of recovery rates (incorporating both endosomal trafficking and diffusion) with the mobile fractions of AMPA receptors, potentially weakened interpretations of the findings. A more comprehensive discussion addressing the conflicting effects of MCU-1 and ROS on GLR-GFP FRAP recovery and dendritic trafficking would enable readers to grasp the intricate roles of mitochondrial calcium and ROS in modulating synaptic receptors.

      We appreciate the reviewer’s attention to detail while reviewing our article. Their major concern about directly visualizing exocytosis events is valid since changes in exocytosis and endocytosis would dictate the amount of SEP::GLR-1 at the synaptic membrane. However, streaming imaging of SEP in vivo is technically difficult showing only few exocytosis events and provides short “snapshots” (1-2 minutes, longer streaming imaging causes photobleaching and photo-toxicity) which must be extrapolated to longer time frames. Our 16-minute SEP::GLR-1 FRAP protocol allows us to capture all plasma membrane recruitment and quantify the relative balance between exo- and endocytosis. It also allows for longer observational periods during which we can detect changes in GLR-1 recruitment to and retention at the synaptic membrane in genetic mutants and with drug treatments. In addition, our photobleaching approach involves photobleaching a ~40-60 µm region proximally and distally to the imaging region which limits the influence of receptor diffusion on the FRAP rate. The reviewer makes a valid point that receptor endocytosis rates would also influence the SEP::GLR-1 FRAP rate. We have now changed the text in the results and discussion to include this information (lines 155-161, and changing “exocytosis” to “synaptic recruitment” throughout the manuscript when discussing SEP::GLR-1 FRAP results [e.g, at lines 169, 208, and 321]).

    1. Author Response

      Reviewer #1 (Public Review):

      Payne et al. have investigated the neural basis of VOR adaptation with the goal of constraining sites and mechanisms of plasticity supporting cerebellar learning. This has been an area of intense debate for decades; previous competing models have argued extensively about the sites of plasticity and the strength of eye velocity feedback/ efference copy signals to Purkinje cells has been central to the debate. This paper nicely explores the consequences of varying the strength of this feedback and in so doing, provides a potential explanation for why Purkinje cell responses during VOR cancellation could exhibit stronger responses following learning, despite net depression of the strength of their vestibular inputs. In that sense it provides some reconciliation of existing models. The work appears to be well done and the paper is well written. The manuscript could be improved and the significance of the work clarified and enhanced by contextualizing the work more appropriately within the existing literature in this area.

      We thank the reviewer for the nice summary of this work’s contribution to the long-standing debate regarding sites and mechanisms of plasticity underlying cerebellar learning.

      We have revised the manuscript to address several key points raised by the reviewer. We now emphasize that the main evidence for weak feedback arises from interpreting our model in the context of the existing experimental evidence for plasticity rules in the cerebellar cortex, and we have clarified the commonalities and differences from the Miles-Lisberger model. Several missing references are now included. Additionally, we clarify the comparison of our model to data after learning, and explain how altered signaling through the visual pathways drives paradoxical changes in neural activity without requiring plasticity in the visual pathways. We hope that these changes better situate the work to be interpreted appropriately in the context of the existing literature.

      Reviewer #2 (Public Review):

      Payne et al. use a computational approach to predict the sites and directions of plasticity within the vestibular cerebellum that explain an unresolved controversy regarding the basis of VOR learning. Specifically, the conclusion by Miles and Lisberger (1981) that vestibular inputs onto Purkinje cells (PCs) must potentiate, rather than depress (as in the Marr/Albus/Ito model), following gain-increase learning because when the VOR is cancelled, PC firing increases rather than decreases. Payne et al. provide a novel model solution that recapitulates the results of Miles and Lisberger but, paradoxically, uses plasticity in the cerebellar cortex that weakens PC output rather than strengthens it. However, the model only succeeds when efference copy feedback to the cerebellar cortex is relatively weak thereby allowing a second feedback pathway to drive PC activity during VOR cancellation to counteract the learned change in gain. Because the model is biologically constrained, the findings are well supported. This work will likely benefit the field by providing a number of potentially experimentally testable conclusions. The findings will be of interest to a wider audience if the results can be extrapolated to other cerebellar-dependent learning behaviors rather then just VOR gain-increase learning. Overall, the manuscript is very well written with clearly delineated results and conclusions.

      We appreciate the reviewer’s comments that the model is well-constrained and provides a solution to the long-standing debate surrounding sites and directions of plasticity underlying VOR learning.

      The reviewer raises an important question: do our results generalize across the cerebellum? We note first that we are studying the cerebellum to illustrate a core problem in modeling systems throughout the brain, namely, how to disambiguate plasticity in the face of ubiquitous feedback loops, both within the brain and between the brain and the environment. Within the cerebellum, we focused on VOR learning due to the wealth of experimental data available. While the specific effect of feedback strength on plasticity will depend on the details of the relevant cerebellar circuit, our general approach can be applied to other areas, given sufficient data, in order to determine how plasticity is distributed in the face of potential feedback loops. Importantly, error-driven LTD of the parallel fiber-Purkinje cell synapse is a fundamental hypothesized mechanism for cerebellar learning which has been generally accepted elsewhere in the cerebellum, but was called into question for VOR learning in the flocculus by the Miles-Lisberger model. Thus, our study of VOR learning has broad implications for reconciling plasticity mechanisms across the cerebellum.

      We also note that, even within the VOR circuit, the direction of plasticity and the relative dependence on plasticity at each site may depend on the timescale of learning. On longer timescales, there is thought to be consolidation of learning from a cerebellar cortical site to a brainstem site. Such consolidation from a faster-learning site to a slower-learning site is known as systems consolidation and has been shown theoretically to mitigate the ‘plasticity-stability dilemma’ of having fast learning without over-writing longer-term learning. Our model is compatible with both error-driven plasticity in the cerebellar cortex and a site of plasticity in the brainstem, with brainstem plasticity potentially mediating consolidation of earlier learned changes in the cerebellar cortex. We have now updated the text significantly to discuss the broader implications of the results and to address the reviewer’s specific comments.

      Reviewer #3 (Public Review):

      Summary: In this study, the authors attempt to determine what is the role (and strength) of feedback in a closed-loop (cerebellar) system.

      Strengths:

      1) By combining extensive data fitting of cerebellar experimental observations this study provides deep insights into existing questions and more broadly on the role of feedback and what are the limitations when inferring feedback in (plastic) neural circuits.

      2) Another strength of this study is the gradual build-up of evidence by using models of different complexities to help build the argument that weak feedback is sufficient to explain experimental observations.

      3) The paper is well-written and structured.

      Weaknesses:

      1) In principle feedback can (i) drive dynamics or/and (ii) drive learning directly. Throughout the paper, the authors refer to only the first case (i.e. dynamics). However, the role of feedback in learning is already implicitly assumed by the authors when jointly fitting the model before and after learning. Note that the general conclusion that feedback (in general) is weak may be to the first view (i.e. dynamics), but not the second. Given that a key conclusion of the paper is that no feedback is sufficient to explain the data, this suggests that feedback may instead be used for learning/plasticity.

      We fully agree with the reviewer that our conclusions do not preclude an important role for many other types of feedback, including as an instructive signal for learning. Instead of explicitly considering feedback for learning in our model, we consider static snapshots before and after learning to infer plasticity, while remaining agnostic to the neural algorithm used to achieve such plasticity. A widely held hypothesis is that motor error signals carried by climbing fibers instruct LTD at co-active parallel fiber inputs to Purkinje cells; this is indeed a form of feedback, operating on a slower timescale than “feedback for dynamics.” This “feedback for learning” is not modeled here but is fully consistent with our results, as discussed in a new paragraph of our Discussion (end of Section 3.4.1 “Pathways undergoing plasticity”).

      2) There are some potential limitations of the conclusions drawn due to the model inference methods used. The methods used (fmincon) can easily get stuck in local minima and more importantly they do not provide an overview of the likelihood of parameters given the data. A few studies have now shown that it is important to apply more powerful inference techniques both to infer plasticity (Bykowska et al. Frontiers 2019) and neural dynamics (Gonçalves et al. eLife 2020). As highlighted by Costa et al. Frontiers 2013 using more standard fitting methods can lead to misleading interpretations. Given the large range of experimental data used to constrain the model, this may not be an issue, but it is not explicitly shown.

      The reviewer correctly points out that we used a deterministic model-fitting procedure. To address this concern, we complemented the full dynamic model with a simple analytic model ( Figure 5 ) for which we could fully derive the cost function landscape and analytically show that there is a line of parameters corresponding to a perfect degeneracy in the model. Thus, the challenge in the model we analyze is that there are too many solutions, rather than it being difficult to find a solution. Given this degeneracy, we chose to fix the level of efference copy feedback and then find the (now non-degenerate) solutions, and to then compare these different solutions with regards to their implications for the correlated strengths and changes in strengths of different pathways. We have edited the relevant section of the Discussion for clarity on this topic, and have added references to the additional strategies for model inference mentioned above, in Section 3.3 “Relation to other sloppy models”.

      3) There is some lack of clarity on how the feedback pathways as currently presented should be interpreted in the brain.

      We interpret this comment as referring to the questions of (1) whether our model includes a pathway for learning through feedback, (2) what is the anatomical implementation of the efference copy feedback pathway and visual pathways, and (3) how should the positive weights on the efference copy feedback pathway k PE be interpreted. We address these below.

      (1) Feedback for learning was discussed in point 1 above.

      (2) Anatomical implementation of efference copy pathway: We have edited the Discussion to clarify that there is anatomical evidence for efference copy input to the cerebellum, but that a key aspect of ‘feedback’ is that activity functionally loops back onto itself. Instead, neurons carrying eye movement commands (such as in the vestibular nucleus) could send signals to the cerebellum, without receiving output from the same cerebellar neurons – this would correspond to a ‘spiraling’ pathway that does not form a closed feedback loop (Figure 8). Thus we argue that the existence of the gross anatomical pathways does not necessitate a role for strong, functional, efference copy feedback (Discussion, Section 3.1, lines 481-491).

      Anatomical implementation of visual pathway: The visual feedback pathways considered here are those that would receive visual motion information from the environment. This visual feedback is itself changed by eye movements, thus providing a net overall negative feedback loop that helps to stabilize gaze. This pathway has been proposed to involve cortical regions such as MST (discussed in Materials and Methods, Model Implementation, lines 769-774).

      (3) Interpretation of positive feedback loop: In our model, the efference copy feedback filter, k PE , has positive weight. This corresponds to the positive net sign of the Purkinje cell to brainstem to Purkinje cell feedback loop. Specifically, the Purkinje cell to brainstem pathway is inhibitory (because Purkinje cells are inhibitory), the brainstem to eye velocity command pathway is inhibitory (to achieve counter-rotation of the eyes in response to head turns), and the feedback of this eye velocity command back to Purkinje cells (k PE ) is positive. Thus this loop in our model represents positive feedback. This is now clarified in Materials and Methods, Model Implementation, lines 748.

      4) The functional benefits of having (or not) feedback could be better discussed (related to point 1 above).

      Related to point 1 above, it is certainly the case that feedback is necessary for learning. We do not explicitly model the climbing fiber feedback thought to be involved in learning/plasticity of the parallel fiber pathway.

      We instead focus on the role of efference copy feedback, and how it functionally impacts the required sites and signs of plasticity in the circuit. As shown in the paper, if the efference copy pathway is strong, then this is most consistent with learned changes in eye movements being driven primarily by plasticity in the brainstem pathway (as in the Miles-Lisberger hypothesis), whereas if the efference copy pathway is weak, then this is most consistent with learned changes in eye movements being driven by net depression in the parallel fiber to Purkinje cell pathway (as in the classic Marr-Albus-Ito model and as suggested by most cellular and molecular studies of parallel fiber-Purkinje cell plasticity), in addition to a role of plasticity in the brainstem pathway. We also note that, in the ‘Strong Feedback’ model, the feedback is so strong that the system is on the brink of instability – this has been argued to have the functional benefit of providing ‘inertia’ to eye movements that could help to maintain eye movements during smooth pursuit when a target goes behind an occluder, but it also has the disadvantage of placing the system at a level of positive feedback near the brink of instability. We also note that the visual feedback pathway through the environment, emphasized in this work, serves as a negative feedback loop that reduces deviations between the eye and target velocity. We have extensively re-written the first section of the Discussion (Section 3.1), in order to more clearly lay out the implications of each model for circuit plasticity and feedback.

      5) Some of the key conclusions of the work are not described in the abstract, namely that feedback is weak in the cerebellar system.

      Thank you for raising this point, we have added this key conclusion to the end of the abstract: “Our results address a long-standing debate regarding cerebellum-dependent motor learning, suggesting a reconciliation in which error-driven plasticity of synaptic inputs to Purkinje cells is compatible with seemingly oppositely directed changes in Purkinje cell activity. More broadly, the results demonstrate how learning-related changes in neural activity can appear to contradict the sign of the underlying plasticity when either internal feedback or feedback through the environment is present.”

      Claims:

      The argument is well-built throughout the paper, but there are some potential caveats with the general interpretation (see weaknesses).

      Impact:

      This work has the potential to bring important messages on how best to interpret and infer the role of feedback in neural systems. For the field of the cerebellum, it also proposes solutions to long-standing problems.

    1. Author Response

      Reviewer #1 (Public Review):

      Summary:

      Cyclic Nucleotide Binding (CNB) domains are pervasive structural components involved in signaling pathways across eukaryotes and prokaryotes. Despite their similar structures, CNB domains exhibit distinct ligand-sensing capabilities. The manuscript offers a thorough and convincing investigation that clarifies numerous puzzling aspects of nucleotide binding in Trypanosoma.

      Strengths:

      One of the strengths of this study is its multifaceted methodology, which includes a range of techniques including crystallography, ITC (Isothermal Titration Calorimetry), fluorimetry, CD (Circular Dichroism) spectroscopy, mass spectrometry, and computational analysis. This interdisciplinary approach not only enhances the depth of the investigation but also offers a robust cross-validation of the results.

      Weaknesses:

      None noticed.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript clearly shows that Trypanosoma PKA is controlled by nucleoside analogues rather than cyclic nucleotides, which are the primary allosteric effectors of human PKA and PKG. The authors demonstrate that the inosine, guanosine, and adenosine nucleosides bind with high affinity and activate PKA in the tropical pathogens T. brucei, T. cruzi and Leishmania. The underlying determinants of nucleoside binding and selectivity are dissected by solving the crystal structure of T. cruzi PKAR(200-503) and T. brucei PKAR(199-499) bound to inosine at 1.4 Å and 2.1 Å resolution and through comparative mutational analyses. Of particular interest is the identification of a minimal subset of 2-3 residues that controls nucleoside vs. cyclic nucleotide specificity.

      Strengths:

      The significance of this study lies not only in the structure-activity relationships revealed for important targets in several parasite pathogens but also in the understanding of CNB's evolutionary role.

      Weaknesses:

      The main missing piece is the model for activation of the kinetoplastid PKA which remains speculative in the absence of a structure for the trypanosomatid PKA holoenzyme complex. However, this appears to be beyond the scope of this manuscript, which is already quite dense.

      We fully agree that insight into the activation mechanism and its possible deviation from the mammalian paradigm requires a holoenzyme structure revealing the details of R-C interaction. We have attempted Cryo-EM from LEXSY-produced holoenzyme, yet upscaling the purification procedures described in this manuscript have repeatedly failed in spite of numerous protocol changes and optimizations. Much more work is required to achieve this.

      Reviewer #2 (Recommendations For The Authors):

      Some minor points to consider for enhancing the impact of this interesting manuscript:

      1) The nucleoside affinities measured are mainly for the regulatory subunits unbound to the kinase domain. How would nucleoside affinities change when the regulatory subunits are bound to the kinase domain, which is presumably the case under resting conditions? An estimation of this change in affinity is important because it more closely relates to the variations in cellular nucleoside concentrations needed for activation.

      This is an important question and we have given an indirect answer in the manuscript, but not very explicit. The EC50 values for kinase activation of the purified holoenzyme complexes are very similar or almost identical to the kD values measured by ITC with free regulatory subunits. By inference, the binding kD for the holoenzyme and for the free R-subunit cannot be very different. In addition, we have recently determined the EC50 for PKA activation in vivo in trypanosomes using a bioluminescence complementation reporter assay. The values fit perfectly to the values obtained with purified holoenzyme (Wu et al. in preparation). A sentence in Results (lines 201-203) has been added.

      2) The authors should point out that a major implication of nucleoside vs. cyclic nucleotide activation is in terms of signal termination. If phosphodiesterases (PDEs) are responsible for cAMP/cGMP signal termination, what terminates nucleoside-dependent signaling? Although the answer to this question may not be known at this stage, it is important to highlight this critical implication of the authors' study.

      The mechanism of signal termination is indeed unknown so far. We speculate that some enzymes of the purine salvage pathways are differentially localized in subcellular compartments and thereby able to establish microdomains that enable nucleoside signaling. In addition, PKA subunit phosphorylations/dephosphorylations and/or protein turnover may also regulate signal termination. As an example, free PKAC1 is rapidly degraded upon depletion of the PKAR subunit by RNAi. We have now mentioned signal termination in Discussion and have revised the last part of Discussion (lines 567-602). A possible approach to monitor compartmentalized signaling would be using the FluoSTEPs technology (Tenner et al., Sci. Adv. 2021; 7: eabe4091), but adapting this to the trypanosome system will not be a short-term task.

    1. Author Response

      Reviewer #1 (Public Review):

      Summary:

      The investigators have performed a state-of-the art systematic review and meta-analysis of studies that may help to answer the research question: if administration of multiple antibiotics simultaneously prevents antibiotic resistance development in individuals. The amount of studies eligible for analysis is very low, and within that low number, there is huge variability in bug-drug combinations studied and most studies had a high risk of bias, further limiting the capability of meta-analysis to answer the research question. In addition, based on I2 values there is also huge statistical heterogeneity between outcomes of studies compared, further limiting the predictive value of meta-analysis. In fact, the only 2 studies meeting all eligibility criteria addressed the treatment of mycobacterium tuberculosis, for which the research question is hardly applicable. The authors, therefore, conclude that "our analysis could not identify any benefit or harm of using a higher or a lower number of antibiotics regarding within-patient resistance development." Apart from articulating this knowledge gap, the findings will not have consequences for patient care, but may stimulate the scientific community to better address this research question in future studies.

      Strengths:

      The systematic and rigorous approach for the review and meta-analysis.

      Weaknesses:

      None identified.

      We thank the reviewer for this thoughtful and positive appraisal of our work.

      Reviewer #2 (Public Review):

      Summary:

      The authors performed a systematic review and meta-analysis to investigate whether the frequency of emergence of resistance is different if combination antibiotic therapy is used compared to fewer antibiotics. The review shows that there is currently insufficient evidence to reach a conclusion due to the limited sample size. High-quality studies evaluating appropriate antimicrobial resistance endpoints are needed.

      Strengths:

      The strengths of the manuscript are that the article addresses a relevant research question that is often debated. The article is well-written and the methodology used is valid. The review shows that there is currently insufficient evidence to reach a conclusion due to the limited sample size. High-quality studies evaluating appropriate antimicrobial resistance endpoints are needed. I have several comments and suggestions for the manuscript.

      Weaknesses:

      Weaknesses of the manuscript are the large clinical and statistical heterogeneity and the lack of clear definitions of acquisition of resistance. Both these weaknesses complicate the interpretation of the study results.

      We thank the reviewer for the positive comments and pointing out where our work can be improved.

      Major comments:

      My main concern about the manuscript is the extent of both clinical and statistical heterogeneity, which complicates the interpretation of the results. I don't understand some of the antibiotic comparisons that are included in the systematic review. For instance the study by Paul et al (50), where vancomycin (as monotherapy) is compared to co-trimoxazole (as combination therapy). Emergence (or selection) of co-trimoxazole in S. aureus is in itself much more common than vancomycin resistance. It is logical and expected to have more resistance in the co-trimoxazole group compared to the vancomycin group, however, this difference is due to the drug itself and not due to co-trimoxazole being a combination therapy. It is therefore unfair to attribute the difference in resistance to combination therapy. Another example is the study by Walsh (71) where rifampin + novobiocin is compared to rifampin + co-trimoxazole. There is more emergence of resistance in the rifampin + co-trimoxazole group but this could be attributed to novobiocin being a different type of antibiotic than co-trimoxazole instead of the difference being attributed to combination therapy. To improve interpretation and reduce heterogeneity my suggestion would be to limit the primary analyses to regimens where the antibiotics compared are the same but in one group one or more antibiotic(s) are added (i.e. A versus A+B). The other analyses are problematic in their interpretation and should be clearly labeled as secondary and their interpretation discussed.

      We acknowledge the presence of statistical and clinical heterogeneity in our overall analysis. The decision to pursue this comprehensive examination was predefined in our previously published study protocol (PROSPERO CRD42020187257) and driven by our interest whether, despite some differences, we could either identify an overarching effect of combination therapy on resistance or identify factors that explain potential differences of the effect of combination therapy across pathogens/drugs. We indeed, find that heterogeneity is high, however identifying the driving factors of this heterogeneity is difficult as evidence is limited.

      We carried out several subgroup analyses, e.g. explicitly focusing on specific pathogen groups and medical conditions or exploring heterogeneity in treatment arms (figure 3, supplementary materials section 6). However, it is important to highlight that the number of studies available for these subgroup analyses was low. Additionally, recognizing the high heterogeneity within treatment arms, we performed a subgroup analysis focusing solely on resistances of antibiotics common to both arms (supplementary material section 6.1.8; which would avoid comparisons such as the one between vancomycin and co-trimoxazole raised by the reviewer). Unfortunately, this also revealed substantial heterogeneity. While we aimed to address heterogeneity through these subgroup analyses, limitations arose due to the number of studies meeting specific criteria and the nature of data provided by these studies.

      Moreover, regarding the concern on interpretation of co-trimoxazole as combination therapy, we acknowledge the confusion surrounding its classification as one or two antibiotics. Despite the common contemporary view of co-trimoxazole as a single antibiotic, we chose to consider it as two antibiotics due to historical practices, as observed in Black et al. (1982), where trimethoprim was compared to trimethoprim and sulfamethoxazole. We recognize that this decision may lead to confusion and we consider conducting a further sensitivity analysis in the future version of this manuscript, exploring the possibility of considering co-trimoxazole as a single antibiotic. We agree that the slight trend of less antibiotics performing better overserved for MRSA, should not be over interpreted as this is driven by the two studies Walsh et al 1993 and Paul et al 2015 as pointed out by the reviewer. In lines 183-186 we discuss this issue that for better evaluation of antibiotic combination therapy, more studies which use identical antibiotics (i.e. A versus A+B) are needed. We will try to clarify and highlight this in the future version of the manuscript.

      Another concern is about the definition of acquisition of resistance, which is unclear to me. If for example meropenem is administered and the follow-up cultures show Enterococcus species (which is intrinsically resistant to meropenem), does this constitute acquisition of resistance? If so, it would be misleading to determine this as an acquisition of resistance, as many people are colonized with Enterococci and selection of Enterococci under therapy is very common. If this is not considered as the acquisition of resistance please include how the acquisition of resistance is defined per included study.

      Thank you for pointing out this potential ambiguity. Our definition of “acquisition of resistance” is agnostic to bacterial species and hence intrinsically resistant species can be included if they were only detected during the follow-up culture by the studies. We will clarify this in the definition of “acquisition of the resistance” in the manuscript (see l. 259-260). However, it was not always clear from the studies which pathogens were acquired or whether intrinsically resistant species were not reported. Therefore, we rely on the studies' specifications of resistant and non-resistant without further classifying data into intrinsic and non-intrinsic resistance. The outcome “acquisition of resistance” can be seen more of a risk assessment for having any resistant bacterium during or after treatment. In contrast, the outcome “emergence of resistance” is more rigorous, demanding the same species to be measured as more resistant during or after treatment.

      Table S1 is not sufficiently clear because it often only contains how susceptibility testing was done but not which antibiotics were tested and how a strain was classified as resistant or susceptible.

      In Table S1, we omitted the listing of antibiotics for which susceptibility testing was performed, as this information is already presented in the main text (Table 1). However, we agree that linking this information better in a future version would benefit the understanding. Given the variability in methods used to assess resistance and the variability in drugs, the comparability of breakpoints is limited. Hence, we decided not to provide further details on this aspect so far.

      Line 85: "Even though within-patient antibiotic resistance development is rare, it may contribute to the emergence and spread of resistance."

      Depending on the bug-drug combination, there is great variation in the propensity to develop within-patient antibiotic resistance. For example: within-patient development of ciprofloxacin resistance in Pseudomonas is fairly common while within-patient development of methicillin resistance in S. aureus is rare. Based on these differences, large clinical heterogeneity is expected and it is questionable where these studies should be pooled.

      We agree that our formulation neglects differences in prevalence of within-host resistance emergence depending on bug-drug combinations. We will correct this in our upcoming version. (i.e. we will correct our statement to: “Within-patient antibiotic resistance development, even if rare, can contribute to the emergence and spread of resistance.”)

      Line 114: "The overall pooled OR for acquisition of resistance comparing a lower number of antibiotics versus a higher one was 1.23 (95% CI 0.68 - 2.25), with substantial heterogeneity between studies (I2=77.4%)"

      What consequential measures did the authors take after determining this high heterogeneity? Did they explore the source of this large heterogeneity? Considering this large heterogeneity, do the authors consider it appropriate to pool these studies?

      Thank you for highlighting this lack of clarity. In our upcoming version, we will emphasize the sub-analyses conducted to explore heterogeneity (i.e., figure 3 and supplementary materials section 6). Nevertheless, these analyses faced limitations due to the scarcity of evidence and the data provided by the studies. Given the lack of appropriate evidence, it is hard to identify the source of heterogeneity. The decision to pool all studies was pre-specified in our previously published study protocol (PROSPERO CRD42020187257) and was motivated by the question whether there is a general effect of combination therapy on resistance development or identify factors that explain potential differences of the effect of combination therapy across bug-drug combinations.

    1. Author Response

      We are grateful to the reviewers for their positive feedback with their comments and suggestions on the manuscript. Reviewer 1 has indicated two weaknesses and Reviewer 2 has none. With this provisional reply, we address the two concerns of the Reviewer 1:

      1) Data obtained from a single aminoacyl-tRNA (D-Tyr-tRNATyr) have been generalized to imply that what is relevant to this model substrate is true for all other D-aa-tRNAs. This is not a risk-free extrapolation. Why do the authors believe that the length of the amino acid side chain will not matter in the activity of DTD2?

      We thank the reviewer for bringing up this important point. We wish to clarify that only a few of the aminoacyl-tRNA synthetases are known to charge D-amino acids and only D-Leu (Yeast), D-Asp (Bacteria, Yeast), D-Tyr (Bacteria, Cyanobacteria, Yeast) and D-Trp (Bacteria) show toxicity in vivo in the absence of known DTD (Soutourina J. et al., JBC, 2000; Soutourina O. et al., JBC, 2004; Wydau S. et al., JBC, 2009). D-Tyr-tRNATyr is used as a model substrate to test the DTD activity in the field because of the conserved toxicity of D-Tyr in various organisms. DTD2 has been shown to recycle D-Asp-tRNAAsp and D-Tyr-tRNATyr with the same efficiency both in vitro and in vivo (Wydau S. et al., NAR, 2007). Moreover, we have previously shown that it recycles acetaldehyde-modified D-Phe-tRNAPhe and D-Tyr-tRNATyr in vitro (Mazeed M. et al., Science Advances, 2021). We have earlier shown that DTD1, another conserved chiral proofreader across bacteria and eukaryotes, acts via a side chain independent mechanism (Ahmad S. et al., eLife, 2013). Considering the action on multiple side chains with different chemistry and size, it can be proposed with reasonable confidence that DTD2 also operates based on a side chain independent manner.

      2) While the use of EFTu supports that the ternary complex formation by the elongation factor can resist modifications of L-Tyr-tRNATyr by the aldehydes or other agents, in the context of the present work on the role of DTD2 in plants, one would want to see the data using eEF1alpha. This is particularly relevant because there are likely to be differences in the way EFTu and eEF1alpha may protect aminoacyl-tRNAs (for example see description in the latter half of the article by Wolfson and Knight 2005, FEBS Letters 579, 3467-3472).

      We thank the reviewer for bringing another important point. We analysed the aa-tRNA bound elongation factor structures from both bacteria (PDB id: 1TTT) and mammal (PDB id: 5LZS) and found that the amino acid binding site is highly conserved where side chain of amino acid is projected outside. Modelling of D-amino acid in the same site shows serious clashes, indicating D-chiral rejection during aa-tRNA binding by elongation factor. In addition, the amino group of amino acid is tightly selected by the main chain atoms of elongation factor thereby lacking a space for aldehydes to enter and then modify the L-aa-tRNAs and Gly-tRNAs. Minor differences near the amino acid side chain binding site (as indicated in Wolfson and Knight, FEBS Letters, 2005) might induce the amino acid specific binding differences. However, those changes will have no influence when the D-chiral amino acid enters the pocket, as the whole side chain would clash with the active site. We will present a sequence and structural conservation analysis to clarify this important point in our revised manuscript. Overall, our structural analysis suggests a conserved mode of aa-tRNA selection by elongation factor across life forms and therefore, our biochemical results with bacterial elongation factor Tu (EF-Tu) reflect the protective role of elongation factor in general across species.

      In our revised manuscript, we will provide a thorough point-by-point response to the above as well as all the specific reviewer comments. We also intend to include new analysis with updated data that would address the key questions raised by the reviewers.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      This nice study by Miyano combines slice electrophysiology and superresolution microscopy to address the role of RBP2 in Ca2+ channel clustering and neurotransmitter release at hippocampal mossy fiber terminals. While a number of studies demonstrated a critical role for RBPs in clustering Ca2+ channels at other synapses and some provided evidence for a role of the protein in molecular coupling of Ca2+ channels and release sites, the present study targets another key synapse that is an important model for presynaptic studies and offers access to a microdomain controlled synaptic vesicle (SV) release mechanism with low initial release probability.

      Summarizing a large body of high-quality work, the authors demonstrate reduced Ca2+ currents and a reduced release probability. They attribute the latter to the reduced Ca2+ influx and can restore release by increasing Ca2+ influx. Moreover, they propose an altered fusion competence of the SVs, which is not so strongly supported by the data in my view.

      The effects are relatively small, but I think the careful analysis of the RBP role at the mossy fiber synapse is an important contribution.

      We thank the reviewer for careful assessment of the paper. We agree that while reduced Ca influx in KO is relatively straightforward, impaired priming is somewhat indirect, remaining as suggestion. We also noted that Moser and colleagues have analyzed the function of RIM-BP2 at hair cell synapses and also showed reduced Ca influx. In cortical synapses, there have been no study using direct presynaptic recording. In the revision, we carefully cited previous studies and tried to be fair. We hope that the current revision is much improved.

      Reviewer #2 (Public Review):

      The proper expression and organization of CaV channels at the presynaptic release sites are subject to coordinative and redundant control of many active zone-specific molecules including RIM-BPs. Previous studies have demonstrated that ablation of RIM-BPs in various mammalian synapses causes significant impairment of synaptic transmission, either by reducing CaV expression or decoupling CaV from synaptic vesicles. The mechanisms remain unknown.

      In the manuscript, Sakaba and colleagues aimed to examine the specific role of RIM-BP2 at the hippocampal mossy fiber-CA3 pyramidal cell synapse, which is well-characterized by low initial release probability and strong facilitation during repetitive stimulation. By directly recording Ca2+ currents and capacitance jumps from the MF boutons, which is very challenging but feasible, they showed that depolarization-evoked Ca2+ influx was reduced significantly (~39%) by KO of RIM-BP2, but no impacts on Ca-induced exocytosis and RRP (measured by capacitance change). They used STED microscopy to image the spatial distribution of the CaV2.1 cluster but found no change in the cluster number with a slight decrease in cluster intensity (~20%). They concluded that RIM-BP2 functions in tonic synapses by reducing CaV expression and thus differentially from phasic synapses by decoupling CaV-SV.

      In general, they provide solid data showing that RIM-BP2 KO reduces Ca influx at MF-CA3 synapse, but the phenotype is not new as Moser and colleagues have also used presynaptic recording and capacitance measurement and shown that RIM-BP2 KO reduces Ca2+ influx at hair cell active zone (Krinner et al., 2017), although at different synapse model expressing CaV1.3 instead of CaV2.1. Further, the concept that RIM-BP2 plays diverse functions in transmitter release at different central synapses has also been proposed with solid evidence (Brockmann et al., 2019).

      We thank the reviewer for careful reading of the ms. We agree that previous studies have sown reduced Ca influx at hair cells, and diverse function of RIM-BP2 in different central synapses have been proposed by Brockman et al. The new point of this study is we firmly and quantitatively show the reduced Ca currents using direct presynaptic recording, which has not been done in mossy fiber synapses or cortical synapses in general. Quantitative and time-resolved measurements of the presynaptic currents cannot be done by other methods, so far. In this revision, we point this out carefully.  

      Reviewer #1 (Recommendations For The Authors):

      The MS is overall carefully prepared and I have only a few minor comments to help with further improving the manuscript.

      Abstract:

      I think the notion of different RBP function at tonic and phasic synapses is not so well founded. The reduced number of Ca2+ channels and their altered topography have been shown in multiple synapses that also include those with phasic release. Quantitative structural and functional analysis of presynaptic Ca2+ channels of RBP-2 and RBP1-2 DKO deficient AZs closely related to the present study has e.g. been provided for auditory synapses (e.g. hair cells, endbulb/calyx of end synapses that provide both phasic and sustained release.

      In abstract, we have omitted description of phasic vs tonic synapses, because it is not well founded as the reviewer pointed out. Specifically, in abstract (Line 13~):

      “Synaptic vesicles dock and fuse at the presynaptic active zone (AZ), the specialized site for transmitter release. AZ proteins play multiple roles such as recruitment of Ca2+ channels as well as synaptic vesicle docking, priming and fusion. However, the precise role of each AZ protein type remains unknown. In order to dissect the role of RIM-BP2 at mammalian cortical synapses having low release probability, we applied direct electrophysiological recording and super-resolution imaging to hippocampal mossy fiber terminals of RIM-BP2 KO mice. By using direct presynaptic recording, we found the reduced Ca2+ currents. The measurements of EPSCs and presynaptic capacitance suggested that the initial release probability was lowered because of the reduced Ca2+ influx and impaired fusion competence in RIM-BP2 KO. Nevertheless, larger Ca2+ influx restored release partially. Consistent with presynaptic recording, STED microscopy suggested less abundance of P/Q-type Ca2+ channels at AZs deficient in RIM-BP2. Our results suggest that the RIM-BP2 regulates both Ca2+ channel abundance and transmitter release at mossy fiber synapses.”

      Intro:

      Line 48: consider adding Butola et al., 2021 /endbuld of Held to reference which concurs on the notion made for Calyx. However, a contrasting finding was made for another synapse with tight coupling: RBP2 deletion did not alter tight coupling in hair cells (Krinner et al., 2017). Line 51: RBP-DKO/lack of additional effect of RBP1 deletion: suggest adding Krinner et al., 2021 to reference, which concurs with the notion made for hair cells.

      We cited Butola et al., 2021 (Line 49) and Krinner et al., 2021 (Line 52), as the reviewer suggested.

      Results:

      STED microscopy: I am concerned with two aspects of the analysis/presentation. I) I recommend replacing density with abundance as the authors do not resolve single channels. II) I appreciate the note of caution about the fact that STED nanoscopy due to the non-linear nature of the depletion process should/could not be easily used to quantify copy numbers based on immunofluorescence. I would recommend the authors perform 2D Gaussian fitting to at least the Cav2.1 immunofluorescent spots neighboring Munc13-1 spots and report the short and long axis estimates as well as potentially the area. Should the authors have confocal Cav2.1 and Cav2.2 immunofluorescent data co-acquired with STED of Munc13-1, this would be very valuable additional information, but I do not think the experiment is essential for the sake of publication if it was not done already, given the large body of high-quality physiology data.

      I) We have changed the term from density to abundance as the reviewer suggested throughout the manuscript.

      II) As the reviewer suggested, we have carried out 2D Gaussian fitting of Cav2.1 spots. The length, width, and area of Cav2.1 clusters in the AZ were not different between WT and RIM-BP2 KO terminals (Line 431-433, Figure 7-figure supplement 4). The spatial resolution of STED, especially at mossy fiber synapses in the tissue, and a small difference between WT and KO (~30 % expected from electrophysiology) could prevent detection of the difference, unlike ribbon synapses and fly NMJ where release sites and Ca channel clusters are well defined. We should also note that the intensity was calculated similar to previous studies (integral of signal intensity, Krinner et al., 2017), and not absolute peak intensity.  

      As the reviewer suggested, we have added confocal data ((Line 434-436, Figure 7-figure supplement 5). We have determined the AZ area from the Munc13-1 STED data, and Munc13-1, Cav2.1 and Cav2.2 intensities were quantified. As shown in the figure, only Ca2.1 intensity was reduced in KO, consistent with the STED data.

      Nevertheless, we should be cautious about interpretation of the intensity as the reviewer suggested, and are aware that the data are just consistent with electrophysiology. From imaging, we only see a qualitative rather than quantitative difference between WT and KO.

      Discussion:

      I think the focus on alterations of presynaptic Ca channels could be further strengthened along with the discussion of the relevant previous studies.

      Thank you for the suggestion. We have added a paragraph as shown below in the discussion (Line 531~).

      “By using direct presynaptic patch clamp recordings, we here observed a decrease of Ca2+ current amplitudes (~30%) in RIM-BP2 KO mice (Fig. 1). Consistently, STED microscopy supported reduced abundance of P/Q-type Ca2+ channels (Cav2.1) in the mutant mossy fiber terminal (Fig. 7). Interestingly, this observation is similar to that at Drosophila NMJ and hair cell synapses (Liu et al., 2011; Krinner et al., 2017), but not that at other synapses (Acuna et al., 2015; Grauel et al., 2016; Butola et al., 2021), suggesting that the functional role of RIM-BP2 in recruiting Ca2+ channels differs among synapse types. “

      Reviewer #2 (Recommendations For The Authors):

      Minor questions:

      1) The title is misleading as it only shows RIM-BP2 regulates CaV expression but not clustering.

      This has been pointed out by the 1st reviewer, too. We have adopted the term “abundance” as suggested by the 1st reviewer and changed to “RIM-BP2 regulates Ca2+ channel abundance and neurotransmitter release at hippocampal mossy fiber terminals.”

      2) Figure 7 legend. Again, RIM-BP2 only changes the intensity of CaV2.1 clusters but not the density.

      Changed Figure 7 title from “RIM-BP2 deletion alters the density …” to “RIM-BP2 deletion alters the signal intensity …”.

      3) Line 31: "Ca2+ influx through voltage-gated Ca2+ channels triggers neurotransmitter release from synaptic vesicles within a millisecond" is not correct. Ca-evoked transmitter release can only occur with such fast speed at very specialized synapses such as the calyx of Held but not at general chemical synapses.

      We changed “within a millisecond” to “within milliseconds” (Line 30).

      4) Line 44-46: In Drosophila NMJs and at Drosophila NMJs are redundant.

      We eliminated “at Drosophila NMJs”.

      5) The authors should use the verb tense consistently throughout the manuscript such as"In RIM-BP1,2 DKO mice, the coupling between Ca2+ channels and synaptic vesicles became loose, and action potential-evoked neurotransmitter release was reduced at the calyx of Held synapse (Acuna et al., 2015). At hippocampal CA3-CA1 synapses, RIM-BP2 deletion alters Ca2+ channel localization at the AZs without altering total Ca2+ influx. Besides, RIM-BP1,2 DKO has no additional effect...".

      We changed verb tenses in Line 46-49, Line 55-58, and Line 62-67. We also checked the ms once more. Thank you for pointing this out.

      6) Line 59: technically difficulty should be technical difficulty.

      Fixed.

      7) Figure 4A-B are representative traces of 0.5 mM EGTA (black) or 5 mM EGTA (red) recorded from the same terminals or from different terminals but simply superimposed?

      Representative traces are recorded from different terminals. We describe this point in the figure legend (Fig 4A). We are very sorry for confusion.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      Receptor tyrosine kinases such as ALK play critical roles during appropriate development and behaviour and are nodal in many disease conditions, through molecular mechanisms that weren't completely understood. This manuscript identifies a previously unknown neuropeptide precursor as a downstream transcriptional target of Alk signalling in Clock neurons in the Drosophila brain. The experiments are well designed with attention to detail, the data are solid and the findings will be useful to those interested in events downstream of signalling by receptor tyrosine kinases.

      Authors response: We thank the reviewers for this assessment of our Manuscript. We are happy to accept the current eLife assessment of our manuscript. In our revised manuscript we have addressed all of the major reviewer comments, including additional experiments suggested by the reviewers, which have significantly strengthened the revised version.

      Reviewer #1 (Public Review):

      Sukumar et al build on a body of work from the Palmer lab that seeks to unravel the transcriptional targets of Alk signaling (a receptor tyrosine kinase). Having uncovered its targets in the mesoderm in an earlier study, they seek to determine its targets in the central nervous system. To do this, they use Targeted DamID (TaDa) in the wild-type and Alk dominant negative background and identify about 1700 genes that might be under the control of Alk signalling. Using their earlier data and applying a set of criteria - upregulated in gain-of-Alk, downregulated in loss-of-Alk, and co-expressed with Alk positive cells in single cell datasets - they arrive upon a single gene, Sparkly, which is predicted to be a neuropeptide precursor.

      They generate antibodies and mutants for Sparkly and determine that it is responsive to Alk signalling and is expressed in many neuroendocrine cells, as well as in clock neurons. Though the mutants survive, they have reduced lifespans and are hyperactive. In summary, the authors identify a previously unidentified transcriptional target of Alk signalling, which is likely cleaved into a neuropeptide and is involved in regulating circadian activity.

      The data support claims made, are generally well presented and the manuscript clearly written. The link between circadian control of Alk signalling in Clock neurons > Spar expression > ultimately controlling circadian activity, however, was not clear.

      Authors response: We thank the reviewer for this through reading of our manuscript and for kindly highlighting the important takeaways from the study. The role of Alk signalling in activity, circadian rhythm and sleep has previously been reported by other groups in the following studies – (Bai and Sehgal, 2015; Weiss et al, 2017; Gouzi, Bouraimi et al 2018), which we have discussed in our manuscript. We also have identified a hyperactivity phenotype in our Alk CNS specific loss-of-function allele, AlkRA, which is similar to the Spar loss-of-function mutant phenotype. We hypothesize that one of ways in which Alk signalling regulates fly activity is through regulating Spar gene expression in neuroendocrine cells. This is supported by our data which shows Alk expression in Clock neurons, as well by the new experimental data showing an activity phenotype in flies expressing Spar RNAi driven by the Clk678-Gal4 driver.

      Reviewer #2 (Public Review):

      This manuscript illustrates the power of "combined" research, incorporating a range of tools, both old and new to answer a question. This thorough approach identifies a novel target in a well-established signalling pathway and characterises a new player in Drosophila CNS development.

      Largely, the experiments are carried out with precision, meeting the aims of the project, and setting new targets for future research in the field. It was particularly refreshing to see the use of multi-omics data integration and Targeted DamID (TaDa) findings to triage scRNA-seq data. Some of the TaDa methodology was unorthodox (and should be justifed/caveats mentioned in the main text), however, this does not affect the main finding of the study.

      Their discovery of Spar as a neuropeptide precursor downstream of Alk is novel, as well as its ability to regulate activity and circadian clock function in the fly. Spar was just one of the downstream factors identified from this study, therefore, the potential impact goes beyond this one Alk downstream effector.

      Authors response: We thank the reviewer for the positive comments highlighting the strengths of our study. TaDa was used as a semi-quantitative readout of the transcriptional activity in a Alk loss-of-function background with an emphasis on relative differences in peaks close to GATC sites, providing an important dataset for integration with bulk and single cell RNAseq. As the reviewer points out there are important considerations when interpreting this data and we have now added sentences in the discussion to inform readers of possible caveats of our TaDa dataset.

      Reviewer #3 (Public Review):

      Summary:

      The receptor tyrosine kinase Anaplastic Lymphoma Kinase (ALK) in humans is nervous system expressed and plays an important role as an oncogene. A number of groups have been signalling ALK signalling in flies to gain mechanistic insight into its various role. In flies, ALK plays a critical role in development, particularly embryonic development and axon targeting. In addition, ALK also was also shown to regulate adult functions including sleep and memory. In this manuscript, Sukumar et al., used a suite of molecular techniques to identify downstream targets of ALK signalling. They first used targeted DamID, a technique that involves a DNA methylase to RNA polymerase II, so that GATC sites in close proximity to PolII binding sites are marked. They performed these experiments in wild-type and ALK loss of function mutants (using an Alk dominant negative ALkDN), to identify Alk responsive loci. Comparing these loci with a larval single-cell RNAseq dataset identified neuroendocrine cells as an important site of Alk action. They further combined these TaDa hits with data from RNA seq in Alk Loss and Gain of Function manipulations to identify a single novel target of Alk signalling - a neuropeptide precursor they named Sparkly (Spar) for its expression pattern. They generated a mutant allele of Spar, raised an antibody against Spar, and characterised its expression pattern and mutant behavioural phenotypes including defects in sleep and circadian function.

      Strengths:

      The molecular biology experiments using TaDa and RNAseq were elegant and very convincing. The authors identified a novel gene they named Spar. They also generated a mutant allele of Spar (using CrisprCas technology) and raised an antibody against Spar. These experiments are lovely, and the reagents will be useful to the community. The paper is also well written, and the figures are very nicely laid out making the manuscript a pleasure to read.

      Weaknesses:

      My main concerns were around the genetics and behavioural characterisation which is incomplete. The authors generated a novel allele of Spar - Spar ΔExon1 and examined sleep and circadian phenotypes of this allele. However, they have only one mutant allele of Spar, and it doesn't appear as if this mutant was outcrossed, making it very difficult to rule out off-target effects. To make this data convincing, it would be better if the authors had a second allele, perhaps they could try RNAi?

      Further, the sleep and circadian characterisation could be substantially improved. In Fig 8 E-F it appears as if sleep was averaged over 30 days! This is a little bizarre. They then bin the data as day 1 - 12 and 12-30. This is not terribly helpful either. Sleep in flies, as in humans, undergoes ontogenetic changes - sleep is high in young flies, stabilises between day 3-12, and shows defects by around 3 weeks of age (cf Shaw et al., 2000 PMID 10710313). The standard in the sleep field is to average over 3 days or show one representative day. The authors should reanalyse their data as per this standard, and perhaps show data from 310 day old flies, and if they like from 20-30 day old flies. Further, sleep data is usually analysed and presented from lights on to lights on. This allows one to quantify important metrics of sleep consolidation including bout lengths in day and night, and sleep latency. These metrics are of great interest to the community and should be included.

      The authors also claim there are defects in circadian anticipatory activity. However, these data, as presented are not solid to me. The standard in the field is to perform eduction analyses and quantify anticipatory activity e.g. using the method of Harrisingh et al. (PMID: 18003827). Further, circadian period could also be evaluated. There are several free software packages to perform these analyses so it should not be hard to do.

      Authors response: We thank the reviewer for the thorough reading of our manuscript and for generously praising the positives as well as pointing out the weakness of our study. We have now addressed the highlighted weaknesses in behavioural experiments. In particular, we have reanalysed our data according to the reviewer’s suggestions. In addition, we provide experimental data, driving Spar RNAi in Clock neurons, that support our Spar mutant analysis.

      Point-by-point response to the reviewers’ concerns:

      Point 1. “My main concerns were around the genetics and behavioural characterisation which is incomplete. The authors generated a novel allele of Spar - Spar ΔExon1 and examined sleep and circadian phenotypes of this allele. However, they have only one mutant allele of Spar, and it doesn't appear as if this mutant was outcrossed, making it very difficult to rule out off-target effects. To make this data convincing, it would be better if the authors had a second allele, perhaps they could try RNAi?”

      Authors response: As per the reviewer's suggestion, we conducted a targeted knockdown of Sparkly specifically in clock neurons (Clk-Gal4 > Spar-RNAi) and assessed the circadian phenotypes. Flies were monitored for 5 days in LD followed by a shift to DD, similar to our previous LD-DD experiments. The results revealed a significant disruption in both activity and sleep during the DD transition period upon knockdown of Spar in circadian clock neurons. These findings strongly align with the expression pattern of Spar in clock neurons (Figure 7i-l’’). We have now included a new main figure (Figure 9) together with several supplementary figure (Figure 9 – figure supplements 1 and 2) and discussed these experiments on pages 17-18 of the results section of the revised manuscript.

      Point 2. “Further, the sleep and circadian characterisation could be substantially improved. In Fig 8 E-F it appears as if sleep was averaged over 30 days! This is a little bizarre. They then bin the data as day 1 - 12 and 12-30. This is not terribly helpful either. Sleep in flies, as in humans, undergoes ontogenetic changes - sleep is high in young flies, stabilises between day 3-12, and shows defects by around 3 weeks of age (cf Shaw et al., 2000 PMID 10710313). The standard in the sleep field is to average over 3 days or show one representative day. The authors should reanalyse their data as per this standard, and perhaps show data from 3–10-day old flies, and if they like from 20–30-day old flies.”

      Authors response: We have reanalysed these data according to the reviewer's suggestions and revised the sleep data presented. Specifically, we have focused on two 3-day periods, days 5-7 as well as days 20-22. By averaging the sleep mean during these time points, we observed a significant decrease in average sleep duration in the SparΔExon1 and Alk ΔRA mutant flies at a younger age (Figure 8h-h’, Figure 8 – figure supplement 2). However, no significant effect was observed in older flies (Figure 8h-h’, Figure 8 – figure supplement 2). We have incorporated this new data into Figure 8 and provided a detailed description in the results section (page 16) of the revised manuscript.

      Point 3. “Further, sleep data is usually analysed and presented from lights on to lights on. This allows one to quantify important metrics of sleep consolidation including bout lengths in day and night, and sleep latency. These metrics are of great interest to the community and should be included.”

      Authors response: We have now reanalysed these data as per the reviewer's suggestion. From the raw data collected over a span of 3 days, we specifically selected the lights on-lights on data and examined the average sleep duration. Notably, we observed a significant downregulation of average sleep in SparΔExon1 and AlkΔRA flies, but only at a younger age (Figure 8h-h’, Figure 8 – figure supplement 2). Furthermore, we assessed the number of sleep bouts using this data and found a significant increase in the number of bouts in younger SparΔExon1 and AlkΔRA flies, with no changes observed at an older age (Figure 8 – figure supplement 2). Additionally, we evaluated the number of bouts in flies that were initially monitored in LD and then shifted to DD, observing a significant decrease in the number of sleep bouts in SparΔExon1 flies following the transition to DD (Figure 9d). This new data is described in detail in the results section (pages 16-18) of the revised manuscript.

      Point 4. “The authors also claim there are defects in circadian anticipatory activity. However, these data, as presented are not solid to me. The standard in the field is to perform eduction analyses and quantify anticipatory activity e.g. using the method of Harrisingh et al. (PMID: 18003827).”

      Authors response: We appreciate the valuable suggestion provided by the reviewer. In accordance with the referenced paper by Harrisingh et al. (2007), we calculated the "anticipation score" defined as the percentage of activity in the 6hour period preceding the lights-on or lights-off transition that occurs in the 3-hour window just before the transition. To analyse the mean activity of the flies, we selected the data corresponding to the 6 hours before lights-on and the 6 hours before lights-off, averaged over a 14-day period under normal LD conditions. Interestingly, we observed a significant increase in the mean activity of SparΔExon1 flies during both morning anticipation (a.m. anticipation) and evening anticipation (p.m. anticipation) (Figures 8f). Furthermore, we analysed this parameter for flies entrained in DD and found that SparΔExon1 flies exhibited lower mean activity during both morning and evening anticipation (Figures 8g). We have incorporated this new data into Figure 8 and provided a detailed description in the results section (pages 16-18) of the revised manuscript.

      Point 5. Further, circadian period could also be evaluated. There are several free software packages to perform these analyses so it should not be hard to do.

      Authors response: We have now evaluated the circadian period as suggested by the reviewer; generating a chi-square periodogram for each fly to calculate the free-running period for the flies that were under normal LD conditions additionally to the ones that were entrained in DD. We calculated the percentage of flies that had a shorter or longer period than 1440 min (24 h) and observed that w1118 and SparΔExon1 flies have a longer circadian period (Figure 8 – figure supplement 4) but following the shift to DD, they tend to have a shorter circadian period (Figure 9 – figure supplement 3). This new data is described in the results (pages 16-18).

      Recommendations for the authors:

      There are two major concerns that we recommend the authors address:

      1) The behaviour: There are a number of unconventional representations of the behavioural data in this manuscript. We recommend that the authors revisit their data representation to adhere to conventions in the field - specific suggestions are in the reviews. We also suggest an additional experiment - an RNAi/different allele/rescue experiment to ensure that the phenotypes the authors observe are not due to off-target effects of the mutant they have generated.

      Authors response: In the revised manuscript, we have reanalysed the behavioural data according to the reviewers’ recommendations (included in Figures 8 and 9 of the revised version). In addition, we have performed a targeted Spar RNAi experiment in clock neurons (included in Figure 9 of the revised version), identifying a hyperactive behavioural phenotype similar to that of Spar mutants. The inclusion of these new analyses and data strengthens the manuscript and support the conclusion that Spar plays a role in regulation of behaviour.

      2) TaDa analyses: We were concerned that the authors might be picking up false positives with the way they have analysed their data. While this may not matter for this study, it will be useful to reason out their approach and keep this in mind for any other targets they choose from these data for further studies.

      Authors response: In line with the reviewers concerns we have now highlighted the potential caveats and drawbacks of our TaDa dataset in the discussion section of the revised manuscript (detailed in response to Reviewer #2 below).

      Reviewer #1 (Recommendations For The Authors):

      Though generally well written, I felt that some sections could be written in more detail. For example, the text around Figure 5 was not very informative. Many of the other approaches to the analyses and details of datasets used were glossed over. Since the manuscript uses a lot of previously published data, it would be nice to give more details about them in the context of the results.

      Authors response: We thank the reviewer for this recommendation. We have now added additional information about peptidomics analysis in the results and in the legend of Figure 5. We have also included a table in the Methods that summarised the datasets used in this study, including the Dataset name, brief description and reference.

      In the panels where co-localisations have been represented, it would be nice to include enlarged insets depicting the co-labelling. It is not always obvious in the way the figures have currently been represented. For example, in Fig 2G, Alk stain appears to be everywhere, but the authors make the point that it is enriched in neuroendocrine cells (as labelled by dimmed), but the co-localisation isn't evident. Similar issues come up with the sparkly colocalisations.

      Authors response: As suggested by the reviewer, we have now added additional panels to complement the stainings in Figure 2G. These new data are included as Figure 2 – figure supplement 1 (Alk/Dimm-Gal4>UAS-GFPcaax staining) and as Figure 4 – figure supplement 1 (Alk/Spar staining), which indicate colocalization in the central brain and ventral nerve cord prosecretory cells with enlarged panels.

      Supplementary figures S3C and 3F appear garbled to me? Maybe it didn't upload properly?

      Authors response: Unfortunately, this issue is not apparent to us. However, we have now re-uploaded these Figures.

      Sparkly's responsiveness to Alk signalling: Visually, there does not seem to be an increase or decrease in spar levels in the images in Fig 4F-H. How was the quantification done? I would suggest a more detailed interpretation of their results related to spar's responsiveness to Alk signalling - at the mRNA vs protein levels and the GOF vs LOF conditions.

      Authors response: We thank the reviewer for this constructive recommendation. In the revised manuscript, we have now repeated this experiment with increased numbers of larval CNS followed by blinded image analysis. These results also show an increased fluorescence intensity as measured by corrected total cell fluorescence (CTCF), confirming our previous observation of increased Spar protein expression in in Alk gain-of-function conditions compared to controls. In this analysis, changed in Spar levels in Alk loss-of-function remained non-significant compared to control, in agreement with our previous data. As suggested by the reviewer, we have now included several additional sentences discussing the possible reasons for these observations. This following text is now included on Page 11 of the results section:

      “While our bulk RNA-seq and TaDa datasets show a reduction in Spar transcript levels in Alk loss-of-function conditions, this reduction is not reflected at the protein level. This observation may reflect additional uncharacterised pathways that regulate Spar mRNA levels as well as translation and protein stability. Taken together, these observations confirm that Spar expression is responsive to Alk signaling in CNS, although Alk is not critically required to maintain Spar protein levels.” We have also added an additional Image analysis method section explaining the methodology of the CTCF fluorescent intensity quantification on Page 28.

      Reviewer #2 (Recommendations For The Authors):

      It was surprising to see that the authors did not use Dam-only controls. This is to control for background methylation by Dam (i.e. accessible chromatin). This does not invalidate the main results of the manuscript, however, there could be false positives in the dataset for genes that are seen to be up-regulated in the mutant condition (e.g. if accessibility is increased in the mutant but not transcription, then it would look like increased Pol II binding, when it isn't). As the study was focusing on genes down-regulated in the mutant, this is less of an issue, as it is very unlikely to see an increase in transcription with a decrease in accessibility (that could provide a false positive). The authors should explain their rationale for not using Dam-only controls, and the associated caveats, in the manuscript.

      Authors response: We agree with the reviewer’s comment on possibility of identifying false positive candidates from our TaDa dataset. Especially, if one is seeking to find a gene with increased Pol II occupancy in a Alk dominant negative condition. However, our analysis only focuses on genes which are responsive to Alk-manipulation, namely, genes which are downregulated in the Alk dominant negative condition. One of the rationales for not using a Dam-only control was that in our previous Mendoza-Garcia et al, 2021 study, we employed a similar method and were able to successfully identify already known and novel targets of Alk signalling in embryonic mesoderm comparing the Dam-Pol II versus Dam-Pol II; Alk Dominant negative conditions. In the current version of the manuscript, we have expanded our discussion of these caveats as follows (Discussion, Page 19-20):

      “A potential drawback of our TaDa dataset is the identification of false positives, due to non-specific methylation of GATC sites at accessible regions in the genome by Dam protein. Hence, our experimental approach likely more reliably identifies candidates which are downregulated upon Alk inhibition. In our analysis, we have limited this drawback by focusing on genes downregulated upon Alk inhibition and integrating our analysis with additional datasets, followed by experimental validation. This approach is supported by the identification of numerous previously iden- tied Alk targets in our TaDa candidate list.”

      Related to this, could the authors make it clear/justify why they chose to use peakbased analysis of the Dam-Pol II data rather than looking at signals across whole transcripts? For example, this could result in false positives if a gene switches from having no Pol II to having paused Pol II.

      Authors response: In our opinion, a peak based analysis is dependable in this context. We chose to prioritize peaks close (+/- 1kb) to transcription start sites (TSS) to increase the chances of finding true Pol II occupancy peaks. Also, during bioinformatics analysis using Damid-seq pipeline (Maksimov et al, 2016) fragments not aligning to GATC borders are excluded. Therefore, a whole transcript Pol II occupancy peak analysis may not be always feasible. We agree with the reviewer that a paused Pol II will result in false positives, however, it will only result in an increase of a specific peak and in our case, we are seeking to identify peaks with lower pol II occupancy as a result of Alk knockdown. Furthermore, we depend on additional integration with additional relevant datasets to minimise false positive candidates for detailed analysis. In the current version of the manuscript these caveats have been mentioned and discussed (see point above).

      Do the authors have any theories about the mode of action of Spar? Or ideas about how this might be followed up? If so, that could be included in the Discussion.

      Authors response: Other than identifying modified Spar derived peptides, which suggest a target receptor, possibly a GPCR, were have no other data currently that allows us to speculate more on the mode of action of Spar. We are currently working hard to try to identify a receptor, but this is a challenging and ongoing process. In the discussion we speculate regarding the identity of the Spar receptor, as well as its location, which is likely in the CNS, and body muscle, however, these are open questions that we can hopefully answer in a future study.

      Reviewer #3 (Recommendations For The Authors):

      Spar protein expression was unchanged in Alk loss of function. This is a curious result as the authors used RNA seq data from Alk loss of function to identify Spar. This could be commented on in the discussion.

      Authors response: We thank the reviewer for this comment, and they are correct in noticing this. We have also thought about this, and reviewer #1 also commented. To confirm this result, we repeated this experiment with increased numbers of larval CNS followed by blinded image analysis for the revised version. These results also show an increased fluorescence intensity as measured by corrected total cell fluorescence (CTCF), confirming our previous observation of increased Spar protein expression in in Alk gain-of-function conditions compared to controls. In this analysis, changed in Spar levels in Alk loss-of-function remained non-significant compared to control, in agreement with our previous data. As suggested by reviewer #1, we have now included several additional sentences discussing the possible reasons for these observations. This following text is now included on Page 11 of the results section:

      “While our bulk RNA-seq and TaDa datasets show a reduction in Spar transcript levels in Alk loss-of-function conditions, this reduction is not reflected at the protein level. This observation may reflect additional uncharacterised pathways that regulate Spar mRNA levels as well as translation and protein stability. Taken together, these observations confirm that Spar expression is responsive to Alk signaling in CNS, although Alk is not critically required to maintain Spar protein levels.”

      Pg 19: Spar is expressed in the Mushroom Bodies (MBs). Do they mean in Kenyon Cells (KCs)? I don't see this expression in the figures. Maybe this could be highlighted in the figure. It would definitely be of interest if this were true.

      Authors response: We agree with the reviewer that this would be interesting. We have not performed detailed staining of the mushroom bodies at this point, however, Spar mRNA expression in a transcriptomics analysis performed by Crocker et al, 2016, identifies Spar in all cell types, including Kenyon cells. We have now included this and cited this reference in the discussion.

      Spar is also expressed in multiple potential sleep regulatory sites including clock neurons, the PI, AstA cells and so on. Some of these might be arousal-promoting and some sleep-promoting. Taking out Spar in both sleep and arousal-promoting subsets might have complex effects. The authors might want to knock down Alk in different subsets of neurons to make more targeted manipulations.

      Authors response: We thank the reviewer for this suggestion regarding interesting experiments to further investigate Spar function. We are planning to follow up and study the role of Alk signalling in different neuronal subsets, with a specific interest in neuroendocrine/prosecretory cells.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer No.1 (public)

      The authors present a study focused on addressing the key challenge in drug discovery, which is the optimization of absorption and affinity properties of small molecules through in silico methods. They propose active learning as a strategy for optimizing these properties and describe the development of two novel active learning batch selection methods. The methods are tested on various public datasets with different optimization goals and sizes, and new affinity datasets are curated to provide up-todate experimental information. The authors claim that their active learning methods outperform existing batch selection methods, potentially reducing the number of experiments required to achieve the same model performance. They also emphasize the general applicability of their methods, including compatibility with popular packages like DeepChem.

      Strengths:

      Relevance and Importance: The study addresses a significant challenge in the field of drug discovery, highlighting the importance of optimizing the absorption and affinity properties of small molecules through in silico methods. This topic is of great interest to researchers and pharmaceutical industries.

      Novelty: The development of two novel active learning batch selection methods is a commendable contribution. The study also adds value by curating new affinity datasets that provide chronological information on state-of-the-art experimental strategies.

      Comprehensive Evaluation: Testing the proposed methods on multiple public datasets with varying optimization goals and sizes enhances the credibility and generalizability of the findings. The focus on comparing the performance of the new methods against existing batch selection methods further strengthens the evaluation.

      Weaknesses:

      Lack of Technical Details: The feedback lacks specific technical details regarding the developed active learning batch selection methods. Information such as the underlying algorithms, implementation specifics, and key design choices should be provided to enable readers to understand and evaluate the methods thoroughly.

      Evaluation Metrics: The feedback does not mention the specific evaluation metrics used to assess the performance of the proposed methods. The authors should clarify the criteria employed to compare their methods against existing batch selection methods and demonstrate the statistical significance of the observed improvements.

      Reproducibility: While the authors claim that their methods can be used with any package, including DeepChem, no mention is made of providing the necessary code or resources to reproduce the experiments. Including code repositories or detailed instructions would enhance the reproducibility and practical utility of the study.

      Suggestion 1:

      Elaborate on the Methodology: Provide an in-depth explanation of the two active learning batch selection methods, including algorithmic details, implementation considerations, and any specific assumptions made. This will enable readers to better comprehend and evaluate the proposed techniques.

      Answer: We thank the reviewer for this suggestion. Following this comments we have extended the text in Methods (in Section: Batch selection via determinant maximization and Section: Approximation of the posterior distribution) and in Supporting Methods (Section: Toy example). We have also included the pseudo code for the Batch optimization method.

      Suggestion 2:

      Clarify Evaluation Metrics: Clearly specify the evaluation metrics employed in the study to measure the performance of the active learning methods. Additionally, conduct statistical tests to establish the significance of the improvements observed over existing batch selection methods.

      Answer: Following this comment we added to Table 1 details about the way we computed the cutoff times for the different methods. We also provide more details on the statistics we performed to determine the significance of these differences.

      Suggestion 3:

      Enhance Reproducibility: To facilitate the reproducibility of the study, consider sharing the code, data, and resources necessary for readers to replicate the experiments. This will allow researchers in the field to validate and build upon your work more effectively.

      Answer: This is something we already included with the original submission. The code is publicly available. In fact, we provide a phyton library, ALIEN (Active Learning in data Exploration) which is published on the Sanofi Github(https://github.com/ Sanofi-Public/Alien). We also provide details on the public data used and expect to provide the internal data as well. We included a small paragraph on code and data availability.

      Reviewer No.2 (public)

      Suggestion 1:

      The authors presented a well-written manuscript describing the comparison of activelearning methods with state-of-art methods for several datasets of pharmaceutical interest. This is a very important topic since active learning is similar to a cyclic drug design campaign such as testing compounds followed by designing new ones which could be used to further tests and a new design cycle and so on. The experimental design is comprehensive and adequate for proposed comparisons. However, I would expect to see a comparison regarding other regression metrics and considering the applicability domain of models which are two essential topics for the drug design modelers community.

      Answer: We want to thank the reviewer for these comments. We provide a detailed response to the specific comments below. 

      Reviewer No.1 (Recommendations For The Authors)

      Recommendation 1:

      The description provided regarding the data collection process and the benchmark datasets used in the study raises some concerns. The comment specifically addresses the use of both private (Sanofi-owned) and public datasets to benchmark the various batch selection methods. Lack of Transparency: The comment lacks transparency regarding the specific sources and origins of the private datasets. It would be crucial to disclose whether these datasets were obtained from external sources or if they were generated internally within Sanofi. Without this information, it becomes difficult to assess the potential biases or conflicts of interest associated with the data.

      Answer: We would like to thank the reviewer for this comment. As mentioned in the paper, the public github page contains links to all the public data and we expect also to the internal Sanofi data. We also now provide more information on the specific experiments that were internally done by Sanofi to collect that data.

      Potential Data Accessibility Issues: The utilization of private datasets, particularly those owned by Sanofi, may raise concerns about data accessibility. The lack of availability of these datasets to the wider scientific community may limit the ability of other researchers to replicate and validate the study’s findings. It is essential to ensure that the data used in research is openly accessible to foster transparency and encourage collaboration.

      Answer: Again, as stated above we expect to release the data collected internally on the github page.

      Limited Information on Dataset Properties: The comment briefly mentions that the benchmark datasets cover properties related to absorption, distribution, pharmacokinetic processes, and affinity of small drug molecules to target proteins. However, it does not provide any specific details about the properties included in the datasets or how they were curated. Providing more comprehensive information about the properties covered and the methods used for curation would enhance the transparency and reliability of the study.

      To address these concerns, it is crucial for the authors to provide more detailed information about the data sources, dataset composition, representativeness, and curation methods employed. Transparency and accessibility of data are fundamental principles in scientific research, and addressing these issues will strengthen the credibility and impact of the study.

      Answer: We agree with this comment and believe that it is important to be explicit about each of the datasets and to provide information on the new data. We note that we already discuss the details of each of the experiments in Methods and, of course, provide links to the original papers for the public data. We have now added text to Supporting Methods that describes the experiments in more details as well as providing literature references for the experimental protocols used. As noted above, we expect to provide our new internal data on the public git page. 

      Recommendation 2:

      Some comments on the modeling example Approximation of the posterior distribution. Lack of Methodological Transparency: The comment fails to provide any information regarding the specific method or approach used for approximating the posterior distribution. Without understanding the methodology employed, it is impossible to evaluate the quality or rigor of the approximation. This lack of transparency undermines the credibility of the study.

      Answer: We want to thank the reviewer for pointing this out. Based on this comment we added more information to Section: Approximation of the posterior distribution. Moreover, we now provide details on the posterior approximation in Section: Two approximations for computing the epistemic covariance.

      Questionable Assumptions: The comment does not mention any of the assumptions made during the approximation process. The validity of any approximation heavily depends on the underlying assumptions, and their omission suggests a lack of thorough analysis. Failing to acknowledge these assumptions leaves room for doubt regarding the accuracy and relevance of the approximation.

      Answer: We are not entirely sure which assumptions the reviewer is referring to here. The main assumption we can think of that we have used is the fact that getting within X% of the optimal model is a good enough approximation. We have specifically discussed this assumption and tested multiple values of X. While it would have been great to have X = 0 this is unrealistic for retrospective studies. For Active Learning the main question is how many experiments can be saved to obtain similar results and the assumptions we used are basically ’what is the definition of similar’. We now added this to Discussion.

      Inadequate Validation: There is no mention of any validation measures or techniques used to assess the accuracy and reliability of the approximated posterior distribution. Without proper validation, it is impossible to determine whether the approximation provides a reasonable representation of the true posterior. The absence of validation raises concerns about the potential biases or errors introduced by the approximation process.

      Answer: We sincerely appreciate your concern regarding the validation of the approximated posterior distribution. We acknowledge that our initial submission might not have clearly highlighted our validation strategy. It is, of course, very hard to determine the accuracy of the distribution our model learns since such distribution cannot be directly inferred using experiments (no ’ground truth’). Instead, we use an indirect method to determine the accuracy. Specifically, we conducted retrospective experiment using the learned distribution. In these experiments, we indirectly validated our approximation by measuring the error with the respective method. The results from these retrospective experiments provided evidence for the accuracy and reliability of our approximation in representing the true posterior distribution. We now emphasize this in Methods.

      Uncertainty Quantification: The comment does not discuss the quantification of uncertainty associated with the approximated posterior distribution. Properly characterizing the uncertainty is crucial in statistical inference and decision-making. Neglecting this aspect undermines the usefulness and applicability of the approximation results.

      Answer: Thank you for pointing out the importance of characterizing uncertainty in statistical inference and decision-making, a sentiment with which we wholeheartedly agree. In our work, we have indeed addressed the quantification of uncertainty associated with the approximated posterior distribution. Specifically, we utilized Monte Carlo Dropout (MC Dropout) as our method of choice. MC Dropout is a widely recognized and employed technique in the neural networks domain to approximate the posterior distribution, and it offers an efficient way to estimate model uncertainty without requiring any changes to the existing network architecture [1, 2]. In the revised version, we provide a more detailed discussion on the use of Monte Carlo Dropout in our methodology and its implications for characterizing uncertainty.

      Comparison with Gold Standard: There is no mention of comparing the approximated posterior distribution with a gold standard or benchmark. Failing to provide such a comparison leaves doubts about the performance and accuracy of the approximation method. A lack of benchmarking makes it difficult to ascertain the superiority or inferiority of the approximation technique employed.

      Answer: As noted above, it is impossible to find gold standard information for the uncertainly distribution. It is not even clear to us how such gold standard can be experimentally determined since its a function of a specific model and data. If the reviewer is aware of such gold standard we would be happy to test it. Instead, in our study, we opted to benchmark our results against state-of-the-art batch active learning methods, which also rely on uncertainty prediction (such uncertainty prediction is the heart of any active learning method as we discuss). Results clearly indicate that our method outperforms prior methods though we agree that this is only an indirect way to validate the uncertainty approximation.

      Reviewer No.2 (Recommendations For The Authors)

      Recommendation 1:

      The text is kind of messy: there are two results sections, for example. It seems that part of the text was duplicated. Please correct it.

      Answer: We want to thank the reviewer pointing this out. These were typos and we fixed them accordingly.

      Recommendation 2:

      Text in figures is very small and difficult to read. Please redraw the figures, increasing the font size: 10-12pt is ideal in comparison with the main text.

      Answer: We want to thank the reviewer for this comment and we have made the graphics larger.

      Recommendation 3: Please, include specific links to data availability instead of just stating it is available at the Sanofi-Public repository.

      Answer: We want to thank the reviewer for this comment and added the links and data to the Sanofi Github page listed in the paper.

      Recommendation 4:

      What are the descriptors used to train the models?

      Answer: We represented the molecules as molecular graphs using the MolGraphConvFeaturizer from the DeepChem library. We now explicitly mention this in Methods.

      Recommendation 5:

      Regarding the quality of the models, I strongly suggest two approaches instead of using only RMSE as metrics of models’ performance. I recommend using the most metrics as possible as reported by Gramatica (https://doi.org/10.1021/acs.jcim.6b00088). I also recommend somehow comparing the increment on the dataset diversity according to the employed descriptors (applicability domain) as a measurement to further applications on the unseen molecules.

      Answer: We want to thank the reviewer for this great suggestions. As suggested we added new comparison metrics to the Supplement.

      • Distribution plot for the range of the Y values Figure 8 • Clustering of the data sets represented as fingerprints Supplementary material Figure 5,6

      • Retrospective experiments with Spearman correlation coefficient. Supplementary material Figure: 2,3,4

      I suggest also a better characterization of datasets including the nature and range of the Y variable, the source of data in terms of experimentation, and chemical (structural and physicochemical) comparison of samples within each dataset.

      Answer: As noted above in response to a similar comment by Reviewer 1, we have added more detailed information about the different experiments we tested to Supporting Methods.

      References

      [1] Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Maria Florina Balcan and Kilian Q. Weinberger, editors, Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 1050–1059, New York, New York, USA, 20–22 Jun 2016. PMLR.

      [2] N.D. Lawrence. Variational Inference in Probabilistic Models. University of Cambridge, 2001.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We would like to thank the reviewers for their work, and the very useful comments.

      Public reviews:

      Reviewer #2

      1) The authors discussed possible reasons for the different results of the RRP sizes between this study and Alten et al., 2021. One of them is how the hypertonic solution is applied. The authors thought that the long application of hypertonic solution in Alten et al., 2021 caused an overlapping release of RRP and upstream vesicle pools because Alten et al., 2021 measured 10-fold larger RRP size than what was measured in this study. However, Alten et al., 2021 measured RRP from IPSCs and a single inhibitory vesicle fusion causes larger charge transfer than an excitatory vesicle. The authors need to take this into consideration and 10-fold is likely an overestimate.

      Answer: Thank you for pointing out this important difference. We have modified the text in the Discussion accordingly and we no longer refer to the 10-fold difference.

      2) Statistical tests should be performed for protein expression levels (Fig 2A and Fig 10A) and in vitro fusion assays (Fig 8D,E and Fig 9 B,C).

      Answer: We inserted new panels B and C in Fig. 2 and Fig. 10 showing all the Western Blot data and performed statistical tests (none were significant). For the in vitro fusion assays, we have inserted statistical tests in panels 8E and 9C. The quantities in those panels (subdivided into “Pre Ca2+”, “post Ca2+” and “end fusion”) are based on the data in Figure 8D and 9B. We have therefore not inserted separate statistical tests in Figures 8D and 9B.

      Reviewer #1 (Recommendations For The Authors):

      It would be quite interesting for future studies to address how these three mutations in SNAP-25 behave in the Syt1 null background in their electrophysiological experiments. Does the I167N allele block the enhanced spontaneous release in the Syt1 null? Do the V48F and D1667 alleles synergize with Syt1 to enhance spontaneous release to even higher levels? By examining how different components interact to shape the energy landscape for priming and fusion, these types of approaches should be quite revealing.

      Answer: We agree with the reviewer that these future studies would be interesting. Unfortunately, they are beyond our current capacities.

      Reviewer #2 (Recommendations For The Authors):

      1) In the introduction, when discussing haploinsufficiency of Munc18-1 causes a decrease in release, additional references should be included, for example, the studies in flies (Wu et al., 1998, EMBO), human neurons (Patzke et al., 2015 JCI), and mouse neurons (Toonen et al., 2006 PNAS; Chen et al., 2020 eLife).

      Answer: Thank you for the suggestion. We have rewritten the text and added additional references.

      2) The authors may consider introducing additional motivations and significance of this study. For example, the evoked EPSCs cannot be properly measured in the cultures of Alten et al., 2021, but was properly studied here.

      Answer: We agree and have added additional motivations in the Introduction.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Weinberger et al. use different fate-mapping models, the FIRE model and PLX-diet to follow and target different macrophage populations and combine them with single-cell data to understand their contribution to heart regeneration after I/R injury. This question has already been addressed by other groups in the field using different models. However, the major strength of this manuscript is the usage of the FIRE mouse model that, for the first time, allows specific targeting of only fetal-derived macrophages. The data show that the absence of resident macrophages is not influencing infarct size but instead is altering the immune cell crosstalk in response to injury, which is in line with the current idea in the field that macrophages of different origins have distinct functions in tissues, especially after an injury. To fully support the claims of the study, specific targeting of monocyte-derived macrophages or the inhibition of their influx at different stages after injury would be of high interest. In summary, the study is well done and important for the field of cardiac injury. But it also provides a novel model (FIRE mice + RANK-Cre fate-mapping) for other tissues to study the function of fetal-derived macrophages while monocyte-derived macrophages remain intact.

      Response from the authors: We thank the reviewer for the thorough review and the positive feedback, and we agree that the Csf1r-FIRE mice represent an interesting model for studying the role of resident embryo-derived macrophages in different tissues and pathologies.

      Recent work of the Cochain lab demonstrated by combined CITE-seq analysis and CCR2 antibody treatment that monocyte depletion does not affect levels of resident tissue macrophages after myocardial infarction (REF Rizzo et al PMID: 35950218), supporting the concept to specifically investigate the role of resident and recruited macrophages. While previous work has addressed the effects of broad CCR2-mediated monocyte depletion, information on differential macrophage subsets derived from blood monocytes has been lacking. We agree with the reviewer that targeting subsets of monocyte-derived macrophages, such as for example Ly6Chi monocytes, MHCII+Il1b+ macrophages, and Isg15hi populations (REF Rizzo et al PMID: 35950218), or interference with their recruitment at different time-points after myocardial infarction would be of interest and could help to decipher their functions in the different stages of cardiac healing. However, these studies would go beyond the scope of the current analysis and will be addressed in a separate project.

      Reviewer #2 (Public Review):

      In this study Weinberger et al. investigated cardiac macrophage subsets after ischemia/reperfusion (I/R) injury in mice. The authors studied a ∆FIRE mouse model (deletion of a regulatory element in the Csf1r locus), in which only tissue resident macrophages might be ablated. The authors showed a reduction of resident macrophages in ∆FIRE mice and characterized its macrophages populations via scRNAseq at baseline conditions and after I/R injury. 2 days after I/R protocol ∆FIRE mice showed an enhanced pro inflammatory phenotype in the RNAseq data and differential effects on echocardiographic function 6 and 30 days after I/R injury. Via flow cytometry and histology the authors confirmed existing evidence of increased bone marrow-derived macrophage infiltration to the heart, specifically to the ischemic myocardium. Macrophage population in ∆FIRE mice after I/R injury were only changed in the remote zone. Further RNAseq data on resident or recruited macrophages showed transcriptional differences between both cell types in terms of homeostasis-related genes and inflammation. Depleting all macrophage using a Csf1r inhibitor resulted in a reduced cardiac function and increased fibrosis.

      Strengths

      1) The authors utilized robust methodology encompassing state of the art immunological methods, different genetic mouse models and transcriptomics.

      2) The topic of this work is important given the emerging role of tissue resident macrophages in cardiac homeostasis and disease.

      Response from the authors: We thank the reviewer for pointing out the strengths of our study, and putting the findings in context of the current view of the role of resident macrophages.

      Weaknesses:

      1) Specificity of ∆FIRE mouse model for ablating resident macrophages.

      The study builds on the assumption that only resident macrophages are ablated in ∆FIRE mice, while bone marrow-derived macrophages are unaffected. While the effects of the ∆FIRE model is nicely shown for resident macrophages, the authors did not directly assess bone marrow-derived macrophages. Moreover, in the immunohistological images in Fig. 1D nearly all macrophages appear to be absent. It would be helpful to further address the question of whether recruited macrophages are influenced in ∆FIRE mice. Evaluation of YFP positive heart and blood cells in ∆FIRE mice crossed with Flt3CreRosa26eYFP mice could clarify whether bone marrow-derived cardiac macrophages are influenced in ∆FIRE mice. This would be even more relevant in the I/R model where recruitment of bone marrow-derived macrophages is increased. A more direct assessment of recruited macrophages in ∆FIRE mice could also help to discuss potential similarities or discrepancies to the study of Bajpai et al, Circ Res 2018, which showed distinct effects of resident versus recruited macrophages after myocardial infarction. Providing the quantification of flow cytometry data (fig. 1E-F) would be supportive.

      Response from the authors: We thank the reviewer for these comments. The reviewer addresses the specificity of the ∆FIRE mouse model for ablating resident macrophages and its potential effects on bone marrow-derived macrophages. Our single-cell sequencing data support the specificity of the ∆FIRE model regarding embryo-derived resident macrophages in two ways. First, the ∆FIRE mice are characterized by the specific reduction of embryo-derived macrophage clusters (e.g. homeostatic macrophages as well as antigen-presenting macrophages) in baseline conditions, while the abundance of recruited macrophages (e.g. Ccr2hiLy6chi macrophages, Cx3Cr1hi macrophages) is not altered (Fig. 2B-D). Second, transcriptomic analysis of bone marrow-derived macrophage clusters (e.g. Ccr2hiLy6chi macrophages, Cx3Cr1hi macrophages) and of monocytes revealed no differences in ∆FIRE compared to control mice. On the other hand, we found substantial transcriptome differences in clusters that were mainly of embryonic origins (e.g. homeostatic macrophages as well as antigenpresenting macrophages) (Fig.2 and Fig S.4). These findings indicate that the ∆FIRE model mainly induces changes in embryo-derived macrophages.

      We agree with this reviewer that crossbreeding of ∆FIRE mice with Flt3CreRosa26eYFP mice would be of interest, and we have been working hard to establish this line. However, our breeding efforts have thus far been in vain, which is probably due to the necessity to keep a CBA/Ca background for the FIRE model (as reported by JAX: https://www.jax.org/strain/032783) and requires further backcrossing of Flt3CreRosa26eYFP mice with the respective CBA strain. In future work, we plan to carry out this experiment and also to specifically target monocyte-derived macrophages.

      The reviewer further asks about the modality to quantify cardiac macrophages, and suggests flow cytometry to quantify their number and not only use immunohistology. The quantification of cardiac immune cells shown in Fig. 1D (formerly 1C) was in fact performed by flow cytometry. We apologize for the lack of clarity. We rearranged the figure and added this information to the figure legend. We also added quantification by immunohistology, which is now shown in Fig. 1G.

      2) Limited adverse cardiac remodeling in ∆FIRE mice after I/R.

      The authors suggested an adverse cardiac remodeling in ∆FIRE mice. However, the relevance of a <5% reduction in ejection fraction/stroke volume within an overall normal range in ∆FIRE mice is questionable. Moreover, 6 days after I/R injury ∆FIRE mice were protected from the impairment in ejection fraction and had a smaller viability defect. Based on the data few questions may arise: Why was ablation of resident macrophages beneficial at earlier time points? Are recruited macrophages affected in ∆FIRE mice (see above)? Overall, the manuscript could benefit if the claim of an adverse remodeling in ∆FIRE mice would be discussed more carefully.

      Underlying mechanisms:

      The study did not functionally evaluated targets from transcriptomics to provide further mechanistic insights. It would be helpful if the authors discuss potential mechanisms of the differential effects of macrophages after ischemia in more detail.

      Response from the authors: The reviewer raises the question why the ablation of resident macrophages trends towards a beneficial effect at earlier time points after I/R injury. Further, the reviewer questions the relevance of a <5% reduction in ejection fraction/stroke volume over time in the light of an otherwise modestly reduced ejection fraction.

      In this study we used the experimental mouse model of ischemia-reperfusion injury with transient (1h) coronary artery occlusion. The potential disadvantage of this model is the smaller infarct size and smaller effects on cardiac function. However, it better represents the clinical picture and pathology of myocardial infarction in human patients with timely reperfusion by percutaneous coronary intervention. Infarct size after I/R was approx. 25% in control animals indicating relevant cardiac injury. Further, infarct size was reduced to approx. 16% in ∆FIRE mice 6 days after infarction, however, the difference did reach statistical significance. In line with this, the ejection fraction was numerically reduced on d6 after infarction in the control group, however with no statistical significance. In the chronic phase after infarction, the ejection fraction improved over time in the control group by approx. 5% and decreased in ∆FIRE mice by 4%, which resulted in a difference (delta) of 9% change of ejection fraction. This indicated adverse remodeling in ∆FIRE mice.

      We agree that the different impact of the absence of resident cardiac macrophages during the course of myocardial healing after injury is of great interest to the field. We discuss potential mechanisms of the differential effects of resident macrophage ablation in lines 290-314 in the revised manuscript. However, to decipher the influence of embryo-derived macrophages at different time points after infarction, an inducible model for specific depletion of this macrophage population would be necessary, which to our knowledge does not exist.

      In the revised manuscript, we now discuss the effects on cardiac healing in ∆FIRE and also the limitations more thoroughly.

      Other:

      • It is unclear why the authors performed RNAseq experiments 2 days after I/R (fig. 5/6), while the proposed functional phenotype occurred later. - A sample size of 2 animals per group appears very limited for RNAseq in ∆FIRE mice (fig.6).

      Response from the authors: We chose a time point in the “late early phase” of myocardial infarction (= day 2 post I/R) as we were also interested in the effect of resident macrophage depletion on other immune cell subsets (e.g. neutrophils) which could only be captured in this time period.

      We aimed to analyse 10000 cells per condition. The applied sample size allowed us to analyse 13452 CD45+cells from ∆FIRE mice and 9152 cells from control mice in infarct condition.

      Lines 299-324 "Ablation of resident macrophages altered macrophage crosstalk to non-macrophage immune cells, especially lymphocytes and neutrophils. This was characterized by a proinflammatory gene signature, such as neutrophil expression of inflammasome-related genes and a reduction in anti-inflammatory genes like Chil3 and Lcn2. Interestingly, inflammatory polarization of neutrophils have also been associated with poor outcome after ischemic brain injury (Cuartero et al, 2013). Clinical trials in myocardial infarction patients showed a correlation of inflammatory markers with the extent of myocardial damage {Sanchez, 2006 #2763} and with short- and long-term mortality {Mueller, 2002 #2780}.

      Our study provides evidence that the absence of resident macrophages negatively influences cardiac remodeling in the late postinfarction phase in ∆FIRE mice indicating their biological role in myocardial healing. In the early phase after I/R injury, absence of resident macrophages had no significant effect on infarct size or LV function. These observations potentially indicate a protective role in the chronic phase after myocardial infarction by modulating the inflammatory response, including adjacent immune cells like neutrophils or lymphocytes.

      Deciphering in detail the specific functions of resident macrophages is of considerable interest but requires both cell-specific and temporally-controlled depletion of respective immune cells in injury, which to our knowledge is not available at present. These experiments could be important to tailor immune-targeted treatments of myocardial inflammation and postinfarct remodelling."

      Reviewer #1 (Recommendations For The Authors):

      1) Fetal-derived macrophages are often involved in organ development and function during steady-state. The authors should show heart morphology/function before I/R injury to make sure that the cause for a worsened outcome in FIRE mice is not due to a developmental/functional defect.

      Response from the author: We conducted a gross analysis of cardiac morphology by histology, and did not determine differences to littermate controls. However, we have not conducted a detailed investigation of cardiac development since this was not the scope of this study. Further, our study mainly shows differences in cardiac healing between d6 and d30, which is unlikely influenced by developmental defects.

      2) Line 164: The authors state that they have analysed macrophages via flow cytometry, but Figure 4a only shows IF. Quantification of different macrophage subsets via flow cytometry should be included in this model.

      Response from the author: The sentence “To gain a deeper understanding of the inflammatory processes taking place in the infarcted heart, we quantified macrophage distribution by immunofluorescence and flow cytometry analysis of ischemic and remote areas after I/R.” beginning line 164 describes the entire figure 4 and not only 4a. Here we show IF as well as flow cytometry to describe numbers but also different subpopulations of macrophages (BM-derived vs. resident).

      3) Lines 254-255 (now starting 267): it is not entirely true that the heart does not harbor BM-derived macrophages under steady state. Of course, there are many more after I/R injury, but the authors should take also their own data into account (Figure 1c, e showing a clear reduction but not complete absence of macrophages) and not claim a "scarce" population. See also Dick et al (PMID: 30538339), where both, the Ccr2-Tim4- and Ccr2+ populations are (slowly) replaced by BM monocytes.

      Response from the author: We thank the reviewer for this comment. We changed “scarce population” to “small population”.

      4) Lines 269-273 (now starting line 283): The point that DT-mediated depletion of cells causes inflammation that may have an impact on macrophages is compelling. However, the approach of combining and correlating data from PLX diet and FIRE mice is not proof that the significant increase in infarct size and deterioration of left ventricular function after I/R injury is driven by monocyte-derived macrophages. The authors could use Ccr2KO mice or injection of Ly6C antibody to show the specific functions of recruited macrophages.

      Response from the author: In this study we combine a specific genetic depletion of resident macrophages (FIRE) with an pharmaceutical depletion of all macrophage populations (Csf1r-inhibiton with PLX5622). We did not aim to specifically deplete monocyte-derived macrophages, which has been addressed previously by Bajpai et al. (PMID: 30582448) using the CCR2-DTR mouse line. To address the functions of recruited macrophages would go beyond the scope of the manuscript.

      Along these lines: the authors discuss that neutrophils may have been targeted in the Ccr2-DTR model. However, the egress of neutrophils in the CCR2 KO model is not affected and should be a good model to look at the impact of monocyte-derived macrophages after I/R injury in the heart.

      Response from the author: We agree with the reviewer that CCR2 under steady state conditions might not be important for the egress of neutrophils. However, after ischemic injury CCR2-inhibition has been shown to impair neutrophil egress as well as neutrophil recruitment to ischemic tissue in an ischemia-reperfusion injury model (PMID: 28670376).

      5) Line 299 (now line 332): Reference is missing for Ccr2-DTR mice study

      Response from the author: We added the respective reference.

      6) Can the authors take also the timing of treatment/cell depletion into account in their discussion incoming monocytes may be required in the first days after injury to promote the regeneration process so that targeting them before the onset of the injury may be detrimental while targeting them during the chronic phase may be beneficial.

      Response from the author: We thank the reviewer for this comment. We added the following sentence to the manuscript (Lines 343-346):

      “An explanation of this controversy might be the timing and duration of macrophage depletion. Bajpai et al. depleted recruited macrophages only in the initial phase of myocardial infarction which improved cardiac healing (Bajpai et al., 2019), while depletion of macrophages over a longer period of time, as shown in our study, is detrimental for cardiac repair.”

      7) Figure 6E, F: Why are the outgoing signals pooled? The data has the strength of distinguishing between distinct populations. This data should be used and exploited to work out distinct pathways of distinct macrophage populations in more detail. From the representation, it remains unclear which pathways are active and distinct between Ctrl and FIRE mice besides the few chosen once (inflammasome). Also, legends are missing (what is red/blue?)

      Response from the author: We thank the reviewer for this comment. The aim of this analysis was to evaluate the effect of the FIRE ko on communication of immune cells in infarct conditions. To address changes in all populations which are affected by the FIRE ko we pooled the respective clusters (e.g. homeostatic, antigen-presenting and Ccr2loLy6clo Mø clusters). We provided the detailed analysis of the individual clusters in the new Supplemental Figure 9. Further, we added the respective legend to the Figure.

      8) The methods part mentioned CD169-DTR mice, however, there are no experiments shown in the manuscript. Further, how did the authors breed the FIRE mice? It is known in the field that they have big developmental issues and behavioural deficits if kept on a B6 background, which was likely the case in the study, at least for the fate-mapping approach.

      Response from the author: We removed the CD169-DTR reference from the methods part.<br /> FIRE mice were kept on a CBA/Ca background. As mentioned by the reviewer this was not the case for the experiment where reporter mice were bred with FIRE mice (Csf1rΔFIRE/+RankCreRosa26eYFP) as these mice are on a C57Bl6 background. All experiments evaluating cardiac function and outcome after infarction in FIRE mice were performed on mice kept with a CBA/Ca background.

      Reviewer #2 (Recommendations For The Authors):

      • Please provide the sample size for Fig. 5.

      We described the sample size in the methods part (lines 448-450: “Cell sorting was performed on a MoFlo Astrios (Beckman Coulter) to obtain cardiac macrophages from CD45.2; Mx1CreMybflox/flox after BM-transplantation of CD45.1 BM (n=3 for 2 days after I/R injury) for bulk sequencing,..“). We added the sample size also to the figure legend.

      • Please state in the methods how the normality of data was tested.

      We added the respective normality test to the methods part. “The Shapiro-Wilk test was used to test normality. “

      • How did the authors ensure a standardized infarct size?

      The authors ensured a standardized infarct size in mice following myocardial infarction through a carefully controlled experimental protocol. We employed the well-established I/R procedure for inducing myocardial infarction in mice by ligation of the LAD for 1h to mimic the transient blockage of blood flow to the anterior wall of the heart. Success of the ligation of the LAD and the induction of ischemia was confirmed by the pale color of the myocardium after ligation and the success of reperfusion by the return of color after removing the suture. The surgical technique was consistently performed by the same highly trained veterinarian in a blinded fashion to minimize variability.

    1. Author Response

      We are grateful to the three reviewers and the editors who have provided comments about our manuscript, "Formation of malignant, metastatic small cell lung cancers through overproduction of cMYC protein in TP53 and RB1 depleted pulmonary neuroendocrine cells derived from human embryonic stem cells.”

      We are pleased that the reviewers recognized the importance of the problem we have addressed – namely, the need for better models of small cell lung cancer, a relatively common and refractory cancer. We also appreciate their acknowledgement of the significance of our major finding: that addition of an efficiently expressed CMYC transgene to neuroendocrine cells derived from human embryonic stem cells in which the RB1 and TP53 genes have been suppressed serves to drive aggressive growth and metastatic spread, rendering this system an appealing one for future studies of this recalcitrant cancer. Further, we acknowledge that more work needs to be done to more fully characterize and better understand the mechanistic features of this model system and to exploit it for therapeutic purposes.

      More specifically, we agree with the reviewers that this manuscript would be stronger if it included: (i) tests of other oncogenes, especially other members of the MYC gene family, to serve as drivers of tumor growth and metastasis and tests of orthotropic implantation of cells into the lung; (ii) descriptions of how such tumors with various genotypes respond to therapeutic approaches, both established and novel; and (iii) a more complete assessment of the contribution of abundant MYC proteins to physiological changes in tumor cells, such as growth, apoptosis, and invasion.

      While we wish we could provide such information, it is unrealistic to believe that it will be generated by the current constellation of authors in the foreseeable future. Data in the present manuscript has been generated over nearly five years, mostly in the early phases of that interval. Since then, some of us have moved from one institution to another, and some have shifted the focus of our studies. Further delays in publishing the main messages in this paper will only delay the pursuit of further studies, most likely by others. Indeed, one of the strongest justifications for the novel publication policies at eLife is to return control of the time for dissemination of results to the hands of the authors. Our situation illustrates the wisdom of that approach.

      We also note that the reviewers have raised a few issues that we aim to clarify by revisions of the current manuscript, thereby creating an improved Version of Record, within the next few weeks. We acknowledge here the significance of those issues and the ambiguities noted by the reviewers.

      The issues include the following point noted by more than one reviewer: our claim that expression of the CMYC oncogene increases the neuroendocrine character of the tumors. We recognize that this observation may be influenced by the nature of the analysis (single cell or bulk RNA sequencing), the choice of lineage markers (eg, NEUROD1 or ASCL1 or others), and the statistical evaluation of the data. We will review these aspects of the problem and make appropriate changes in the text to be submitted as the Version of Record.

      Reviewer 1 also makes a good point about the possible effects of CMYC on the differentiation of hESC-derived lung progenitors (LPs). In this paper, we examine this issue only in LPs in which the tumor suppressor genes, RB1 and TP53, have been suppressed. Further studies of the effect of CMYC on differentiation of LPs with various combinations of functional tumor suppressor genes might well prove valuable in exploring the origins of SCLC.

      Finally, we wish to note that a topic discussed by Reviewer 1 (and by us) about the still poorly understood relationship between cancer genotypes and cell lineages has been partially addressed in a paper from our group that has been accepted for publication in Science.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      1) A single biomarker seems very unlikely to be of much help in the detection of glaucoma due to the etiological heterogeneity of the disease, the existence of different subtypes, and the genetic variability among patients. Rather, a panel of biomarkers may provide more useful information for clinical prediction, including better sensitivity and specificity. The inclusion of additional metabolites already identifying in the study, in combination, may provide more reliable and correct assignment results.

      The authors’ answer: Thank you for your comment. We recognize the constraints of using single biomarkers for diagnosis. In upcoming research, we aim to incorporate multiple biomarkers to improve diagnostic accuracy and will consider adding more metabolites as suggested.

      2) The number of samples in the supplementary phase is low, larger sample sizes are mandatory to confirm the diagnostic accuracy.

      The authors’ answer: Thank you for your comment. Collecting aqueous humor is invasive, making samples scarce. We acknowledge the small sample size limitation. In future studies, we plan to use larger samples to verify the biomarker's diagnostic accuracy. Your feedback emphasizes the need for thorough validation in our next research

      3) Cohorts from different populations are needed to verify the applicability of this candidate biomarker.

      The authors’ answer: Thank you for the suggestion. We agree on the need to test the biomarker's relevance across varied populations. Reports from other groups will help confirm and broaden our results.

      4) Sex hormones seem to be associated also with other types of glaucoma, such as primary open-angle glaucoma (POAG), although the molecular mechanisms are unclear (see doi:10.1167/iovs.17-22708). The inclusion of patients diagnosed with other subtypes of glaucoma, like POAG, may contribute to determining the sensitivity and specificity of the proposed biomarker. Androstenedione levels should be determined in POAG, NTG, or PEXG patients.

      The authors’ answer: I agree with your comment and thank you for your suggestion. PACG is a major cause of irreversible blindness in Asians. While this study centers on PACG, the link between sex hormones and other glaucoma subtypes, like POAG, merits investigation. Future studies will include POAG and other subtypes to further assess androstenedione's diagnostic relevance.

      5) In addition, the levels of androstenedione were found significantly altered during other diseases as described by the authors or by conditions like polycystic ovary syndrome, limiting the utility of the proposed biomarker.

      The authors’ answer: Thank you for your advice. Androstenedione levels also change in conditions like polycystic ovary syndrome, which could affect the biomarker's specificity. We plan to further study androstenedione's unique changes in glaucoma versus other conditions to clarify its diagnostic value.

      6) Uncertainty of the androstenedione levels compromises its usefulness in clinical practice.

      The authors’ answer: The uncertainty surrounding androstenedione levels and its impact on clinical applicability is a valid concern. We plan to delve deeper into understanding the variability and determinants of androstenedione levels to better assess its clinical relevance.

      Reviewer #2 (Public Review):

      The "predict" part is on much less solid ground. The visual field progression and association with serum androstenedione within the current experimental design eludes to a correlation. It truly cannot be stated as predictive. To predict one needs to put the substance when nothing is there and demonstrate that the desired endpoint is reached. Conversely, the substance (androstenedione) can be removed, and show that the condition regresses. None of these are possible without model system experiments, which have not been done. The authors could put some additional details in the methods, such as: 1) how much sample was collected, 2) whether equal serum volume for analysis had equal serum proteins (or cells). They have used a LC-MS/MS and a Chemiluminescence method, but another independent method such as GC-MS/MS or NMR to detect androstenedione for a subset of patients with different stages of visual field defect would be desirable.

      The authors’ answer: We acknowledge your constructive critique concerning our use of the term "predict". In the present study, we elucidated a discernible correlation between visual field progression and serum androstenedione concentrations. We are cognizant of the critical distinction between correlation and causation, and we concur that our application of the term “predict” may have been overly assertive in this context.

      Your emphasis on the imperative of employing model system experiments to unequivocally ascertain causative relationships is well-received. The experimental approach of modulating the substance, androstenedione in this case, to empirically observe its consequential impact on the condition, is a pivotal direction that warrants exploration in subsequent research endeavors. With regard to the variability of serum protein concentrations across participants, we adopted a methodological standardization by ensuring that the analyzed serum volume remained consistent across samples. This was implemented to enhance the reliability and generalizability of our findings.

      Your recommendation to consider alternative detection methodologies, specifically GC-MS/MS or NMR, is duly noted. Although our choice of LC-MS/MS and Chemiluminescence was predicated on available resources, we recognize the scientific merit in leveraging multiple analytical techniques. In future investigations, we endeavor to incorporate a broader spectrum of detection methodologies for androstenedione, particularly when assessing patients with varied visual field defect stages, thereby bolstering the robustness and validity of our conclusions.

      Reviewer #1 (Recommendations for The Authors):

      1) POAG is the leading cause of irreversible blindness worldwide (see reference #4). The prevalence of PACG is highest in Asia, but the major form of glaucoma is still POAG. The authors should modify the abstract and background sections accordingly (see line 30 and lines 61-62).

      The authors’ answer: Thank you for your suggestion, and we apologize for this mistake. The sentence” Primary angle closure glaucoma (PACG) is the leading cause of irreversible blindness worldwide” has been changed to” Primary angle closure glaucoma (PACG) is the leading cause of irreversible blindness in Asia”. (Page 2, lines 33; Page 3, lines 62-64)

      2) Line 69, please change the sentence "the He et al. taught us..." to the following "the He et al. study taught us.".

      The authors’ answer: Thank you for your comment. The sentence "the He et al. taught us..." has been changed to "the He et al. study taught us.". (Page 3, lines 72)

      3) I suggest including the name of the identified candidate biomarker in the title of the manuscript. The title must be straightforward.

      The authors’ answer: We agree with your comment and thank you for your suggestion. The sentence “Metabolomics Identifies and Validates Serum Novel Biomarker for Diagnosing Primary Angle Closure Glaucoma and Predicting the Visual Field Progression” has been changed to “Metabolomics Identifies and Validates Serum Androstenedione as Novel Biomarker for Diagnosing Primary Angle Closure Glaucoma and Predicting the Visual Field Progression”. (Page 1, lines 1)

      4) Line 88, please change "normal subjects" to "control individuals".

      The authors’ answer: Thank you for your comment. We have changed "normal subjects" to "control individuals”. (Page 4, lines 91)

      5) Line 95 and so on along the manuscript, avoid the term "normal controls" or "normal" and use only the term "controls".

      The authors’ answer: Thank you for your advice. "normal subjects" has been changed to "controls". (Page 4, lines 113; Page5, lines 118,120,124,128,133)

      6) In the participants section, indicate the ocular treatments of PACG patients. For example, on line 141, which "treatment" are you referring to?

      The authors’ answer: Thank you for your comment. We apologize to this vague statement. Treatment included medical treatment and surgical treatment. We have revised it in the manuscript. (Page 5, lines 142)

      7) The entire section 2.4 is confusing. According to Figure S2, untargeted metabolomics was conducted with a mixed sample containing "all" serum extracts in order to obtain an in-house database with molecular features present in serum by LCHRMS. Then, this database was used for targeted metabolomics in individual serum samples using LCQQQ. However, as it is described in the manuscripts, it seems that first, an untargeted metabolomics analysis was carried out to identify altered metabolites, then targeted metabolomics was carried out to validate the untargeted analysis and finally, a profiling analysis was carried out to construct the database. The workflow must be clearly discussed and amended to be understable.

      The authors’ answer: Thank you for your comment. We have revised the description of the experimental method section 2.4. (Page 7, lines 195-198)

      8) Please, briefly explain what widely-targeted metabolomics is and how it works in this study (see section 2.4).

      The authors’ answer: Thank you for your comment. For extensively targeted metabolome detection, a local database was first established by using the standard database, and ion pair information was obtained by scanning ion pairs of mixed samples (QC) with QTOF. A wide range of metabolites were qualitatively obtained by comparing with the local self-built database, and then the metabolites of each sample were qualitatively and quantitatively measured by MRM scanning mode of triple four-bar QQQ. This project combines the non-target public database scanning construction database and the wide target local database to build a new database, and then scans the database of the samples of this project with Q-TOF, and then carries out the qualitative and quantitative detection of metabolites of each sample in MRM mode. (Figure S2)

      9) On Table 1, indicate the number of patients and controls with cataracts.

      The authors’ answer: For the glaucoma group and the control group, we have excluded people with cataracts. This section is described in the inclusion and exclusion criteria for supplementary materials. (Inclusion and exclusion criteria)

      10) On "Sample processing" section, lines 152 and 153: Have you used cold methanol to ensure metabolic quenching? If not, how metabolite quenching was carried out?

      The authors’ answer: Thank you for your comment. We use cold methanol to extract metabolites, and the early blood samples have been stored in a -80°C refrigerator to ensure a low temperature process and ensure metabolic quenching. (Page 6, lines 196)

      11) On the same "Sample processing" section, have you used internal standards during metabolite extraction? If yes, ones? If not, why?

      The authors’ answer: Thank you for your comment. In the metabolite extraction process of each sample, the same internal standard was added, and the same volume of 50 μL serum samples were extracted. The specific internal label name has been added in "Sample processing" section. (Page 6, lines 153-155)

      12) Lines 161-163, I suggest including in the supplementary material the worklist of the entire experiment run by LC-MS, including analytical replicates and QCs.

      The authors’ answer: Thank you for your comment. Worklist for mass spectrometry can be found in supplementary sheet1. (Page 6, lines 165)

      13) The title of the section "Detection method" does not seem appropriate, please change it to "Analytical methods "or something similar.

      The authors’ answer: Thank you for your advice. "Detection method" has been changed to “Analytical methods “. (Page 6, lines 168)

      14) Section 2.4.1, I suggest changing "Untargeted detection conditions" to "Untargeted metabolomics analysis".

      The authors’ answer: Thank you for your comment. "Untargeted detection conditions" has been changed to "Untargeted metabolomics analysis". (Page 6, lines 169)

      15) Lines 170-172, the column used is compatible with 100% water, why start with 5% acetonitrile?

      The authors’ answer: Thank you for your comment. If the acetonitrile starting gradient is 0, it will cause a lot of water-soluble substances to elute and easily clog the column, so we want to use 5% organic phase.

      16) Section 2.4.1, the chromatographic conditions (mobiles phases) were the same in both positive and negative ion mode? It is desirable to change or adjust a basic pH when working in negative, so please amend and clarify it.

      The authors’ answer: Thank you for your comment. In the negative ion mode, the peak shape of the chromatogram under the acidic system is better than that under the alkaline system, so we choose the acidic system.

      17) I am not able to clearly understand what is "widely targeted conditions" (see section 2.4.2). What is the difference with the conventional targeted metabolomics analysis? In my view, widely-targeted metabolomics refers to the combination of untargeted metabolomics and targeted metabolomics. This must be clarified and simplified.

      The authors’ answer: Thank you for your syggestion. The characterization of metabolites in this study was conducted using a non-targeted database and a self-built database. Non-targeted metabolites were characterized with mixed samples, and then combined with the laboratory self-established database to form a new metabolome database for this study. 2.4.2 The broad targeting here refers to the use of the MWDB standard self-built database to characterize metabolites, and then the QQQ MRM model to quantify metabolites. In order to clearly describe the detection process, this part of the method has been modified. (Figure S2)

      18) Line 199, please, indicate the normalization carried out.

      The authors’ answer: We agree with your comment and thank you for your suggestion. The normalization description is missing from its data processing steps and has been corrected in the manuscript. (Page 7, lines 203)

      19) How many instrumental replicates have you carried out both in untargeted and targeted metabolomics? Please, indicate it.

      The authors’ answer: Thank you for your advice. In this project, all sample mixtures were used as QC samples, which were repeated several times in the testing process (one QC sample was inserted between every 10 samples), and the repeated correlation between repeated QC was more than 99% to ensure the stability of sample testing. (Sheet1)

      20) Line 267, why did you select a fold changes threshold greater than 1.15 (or lower 0.85)? In metabolomics, it would be desirable to have a minimum of 1.5-fold change considering the variability of data.

      The authors’ answer: Thank you for your comment. FC reduction is selected to expand potential candidate metabolites and can be repeated in three batches and refer to the literature "Blood metabolomics uncovers inflammation-associated mitochondrial dysfunction as a potential mechanism. underlying ACLF "method screening threshold.

      21) To include anywhere the molecular formula of androstenedione.

      The authors’ answer: I agree with your comment and thank you for your suggestion. We have added the molecular formula of androstenedione to the supplementary material. (Page 17, lines 475)

      22) Line 290 is not Figure 4B and 4C, you may refer to Figure 3B and 3C.

      The authors’ answer: Thank you for your advice. We apologize to this mistake. Figure 4B and 4C have been changed to Figure 3B and 3C.

      23) Figure S3 was lost from Supplementary material, please include it.

      The authors’ answer: Thank you for your comment. We apologize to this mistake. There is an error in the ordering of the supplementary graph. Figure 3 is redundant, and we have modified it in the supplementary materials.

      24) Figure 4 B, indicate in the text the average and uncertainty of androstenedione levels in both control and PACG groups.

      The authors’ answer: Thank you for your comment. In the manuscript, We have added descriptions of mean ± standard deviation of androstendione levels in the control group and the disease group. (Page 11, lines 311-312)

      25) Section 3.6. please include the average and uncertainty of androstenedione levels in males and females in both control and PACG groups.

      The authors’ answer: Thank you for your advice. For 3.6 section, we supplemented the mean ± standard deviation of androstenedione levels in the control and disease groups. (Page 13, lines 350-356)

      26) Figure S9 seems missing.

      The authors’ answer: Thank you for your comment. We apologize to this mistake. Figures S9 has been added in the Supplementary material.

      27) Lines 345-346, indicate the levels obtained for the metabolite in the compared groups.

      The authors’ answer: Thank you for your suggestion. The levels of androstenedione in each group are seen in “The results from both discovery set 1 (Figure S9A, Mild:32600±17011, Moderate:33215±17855, Severe:46060±21789) and discovery set 2 (Figure S9B, Mild:27866±19873, Moderate:27057±13166, Severe:43972±19234) indicated that the mean serum androstenedione levels were significantly higher in the severe PACG group compared to the moderate and mild PACG groups (P<0.001). These findings were further validated in both validation phase 1 (Figure S9C, Mild:75726±45719, Moderate:65798±30610, Severe:94348±30858) and validation phase 2 (Figure S9D, Mild:1.121±0.3143 ng/ml, Moderate:1.461±0.4391 ng/ml, Severe:2.147±0.6476 ng/ml).” and “Notably, the level of androstenedione was found to be significantly higher in PACG patients than in normal subjects in both discovery set 1 (Figure 4B, P=0.0081, Normal:33987±11113, PACG:42852±20767) and discovery set 2 (Figure 4C, P=0.0078, Normal:31559±10975, PACG:37934±18529).”

      28) Line 368, you don't need to indicate the PACG abbreviation again.

      The authors’ answer: Thank you for your comment. We apologize to this mistake. I have changed " patients with PACG " to "patients". (Page 13, lines 377)

      29) Figure 6, panels A and B are not labeled (i.e., commented) in the body text of the manuscript.

      The authors’ answer: Thank you for your suggestion. We’re very sorry for this mistake. Figure 6, panels A and B have been labeled in the manuscript. (Page 13, lines 377-379)

      30) Section 3.7., when you indicate "after therapy" are you referring to surgical treatment? Please, clarify.

      The authors’ answer: Thank you for your comment. We apologize to this vague statement. Blood samples were taken before and three months after surgery. “therapy” has been changed to “surgical treatment” in the manuscript. (Page 13, lines 377)

      31) Line 370, "97th patient" should be replaced by "nine patients"?

      The authors’ answer: Thank you for your advice. We apologize to this mistake. "97th patient" has been changed to “nine patients". (Page 13, lines 378-379)

      32) Lines 370-372, it difficult to understand, please clarify why these findings indicate that severity is related to increased PACG according to Figure 6B.

      The authors’ answer: Thank you for your comment. We’re very sorry for this vague statement. The sentence of “These findings showed that the levels of androstenedione that were tightly connected with PACG severity rose dramatically as PACG progressed.” Has been removed.

      33) Line 447, the word "corrected" should be changed to "correlated"?

      The authors’ answer: Thank you for your comment. "corrected" has been changed to "correlated". (Page 16, lines 453,456)

      34) According to the literature, the levels found in control subjects are within the range of the "normal" values, i.e., are they comparable?

      The authors’ answer: Thank you for your advice. Androstenedione ranges from 0.4 to 2 in the normal population. The mean standard deviation of androstenedione in the normal population was 1.552 ± 0.4859.

      35) Lines 471-474, why "steroid hormone biosynthesis appears to be the critical node to high-match PACG pathophysiological concepts" while the high enrichment was observed in the "metabolic pathways"?

      The authors’ answer: Metabolic pathways encompass a series of chemical reactions within a cell that enable the synthesis or breakdown of molecules to maintain the cell's energy balance. Steroid hormone biosynthesis is one of these metabolic pathways, and its products, steroid hormones, participate in a wide range of physiological processes, including metabolism, immune response, and the regulation of inflammation. In a different context, a study related to fatigue during Androgen Deprivation Therapy (ADT) showed a significant difference in metabolite levels within the steroid hormone biosynthesis pathways, emphasizing the role these pathways play in metabolic alterations. The mentioned findings suggest that steroid hormone biosynthesis and metabolic pathways are intertwined. (Page 17, lines 481-488)

      36) Figure S13 and Figure S14A are the same.

      The authors’ answer: Thank you for your comment. Figure S14A has been removed.

      37) On lines 476-485, it would be interesting to discuss whether alterations of this metabolite could be a cause or consequence of PACG.

      The authors’ answer: Based on the literature found, androstenedione is a naturally occurring steroid hormone produced by the gonads and adrenal glands, and serves as an intermediate in testosterone biosynthesis (Androstenedione (a Natural Steroid and a Drug Supplement): A Comprehensive Review of Its Consumption, Metabolism, Health Effects, and Toxicity with Sex Differences). Early events in the pathobiology of glaucoma involve oxidative, metabolic, or mechanical stress acting on retinal ganglion cells (RGCs), leading to their rapid release of danger signals such as extracellular ATP, thus triggering microglial and macroglial activation as well as neuroinflammation (Immune Responses in the Glaucomatous Retina: Regulation and Dynamics). However, one might speculate that since androstenedione is a steroid hormone, it could potentially impact the inflammatory and metabolic stress observed in the pathophysiological processes of glaucoma (Adaptive responses to neurodegenerative stress in glaucoma). Metabolic and anti-inflammatory avenues might be crucial in understanding the relationship between alterations in androstenedione levels and the severity of glaucoma. Nevertheless, more research and literature analysis would be necessary to better understand the precise relationship and its underlying mechanisms between these two entities.

      38) I suggest sending the MS and MS/MS into a publicly available repository.

      The authors’ answer: Thank you for your suggestion. Further research will necessitate the utilization of the raw mass spectrometry data. We anticipate making this raw data available in a public repository upon the conclusion of subsequent experiments.

      Reviewer #2 (Recommendations for The Authors):

      The authors should aim to describe methods in greater detail.

      The authors could improve the writing to accurately describe their results and their interpretation and state what else could be done to make the result truly "predictive".

      The authors’ answer: (1) Detail Enhancement in the Methods section: We expand the description of methods such as sample pre-processing, mass spectrometry detection, and result analysis in the study to provide more detailed information about the procedures, equipment, and materials used. (2) Improvement in Writing Quality: We have engaged a scientific editor to review our manuscript for clarity, coherence, and consistency to ensure that the results and interpretations are accurately and clearly conveyed. Terminologies and phrases have been revised to better reflect the findings and interpretations. (3) Limitation supplement: We have included a discussion on the limitations of our study and suggested additional studies and analyses that could be conducted to enhance the predictive value of our findings. We sincerely appreciate the constructive feedback from the reviewer, which has greatly contributed to improving the quality and rigor of our manuscript.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Issue 1: The relevance is somewhat unclear. High cysteine levels can be achieved in the laboratory, but, is this relevant in the life of C. elegans? Or is there physiological relevance in humans, e.g. a disease? The authors state "cells and animals fed excess cysteine and methionine", but is this more than a laboratory excess condition? SUOX nonfunctional conditions in humans don't appear to tie into this, since, in that context, the goal is to inactivate CDO or CTH to prevent sulfite production. The authors also mention cancer, but the link to cysteine levels is unclear. In that sense, then, the conditions studied here may not carry much physiological relevance.

      Response 1: We set out to answer a fundamental question: what pathways regulate the function of cysteine dioxygenase, a highly conserved enzyme in sulfur amino acid metabolism? In an unbiased genetic screen that sampled millions of EMS generated mutations across all ~20,000 C. elegans genes, we discovered loss of function/null mutations in egl-9 and rhy-1, two negative regulators of the hypoxia inducible transcription factor (hif-1). Genetic ablation of the egl-9 or rhy-1 loci are likely not relevant to the life of a C. elegans animal, i.e. this is not representative of a natural state. Yet, this extreme genetic intervention has taught us a new fundamental truth about the interaction between EGL-9/RHY-1, HIF-1, and the transcriptional activation of cdo1. Similarly, the high cysteine levels used in our assays may or may not be representative of a state in nature, we do not know (nor do we make any claims about the environmental relevance of our choice of cysteine concentrations). It seems very plausible that pathological states exist where cysteine concentrations may rise to comparable levels in our experimental system. More importantly, we have started with excess to physiology to elicit a clear response that we can study in the lab. Similar strategies established the cysteine-induction phenotype of CDO1 in mammalian systems. For instance, in Kwon and Stipanuk 2001, hepatocytes are cultured in media supplemented with 2mmol/L cysteine to promote a ~4-fold increase in CDO1 mRNA.

      Issue 2: The pathway is described as important for cysteine detoxification, which is described to act via H2S (Figure 6). Much of that pathway has already been previously established by the Roth, Miller, and Horvitz labs as critical for the H2S response. While the present manuscript adds some additional insight such as the additional role of RHY-1 downstream on HIF-1 in promoting toxicity, this study therefore mainly confirms the importance of a previously described signalling pathway, essentially adding a new downstream target rhy-1 -> cysl-1 -> egl-9 -> hif-1 -> sqrd-1/cdo-1. The impact of this finding is reduced by the fact that cdo-1 itself isn't actually required for survival in high cysteine, suggesting it is merely a maker of the activity of this previously described pathway.

      Response 2: We agree that the primary impact of our manuscript is the establishment of a novel intersection between the H2S-sensing pathway (largely worked out by Roth, Miller, and Horvitz) and our gene of interest, cysteine dioxygenase. We believe that the connection between these two pathways is exciting as it suggests a logical homeostatic circuit. High cysteine yields enzymatically produced H2S. This H2S may then act as a signal promoting HIF-1 activity (via RHY-1/CYSL-1/EGL-9). High HIF-1 activity increases cdo-1 transcription and activity promoting the degradation of the high-cysteine trigger. As pointed out by the reviewer, cdo-1(-) loss of function alone does not cause cysteine sensitivity at the concentrations tested. Given that cysl-1(-) and hif-1(-) mutants are exquisitely sensitive to high levels of cysteine, we propose that HIF-1 activates the transcription of additional genes that are required for high cysteine tolerance. However, our genetic data show that cdo-1 is more than simply a marker of HIF-1 transcription. Our genetic data in Table 1 demonstrate that HIF-1 activation (caused by egl-9(-)) is sufficient to cause severe sickness in a suox-1 hypomorphic mutant which cannot detoxify sulfites, a critical product of cysteine catabolism. This severe sickness can be reversed by inactivating hif-1, cth-2, or cdo-1. These data demonstrate a functional intersection between the established H2S-sensing pathway and cysteine catabolism governed by cdo-1.

      Reviewer #2 (Public Review):

      Issue 3: First, the authors show that the supplementation of exogenous cysteine activates cdo-1p::GFP. Rather than showing data for one dose, the author may consider presenting dose-dependency results and whether cysteine activation of cdo-1 also requires HIF-1 or CYSL-1, which would be important data given the focus and major novelty of the paper in cysteine homeostasis, not the cdo-1 regulatory gene pathway.

      Response 3: We agree with the reviewer and have performed the suggested dose-response curve for expression of Pcdo-1::GFP in wild-type C. elegans. We observe substantial activation of the Pcdo-1::GFP transcriptional reporter beginning at 100µM supplemental cysteine (Figure 3C). Higher doses of cysteine do not elicit a substantially stronger induction of the Pcdo-1::GFP reporter. Thus, we find that 100µM supplemental cysteine strikes the right balance between strongly inducing the Pcdo-1::GFP reporter while not inducing any toxicity or lethality in wild-type animals (Figure 3E).

      We further agree that testing for induction of the Pcdo-1::GFP reporter in a hif-1(-) or cysl-1(-) mutant background is a critical experiment. However, we have not been able to identify a cysteine concentration that induces Pcdo-1::GFP and is not 100% lethal for hif-1(-) or cysl-1(-) mutant C. elegans. The remarkable sensitivity of hif-1(-) or cysl-1(-) mutant C. elegans to supplemental cysteine demonstrates the critical role of these genes in promoting cysteine homeostasis. But because of this lethality, we could not assay the Pcdo1::GFP reporter in the hif-1(-) or cysl-1(-) mutant animals. But the lethality to excess cysteine demonstrates that this cysteine response is salient. To get at how cysteine might be interacting with the HIF-1-signaling pathway, we performed new additivity experiments by supplementing 100µM cysteine to wild type, egl-9(-), and rhy-1(-) mutant C. elegans expressing the Pcdo-1::GFP reporter. Surprisingly, we found that cysteine had no significant impact on Pcdo-1::GFP expression in an egl-9(-) mutant background but significantly increased the Pcdo-1::GFP expression in a rhy-1(-) background (Figure 3A,B). These data suggest that cysteine acts in a pathway with egl-9 and in parallel to rhy-1. These data have been incorporated into Figure 3A,B and are included in the Results section of the manuscript.

      Issue 4: While the genetic manipulation of cdo-1 regulators yields much more striking results, the effect size of exogenous cysteine is rather small. Does this reflect a lack of extensive condition optimization or robust buffering of exogenous/dietary cysteine? Would genetic manipulation to alter intracellular cysteine or its precursors yield similar or stronger effect sizes?

      Response 4: We agree that the induction of the Pcdo-1::GFP reporter by supplemental cysteine is not as dramatic as the induction caused by the egl-9 or rhy-1 null alleles. We believe our Response 3 and new Figure 3C demonstrate that this phenomenon is not due to lack of condition optimization, but likely reflects some biology. As pointed out by the reviewer, C. elegans likely buffers exogenous cysteine and this (perhaps) prevents the impressive Pcdo-1::GFP induction observed in the egl-9(-) and rhy-1(-) mutant animals. We have now mentioned this possible interpretation in the Results section. Furthermore, we like the idea of using genetic tricks to promote cysteine accumulation within C. elegans cells and tissues and will consider these approaches in future studies.

      Issue 5: Second, there remain several major questions regarding the interpretation of the cysteine homeostasis pathway. How much specificity is involved for the RHY-1/CYSL-1/EGL-9/HIF-1 pathway to control cysteine homeostasis? Is the pathway able to sense cysteine directly or indirectly through its metabolites or redox status in general? Given the very low and high physiological concentrations of intracellular cysteine and glutathione (GSH, a major reserve for cysteine), respectively, there is a surprising lack of mention and testing of GSH metabolism.

      Response 5: Future studies are required to determine the specificity of the RHY-1/CYSL-1/EGL-9/HIF-1 pathway for the control of cysteine homeostasis. Our proposed mechanism, that H2S activates the HIF-1 pathway is based largely on the work of the Horvitz lab (Ma et al. 2012). They demonstrate that H2S promotes a direct inhibitory interaction between CYSL-1 and EGL-9, leading to activation of HIF-1. These findings align nicely with our genetic and pharmacological data. However, our work does not provide direct evidence as to the cysteine-derived metabolite that activates HIF-1. We propose H2S as a likely candidate.

      We have added a note to the introduction regarding the role of GSH as a reservoir of excess cysteine and agree that future studies might find interesting links between CDO-1, GSH metabolism, and HIF-1.

      Issue 6: In addition, what are the major similarities and differences of cysteine homeostasis pathways between C. elegans and other systems (HIF dependency, transcription vs post-transcriptional control)? These questions could be better discussed and noted with novel findings of the current study that are likely C. elegans specific or broadly conserved.

      Response 6: We have included a new section in the Discussion highlighting the nature of mammalian CDO1 regulation. We propose the hypothesis that a homologous pathway to the C. elegans RHY-1/CYSL-1/EGL9/HIF-1 pathway might operate in mammalian cells to sense high cysteine and induce CDO1 transcription. Importantly, all proteins in the C. elegans pathway have homologous counterparts in mammals. However, this hypothesis remains to be tested in mammalian systems.

      Reviewer #3 (Public Review):

      Major weaknesses of the paper include:

      Issue 7: the over-reliance on genetic approaches.

      Response 7: This is a fair critique. Our expertise is genetics. Our philosophy, which the reviewers may not share, is that there is no such thing as too much genetics!

      Issue 8: the lack of novelty regarding prolyl hydroxylase-independent activities of EGL-9.

      Response 8: We believe the primary novelty of our work is establishing the intersection between the H2Ssensing HIF-1 pathway and cysteine catabolism governed by cysteine dioxygenase. Our demonstration that cdo-1 regulation operates largely independent of VHL-1 and EGL-9 prolyl hydroxylation is a mechanistic detail of this regulation and not the critical new finding. Although, we believe it does suggest where pathway analyses should be directed in the future. We also believe that our homeostatic feedback model for the regulation of HIF-1 (and cdo-1) by cysteine-derived H2S is new and exciting and provides insight into the logic of why HIF-1 might respond to H2S and promote the activity of cdo-1. Our work suggests that one reason for this intersection of hif-1 and cdo-1 is to sense and maintain cysteine homeostasis when cysteine is in excess.

      Issue 9: the lack of biochemical approaches to probe the underlying mechanism of the prolyl hydroxylaseindependent activity of EGL-9.

      Response 9: While not the primary focus of our current manuscript, we agree that this is an exciting area of future research. To uncover the prolyl hydroxylase-independent activity of EGL-9, we agree that a combination of approaches will be required including, biochemical, structure-function, and genetic.

      Major Issues We Feel the Authors Should Address:

      Issue 10: One particularly glaring concern is that the authors really do not know the extent to which the prolyl hydroxylase activity is (or is not) impacted by the H487A mutation in egl-9(rae276). If there is a fair amount of enzymatic activity left in this mutant, then it complicates interpretation. The paper would be strengthened if the authors could show that the egl-9(rae276) eliminates most if not all prolyl hydroxylase activity. In addition, the authors may want to consider doing RNAi for egl-9 in the egl-9(rae276) mutant as a control, as this would support the claim that whatever non-hydroxylase activity EGL-9 may have is indeed the causative agent for the elevation of CDO-1::GFP. Without such experiments, readers are left with the nagging concern that this allele is simply a hypomorph for the single biochemical activity of EGL-9 (i.e., the prolyl hydroxylase activity) rather than the more interesting, hypothesized scenario that EGL-9 has multiple biochemical activities, only one of which is the prolyl hydroxylase activity.

      Response 10: We have two lines of evidence that suggest the egl-9(rae276)-encoded H487A variant eliminates prolyl hydroxylase activity. First, Pan et al. 2007 (reference 57) demonstrate that when the equivalent histidine (H313) is mutated in human protein, that protein lacks detectible prolyl hydroxylase activity. Second, the phenotypic similarities caused by egl-9(rae276) and the vhl-1 null allele, ok161. Both alleles cause nearly identical activation of the Pcdo-1::GFP reporter transgene (Fig. 5C,D), and similarly impact the growth of the suox-1(gk738847) hypomorphic mutant (Table 1). This phenotypic overlap is highly relevant as the established role of VHL-1 is to recognize the hydroxyl mark conferred by the EGL-9 prolyl hydroxylase domain and promote the degradation of HIF-1. If EGL-9[H487A] had residual prolyl hydroxylase activity, we would expect the vhl-1(-) null mutant C. elegans to display more dramatic phenotypes than their egl-9(rae276) counterparts. This is not the case.

      Issue 11: The authors observed that EGL-9 can inhibit HIF-1 and the expression of the HIF-1 target cdo-1 through a combination of activities that are (1) dependent on its prolyl hydroxylase activity (and subsequent VHL-1 activity that acts on the resulting hydroxylated prolines on HIF-1), and (2) independent of that activity. This is not a novel finding, as the authors themselves carefully note in their Discussion section, as this odd phenomenon has been observed for many HIF-1 target genes in multiple publications. While this manuscript adds to the description of this phenomenon, it does not really probe the underlying mechanism or shed light on how EGL-9 has these dual activities. This limits the overall impact and novelty of the paper.

      Response 11: See response to Issues #8.

      Issue 12: Cysteine dioxygenases like CDO-1 operate in an oxygen-dependent manner to generate sulfites from cysteine. CDO-1 activity is dependent upon availability of molecular oxygen; this is an unexpected characteristic of a HIF-1 target, as its very activation is dependent on low molecular oxygen. Authors neither address this in the text nor experimentally, and it seems a glaring omission.

      Response 12: We agree this is an important point to raise within our manuscript. Although, despite its induction by HIF-1, there is no evidence that cdo-1 transcription is induced by hypoxia. In fact, in a genome wide transcriptomic study, cdo-1 was not found to be induced by hypoxia in C. elegans (Shen et al. 2005, reference 71).

      We have newly commented on the use of molecular oxygen as a substrate by both EGL-9 and CDO-1 in our Discussion section. The mammalian oxygen-sensing prolyl hydroxylase (EGLN1) has been demonstrated to have high a Km value for O2 (high µM range). This likely allows EGLN1 to be poised to respond to small decreases in cellular oxygen from normal oxygen tensions. Clearly, CDO-1 also requires oxygen as a substrate, however the Km of CDO-1 for O2 is likely to be much lower, preventing sensitivity of the cysteine catabolism to physiological decreases in O2 availability. Although, to our knowledge, the CDO1 Km value for O2 has not been experimentally determined. We have added a new Discussion section where we address the conundrum about low oxygen inducing HIF-1 but oxygen being needed by CDO-1/CDO1.

      Issue 13: The authors determined that the hypodermis is the site of the most prominent CDO-1::GFP expression, relevant to Figure 4. This claim would be strengthened if a negative control tissue, in the animal with the knockin allele, were shown. The hypodermal specific expression is a highlight of this paper, so it would make this article even stronger if they could further substantiate this claim.

      Response 13: Our claim that the hypodermis is the critical site of cdo-1 function is based on; i) our hands on experience looking at Pcdo-1::GFP, Pcdo-1::CDO-1::GFP, CDO-1::GFP (encoded by cdo-1(rae273)) and our reporting of these expression patterns in multiple figures throughout the manuscript and ii) the functional rescue of cdo-1(-) phenotypes by a cdo-1 rescue construct expressed by a hypodermal-specific promoter (col10). We agree that providing negative control tissues would modestly improve the manuscript. However, we do not think that adding these controls will substantially alter the conclusions of the paper. Importantly, we acknowledge this limitation of our work with the sentence, “However, we cannot exclude the possibility that CDO-1 also acts in other cells and tissues as well.”

      Minor issues to note:

      Issue 14: Mutants for hif-1 and cysl-1 are sensitive to exogenous cysteine levels, yet loss of CDO-1 expression is not sufficient to explain this phenomenon, suggesting other targets of HIF-1 are involved. Given the findings the authors (and others) have had showing a role for RHY-1 in sulfur amino acid metabolism, shouldn't the authors consider testing rhy-1 mutants for sensitivity to exogenous cysteine?

      Response 14: To test the hypothesis that rhy-1(-) C. elegans might be sensitive to supplemental cysteine, we cultured wild type and rhy-1(-) animals on 0, 100, and 1000µM supplemental cysteine. At 0 and 100µM supplemental cysteine, neither wild-type nor rhy-1(-) animals display any lethality suggesting rhy-1 is not required for survival in the face of excess cysteine (Fig. 3D,E). We also cultured these same strains on 1000µM supplemental cysteine, a concentration that is highly toxic to wild-type animals (100% lethality). rhy1(-) animals were resistant to 1000µM supplemental cysteine with a substantial fraction of the population surviving overnight exposure to this lethal dose of cysteine. Similarly, egl-9(-) mutant C. elegans were also resistant to 1000µM supplemental cysteine. We propose that loss of egl-9 or rhy-1 activates HIF-1-mediated transcription which is priming these mutants to cope with the lethal dose of cysteine. These data are now presented in Figure 3D-F and presented in the Results section.

      Issue 15: The cysteine exposure assay was performed by incubating nematodes overnight in liquid M9 media containing OP50 culture. The liquid culture approach adds two complications: (1) the worms are arguably starving or at least undernourished compared to animals grown on NGM plates, and (2) the worms are probably mildly hypoxic in the liquid cultures, which complicates the interpretation.

      Response 15: We agree that it is possible that animals growing overnight in liquid culture are undernourished and mildly hypoxic. However, we are confident in our data interpretation as all our experiments are appropriately controlled. Meaning, control and experimental groups were all grown under the same liquid culture conditions. Thus, these animals would all experience the same stressors that come with liquid culture. Importantly, we never make comparisons between groups that were grown under different culture conditions (i.e. solid media vs. liquid culture).

      Issue 16: An easily addressable concern is the wording of one of the main conclusions: that cdo-1 transcription is independent of the canonical prolyl hydroxylase function of EGL-9 and is instead dependent on one of EGL-9's non-canonical, non-characterized functions. There are several points in which the wording suggests that CDO-1 toxicity is independent of EGL-9. In their defense, the authors try to avoid this by saying, "EGL-9 PHD," to indicate that it is the prolyl hydroxylase function of EGL-9 that is not required for CDO-1 toxicity. However, this becomes confusing because much of the field uses PHD and EGL-9/EGLN as interchangeable protein names. The authors need to be clear about when they are describing the prolyl hydroxylase activity of EGL-9 rather than other (hypothesized) activities of EGL-9 that are independent of the prolyl hydroxylase activity.

      Response 16: We appreciate the reviewer alerting us to this practice within the field. To avoid confusion, we have removed the “PHD” abbreviation from our manuscript and explicitly referred to the “prolyl hydroxylase domain” where relevant.

      Issue 17: The authors state in the text, "the egl-9; suox-1 double mutants are extremely sick and slow growing." We appreciate that their "health" assay, based on the exhaustion of food from the plate, is qualitative. We also appreciate that it is a functional measure of many factors that contribute to how fast a population of worms can grow, reproduce, and consume that lawn of food. However, unless they do a lifespan assay and/or measure developmental timing and specifically determine that the double mutant animals themselves are developing and/or growing more slowly, we do not think it is appropriate to use the words "slow growing" to describe the population. As they point out, the rate of consumption of food on the plate in their health assay is determined by a multitude and indeed a confluence of factors; the growth rate is one specific one that is commonly measured and has an established meaning.

      Response 17: We see how the phrase ‘slow growing’ might imply a phenotype that we have not actually assessed with this assay. Therefore, we have removed all claims about “slow growth” of the strains presented in Table 1 and have highlighted the assay more overtly in the results section. For example; “While egl-9(-) and suox-1(gk738847) single mutant animals are healthy under standard culture conditions, the egl-9(-); suox1(gk738847) double mutant animals are extremely sick and require significantly more days to exhaust their E. coli food source under standard culture conditions (Table 1).”

      Reviewer #1 (Recommendations For The Authors):

      Issue 18: Relevance could be addressed further in the text.

      Response 18: We have added additional context for our work in the Discussion section. Please see our response to Issues #5, 6, 12, and 24.

      Issue 19: Better appreciation and integration of the manuscript's findings with published studies would be appropriate.

      Response 19: We have added additional context for our work in the Discussion section. Please see our response to Issues #5, 6, 12, and 24.

      Issue 20: It might be perhaps relevant to test whether cdo-1 is relevant for hypoxia resistance since it appears to be a key target for hif-1.

      Response 20: We agree that this is an interesting future direction, however given that cdo-1 mRNA is not induced by hypoxia (Shen et al. 2005) we have not prioritized these experiments for the current manuscript.

      Issue 21: "egl-9 inhibits cdo-1 transcription in a prolyl-hydroxylase and VHL-1-independent manner" should be tempered. vhl-1 mutants and egl-9 hydroxylase point mutant still have significant induction of the reporter.

      Response 21: Thank you for identifying this oversight. We have modified the Figure 5 legend title to read, “egl9 inhibits cdo-1 transcription in a largely prolyl-hydroxylase and VHL-1-independent manner.”

      Issue 22: Please use line numbers in the future for easier tracking of comments.

      Response 22: We shall.

      Issue 23: Abstract and elsewhere, "high cysteine activates...", should be rephrased to "high levels of cysteine".

      Response 23: We have made this change throughout the manuscript.

      Reviewer #3 (Recommendations For The Authors):

      Issue 24: The authors discuss CDO1 in the context of tumorigenesis, as well as the potential regulation between cysteine and the hypoxia response pathway. Thus, I was surprised that there was no mention of the foundational Bill Kaelin paper (Briggs et al 2016) showing how the accumulation of cysteine is related to tumorigenesis, and that cysteine is a direct activator of EglN1. Puzzling that CDO1 is a tumor suppressor: you lose it, cysteine can accumulate and activate EglN1, causing HIF1 turnover. How do the authors reconcile their results with this paper? I was also surprised that there was no mention in the Discussion of the role of hydrogen sulfide, cysteine metabolism, and CTH and CBS in oxygen sensation in the carotid body given the role they play there. Seems important to discuss this issue.

      Response 24: We have added new sections to our Discussion that consider the relationship between our work and Briggs et al. 2016 as well as mentioned the role of CTH and H2S in the mammalian carotid body.

      Issue 25: The abstract has a variety of contradictory statements. For example, the authors state that "HIF-1mediated induction of cdo-1 functions largely independent of EGL-9," but then go on to conclude in the final sentence that cysteine stimulates H2S production, which then activates EGL-9 signaling, which then increases HIF-1-mediated transcription of cdo-1. A quick reading of the abstract leaves the reader uncertain whether EGL-9 is or is not involved in this regulation of cdo-1 expression. In addition, the conclusion sentence implies that activation of the EGL-9 pathway increases HIF-1-mediated transcription, yet it is well established that EGL-9 is an inhibitor of HIF-1. The abstract fails to deliver a clear summary of the paper's conclusions. Perhaps consider this alternative (changes in capital letters):

      The amino acid cysteine is critical for many aspects of life, yet excess cysteine is toxic. Therefore, animals require pathways to maintain cysteine homeostasis. In mammals, high cysteine activates cysteine dioxygenase, a key enzyme in cysteine catabolism. The mechanism by which cysteine dioxygenase is regulated remains largely unknown. We discovered that C. elegans cysteine dioxygenase (cdo-1) is transcriptionally activated by high cysteine and the hypoxia inducible transcription factor (hif-1). hif-1- dependent activation of cdo-1 occurs downstream of an H2S-sensing pathway that includes rhy-1, cysl-1, and egl-9. cdo-1 transcription is primarily activated in the hypodermis where it is sufficient to drive sulfur amino acid metabolism. EGL-9 and HIF-1 are core members of the cellular hypoxia response. However, we demonstrate that the mechanism of HIF-1-mediated induction of cdo-1 IS largely independent of EGL-9 prolyl hydroxylASE ACTIVITY and the von Hippel-Lindau E3 ubiquitin ligase. We propose that the REGULATION OF cdo-1 BY HIF-1 reveals a negative feedback loop for maintaining cysteine homeostasis. High cysteine stimulates the production of an H2S signal. H2S then ACTS THROUGH the rhy-1/cysl-1/egl-9 signaling pathway DISTINCTLY FROM THEIR ROLE IN HYPOXIA RESPONSE TO INCREASE HIF-1-mediated transcription of cdo-1, promoting degradation of cysteine via CDO-1.

      Response 25: We agree that the abstract could be clearer. We believe this concern stems from the fact that we did not discuss our initial screen in the abstract. Thus, we failed to establish a role for egl-9 in the regulation of cdo-1. To remedy this, we have modified the abstract as suggested by the reviewer and added additional context. We believe that these changes improve the clarity of the Abstract substantially.

      Issue 26: An easily addressable concern involves the "dark" microscopy controls showing lack of fluorescence from a nematode. In these dark negative control micrographs, the authors should draw dotted outlines around where the worms are or include a brightfield image next to the fluorescence image. On a computer screen, it is in fact possible to make out the worms. Yet, when printed out, the reader must assume there are worms in the dark images. Additionally, we realize that adjusting fluorescence so that wild-type CDO-1 expression can be seen will result in oversaturation of the egl-9 and rhy-1; cdo-1 doubles; however, this would be a useful figure to add into the supplement to both provide a normal reference of CDO-1 low-level expression and a demonstration of just how bright it is in the mutant backgrounds. It would also be useful for you to please report your exposure settings for purposes of reproducibility.

      Response 26: As suggested, we have added dotted lines around the location of the C. elegans animals in all images where GFP expression is low or basal. We have also reported the exposure times for each image in the appropriate figure legends.

      Issue 27: This title is quite generic and doesn't even mention the main players (CDO-1 and sulfite metabolism).

      Response 27: We have updated our title to call attention to cysteine dioxygenase. The improved title is: “Hypoxia-inducible factor induces cysteine dioxygenase and promotes cysteine homeostasis in Caenorhabditis elegans”

      Issue 28: The authors mention two disorders in which CDO-1 plays a pathogenic role: MoCD and ISOD. We recommend switching the order in which the authors mention these, as the remainder of the paragraph is about MoCD. Also, they should write out the number "2" in the first sentence of that paragraph.

      Response 28: We have made the suggested changes.

      Issue 29: The authors state in the main text, "...to ubiquitinate HIF-1, targeting it for degradation by the proteosome." Here, they should refer to the pathway in Figure 5a.

      Response 29: We have made the suggested change.

      Issue 30: The authors state in the main text, "Elements of the HIF-1 pathway have emerged..." which is vague and confusingly worded. Change to, "Members of the HIF-1 pathway and its targets have emerged from C. elegans genetic studies."

      Response 30: We have made the suggested change.

      Issue 31: Clarify in the figure legends that supplemental cysteine did not affect the mortality of worms that were imaged.

      Response 31: We have added this note to Figure 3A and Figure S3A.

      Issue 32: Figure 1b. "the cdo-1 promoter is shown..." Add: "as a straight line" to the end of this phrase.

      Response 32: We have made the suggested change.

      Issue 33: The authors should consider changing the red text in Figure 1 to magenta, which tends to be more readable for people who have limited color vision.

      Response 33: We have adjusted the colors in Figure 1 as suggested.

      Issue 34: Figure 2, legend title. Consider changing "hif-1" to "HIF-1," as well as rhy-1, cysl-1, and egl-9. In this case, they are talking about proteins, not mutants or genes. This will make the paper easier to follow for readers who lack a C. elegans background.

      Response 34: We have made the suggested change.

      Issue 35: Figure 5, caption text. "...indicates weak similarity." Add, "amongst species compared."

      Response 35: We have made the suggested change.

      Issue 36: It is starting to become a standard for showing the datapoints in bar graphs. Although this is done in many graphs in the paper, it should also be done for Figure S1 and Figure 4C.

      Response 36: We have made the suggested change.

      Issue 37: An extensive ChIP-seq and RNA-seq analysis of C. elegans HIF-1 was recently published (Vora et al, 2022), which the authors should reference in support of the regulation of CDO-1 transcription by HIF-1 in their description of published expression studies of the pathway (Results section, page 4). Indeed, Vora et al were key generators of the ChIP-seq data cited in Warnhoff et al but not included as authors in the ModERN/ModENCODE publication: their contributions were published separately in Vora et al and should be acknowledged equivalently.

      Response 37: We appreciate the reviewer pointing this detail out and we have added the correct citation as indicated.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Some suggestions:

      1) It's obviously concerning that your GWAS results are not at all robust to the approach used (Fig S3). Did you try something non-parametric, like a Kruskal-Wallis test?

      We used both GWAS and crosses (F2) to validate the presence of the QTL. So ,evidence is not only brought by GWAS. We did not use non parametric tests as we will have difficulty to account for population structure/relatedness with such approaches. Our GWAS approach is certainly a little underpowered associated with the number of individuals we used and certainly the polygenic nature of the root growth traits. But F2 crosses allow us to put more evidence weight on some region we identified with GWAS.

      2) You don't explain what you do with heterozygotes, nor discuss the level of inbreeding in general.

      We are dealing with inbred lines, but indeed there are not completely fixed inbred lines. For the remaining heterozygotes, they were randomly fixed in one or the other alleles. The median heterozygosity value was low at 5.6%. We clarified this point in the material and methods.

      3) The finding that over 30% of RNA-seq reads don't seem to have an annotated home should give you pause. Do they map anywhere? At least discuss what is going on. Also, note that you likely have enormous errors in SNP-calling due to cryptic structural variation - think about what this might do?

      We agree with reviewer #1. We added a few sentences in the result section to clarify this point: “When further analyzed, 15.15% of the unmapped reads (with no correspondence to predicted CDS) were found not to match the reference genome. These might correspond either to unsequenced regions or to genotype-specific genomic regions that are not present in the reference line. The remaining unmapped reads corresponded to either rRNA and tRNA genes (40.28% of the unmapped reads) or to non-annotated genes or non-coding RNAs (44.57% of the unmapped reads).” As we used the same reference genome for mapping the RNAseq reads, some genes might not being present in our analysis for the two lines we studied.

      4) Did you consider moving PgGRXC9 into Arabidopsis?

      This is a great suggestion. In fact, we plan to explore more how some GRXs regulate root growth and how this is conserved in plants in a follow up project. This is however beyond the scope of this manuscript.

      Minor suggestions:

      1) Why not calculate H^2 simply as line variance divided by total?

      Heritability estimated on single individuals in population, approaches generally used for human and animal breeding led directly to line variance divided by total phenotypic variance.

      But in plant breeding (or plant science), we generally work on replicated genotypes in different blocks/experimental repetition. So we estimate the heritability of the mean phenotype of genotypes. There is ample literature (Nyquist, 1991; Holland et al. 2003; for a very nice and smartly written explanation, on the introduction of this PhD: http://opus.uni-hohenheim.de/volltexte/2020/1720/pdf/20200221_PhD_Thesis_Publikationsversion.pdf). Calculation of heritability (of the mean phenotype) should take into account for the calculation of the phenotypic variance (denominator) the number of replicate genotypes (we do not have a single plant, but several clones when using inbred lines: n). The meaning of the formula is that the error in the model is inflated because we have n replicate plants per genotype. And so to estimate the heritability of the average genotype, we have to take into account this inflated variance in the errors.

      2) While the paper overall is well-written, the captions need further proof-reading.

      We corrected all the captions.

      Reviewer #2 (Recommendations For The Authors):

      Major suggestions:

      1) The experimental support for the mutant phenotype of roxy19 needs to be further substantiated. Current methods available for CRISPR mutagenesis make it relatively easy to generate additional alleles. Alternatively, the authors could complement the mutant with a wild-type copy of the gene. These approaches represent the standard of the field and should be used here as well.

      We agree with rev #2. We added some sentences in the discussion to stress out the limitations of our study to link the QTL to PgGRXC9.

      As stated above we’d like to explore more how some GRXs regulate root growth and how this is conserved in plants. We plan to generate new single and multiple mutants in ROXY19 and its closest homologues (using CRISPR). This is, however, beyond this manuscript.

      2) The authors may want to state more clearly what the hypothesis is for how redox levels might contribute to root length differences and more clearly state what the limits of their current study are.

      We modified the discussion to try to clearly indicate the limitations of our study.

      3) Differences in root growth can be the consequence of a number of different parameters that contribute to root elongation and the authors need to more clearly define which of these are likely affected in their different genotypes.

      We agree with Reviewer #2. However, as stated before, we plan to further explore the molecular and cellular mechanisms responsible for the phenotype we observe in Arabidopsis. This will need extra work and is beyond the scope of this manuscript.

      4) Page 13, first paragraph. The authors provide an overly strong statement that suggests they have determined the molecular basis for the difference in PgGRXC9: " Altogether, our results suggest that PgGRXC9 is a positive regulator of root growth and that a polymorphism in the promoter region of PgGRXC9 associated with changes in its expression level appeared responsible for a quantitative difference in root growth between the two lines."

      While their results suggest the PgGRXC9 locus is associated with root growth variation, they have not directly tested the effect of the polymorphisms in the promoter on gene expression and this statement needs to be weakened.

      We changed the text to: “Altogether, our results suggest that PgGRXC9 is a positive regulator of root growth and that a polymorphism in the promoter region of PgGRXC9 might led to changes in its expression level and ultimately to a quantitative difference in root growth between the two lines. However, the effect of the polymorphisms in the promoter on gene expression need to be tested to validate this hypothesis.”

      We also changed the title of the manuscript to better reflect our results.

      Minor suggestions:

      1) Page 4: "FTSW below 0.3 was considered a stressful condition." It was not specified how this threshold was determined.

      This value corresponds to the measured FTSW value at which pearl millet genotypes subjected to a dry down generally start to reduce their transpiration rate (see Fig. 1 of Kholová et al, 2010; https://doi.org/10.1093/jxb/erp314). At FTSW values above 0.3, transpiration is not affected. At FTSW values around 0.3, the water supply from pearl millet roots cannot fully support transpiration. The plant enters a drought stress responsive phase and progressively closes its stomata to reduce water losses and decrease plant productive functions to match water supply. We have clarified this in the manuscript.

      2) Page 6: Figure 1; footnote: at the end of the description of panel A, a comma is missing between "red" and "blue."

      Thanks for pointing that out. This was corrected.

      3) The root growth data determined by X-ray imaging is not significant (Fig S4B), yet the authors describe the result in the main text without qualification. The authors should clarify this in the text.

      We added some text to clarify this.

      4) Page 9: Figure 2C; It would be better to enlarge these images and annotate them to indicate what specific anatomical features have been measured. Currently, only an expert in the field would be able to interpret these images.

      While we understand the point made by Reviewer #2, Fig2C was meant to illustrate differences in the root tip of the two lines.

      5) Page 9: Figures 2D and E; the number of biological samples measured is not indicated (what is "n"?).

      Thanks again for pointing this out. This was added to the figure legend.

      6) Page 14: Figure 4B; scale bar needs to be included.

      Scale bars were added to the pictures.

      7) Page 14: Figure 4; I recommend adding confocal images or DIC of cleared root apex tissues to easily compare the RAM size and cell lengths in both WT and roxy19 mutant.

      Once again, we plan to have a follow up study on the molecular and cellular mechanisms of action of ROXY19 and its closest homologues on root development. We believe a thorough analysis of differences in phenotype could be illustrated in a future manuscript.

      8) Page 18: main text; "we propose that redox regulation in the root meristem is responsible for a root growth QTL in pearl millet." This statement is ambiguous in the description of the mechanism. The authors do not clarify if the role they propose for PgGRXC9 is in the meristematic or elongation zone. Likely the authors are not able to know precisely where the gene is acting at this point, and so the presented hypothesis needs to more clearly state what limitations there are in assigning a mode of action for the PgGRXC9 and ROXY19 genes in root growth.

      We rewrote this paragraph to clarify the current gap in our understanding of the putative PgGRXC9 function.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the Editor and the referees for their questions and remarks. In this document we provide a point-by-point response to revisions requested by the reviewers.

      Public Reviews:

      Reviewer #1 (Public Review):

      Jafarinia et al. have made an interesting contribution to unravelling the molecular mechanisms underlying pathological phenotypes of repeat expansion of the C9orf72 gene. The repeat expression leads to the expression of polyPR proteins. Using coarse-grained molecular dynamics simulations, the authors identify putative binding partners involved in nucleocytoplasmic transport (NCT), and that conjecture that polyPR affects essential processes by binding to NCT-related proteins. The results are well-reported, but only putative, and need experimental support to be more conclusive. Also, a comparison with results from all-atom MD simulations in explicit water could help verify the results. But even without these, the work is very useful as a first step to unravel the role of polyPR and related peptides.

      We greatly appreciate the reviewer's positive assessment of our work and the suggestions. We acknowledge the need for more experimental validation of the binding behavior of some of the transport components. Our results coincide with the experimental findings of Hutten et al. [1] ([16] in our paper) for example regarding the binding of polyPR to Kapβs and Impαs, but experimental validation of additional transport components, especially for RanGAP, would be valuable. We hope that our work will inspire colleagues from the field to actually perform such experiments.

      We also agree with the reviewer's suggestion that all-atom simulations can provide further details on the molecular conformations at the local NTR-PR binding regions. Nonetheless, such simulations for all transport components, particularly for interactions involving large conformational flexibility of longer polyPR chains such as PR50, would require significant computational expenses. In a recent publication (Jafarinia et al. [2]) we reported on the close resemblance in binding behavior between our coarse-grained MD data and the all-atom MD simulations of (Nanaura et al. [3]), both showing polyPR binding to a negatively-charged cavity of Kapβ2. We expect future MD simulations to elucidate more atomistic detail with the continuously increasing power of high-performance computing clusters.

      Reviewer #2 (Public Review):

      This study used coarse-grained molecular dynamics simulation to explain how the binding of polyPR might interfere with distinct stages of the transport cycle. This finding shows that the interaction between polyPR and transport components is driven by electrostatic interactions and is correlated with the salt concentration and the length of polyPR, providing an important basis for subsequent exploration of the impact of C9orf72 R-DPRs on NCT disruption.

      We appreciate the reviewer's positive feedback and the recognition of the significance of our work.

      Reviewer #3 (Public Review):

      Onck and co-workers present in this work the identification of binding partners and sites of polyPR on various nuclear transport components and elucidate how polyPR might potentially influence the transport process. It's interesting to note that some interaction sites on transport components also serve as their inherent/functional binding sites. The difference in the effects between short polyPR (PR7) and long polyPR (PR50) is also evident, although the authors might need to clarify the mechanisms better. Overall, the manuscript is well organized and concisely written, and it would greatly enhance our understanding of the toxicity induced by polyPR. In general, the 1-bead per atom force field model used in the study is well-tuned for studying the interactions between polyPR and proteins, as the essential cation-pi interactions (between Arg and Phe/Tyr/Trp) were included using an 8-6 LJ model.

      We thank the reviewer for recognizing the suitability of our 1-bead-per-amino-acid force field for studying R-DPRs' interactions with transport components and for acknowledging our work's contribution to understanding polyPR toxicity mechanisms. Below we comment on the mechanisms describing the difference between short and long polyPR molecules.

      Recommendations for the authors:

      1) Regarding Figure 2 (also see below for more specific comments), there is a major concern that the dipole moment is not included in Fig 2b (as the correlation is better with f=0), but the authors still conclude that this is generally important (lines 258-261). As a minimum, this needs to be discussed more carefully. Is f (i..e. the importance of dipole moment for binding) dependent on the specific binding partner, or what is going on? Maybe, there is a good explanation?

      Indeed, the significance of the dipole moment depends on the specific type of transport component involved. Our analysis reveals that for Kapβs, see figure 2b, the best-fit is obtained with f=0, indicating that the separation of charge within Kapβs has a relatively minor effect on their interaction with polyPR. Instead, the primary determinant for polyPR-Kapβ interaction appears to be the net charge per residue (NCPR), with a more negative NCPR leading to stronger interactions.

      We attribute this behavior to the structural characteristics of Kapβs, particularly the superhelical structure which features inner and outer surfaces with differing charge distributions. Importantly, this structural arrangement creates an inner surface characterized by a negative electrostatic potential. As demonstrated in our previous work, polyPR predominantly binds to this negatively charged cavity within Kapβs. Consequently, the separation of charges on the Kapβ surface becomes less influential compared to the overall charge. Other transport components, however, depicted in figure 2a, do not share this feature and the distribution of charges over the surface becomes a more critical factor in polyPR interactions. We have now added this explanation to page 6, and emphasized in the conclusion section that the effect of dipole moment is only observed for the transport components in figure 2a.

      2) Write out nucleoporin, Nup, at first appearance (line 51).

      We have changed it in line 51.

      3) Fig 1: a (representative) CG structure of polyPR (PR7,PR20 and PR70) would be very useful.

      We have added a CG representation of PR7 and PR20 to figure 1.

      4) Please use chi-square, not R-square, to evaluate the fit, as chi-square takes experimental errors into account.

      We use R-square as a standard measure to assess the quality of the fit in the simulations, as it considers the summation of residuals. This choice aligns with the methodology we have used in our previous publications and therefore prefer to use this measure here as well.

      5) Please use a dot (not a full stop) for multiplication in line 151 and Figure 2 legend.

      We made the adjustment in line 151, the caption of figure 2, and the y-axis label of figure S2.

      6) 330: it is very unconventional to plot half the std dev as an error bar. Please plot the std dev (standard error) of the mean.∙

      We made the suggested change and now the error bars in figure 2 are standard errors of the mean (SEM) calculated from block averaging with three blocks at equilibrium. We also amended the caption of figure 2 and the Methods section.

      7) Please write an explicit equation for the linear relation that is plotted in Figure 2. Something like: C_t = a(NCPR - fM/Rg)+b ? That would make it easier to read.

      We have now added the linear equation of the fit to a new table S4, and included a reference to it in the caption of figure 2.

      8) Fig 2: why is the fit to PR7 not reported/shown?

      The fits for PR7 resulted in R2 values of 0.89 (a) and 0.83 (b) for 200M and of 0.7 (a) and 0.59 (b) for 100 mM. Because of the low R2 values for 100 mM, the fits for PR7 are not shown. We have added this explanation to the caption of figure 2.

      9) Fig 4: isn't the blue shape KapB (and not importin)?

      We changed "importin" to "Kapβ Imp" for consistency.

      10) In the interest of reproducibility, a recommendation is to make the scripts for setting up, running, and analyzing the simulations freely available, e.g. at GitHub. This will increase reproducibility and transparency.

      At the moment we do not have the scripts available on GitHub. However, codes can be provided by the authors upon reasonable request, as also mentioned in the data availability statement in the paper.

      11) Can the authors explain the salient advances in this article versus the one published last year?

      In our previous work, we showed that polyPR binds to the Kapβ family of nuclear transport receptors (NTRs), consistent with experimental findings. While this provided valuable insights, it was essential to broaden our investigation as C9orf72 toxicity not only affects the Kapβ family of NTRs but also disrupts other key regulators of NCT. For instance, recent literature (see lines 87-91 in our paper) showed that Ran and its regulators RanGAP and RanGEF are mislocalized in cells expressing R-DPRs, and genetic screening studies have identified several nucleocytoplasmic transport genes as modifiers of R-DPR-mediated toxicity.

      In the present study, we therefore delved deeper into the underlying mechanisms of polyPR-modification of NCT. We focused on exploring whether polyPR directly interacts with Impα isomers, CAS/Cse1, RanGEF, RanGAP, Ran, and NTF2. By doing so, we unveiled a network of direct interactions between polyPR and a remarkably wide range of NCT components. This newfound insight is valuable for interpreting existing experimental findings, such as the mislocalization of RanGAP. We also demonstrate that polyPR binding is influenced not only by factors such as the net charge per residue and the polyPR chain length, as previously observed for Kapβs, but also by the spatial separation of charges, incorporated by an additional dependence on dipole moments in influencing the total number of contacts with polyPR. This sheds new light on how polyPR interacts with numerous targets within the cellular environment, providing a valuable reference for future (experimental) investigations of R-DPR-compromised nuclear transport. These points are explained in the last paragraph of the introduction and paragraphs 2,3 of the conclusion section. Paragraph 2 of the conclusion is also modified for clarification.

      12) In Figure 2(a), the vertical coordinates of the first graph do not match the others.

      We have now modified figure 2a left panel to match the others.

      13) When the polyPR length is large enough, it seems that the binding of polyPR to RanGEF and NTF2 is not significantly improved.

      The binding behavior depends on polyPR length, as well as on the net charge per residue and the dipole moment (expressed as NCPR-fM/R_g). We note that the number of contacts in figure 2 is normalized by the polyPR length so that for both NTF2 and RanGEF the total number of contacts increase with length (PR7 to PR20) when binding occurs. Specifically, for RanGEF, especially at lower ion concentrations (100 mM), PR7 and PR20 exhibit a similar number of contacts per unit length of polyPR. This implies that the absolute number of contacts between PR20 and RanGEF is higher than that of PR7. However, as we extend the polyPR length to PR50, there is a reduction in the number of contacts per unit length of polyPR. This phenomenon indicates that the more extended PR50 has regions that make little to no contact with RanGEF, resulting in a smaller number of contacts per unit length for PR50. Lines 188-195 are now modified to put more emphasis on the difference between number of contacts and number of contacts normalized by polyPR length.

      14) The representation of the mechanism in Figure 4 is not intuitive enough and the color scheme still needs to be improved.

      We have tried to improve clarity by including the names of each transport component next to their schematic representations.

      15) Figure 3 shows that the longer polyPR exhibits a higher contact probability with individual residues compared to a shorter polyPR, is this result in conflict with Figure 2?

      We re-iterate here that the number of contacts in figure 2 is normalized by the polyPR length, while the results in Fig. 3 are not.

      Figure 3 and figure S4 demonstrate that as the length of polyPR increases, the contact probability of individual residues of transport components for interaction with polyPR also increases.

      In figure 2, we have normalized the time-averaged number of contacts by the length of polyPR. For example, in the top-right panel of figure 2a, when comparing results for PR7 with PR50 interaction with RanGAP, a higher value for PR7 indicates that PR7 makes more contacts per unit of its length with RanGAP. In terms of absolute number of contacts, however, the PR50 chain makes more contacts with RanGAP, resulting in a higher contact probability. We now added a sentence (see lines 188-189) for clarification.

      In summary, when a short polyPR strongly binds to a transport component (evidenced by a relatively large number of contacts), it makes more contacts per unit length than a large poyPR. This occurs because for shorter polyPRs most of the residues come into contact with the target protein. In contrast, for longer polyPRs, only certain parts of the chain are in contact with the transport components, while other regions make fewer or no contacts. This is explained in lines 188-195.

      16) In S2 and S3, does the data require an error bar?

      NCPR, defined as total charge divided by sequence length of the transport components, is a constant and therefore figure S3 does not require an error bar.

      In figure S3 we have added error bars (standard deviation) for the dipole moment calculated from 2.5 us simulations of the isolated transport components.

      17) What is the physiological significance when the salt concentration is 100 mM?

      We conducted simulations at two different salt concentrations: 200 mM, which aligns with in vitro conditions as reported in Hutten et al. [1], and a lower 100 mM salt concentration. The inclusion of the 100 mM salt concentration enables us to assess the significance of salt concentration, and to confirm the dominance of electrostatic interactions in polyPR binding. We also note that this range of salt concentration is commonly used in in-vitro experiments [1, 4, 5].

      18) Please introduce abbreviation NLS in the abstract.

      We added the full name of NLS to the abstract.

      19) Given the high number of Arg residues in its sequence, polyPR should interact with many proteins. It would be beneficial to discuss the frequency of binding/non-binding interactions of polyPR with nuclear transport components in comparison to general proteins.

      We appreciate the reviewer's comment. While such a comparison is indeed interesting, our study primarily focused on elucidating the interactions between polyPR and crucial nuclear transport components, aiming to provide insights into potential defects in nucleocytoplasmic transport. The broader comparison of polyPR interactions with different protein classes in the proteome is indeed an interesting direction for future research, but out of the scope of the current manuscript.

      20) The authors should provide a convergence check to determine whether the 2.5 µs simulations are sufficient for sampling the interaction modes, particularly with the long PR50.

      We have included a new figure (figure S5) and additional text in the Methods section to verify that extending the simulation duration does not alter the contact probabilities (which are indicators of binding modes) presented in figure 3a, confirming convergence of our computations.

      21) In reference to Figure 4, the upper panel merely summarizes the known transport mechanisms, while the lower part (A-H) provides potential novel insights from this study. Unfortunately, these novel insights are not sufficiently detailed. It is recommended to include more details to make these relevant plots clearer by expanding the corresponding discussions (currently, only the last paragraph in the Results section addresses these). If possible, the authors should also carry out some CG simulations of the most relevant processes to further elucidate the interference caused by polyPR.

      We have taken the reviewer's feedback into consideration and made the suggested revisions. Specifically, we have expanded the last paragraph of the discussion to provide more detailed explanations of the insights derived from our computational model. For each mechanism, we begin by presenting the reader with the baseline understanding of normal function of the transport component. Subsequently, we discuss how the findings presented in figures 2 and 3 offer insights into polyPR's potential interference with the function of NCT components. Furthermore, we have made improvements to the schematic representation of mechanisms in figure 4 to enhance clarity.

      At the moment, accurately capturing the binding of NCT components to their native binding targets and the competition with polyPR are best resolved by all-atom molecular dynamics simulations, which come with significant computational demands. This level of detail and computation-intensive analyses is beyond the scope of the current study, but we hope that our results will provide the groundwork for future, more detailed investigations.

      References

      1. Hutten, S., et al., Nuclear Import Receptors Directly Bind to Arginine-Rich Dipeptide Repeat Proteins and Suppress Their Pathological Interactions. Cell Rep., 2020. 33(12): p. 108538.

      2. Jafarinia, H., E. Van der Giessen, and P.R. Onck, Molecular basis of C9orf72 poly-PR interference with the β-karyopherin family of nuclear transport receptors. Sci. Rep., 2022. 12(1): p. 21324.

      3. Nanaura, H., et al., C9orf72-derived arginine-rich poly-dipeptides impede phase modifiers. Nat Commun, 2021. 12(1): p. 5301.

      4. Brady, J.P., et al., Structural and hydrodynamic properties of an intrinsically disordered region of a germ cell-specific protein on phase separation. Proceedings of the National Academy of Sciences, 2017. 114(39): p. E8194-E8203.

      5. Fisher, R.S. and S. Elbaum-Garfinkle, Tunable multiphase dynamics of arginine and lysine liquid condensates. Nat. Commun., 2020. 11(1): p. 4628.

    1. Author Response

      Reviewer #1 (Public Review):

      Summary:

      Zhang et al. provide valuable data for understanding molecular features of the human spinal cord. The authors made considerable efforts to acknowledge and objectively address the limitations of Visium while attempting to overcome them by utilizing single-nucleus RNA sequencing (snRNA-seq) from the same tissue. By mapping snRNA-seq clusters to Visium data, they offer spatial information, complemented by RNA-ISH and immunofluorescence (IF) validation. They also discuss gender-related differences and the similarities between human and mouse data, aiming to establish a crucial foundation for experimental research. However, I have some comments below.

      1) The observation of gender-related differences is interesting. The authors reported that SCN10A, associated with nociceptos, exhibited stronger expression in females. While they intend to validate this finding through IF, the quantitative difference is not clearly observed in the IF data (Figure 5f). It would be essential to provide validation through DAPI-based cell counts, demonstrating the difference in CHAT/SCNA10A co-expression.

      Thank you for this important question! We have added panel G in Figure 5, which provided the quantitative analysis of the percentage of CHAT neurons that expressing SCN10A in male and female spinal cord.

      2) It is meritorious that in novel features of the transcriptomic study, the authors considered gender-related differences and similarities between humans and mice. Nevertheless, despite the extensive bioinformatics-based analyses performed, the results mostly confirm what has been previously reported (Nguyen et al. 2021; Yadav et al. 2023; Jung et al. 2023).

      Thank you! In addition to confirming the findings from previous studies, our results also provided new information regarding the difference between human and mouse. For example, we found that PVALB and SST showed broader expression across human DRG neuronal clusters than in mice, suggesting that genes are more selectively expressed in mice than in human DRGs. Moreover, we identified several genes associated with pain that were differentially expressed in motor neurons between sexes.

      3) The study did not perform snRNA-seq in the DRG. The limitations of Visium in cell type separation are acknowledged, and the authors are aware that Visium alone has limitations in describing cell expression patterns. The authors need to validate their findings via analyses of public DRG snRNA-seq data (Jung et al. 2023 Ncom; Nguyen et al. 2021eLife) before drawing broad conclusions.

      Thank you for this critical question! It is right that snRNA-seq has a higher resolution in describing cell expression patterns compared to the spatial transcriptomics. We acknowledged the limitation that we only performed spatial transcriptomics in human DRG without snRNA-seq. Nevertheless, our results of spatial transcriptomics in human DRG were similar to previously public snRNA-seq data of human DRG, suggesting a feasibility of using spatial transcriptomics in human DRG.

      4) Figure 7's comparison between human Visium spot data and Renthal et al.'s mouse snRNA-seq may have limitations as Visium spot data could not provide a transcriptional profile at the single cell resolution. The authors need to clarify this point.

      Thank you! We have clarified this in the limitation section.

      5) Recent findings indicate that type 2 cytokines can directly stimulate sensory neurons. This includes the expression of IL-4RA, IL31RA, and IL13RA in DRG. These findings support the role of JAK kinase inhibitors in mediating chronic itch. Demonstrating the expression of these itch receptors in DRG would be valuable.

      We have provided the expression patterns of IL-4RA, IL31RA, and IL13RA in human and mouse DRG (Figure 7-figure supplement 4), and cited the relevant paper.

      6) Given that juxtacrine and paracrine signals operate from 0 to 200 um, spatial information is vital to understanding intercellular communication. The presentation of spatial information using Visium is meaningful, and more comprehensive analyses of potential interaction based on distance should be provided, beyond the top 10 interactions (Figure 8).

      Thank you for this good question! In this study, we focused on the putative projections from DRG to spinal neuronal types, which may be an important future direction for research on sensory transduction. It will be interesting to determine the intercellular communication in the spinal spot using the spatial transcriptomics data in future studies.

      7) The gender-related differences are interesting and, if possible, it would be interesting to explore whether age-related differences or degeneration-related factors exist. Using public data could allow the examination of age-related changes.

      We agree with the reviewer that it is of great importance to identify the age-related differences using spatial transcriptomics and scRNA-seq data of human spinal cord. However, it is currently difficult to obtain comprehensive results due to the limited human spinal cord datasets regarding different ages.

      Reviewer #2 (Public Review):

      Summary:

      In this paper, the authors generated a comprehensive dataset of human spinal cord transcriptome using single-cell RNA sequencing and the Visium spatial transcriptomics platform. They employed Visium data to determine the spatial orientation of each cell type. Using single-cell RNA sequencing data, they identified differentially expressed genes by comparing human and mouse samples, as well as male and female samples.

      Strengths:

      This study offers a thorough exploration of both cellular and spatial heterogeneity within the human spinal cord. The resulting atlas datasets and analysis findings represent valuable resources for the neuroscience community.

      Weaknesses:

      The analysis of spatial transcriptomics data was conducted as it is single-cell RNAseq data. However, there are established tools for effectively integrating these two types of data. The incorporation of deconvolution methods could enhance the characterization of each spot's cell type composition.

      Thank you very much for your positive comments and suggestions!Indeed, we have used deconvolution methods to incorporate the spinal snRNA-seq and spatial transcriptomics data.

      Reviewer #3 (Public Review):

      Summary:

      Zhang et al sought to use spatial transcriptomics and single-nucleus RNA sequencing to classify human spinal cord neurons. The authors reported 17 clusters on 10x

      Visium slides (6 donors) and 21 clusters by single-nucleus sequencing (9 donors). The authors tried to compare the results to those reported in mice and claimed similar patterns with some differing genes.

      Strengths:

      The manuscript provides a valuable database for the molecular and cellular organization of adult human spinal cords in addition to published datasets (Andersen, et al. 2023; Yadav, et al. 2023).

      Weaknesses:

      The results are largely observatory and lack quantitative analysis. Moreover, the assertions regarding the sex differences in motor neurons and the potential interactions between DRG and spinal cord neuronal subclusters appear preliminary and necessitate more rigorous validation.

      Thank you very much! We have provided the quantitative analysis of the differential expression of SCN10A in male and female spinal cord motor neurons. Our sequencing data revealed putative projections from DRG to spinal neuronal types, which may be an important future direction for research on sensory transduction. We did not use animal models to verify these interactions between DRG and spinal cord neuronal subclusters, which is a major limitation in our study. Nevertheless, our analysis results will provide an important resource for future research to investigate the molecular mechanism underlying spinal cord physiology and diseases.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This work presents important findings for the field of Alzheimer's disease, especially for the electrophysiology subfield, by investigating the temporal evolution of different disease stages typically reported using M/EEG markers of resting-state brain activity. The evidence supporting the conclusions is solid and the methodology as well as the descriptions of the processes are of high quality, although a separation of individuals who are biomarker positive versus negative would have strengthened the interpretability of the results and the conclusions of the study.

      Response: Thank you for the positive assessment of the paper.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors aimed to infer the trajectories of long range and local neuronal synchrony across the Alzheimer's disease continuum, relative to neurodegeneration and cognitive decline. The trajectories are inferred using event-based models, which infer a set of data-driven disease stages from a given dataset. The authors develop an adapted event-based modelling approach, in which they characterise each stage as a particular biomarker increasing by a particular z-score deviation from controls. Fitting infers the optimal set of z-scores to use for each biomarker and the order in which each biomarker reaches each z-score. The authors apply this approach to data from 148 individuals (70 cognitively unimpaired older adults and 78 individual with mild cognitive impairment or Alzheimer's disease), identifying trajectories in which long-range (amplitude-envolope correlation) and local (regional spectral power) neuronal synchrony in the alpha and beta bands becomes abnormal prior to neurodegeneration (measured as the volume of the parahippocampal gyrus) and cognitive decline (measured using the mini-mental state examination).

      Strengths:

      • The main strength is that the authors assess two models. In the first they derive a staging system based only on the volume of the parahippocampal gyrus and mini-mental state examination score. They then investigate how neuronal synchrony metrics change compared to this staging system. In the second they derive a staging system that also includes an average (combined long-range and local) neuronal synchrony metric and investigate how long-range and local synchrony metrics change relative to this staging system. This is a strength as the first model provides confidence that there is not overfitting to the neuronal synchrony data, and the second provides more detailed insights into the dynamics of the early neuronal synchrony changes.

      • Another strength is that the authors automatically infer the optimal z-scores to choose, rather than having to pre-select them manually, as in previous approaches.

      Response: Thank you for the positive comments and a succinct summary of the paper and its strengths.

      Weaknesses:

      • The dataset is small and no external validation is performed.

      Response: We agree that future validation studies of the predictions are necessary. We now include the related sentences in the last paragraph of the limitations section in the revised manuscript.

      • A high proportion of the data is from controls (nearly 50%) with no biomarker evidence of Alzheimer's disease, and so the changes may be driven by aging or other non-Alzheimer's effects.

      Response: We would like to clarify that the z-scores of the metrics used in the EBMs were computed using age-adjusted values. All our controls were recruited from an ongoing longitudinal study of healthy aging. Amongst the 70 controls, 39 have confirmed A-beta negative PET scans and 8 were confirmed A-beta positive PET scans, and in the rest of the 23 we do not have any biomarker data available. However, in all the controls, we have conducted comprehensive neuropsychological assessment (see Appendix 1—table 1 in the revised supplementary file) and based on this data we can be quite confident about their lack of clinical deficits, and we have a very high degree of confidence that none of the controls have any neurodegeneration (AD-related or otherwise). Consistent with this assessment, in our EBM analyses, most of the control participants were indeed categorized to the preclinical stages.

      • Inferring the optimal z-scores is a strength, however as different sets of z-scores are allowed per biomarker, there is a concern that the changes reflected are mainly driven by the choice of z-score, rather than the markers themselves (e.g. if lower z-scores are selected for one marker than another, then changes in that marker will appear to be detected earlier, even if both markers change at the same time).

      Response: Indeed, the biomarker sequence depends on the choice of the z-scores per biomarker. However, please note that our choice of z-scores is based on maximizing the sequence likelihood. Therefore, other values of the z-scores will have by construction a smaller likelihood of sequence occurrence compared to the results shown.

      • In equation 2 it is unclear why the gaussian is measured based on a sum over I. The more obvious choice would be to use a multivariate gaussian with no covariance, which would mean taking the product rather than the sum over I.

      Response: We thank the reviewer for pointing this out and we now clarify this point. In this revision, we do not use the term ‘multivariate’. Indeed, the model likelihood assumes independence for each metric’s priors, and hence is the product of each metric’s univariate gaussian probability distribution. This can be seen in equations 1 and 2 of the revision manuscript (Section titled “Event-based sequencing modeling’). The assumption about independent priors is similar to the one used in the original event-based model (see equation (2) in A .L. Young et al., Nature Comm. 9.1 (2018): 4273).

      • In the original event-based model, k is a hidden variable. Presumably that is also the case here, however the notation k=stage(j) makes it seem like each subject is assigned a stage during the sequence optimisation.

      Response: We would like to clarify that the posterior probability of each stage for every subject is estimated during the sequence optimization. To clarify the notation, we have now deleted the term “stage” and use “tj” to denote stages for each subject j. The sequence optimization was performed with the assumption of a uniform prior distribution p(tj=k) = 1/(N+1) for each stage k. Then, the posterior probability p(tj=k|Zj,S), i.e., the probability that subject j belongs to stage k, given the metrics and the sequence, was computed during the sequence optimization procedure.

      • Typically for event-based modeling, positional variance diagrams are created from the markov chain monte carlo samples of the event sequence, enabling visualisation of the uncertainty in the sequence, but these are not included in the study.

      Response: In the revised supplementary file, we have now included positional uncertainty diagrams for the optimal set of z-score events that were created from 50,000 MCMC samples. Please see Appendix 1—figure 2 for the AC-EBM and Appendix 1—figure 9 for the SAC-EBMs.

      • Many of the figures in the manuscript (e.g. Figure 1E/G, Figure 2A/B, Figure 3A/B/E/F/I/J, Figure 4 A/B/E/F/I/J) are based on averages in both the x and the y axis. In the x dimension, individuals have a weighted contribution to the value on the y axis, depending on their stage probability. In the y dimension, the values are averages across those individuals, and the error bars represent the standard error rather than the standard deviation. Whilst the trajectories themselves are interesting, they may not be discriminative at the individual level and may be more heterogeneous than it appears.

      Response: In the current study, the predictions of trajectories are intended at the cohort level. Individual level investigations will be the topic of future investigations.

      • The bootstrapped statistical analyses comparing metrics between the stages do not consider the variability in the sequence.

      Response: Please see the response above. The positional uncertainty diagrams are included in the revised supplementary file.

      Reviewer #2 (Public Review):

      Summary:

      This work presented by Kudo and colleagues is of great importance to strengthen our understanding of electrophysiological changes in the course of AD. Although the main conclusions regarding functional connectivity and spectral power change through the course of the disease are not new and have been largely studied and theorised on, this article offers an innovative approach that certainly consolidates previous knowledge on the topic. Not only that, this article also broadens our knowledge presenting useful and important details on the specificity of frequency and cortical distribution of these early alterations. The main take-home message of this work is the early disruption of electrophysiological signatures that precedes detectable alterations in other more commonly used pathology markers (i.e. gray matter atrophy and cognitive impairment). More specifically, these signatures include long-range connectivity in the alpha and beta bands, and local synchrony (spectral power) in the same frequency bands.

      Response: Thank you for the positive comments and for providing a nice succinct summary.

      Strengths:

      The present work has some major strengths that make it paramount for the advance of our understanding of AD electrophysiology. It is a very well written manuscript that, despite the complexity of the analyses employed, runs the reader through the different steps of the analysis in a pedagogic and clever way, making the points raised by the results easy to grasp. The methodology itself is carefully chosen and appropriate to the nature of the question posed by the researchers, as event-based models are well-suited for cross-sectional data.

      The quality of the figures is outstanding; not only are they aesthetic but, more importantly, the figures convey information exceptionally well and facilitate comprehension of the main results.

      The conclusions of the paper are, in general, well described and discussed, and consider the state-of-the-art works of AD electrophysiology. Furthermore, even though the conclusions themselves are not groundbreaking at all (synaptic damage preceding structural and cognitive impairment is one of the epitomes of the pathological cascading model proposed by Jack in 2010), this article is innovative and groundbreaking in the way they address with clever analyses in a relatively large sample for neuroimaging standards.

      Response: Thank you for the positive comments of the strengths of the paper.

      Weaknesses:

      The main limitation of the work revolves around sample definition and inclusion criteria that are somewhat confusing obscuring some of the points of the analyses. Firstly it is not clear why the purely clinical approach is employed to diagnose the "probable Alzheimer´s Disease" for the 78 participants in the "AD group". In the same paragraph, it is stated that 67 out of the 78 participants show biomarker positivity, thus allowing a more biologically guided diagnosis that is preferred according to current NIA-AA criteria. This would avoid highly possible mixing of different subtypes of dementia etiologies. One might wonder, why would those 11 participants be included if we have strong indications that their symptoms are not due to AD? Furthermore, the real pathological status of the control group is somewhat questionable. The authors do not specify whether common AD biomarkers are available for this subgroup. In that case, it would have highly increased the clarity and interpretability of the results if this group was subdivided in a preclinical and completely healthy control group. This would be particularly interesting since a significant proportion of the control group is labeled as belonging to stages 2,3,4 (MCI) and even 5 (mild dementia). This raises the question of whether these participants are true healthy controls mislabeled by the EBM model, or actual cognitive controls with actual underlying AD pathology well identified by the model proposed.

      Response: Please see responses above to a similar comment from R1. To clarify, all our controls were recruited from an ongoing longitudinal study of healthy aging. Amongst the 70 controls, 39 have confirmed A-beta negative PET scans and 8 were confirmed A-beta positive PET scans, and in the rest of the 23 we do not have any biomarker data available. The biomarker positivity rates in our control cohort are completely consistent with the prevalence of A-beta positivity in cognitively healthy individuals and are within a normal biological continuum for amyloid beta (Jansen WJ et al. 2015). In all the controls, we have conducted comprehensive neuropsychological assessment (see Appendix 1—table 1 in the revised supplementary file) and based on this data we can be quite confident about their lack of clinical deficits, and we have a high degree of confidence that none of the controls have any neurodegeneration (AD-related or otherwise). We include these details in the revision (see the revised ‘Participants’ section in the Materials and methods.).

      Jansen WJ et al., 2015 JAMA; 667 313(19):1924-1938.

      On this note, Figure 2 (C and D) and Figure 3 (C, G and K) show a cortical surface depicting the mean difference of each stage vs the control group, which again, is formed by subjects that can be included (and in fact, are included) in all those stages, obscuring the meaning and interpretability of these cortical distributions.

      Response: We would like to clarify that these figures depict the regional maps of each metric for each stage of AD progression, not the contrast against a control group.

      Reviewer #1 (Recommendations For The Authors):

      • If possible, perform independent validation of the results.

      Response: This is something we indeed intend to examine in our future investigations.

      • Repeat the analysis in the subset of individuals that are amyloid positive.

      Response: Amongst the 78 AD patients, 20 had autopsy confirmed AD neuropathology, an additional 41 patients had molecular pathology identified by Abeta-PET, and another additional 9 had fluid biomarker (CSF) confirmation of amyloid and tau levels consistent with AD diagnosis. Eight remaining patients had a diagnosis of AD with high certainty, based on clinical presentation, neurological assessment, and cortical atrophy on MRI. Given that there are only eight patients who had clinical diagnosis of AD (with no biomarkers), and the comprehensive clinical characterization of all the AD patients in our cohort (Appendix 1—table 1), we do not believe that any subgroup analysis is warranted.

      • When inferring the optimal z-scores, select the same set of z-scores per biomarker, or include diagrams of stage vs z-score that include all of the markers so that it is easy to see how one marker changes relative to the others (overlay Figure 1G on Figure 2A and 2B).

      Response: How the neural synchrony metrics, PHG volume and MMSE scores change relative to each other is exactly what we show in Figures 3 B/F/J and 4 B/F/J. Since each EBM model optimizes the z-score thresholds, sequence likelihood and posterior probability of each stage for each subject, the EBM framework provides the most likely estimate for each metric at every stage. Therefore, the SAC-EBM model gives the most accurate description of the relative differences in these metrics over the AD progression stages. The reviewer’s suggestion to overlay Figure 1G (now figure 1F, based on optimized z-scores for PHG volume and MMSE scores) on Figures 2A and 2B will be inaccurate, as the neural synchrony measures plotted in figures 2A and 2B are not for optimized z-scores.

      • Change equation 2 to use a multivariate gaussian.

      Response: We now clarify that we use a factorized multivariate form that reflects independent priors for each metric which are Gaussian.

      • Clarify whether k is a hidden variable and possibly change the notation.

      Response: We now clarify that in our notation, k is a label for the stage [k=1,..,7 (when I=2) or k=1,...,10 (when I =3)] and is indeed a hidden variable and not observed (but inferred from the EBM). Specifically, the posterior probability for each subject j belonging to stage k was estimated as part of the sequence optimization procedure.

      • Generate positional variance diagrams of the MCMC samples.

      Response: We are doing the MCMC to obtain the most likely sequence. We have now included positional variance diagrams of the optimal set of z-score events in Appendix 1—figure 2 and Appendix 1—figure 9 in the revised supplementary file.

      • It would be interesting to study whether the stages are predictive of conversion or look at longitudinal data, if available.

      Response: This is something we indeed intend to examine in our future investigations.

      • Also look at statistics across MCMC samples of the sequence.

      Response: Thank you for this suggestion. In the Appendix 1—figure 10, we now include an example of the MCMC samples for an SAC-EBM including the alpha-band AEC. We then derived the positional variances for each metric that are now shown in Appendix 1—figure 2 and Appendix 1—figure 9.

      Reviewer #2 (Recommendations For The Authors):

      Some really minor changes are suggested on two specific points that somewhat confused me as a reader and got me stuck in the reading process to try to get the meaning of what I was seeing/reading:

      1. It is not specified (or at least I was unable to find it) what are you comparing exactly for the group comparison in the long-range synchrony metric (AEC) before creating your scalar metric. Are you comparing individual links (in which case you would have 93 link values for each ROI to compare)? Or are you comparing the strength for each ROI (thus, one value -the individual links sum- for each ROI)? I guess it should be the latter for what I see in the figures but it could be useful to specify it.

      Response: The reviewer is correct. We compare the strength of each ROI, i.e., averaging over edges of the symmetric AEC matrix of functional connectivity. We now clarify this in the Amplitude-envelope correlation section and the caption of the revised Appendix 1—figure 6.

      1. In Figure 1 (which, by the way, is exceptionally aesthetic, congratulations for that!) I got stuck for a relatively long time in a really small detail and I am not completely sure if I came to the right conclusion. It is regarding the X axis of the histograms in panels B and D. They are expressed as "PHG volume loss" and "MMSE decline". So I supposed those histograms were showing some kind of subtraction, (maybe from stage X to stage Y, or from group X to group Y). I was trying to understand the histogram and rereading methods to see if I overlooked any description of that graphic and then just realized they might be just the Z-score itself for each group (control and AD) with respect to the whole population. If that is the case I would suggest changing the X-label to "PHG z-score" and "MMSE z-score" avoiding the reference to "loss and "decline" as they are just reflecting the direct transformation to z-score.

      Response: Thank you. We would like to clarify that the z-score for PHG volume and MMSE scores were sign-inverted so that higher values denote “PHG Volume loss” and “MMSE decline”, respectively. We now clarify this point in the revised text and legend for the revised figure 1.

      Lastly, regarding the point I raised in the limitations section of the public review, I understand it might fall out of the scope of eLife reviewing process as it would require a more extensive change of the current manuscript, which is great as it is. But as a reader and researcher in the field, I would have recommended using biomarkers to divide the control group (if available) thus including in the models only those belonging to the AD continuum according to their biomarker status, and leaving those control without any biomarker positivity as the reference group for the figures I mention in that section (those showing differences for each stage in the cortical surface with respect to the control group).

      Response: Please see a similar comment from R1. Amongst the 70 controls, 39 have confirmed A-beta negative PET scans and only 8 were confirmed A-beta positive PET scans, and in the rest of the 23 we do not have any biomarker data available. In all the controls, we have conducted comprehensive neuropsychological assessment (see Appendix 1—table 1 in the revised supplementary file) and based on this data we can be quite confident about their lack of clinical deficits, and we have a high degree of confidence that none of the controls have any neurodegeneration (AD-related or otherwise). Since only 8 participants were confirmed as amyloid positive in the control group and this sample size is small, we do not conduct this recommended re-analysis in this manuscript.

    1. Author Response

      We appreciate your comments and also thanks to the reviewers for providing valuable feedback and recommendations. For most of the recommendations, we will respond in the revised version, which will provide more information for readers to understand and apply the study. For some of the recommendations, we can give quick responses as follows:

      Reviewer #2 (Public Review):

      The differences between passive and active immunolabeling, as well as photobleaching data, should be addressed for a comprehensive understanding.

      In passive immunolabeling, antibodies penetrate and achieve their targets merely via diffusion, without any additional force. In contrast, active immunolabeling utilizes an external force, such as pressure, electrophoresis, etc., to facilitate antibody penetration and therefore significantly speed up the staining process (i.e., one day vs. 2 months for a whole mouse brain). In our study, the samples we were dealing with were centimeter-sized; therefore, we employed only active electrophoretic immunolabeling (details provided in Materials and Methods). However, for laboratories that do not possess adequate devices or handle small specimens, they can employ passive immunolabeling instead. As for the photobleaching data, we will provide it in the revised version.

      The compatibility of MOCAT with genetically encoded fluorescent proteins remains unclear and warrants further investigation.

      We agree with the possibility that the encoded fluorescent proteins will be affected. Since there is evidence that fluorescence can be quenched by xylene and alcohol, which are two organic solvents used in paraffin processing, we think boost immunolabeling is necessary for observing genetically encoded fluorescent proteins. We also pointed out this limitation in the Discussion:

      “Fourth, endogenous fluorescence—such as GFP, YFP, and tdTomato—may be quenched during paraffin processing and thus need to be visualized by means of additional immunolabeling.”

      However, the extent to which endogenous fluorescence will be quenched during the paraffin processing and MOCAT procedure, and how much boost labeling can rescue, is worth investigating for broadening the application of MOCAT. We will provide it in the revised version.

      The composition of NFC1 and NFC2 solutions for refractive index matching should be provided.

      Since NFC1 and NFC2 are commercial products from Nebulem (Taiwan), the composition is non-disclosable. However, the refractive index of NFC1 and NFC2 is 1.47 and 1.52, respectively.

    1. Author Response:

      Update, January 11, 2024:

      During the course of our careful revising of the paper, we discovered an inconsistency in the way we presented data for figures 5 and 6. Specifically, we used optogenetics to induce ataxia in mice. However, "ataxia", as a phenotype, can be initiated by a spectrum of cell dysfunctions as revealed by previous studies. We systematically explored this with optogenetics in this current work. Our error is that we presented one stimulation paradigm to show ataxic cell firing (2 ms on / 11 ms off square wave) and then presented a slightly different paradigm to show ataxic animal behavior (10 ms on / 10 ms off square wave). We note that our ataxia paradigms do not affect the outcomes of the dystonia and tremor stimulations. Importantly, the choice of ataxia paradigm does not change the conclusions of the paper. Regardless, for clarity we are actively working to make the stimulation parameters that we present consistent between figures 5 and 6.

      October 10, 2023:

      We would like to thank all three reviewers for providing excellent suggestions that will enable us to strengthen our manuscript and enhance the impact of our findings. We plan on addressing the comments by altering the text, providing additional data, revising the figures as requested, and most importantly by providing an improved classifier model. Where relevant, we will also provide the reviewers with a response to specific questions that they raised. We will respond to the reviewer’s comments in a point-by-point manner when we submit a revised manuscript. Below, we include an outline of the main points that we intend to address.

      Although we will respond in full to all comments and suggestions in the revised documents, here we outline only the major areas in order provide context for our revisions. 1) The major point of concern raised by the reviewers is the strength of the classifier model. We agree with the reviewers that we should put forward the strongest model possible as this forms a core component of our paper. We are planning on retraining our model using the suggestions put forward by the reviewers in the public and author-directed comments. Importantly, given the healthy discussion about our model, our revised manuscript will now also include additional clarification about the choice of the model architecture and limitations of our data structure. Based on the reviewers’ comments, we will include a brief discussion about possible future ways of improving the model. 2) We will provide additional figures and updated figure panels to reflect the new data analyses. Ultimately, we agree that the major strength of our manuscript lies within the many mouse models tested and validation of the classification in different genetic, pharmacological, and optogenetic mouse models, a point raised by all three reviewers. We are confident that the revised images will reflect these strengths. 3) In addition to improving our classifier model, we are planning on making textual changes to clarify several parts of the text and propose a new title that better reflects the data put forth in our manuscript. 4) There are several minor but important comments that were raised by all three reviewers. We will also incorporate these changes as suggested.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Reviewer #3 (Recommendations For The Authors):

      1. Fig. 2B: In their previous comment #6, I assume that Reviewer #2 was asking about peaks that were called as statistically significant above background, not just "higher" as assessed by eye. The authors have now marked peaks that are "higher" but still do not indicate that they were called as statistically significant by any software. I agree that they need to indicate in the figure which peaks were discovered by formal analysis.

      Response: Thank you for the professional suggestions. We used the Piranha (version 1.2.1) software to call peaks from CLIP-seq data, in which the P-value threshold for peaks (i.e., the -p parameter) was set as 0.05. And then any region above the IgG peak could be a binding region, and of course, the higher the peak, the more pre-mRNA SRSF1 binds in that region.

      1. Similar to the above comment, in Fig. 7G "visual analysis" of IGV tracks is not an assay. It is fine to show the tracks as an example of the differential expression called using DESeq2, but this should be described for what it is.

      Response: We thank the reviewer for the professional comments. Following this advice, we have corrected the text in this revised version (Page 11, Line 233).

      1. Fig 5C: TUNEL results are supported by a single image of only a few cells. It is important to include quantitation as has been done for other microscopy data.

      Response: Thank you for the professional suggestions. Following this advice, we have added the quantitative data in Figure 5C. Also, we have added specific quantification methods to the text (Page 23, Line 484-485).

      1. Legend to Fig 6C-E: I assume n=4 refers to the number of animals. It would be best to also know many cells/tubules were counted for each animal.

      Response: Thank you for the helpful comments. Following this advice, we have revised the legend for Figure 6D, E (Page 12, Line 246-249).

      1. There appears to be a mistake in line 285-287, which reads: "the overall analysis of aberrant AS events showed that SRSF1 effectively promotes the occurrence of SE and MXE events and inhibits the occurrence of RI events." The data in Fig 8C appears to show the opposite, with more SE and MXE, and fewer RI events, in the SRSF1 KO. This would imply that SRSF1 normally inhibits SE/MXE and promotes RI.

      Response: Thank you very much for the professional comments. Following this advice, we have corrected the text in this revised version (Page 14, Line 286-288).

      1. In Fig. 8E, an upper band is depleted in SRSF1 KO, but in Figure 8J, a much lower band is depleted. How is this explained?

      Response: Thank you for the professional suggestions. Since exon 7 of Tial1 is in the non-coding region, the lower band in Figure 8E does not correspond to the lower band in Figure 8J. For better understanding, we show the detailed information of Tial1 in the attached Figure S3.

      1. Line 81: As a very minor point, "AS" is defined as alternative splicing in the abstract, but should be re-defined again in the main text when first mentioned.

      Response: Thank you for the helpful comments. Following this advice, we have corrected the text in this revised version (Page 3, Line 81).

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the editor and the reviewers for their valuable and constructive feedback. In the revised manuscript, we have incorporated and addressed the suggestions provided by the reviewers.

      Reviewer #1 (Recommendations For The Authors):

      The primary recommendation is to provide additional language explaining how KinCytE will be updated.

      Response: We appreciate the reviewer’s insightful feedback regarding the KinCytE update. In response, we have included additional details in the “Development and use of KinCyte’ section as follows: “We welcome researchers to actively participate in advancing the development of KinCytE by sharing external screening data, especially data on new secreted factors and cell types that extend beyond macrophages. This collaborative effort promises to enhance our understanding of kinase-focused networks, opening new avenues for cutting-edge therapeutic approaches”. In addition, we explicitly state in the "Data, Software, and Availability" section, "To contribute data, kindly email the corresponding author and refer to Table S2 for guidance on the preferred file format."

      Reviewer #2 (Recommendations For The Authors):

      Would have been nice to see a validation of the regression models from outside of the training data. I would also consider removing statements like "We anticipate that KinCytE will be highly sought after by biologists... " , it reads like a grant application (and this is not)! Could tone the language down a bit. In the future, you might consider displaying your graphs as "biofabrics", they're much cleaner than "hairballs" (PMID: 23102059). Or potentially, show a hierarchical view where the selected cytokine (or other) is at the root, and you can immediately see what's connected. Anyway, the network display can be expanded. Consider maybe adding the nearest neighbors to the table on the right after selecting the node. Generally, though, I like how it works.

      There needs to be a button to download the graph as a .csv file. Maybe the subgraph after selecting a node (or set of nodes). Also, once you're at a graph view, it's hard to guess how to get back to the starting page. Maybe just one button with a "home" on it would fix that. On the Kinases Discovery, why are the gene symbols all lower case? Very cool!

      Response:: We greatly value the reviewer's constructive suggestions. To incorporate these, we have made the following changes:

      (1) "We anticipate that KinCytE will be highly sought after by biologists... " This sentence is removed.

      (2) A ‘SAVE CSV’ button is added to the bottom right of the Cytokine Explorer page, which allows the users to download the graph as a csv file.

      (3) A redesigned KinCyte logo now functions as the 'HOME' button, located at the top left of the webpage, ensuring that users can easily return to the homepage at any time.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      The manuscript describes the synergy among PI3Kbeta activators, providing compelling results concerning the mechanism of their activation. The particular strengths of the work arise to a great extent from the reconstitution system better mimicking the natural environment of the plasma membrane than previous setups have. The study will be a landmark contribution to the signaling field.

      Public Reviews:

      Reviewer #1 (Public Review):

      The manuscript aims to provide mechanistic insight into the activation of PI3Kbeta by its known regulators tyrosine phosphorylated peptides, GTP-loaded Rac1 and G-protein beta-gamma subunits. To achieve this the authors have used supported lipid bilayers, engineered recombinant peptides and proteins (often tagged with fluorophores) and TIRF microscopy to enable bulk (averages of many molecules) and single molecule quantitation. The great strength of this approach is the precision and clarity of mechanistic insight. Although the study does not use "in transfecto" or in vivo models the experiments are performed using "physiologically-based" conditions and provide a powerful insight into core regulatory principles that will be relevant in vivo.

      The results are beautiful, high quality, well controlled and internally consistent (and with other published work that overlaps on some points) and as a result are compelling. The primary conclusion is that the primary regulator of PI3Kbeta are tyrosine phosphorylated peptides (and by inference tyrosine phosphorylated receptors/adaptors) and that the other activators can synergise with that input but have relatively weak impacts on their own.

      Although the methodology is not easily imported, for reasons of both cost and the experience needed to execute them well, the results have broad importance for the field and reverse an impression that had built in large parts of the broader signalling and PI3K communities that all of the inputs to PI3Kbeta were relatively equivalent, however, these conclusions were based on "in cell" or in vivo studies that were very difficult to interpret clearly.

      Reviewer #2 (Public Review):

      The manuscript of Duewell et al has made critical observations that help to understand the mechanisms of activation of the class IA PI3Ks. By using single-molecule kinetic measurements, the authors have made outstanding progress toward understanding how PI3Kbeta is uniquely activated by phosphorylated tyrosine kinase receptors, Gbeta/gamma heterodimers and the small G protein Rac1. While previous studies have defined these as activators of PI3Kbeta, the current manuscript makes clear the quantitative limitations of these previous observations. Most previous quantitative in vitro studies of PI3Kbeta activation have used soluble peptides derived from bis-phosphorylated receptors to stimulate the enzyme. These soluble peptides stimulate the enzyme, and even stimulate membrane interaction. Although these previous studies showed that the release of p85-mediated autoinhibition unmasks an intrinsic affinity of the enzyme for lipid membranes, they ignored what would be the consequence of these peptide sequences being present in the context of intrinsic membrane proteins. The current manuscript shows that the effect of membrane-conjugated peptides on the enzyme activity is profound, in terms of recruiting the enzyme to membranes. In this context, the authors show that G proteins associated with the membranes have an important contribution to membrane recruitment, but they also have a profound allosteric effect on the activity on the membrane, These are observations that would not have been possible with bulk measurements, and they do not simply recapitulate observations that were made for other class IA PI3Ks.

      An important observation that the authors have made is that Gbeta/gamma heterodimers and RAc1 alone have almost no ability to recruit PI3Kbeta to the membranes that they are using, and this is central to one of the most profoundly novel activation mechanisms offered by the manuscript. The authors propose that the nSH2- and Gbeta/gamma binding sites partially overlap, so that Gbeta/gamma can only bind once the nSH2 domain releases the p110beta subunit. This mechanism would mean that once the nSH2 is engaged by membrane-conjugated pY, the Gbg heterodimer can bind and increase the association of the enzyme with membranes. Indeed, this increased membrane association is observed by the authors. However, the authors also show that this increased recruitment to membranes accounts for relatively little increase in activity, and that the far greater component of activation is due to an allosteric effect of the membrane association on the activity of the enzyme. The proposal for competition between Gbg binding and the nSH2 is consistent with the behavior of an nSH2 mutant that cannot bind to pY and which, consequently, does not vacate the Gbg-binding site. In addition to the outstanding contribution to understanding the kinetics of activation of PI3Kbeta, the authors have offered the first structural interpretation for the kinetics of Gbg activation in synergy with pY activation. The proposal for an overlapping nSH2/Gbg binding site is supported by predictions made by John Burke, using alphafold multimer. Although there is no experimental structure to support this structural model, it is consistent with HDX-MS analyses that were published previously.

      Reviewer #1 (Recommendations For The Authors):

      1. The approx relative concentrations (surface densities ) of Rac1-GTP, GBetagammas and PY-peptides used in experiments in Fig 1 are not easy to understand and useful to give an intuitive feel for the relative sensitivity of the PI3Kbeta reporter to those inputs.

      In our revised manuscript, we provide densities of the individual signaling inputs used to reconstitute Dy647-PI3Kβ membrane recruitment (see Figure legend 1). We provide a more detailed explanation about our quantification method in subsequent figures where the membrane surface density of signaling inputs is varied to modulate the strength of PI3Kβ membrane localization and activity.

      Building off the quantification of Rac1-GTP and pY membrane density measurements presented in our initial manuscript submission, we now include an estimate of the GβGγ membrane density. For these new measurements, we recombinantly expressed and purified additional SNAP-GβGγ protein, which we fluorescently labeled with AlexaFluor 555. The membrane surface density of GβGγ was quantified at equilibrium using a combination of AF488-SNAP-GβGγ (bulk signal) and dilute AF555-SNAP-GβGγ (0.0025%), which allowed us to resolve and count the single molecule density (Figure 3A). We calculate the total surface density of GβGγ based on the AF555-SNAP-GβGγ dilution factor. In the methods section titled, “surface density calibration,” we describe our protocol.

      1. The estimates of the PIP3 concentrations/densities measured using the BTK reporter seem good but its unclear (to me) how they were derived.

      The density of PI(3,4,5)P3 lipids in our supported lipid bilayers was calculated based on the incorporation of a define molar ratio of PI(3,4,5)P3 in our small unilamellar vesicles. Based on the average footprint of 0.72 nm2 for a single lipid, we calculated the density of lipids per µm2. In the methods section titled, “kinetic measurements of PI(3,4,5)P3 lipid production,” we include the following description:

      “Assuming an average footprint of 0.72 nm2 for phosphatidylcholine (Carnie et al., 1979; Hansen et al., 2019), we calculated a density of 2.8 × 104 PI(3,4,5)P3 lipids/μm2 for supported membranes that contain an initial concentrations of 2% PI(4,5)P2. We assume that the plateau fluorescence intensity of the AF488-SNAP-Btk sensor following reaction completion in the presence of PI3Kβ represents the production of 2% PI(3,4,5)P3. The bulk membrane intensity of AF488-SNAP-Btk was normalized from 0 to 1, and then multiplied times the total density of PI(3,4,5)P3 lipids to generate kinetic traces that report the kinetics of PI(3,4,5)P3 production.”

      Minor points

      l164; Rac1(GTP) AND GBeta gammas. In this context it should be OR. Or have I misunderstood?

      l1093; kineticS measurementS.

      Thank you for pointing out these typos. We made the appropriate edits.

      The paper of Suire etal (Suire, S., Lécureuil, C., Anderson, K. E., Damoulakis, G., Niewczas, I., Davidson, K., Guillou, H., Pan, D., Jonathan Clark, Phillip T Hawkins, & Stephens, L. (2012). GPCR activation of Ras and PI3Kc in neutrophils depends on PLCb2/b3 and the RasGEF RasGRP4. The EMBO journal, 31(14), 3118-3129. https://doi.org/10.1038/emboj.2012.167) make the point that in vivo it appears that although Ras-activation is required for full activation of PI3Kgamma (and can activate PI3Kgamma in vitro directly) if you use tools to activate Ras in the absence of receptor and Gbetagamma signalling, it has no affect on PIP3 . This directly supports the authors conclusions.

      Thank you for sharing this citation. We incorporated the reviewer’s insight into our discussion section to broaden the significance of our work.

      Reviewer #2 (Recommendations For The Authors):

      There are only a few relatively minor points that could be addressed to improve the paper:

      1. Why is the density still going up after 10 minutes in Figure 1 Figure supplement 2? Doesn't this seem like a very long time? Are we seeing fast on/off combined with fast on/slow off? Are the particles eventually becoming stuck in odd places or are they slowly denaturing?

      Our movies do not indicate a slow accumulation of immobilized or stuck Dy647-PI3Kβ particles on the membrane surface. On the long timescale, we believe that a small fraction of Dy647-PI3Kβ molecular do exhibit longer dwell times on membranes containing a high density of pY (>6,000 molecules/µm2). This is likely due to membrane hopping of Dy647-PI3Kβ. In other words, rather than Dy647-PI3Kβ dissociating from the membrane surface directly into the solution, the Dy647-PI3Kβ molecule immediately rebinds to another membrane conjugated pY peptide. This type of behavior of a peripheral membrane binding protein is generally correlated with there being a higher surface density of the binding partner (Yasui et al., 2014). Characterization of potential Dy647-PI3Kβ membrane hopping will require additional experimentation (e.g. PI3Kβ mutants) and quantitative analysis that goes beyond the scope of this study.

      1. Lines 188-189. "By quantifying the average number of Alexa488-pY particles per unit area of supported membrane we calculated the absolute density of pY per μm2 (Figure 2D). I think this should be Figure 2C, right hand y-axis.

      Thank you for identifying our typo. We’ve corrected the text for clarity.

      1. Lines 102-193. "When Dy647-PI3Kβ was flowed over a membrane containing a low density of {less than or equal to} 500 pY/μm2, we observed rapid equilibration kinetics consistent with a 1:1 binding stoichiometry (Figure 2E).” There is no density shown in Fig. 2E. There is only "membrane intensity." Perhaps it was their intent to include a right-hand axis with density (number of particles/area), as they did in Figure 2C. However, they did not, so Figure 2E does not support the text. The value of Intensity/#py/um**2 does not appear to be the same for Figure 2C as for Figure 2E, assuming that the statement in the text is correct. The authors should include the density as a right-hand axis in 2E.

      We have reworded this portion of the results section for clarity. In reading the reviewers comment, we recognize that a more convincing way to support our claim of a 1:1 binding stoichiometry would be to show that there are ~500 Dy647-PI3Kβ/μm2 membrane bound complexes when the pY surface density equals ~500 pY/μm2. For us to make this connection, we would need to perform experiments using a Dy647-PI3Kβ concentration that fully saturates all the binding pY binding sites. However, at this elevated Dy647-PI3Kβ solution concentration, individual Dy647-PI3Kβ complexes can start to bind to a single phosphotyrosine of the dually phosphorylated peptide due to competition for pY binding sites. As an alternative to performing the experiment described above, we can infer binding stoichiometry from the shape of the membrane absorption kinetic traces. For example, a simple bimolecular interaction exhibits rapid equilibration kinetics with a hyperbolic shaped kinetic trace. Systems that have more complex binding equilibria, however, generally take longer to equilibrate (due to the change in KOFF) and can often be broken down into 2 or 3 distinct dissociation constants (KD). This type of kinetic analysis has previously been used to describe multivalent membrane binding interactions for the Btk-PI(3,4,5)P3 (Chung et al., 2019) and PI3Kγ-GβGγ (Rathinaswamy et al., 2021) complexes. Considering that there are multiple interpretations of the Dy647-PI3Kβ membrane absorption traces show in Figure 2E, we refrain from saying that our results explicitly reveal a 1:1 binding stoichiometry. Instead, we provide several possible explanations for the results. Ultimately, additional experiments and kinetic modeling of wild type and mutant PI3Kβ is necessary to define the binding stoichiometry under different conditions.

      1. Table 1. The authors have analysed the data to extract two dwell times and two diffusion coefficients. The legend should make this clear, referring to D1 as the slow diffusion component and D2 as fast diffusion, similarly, there are short and long dell times. This should be stated in the legend. There are two columns labelled "alpha". This presumably should be alpha1 and alpha2, the fractions of particles with short and long dwell times. The table legend should clarify this.

      In our revision, additional text has been added to the figure legends and Table 1.

      Text from Table 1: “Alpha (α) equals the fraction of molecules with the characteristic dwell time, τ1 (DT = dwell time). The fraction of molecules with the characteristic dwell time, τ2, equals 1-α. Alpha (αD) equals the fraction of molecules with the characteristic diffusion coefficient, D1. The fraction of molecules with diffusion coefficient, D2, equals 1-αD.”

      1. In the legend for Figure 5 figure supplement 1, for part D, the "Cumulative membrane of binding events..." The "of" should be deleted.

      Thank you for identifying this typo.

      1. Lines 423-426: "We found that PI3Kβ kinase activity is also relatively insensitive to either Rac1(GTP) or GβGγ alone. This is in contrast to previous reports that showed Rho-GTPases (Fritsch et al. 2013) and GβGγ (Katada et al. 1999; Hashem A. Dbouk et al. 2012; Maier, Babich, and Nürnberg 1999) can activate PI3Kβ, albeit modest, compared to synergistic activation with pY peptides plus Rac1(GTP) or GβGγ." It is not clear what this statement means. On the surface, it might be interpreted as saying that these previous studies had some flaw that led the authors to conclude that there is some activation caused by Rac1 or Gbeta/gamma on their own. The current manuscript is an important contribution to understanding the mechanism of synergistic activation, but it is also true that the Hansen and his colleagues have not used the same membranes as were used previously. The authors state that they have used a wide range of membrane compositions, but the only ones that have appeared in the manuscript are nearly pure PC (with 2% PIP2) or PC with 20% PS. Extensive studies with varying membrane compositions are beyond the scope of the current study, since the current manuscript concisely makes important observations regarding mechanism. However, it would be helpful for readers if the authors at least mention the differences in membrane compositions among the studies.

      The reviewer raises an important point concerning our interpretation of PI3Kβ activation data in relationship to existing literature. In our original submission, we made conclusions concerning how individual signaling inputs modulate PI3Kβ activity, without showing all our data or providing sufficient explanation. In our revised manuscript, we include PI3Kβ kinase activity measurements performed in the presence of either pY, Rac1(GTP), or GβGγ alone (Figure 5B-5C). These experiments were reconstituted on supported membranes in the absence or presence of 20% PS lipids. We found that increasing the density of anionic lipids increased the overall activity of PI3Kβ in the presence of pY or GβGγ alone. This is consistent with a subtle increase in PI3Kβ membrane affinity due to the negatively charged PS lipids. Mutations that disrupt the direct interaction between PI3Kβ and GβGγ eliminated the observed lipid kinase activity. We were unable to detect PI3Kβ activity in the presence of Rac1(GTP) alone. In conclusion, we’re able to detect some PI3Kβ activity in the presence of GβGγ alone, which is consistent with previous reports (Dbouk et al., 2010; Katada et al., 1999; Maier et al., 2000). In the future, a more comprehensive analysis will be required to map the relationship between PI3Kβ activity, membrane localization, and lipid composition. For example, previous reconstitutions have revealed differential activation of PI3Kα that depends on the most abundant lipid being phosphatidylethanolamine (PE) rather than phosphatidylcholine (PC) (Hon et al., 2012; Ziemba et al., 2016). PE lipids comprise 25-30% of the cellular plasma membrane (Yang et al., 2018) and have been used in previous studies to measure PI3K lipid kinase activity on small unilamellar vesicles (Dbouk et al., 2010; Hon et al., 2012).

      In this study, we elected to use a simplified membrane composition that minimized non-specific membrane localization of fluorescently labeled PI3Kβ. This allowed us to more clearly define the strength of individual and combinations of protein-protein interactions that regulate PI3Kβ localization and kinase activity. When reconstituting amphiphilic molecules (i.e. lipids) in aqueous solution a variety of structures, including micelles, inverted micelles, and planar bilayers can form based on the lipid composition (Kulkarni, 2019). The organization of these membrane structures is related to the molecular packing parameter of the individual phospholipids (Israelachvili et al., 1976). The packing parameter (P=v⁄((a•l_c))) depends on the volume of the hydrocarbon (v), area of the lipid head group (a), and the lipid tail length (l_c). When generating supported lipid bilayers on a flat two-dimensional glass surface, we aim to create a fluid lamellar membrane. We find that phosphatidylcholine (PC) lipids are ideal for making supported lipid bilayers because they have a packing parameter of ~1 (Costigan et al., 2000). In other words, PC lipids are cylindrical like a paper towel roll. In contrast, cholesterol and phosphatidylethanolamine (PE) lipids have packing parameters of 1.22 and 1.11, respectively (Angelov et al., 1999; Carnie et al., 1979). This gives cholesterol and PE lipids an inverted truncated cone shape, which prefers to adopt a non-lamellar phase structure. Due to the intrinsic negative curvature of PE lipids, they can spontaneously form inverted micelles (i.e. hexagonal II phase) in aqueous solution when they are the predominant lipid species (Israelachvili et al., 1980; Kobierski et al., 2022; Wnętrzak et al., 2013). In the methods section of our manuscript, we note that from our experience incorporation of PE lipids dramatically reduces the protein-maleimide coupling efficiency, displayed more membrane defects, and resulted in a larger fraction of surface immobilized Dy647-PI3Kβ. This could be related to the intrinsic negative curvature of PE membranes. However, further investigation is needed to decipher these issues.

      Angelov B, Ollivon M, Angelova A. 1999. X-ray Diffraction Study of the Effect of the Detergent Octyl Glucoside on the Structure of Lamellar and Nonlamellar Lipid/Water Phases of Use for Membrane Protein Reconstitution. Langmuir 15:8225–8234. doi:10.1021/la9902338

      Carnie S, Israelachvili JN, Pailthorpe BA. 1979. Lipid packing and transbilayer asymmetries of mixed lipid vesicles. Biochim Biophys Acta 554:340–357. doi:10.1016/0005-2736(79)90375-4

      Chung JK, Nocka LM, Decker A, Wang Q, Kadlecek TA, Weiss A, Kuriyan J, Groves JT. 2019. Switch-like activation of Bruton’s tyrosine kinase by membrane-mediated dimerization. Proc Natl Acad Sci 116:10798–10803. doi:10.1073/pnas.1819309116

      Costigan SC, Booth PJ, Templer RH. 2000. Estimations of lipid bilayer geometry in fluid lamellar phases. Biochim Biophys Acta 1468:41–54. doi:10.1016/s0005-2736(00)00220-0

      Dbouk HA, Pang H, Fiser A, Backer JM. 2010. A biochemical mechanism for the oncogenic potential of the p110 catalytic subunit of phosphoinositide 3-kinase. Proc Natl Acad Sci 107:19897–19902. doi:10.1073/pnas.1008739107

      Hansen SD, Huang WYC, Lee YK, Bieling P, Christensen SM, Groves JT. 2019. Stochastic geometry sensing and polarization in a lipid kinase–phosphatase competitive reaction. Proc Natl Acad Sci 116:15013–15022. doi:10.1073/pnas.1901744116

      Hon W-C, Berndt A, Williams RL. 2012. Regulation of lipid binding underlies the activation mechanism of class IA PI3-kinases. Oncogene 31:3655–3666. doi:10.1038/onc.2011.532

      Israelachvili JN, Marcelja S, Horn RG. 1980. Physical principles of membrane organization. Q Rev Biophys 13:121–200. doi:10.1017/s0033583500001645

      Israelachvili JN, Mitchell DJ, Ninham BW. 1976. Theory of self-assembly of hydrocarbon amphiphiles into micelles and bilayers. J Chem Soc Faraday Trans 2 Mol Chem Phys 72:1525–1568. doi:10.1039/F29767201525

      Katada T, Kurosu H, Okada T, Suzuki T, Tsujimoto N, Takasuga S, Kontani K, Hazeki O, Ui M. 1999. Synergistic activation of a family of phosphoinositide 3-kinase via G-protein coupled and tyrosine kinase-related receptors. Chem Phys Lipids 98:79–86. doi:10.1016/S0009-3084(99)00020-1

      Kobierski J, Wnętrzak A, Chachaj-Brekiesz A, Dynarowicz-Latka P. 2022. Predicting the packing parameter for lipids in monolayers with the use of molecular dynamics. Colloids Surf B Biointerfaces 211:112298. doi:10.1016/j.colsurfb.2021.112298

      Kulkarni CV. 2019. Calculating the “chain splay” of amphiphilic molecules: Towards quantifying the molecular shapes. Chem Phys Lipids 218:16–21. doi:10.1016/j.chemphyslip.2018.11.004

      Maier U, Babich A, Macrez N, Leopoldt D, Gierschik P, Illenberger D, Nürnberg B. 2000. Gβ 5 γ 2 Is a Highly Selective Activator of Phospholipid-dependent Enzymes. J Biol Chem 275:13746–13754. doi:10.1074/jbc.275.18.13746

      Rathinaswamy MK, Dalwadi U, Fleming KD, Adams C, Stariha JTB, Pardon E, Baek M, Vadas O, DiMaio F, Steyaert J, Hansen SD, Yip CK, Burke JE. 2021. Structure of the phosphoinositide 3-kinase (PI3K) p110γ-p101 complex reveals molecular mechanism of GPCR activation. Sci Adv 7:eabj4282. doi:10.1126/sciadv.abj4282

      Wnętrzak A, Lątka K, Dynarowicz-Łątka P. 2013. Interactions of alkylphosphocholines with model membranes-the Langmuir monolayer study. J Membr Biol 246:453–466. doi:10.1007/s00232-013-9557-4

      Yang Y, Lee M, Fairn GD. 2018. Phospholipid subcellular localization and dynamics. J Biol Chem 293:6230–6240. doi:10.1074/jbc.R117.000582

      Yasui M, Matsuoka S, Ueda M. 2014. PTEN Hopping on the Cell Membrane Is Regulated via a Positively-Charged C2 Domain. PLoS Comput Biol 10:e1003817. doi:10.1371/journal.pcbi.1003817

      Ziemba BP, Burke JE, Masson G, Williams RL, Falke JJ. 2016. Regulation of PI3K by PKC and MARCKS: Single-Molecule Analysis of a Reconstituted Signaling Pathway. Biophys J 110:1811–1825. doi:10.1016/j.bpj.2016.03.001

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      We thank the referee for the positive review.

      Reviewer #2 (Public review):

      We thank the referee for his/her constructive comments

      1. The weakness of this work is the lack of clarification on the function of eIF2A in general. The novelty of this study was limited.

      We believe our study is valuable in providing strong evidence that eIF2A does not functionally substitute for eIF2 in tRNAi recruitment even when eIF2 function is impaired, and in showing that it does not contribute to translational control by uORFs or IRESs, thus ruling out the most likely possibilities for its function in yeast based on studies of the mammalian factor. We agree that the function of yeast eIF2A remains to be identified; however, we think this should be regarded as a limitation rather than a weakness in experimental design or data obtained in the current study.

      1. Related to this, it would be worth investigating common features in mRNAs selectively regulated (surveyed in Figure 3A).

      We did not embark on this because only 17 of the 32 transcripts showing TE reductions in Fig. 3A showed a pattern of TE changes consistent with a conditional requirement for eIF2A under conditions of reduced eIF2 function, exhibiting greater TE decreases when both eIF2 function was impaired by phosphorylation and eIF2A was eliminated from cells. Moreover, we could validate this conditional eIF2A dependence by LUC reporter for only a single mRNA, HKR1.

      Also, it would be worth analyzing the effect of eIF2A deletion on elongation (ribosome occupancy on each codon and/or global ribosome footprint distribution along CDS) and termination/recycling (footprint reads on stop codon and on 3′ UTR).

      We have analyzed the effects of deleting eIF2A on ribosome pausing at individual codons by calculating tri-peptide pause scores from our ribosome profiling data. The results shown in new Fig. 7 reveal that eIF2A plays no discernible role in stimulating the rate of decoding of any three-codon combinations.

      1. Regarding Figure 3D, the reporters were designed to include promoter and 5′ UTR of the target genes. Thus, it should be worth noting that reporter design was based on the assumption that eIF2A-dependency in translation regulation was not dependent on 3′ UTR or CDS region. The reason why the effects on ribosome profiling-supported mRNAs could not be recapitulated in reporter assay may originate from this design. This should be also discussed.

      We agree and included this stipulation in the DISCUSSION, while at the same time noting that the native mRNAs were examined in the orthogonal assay of polysome distributions.

      1. Related to the point above, the authors claimed that eIF2A affects "possibly only one" (HKR1) mRNA. However, this was due to the reporter assay which is technically variable and could not allow some of the constructs to pass the authors' threshold. Alternative wording for this point should be considered.

      We agree and revised text in the DISCUSSION to read: “A possible limitation of our LUC reporter analysis in Fig. 3D was the lack of 3’UTR sequences of the cognate transcripts, which might be required to observe eIF2A dependence. Given that native mRNAs were examined in the orthogonal assay of polysome profiling in Fig. 3E, the positive results obtained there for SAG1 and SVL3 in addition to HKR1 should be given greater weight. Nevertheless, our findings indicate a very limited role of yeast eIF2A in providing a back-up mechanism for Met-tRNAi recruitment when eIF2 function is diminished by phosphorylation of its α-subunit.”

      1. For Figure 3D, it would be worth considering testing the #-marked genes (in Figure 3C) in this set up.

      Actually, we did test 10 of the 17 mRNAs marked with “#”s in the reporter assays of Fig. 3C, which had been noted in the Fig. 3C legend.

      1. In box plots, the authors should provide the statistical tests, at least where the authors explained in the main text.

      At the first occurrence of a notched box plot (Fig. 2D), we explained in the main text that in all such plots, when the notches of different boxes do not overlap, their median values differ significantly with a 95% confidence level. In cases where overlaps between notches is difficult to assess by eye, we added the results of Mann-Whitney U tests with the p values indicated by asterisks, as explained in the legends. We added results of additional Mann-Whitney U tests to such box plots in Figs. 3B, 6A-C, and 6-supp. 1E & G and mentioned this in the corresponding legends.

      Reviewer #2 (Recommendations For The Authors):

      The first section of "Yeast eIF2A does not play a prominent role as a functional substitute for eIF2 in the presence or absence of amino acid starvation" can be subdivided into a couple of sections for better readability.

      Done.

      Although the authors have used SM to induce ISR in yeasts previously, the validation of eIF2alpha phosphorylation in Western blot would be helpful for readers. Also, it should be worth testing whether eIF2alpha phosphorylation was properly induced in eIF2A KO cells.

      The translational induction of GCN4 mRNA, which we have documented in WT and eIF2A∆ cells, provides a quantitative read-out of eIF2 functional attenuation superior to determining the proportion of eIF2α that is phosphorylated.

      For Figure 2B, the Venn diagram that shows the overlap between TE-changes genes in WT_SM/WT and those in eIF2A∆_SM/eIF2A∆ would be helpful (although a list was provided by the source data).

      The Venn diagram has been provided in a new figure, Figure 2-figure supplement 1B.

      For Figures 1C and 5A-B, the depiction of the positions of uORFs within the orange gene region would be helpful for readers.

      Done.

      For Figure 4A-C, the depiction of the IRES regions (if known) within the orange gene region would be helpful for readers.

      Done for the URE2 IRES, whose location is known.

      For Figures 1C, 4A-C, and 5A-B, the y-axis should have a label/scale.

      Added.

      For Figure 3C, the definition of #-marked genes should be concretely described (e.g., value range) in the legend.

      Added.

      For Figure 3D-E, the statistical test has been only shown in a couple of data. A full depiction of the statistical results for all the data sets may be helpful for readers.

      We explained that when notches in box plots do not overlap, their medians differ with 95% confidence. In cases where overlaps were difficult to discern, we added p values from Mann-Whitney U tests to the relevant box plots.

      For Figure 3E, it would be helpful if the authors could show the UV spectrum of the sucrose density gradient to show the regions isolated for the experiments.

      Added for a representative replicate gradient in the new figure, Figure 3-figure supplement 1.

      Reviewer #3 (Public Review):

      We thank the referee for his/her positive assessment of our study.

      Weaknesses:

      While no role of eIF2A in translation initiation is apparent, the authors do not determine what function eIF2A does play in yeast. Whether it plays a role in regulating translation in a different stress response is not determined.

      We agree that there are many additional possibilities to consider for functions of eIF2A in translation initiation, including different stress situations or mutant backgrounds; however, we regard this as a limitation rather than a weakness in the experimental design and data obtained in the current study in which we examined the most likely possibilities for eIF2A function in yeast based on studies of the mammalian factor.

      Reviewer #3 (Recommendations For The Authors):

      Curiously, the authors indicate that they could not replicate published results for eIF2A's repressor function for URE2, PAB1, or GIC1 translation. This is a little concerning and one wonders if the yeast strain used in the previous study is different in some way from the authors' strain. Did the authors obtain that strain to test it in their assays?

      The same WT and eIF2A∆ strains have been analyzed here and in the two cited studies on yeast IRESs.

      The authors do discuss the fact that eIF2A may function to regulate translation in response to different stresses. It would have been a strength to test an alternative stress in the current study. However, I also appreciate that this could be the subject of a future study.

      Agreed.

      One minor question I have is whether the yeast strains used possess L-A dsRNA virus? While it may not be that this virus would necessarily mask a role of eIF2A-dependent translation, do the authors have any specific thoughts on this? Would different results be obtained if cured strains were used?

      According to Ravoityte et al. (doi: 10.3390/jof8040381), the S. cerevisiae strain we employed, BY4741, harbors L-A-1 dsRNA; however, we have not explored whether curing the virus would alter the consequences of eliminating eIF2A.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Response to reviewers

      We thank the two reviewers for their constructive criticism, which helped to significantly improve our manuscript.

      During the revision process, we had to realize that the localization pattern reported for H. neptunium LmdCN-mCherry was an artifact caused by bleed-through of the BacA-YFP signal in the mCherry channel. More detailed studies showed that the fusion protein was detectable by Western blot analysis but, for unknown reasons, did not produce any fluorescence signal. Therefore, we have now removed the localization data shown in previous Figure 8B,C and Figure 8—figure supplement 1.

      To provide more evidence for a functional interaction between BacA and LmdC in H. neptunium, we have now established an inducible CRISPR interference system for this species and used it successfully to deplete LmdC (new Figure 9A-F). The loss of LmdC causes morphological defects very similar to those observed for the ΔbacA(D) mutant. In line with the physical interaction of BacA with the cytoplasmic region of LmdC observed in vitro, these findings support the hypothesis that the two proteins act in the same pathway. Consistent with the results obtained in H. neptunium, the absence of BacA leads to the delocalization of LmdC in R. rubrum. Moreover, we now provide in vivo evidence for a critical role of the cytoplasmic region of LmdC in the interaction of this protein with BacA in R. rubrum cells (new Figure 11). Together, these new findings strongly support the model that BacA and LmdC form a conserved morphogenetic module involved in the establishment of complex cell shapes in bacteria.

      Please see below for a more detailed explanation of our new results and for our response to the issues raised in the first round of review.

      Reviewer #1 (Public Review)

      In their study, Osorio-Valeriano and colleagues seek to understand how bacterial-specific polymerizing proteins called bactofilins contribute to morphogenesis. They do this primarily in the stalked budding bacterium Hyphomonas neptunium, with supporting work in a spiral-shaped bacterium, Rhodospirillum rubrum. Overall the study incorporates bacterial genetics and physiology, imaging, and biochemistry to explore the function of bactofilins and cell wall hydrolases that are frequently encoded together within an operon. They demonstrate an important, but not essential, function for BacA in morphogenesis of H. neptunium. Using biochemistry and imaging, they show that BacA can polymerize and that its localization in cells is dynamic and cell-cycle regulated. The authors then focus on lmdC, which encodes a putative M23 endopeptidase upstream of bacA in H. neptunium, and find that is essential for viability. The purified LmdC C-terminal domain could cleave E. coli peptidoglycan in vitro suggesting that it is a DD-endopeptidase. LmdC interacts directly with BacA in vitro and co-localizes with BacA in cells. To expand their observations, the authors then explore a related endopeptidase/ bactofilin pair in R. rubrum; those observations support a function for LmdC and BacA in R. rubrum morphogenesis as well.

      An overall strength of this study is the breadth and completeness of approaches used to assess bactofilin and endopeptidase function in cells and in vitro. The authors establish a clear function for BacA in morphogenesis in two bacterial systems, and demonstrate a physical relationship between BacA and the cell wall hydrolase LmdC that may be broadly conserved. The eventual model the authors favor for BacA regulation of morphogenesis in H. neptunium is that it serves as a diffusion barrier and limits movement of morphogenetic machinery like the elongasome into the elongating stalk and/or bud. However, there is no data presented here to address that model and the role of LmdC in H. neptunium morphogenesis remains unclear.

      We hypothesize that BacA establishes a barrier that prevents the movement of elongasome complexes into the stalk, either directly by sterical hindrance and/or indirectly by promoting the formation of an annular region of high positive inner cell curvature that cannot be passed by the elongasome. To test this model, we have now analyzed the localization dynamics of RodZ, a core structural component of the elongasome complex, in wild-type and ΔbacAD cells. We found that wild-type cells show dynamic YFP-RodZ foci whose movement is limited to the mother cell and the nascent bud, with no signal ob-served in the stalk. In ΔbacAD cells, by contrast, the fusion protein is consistently detected in all regions of the cell, including nascent stalks (new Figure 5). These results support the idea that BacA is required to confine the elongasome to the mother cell and bud regions and, thus, set the limits of the different growth zones in H. neptunium. We also attempted to follow the localization dynamics of other elongasome components, such as PBP2, MreC and MreD, but none of the corresponding fluorescent protein fusions was functional.

      In the past, we tried intensively to generate conditional mutants of lmdC, but all attempts to place the expression of this gene under the control of the copper- or zinc-inducible promoters available for H. neptunium were unsuccessful. To clarify the role of LmdC in H. neptunium morphogenesis, we have now established an inducible CRISPR interference system for this species and managed to block the ex-pression of lmdC using an sgRNA directed against the 5' region of its non-coding strand. We observed that cells lacking LmdC show a phenotype very similar to that of the ΔbacA mutant. Together with the finding that the N-terminal cytoplasmic region of LmdC physically interacts with BacA, this result strongly supports the hypothesis that BacA and LmdC act in the same pathway, forming a complex that ensures proper morphogenesis in H. neptunium (new Figure 9).

      The data presented illuminate aspects of bacterial morphogenesis and the physical and functional relationship between polymerizing proteins and cell wall enzymes in bacteria, a recurring theme in bacterial cell biology with a variety of underlying mechanisms. Bactofilins in particular are relatively recently discovered and any new insights into their functions and mechanisms of action are valuable. The findings presented here are likely to interest those studying bacterial morphogenesis, peptido-glycan, and cytoskeletal function.

      Reviewer #2 (Public Review):

      This is an excellent study. It starts with the identification of two bactofilins in H. neptunium, a demonstration of their important role for the determination of cell shape and discovery of an associated endopeptidase to provide a convincing model for how these two classes of proteins interact to control cell shape. This model is backed up by a quantitative characterisation of their properties using high-resolution imaging and image analysis methods.

      Overall, all evidence is very convincing and I do not have many recommendations on how to improve the manuscript.

      In my opinion, there are only two issues that I have with the paper:

      1. The single particle dynamics of BacA is presented as analysed and I would like to give some suggestions how to maybe extract even more information from the already acquired data:

      1.1. Presentation: Figure 5A is only showing projections of single particle time-lapse movies. To convince the reader that it was indeed possible to detect single molecules it would be helpful if the authors present individual snapshots and intensity traces. In case of single molecules these will show step wise bleaching.

      We have now added a supplementary video that shows both time series and intensity traces of individual BacA-YFP molecules (Figure 6—Video 1). It verifies the step-wise bleaching of the particles observed and thus shows that we observe the mobility of single molecules. Moreover, we have now included a supplementary figure that shows all trajectories identified within representative cells. This visualization provides a more comprehensive view of our data and further supports the notion that our analysis is based on the detection of single molecules.

      1.2. Analysis: Figure 5B and Supplement Figure 1 are showing the single particle tracking results, revealing that there are two populations of BacA-YFP in the cell. However, this data does not show if individual BacA particles transition between these two populations or not. A more detailed analysis of the existing data, where one can try to identify confinement events in single particle trajectories could be very revealing and help to understand the behaviour of BacA in more detail.

      We agree that an analysis of the single-molecule traces for transitions between the mobile and static states would help to achieve a more detailed understanding of the polymerization behavior of BacA. We believe that the dynamic formation, reorganization and disappearance of BacA-YFP foci observed by time-lapse analysis (Figure 4) indicates that BacA undergoes reversible polymerization in vivo. A deeper investigation of this aspect is beyond the scope of the present study and will be performed at a later point.

      1. The title of Fig. 3 says that BacA and BacD copolymerise, however, the data presented to confirm this conclusion is actually rather weak. First, the Alphafold prediction does not show the co-polymer, and second, the in vitro polymerisation experiments were only done with BacA in the absence of BacD. Accordingly, the only evidence that supports this is their colocalization in fluorescence microscopy. I suggest either weakening the statement or changing the title adds more evidence.

      To support the idea that BacA and BacD interact with each other, we have now added images of cells producing BacA-YFP or BacD-CFP individually (new Figure 3—figure supplement 1B,C). The results obtained show that Bac-YFP alone still forms filamentous structures, whereas BacD-CFP condenses into tight foci in the absence of its paralog. However, when produced together with BacA-YFP, the two proteins colocalize into filamentous structures, supporting the notion that they interact with each other. However, we agree that it is unclear whether BacA and BacD copolymerize into mixed protofilaments or whether they form distinct protofilaments that then interact laterally to form larger bundles. We have therefore replaced the term “co-polymerize” with “assemble” in the heading of this section.

      Finally, did the authors think about biochemical experiments to study the interaction between the cytoplasmic part of LmdC and the bactofilins? These could further support their model.

      We show the interaction between the cytoplasmic region of H. neptunium LmdC and BacA in Figure 9G,H (previously Figure 8D,E). For technical reasons, it was not possible to synthesize a peptide com-prising the corresponding region of R. rubrum LmdC, so that our in vitro analysis is limited to the H. neptunium proteins.

      To further support the notion that BacA interacts with the cytoplasmic region of LmdC, we have now analyzed the localization behavior of two LmdC variants with amino acid exchanges in the conserved cytoplasmic β-hairpin motif (new Figure 11). Both variants no longer colocalize with BacA and are no longer enriched at the inner cell curve. Interestingly, these exchanges also affect the enrichment of BacA at the inner cell curvature, suggesting that BacA needs to interact with LmdC for proper localization. It is tempting to speculate that BacA polymers have a preferred intrinsic curvature and that the activity of the BacA-LmdC complexes adjusts cell curvature in a manner that facilitates their association with the inner curve.

      Reviewer #1 (Recommendations for The Authors):

      We have the following specific recommendations for the improvement of the manuscript:

      1. Several places would benefit from additional quantitation of data:

      a. Figure 1 and supplements: can cell shape be quantified in a more specific way? (e.g. principle component analysis of shape as in https://onlinelibrary.wiley.com/doi/10.1111/mmi.13218). It looks as if BacD production may partially rescue the bacA shape phenotype?

      We have made considerable efforts to establish methods to quantify morphological changes and protein localization patterns in Hyphomonas neptunium. Since standard software packages, such as Oufti or MicrobeJ, are not able to reliably detect stalks and, thus, typically identify buds as separate cells, we have developed our own analysis software (BacStalk; Hartmann et al, 2020, Mol Microbiol), that is optimized for the detection of thin cellular extensions. However, while this software works very well with wild-type cells, it also fails to recognize amorphous cells with multiple, ill-defined extensions. Given these problems in cell segmentation, it is currently not possible to use principle component analysis to obtain a robust measure of the morphological defects of bactofilin mutants in H. neptunium.

      b. Figures 2-S2b, 7D and 9-S1b - can the area under the peaks be quantified and compared across strains? Visual examination of the spectra makes it difficult to discern differences.

      A direct comparison of the peak areas between strains is not possible, because the absolute values depend on the amount of peptidoglycan used in the muropeptide analyses. It is very difficult to precisely quantify peptidoglycan, which makes it challenging to use equal amounts of material from different strains in the reactions. However, the relative proportion of different muropeptide species, as provided in Figure 2—Dataset 1, faithfully reflects the composition of peptidoglycan and can easily compared between strains.

      c. Figure 9E,F, 9-S4d - BacA and LmdC localization in R. rubrum is very difficult to assess. It does not look linear/filamentous in most cells and is difficult to tell if it is associated with the inner curvature. Can you quantify the position of the signal along the short axis of the cell to better demonstrate that?

      We agree that a better quantification of the distribution of protein along the cell envelope of R. rubrum is required to support the conclusions drawn. To address this issue, we have now used line scans to measure the fluorescence intensities along the inner and outer curve of cells (n=200 per strain) and visualized the data in the form of demographs. The results clearly show an enrichment of BacA and LmdC at the inner curve in wild-type cells and a disruption of this pattern in various mutant backgrounds (new Figures 10F,G,J and 11D,E).

      1. Figure 2-S2A. Does ∆bacD grow better than wild-type? It would also be useful to add growth curves of the bacA complemented strains.

      In the case of H. neptunium growth curves are often misleading, because cells start to aggregate at the late exponential phase due to abundant EPS formation. The degree of cell aggregation also depends on the morphology of cells, because EPS production is limited to the mother cell body, which makes it challenging to compare morphologically distinct mutant strains. We have now performed growth assays for all H. neptunium deletion and complementation strains used in the study and limited the analysis of doubling times to the early and mid-exponential phase, in which cells do not yet form visible aggregates. The results obtained are now included in the new Figure 1F and Figure 1—figure supplement 2D. They show that the doubling times of the different bactofilin mutants are close to that of the wild-type strain.

      1. Figure 4BC: From the demographs provided, BacA and BacD appear to have different localization dynamics. BacD seems to stay at the base of the stalk, nearest the mother cell, whereas BacA migrates towards to bud? Also, "length" is misspelt in the panels.

      During the transition to bud formation, we indeed observe that the localization patterns of BacA and BacD are in many cases not fully superimposable, with BacD lagging behind BacA and forming transient additional clusters in the vicinity of the stalk base. Examples are now shown in Figure 4—figure supplement 4). This effect explains the distinct patterns in the demographs. We have now modified the text accordingly. We have also corrected the spelling of “length” in the figure.

      1. Can BacD polymerize on its own? It colocalizes with BacA in E. coli but that does not necessarily mean it co-polymerizes.

      Please see our response to a similar issue (point 2) raised by Reviewer #1.

      1. Lines 263-266. You use E. coli PG as a substrate for LmdC in vitro because "peptidoglycan from H. neptunium shows only a low degree of cross-linkage and hardly any pentapeptides." Does this not have relevance to the physiological significance of the observed activity? Or do you presume that LmdC activity (and/or that of other endopeptidases) is very high in H. neptunium so it is difficult to detect additional activity using HnPG as a substrate? It would be useful to clarify this logic in the text.

      DD-crosslinks are formed by all major peptidoglycan biosynthetic complexes, including the elongasome and the divisome, so that their general relevance to cell growth in H. neptunium is beyond doubt. The low degree of crosslinkage observed suggests that H. neptunium contains high endopeptidase activity, which cleaves crosslinks after their formation by DD-transpeptidases. We have now added the explanation “likely due to a high level of autolytic activity” to make this point clearer. Whether LmdC makes a major contribution to the low level of crosslinkage remains to be determined. However, our data suggest that it mostly acts in complex with BacA, so that it may only cleave peptidoglycan locally and not have a global effect global on cell wall composition. It would not possible to detect the DD-endopeptidase activity of LmdC using H. neptunium peptidoglycan as a substrate, because it has a low content of DD-linked peptide chains. To facilitate the in vitro activity assay, we therefore used highly crosslinked peptidoglycan from a mutant E. coli strain.

      1. Lines 268-269: Is there some explanation for why monomers do not increase on LmdC treatment? Here quantitation of peaks before and after treatment would allow the reader to more precisely interpret these data.

      The absolute peak sizes are not comparable, because there is some variation in the amount of peptido-glycan included in the assays (see also our comments on point 1b raised by Reviewer #1) and the integrated peak areas (which correspond to the amounts of muropeptide species produced) depend on both the height and the width of the peaks, which vary to some degree in different HPLC runs. The relevant measure to compare the muropeptide profiles is therefore the relative content of different muropeptide species in the different conditions. For clarification, we have now added the following sentence to the legend of Figure 8D: “A quantification of the relative abundance of different muropeptide species in each condition, based on a comparison of the relative integrated peak areas, is provided in Figure 8—Dataset 1.” The control reaction lacking LmdC only contains peptidoglycan diluted in buffer and thus provides insight into muropeptide composition of untreated peptidoglycan.

      1. Lines 280-283: It would be interesting to know if the transmembrane domain of LmdC is required for its localization since it is dispensable for binding BacA and since LmdC still localizes to foci without BacA.

      Given that it is currently not possible to localize LmdC in H. neptunium, we were not able to perform this analysis.

      1. Line 296: it is also possible that LmdC localizes with another protein and does not independently assemble into larger complexes.

      Since the localization pattern reported for LmdC in the ΔbacAD background is no longer valid, we have not discussed this aspect in the revised version of our manuscript. However, in general, we do not exclude the possibility that LmdC could interact with other peptidoglycan biosynthetic proteins.

      1. Line 304-306 and Fig 9: Is the domain organization of RrLmdC the same as for HnLmdC? It would be useful to include its domain organization as well. Also, please add amino acid numbering to Figure 9B.

      We have now added a schematic showing the domain organization of LmdC from R. rubrum (new Figure 10B). The protein is highly similar to its homolog from H. neptunium.

      1. Line 340-341: "In both cases, they functionally interact with LmdC-type DD-endopeptidases to promote local changes in the pattern of peptidoglycan biosynthesis." This conclusion is not experimentally supported. Since LmdC is essential and you could not make a depletion strain in H. neptunium, it was not shown that the interaction with LmdC is how BacA promotes changes in PG patterning. HADA/FDAA labeling was not performed in R. rubrum, and no global changes in PG chemistry were observed in bacA or lmdC mutants, so you cannot claim BacA or LmdC influences PG patterning there, either. Either soften this statement to a hypothesis or otherwise rephrase.

      To further corroborate a functional interaction between BacA and LmdC, we have now established an inducible CRISPRi system to deplete LmdC from H. neptunium cells (see also our comments on the public review of Reviewer #1). We observe that the loss of LmdC leads to a phenotype very similar to that observed for the ΔbacA(D) mutant, supporting the idea that BacA and LmdC act in the same path-way. We have now also performed localization studies of the elongasome component RodZ in H. nep-tunium, which demonstrate that the spatial distribution of elongasome complexes is affected in the absence of the bactofilin cytoskeleton in H. neptunium. Combined with the observation that LmdC is a catalytically active DD-endopeptidase and its absence leads to morphological defects, these results indicate that BacA, together with LmdC, induces local changes in pattern of peptidoglycan biosynthesis, both by affecting elongasome movement and, likely, by reducing peptidoglycan crosslinking in the cell envelope regions it occupies.

      1. Figure 9-S4: there is no panel C (change D to C).

      Corrected.

      1. Lines 344-355: No data is presented here to support the barrier model of bactofilin function. In addition, it is unclear why cells would take on amorphous shapes instead of extended rod shapes/filaments if elongasome function was not constrained on the longitudinal axis. It would be helpful to have more discussion of the potential mechanisms of LmdC function in H. neptunium in this section of the discussion since that is the emphasis of the results section.

      To support the barrier model, we have now compared the localization dynamics of the elongasome component RodZ in wild-type and ΔbacAD cells. The results show that RodZ is excluded from the stalk in the wild-type background, whereas it readily enters the stalk in the mutant cells, leading to the expansion of stalks into large, amorphous extensions. Consistent with these findings, HADA labeling is not observed within the stalks in wild-type cells, whereas it is readily observed in the enlarged stalk structures (pseudohyphae) formed in the mutant cells.

      The current model of MreB movement suggests that MreB filaments have an intrinsic curvature and thus preferentially align along regions of similar curvature, which is along the circumference of the cell in rod-shaped geometries. However, previous work has shown that MreB starts to move along randomly oriented trajectories as soon as cells lose their rod-shaped morphology and adopt more spherical shapes (Hussain et al, 2018, eLife). In line with these findings, our current and our previous work (Cserti et al, 2017, Mol Microbiol) indicate that the expansion of the ovoid H. neptunium mother cell prior to the onset of stalk biosynthesis as well as bud formation are mediated by the elongasome complex. Thus, the elongasome can clearly also give rise to shapes other than rods. Interestingly, however, the H. neptunium elongasome also appears to drive the formation of the rod-shaped stalk, possibly by moving around the circumference of the stalk base. Thus, species- or growth phase-dependent regulatory mechanisms or, potentially, differences in the spatial arrangement of the glycan strands within the peptido-glycan layer may result in different modes of elongasome movement and, thus, modulate the morphogenetic activity of elongasome complexes.

      1. Lines 395-397: It is also possible that LmdC positioning is dependent on cell morphology, rather than directly on BacA, since morphology is so distorted in bacA mutant cells.

      We provide several lines of evidence showing that LmdC and BacA functionally and physically interact (see above), making it highly unlikely that the two proteins are not associated with each other. How-ever, our previous (Figure 10I,J) and new (Figure 11) results suggest that the physical interaction with LmdC and/or or the cell shape-modulating activity of the complex are required for the proper localization of BacA at the inner curve of the cell. This finding may indicate the existence of a self-reinforcing cycle, in which the morphological changes induced by BacA-LmdC assemblies stimulate the recruitment of additional assemblies to their site of action.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This study presents useful findings regarding the impact of forest cover and fragmentation on the prevalence of malaria in non-human primates. The evidence supporting the claims of the authors is, however, incomplete, as the sampling design cannot adequately address the geospatial issues that this study focuses on.

      Public Reviews:

      Reviewer #1 (Public Review):

      The study as a concept is well designed, although there is still one issue I see in the methodology.

      I still have concerns with their attempts to combine the different scales of data. While the use of point data is great, it limits the sample size, and they have included the district to country level data to try and increase the sample size. The problem is that although they try to get an overall estimate at the district/state/country by taking 10 random sample points, which could be a method to get an estimate for the district/state/country. It would be a suitable method if the primates were evenly distributed across the district/state/country. The reality is that the primates are not evenly distributed across the district/state/country therefore the random point sampling is not a reasonable method to get an estimate of the environmental variables in relation to the macaques. For example if you had a mountainous country and you took 10 random points to estimate altitude, you would end up with a large number, but if all the animals of interest lived on the coast, your average altitude is meaningless in relation to the animals of interest as they are all living at low altitude. The fact that the model relies less on highly variable components and places more reliance on less variable components, is really not relevant as the district/state/country measurements have no real meaning in relation to the distribution of masques.

      A simple possible way forward could be to run the model without the district/state/country samples and see what the outcome is. If the outcome is similar then the random point method may be viable (but if it gives the same outcome as ignoring those samples then you don't need the district/state/country samples). If you get a totally different outcome then it should raise concerns about using the district/state/country samples.

      This paper is a really nice piece of work and is a valuable contribution but the district/state/country sample issue really needs to be addressed.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A simple possible way forward could be to run the model without the district/state/country samples and see what the outcome is. If the outcome is similar then the random point method may be viable (but if it gives the same outcome as ignoring those samples then you don't need the district/state/country samples). If you get a totally different outcome then it should raise concerns about using the district/state/country samples.

      Thank you for your comments, and for the suggestions to address the issues identified in your main commentary by running an analysis on exclusively GPS geolocated data points. This was the original plan for analysis, but the available data identified in the literature review includes only 14 data points (macaque P. knowlesi prevalence surveys) with associated GPS coordinates. This was found to be too limited to obtain meaningful results from a regression analysis, and hence we then explored methods for utilising all available data to identify trends whilst accounting for spatial uncertainty in the analysis. As the point location only represents the location of capture and not the extent of the home range of the NHPs, we additionally feel there is value in exploring methods to encompass the wider surrounding habitat.

      We do appreciate the concerns you raise with the random point method being used to represent macaque survey sites when species of interest are not necessarily evenly distributed across an area. To investigate this, we ran sensitivity analysis on a subset of the dataset according to whether the points fall in areas of >50%, >75% or >90% predicted probability of macaque occurrence, with maps derived from published models of macaque suitability in Southeast Asia. For each of these thresholds, points that fall outside these areas were removed – such that, if a random point is located on a mountain range where there is 0 likelihood of macaque occurrence, it is excluded from the analysis. We found that restricting analysis to areas with highly probably macaque habitat still shows a robust effect of forest cover on NHP prevalence, and additionally that for the most conservative (>90%) habitat threshold there remains an effect of forest fragmentation on prevalence (SI Table S17c, Figure S15c). Given that using the full data set increases the uncertainty, as there is more variation in covariates between the replicates, this can be considered a more conservative approach to detecting an effect of environment as reported in the main findings.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      1. A more thorough analysis of transition boundaries between different types of patterns would further strengthen the conclusions.

      We agree that the transition between different patterning regimes should be discussed more quantitatively in the manuscript. Specifically, we identified a highly sensitive parameter range where the disorder in the patterns rapidly increases as a function of the VEGF stimulus. We have improved our discussion of the transition between ‘orderedlike’ patterns and ‘disordered-like’ patterns in the main text as follows: “At relatively low VEGF levels, the patterns were mostly ordered, with small deviations from the expected ‘salt and paper’ geometry with a 25%-75% ratio of TipStalk (Fig. 2D). However, as the VEGF input increased, the fraction of Tips grew and the patterns became sharply more disordered over a relatively narrow range of magnitude of the VEGF input, which could be identified as a highly sensitive area separating more ‘ordered-like’ and ‘disordered-like’ patterns. Finally, increasing VEGF stimuli beyond the highly sensitive area further increased the disorder of the patterns, but with a lower VEGF sensitivity, over several more orders of magnitude of VEGF inputs”.

      Reviewer #2 (Recommendations For The Authors):

      Please refer to the Public Comments above for a broad review. Below, I provide specific concerns that could be addressed.

      Main comments

      1. Is the salt-and-pepper model observed for the case when there is no VEGF in the experiments? It would be good to confirm the same. If not, the analysis presented in Fig. 3 could be performed for this case and used as a baseline while referring to the data in Fig. 3.

      We thank the referee for the interesting suggestion. The pattern predicted by the model is not strictly salt-and-pepper in absence of VEGF, but the disorder quantified in terms of “incorrect” contacts between Tip cells is considerably lower (see for example the disorder quantification in supplementary figure 1C). We have included the Tip-Tip contact statistics for a case of VEGF=1 ng/ml (100-fold lower that the level used in Fig. 3 compare between model and experiment). In this case, there is clearly more spacing between Tip cells, thus demonstrating how high VEGF stimuli increase the probability of contacts between Tip cells. In the main text, we commented: “As a baseline comparison, the mathematical model with a 100-fold reduction of VEGF stimulus (1 ng/ml) exhibited a Tip-Tip distance statistics more closely comparable with the ‘salt-and-pepper’ model”.

      1. The authors mention in the Discussion (end of pg. 7) that ...a low level of exogeneous VEGF is essential to induce Delta-NOTCH signalling.. However, in the standard NOTCH signalling (Boareto et al.), we can get the salt-and-pepper pattern without any VEGF. Am I missing something? The authors may want to take a re-look.

      We appreciate the referee’s understanding of the mathematical model. The model used here still exhibits a bistable behavior between the low-Delta and high-Delta cell states even in the absence of VEGF input, as seen for example in the cell state distribution of Fig. 2B, and in agreement with the original model by Boareto et al. This behavior is reflective of the more general applicability of the model, as it describes Delta-NOTCH interactions in various systems. For endothelial cells, VEGF is indeed required to trigger this interaction, but this was not the primary focus of the paper, hence the original model was used. In the text referred to by the reviewer, we are discussing the role,of VEGF based in its known biological effects as well as modeling results. We anticipate that the future further adaptation of the model to,endothelial cells will refine its description of of cell interactions in the absence of VEGF.

      1. The size of cells (or spacing between cell nuclei) is highly variable (Fig. 3). Since it is known that the size of cell-cell junctions influences signalling, it would good to at least comment on the same, considering that the model in the paper consists of regular static hexagons. Similarly, it seems desirable to comment on expressing the distance between Tip cells (Fig. 3) in cell length units, when the cell lengths are so variable.

      We concur with the suggestion that our consideration of the cell-cell contact size in NOTCH signaling should be clarified in the manuscript.

      Sprinzak et al. reported in their 2017 article published in Developmental Cell that the cell-cell contact area does influence NOTCH Signaling. In this article, they found that NOTCH trans-endocytosis (TEC) for pairs with a larger contact width (25µm) is up to five times higher than for pairs with a smaller contact (2.5µm), as observed through the two-cell TEC assay. While TEC correlates with contact width across a range from 1 to 40µm, the values fluctuate significantly in the middle range, particularly when excluding extremely low cell-cell contact areas.

      In our experiments, we observed that the cell-cell contact area ranges from essentially infinitesimal corner-to-corner contact to roughly 50µm. We excluded the corner contacts, which might correspond to extremely low cell-cell contact areas, from the Tip-Tip distance measurements as depicted in Fig. 3B. We also made the assumption that variations in cell-cell contact size within tens of microns correlate weakly with the strength of NOTCH signaling. This assumption did not impede our effort to compare the overall trends with results from modeling using hexagonal cells, as shown in Figs 6 D&E. We have included this comment and the corresponding reference to elucidate our assumption in the results as follows: In our experiments, the observed cell-cell contact area varied, spanning from very low (cell corner-to-corner contact) up to approximately 50µm. Previous studies(14, 15) have clearly demonstrated the influence of the cell-cell contact area on NOTCH Signaling, but the values get nosy in the middle range, particularly when excluding extremely low cell-cell contact areas. Reflecting these findings, we excluded the corner contacts, which might correspond to extremely low cell-cell contact areas, from the Tip-Tip distance measurements as depicted in Fig. 3B. We also made an assumption that variations in cell-cell contact size within tens of microns correlate weakly with the strength of NOTCH signaling. This assumption did not impede our effort to compare the overall trends with results from modeling using hexagonal cells, as shown in Figs 3 D&E.

      1. The results presented in Fig. 6J are quite striking. However, the number of samples N = 10 and N = 11 seem somewhat low. How does one justify that the findings are not influenced by low number fluctuations?

      We acknowledge the reviewer's concerns regarding potential biases stemming from a limited number of samples. The analysis presented in Fig. 6J was specifically designed to complement and support the findings in Fig. 6H. In this context, the counts of sprout and mini-sprout dots correspond to the number of instances "including a sprout" and "including a mini-sprout."

      While the counts of sprouts and mini-sprouts in Fig. 6H might seem limited as highlighted by the reviewer, the statistical difference between the two groups was found to be significant. Nevertheless, we expanded our regions of interest to encompass neighboring cells, based on the rationale that the local environment might have closely interacting and similar features. The sample sizes in Figure 6J, represented as N=10 and N=11, equate to an examination of 70 cells and 77 cells, respectively. For instance, in the category "including a sprout," five out of ten groups indicated that all seven neighboring cells in a group exhibited fibronectin levels exceeding a given threshold, translating to 35 cells with fibronectin levels above this threshold. Given that the observed trends in distribution were consistently reasonable across the examinations of both 70 and 77 cells, we would like to state that we are confident in our results.

      1. It is written towards the end on pg. 5 that ... although all sprouts indeed formed from mini-sprouts, not all .... However, as can be seen from Fig. 4O, Sprouts can also be generated from Stalk cells. This should be corrected.

      Thank you for highlighting the discrepancy between our statement on page 5 and the observations in Fig. 4O. While all sprouts undergo a mini-sprout phase, the transition from Stalk to mini-sprout is not always be observed due to the limitations of our observational timeframe. We acknowledge this oversight and adjusted our statement to clarify that sprouts appearing to form directly from Stalks likely passed through an unobserved intermediate mini-sprout stage as follows: We found that all sprouts formed either directly from Stalks or from mini-sprouts, suggesting a non-observed transition from Stalk to mini-sprout due to observational timeframe limitations. Strikingly, however, not all minisprouts persisted and initiated sprout formation.

      1. No solid blue bars are shown in Fig. S2A as mentioned in the caption. Kindly correct.

      We apologize for the mistake. We have corrected the figure to show the blue bars depicting the experimental measurements for sprout distance probability.

      1. How are the high-Delta cells or high-NOTCH cells decided in experiments or simulations? Does it happen that Delta and NOTCH levels are comparable? In that case, what is done? This point could be clarified in the main manuscript or Materials and Methods.

      We agree with the reviewer that Tip cell definition should be clarified. In the model, we define a threshold level for cellular Delta to distinguish Tip and Stalk cells, which is now explained in the Methods section “Definition of Tip cells in the model”. As elaborated in the new section, Delta and NOTCH levels are never comparable due to the circuit’s bistable behavior. In experiments, Tip cells based on their key phenotypic characteristic — invasive migration into the surrounding collagen matrix rather than Delta or NOTCH levels. The details can be found in “Precise quantification of Tip cell spatial arrangement suggests disordered patterning in the engineered angiogenesis model” section and Figure 3A.

      Minor comments

      There are a good number of typos in the paper. The manuscript should be carefully checked and corrected for the same. Below, I provide a few instances.

      1. In the abstract towards the end, it should be "understanding" instead of "understating"

      2. On pg. 5, just before the beginning of the last paragraph, there is a typo "parodied" which should most likely be "provided"

      3. First paragraph on pg. 6 typo "spouts" instead of "Sprouts"

      4. Second paragraph on pg. 6, correctly write "testS"

      5. Near the beginning of pg. 8, should be "C. elegans" instead of "C. elegance"

      6. Figure 1 caption, towards the end, should be "Stalk" instead of "Salk"

      We sincerely appreciate your keen attention to detail. we have thoroughly reviewed the manuscript and made the necessary corrections, including those that you have highlighted.

      Reviewer #3 (Recommendations For The Authors):

      Major concern:

      The authors should discuss in more detail how their work can be used for a better understanding of the angiogenesis process in physiological conditions and in pathological conditions such as post-ischemic revascularization or tumor vascularization.

      We have included comments and the corresponding references to clarify the aspect the reviewer suggested: The results in this study can further inform our understanding of angiogenesis in physiological and pathophysiological conditions. In particular, in many circumstances, the levels of VEGF is determined by the degree of hypoxia, which can be highly elevated following oxygen supply interruption, e.g., in wound healing or ischemia, or due to progression of neoplastic growth. Our results suggest that in these cases, formation of sprouts can be dysregulated due to higher incidences of co-localizations of prospective Tip cells. In addition, since these conditions are frequently accompanied by altered synthesis of ECM, the sprout density can increase, which may lead to formation of denser and less developed vascular beds frequently observed as a result of tumor angiogenesis(42, 43). Our results thus suggest that the disorder and higher plasticity of the endothelial cell fate speciation at higher VEGF inputs can be a key contributor to some pathological states associated with persistently hypoxic conditions.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1:

      Summary:

      Ngoune et al. present compelling evidence that Slender cells are challenged to infect tsetse flies. They explore the experimental context of a recent important paper in the field, Schuster et al., that presents evidence suggesting the proliferative Slender bloodstream T. brucei can infect juvenile tsetse flies. Schuster et al. were disruptive to the widely accepted paradigm that the Stumpy bloodstream-form is solely responsible for tsetse infection and T. brucei transmission potential. Evidence presented here shows that in all cases, Stumpy form parasites are exponentially more capable of infecting tsetse flies. They further show that Slender cells do not infect mature flies.

      However, they raise questions of immature tsetse immunological potential and field transmission potential that their experiments do not address. Specifically, they do not show that teneral tsetse flies are immunocompromised, that tsetse flies must be immunocompromised for Slender infection nor that younger teneral tsetse infection is not pertinent to field transmission.

      Strengths:

      Experimental Design is precise and elegant, outcomes are convincing. Discussion is compelling and important to the field. This is a timely piece that adds important data to a critical discussion of host: parasite interactions, of relevance to all parasite transmission.

      Thank you

      Weaknesses:

      As above, the authors dispute the biological relevance of teneral tsetse infection in the wild, without offering evidence to the contrary. Statements need to be softened for claims regarding immunological competence or relevance to field transmission.

      We have modified the revised version to soften these claims (l.156 and l.159). Please, note that the limited immunocompetence of teneral flies has been extensively studied by the labs of S. Aksoy at Yale and M. Lehane at Liverpool. In the discussion, we provide key references from these two labs 18-21. Our comment on the relevance to field transmission is simply based on field observations of the fly biology.

      Reviewer #2:

      Summary:

      Contrary to findings recently reported by Schuster S et al., this short paper shows evidence that the stumpy form of T. brucei is probably the most pre-adapted form to progress with the life cycle of this parasite in the tsetse vector.

      Strengths:

      One of the most important pieces of experimental evidence is that they conduct all fly infection experiments in the absence of metabolites like GlcNAc or S-glutathione; by doing so, the infection rates in flies infected with slender trypanosomes seem very low or non-existent. This, on its own, is a piece of important experimental evidence that the Schuster S et al findings may need to be revisited.

      Thank you

      Weaknesses:

      I consider that the authors should have included their own experiments demonstrating that the addition of these chemicals enhances the infection rates in flies receiving bloodmeals containing slender trypanosomes.

      The main purpose of this study is to assess the intrinsic infectivity of SL Vs. ST in teneral Vs. adult flies, not to reproduce the results obtained by Schuster et al.. We think that the suggested experiment is not necessary as L-Glutathion is well-known to enhance infection rates by reducing the fly immune response efficiency (Ref 24). Most of the experimental infections with procyclic or ST forms (even at low densities) published by our lab and others, especially for studying parasite stages in the salivary glands, were actually performed by complementing the infective meal with L-Glutathion for this reason.

      Reviewer #3:

      The dogma in the Trypanosome field is that transmission by Tsetse flies is ensured by stumpy forms. This has been recently challenged by the Engstler lab (Schuster et al.), which showed that slender forms can also be transmitted by teneral flies. In this work, the authors aimed to test whether transmission by slender forms is possible and frequent.

      For this, the authors repeated Tsetse transmission experiments but with some key critical differences relative to Schuster et al. First, they infected teneral and adult flies. Second, their infective meals lacked two components (N-acetylglucosamine and glutathione), which could have boosted the infection rates in the Schuster et al. work. In these conditions, the authors observed that most stumpy form infections with teneral and adult flies were successful while only 1 out of 24 slender-form infections was successful. Adult flies showed a lower infection rate, which is probably because their immune system is more developed.

      Given that in Tsetse-infested areas most transmission is likely ensured by adult flies, the authors conclude that the parasite stage that will have a significant epidemiologic impact on transmission is the stumpy form.

      Strengths:

      • This work tackles an important question in the field.

      • The Rotureau laboratory has well-known expertise in Tsetse fly transmission experiments.

      • Experimental setup is robust and data is solid.

      • The paper is concise and clearly written.

      Thank you

      Weaknesses:

      • The reason(s) for why this work has lower infection rates with slender forms than Schuster et al. remain unknown. The authors suggested it could be because of the absence of N-acetylglucosamine and/or glutathione, but this was not formally tested. Could another source of variation be the clone of EATRO1125 AnTat1.1 (Paris versus Munich origin)? To reduce the workload, such additional experiments could be done with just one dose of parasites.

      Differences between the strain clones, the cell culture conditions and/or the fly colony maintenance conditions could indeed explain the differences in infection rates observed in the two studies. However, the main purpose of this study is to assess the intrinsic infectivity of SL Vs. ST in teneral Vs. adult flies. Our study was designed to stand alone for providing a clear answer to this question, not to reproduce the results obtained by Schuster et al.. Hence, we don’t think that any additional experiments are required here.

      • The characterization of what is slender and stumpy is critical. The authors used PAD1 protein expression as the sole reporter. While this is a robust assay to confirm stumpy, an analysis of the cell cycle would have been helpful to confirm that slender forms have not initiated differentiation (Larcombe S et al. 2023, preprint).

      In this study, ST are indeed defined by their general morphology and by the expression of PAD1 proteins at the cell membrane as assessed by IFA. This is the simplest and most accurate ST proxy accessible by IFA. We do not think that monitoring in more details the cell cycle would provide key information here. If some SL forms had initiated differentiation in our experiments, then, the low infection rates observed with SL would have reinforced the fact that mostly mature PAD1+ ST are infectious for flies .

      • Statistical analysis is missing. Is the difference between adult and teneral infections statistically significant?

      An ANOVA statistical analysis was performed and a dedicated section was added to the revised version.

      For all conditions, MG infection rate comparisons between adult and teneral flies were statistically significant.

      Recommenda8ons for the authors:

      Reviewer #1:

      While some perceived outcomes pertaining to immunological competence and transmission relevance of teneral flies are overstated, the overall tone of the paper is inappropriately apologe7c. The authors obviously don't want to offend their colleagues but the current wri7ng style obscures meaning, making the paper a bit 'flowery' and difficult to read.

      Ngoune et al. have important outcomes that need to be stated more directly.

      Words such as 'unequivocally' are not appropriate to Schuster et al's outcomes. As your study shows, their findings are experimentally based, with inherent caveats, and are therefore sugges7ve, not demonstrated or proven.

      The word 'unequivocally' has been removed from the revision.

      Reviewer #3:

      The Engstler lab cul7vates AntTaT1.1 in methylcellulose (Munich clone, if I am not mistaken). The Rotureau lab uses the Paris AntTaT1.1 clone and uses no methylcellulose. Given that methylcellulose helps stumpy forma7on, it seems important to show that the results of this paper are reproducible with the Munich clone grown in the presence of methylcellulose.

      Differences between the strain clones and culture conditions could indeed explain the differences in infection rates observed in the two studies. However, the main purpose of this study is to assess the intrinsic infectivity of SL Vs. ST in teneral Vs. adult flies. Our study was designed to stand alone for providing a clear answer to this question, not to reproduce the results obtained by Schuster et al.. Hence, we don’t think that any additional experiments are required here.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Summary of the reviewers’ discussion:

      • The development of MSI-1 as a post-transcriptional regulator of gene expression in Escherichia coli represents a valuable addition to the synthetic biology toolkit. MSI-1 has advantages over transcriptional regulators because it has the potential to target single genes in operons. Allosteric control of MSI-1 by oleic acid increases its versatility.

      Authors’ response: We thank the reviewers and editor for this evaluation.

      • We recommend that authors add experiments to test the mechanism of regulation by MSI-1 or soften their claims about translational regulation. We also recommend that the authors expand their discussion of other natural and synthetic regulatory systems that target translation.

      Authors’ response: In this revision, we have added new experimental results from RT-qPCR, bulk fluorometry, and flow cytometry assays to further support our conclusions. We have also enlarged the Introduction and Discussion.

      • Adding an experiment to quantify the effect of oleic acid with the most strongly regulated reporter construct (i.e., flow cytometry with redesign-3) would substantially increase the impact of the work.

      Authors’ response: We have done this experimental quantification (see the new Fig. 5d).

      Reviewer #1 (Public Review):

      The authors develop reporter constructs in E. coli where gene expression, presumably translation, is repressed by MSI-1. This is a potentially useful tool for synthetic biologists, with the advantage over transcriptional regulation that one gene in an operon could be targeted. That being said, an important caveat of translational regulation that is not addressed in the manuscript is the potential for downstream effects on RNA stability and/or transcription termination. The authors' MSI-1-regulated reporter constructs could also be useful for mechanistic studies of MSI-1.

      Authors’ response: We thank the reviewer for such appreciation of our work. Regarding the potential effects on RNA stability or transcription termination, we would like to highlight our results with the sfGFP-mScarlet bicistron (Fig. 6c), showing the specific regulation of sfGFP by MSI-1* and not of mScarlet. Anyway, for this revision we have conducted an RT-qPCR experiment to quantify the mRNA level of sfGFP to further support our conclusions (see the new Fig. S2).

      The author's initial construct design led to only weak regulation by MSI-1, presumably because the MSI-1 binding sites were not suitably positioned to repress translation initiation. A more rationally designed construct led to considerably greater repression. One weakness of the paper is that the authors did not use their redesigned construct that is more strongly repressed to demonstrate allosteric regulation by oleic acid using a comparable assay (e.g., flow cytometry) to that used in other experiments. The potential for allosteric regulation is a major strength of the MSI-1 system, so this is a significant gap. Similarly, the authors use the weakly regulated constructs to assess the effect of MSI-1 binding site mutations and for their mathematical modeling; these experiments would be better suited to the more strongly regulated construct.

      Authors’ response: For this revision, we have performed the flow cytometric quantification of the allosteric regulation by oleic acid in the redesigned-3 system (see the new Fig. 5d). Regarding the kinetic study, we focused on the reporter system with just one recognition motif for simplicity. A reporter system with two recognition motifs, thereby recruiting two different proteins, increases the complexity to distill the effect of point mutations.

      Reviewer #1 (Recommendations For The Authors):

      1. Figure 5. Panels c-f look at colonies on plates, with numbers from these data being difficult to compare with either the bulk fluorescence or single-cell fluorescence values shown in other figures. Supplementary Figure 8 shows data for single cells; these data would be more appropriate in Figure 5, with the plate-based data moving to the supplement. Moreover, measuring the effect of oleic acid on the redesign-3 reporter using flow cytometry would assess the impact of oleic acid on the most strongly regulated reporter; this would be the most impactful analysis.

      Authors’ response: We have redone Fig. 5 to include flow cytometry data (also for the system implemented with the redesign-3 reporter).

      1. Paragraph starting line 438. The authors should briefly discuss the potential for translational repression leading to reduced RNA stability, and in the case of rapid repression that impacts transcription-coupled translation, its impact on Rho-dependent transcription termination. These factors could alter the expression of neighboring genes.

      Authors’ response: As we have shown with the RT-qPCR experiment, the mRNA level of the target gene does not change in response to protein binding. We agree that mRNA stability could potentially be changed by using other RNA-targeting proteins. But in our view, a reduction of RNA stability is not a regulation of translation. We have added the following sentence in the Discussion: “The additional use of RNA-binding proteins able to alter mRNA stability might lead to the implementation of more complex circuits at the posttranscriptional level.”

      1. Figure 1. It would be informative to include a control where cells have an empty plasmid rather than a plasmid expressing MSI-1, to address leakiness of MSI-1 expression.

      Authors’ response: We have constructed a void plasmid as suggested and performed new bulk fluorometry assays. The new Fig. S8 shows the tight control of MSI-1* expression with the PLlac promoter. No apparent leakage is observed.

      1. Line 132. Where were the two sequences positioned with respect to each other than the start codon? It would be helpful to show the sequence in Figure 1.

      Authors’ response: The precise sequence is shown in the inset of Fig. 1b. The motif is placed just after the start codon.

      1. Line 135. The authors envisioned repression mechanism isn't clear from the text, specifically the meaning of "block the progression" and "initial phase". As far as I know, there is no precedent for RNA-binding proteins repressing translation in bacteria by preventing translation elongation. Presumably, repression in the context described here would be due to MSI-1 binding over the ribosome-binding site, although the predicted hairpin may also occlude binding of initiating 30S ribosomes in the absence of MSI-1 binding.

      Authors’ response: It is difficult to know the exact mode of action. In page 7, we have rewritten a sentence to have: “In this way, MSI-1* can repress translation by blocking the binding of the ribosome, presumably by imposing a steric hindrance for the 30S ribosomal subunit.”

      1. Figure 1e is overly complicated and hence is difficult to interpret. The key result is that mScarlet expression is unchanged as a function of lactose concentration. It is sufficient to show the inset graph as a supplementary figure panel and to conclude that regulation of sfGFP is at a post-transcriptional level. Similarly, the inset in Figure 4b is unnecessary.

      Authors’ response: The inset of Fig. 1e shows that the growth rate of the cells is almost constant when lactose varies. A change in growth rate will affect protein expression. The use of a two-reporter system, one regulated translationally and the other not, is instrumental to extract from fluorescence data estimates of transcription and translation rates. Of course, showing that mScarlet expression is almost constant when lactose varies would be sufficient, but we believe that performing a fine treatment of the data helps to better understand the regulatory system from a mathematical and mechanistic point of view. Therefore, despite increasing the complexity of the figure, we prefer to keep the representation of the Crick spaces (following Alon’s terminology, see our ref. 32). We have tried to carefully explain Fig. 1e in the text.

      1. Figure 1f and Figure 4c would be easier to interpret as two-dimensional plots.

      Authors’ response: We decided to use 3D plots to have more compact representations of the data in the main figures. The accompanying insets show the percentage of cells above the threshold, which helps to understand the regulatory effects. In any case, we have provided the corresponding 2D plots in Fig. S10.

      1. I don't think Figure 2e is relevant. The key result is shown in Figure 2f, i.e., the effect of mutations on regulation by MSI-1.

      Authors’ response: We agree with the reviewer that the key result is shown in panel f. However, we prefer to keep panel e in Fig. 2 because, even if negative, this result may incite further research. In addition, we avoid the rearrangement of the whole figure.

      1. Lines 311-313. Without additional evidence that the mutants are toxic, I suggest removing this text.

      Authors’ response: As suggested, we have removed that claim.

      Reviewer #2 (Public Review):

      Summary:

      Dolcemascolo and colleagues describe the use of the mammalian RNA-binding protein Musashi-1 (MSI-1) to implement translational regulation systems in E. coli. They perform detailed in vitro studies of MSI-1 and its binding to different RNA sequences. They provide compelling evidence of the effectiveness of the regulatory system in multiple circuits using different mRNA sequence motifs. They harness allosteric inhibition of MSI-1 by omega-9 monounsaturated fatty acids to demonstrate a fatty-acid-responsive circuit in E. coli.

      Strengths:

      The experimental results are compelling and the characterization of the binding between MSI-1 and different RNA sequences is thorough and performed via multiple complementary techniques. Several new useful circuit components are demonstrated.

      Authors’ response: We thank the reviewer for such appreciation of our work.

      Weaknesses:

      MSI-1 provides 8.6-fold downregulation of sfGFP with an optimized mRNA sequence. In some applications, a larger degree of repression may be required.

      Authors’ response: We agree with the reviewer in this point. We expect to conduct further research in the future to optimize the dynamic range of the system. We have added the following sentence in the Discussion: “Further work should be conducted to enhance the fold change of the regulatory module and engineer complex circuits with it.”

      Reviewer #2 (Recommendations For The Authors):

      Overall, I think this paper is very well done and quite thorough. I only have minor suggestions:

      • For Figures 1f and 4c, it is quite hard to interpret the fraction of cells above the threshold with the 3d perspective. It would be clearer to use a more standard 2d plot where the histograms are offset along the y-axis and the threshold is indicated by a vertical line.

      Authors’ response: We decided to use 3D plots to have more compact representations of the data in the main figures. The accompanying insets show the percentage of cells above the threshold, which helps to understand the regulatory effects. In any case, we have provided the corresponding 2D plots in Fig. S10.

      • For Figure 4b, the highlighting of different sequence regions in red3 appears to be offset by one base (e.g. AAU is highlighted rather than AUG).

      Authors’ response: This has been corrected.

      • For line 504, it seems that MSI-1 is used for two different proteins. A different name should be assigned to this 200-residue protein to avoid confusion with the other MSI-1.

      Authors’ response: We now use the term MSI-1h* for the human version of the protein.

      • The note (Page S12) that A_0 + A_R = alpha/delta only applies in steady-state conditions, which should be stated.

      Authors’ response: We have specified that.

      • It seems that some authors work for the companies that sell some of the instruments/consumables used for the assays, specifically switchSENSE and LigandTracer. This may be something that should be declared under Competing Interests for the paper.

      Authors’ response: We are sorry for having missed this point. We have included a Competing Interests section to state that “RAHR and WFV work for Dynamic Biosensors. GPR and JB work for Ridgeview Instruments”.

      Reviewer #3 (Public Review):

      Summary:

      In this work, the authors co-opt the RRM-binding protein Musashi-1 to act as a translational repressor. The novelty of the work is in the adoption of the allosteric RRM protein Musashi-1 into a translational reporter and the demonstration that RRM proteins, which are ubiquitous in eukaryotic systems, but rare in prokaryotic ones, may act effectively as post-translational regulators in E. coli. The extent of repression achieved by the best design presented in this work is not substantially improved compared to other synthetic regulatory schemes developed for E. coli, even those that similarly regulate translation (eg. native PP7 repression is approximately 10-fold, Lim et al. J. Biol. Chem. 2001 276:22507-22513). Furthermore, the mechanism of regulation is not established due to missing key experiments. The work would be of broader interest if the allosteric properties of Musashi-1 were more effective in the context of regulation. Unfortunately, the authors do not demonstrate that fatty acids can completely de-repress expression in the experimental system used for most of their assays, nor do they use this ability in their provided application (NIMPLY gate).

      Authors’ response: For this revision, we have performed the flow cytometric quantification of the allosteric regulation by oleic acid in the redesigned-3 system, showing substantial de-repression of the system with the biochemical compound. We have redone Fig. 5 and modified the Results section accordingly. Aligned with the reviewers and editor, we believe that this new result helps to improve our manuscript.

      Strengths:

      The first major achievement of this work is the demonstration that a eukaryotic RRM protein may be used to posttranscriptionally regulate expression in bacteria. In my limited literature search, this appears to be the first engineering attempt to design an RBP to directly regulate translation in E. coli, although engineered control of translation via other approaches including alterations to RNA structure or via trans-acting sRNAs have been previously described (for review see Vigar and Wieden Biochim Biophys. Acta Gen. Subj. 2017, 1861:3060-3069). Additionally, several viral systems (e.g. MS2 and PP7) have been directly co-opted to work in a similar fashion in the past (utilized recently in Nguyen et al. ACS Synthetic Biol 2022, 11:1710-1718).

      Authors’ response: We thank the reviewer for such appreciation of our work.

      The second achievement of this work is the demonstration that the allosteric regulation of Musashi-1 binding can be utilized to modulate the regulatory activity. However, the liquid culture demonstration (Suppl. Fig 8) shows that this is not a very effective switch, with de-repressed reporter activity showing substantial change but not approaching un-repressed activity. This effect is stronger when colonies are grown on a solid medium (Fig. 5).

      Authors’ response: As we have previously indicated, the flow cytometric quantification of the allosteric regulation by oleic acid in the redesigned-3 system in liquid culture showed substantial de-repression with the biochemical compound. It is now stated in the text the following: “Nevertheless, the system implemented with the redesign-3 reporter displayed a better dynamic behavior in response to lactose and oleic acid. In particular, the percentage of cells in the ON state increased from 0 (with 1 mM lactose) to 71% upon addition of 20 mM oleic acid (Fig. 5d).” This new result helps to improve our manuscript.

      Weaknesses:

      In this work, the authors codon optimize the mouse Musashi-1 coding sequence for expression in E. coli and demonstrate using an sfGFP reporter that an engineered Musashi-1 binding site near the translational start site is sufficient to enable a modest reduction in reporter gene expression. The authors postulate that the reduction in expression due to inhibition of ribosome translocation along the transcript (lines 134/135), as an expression of a control transcript (mScarlet) driven by the same promoter (Plac) but without the Musashi-1 recognition site does not demonstrate the same repression. However, the situation could be more complex. Other possibilities include inhibition of translation initiation rather than elongation, as well as accelerated mRNA decay of transcripts that are not actively translated. The authors do not present any measurements of sfGFP mRNA levels.

      Authors’ response: In page 7, we have rewritten a sentence to have: “In this way, MSI-1* can repress translation by blocking the binding of the ribosome, presumably by imposing a steric hindrance for the 30S ribosomal subunit.” In addition, for this revision we have conducted an RT-qPCR experiment to quantify the mRNA level of sfGFP to further support our conclusions (see the new Fig. S2). As shown, there is no change in the mRNA level upon inducing the system with lactose.

      In subsequent sections of the work, the authors create a series of point mutations to assess RNA-protein binding and assess these via both a sfGFP reporter and in vitro binding assays (switchSENSE). Ultimately, it is difficult to fully rationalize and interpret the behavior of these mutants in the context provided. The authors do identify a relationship between equilibrium constant (1/KD) and fold-repression. However, it is not clear from the narrative why this relationship should exist. Fold-repression is one measure of regulator efficacy, but it is an indirect measure determined from unrepressed and repressed expression. It is not clear why unrepressed expression (in the absence of the protein) is expected to be a function of the equilibrium constant.

      Authors’ response: A mathematical derivation from mass action kinetics on why the fold change scales with 1/KD is provided in Note S2. It is the ratio between the unrepressed and repressed expression (i.e., fold change) what scales with 1/KD, but not the expression of a particular state. This kind of relationship has been previously established in the case of transcription regulation [see e.g. Garcia & Phillips, PNAS (2011), our ref. 39]. Our mathematical modeling results expand previous work by providing a single picture from which to analyze transcription and translation regulation.

      Subsequent rational redesign of the Musashi-1 binding sequence to produce three alternative designs shows that fold-repression may be improved to approximately 8.6-fold. However, the rationalization of why the best design (red3) achieves this increase based on either the extensive modelling or in vitro measured binding constants is not well articulated. Furthermore, this extent of regulation is approximately that which can be achieved from the PP7 system with its native components (Lim et al. J. Biol. Chem. 2001 276:22507-22513).

      Authors’ response: In the case of translation control, the regulation is more challenging because the target is quickly degraded, especially in bacteria (in contrast to transcription control, where the target is stable). This is acknowledged in the manuscript. Even though, it is possible to engineer synthetic circuits with sRNAs or RNA-binding proteins with sufficient dynamic range. We expect to conduct further research in the future to optimize the dynamic range of the system. We have added the following sentence in the Discussion: “Further work should be conducted to enhance the fold change of the regulatory module and engineer complex circuits with it.” Regarding the articulation of the results for the mutants and mathematical model, see our responses in the following questions.

      The application provided for this regulator (NIMPLY gate), is not an inherently novel regulatory paradigm, and it does not capitalize on the allosteric properties of Musashi-1, but rather treats Musashi-1 as a non-allosteric component of a regulatory circuit.

      Authors’ response: The NIMPLY gate refers to lactose and aTC as inputs. Considering oleic acid as an additional input will lead to a more complex logic. In the last Results section, we wanted to show that the post-transcriptional mechanism engineered with Musashi-1 can be useful specifically regulate a gene within an operon, to implement combinatorial regulation (i.e., coupling transcription and translation control), and to reduce protein expression noise. To these ends, the allosteric ability of the Musashi-1 was not so determinant. In this regard, it would be true that such fine regulatory effects might be achieved as well with non-allosteric RNA-binding proteins, such as MS2CP or PP7CP.

      Reviewer #3 (Recommendations For The Authors):

      1. In the introduction the authors should adequately address the native bacterial mechanisms that allow posttranscriptional regulation in bacteria as well as better discuss previous examples of translational repressors.

      Authors’ response: We have added the following paragraph in the Introduction: “Even though bacteria do not appear to exploit proteins to regulate translation in a gene-specific manner, it is worth noting that some bacteriophages do follow this mechanism to modulate their infection cycle. These are the cases, e.g., of the coat proteins of the phages MS2 (infecting Escherichia coli) or PP7 (infecting Pseudomonas aeruginosa), which regulate the expression of the cognate phage replicases through protein-RNA interactions [18]. However, one limitation for synthetic biology developments is that such phage proteins are not allosteric. At the post-transcriptional level, bacteria mostly rely on a large palette of cis- and trans-acting non-coding RNAs to either activate or repress protein expression, resulting in the regulation of translation initiation, mRNA stability, or transcription termination, and even allowing sensing small molecules [1,15]. Thus, there should be efforts to replicate this functional versatility with proteins in bacteria.”

      1. Given the location of the Musashi-1 binding site in the sfGFP reporter, it may be blocking translation initiation, rather than blocking the progression of the ribosome once attached (line 134/135). The schematic in Fig 1a. is also not overly clear in describing the differences in mechanisms between eukaryotic and prokaryotic systems described in the text.

      Authors’ response: In page 7, we have rewritten a sentence to have: “In this way, MSI-1 can repress translation by blocking the binding of the ribosome, presumably by imposing a steric hindrance for the 30S ribosomal subunit.” In page 14, we have added the following sentence: “In this way, MSI-1 can also block the RNA component of the 30S ribosomal subunit.”

      1. The authors did not directly examine mRNA levels of their reporter to establish translational regulation. In many cases, inhibition of translation is accompanied by an increased degradation rate in bacterial systems. The authors do not seem to recognize this as a possible amplifier in their system, relying exclusively on normalization via another transcript produced from the same promoter (mScarlet).

      Authors’ response: For this revision we have conducted an RT-qPCR experiment to quantify the mRNA level of sfGFP to further support our conclusions (see the new Fig. S2). As shown, there is no change in the mRNA level upon inducing the system with lactose.

      1. The results presented for mutations 1-5 are not consistent with the author's models for what is occurring. In particular, mutant 1 displays a reduction in reporter production in the absence of Musashi-1, but the production in the presence does not change from the unaltered sequence. The claim that mutation 1 (in the UAG binding site) results in less binding and ultimately in less regulation is not substantiated since this loss of regulation is due to a reduction in unrepressed expression rather than an increase in expression when Musashi-1 is present.

      Authors’ response: We respectfully disagree with this appreciation. In the case of mutant 1, if the Musashi protein recognized the target mRNA with the same affinity as in the original scenario, the red bar would be much lower. Because the Musashi protein hardly recognizes the mutant-1 mRNA, the blue and red bars are quite similar. To clarify this point, we have added the following text in the manuscript: “Despite that mutation substantially reduced sfGFP expression in absence of MSI-1*, the presumed repressed state upon addition of lactose did not change much, suggesting the difficulty of the protein for targeting the mutated mRNA.”

      1. Given point 5 above, it is not clear to me why one would expect the 1/KD to be predictive fold-repression in the presence and absence of the repressor. I would rather see the relationship described as predictive in Fig. 2f (fold change vs. 1/KD) rather than the non-linear relationship. It is difficult to qualitatively evaluate the fit quality with the way the data are currently presented.

      Authors’ response: Note S2 provides a mathematical derivation from mass action kinetics on why the fold change scales with 1/KD. The R2 value that we provide for the fitting corresponds to the linear regression between fold and 1/KD, as specified in the figure legend. However, we think that the representation of fold vs. KD in log scale is more illustrative in this case.

      1. It is not clear what conclusion is determined from the computational modeling, or how this work contributes to the narrative presented. It does not seem like what is learned from these experiments is utilized for novel designs. Furthermore, several of the assumptions within the model may be problematic including the high rate of "elongation leakage" described and the lack of justification for RNA degradation rates utilized.

      Authors’ response: The mathematical modeling was performed to rationalize our experimental data. Our idea was more to recapitulate the observed dynamics than to guide the design of new systems. Our model might be exploited to this end in further research, as the reviewer suggests. Besides, elongation leakage is a concept that applies to both transcription and translation regulation systems, and it is not more than the ability of the RNA polymerase or ribosome to elongate even if there is a protein bound to the nucleic acid. This parameter can be set to 0 in the model if appropriate. Moreover, we cite the paper by Bernstein et al., PNAS (2002), our ref. 38, to justify that in E. coli the average mRNA half-life is about 5 min (i.e., degradation rate of 0.14 min-1).

      1. The data presented in Figure 4 are not presented in a consistent way. While it would be somewhat redundant, including the 0 and 1 mM lactose data for red3 in Figure 4a would be helpful for comparison purposes.

      Authors’ response: We have added the requested bar plot in Fig. 4a.

      1. The presence of additional Musashi-1 sites upstream of the start codon in red3, and their impact on impact on the fold-repression may support an inhibition of the translation initiation model rather than an inhibition of elongation.

      Authors’ response: In page 7, we have rewritten a sentence to have: “In this way, MSI-1 can repress translation by blocking the binding of the ribosome, presumably by imposing a steric hindrance for the 30S ribosomal subunit.” In page 14, we have added the following sentence: “In this way, MSI-1 can also block the RNA component of the 30S ribosomal subunit.”

    1. Author Response

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors aim to address a critical challenge in the field of bioinformatics: the accurate and efficient identification of protein binding sites from sequences. Their work seeks to overcome the limitations of current methods, which largely depend on multiple sequence alignments or experimental protein structures, by introducing GPSite, a multi-task network designed to predict binding residues of various molecules on proteins using ESMFold.

      Strengths:

      1. Benchmarking. The authors provide a comprehensive benchmark against multiple methods, showcasing the performances of a large number of methods in various scenarios.

      2. Accessibility and Ease of Use. GPSite is highlighted as a freely accessible tool with user-friendly features on its website, enhancing its potential for widespread adoption in the research community.

      We thank the reviewer for acknowledging the contributions and strengths of our work! Weaknesses:

      1. Lack of Novelty. The method primarily combines existing approaches and lacks significant technical innovation. This raises concerns about the original contribution of the work in terms of methodological development. Moreover, the paper reproduces results and analyses already presented in previous literature, without providing novel analysis or interpretation. This further diminishes the contribution of this paper to advancing knowledge in the field.

      The novelty of this work is primarily manifested in four key aspects. Firstly, although we agree with the reviewer that we did employ several existing tools such as ProtTrans and ESMFold to extract sequence features and predict protein conformations, these techniques were hardly explored in the field of binding site prediction. We have successfully demonstrated the feasibility of substituting multiple sequence alignments with language model embeddings and training with “less accurate” predictive structures, providing a new solution to overcome the limitations of current methods for genome-wide applications. Secondly, though a few methods tend to capture geometric information based on protein surfaces or atom graphs, surface calculation and property mapping are usually time-consuming, while massage passing on full atom graphs is memory-consuming and thus challenging to process long sequences. Besides, these methods are sensitive towards details and errors in the predictive structures. To facilitate large-scale annotations, we have innovatively applied geometric deep learning to protein residue graphs for comprehensively capturing backbone and sidechain geometric contexts in an efficient and effective manner (Figure 1). Thirdly, we have not only exploited multi-task learning to integrate diverse ligands and enhance performance, but also shown its capability to easily extend to the binding site prediction of other unseen ligands (Figure 4 D-E). Last but not least, as a Tools and Resources article, we have provided a fast, accurate and user-friendly webserver, as well as constructed a large annotation database for the sequences in Swiss-Prot. Leveraging this database, we have conducted extensive analyses on the associations between binding sites and molecular functions, biological processes, and disease-causing mutations (Figure 5), indicating the potential of our tool to unveil unexplored biology underlying genomic data.

      1. Benchmark Discrepancies. The variation in benchmark results, especially between initial comparisons and those with PeSTo. GPSite achieves a PR AUC of 0.484 on the global benchmark but a PR AUC of 0.61 on the benchmark against PeSTo. For consistency, PeSTo should be included in the benchmark against all other methods. It suggests potential issues with the benchmark set or the stability of the method. This inconsistency needs to be addressed to validate the reliability of the results.

      We thank the reviewer for the constructive comments. Since our performance comparison experiments involved numerous competitive methods whose training sets were disparate, it was difficult to compare or rank all these methods fairly using a single test set. As described in the “GPSite outperforms state-of-the-art methods” section, 358 out of 375 proteins in our protein-protein binding site test set share >30% sequence identity with the training sequences of PeSTo. To address this, we meticulously re-split our entire protein-protein binding site dataset to generate a new test set that avoids any overlap with the training sets of both GPSite and PeSTo and performed a separate evaluation. This is quite common in this field. For instance, in the study of PeSTo [Nat Commun 2023], the comparisons of PeSTo with MaSIF-site, SPPIDER, and PSIVER were conducted using one test set, while the comparison with ScanNet was performed on a separate test set. Based on the reviewer’s suggestion, in the revised version of the manuscript, we intend to include other comparative methods alongside PeSTo on the new test set or retrain our model directly on PeSTo's training set for comparison, which should enhance the completeness of our results.

      1. Interface Definition Ambiguity. There is a lack of clarity in defining the interface for the binding site predictions. Different methods are trained using varying criteria (surfaces in MaSIF-site, distance thresholds in ScanNet). The authors do not adequately address how GPSite's definition aligns with or differs from these standards and how this issue was addressed. It could indicate that the comparison of those methods is unreliable and unfair.

      We thank the reviewer for the comments. The precise definition of ligand-binding sites is elucidated in the “Benchmark datasets” section. Specifically, the datasets of DNA, RNA, peptide, ATP, HEM and metal ions used to train GPSite were collected from the widely acknowledged BioLiP database [PMID: 23087378]. In BioLiP, a binding residue is defined if the smallest atomic distance between the target residue and the ligand is <0.5 Å plus the sum of the Van der Waal’s radius of the two nearest atoms. In the meanwhile, most comparative methods regarding these ligands were also trained on data from BioLiP, thereby ensuring fair comparisons.

      However, since BioLiP does not include data on protein-protein binding sites, studies for protein-protein binding site prediction may adopt slightly distinct label definitions, as the reviewer suggested. Here, we employed protein-protein binding site data from our previous study [PMID: 34498061], where a protein-binding residue was defined as a surface residue (relative solvent accessibility > 5%) that lost more than 1 Å2 absolute solvent accessibility after protein-protein complex formation. This definition was initially introduced in PSIVER [PMID: 20529890] and widely applied in various studies (e.g., PMID: 31593229, PMID: 32840562). SPPIDER [PMID: 17152079] and MaSIF-site [PMID: 31819266] have also adopted similar surface-based definitions as PSIVER. On the other hand, ScanNet [PMID: 35637310] employed an atom distance threshold of 4 Å to define contacts while PeSTo [PMID: 37072397] used a threshold of 5 Å. However, it is noteworthy that current methods in this field including ScanNet [Nat Methods 2022] and PeSTo [Nat Commun 2023] directly compared methods using different label definitions without any alignment in their benchmark studies, likely due to the subtle distinctions among these definitions. For instance, the study of PeSTo directly performed comparisons with ScanNet, MaSIF-site, SPPIDER, and PSIVER. Therefore, we followed these previous works, directly comparing GPSite with other protein-protein binding site predictors. In our revised manuscript, we will provide more details for the binding site definitions to avoid any potential ambiguity.

      While GPSite demonstrates the potential to surpass state-of-the-art methods in protein binding site prediction, the evidence supporting these claims seems incomplete. The lack of methodological novelty and the unresolved questions in benchmark consistency and interface definition somewhat undermine the confidence in the results. Therefore, it's not entirely clear if the authors have fully achieved their aims as outlined.

      The work is useful for the field, especially in disease mechanism elucidation and novel drug design. The availability of genome-scale binding residue annotations GPSite offers is a significant advancement. However, the utility of this tool could be hampered by the aforementioned weaknesses unless they are adequately addressed.

      We thank the reviewer for acknowledging the advancement and value of our work, as well as pointing out areas where improvements can be made. As discussed above, we will carry out the corresponding revisions in the next version of the manuscript to enhance the completeness and clearness of our work.

      Reviewer #2 (Public Review):

      Summary:

      This work provides a new framework, "GPsite" to predict DNA, RNA, peptide, protein, ATP, HEM, and metal ions binding sites on proteins. This framework comes with a webserver and a database of annotations. The core of the model is a Geometric featurizer neural network that predicts the binding sites of a protein. One major contribution of the authors is the fact that they feed this neural network with predicted structure from ESMFold for training and prediction (instead of native structure in similar works) and a high-quality protein Language Model representation. The other major contribution is that it provides the public with a new light framework to predict protein-ligand interactions for a broad range of ligands.

      The authors have demonstrated the interest of their framework with mostly two techniques: ablation and benchmark.

      Strengths:

      The performance of this framework as well as the provided dataset and web server make it useful to conduct studies.

      The ablations of some core elements of the method, such as the protein Language Model part, or the input structure are very insightful and can help convince the reader that every part of the framework is necessary. This could also guide further developments in the field. As such, the presentation of this part of the work can hold a more critical place in this work.

      We thank the reviewer for recognizing the contributions of our work and for noting that our experiments are thorough.

      Weaknesses:

      Overall, we can acknowledge the important effort of the authors to compare their work to other similar frameworks. Yet, the lack of homogeneity of training methods and data from one work to the other makes the comparison slightly unconvincing, as the authors pointed out. Overall, the paper puts significant effort into convincing the reader that the method is beating the state of the art. Maybe, there are other aspects that could be more interesting to insist on (usability, interest in protein engineering, and theoretical works).

      We sincerely appreciate the reviewer for the constructive and insightful comments. As to the concern of training data heterogeneity raised by the reviewer, it is noteworthy that current studies in this field, such as ScanNet [Nat Methods 2022] and PeSTo [Nat Commun 2023], tend to directly compare methods trained on different datasets in their benchmark experiments. Therefore, we have adhered to the paradigm in these previous works. According to the detailed recommendations by the reviewer, we will improve our manuscript by incorporating additional ablation studies regarding the effects of predicted structures and language model representations. Besides, we will refine the Discussion section to focus more on the achievements of this work and its potential applications including protein engineering. A comprehensive point-by-point response to the reviewer’s recommendations will be provided alongside the revised manuscript. This will ensure that all concerns and suggestions are adequately addressed.

      Reviewer #3 (Public Review):

      Summary

      The authors of this work aim to address the challenge of accurately and efficiently identifying protein binding sites from sequences. They recognize that the limitations of current methods, including reliance on multiple sequence alignments or experimental protein structure, and the under-explored geometry of the structure, which limit the performance and genome-scale applications. The authors have developed a multi-task network called GPSite that predicts binding residues for a range of biologically relevant molecules, including DNA, RNA, peptides, proteins, ATP, HEM, and metal ions, using a combination of sequence embeddings from protein language models and ESMFold-predicted structures. Their approach attempts to extract residual and relational geometric contexts in an end-to-end manner, surpassing current sequence-based and structure-based methods.

      Strengths

      1. The GPSite model's ability to predict binding sites for a wide variety of molecules, including DNA, RNA, peptides, and various metal ions.

      2. Based on the presented results, GPSite outperforms state-of-the-art methods in several benchmark datasets.

      3. GPSite adopts predicted structures instead of native structures as input, enabling the model to be applied to a wider range of scenarios where native structures are rare.

      4. The authors emphasize the low computational cost of GPSite, which enables rapid genome-scale binding residue annotations, indicating the model's potential for large-scale applications.

      We thank the reviewer for recognizing the significance and value of our work!

      Weaknesses

      1. One major advantage of GPSite, as claimed by the authors, is its efficiency. Although the manuscript mentioned that the inference takes about 5 hours for all datasets, it remains unclear how much improvement GPSite can offer compared with existing methods. A more detailed benchmark comparison of running time against other methods is recommended (including the running time of different components, since some methods like GPSite use predicted structures while some use native structures).

      We thank the reviewer for the valuable suggestion. Empirically, it takes about 30 min for existing MSA-based methods to make predictions for a protein with 500 residues, while it only takes less than 1 min for GPSite (including structure prediction). However, it is worth noting that some predictors in our benchmark study are solely available as webservers, and it is challenging to compare the runtime between a standalone program and a webserver due to the disparity in hardware configurations. Therefore, we will include comprehensive runtime comparisons between the GPSite webserver and other existing servers in the revision to illustrate the practicality and efficiency of our method.

      1. Since the model uses predicted protein structure, the authors have conducted some studies on the effect of the predicted structure's quality. However, only the 0.7 threshold was used. A more comprehensive analysis with several different thresholds is recommended.

      We thank the reviewer for the comment. We assessed the effect of the predicted structure's quality by evaluating GPSite’s performance on high-quality (TM-score > 0.7) and low-quality (TM-score ≤ 0.7) predicted structures. We did not employ multiple thresholds (e.g., 0.3, 0.5, and 0.7), as the majority of proteins in the test sets were accurately predicted by ESMFold. Specifically, as shown in Figure 3B, Appendix 3-figure 2 and Appendix 2-table 5, the numbers of proteins with TM-score ≤ 0.7 are small in most datasets. Consequently, there is insufficient data available for analysis with lower thresholds, except for the RNA test set. Notably, Figure 3C presents a detailed inspection of the proteins with TM-score < 0.5 in the RNA test set. Within this subset, GPSite consistently outperforms the state-of-the-art structure-based method GraphBind with predicted structures as input, regardless of the prediction quality of ESMFold. Only in cases where structures are predicted with extremely low quality (TM-score < 0.3) does GPSite fall behind GraphBind input with native structures. This result further demonstrates the robustness of GPSite.

      1. To demonstrate the robustness of GPSite, the authors performed a case study on human GR containing two zinc fingers, where the predicted structure is not perfect. The analysis could benefit from more a detailed explanation of why the model can still infer the binding site correctly even though the input structural information is slightly off.

      We thank the reviewer for the comment. We have actually explained the potential reason for the robustness of GPSite in the second paragraph of the “GPSite is robust for low-quality predicted structures” section. In summary, although the whole structure of this protein is not perfectly predicted, the binding domains of peptide, DNA and Zn2+ are actually predicted accurately as evidenced by the superpositions of the native and predicted structures in Figure 3D and 3E. Therefore, GPSite can still make reliable predictions.

      1. To analyze the relatively low AUC value for protein-protein interactions, the authors claimed that it is "due to the fact that protein-protein interactions are ubiquitous in living organisms while the Swiss-Prot function annotations are incomplete", which is unjustified. It is highly recommended to support this claim by showing at least one example where GPSite's prediction is a valid binding site that is not present in the current Swiss-Prot database or via other approaches.

      We thank the reviewer for the valuable recommendation. We will perform such analysis in the revised manuscript.

      1. The authors reported that many GPSite-predicted binding sites are associated with known biological functions. Notably, for RNA-binding sites, there is a significantly higher proportion of translation-related binding sites. The analysis could benefit from a further investigation into this observation, such as the analyzing the percentage of such interactions in the training site. In addition, if there is sufficient data, it would also be interesting to see the cross-interaction-type performance of the proposed model, e.g., train the model on a dataset excluding specific binding sites and test its performance on that class of interactions.

      We thank the reviewer for the suggestion. We would like to clarify that the analysis in Figure 5C was conducted at “protein-level” instead of “residue-level”. As described in the second paragraph of the “Large-scale binding site annotation for Swiss-Prot” section, a protein-level ligand-binding score was assigned to a protein by averaging the top k residue-level predictive binding scores. This protein-level score indicates the overall binding propensity of the protein to a specific ligand. We gathered the top 20,000 proteins with the highest protein-level binding scores for each ligand and found that their biological process annotations from Swiss-Prot were consistent with existing knowledge.

      As for the cross-interaction-type performance raised by the reviewer, we will include such analysis in the revised manuscript.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study combines genetically barcoded rabies viruses with spatial transcriptomics in vivo in the mouse brain to decode connectivity of neural circuits. The data generated by the combination of these approaches in this new way is mostly convincing as the authors provide validation and proof-of-concept that the approach can be successful. While this new combination of established techniques has promise for elucidating brain connectivity, there are still some nuances and caveats to the interpretations of the results that are lacking especially with regards to noting unexpected barcodes either due to unexpected/novel connections or unexpected rabies spread.

      In this revised manuscript, we added a new control experiment and additional analyses to address two main questions from the reviewers: (1) How the threshold of glycoprotein transcript counts used to identify source cells was determined, and (2) whether the limited long-range labeling was expected in the trans-synaptic experiment. The new experiments and analyses validated the distribution of source cells and presynaptic cells observed in the original barcoded transsynaptic tracing experiment and validated the choice of the threshold of glycoprotein transcripts. As the reviewers suggested, we also included additional discussion on how future experiments can improve upon this study, including strategies to improve source cell survival and minimizing viral infection caused by leaky expression of TVA. We also provided additional clarification on the analyses for both the retrograde labeling experiment and the trans-synaptic tracing experiment. We modified the Results and Discussion sections on the trans-synaptic tracing experiment to improve clarity to general readers. Detailed changes to address specific comments by reviewers are included below.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this preprint, Zhang et al. describe a new tool for mapping the connectivity of mouse neurons. Essentially, the tool leverages the known peculiar infection capabilities of Rabies virus: once injected into a specific site in the brain, this virus has the capability to "walk upstream" the neural circuits, both within cells and across cells: on one hand, the virus can enter from a nerve terminal and infect retrogradely the cell body of the same cell (retrograde transport). On the other hand, the virus can also spread to the presynaptic partners of the initial target cells, via retrograde viral transmission.

      Similarly to previously published approaches with other viruses, the authors engineer a complex library of viral variants, each carrying a unique sequence ('barcode'), so they can uniquely label and distinguish independent infection events and their specific presynaptic connections, and show that it is possible to read these barcodes in-situ, producing spatial connectivity maps. They also show that it is possible to read these barcodes together with endogenous mRNAs, and that this allows spatial mapping of cell types together with anatomical connectivity.

      The main novelty of this work lies in the combined use of rabies virus for retrograde labeling together with barcoding and in-situ readout. Previous studies had used rabies virus for retrograde labeling, albeit with low multiplexing capabilities, so only a handful of circuits could be traced at the same time. Other studies had instead used barcoded viral libraries for connectivity mapping, but mostly focused on the use of different viruses for labeling individual projections (anterograde tracing) and never used a retrograde-infective virus.

      The authors creatively merge these two bits of technology into a powerful genetic tool, and extensively and convincingly validate its performance against known anatomical knowledge. The authors also do a very good job at highlighting and discussing potential points of failure in the methods.

      We thank the reviewer for the enthusiastic comments.

      Unresolved questions, which more broadly affect also other viral-labeling methods, are for example how to deal with uneven tropism (ie. if the virus is unable or inefficient in infecting some specific parts of the brain), or how to prevent the cytotoxicity induced by the high levels of viral replication and expression, which will tend to produce "no source networks", neural circuits whose initial cell can't be identified because it's dead. This last point is particularly relevant for in-situ based approaches: while high expression levels are desirable for the particular barcode detection chemistry the authors chose to use (gap-filling), they are also potentially detrimental for cell survival, and risk producing extensive cell death (which indeed the authors single out as a detectable pitfall in their analysis). This is likely to be one of the major optimisation challenges for future implementations of these types of barcoding approaches.

      As the reviewer suggested, we included additional discussion about tropism and cytotoxicity in the revised Discussion. Our sensitivity for barcode detection is sufficient, since we estimated (based on manual proofreading) that most barcoded neurons had more than ten counts of a barcode in the trans-synaptic tracing experiment. The high sensitivity may potentially allow us to adapt next-generation rabies virus with low replication, such as the third generation ΔL rabies virus (Jin et al, 2022, biorxiv) in future optimizations.

      Overall the paper is well balanced, the data are well presented and the conclusions are strongly supported by the data. Impact-wise, the method is definitely going to be useful for the neurobiology research community.

      We thank the reviewer for her/his enthusiasm.

      Reviewer #2 (Public Review):

      Although the trans-synaptic tracing method mediated by the rabies virus (RV) has been widely utilized to infer input connectivity across the brain to a genetically defined population in mice, the analysis of labeled pre-synaptic neurons in terms of cell-type has been primarily reliant on classical low-throughput histochemical techniques. In this study, the authors made a significant advance toward high-throughput transcriptomic (TC) cell typing by both dissociated single-cell RNAseq and the spatial TC method known as BARseq to decode a vast array of molecularly labeled ("barcoded") RV vector library. First, they demonstrated that a barcoded-RV vector can be employed as a simple retrograde tracer akin to AAVretro. Second, they provided a theoretical classification of neural networks at the single-cell resolution that can be attained through barcoded-RV and concluded that the identification of the vast majority (ideally 100%) of starter cells (the origin of RV-based trans-synaptic tracing) is essential for the inference of single-cell resolution neural connectivity. Taking this into consideration, the authors opted for the BARseq-based spatial TC that could, in principle, capture all the starter cells. Finally, they demonstrated the proof-of-concept in the somatosensory cortex, including infrared connectivity from 381 putative pre-synaptic partners to 31 uniquely barcoded-starter cells, as well as many insightful estimations of input convergence at the cell-type resolution in vivo. While the manuscript encompasses significant technical and theoretical advances, it may be challenging for the general readers of eLife to comprehend. The following comments are offered to enhance the manuscript's clarity and readability.

      We modified the Results and Discussion sections on the trans-synaptic tracing experiment to improve clarity to general readers. We separated out the theoretical discussion about barcode sharing networks as a separate subsection, explicitly stated the rationale of how different barcode sharing networks are distinguished in the in situ trans-synaptic tracing experiment, and added additional discussion on future optimizations. Detailed descriptions are provided below.

      Major points:

      1. I find it difficult to comprehend the rationale behind labeling inhibitory neurons in the VISp through long-distance retrograde labeling from the VISal or Thalamus (Fig. 2F, I and Fig. S3) since long-distance projectors in the cortex are nearly 100% excitatory neurons. It is also unclear why such a large number of inhibitory neurons was labeled at a long distance through RV vector injections into the RSP/SC or VISal (Fig. 3K). Furthermore, a significant number of inhibitory starter cells in the somatosensory cortex was generated based on their projection to the striatum (Fig. 5H), which is unexpected given our current understanding of the cortico-striatum projections.

      The labeling of inhibitory neurons can be explained by several factors in the three different experiments.

      (1) In the scRNAseq-based retrograde labeling experiment (Fig. 2 and Fig. S3), the injection site VISal is adjacent to VISp. Because we dissected VISp for single-cell RNAseq, we may find labeled inhibitory neurons at the VISp border that extend short axons into VISal. We explained this in the revised Results.

      (2) In the in situ sequencing-based retrograde labeling experiment (Fig. 3,4), the proximity between the two injection sites VISal and RSP/SC, and the sequenced areas (which included not only VISp but also RSP) could also contribute to labeling through local axons of inhibitory neurons. Furthermore, because we also sequenced midbrain regions, inhibitory neurons in the superior colliculus could pick up the barcodes through local axons. We included an explanation of this in the revised Results.

      (3) In the trans-synaptic tracing experiment, we speculate that low level leaky expression from the TREtight promoter led to non-Cre-dependent expression in many neurons. To test this hypothesis, we first performed a control injection in which we saw that the fluorescent protein expression were indeed restricted to layer 5, as expected from corticostriatal labeling. Based on the labeling pattern, we estimated that about 12 copies of the glycoprotein transcript per cell would likely be needed to achieve fluorescent protein expression. Since many source cells in our experiment were below this threshold, these results support the hypothesis that the majority of source cells with low level expression of the glycoprotein were likely Cre-independent. Because these cells could still contribute to barcode sharing networks, we could not exclude them as in a conventional bulk trans-synaptic tracing experiment. In future experiments, we can potentially reduce this population by improving the helper AAV viruses used to express TVA and the glycoprotein. We included this explanation in Results and more detailed analysis in Supplementary Note 2, and discussed potential future optimizations in the Discussion. This new analysis in Supplementary Note 2 is also related to the Reviewer’s question regarding the threshold used for determining source cells (see below).

      1. It is unclear as to why the authors did not perform an analysis of the barcodes in Fig. 2. Given that the primary objective of this manuscript is to evaluate the effectiveness of multiplexing barcoded technology in RV vectors, I would strongly recommend that the authors provide a detailed description of the barcode data here, including any technical difficulties or limitations encountered, which will be of great value in the future design of RV-barcode technologies. In case the barcode data are not included in Fig. 2, I would suggest that the authors consider excluding Fig. 2 and Fig. S1-S3 in their entirety from the manuscript to enhance its readability for general readers.

      In the single-cell RNAseq-based retrograde tracing, all barcodes recovered matched to known barcodes in the corresponding library. We included a short description of these results in the revised manuscript.

      1. Regarding the trans-synaptic tracing utilizing a barcoded RV vector in conjunction with BARseq decoding (Fig. 5), which is the core of this manuscript, I have a few specific questions/comments. First, the rationale behind defining cells with only two rolonies counts of rabies glycoprotein (RG) as starter cells is unclear. Why did the authors not analyze the sample based on the colocalization of GFP (from the AAV) and mCherry (from the RV) proteins, which is a conventional method to define starter cells? If this approach is technically difficult, the authors could provide an independent histochemical assessment of the detection stringency of GFP positive cells based on two or more colonies of RG.

      In situ sequencing does not preserve fluorescent protein signals, so we used transcript counts to determine which cells expressed the glycoprotein. We have added new analyses in the Results and in Supplementary Note 2 to determine the transcript counts that were equivalent to cells that had detectable BFP expression. We found that BFP expression is equivalent to ~12 counts of the glycoprotein transcript per cell, which is much higher than the threshold we used. However, we could not solely rely on this estimate to define the source cells, because cells that had lower expression of the glycoprotein (possibly from leaky Cre-independent expression) may still pass the barcodes to presynaptic cells. This can lead to an underestimation of double-labeled and connected-source networks and an overestimation of single-source networks and can obscure synaptic connectivity at the cellular resolution. We thus used a very conservative threshold of two transcripts in the analysis. This conservative threshold will likely overestimate the number of source cells that shared barcodes and underestimate the number of single-source networks. Since this is a first study of barcoded transsynaptic tracing in vivo, we chose to err on the conservative side to make sure that the subsequent analysis has single-cell resolution. Future characterization and optimization may lead to a better threshold to fully utilize data.

      Second, it is difficult to interpret the proportion of the 2,914 barcoded cells that were linked to barcoded starter cells (single-source, double-labeled, or connected-source) and those that remained orphan (no-source or lost-source). A simple table or bar graph representation would be helpful. The abundance of the no-source network (resulting from Cre-independent initial infection of the RV vector) can be estimated in independent negative control experiments that omit either Cre injection or AAV-RG injection. The latter, if combined with BARseq decoding, can provide an experimental prediction of the frequency of double-labeled events since connected-source networks are not labeled in the absence of RG.

      We have added Table 2, which breaks down the 2,914 barcoded cells based on whether they are presynaptic or source cells, and which type of network they belong to. We agree with the reviewer that the additional Cre- or RG- control experiments in parallel would allow an independent estimate of the double labeled networks and the no-source networks. We have included added a discussion of possible controls to further optimize the trans-synaptic tracing approach in future studies in the Discussion.

      Third, I would appreciate more quantitative data on the putative single-source network (Fig. 5I and S6) in terms of the distribution of pre- and post-synaptic TC cell types. The majority of labeling appeared to occur locally, with only two thalamic neurons observed in sample 25311842 (Fig. S6). How many instances of long-distance labeling (for example, > 500 microns away from the injection site) were observed in total? Is this low efficiency of long-distance labeling expected based on the utilized combinations of AAVs and RV vectors? A simple independent RV tracing solely detecting mCherry would be useful for evaluating the labeling efficiency of the method. I have experienced similar "less jump" RV tracing when RV particles were prepared in a single step, as this study did, rather than multiple rounds of amplification in traditional protocols, such as Osakada F et al Nat Protocol 2013.

      We imaged an animal that was injected in parallel to assess labeling (now included in Supplementary Note 2 and Supp. Fig. S5). The labeling pattern in the newly imaged animal was largely consistent with the results from the barcoded experiment: most labeled neurons were seen in the vicinity of the injection site, and sparser labeling was seen in other cortical areas and the thalamus. We further found that most neurons that were labeled in the thalamus were about 1 mm posterior to the center of the injection site, and thus would not have been sequenced in the in situ sequencing experiment (in which we sequenced about 640 µm of tissue spanning the injection site).

      In addition, we found that the bulk of the cells that expressed mCherry from the rabies virus only partially overlapped with the area that contained cells co-expressing BFP with the rabies glycoprotein. Moreover, very few cells co-expressed mCherry and BFP, which would be considered source cells in a conventional mono-synaptic tracing experiment. The small numbers of source cells likely also contributed to the sparseness of long-range labeling in the barcoded experiment.

      These interpretations and comparisons to the barcoded experiment are now included in Supplementary Note 2.

      Reviewer #3 (Public Review):

      The manuscript by Zhang and colleagues attempts to combine genetically barcoded rabies viruses with spatial transcriptomics in order to genetically identify connected pairs. The major shortcoming with the application of a barcoded rabies virus, as reported by 2 groups prior, is that with the high dropout rate inherent in single cell procedures, it is difficult to definitively identify connected pairs. By combining the two methods, they are able to establish a platform for doing that, and provide insight into connectivity, as well as pros and cons of their method, which is well thought out and balanced.

      Overall the manuscript is well-done, but I have a few minor considerations about tone and accuracy of statements, as well as some limitations in how experiments were done. First, the idea of using rabies to obtain broader tropism than AAVs isn't really accurate - each virus has its own set of tropisms, and it isn't clear that rabies is broader (or can be made to be broader).

      As the reviewer suggested, we toned down this claim and stated that rabies virus has different tropism to complement AAV.

      Second, rabies does not label all neurons that project to a target site - it labels some fraction of them.

      We meant to say that retrograde labeling is not restricted to labeling neurons from a certain brain region. We have clarified in the text.

      Third, the high rate of rabies virus mutation should be considered - if it is, or is not a problem in detecting barcodes with high fidelity, this should be noted.

      Our analysis showed that sequencing 15 bases was sufficient to tolerate a small number of mismatches in the barcode sequences and could distinguish real barcodes from random sequences (Fig. 4A). Thus, we can tolerate mutations in the barcode sequence. We have clarified this in the text.

      Fourth, there are a number of implicit assumptions in this manuscript, not all of which are equally backed up by data. For example, it is not clear that all rabies virus transmission is synaptic specific; in fact, quite a few studies argue that it is not (e.g., detection of rabies transcripts in glial cells). Thus, arguments about lost-source networks and the idea that if a cell is lost from the network, that will stop synaptic transmission, is not clear. There is also the very real propensity that, the sicker a starter cell gets, the more non-specific spread of virus (e.g., via necrosis) occurs.

      We agree with the reviewer that how strictly virus transmission is restricted to synapses remains a hotly debated question in the field, and this question is relevant not only to techniques based on barcoded rabies tracing, but to all trans-synaptic tracing experiments. A barcoding-based approach can generate single-cell data that enable direct comparison to other data modalities that measure synaptic connectivity, such as multi-patch and EM. These future experiments may provide additional insights into the questions that the reviewer raised. We have included additional discussion about how non-synaptic transmission of barcodes because of the necrosis of source cells may affect the analysis in the Discussion.

      Regarding the scenario in which the source cell dies, we agree with the reviewer and have clarified in the revised manuscript.

      Fifth, in the experiments performed in Figure 5, the authors used a FLEx-TVA expressed via a retrograde Cre, and followed this by injection of their rabies virus library. The issue here is that there will be many (potentially thousands) of local infection events near the injection site that TVA-mediated but are Cre-dependent (=off-target expression of TVA in the absence of Cre). This is a major confound in interpreting the labeling of these cells. They may express very low levels of TVA, but still have infection be mediated by TVA. The authors did not clearly explore how expression of TVA related to rabies virus infection of cells near the rabies injection site. A modified version of TVA, such as 66T, should have been used to mitigate this issue. Otherwise, it is impossible to determine connectivity locally. The authors do not go to great lengths to interpret the findings of these observations, so I am not sure this is a critical issue, but it should be pointed out by the authors as a caveat to their dataset.

      We agree with the reviewer that this type of infection could potentially be a major contributor to no-source networks, which were abundant in our experiment. Because small no-source networks were excluded from our analyses, and large no-source networks were only included for barcodes with low frequency (i.e., it would be nearly impossible statistically to generate such large no-source networks from independent infections), we believe that the effect of independent infections on our analyses were minimized. We have added a control experiment in Fig S5 and Supplementary Note 2, which further supported the hypothesis that there were many independent infections. We also included additional discussion about how this can be assessed and optimized in future studies in the Discussion.

      Sixth, the authors are making estimates of rabies spread by comparison to a set of experiments that was performed quite differently. In the two studies cited (Liu et al., done the standard way, and Wertz et al., tracing from a single cell), the authors were likely infecting with a rabies virus using a high multiplicity of infection, which likely yields higher rates of viral expression in these starter cells and higher levels of input labeling. However, in these experiments, the authors need to infect with a low MOI, and explicitly exclude cells with >1 barcode. Having only a single virion trigger infection of starter cells will likely reduce the #s of inputs relative to starter neurons. Thus, the stringent criteria for excluding small networks may not be entirely warranted. If the authors wish to only explore larger networks, this caveat should be explicitly noted.

      In the trans-synaptic labeling experiment, we actually used high rabies titer (200 nL, 7.6e10 iu/mL) that was comparable to conventional rabies tracing experiments. We did not exclude cells with multiple barcodes (as opposed to barcodes in multiple source cells), because we could resolve multiple barcodes in the same cell and indeed found many cells with multiple barcodes. We have clarified this in the text.

      Overall, if the caveats above are noted and more nuance is added to some of the interpretation and discussion of results, this would greatly help the manuscript, as readers will be looking to the authors as the authority on how to use this technology.

      In addition to addressing the specific concerns of the reviewer as described above, we modified the Results and Discussion sections on the trans-synaptic tracing experiment to improve clarity to general readers and expanded the discussion on future optimizations.

      Reviewer #1 (Recommendations For The Authors):

      The scientific problem is clearly stated and well laid out, the data are clearly presented, and the experiments well justified and nicely discussed. It was overall a very enjoyable read. The figures are generally nice and clear, however, I find the legends excessively concise. A bit too often, they just sort of introduce the title of the panel rather than a proper explanation of what it is depicted. A clear case is for example visible in Fig 2, where the description of the panels is minimal, but this is a general trend of the manuscript. This makes the figures a bit hard to follow as self-contained entities, without having to continuously go back to the main text. I think this could be improved with longer and more helpful descriptions.

      We have revised all figure legends to make them more descriptive.

      Other minor things:

      In the cDNA synthesis step for in-situ sequencing, I believe the authors might have forgotten one detail: the addition of aminoallyl dUTP to the RT reaction. If I recall correctly this is done in BARseq. The fact that the authors crosslink with BS-PEG on day 2, makes me suspect they spike in these nucleotides during the RT but this is not specified in the relevant step. Perhaps this is a mistake that needs correction.

      The RT primers we used have an amine group at 5’, which directly allows crosslinking. Thus, we did not need to spike in aminoallyl dUTP in the RT reaction. We have clarified this in the Methods.

      Reviewer #2 (Recommendations For The Authors):

      Throughout the manuscript, there are frequent references to the "Methods" section for important details. However, it can be challenging to determine which specific section of the Methods the authors are referring to, and in some cases, a thorough examination of the entire Methods section fails to locate the exact information needed to support the authors' claims. Below are a few specific examples of this issue. The authors are encouraged to be more precise in their references to the Methods section.

      In the revised manuscript, we numbered each subsection of Methods and updated pointers and associated hyperlinks in the main text to the subsection numbers.

      • On page 7, line 14, it is unclear how the authors compared the cell marker gene expression with the marker gene expression in the reference cell type.

      We have clarified in the revised manuscript.

      • On page 7, line 33, the authors note that some barcodes may have been missed during the sequencing of the rabies virus libraries, but the Methods section lacked a convincing explanation on this issue (see my point 2 above).

      We included a separate subsection on the sequencing of rabies libraries and the analysis of the sequencing depth in the Methods. In this new subsection, we further clarified our reasoning for identifying the lack of sequencing depth as a reason for missing barcodes, especially in comparison to sequencing depth required for establishing exact molecule counts used in established MAPseq and BARseq techniques with Sindbis libraries.

      • On page 9, line 44, the authors state that they considered a barcode to be associated with a cell if they found at least six molecules of that barcode in a cell, as detailed in the Methods section. However, the rationale behind this level of stringency is not provided in the Methods.

      We initially chose this threshold based on visual inspection of the sequencing images of the barcoded cells. Because the labeled cell types were consistent with our expectations (Fig. 4E-G), we did not further optimize the threshold for detecting retrogradely labeled barcoded cells.

      • I have noticed that some important explanations of figure panels are missing in the legends, making it challenging to understand the figures. Below are typical examples of this issue.

      In addition to the examples that the reviewer mentioned below, we also revised many other figure panels to make them clear to the readers.

      • In Fig. 2, "RV into SC" in panel C does not make sense, as RV was injected into the thalamus. There is no explanation of the images in this panel C.

      We have corrected the typo in the revision.

      • In Fig. 3, information on the endogenous gene panel for cell type classification (Table S3) could be mentioned in the legend or corresponding text.

      We now cite Table S3 both in Fig 3 legend and in the main text. We also included a list of the 104 cell type marker genes we used in Table S3.

      • In panel J, it is unclear why the total number of BC cells is 2,752, and not 4,130 as mentioned in the text.

      This is a typo. We have corrected this in the revision. The correct number (3,746) refers to the number of cells that did not belong to either of the two categories at the bottom of the panel, and not the total number of neurons. To make this clear, we now also include the total number of barcoded cells at the top of the panel.

      • In Fig. 4, the definitions of "+" and "−" symbols in panels K and L are unclear. Also, it seems that the second left column of panel K should read "T −."

      We corrected the typo in K, further clarified the “Area” labels, and changed the “S” label in 4K to “−”. This change does not change the original meaning of the figure: when considering the variance explained in L4/5 IT neurons, considering the subclass compositional profile is equivalent to not using the compositional profiles of cell types, because L4/5 IT neurons all belong to the same subclass (L4/5 IT subclass). Although operationally we simply considered subclass-level compositional profiles when calculating the variance explained, we think that changing this to “−” is clearer for the readers.

      • In Fig. 5, panel E is uninterpretable.

      We revised the main text and the figure to clarify how we manually proofread cells to determine the QC thresholds for barcoded cells. These plots showed a summary of the proofreading. We also revised the figures to indicate that they showed the fraction of barcoded cells that were considered real after proofreading. In the revised version, we moved these plots to Fig. S5.

      • In Fig. S1, I do not understand the identity of the six samples on the X-axis of panel A (given that only two animals were described in the main text) and what panel B shows, including the definition of map_cluster_conf and map_cluster_corr.

      In the revised Fig. S1, we made it more explicit that the six animals include both animals used for retrograde tracing (2 animals) and those used for trans-synaptic tracing (4 animals). We updated the y axis labels to be more readable and cited the relevant Methods section for definitions.

      • In Fig. S2, please provide the definitions of blue and red dots and values in panel A, as well as the color codes and size of the circles in panel B. My overall impression from panel B is that there is no significant difference between RV-infected and non-infected cells. The authors should provide more quantitative and statistical support for the claim that "RV-infected cells had higher expression of immune response-related genes."

      We toned down the statement to “Consistent with previous studies […], some immune response related genes were up-regulated in virus-infected cells compared to non-infected cells.” Because the main point of the single-cell RNAseq analysis was that rabies did not affect the ability to distinguish transcriptomic types, the change in immune response-related genes was not essential to the main conclusions. We clarified the red and blue dots in panel A and changed panel B to show the top up-regulated immune response-related genes in the revised manuscript.

      • In Fig. S3, the definitions of the color code and circle size are missing.

      We have added the legends in Fig. S3.

    1. Author Response

      We thank the reviewers for their detailed and constructive criticisms of our work. They raise many important questions (such as the issue of defining context) that we have also been thinking about extensively and they provide new and insightful avenues that have the potential to meaningfully improve the manuscript. We also appreciate that they commented on the novelty and importance of this work. Going forward, we will address the methodological concerns raised as best as we can and thereby hope to make the evidence for our conclusion more compelling

    1. Author Response

      eLife assessment

      This study provides direct evidence showing that Kv1.8 channels underly several potassium currents in the two types of sensory hair cells found in the mouse vestibular system. This is an important finding because the nature of the channels underpinning the unusual potassium conductance gK,L in type I hair cells has been under scrutiny for many years. Although most of the experimental evidence is compelling and the analysis is rigorous, the evidence supporting some of the claims related to Kv1.4 channels is incomplete. The study will be of interest to cell and molecular biologists and auditory neuroscientists.

      We are thankful to the editor and reviewers for their thorough assessment of our work and insightful feedback. Our responses to the comments and suggestions are below.

      Reviewer #1 (Public Review):

      Summary:

      In this paper, the authors provide a thorough demonstration of the role that one particular type of voltage-gated potassium channel, Kv1.8, plays in a low voltage-activated conductance found in type I vestibular hair cells. Along the way, they find that this same channel protein appears to function in type II vestibular hair cells as well, contributing to other macroscopic conductances. Overall, Kv1.8 may provide especially low input resistance and short time constants to facilitate encoding of more rapid head movements in animals that have necks. Combination with other channel proteins, in different ratios, may contribute to the diversified excitability of vestibular hair cells.

      Strengths:

      The experiments are comprehensive and clearly described, both in the text and in the figures. Statistical analyses are provided throughout.

      Weaknesses:

      None.

      Reviewer #2 (Public Review):

      The focus of this manuscript was to investigate whether Kv1.8 channels, which have previously been suggested to be expressed in type I hair cells of the mammalian vestibular system, are responsible for the potassium conductance gK,L. This is an important study because gK,L is known to be crucial for the function of type I hair cells, but the channel identity has been a matter of debate for the past 20 years. The authors have addressed this research topic by primarily investigating the electrophysiological properties of the vestibular hair cells from Kv1.8 knockout mice. Interestingly, gK,L was completely abolished in Kv1.8-deficient mice, in agreement with the hypothesis put forward by the authors based on the literature. The surprising observation was that in the absence of Kv1.8 potassium channels, the outward potassium current in type II hair cells was also largely reduced. Type II hair cells express the largely inactivating potassium conductance gK,A, but not gK,L. The authors concluded that heteromultimerization of non-inactivating Kv1.8 and the inactivating Kv1.4 subunits could be responsible for the inactivating gK,A. Overall, the manuscript is very well written and most of the conclusions are supported by the experimental work. The figures are well described, and the statistical analysis is robust.

      My only comment relates to the statement regarding the results providing "evidence" that Kv1.4 form heteromultimers with Kv1.8 channels (see Discussion). The only data I can see from the results is that Kv1.4 channels are expressed in the membrane of type II hair cells, which is not sufficient evidence for the above claim. Is the distribution of Kv1.8 and Kv1.4 overlapping in type II hair cells? Have the authors attempted to perform some pharmacological studies on Kv1.4? For example, would gK,A be completely blocked by a Kv1.4 antagonist? Addressing at least some of these questions would strengthen your argument.

      Author response: With respect to the “evidence” for heteromultimerization of Kv1.4 and Kv1.8: We agree that there is not conclusive evidence but have pulled together reasons to suggest that the fast inactivation of Kv1.8-dependent gA in type II hair cells reflects a contribution from Kv1.4 subunits. The reasons we note are mostly from other sources: 1) Kv1.4 subunits are the only Kv1 alpha subunits known to make channels with intrinsic rapid inactivation (Bertoli et al., 1994); 2) Kv1.4 is highly expressed in type II hair cells, but not type I hair cells, in mouse utricle (McInturff et al., Biol. Open., 2018; Jan et al., Cell Reports, 2021; Orvis et al., Nat. Methods, 2021); 3) previous work from M. Correia and colleagues suggested Kv1.4 as the likely source of A-current in pigeon vestibular hair cells; 4) some rat type II hair cells show comparatively strong Kv1.4-like immunoreactivity (our Fig. 5). While we consider heteromultimerization of Kv1.4 and Kv1.8 alpha subunits a plausible explanation consistent with available data from different sources, we agree that the question is not at all settled, and indeed raise the possibility that KV beta subunits, which are also differentially expressed by type I and II hair cells, play a role. Experiments to definitively advance or refute this hypothesis are beyond the scope of this paper.

      Reviewer #3 (Public Review):

      Summary:

      This paper by Martin et al. describes the contribution of a Kv channel subunit (Kv1.8, KCNA10) to voltage-dependent K+ conductances and membrane properties of type I and type II hair cells of the mouse utricle. Previous work has documented striking differences in K+ conductances between vestibular hair cell types. In particular, amniote type I hair cells are known to express a non-typical low-voltage-activated K+ conductance (GK,L) whose molecular identity has been elusive. K+ conductances in hair cells from 3 different mouse genotypes (wildtype, Kv1.8 homozygous knockouts, and heterozygotes) are examined here and whole-cell patch-clamp recordings indicate a prominent role for Kv1.8 subunits in generating GK,L. Results also interestingly support a role for Kv1.8 subunits in type II hair cell K+ conductances; inactivating conductances in null mice are reduced in type II hair cells from striola and extrastriola regions of the utricle. Kv1.8 is therefore proposed to contribute as a pore-forming subunit for 3 different K+ conductances in vestibular hair cells. The impact of these conductances on membrane responses to current steps is studied in the current clamp. Pharmacological experiments use XE991 to block some residual Kv7-mediated current in both hair cell types, but no other pharmacological blockers are used. In addition, immunostaining data are presented and raise some questions about Kv7 and Kv1.8 channel localization. Overall, the data present compelling evidence that the removal of Kv1.8 produces profound changes in hair cell membrane conductances and sensory capabilities. These changes at hair cell level suggest vestibular function would be compromised and further assessment in terms of balance behavior in the different mice would be interesting.

      Strengths:

      This study provides strong evidence that Kv1.8 subunits are major contributors to the unusual K+ conductance in type I hair cells of the utricle. It also indicates that Kv1.8 subunits are important for type II hair cell K+ conductances because Kv1.8-/- mice lacked an inactivating A conductance and had reduced delayed rectifier conductance compared to controls. A comprehensive and careful analysis of biophysical profiles is presented of expressed K+ conductances in 3 different mouse genotypes. Voltage-dependent K+ currents are rigorously characterized at a range of different ages and their impact on membrane voltage responses to current input is studied. Some pharmacological experiments are performed in addition to immunostaining to bolster the conclusions from the biophysical studies. The paper has a significant impact in showing the role of Kv1.8 in determining utricular hair cell electrophysiological phenotypes.

      Weaknesses:

      1. From previous work it is known that GK,L in type I hair cells has unusual ion permeation and pharmacological properties that differ greatly from type II hair cell conductances. Notably GK,L is highly permeable to Cs+ as well as K+ ions and is slightly permeable to Na+. It is blocked by 4-aminopyridine and divalent cations (Ba2+, Ca2+, Ni2+), enhanced by external K+, and modulated by cyclic GMP. The question arises, if Kv1.8 is a major player and pore-forming subunit in type I and type II cells (and cochlear inner hair cells as shown by Dierich et al. 2020) how are subunits modified to produce channels with very different properties? A role for Kv1.4 channels (gA) is proposed in type II hair cells based on previous findings in bird hair cells and immunostaining for Kv1.4 channels in rat utricle presented here in Fig. 6. However, hair cell-specific partner interactions with Kv1.8 that result in GK,L in type I hair cells and Cs+ impermeable, inactivating currents in type II hair cells remain for the most part unexplored.

      Author response: Our results raise the question of how Kv1.8/Kcna10 is regulated to produce gK,L in type I hair cells, which has different properties from the Kv1.8 conductance expressed heterologously (Lang et al., Am. J. Physiol. Renal Physiol., 2000; Ranjan et al., Front. Cell. Neurosci., 2019; Dierich et al., Cell Reports, 2020) and the Kv1.8 conductance inferred in inner hair cells (Dierich et al., 2020). We lay out several possibilities in the Discussion, but testing these suggestions is beyond the scope of the present paper.

      The relatively high Cs+ permeability of gK,L (0.31 pCs/pK, Rüsch & Eatock, J. Neurophysiol., 1996; Rennie & Correia, J. Membr. Biol., 2000) suggests there is something different about the selectivity filter and pore region of gK,L relative to most Kv1 family members. Although the intrinsic Cs+ permeability of heterologously expressed Kv1.8 is not reported. While we note that the pore region in Kv1.8 differs from other Kv1 subunits by a single amino acid (a glycine instead of alanine at position 411 – placed by AlphaFold in the pore helix of hKCNA10, Jumper et al., Nature, 2021), the effect of this difference is not known. A separate study is needed to determine why gK,L has a high Cs+ permeability relative to other Kv channels.

      For type II hair cells, the Cs+ permeability of Kv currents has not been fully characterized. Internal Cs+ does appear to reduce outward current more effectively in type II hair cells (Lang & Correia, J. Neurophysiol., 1989; Sokolowski et al., Dev. Biol., 1993) than in type I hair cells (Rüsch & Eatock, J. Neurophysiol., 1996; Rennie & Correia, J. Membr. Biol., 2000).

      With respect to cochlear inner hair cells, note that the assignment of Kv1.8 by Dierich et al. (2021) to a delayed rectifier in cochlear inner hair cells (IHCs) was based on inference – that is, existing inner ear expression databases show that Kv1.8 is expressed in IHCs, and heterologous Kv1.8 channels have a current resembling that observed in IHCs after block of multiple other K channels. We agree with Dierich et al. that Kv1.8 is an attractive candidate for the residual conductance in cochlear IHCs based on comparison with its properties in heterologous expression data. Together their study and our study suggest that Kv1.8 takes on quite different voltage dependence depending on the hair cell environment, and it will be an interesting challenge to sort out the reasons.

      1. Data from patch-clamp and immunocytochemistry experiments are not in close alignment. XE991 (Kv7 channel blocker) decreases remaining K+ conductance in type I and type II hair cells from null mice supporting the presence of Kv7 channels in hair cells (Fig. 7). Also, Holt et al. (2007) previously showed inhibition of GK,L in type I hair cells (but not delayed rectifier conductance in type II hair cells) using a dominant negative construct of Kv7.4 channels. However, immunolabelling indicates Kv7.4 channels on the inner face of calyx terminals adjacent to hair cells (Fig. 5). Some reconciliation of these findings is needed.

      Author response: Our pharmacology with XE991 suggests a small but significant population of Kv7 channels in type I and II hair cells (Fig 7). With the immunogold technique, Kharkovets et al. (PNAS, 2000) and Hurley et al. (J. Neurosci., 2006) counted significant Kv7.4 particles in type I hair cells, although the particles occurred at much greater density in the postsynaptic calyx membrane facing the hair cell. These results lead us to propose that the Kv7 channel we identified pharmacologically includes the Kv7.4 subunit, possibly in combination with other Kv7 subunits (Lysakowski et al., J. Neurosci., 2011). By this argument, the absence of clear hair cell staining in the confocal images of Fig. 5A is likely to reflect differences in methods, which include the use of different mouse strains, different sensitivities of immunogold vs. confocal imaging, and different antibodies.

      Holt et al. (J. Neurosci., 2007) indeed saw inhibition of gK,L in hair cells grown in organotypic cultures of the neonatal mouse utricle after viral expression of a dominant negative Kv7.4 construct. However, other studies show that Kv7 antagonists do not block gK,L (Hurley et al., J. Neurosci., 2006), and the Jentsch group, which first proposed Kv7.4 as a likely candidate for gK,L (Kharkovets et al., PNAS, 2000), ultimately showed that knocking out Kv7.4 and Kv7.5 expression failed to eliminate gK,L (Spitzmaul et al., J. Biol. Chem., 2013). Together, these results suggest that in Holt et al. (2007), the inhibition of gK,L by transfection with the dominant negative KCNQ4 construct may have occurred through unintended interactions with native gK,L channels. The young age of the neonatal cultured and transfected utricles raises the possibility of a developmental effect – that functional Kv7 channels are needed for the developmental transition to a Kv1.8 conductance. Consistent with this idea is the observation that Kv7 current is present in neonatal hair cells, where it is a relatively large proportion of Kv current in type I HCs before they acquire gK,L (Hurley et al., J. Neurosci., 2006). Alternatively, the overexpression of nonfunctional Kv7.4 channels in virally-transfected hair cells may have inhibited or delayed gK,L acquisition through a more general effect on membrane proteins.

      1. Strong immunosignal appears in the cuticle plates of hair cells in addition to signal in basal regions of hair cells and supporting cells. Please provide a possible explanation for this.

      Author response: There is significant non-specific staining of apical cell surfaces and supporting cell membranes in addition to specific staining of hair cell basolateral membranes. We infer non-specific staining when immunolabeling is present in the knockout tissue, as it is for the apical surfaces and supporting cell membranes—compare Fig. 5B.3 (control tissue) with Fig. 5B.4 (Kv1.8 null mutant). Non-specific immunostaining can occur with polyclonal antibodies (specific to several epitopes) if the antibodies are not affinity-purified, but we used an affinity-purified antibody. The apical surfaces are reputed to be “sticky” (susceptible to non-specific staining) but the non-specific labeling in the basal parts of supporting cells is more puzzling. One possibility is that the Kv1.8 antibody weakly recognized closely related Kv1.1 channels, which are more strongly expressed in supporting cells than hair cells (Scheffer et al., J. Neurosci., 2015).

      1. A previous paper reported that a vestibular evoked potential was abnormal in Kv1.8-/- mice (Lee et al. 2013) as briefly mentioned (lines 94-95). It would be very interesting to know if any vestibular-associated behaviors and/or hearing loss were observed in the mice populations. If responses are compromised at the sensory hair cell level across different zones, degradation of balance function would be anticipated and should be elucidated.

      Author response: We agree; some of these questions are the subject of another paper in preparation.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Thank you for overseeing the assessment of our manuscript, “Comprehensive mutagenesis maps the effect of all single codon mutations in the AAV2 rep gene on AAV production". We would also like to thank the reviewers for their feedback. We have carried out the suggested experiments that we feel are most central to our conclusions and summarized the revisions to the manuscript below.

      We appreciate the reviewers’ suggestion with regards to testing different rAAV genomes. We have measured the effect of Rep variants on the production of rAAV containing three additional genomes: a 4.4 kb single-stranded genome, a 3.9 kb single-stranded genome, and a 2.1 kb self-complementary genome (Figures 5C and 5D). The DNase-resistant particles titers - reported as a percent of wild-type Rep titers - are relatively consistent across these three constructs as well as the 5.0 kb single-stranded genome previously tested.

      We agree with the reviewers that measurement of the relative transduction efficiency of rAAV produced with different Rep variants is an important experiment to conduct. To address this, we transduced HEK293T cells with rAAVs, containing a luciferase genome, which were produced using two different Rep variants. When a constant volume of purified rAAV was used for transduction, we observed that the rAAV produced with the S110R Rep variant resulted in higher transduction than rAAV produced with wild-type Rep (as measured by luciferase signal). While we tested only a small number of variants, these results indicate that at least one of the Rep variants we identified can increase not only the viral genome titer but also the titer of transducing particles.

      To generate this transduction data, we produced additional rAAV preps using S110R and Q439T Rep variants. In the previous version of this manuscript, we used the Q439T variant to produce rAAV and noted a 10% increase in the ratio of viral genomes: capsids as determined by comparison of qPCR and capsid ELISA titers. However, a similar increase was not observed in the more recent experiment discussed above. We attribute this discrepancy to changes in the plasmid quantification methods used for transfection. Previously, we quantified plasmids using a fluorometric assay (Qubit); in our more recent experiments, we used qPCR to quantify plasmids for transfection. qPCR provides a more accurate measurement of plasmid concentration due to the specific nature of the primers and probes used, which may account for the subtle shift in quantification. While outside the scope of the current work, it will also be interesting to further investigate the proportion of full capsids using additional Rep variants and more direct methods, such as cryoEM or analytical ultracentrifugation.

      We agree with the reviewers’ observation that there are differences in the production fitness values for synonymous variants. However, the variation in production fitness values between synonymous variants is smaller than that between non-synonymous variants. We conducted the following analysis to clarify this point. We calculated two mean centered fitness values for each codon variant in the WT AAV2 library. The “positional mean centered fitness value” was determined using the production fitness values of all variants at a given amino acid position and describes how far a given fitness value diverges from the mean fitness value for that position. The “synonymous codon mean centered fitness value” was determined using the production fitness values of all synonymous variants at a given position and describes how far a given fitness value diverges from the mean fitness value for all its synonymous codon variants. We then plotted both mean centered fitness values versus amino acid position (Figure S8).

      The distribution of mean centered selection values is narrower when calculated at the synonymous codon level as opposed to the position level. This indicates that, in general, synonymous variants have more tightly distributed production fitness values than non-synonymous variants. This observation precludes us from conducting a more thorough analysis of the effects of synonymous codons on AAV production. (Although, there is at least one instance where clear differences between synonymous codons can be observed (Figure S9C and Figure S9D).) We agree with the reviewers that synonymous variants almost certainly influence aspects of AAV production, such as genome replication, transcriptional regulation, mRNA stability, and protein expression. However, our assay measures the aggregate effect of rep variants on all steps in the AAV production process and is likely unable to detect the effects of synonymous variants on specific steps in this process if those steps are not rate-limiting. We have updated the discussion section to include an explanation of the above.

      The X-axes in Figures 5B and 5D have been updated to plot s’ instead of percent WT titer. We have also added asterisks to indicate significance in Figures 5A and 5C. Thank you for these suggestions.

      We agree with Reviewer 3 that it would be interesting to sequence barcodes from the mRNA pool. The 20 bp barcodes are located upstream of the polyA site and should be present in mRNA transcripts. Something to consider is that AAV2 transcripts expressed from all three promoters (p5, p19, and p40) are polyadenylated at the same site (Stutika et al., 2016). As such, in our WT AAV2 library, barcode representation in the mRNA pool would indicate the aggregate effect of a rep variant on the levels of all AAV2 transcripts. In the pCMV-Rep78/68 library, only two AAV2 transcripts are generated - a spliced and unspliced version of the p5 product. Sequencing of barcodes present in the mRNA pool could be informative regarding the effect of rep variants on combined Rep78/68 expression levels. However, we feel that this experiment is outside the scope of the current work.

      We were also surprised at the number of novel functional Rep variants that were identified in our library. As the reviewer pointed out, optimal rAAV production likely does not equate to optimal fitness of naturally occurring AAV in the endogenous host. Naturally occurring AAV has both a latent and a lytic cycle and the Rep proteins play a role in both these processes (Pereira et al., 1997; Surosky et al., 1997). rAAV production, however, is primarily analogous to the lytic cycle of naturally occurring AAV. In their endogenous hosts, AAV must balance the effect of any mutations on fitness in both the lytic and latent contexts while we assay specifically for production fitness. We additionally attribute this finding to the relatively small number of AAV serotypes, for which rep sequences are available. We have added a discussion of the above to the manuscript.

      Finally, in response to feedback from other researchers, we determined which amino acid substitutions resulted in production fitness values that were significantly different from that of wild-type (Figure S4). These results further emphasized the importance of the origin-binding domain; most statistically significant beneficial substitutions clustered here. Additionally, we noted that the majority of substitutions in the zinc-finger domain resulted in production fitness changes that were not significant. This lines up with previous work indicating that the zinc-finger domain is dispensable for rAAV production. We have added a discussion of these results to the main text.

      We again thank the reviewers for their suggestions; we feel that incorporation of their suggestions has strengthened support for our conclusions and enhanced the utility of this work for others in the field.

      References Pereira, D. J., McCarty, D. M., & Muzyczka, N. (1997). The adeno-associated virus (AAV) Rep protein acts as both a repressor and an activator to regulate AAV transcription during a productive infection. Journal of Virology, 71(2), 1079–1088. https://doi.org/10.1128/jvi.71.2.1079-1088.1997

      Stutika, C., Gogol-Döring, A., Botschen, L., Mietzsch, M., Weger, S., Feldkamp, M., Chen, W., & Heilbronn, R. (2016). A Comprehensive RNA Sequencing Analysis of the Adeno-Associated Virus (AAV) Type 2 Transcriptome Reveals Novel AAV Transcripts, Splice Variants, and Derived Proteins. Journal of Virology, 90(3), 1278–1289. https://doi.org/10.1128/JVI.02750-15

      Surosky, R. T., Urabe, M., Godwin, S. G., McQuiston, S. A., Kurtzman, G. J., Ozawa, K., & Natsoulis, G. (1997). Adeno-associated virus Rep proteins target DNA sequences to a unique locus in the human genome. Journal of Virology, 71(10), 7951–7959. https://doi.org/10.1128/jvi.71.10.7951-7959.1997

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors were trying to investigate whether viral IBs are involved in antagonizing IFN-I production during EBOV trVLPs infection. They found that IRF3 is hijacked and sequestered into EBOV IBs after viral infection, thereby leading to the spatial isolation of IRF3 with TBK1 and IKKε. In such a progress, the activity of IRF3 is suppressed and downstream IFN-I induction is inhibited. The authors designed many experiments, such as the PLA that examined the colocalization, to support their conclusions. However, necessary negative controls were missed in several assays. More key index is needed to be examined in several assays.

      The paper is well organized and most data in this paper could support the conclusions, while there are several issues that need to be further solved.

      1. In Figure 2-4, authors should examine the expression of downstream IFNs as well as the phosphorylation and nuclear localization of IRF3 to further prove the suppression of IRF3 activity by infecting with trVLPs.

      Response: The inhibitory effect of trVLPs infection on the phosphorylation of IRF3 S396 and SeV-induced IRF3 nuclear localization was determined by immunoprecipitation (Figure 3D) and immunofluorescence (Figure 4A and 4B), respectively. In addition, we demonstrated that IFN-β transcription was inhibited more potently by EBOV viral inclusion bodies compared with VP35 alone (Figure 7B and 7C).

      Moreover, EBOV viral inclusion bodies were demonstrated to inhibit the transcription of IFN downstream genes (e.g., CXCL10, ISG15 and ISG56) more potently than VP35 alone (new Figure 7D-F).

      1. In Figure 5, to better prove the conclusion that EBOV NP and VP35 play an important role in sequestering IRF3 in IBS, authors should add the "NP+VP35+VP30" and "NP+VP35+VP24" groups to reperform the assay.

      Response: According to the reviewer’s suggestion, VP24 or VP30 was added to the “VP35+NP” group, and the results showed that the “NP+VP35+VP24” and “NP+VP35+VP30” groups exhibited little, if any, effect on the distribution of IRF3 compared with the “NP+VP35” group (new Figure 5 - figure supplement 2A-B).

      1. In Figure 6f, the expression of STING should be examined by immunostaining to show the knockdown efficiency in trVLPs-infected cells.

      Response: As suggested by the reviewer, immunostaining was performed to visually detect the effect of STING knockdown on the IRF3 distribution during trVLPs infection (new Figure 6F).

      Reviewer #2 (Public Review):

      The manuscript by Zhu et al explored molecular mechanisms by which Ebola virus (EBOV) evades host innate immune response. EBOV has a number of means to shut down the type I interferon induction (by viral VP35 protein) and block type I interferon action (by viral VP24 protein). This study reported a new mechanism that inclusion body (IB) used for viral replication sequesters IRF3, a key transcription factor involved in the interferon signaling, resulting in blockade of downstream type I interferon gene transcription. This finding is potentially interesting and may provide a new insight into EBOV's evasion of innate immunity. However, there are some flaws in the experimentations and analyses that need to be addressed.

      1. Most of experiments were performed by transfection of trVLP plasmids, which is very different from virus infection. The conclusions should be examined and verified in the context of virus infection.

      Response: As suggested by the reviewer, the effects of IRF3 depletion on live Ebola virus replication were examined as described in the revised manuscript. Consistent with the results obtained after trVLPs infection, IRF3 depletion exerted little, if any, effect on viral replication (new Figure 7H), which supports the notion that, upon EBOV infection and the formation of inclusion bodies, IRF3 has little, if any, transcription activation activity after sequestration by inclusion bodies.

      1. Fig 1 - VP35 displayed a classical IB staining only in Panel A, while much less so in Panel C and not in panel B. It seemed that the VP35 staining images were chosen in a way towards the authors' favor. The statistical analysis of co-localization of VP35 and IRF3, TBK1 or IKKe should be performed to draw the conclusion. Another concern is that IKKe is normally lowly expressed under a rest condition and becomes induced only when the interferon signaling is activated. It seemed to be expressed at a high level even when the interferon signaling is blocked in Panel C. The authors should comment on this discrepancy.

      Response: Ebola virus inclusion bodies show variations in both shape and size. According to the reviewer’s suggestion, the colocalization of TBK1 or IKKε and VP35 is shown in new figures (new Figure 1C and 1E), and quantitatively analyzed by the fluorescence intensity using ImageJ software (new Figure 1B, 1D and 1F).

      1. Fig 2 - Was this experiment done by transfection or infection? The description of result is not consistent with the figure legend. The labeling was also not consistent between panel A and B. I would suggest performing Western blot to analyze the expression level of IRF3.

      Response: We apologize for the incorrect description of the data. Ebola virus trVLPs were initially produced based on transfection but also involved the viral infection process. The use of “transfection” in the figure and figure legends has been changed to “infection” in the revised manuscript. As suggested by the reviewer, Western blotting was performed to analyze the IRF3 expression levels at different time points after trVLPs infection (new Figure 2D).

      1. Fig 3 and 4 - As VP35 is well known for its highly efficient blockade of type I interferon activation, how would the authors differentiate the effect of VP35 alone from the sequestration of IRF3 in IBs in these experiments?

      Response: Previous studies have found that VP35, rather than NP, inhibits the expression of interferon, and the “VP35+NP” treatment, which induces IRF3 sequestration, showed inhibited IFN-β luciferase activity much more potently than VP35 expression alone (Figure 7B).

      1. Fig 3 - PolyIC can activate both RLR and TLR signaling pathways. Can the author comment on which pathway it activates in this experiment?

      Response: In this study, the effect of poly(I:C) was consistent with the results observed with SeV, which indicated that poly(I:C) may mainly activate the RLR signaling pathway. A discussion was added to the revised manuscript.

      1. The authors demonstrated that VP35 interacts with STING and recruit the latter to IBs. How would this affect the function of STING given that STING plays essential roles in cGAS/cGAMP pathway?

      Response: This study unexpectedly showed that VP35 can recruit IRF3 into viral inclusion bodies through STING, but whether it regulates the cGAS-STING pathway remains to be further investigated. Related discussion was added to the revised manuscript.

      1. It is difficult to follow the logics of Fig 7. The expression level of each viral protein should be determined. Ideally, a mutation in VP35 that disrupts its ability to antagonize the interferon signaling but still allows for the IB formation can be used to assess the relative contribution of IB sequestering IRF3.

      Response: As suggested by the reviewer, a series of VP35 mutants were constructed, but we failed to obtain a VP35 mutant that contains a mutation that disrupts the ability of the protein to antagonize interferon signaling but still allows IB formation. Instead, coexpression of “NP+VP35+VP30+L”, which induces IBs formation, inhibited IFN-I more potently than the expression of VP35 alone (Figure 7B). IRF3 knockout inhibited poly(I:C)-induced IFN-I production but had little, if any, effect on poly(I:C)-induced IFN-I production in the “NP+VP35+VP30+L” group (Figure 7C). IRF3 knockout in the cells did not significantly affect viral replication, but overexpression of activated IRF3 (IRF3/5D), instead of wild-type IRF3, inhibited viral replication (new Figure 7G-H). These results collectively suggested that almost all IRF3 in cells was hijacked and sequestered into IBs in the Ebola virus-infected cells.

    1. Author Response

      The following is the authors’ response to the original reviews.

      RESPONSE TO REVIEWERS:

      Reviewer #1 (Recommendations For The Authors):

      I think the manuscript of this excellent work can be improved, especially in writing (including a suggestion in the title) and presentation (Figure 6); Also some additional specific experiments and analyses could be important, as I suggest below,

      1. For the title, perhaps a shorter "The acetylase activity of Cdu1 protects Chlamydia effectors from degradation" would be better to convey the major significance of this work. Of course, Cdu1 must regulate the function of InaC, IpaM and CTL0480. But perhaps it is speculative to think that egress is the major function of these effectors as their activity on other host cell processes during the cycle could eventually impact the extrusion process indirectly.

      Although we concur with the insights provided by reviewer 1, we wish to underscore that a significant breakthrough presented in our study revolves around the regulation of Chlamydia exit by Cdu1. Consequently, we believe that this noteworthy discovery should be incorporated into the title.

      1. For the writing:

      a. The description of ubiquitination and DUBs could be synthesized to the essential, so that space is gained to explain things that then come a bit out of the blue in the results (what are Incs, the specific functions of InaC, IpaM, and CTL0480 - at least place the citations in lines 110-112 next to the corresponding Incs -, Cdu2, etc - see specifics below)

      In lines 182-196 of the revised manuscript, we have incorporated additional contextual information concerning the roles of Incs, along with descriptions of the functions of InaC, IpaM, and CTL0480.

      b. In the Results, there is a lot of Chlamydia- and maybe lab-specific jargon that could be significantly simplified for the more general reader. I detail some suggestions below in the specific issues.

      We have improved the readability of our manuscript for a general audience by removing Chlamydia-specific terminology from the entire text and figures.

      1. For the figures:

      a. Figure 6, this figure could be reorganized: why two graphs in panel D? If detailed quantifications were done, perhaps in panel B just zoom on the examples of Golgi distributed/compacted? And again the labelling Rif-R L2, L2 pBOMB, M407 p2TK2, etc, simplify?

      Figure 6 has undergone restructuring. The representative images have been relocated to Supplemental Figures 5 and 6, while we have introduced sample images demonstrating F-actin assembly and Golgi repositioning. Furthermore, the quantification of Golgi dispersal has been streamlined into a single panel. Additionally, we have simplified the labeling of the strains utilized in the study.

      b. Figure 3, in the labelling, WT, inaC null, cdu1::GII wouldn't be enough? Leave the details to the legend and/or M&M.

      We have simplified the labeling of Ct strains in Figure 3.

      c. Figure 3C, these arrowheads should not be so symmetric (small arrows instead?) and it is unclear that the indicated cells do not show CTL0480.

      We have substituted arrowheads with small arrow symbols and have also revised the Figure to incorporate a new representative image that prominently illustrates the absence of CTL0480 at the inclusion membrane of some cdu1::GII inclusions within infected Hela cells at 36 hpi.

      1. Experiments:

      a. In Figure 7, at least extrusion should be analysed also with the Cdu1-deficient strain expressing Ac-deficient Cdu1 and the inaC and ipaM phenotypes should be complemented.

      We have conducted additional experiments to analyze extrusion production in Hela cells infected with a cdu1 null strain expressing the acetylase-deficient Cdu1 variant. We have incorporated the relevant data into revised Figure 7, where the impact of this strain on extrusion production and size is presented. Additionally, we updated Supplemental Figure 8 to include data illustrating the number of inclusions produced by this strain. We have also addressed these new results in the revised manuscript (lines 424-432). We are currently complementing inaC and ipaM mutant strains with various InaC and IpaM constructs that will be used in a follow up manuscript.

      b. Does overexpression of InaC, IpaM, or CTL0480 in a cdu1-null background prevent the degradation of these Incs and suppress the defects of cells infected by the cdu1 mutant (F-actin, Golgi, MYPT1)? This would show that the multiple phenotypes displayed by cells infected by the cdu1 null mutant are indeed related to the decreased levels of InaC, IpaM and CTL0480.

      We opted not to include data from the overexpression of these effectors in a cdu1-null background due to an unexpected decrease in shuttle plasmid load during overexpression. This development prompted concerns regarding the potential detrimental effects of overexpressing these effectors in the absence of Cdu1. Data supporting this observation are not included in this report.

      c. Figures 3A and 3B should be quantified (it says it is from 3 independent experiments). It would be important to have a relative perspective of how much Cdu1 protects these Incs over time (for InaC, it would also be nice to have the 36 and 48 hpi time-point). This is in contrast with the microscopy data in Figure 5, which illustrates very clear effects, and the quantification is a bit redundant.

      In Figure 3, we have incorporated a new Western Blot image showing endogenous InaC protein levels in Hela cells following infection with both WT Ct and cdu1::GII strains at 24, 36, and 48 hours post-infection (hpi). Additionally, we have quantified the Western Blot signals for both InaC and IpaM, and these results are also presented in Figure 3. The quantification of MYPT1 recruitment has been relocated to a supplementary figure. We have also included details regarding the methodology employed for the quantification of Western Blot signals in the Materials and Methods section.

      d. What is the subcellular localization of InaC, IpaM, CTL0480 and Cdu1 when analysed by transfection? Does Cdu1 bind to of InaC, IpaM, CTL0480 in infected cells? If this was attempted and unsuccessful it should be mentioned.

      In transfected HEK cells, InaC, IpaM, CTL0480, and Cdu1 all exhibit cytoplasmic localization with a diffuse pattern (data not shown). Despite our efforts, we encountered challenges in observing co-immunoprecipitation of Cdu1 with all three Incs in infected Hela cells at 24 hpi, We have duly acknowledged this limitation in our findings, as reflected in line 221-226 of the revised manuscript.

      1. Specific issues:

      2. Line 87, "propagule" is really needed to describe the EB?

      The EB is the infectious form of Chlamydia species that spreads within the host to renew its life cycle; thus, "propagule" is a suitable term to characterize the EB.

      • Exocytosis implies fusion with the plasma membrane so "inclusion is exocytosed" (line 91) is not entirely correct.

      In line 91 of the revised manuscript, we referred to extrusion as the exit of an intact inclusion from the host cell and omitted the use of "exocytosed" to describe this process.

      • Line 126, "a Ct L2 (LGV L2 434 Bu) background". Maybe "a Ct cdu1-null strain" would be enough and leave the detail for Materials and Methods.

      In line 128 of the revised manuscript, we omitted "(LGV L2 434 Bu)" to avoid using jargon that may be unfamiliar to readers not well-versed in Chlamydia terminology.

      • Line 138, in the previous Pruneda et al, Nature Microbiol 2018, the title of figure 4 is "ChlaDUB deubiquitinase activity is required for C. trachomatis Golgi fragmentation", so why raise this hypothesis? And why in the end is the acetylation activity of Cdu1 that promotes Golgi distribution? I think this related with infection vs transfection experiments but it deserved to be briefly explained/discussed.

      In lines 140-142 of the revised manuscript, we provide clarification that the DUB activity of Cdu1 is required for Golgi fragmentation in transfected cells. This observation supports our initial hypothesis suggesting that the DUB activity of Cdu1 is also required for Golgi distribution in infected cells, and our rationale for identifying targets of its DUB activity.

      • Lines 147-155, what is the relevance of this non-ubiquitinated proteins that come along? Couldn't this be synthesized?

      We have included a discussion on non-ubiquitinated proteins, as they could potentially encompass proteins that interact with those protected by Cdu1. This perspective provides supplementary insights into the roles of proteins targeted for ubiquitination in the absence of Cdu1. The results of this analysis have been succinctly summarized in a single paragraph within the initial manuscript (lines 151-159 of the revised manuscript).

      • Line 170, I think it is the first time that "Type 3 secretion"; perhaps explain in the introduction.

      Type 3 secretion systems have been extensively characterized and discussed in the literature, and we anticipate that the majority of our readers are well-acquainted with this secretory mechanism.

      • Line 184, I think it is the first time "microdomains" are mentioned; perhaps mention in the introduction.

      The definition of "microdomains" has been provided in line 191 of the revised manuscript.

      • Figure 2, as it stands the analysis with truncated Cdu1 proteins adds little to the work. Binding to the Incs seems to be affected when the TM domain is not present, but it still binds. And this is in a transfection context.

      The results depicted in Figure 2, involving truncated Cdu1 proteins, illustrates that Cdu1 is capable of interacting with InaC, IpaM, and CTL0480 even in the absence of infection. This finding serves as evidence suggesting that all three Incs could potentially serve as direct targets for Cdu1 activity. As a result, we prefer to keep these findings in the manuscript.

      • Line 219, "late stages of infection", this is shown (albeit not completely quantified) for IpaM and CTL0480, but not for InaC.

      In the revised Figure 3, we show InaC protein levels at 24, 36, and 48 hours post-infection, and we have incorporated quantitative data for both InaC and IpaM protein levels in the context of Hela cells infected with both WT L2 and cdu1::GII strains. This updated figure serves to emphasize the pivotal role of Cdu1 in safeguarding all three Incs during the late stages of infection.

      • Line 233, "pBOMB-MCI backbone" - is this needed in the Results section? And this refers to Figure 4 while pBOMB appear already in Fig. 3.

      We have removed “pBOMB-MCI backbone” in the revised manuscript.

      • Line 236, should be cdu1 endogenous promoter.

      In line 265 of the revised manuscript we have replaced Cdu1 with cdu1 (italicized).

      • Line 263, WT.

      In line 293 of the revised manuscript we replaced “wild type” with “WT”.

      • Line 277, IncA instead of "the Inc protein IncA".

      In the manuscript we wanted to emphasize that IncA is also an inclusion membrane protein, therefore we have included “the Inc protein IncA” in the revised manuscript to avoid any confusion.

      • How does the data in Figure 5 relates to the relatively few proteins ubiquitinated in cells infected with cdu1-mutant Ct? These Ub-labelling corresponds to ubiquitinated InaC, IpaM and CTL0480?

      The findings presented in Figure 5 demonstrate that the acetylase activity of Cdu1 plays a crucial role in enabling Ct to block all ubiquitination events taking place on or in proximity to the periphery of the inclusion membrane. This encompasses Cdu1 targets that might not have been identified through our proteomic analysis.

      • Lines 299-301, "M923 inclusions", there is certainly a clear way to write this.

      In lines 326-327 and 332-332 of the revised manuscript, we have clarified that “M923” is an incA null strain to provide clarification.

      • Line 309, is "peripheries" correct?

      We have changed “peripheries” with “periphery” in the revised manuscript (line 360).

      • Line 312, "Rif-R L2" and "M407" - can this be simplified?

      In the revised manuscript, "Rif-R L2" was substituted with "WT L2" in lines 363 and 382, while "M407" was exchanged with "an inaC null strain" in lines 311, 367, and 368. These same replacements were applied to the Figures and their corresponding legends for consistency.

      • Lines 308-321, and 326-335, these % are all approximate figures and this should be made clear.

      In lines 364-395 of the revised manuscript we have stated that all percentages are approximate values.

      • Fig. S1, kb and not k.b; what's the "+ control"; and is not really possible to have a PCR that works for the *? 3 kb is not that long.

      In the updated Figure S1, we have corrected "k.b" to "kb". In the legend of Figure S1, we have clarified that the + control corresponds to the cdu2 locus. Moreover, we could not cleanly amplify a 3 kb PCR product from bacteria in whole cell lysates of infected mammalian cells (Vero cells).

      • Fig. S2, kb and not k.b, bp and not b.p

      In the updated Figure S2, we have corrected “k.b” with “kb” and “b.p” with “bp”.

      Reviewer #2 (Recommendations For The Authors):

      Figure 1 describes an affinity-based purification and mass spectrometric identification of differentially ubiquitinated proteins (host and chlamydial). Through different permutations of combinations of infection (mock, wild type, and Cdu1 mutant), three effectors, IpaM, InaC, and CTL0480, were identified as putative targets of Cdu1. The authors used a high-stringency cutoff, which could explain identification of only three targets. Having said this, the localization of Cdu1 to the inclusion membrane would be expected to also narrow down the number of targets. Interestingly, Cdu2, another deubiquitinase remained active in these experiments, which could have affected identification of Cdu1 targets. The authors addressed this issue by referring to previously reported structural studies. A somewhat glaring omission is the lack of reference to NF-kB as a substrate of ChlaDub1/Cdu1. In experiments by Le Negrate et al., ChlaDub1 ectopic overexpression in cells led to the deubiquitination of IkB-alpha, thus inhibiting the nuclear translation of NF-kB. Based on the inclusion membrane localization of Cdu1 during infection, is the identification of IkB an artifact of overexpression of Cdu1, or is it still a bona fide Cdu1 target?

      We conducted experiments using our cdu1 null strain to investigate whether IκBα could be a target of Cdu1 activity. While our findings are intriguing and relevant, it is not feasible to determine, at this stage, whether our findings result from a direct or indirect consequence of Cdu1 localizing to the inclusion membrane. Consequently, these findings extend beyond the scope of the current manuscript. We plan to explore the implications of our observations more deeply in a subsequent manuscript, where we intend to provide a more comprehensive and mechanistic analysis based on these preliminary findings. Additionally, we have referenced the potential targeting of IκBα by Cdu1 in lines 100-101 and 166-171 of the revised manuscript.

      Figure 2 demonstrates the individual interaction of the identified effectors with Cdu1. Interaction at the inclusion membrane is inferred from colocalization studies, while protein-protein interaction is monitored using ectopic overexpression of tagged versions of Cdu1 and the individual effectors. This is somewhat of a weakness of the manuscript because the mechanism of action of Cdu1 towards its target hinges on protein-protein interaction.

      Despite our efforts, we encountered challenges in co-immunoprecipitating endogenous Cdu1 with all three Incs in infected Hela cells at 24 hpi. There are multiple technical reasons as to why these interactions, which are predicted to be transient, will not be captured by bulk affinity approaches such as immunoprecipitations, especially when the starting materials are present in very low abundance. We acknowledged these limitations in our findings, as reflected in lines 221-226 of the revised manuscript.

      Figure 3 provides the first evidence in this paper of the importance of the inferred interaction of Cdu1 with the three effectors. The authors show that the loss of cdu1 has stability consequences on the three effectors. This figure would benefit from quantifying InaC- or IpaM-positive inclusions in the same manner done with CTL0480. The timepoint-dependent effect of Cdu1 loss of function is intriguing. Do InaC and IpaM retention at the inclusion show the same timepoint-dependent characteristic?

      In the revised Figure 3, we have incorporated InaC protein levels at 24, 36, and 48 hours post-infection. Additionally, we have included quantitative data representing both InaC and IpaM protein levels in HeLa cells infected with both WT L2 and cdu1::GII strains. The quantification of CTL0480 localization to cdu1::GII inclusions has been moved to a supplementary figure.

      This updated figure illustrates that the absence of Cdu1 has a time-dependent impact on both InaC and IpaM. However, it is noteworthy that the kinetics of degradation for these two proteins diverge significantly.

      For Figure 7, the authors should consider monitoring timing of inclusion extrusion to gain additional insight into the functional interactions between the effectors. For example, the loss of CTL0480 leads to increased extrusion, implying a role in delaying or suppressing extrusion. In a time-course experiment, a CTL0480 mutant could exhibit an earlier occurrence of inclusion extrusion.

      One of the principal discoveries of this study is that Cdu1, InaC, IpaM, and CTL0480 collaborate to facilitate optimal extrusion of Ct from host cells. These findings represent a significant contribution to our understanding of how Chlamydia controls its exit from infected cells. We are currently in the process of expanding on these results. A forthcoming follow-up manuscript will provide more detailed and comprehensive exploration of these findings.

      Reviewer #3 (Recommendations For The Authors):

      Specific comments.

      a. I have some concerns related to the time point chosen for mass spec analysis and potential caveats and alternative interpretations. This work was done relatively early (24 hours) compared to the most convincing Cdu1 functions that occur later, thus this may limit the authors global understanding of protein changes. For example, the known substrate of Cdu1, Mcl-1 was not identified but this is altered relatively late during infection. Thus, the surprise that minimal host proteins are altered in ubiquitination may be partially driven by the timing of the assay. This should be more clearly discussed as a caveat.

      In the revised manuscript (lines 166-171), we have acknowledged that there might be additional targets of Cdu1 that remain unidentified, primarily due to the specific time point we utilized in our study.

      b. Another caveat to these studies is while the loss of Cdu1 alters different effectors stability and function and extrusion size, these changes do not modulate bacterial growth in cells. The authors speculate that regulating extrusion size may alter interactions with innate cells to drive dissemination. However, a previous study found defects in an animal model using a Cdu1 transposon mutant found decreased bacterial load in the genital tract. It is also possible that redundancy of effectors may mask importance in growth of Cdu1, but the authors strongly argue against redundancy of Cdu1 and Cdu2 so this weakens the authors argument here. These concepts and published data should be more directly discussed in the context of the authors proposed extrusion model and the role in driving Chlamydia growth and pathogenesis.

      In our revised manuscript (lines 460-466) we propose that while we do not observe any growth impairments during Ct growth in the absence of Cdu1 in HeLa cells, the reduction in bacterial loads observed in murine models of infection with an independent cdu1 mutant strain (cdu1::Tn) may potentially be linked to defects in extrusion production or alterations in Cdu1-dependent regulation of extrusion size.

      c. Recent studies have found that IFNg activation can result in dramatic changes in ubiquitination to pathogen containing vacuoles. While some of these are blocked by the newly found GarD, it seems possible that Cdu1 may also play a role (and perhaps use its deubiquinating activity) to further protect the inclusion. In light of published results showing that Cdu1 mutants have lower IFU burst size only in IFNg activated cells, this may be an important caveat in the current studies. This should be more directly addressed in the current manuscript.

      We have incorporated two experimental findings indicating that the presence of Cdu1 is not required for Ct to defend itself against IFN cellular immunity in human cells. These recent discoveries are now presented in the updated Figure 5 and detailed in lines 338-355 of the revised manuscript.

      d. On lines 433-434 the authors claim that Cdu1 is atypical since it is not encoded with the metaeffector/target pairs. However, this is an oversimplification of what is known about metaeffectors. For example, there are meta-effector/effector pairs that are not encoded together in Legionella (see table 1 DOI: https://doi.org/10.3390/pathogens10020108). Thus, the discussion should be adjusted. It seems Cdu1 is the first meta-effector found in Chlamydia, and maybe this should be highlighted more strongly rather than its uniqueness in this aspect of meta-effector/effector functions.

      In lines 488-489 of the revised manuscript, we have removed the assertion that Cdu1 functions as an atypical metaeffector and emphasized that it represents the initial discovery of a metaeffector within Ct.

    1. Author Response

      eLife assessment

      This important work describes the first high-resolution structure of HGSNAT, a lysosomal membrane protein required for the degradation of heparan sulfate (HS). Through careful structural analysis, this work proposes potential reasons why certain mutations in HGSNAT lead to lysosomal storage disorders and outlines the enzyme's catalytic mechanism. The experimental evidence presented provides incomplete support for the proposed molecular mechanism of the HS acetylation reaction and the impact of disease-causing mutations.

      We thank the editors and reviewers for taking the time to provide a critical assessment of our manuscript. We appreciate the input and suggestions to improve the analysis. Included here are only our provisional responses. We will address the concerns raised in more detail and incorporate them in the revised version of the manuscript.

      Reviewer #1 (Public Review):

      This article by Navratna et al. reports the first structure of human HGSNAT in an acetyl-CoAbound state. Through careful structural analysis, the authors propose potential reasons why certain human mutations lead to lysosomal storage disorders and outline a catalytic mechanism. The structural data are of good quality, and the manuscript is clearly written. This study represents an important step toward understanding the mechanism of HGSNAT and is valuable to the field. I have the following suggestions:

      We thank the reviewer for their encouraging and positive overall assessment of our work.

      1. The authors should characterize whether the purified protein is active. Otherwise, how does one know if the detergent used maintains the protein in a biologically relevant state? The authors should at least attempt to do so. If these prove to be challenging, at the very least, the authors should try a cell-based assay to demonstrate that the GFP tag does not interfere with the function.

      Thank you for highlighting this concern. The cryo-EM sample was prepared without the exogenous addition of ligand, as noted in the manuscript; the acetyl-CoA that we see in the structure was intrinsically bound to the protein, indicating the ability of GFP-tagged HGSNAT protein to bind the ligand. We purified the protein at a pH optimal for acetyl-CoA binding, as suggested by Bame, K. J. and Rome, L. H. (1985) and Meikle, P. J. et al., (1995). Because we see acetyl-CoA in a structure obtained using a GFP fusion, we argue that GFP does not interfere with protein stability and ability to bind to the co-substrate. As demonstrated by existing literature HGSNAT catalyzed reaction is compartmentalized spatially and conditionally. The binding of acetyl-CoA happens towards the cytosol and is optimal at pH 7-0.8.0, while the transfer of the acetyl group to heparan sulfate occurs towards the luminal side and is optimal at pH 5.0-6.0. We are working on establishing a robust assay to study this complicated and compartmentalized acetyl transfer assay.

      1. In Figure 5, the authors present a detailed schematic of the catalytic cycle, which I find to be too speculative. There is no evidence to suggest that this enzyme undergoes isomerization, like a transporter, between open-to-lumen and open-to-cytosol states. Could it not simply involve some movements of side chains to complete the acetyl transfer?

      The acetyl-CoA bound structure presented in the paper does not conclusively support a potential for isomerization and conformational dynamics. We agree with the reviewer that the reaction schematic presented in Figure 5 is speculative. We acknowledge in the discussion that our structure represents only a single step of the reaction, and defining the precise mechanism of acetyl transfer needs additional work. However, we will reword the discussion and change Figure 5 to address this concern raised by multiple reviewers.

      Reviewer #2 (Public Review):

      Summary:

      This work describes the structure of Heparan-alpha-glucosaminide N-acetyltransferase (HGSNAT), a lysosomal membrane protein that catalyzes the acetylation reaction of the terminal alpha-D-glucosamine group required for the degradation of heparan sulfate (HS). HS degradation takes place during the degradation of the extracellular matrix, a process required for restructuring tissue architecture, regulation of cellular function, and differentiation. During this process, HS is degraded into monosaccharides and free sulfate in lysosomes.

      HGSNAT catalyzes the transfer of the acetyl group from acetyl-CoA to the terminal non-reducing amino group of alpha-D-glucosamine. The molecular mechanism by which this process occurs has not been described so far. One of the main reasons to study the mechanism of HGSNAT is that multiple mutations spanning the entire sequence of the protein, such as nonsense mutations, splicesite variants, and missense mutations lead to dysfunction that causes abnormal accumulation of HS within the lysosomes. This accumulation is a cause of mucopolysaccharidosis IIIC (MPS IIIC), an autosomal recessive neurodegenerative lysosomal storage disorder, for which there are no approved drugs or treatment strategies.

      This paper provides a 3.26A structure of HGSNAT, determined by single-particle cryo-EM. The structure reveals that HGSNAT is a dimer in detergent micelles and a density assigned to acetylCoA. The authors speculate about the molecular mechanism of the acetylation reaction, map the mutations known to cause MPS IIIC on the structure and speculate about the nature of the HGSNAT disfunction caused by such mutations.

      Strengths:

      The description of the architecture of HGSNAT is the highlight of the paper since this corresponds to the first description of the structure of a member of the transmembrane acyl transferase (TmAT) superfamily. The high resolution of an HGSNAT bound to acetyl-CoA is an important leap in our understanding of the HGSNAT mechanism. The density map is of high quality, except for the luminal domain. The location of the acetyl-CoA allows speculation about the mechanistic role of multiple residues surrounding this molecule. The authors thoroughly describe the architecture of HGSNAT and map the mutations leading to MPS IIIC. The description of the dimeric interphase is a novel result, and future studies are left to confirm the importance of oligomerization for function.

      We thank the reviewer for their time and for highlighting both the quality and novelty of the structure presented in this work.

      Weaknesses:

      Apart from the cryo-EM structure, the article does not provide any other experimental evidence to support or explain a molecular mechanism. Due to the complete absence of functional assays, mutagenesis analysis, or other structures such as a ternary complex or an acetylated enzyme intermediate, the mechanistic model depicted in Figure 5 should be taken with caution.

      Thank you for pointing out this concern. The proposed mechanistic model in Figure 5 is a hypothesis based on previously reported biochemical characterization of HGSNAT by Rome & Crain (1981), Rome et al, (1983), Miekle et al., (1995) and Fan et al., (2011). However, we agree with the reviewer that this schematic is not experimentally proven and is speculative at best. Especially because our structure presents only a single step of the reaction, which does not conclusively support either ping-pong or random-order bi-substrate reactions. We will rephrase this section of our discussion and edit Figure 5 to address this concern.

      The authors discuss that H269 is an essential residue that participates in the acetylation reaction, possibly becoming acetylated during the process. However, there is no solid experimental evidence, e.g. mutagenesis analysis or structural analysis, in this or previous articles, that demonstrates this to be the case.

      H269, as a crucial catalytic residue, was suggested by monitoring the effect of chemical modifications of amino acids on acetylation of HGSNAT membranes by Bame, K. J. and Rome, L. H. (1986). We agree that mutagenesis, catalysis, and structural evidence for the same are not currently available. We are pursuing a more thorough exploration of the role of both H269 (previous studies) and N258 (from this study) on the stability and function of HGSNAT.

      In the discussion part, the authors mention previous studies in which it was postulated that the catalytic reaction can be described by a random order mechanistic model or a Ping Pong Bi Bi model. However, the authors leave open the question of which of these mechanisms best describes the acetylation reaction. The structure presented here does not provide evidence that could support one mechanism or the other.

      We agree with the reviewer’s observation that the structure doesn’t indeed support one reaction mechanism or another. We are pursuing the structural and kinetic characterization of HGSNAT in the presence of other co-substrates and multiple pHs that are required to address this concern thoroughly.

      Although the authors map the mutations leading to MPS IIIC on the structure and use FoldX software to predict the impact of these mutations on folding and fold stability, there is no experimental evidence to support FoldX's predictions.

      We are working on assessing the impact of specific mutations on the stability of HGSNAT and will add them to the revised version of the manuscript. We thank the reviewer for this suggestion.

      Reviewer #3 (Public Review):

      Summary:

      Navratna et al. have solved the first structure of a transmembrane N-acetyltransferase (TNAT), resolving the architecture of human heparan-alpha-glucosaminide N-acetyltransferase (HGSNAT) in the acetyl-CoA bound state using single particle cryo-electron microscopy (cryoEM). They show that the protein is a dimer and define the architecture of the alpha- and beta- GSNAT fragments, as well as convincingly characterizing the binding site of acetyl-CoA.

      Strengths:

      This is the first structure of any member of the transmembrane acyl transferase superfamily, and as such it provides important insights into the architecture and acetyl-CoA binding site of this class of enzymes.

      The structural data is of a high quality, with an isotropic cryoEM density map at 3.3Å facilitating the building of a high-confidence atomic model. Importantly, the density of the acetyl-CoA ligand is particularly well-defined, as are the contacting residues within the transmembrane domain.

      The open-to-lumen structure of HSGNAT presented here will undoubtedly lay the groundwork for future structural and functional characterization of the reaction cycle of this class of enzymes.

      We thank the reviewer for their positive assessment of the data presented in this work. We really appreciate and agree with the reviewer's comment that the “structure of HSGNAT presented here will undoubtedly lay the groundwork for future structural and functional studies.”

      Weaknesses:

      While the structural data for the open-to-lumen state presented in this work is very convincing, and clearly defines the binding site of acetyl-CoA, to get a complete picture of the enzymatic mechanism of this family, additional structures of other states will be required.

      We agree with the reviewers’ assessment and are heavily invested in pursuing the structures of all the steps of acetyl transfer by HGSNAT.

      A potentially significant weakness of the study is the lack of functional validation. The enzymatic activity of the enzyme characterized was not measured, and the enzyme lacks native proteolytic processing, so it is a little unclear whether the structure represents an active enzyme.

      We thank the reviewer for this comment. While the proteolytic cleavage of the protein remains debated, we find no evidence of such an event in our purification (SDS-PAGE and SEC). Studies like Durand et al., (2010) and Fan et al., (2011) suggest that even the ER retained monomeric HGSNAT is active. Because we see acetyl-CoA (co-substrate) bound to the protein in our structure, we surmise that proteolysis is not necessary for function, at least not for substrate binding. However, we are working towards the structural and kinetic characterization of recombinant α- and β-HGSNAT construct to explore the role of proteolysis on HGSNAT stability and function.

    1. Author Response

      We are delighted that eLife has assessed our study as a valuable contribution as well as appreciating the importance of working on asymptomatic reservoirs of P. falciparum in high transmission where not just children, but adolescents and adults harbor multiclonal infections. The constructive public reviews will serve to improve our manuscript.

      Detailed responses to referees’ comments and a revised manuscript are forthcoming. Here we make a provisional response to three key areas addressed by the referees:

      (1) census population size

      Referee 1 raises important questions although we respectfully disagree on the terminology we have adopted (of “census”) and on the unclear utility of the proposed quantity.

      We consider the quantity a census in that it is a total enumeration or count of the infections in a given population sample and over a given time period. In this sense, it gives us a tangible notion of the size of the parasite population, in an ecological sense, distinct from the formal effective population size used in population genetics. Given the low overlap between var repertoires of parasites (as observed in monoclonal infections), the population size we have calculated translates to a diversity of strains or repertoires. But our focus here is in a measure of population size itself. The distinction between population size in terms of infection counts and effective population size from population genetics has been made before for pathogens (see for example Bedford et al. 2011 for the seasonal influenza virus and for the measles virus) and is a clear one in the ecological literature for non-pathogen populations (Palstra et al. 2012).

      Both referees 1 and 2 point out that census population size will be sensitive to sample size. We completely agree with the dependence of our quantity on sample size. We used it for comparisons across time of samples of the same depth, to describe the large population size characteristic of high transmission, and persistent across the IRS intervention. Of course, one would like to be able to use this notion across studies that differ in sampling depth.

      Here, referee 1 makes an insightful and useful suggestion. It is true that we can use mean MOI, and indeed there is a simple map between our population size and mean MOI (as we just need to divide or multiply by sample size). We can do even more, as with mean MOI we can presumably extrapolate to the full sample size of the host population, or the population size of another sample in another location. What is needed for this purpose is a stable mean MOI relative to sample size. We can show that indeed in our study mean MOI is stable in that way, by subsampling to different depths of our original sample. We will include in the revision discussion of this point and result, which allows an extrapolation of the census population size to the whole population of hosts in the local area. We’ll also clarify the time denominator, as given the typical duration of infections, we expect our population size to be representative of a per-generation measure.

      Referee 2 suggests we adopt the term “census count” but as a census in our mind is a count we prefer to use “census”.

      Referee 3 considers the genetic data tracking parasite MOI and census changes gives the same result as prevalence which tracks infected hosts. Respectfully, we disagree and will provide an expanded response.

      (2) the importance of lineages (in response to referee 2)

      We do not think that lineages moving exclusively through a given type of host or “patch” is a requirement for enumerating the size of the total infections in such a subset. It is true that what we have is a single parasite population, but we are enumerating for the season the respective size in host classes (children and adults). This is akin to enumerating subsets of a population in ecological settings.

      We are also not clear on the concept of lineage for these highly recombinant parasites as we struggle to find highly related repertoires. In fact, we see the use of the var fingerprinting methodology as a means to capture changes in strain or var repertoires dynamics as a result of changing transmission conditions.

      (3) var methodology

      Comments and queries were made by all three referees about aspects of var methodology, including the Bayesian approach. These will be addressed in our full response.

      Here we respond to a very good point made by referee 2: “Thinking about the applicability of this approach to other studies, I would be interested in a larger treatment of how overlapping DBLa repertoires would impact MOIvar estimates. Is there a definable upper bound above which the method is unreliable? Alternatively, can repertoire overlap be incorporated into the MOI estimator?”

      There is no predefined threshold one can present a priori. Intuitively, the approach to estimate MOI would appear to breakdown as overlap moves away from extremely low, and therefore, for locations with lower transmission intensity. Interestingly, we have observed that this is not the case in our paper by Labbé et al. 2023 where we used model simulations in a gradient of three transmission intensities, from high to low. The original varcoding method performed well across the gradient. This may arise from a nonlinear and fast transition from low overlap to high overlap that is accompanied by the MOI transitioning quickly from primarily multiclonal (MOI > 1) to monoclonal (MOI = 1). This issue needs to be investigated further, including ways to extend the estimation to explicitly include the distribution of DBL repertoire overlap.

      References: Bedford T, Cobey S, Pascual, M. 2011. Strength and tempo of selection revealed in viral gene genealogies. BMC Evol Biol 11, 220. https://doi.org/10.1186/1471-2148-11-220

      Labbé F, He Q, Zhan Q, Tiedje KE, Argyropoulos DC, Tan MH, Ghansah A, Day KP, Pascual M. 2023. Neutral vs . non-neutral genetic footprints of Plasmodium falciparum multiclonal infections. PLoS Comput Biol 19 :e1010816. doi:doi.org/10.1101/2022.06.27.49780

      Palstra FP, Fraser DJ. 2012. Effective/census population size ratio estimation: a compendium and appraisal. Ecol Evol. Sep;2(9):2357-65. doi:10.1002/ece3.329.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The study isolated extracellular vesicles (EV) from healthy controls (HCs) and Parkinson patients (PwP), using plasma from the venous blood of non-fasting people. Such EVs were characterized and validated by the presence of markers, their size, and their morphology. The main aim of the manuscript is to correlate the presence of synaptic proteins, namely SNAP-25, GAP-43, and SYNAPTOTAGMIN-1, normalized with HSP70, with the clinical progression of PwP. Changes in synaptic proteins have been documented in the CSF of Alzheimer's and Parkinson's patients. The demographics of participants are adequately presented.

      • One important limiting, as well as puzzling aspect, is the fact that authors did not find differences between groups at the beginning of the study nor after one year, after age and sex adjustment.

      Response: Thanks for your comments. We acknowledge your observation that the absence of a discernible difference in plasma EV synaptic protein levels between the PD and control subjects constitutes a significant limitation of our study. This outcome could be attributed to the fact that the controls were recruited from the neurology outpatient clinic, representing a group that could be considered "sub-healthy." Moreover, these individuals are not exempt from aging-related neurodegenerative processes. Considering that our PD subjects are in the early stages of the disease (with a mean disease duration of less than 3 years) and that synaptic dysfunction is a broader indicator rather than specific to PD, these factors could collectively contribute to the lack of distinction between the PD and control groups.

      However, our primary intention was also to explore the potential of plasma EV synaptic proteins as predictive markers for disease progression in PD. In this regard, we have identified their applicability within the current PD cohort. We are committed to conducting further follow-up with these study subjects over an extended duration to delve deeper into these findings.

      We revised the following statement in the discussion part to address this issue as following “Additionally, synaptic dysfunction is a frequently observed phenomenon in several neurological diseases, and it is not exclusive to PD. Consequently, the HC group in our current study may have included individuals with coexisting neurological conditions, potentially explaining the lack of a significant difference between the PD group and the HCs. However, this approach also illuminates the significance of synaptic dysfunction in the advancement of PD. This insight can be invaluable for monitoring disease progression, particularly in the context of clinical trials focused on disease modification.”

      • Tables in general are hard to follow. Specifically, Table 2 does not convey a clear message nor in the text of the Table itself, and the per 100% of change needs to be explained in the corresponding legend.

      Response: Thanks for your comment. In Table 2, our aim was to demonstrate the association between the change of plasma EV synaptic proteins with the change of clinical severity, and presented as coefficient (p value). We apologize for any prior ambiguity in the main text's description of these results and have since made revisions to enhance clarity.

      Regarding the "per 100% change," this is due to the quantification of plasma EV synaptic proteins being based on a semi-quantitative Western blot method. Each measurement was normalized by the average baseline plasma synaptic protein levels of healthy controls (HCs). The term "per 100% change" denotes the increase or decrease in plasma EV synaptic protein abundance relative to the average baseline levels observed in healthy controls. We apologize for any confusion caused and removed this term. In addition, we rephrased the statement to ensure better understanding and readability in the Table legend of revised manuscript as following “The association between the change of plasma EV synaptic proteins abundance (between baseline and follow-up) with the change of clinical severity in motor and cognitive domains (between baseline and follow-up) in people with Parkinson’s disease. A generalized linear model was employed and the data was presented as coefficient (p value).”

      • It is only when PwP were classified as a first quartile that a significantly greater deterioration was found. However, in the case of tremor, the top 25% had values going from 0.46-0.47 to 0.32-0.35, whereas the lower three quarters went from 0.33-0.34 to 0.27-0.28 depending on the protein analyzed. This needs to be clarified in the text.

      Response: Thanks for your comments. As per the unified Parkinson's disease rating score (UPDRS), a higher score indicates greater severity of symptoms. Regarding tremor, we observed a general trend of improvement in both groups. PwP with elevated baseline plasma EV proteins had a trendy of worse tremor score at baseline, and the improvement was significantly better than the rest of PwP. This improvement seems to contradict the progressive nature of PD, and one possible explanation could be the alleviation of symptoms due to medication usage. The assessment of motor symptoms took place within the hospital setting, where we refrained from requesting patients to withhold their anti-PD medications due to concerns about safety issues such as falls. Consequently, certain motor symptoms might have been effectively controlled by the anti-PD medication. Traditionally, symptoms like tremor and rigidity (as reflected by the akinetic rigidity score) respond well to medications, while postural instability and gait disturbance (PIGD) are less responsive. In our cohort, we noted an improvement in tremor scores and stability in akinetic rigidity (AR) scores. Conversely, PD patients with higher baseline plasma EV synaptic protein levels exhibited notable progression in PIGD scores. These findings have been documented in the results section and discussed comprehensively within the revised manuscript as following “On the other hand, the evaluation of motor symptoms occurred in a hospital setting where we did not ask patients to stop taking their anti- PD medications due to safety concerns like the risk of falls. As a result, specific motor symptoms, particularly tremor and AR, which are more sensitive to medication compared to PIGD, may have been effectively managed by the anti-PD medications. This could potentially explain the improvement in tremor observed between the baseline and one-year follow-up, especially among PwP with elevated baseline plasma EV synaptic proteins.”

      • Table 3 is hard to read and some of the values seem repetitive, especially for tremor, AR, and PIGD. It looks as if Figure 2 represents the same information as Table 3.

      Response: Thanks for your information. We have ensured the accuracy of the results presented in Table 2. While some of the entries may appear similar, they do indeed possess distinct differences.

      To enhance readability, we streamlined the information in Table 3 by removing the p-values from the intra-group comparisons between baseline and the 1-year follow-up within each domain. We retained the original p-values for trend related to the inter-group comparisons for changes. Detailed information has been relocated to the supplementary section of the revised manuscript. In Figure 2, we illustrated the relationship between baseline plasma extracellular vesicle (EV) synaptic protein levels and the clinical assessment parameters during follow-up in patients with Parkinson's disease (PwP). This portrayal is distinct from the information depicted in Table 3.

      If you had concerns about the resemblance between Table 3 and Figure 3, please note that the values in Table 3 represent raw scores, while the values in Figure 3, namely the estimated marginal means, are the "adjusted" scores for UPDRS-II and PIGD at baseline and follow-up. These adjustments encompass age, sex, and disease duration. We sincerely apologize for any lack of clarity in our previous description and have since revised it accordingly.

      • The text and figure legends are not helpful in guiding the reader to understand the presented information.

      Response: Thanks for your comments and we apologized for the unclear statement. We revised the figure legend and the main text for better understanding of the readers.

      Reviewer #2 (Public Review):

      Hong and collaborators investigated variations in the amount of synaptic proteins in plasma extracellular vesicles (EV) in Parkinson's Disease (PD) patients on one-year follow-up. Their findings suggest that plasma EV synaptic proteins may be used as clinical biomarkers of PD progression.

      • It is a preliminary study using semi-quantitative analysis of synaptic proteins.

      Response: Thanks for your comments. The present study represents the initial phase of our investigation into the role of plasma EV synaptic proteins within our PD cohort. Our findings have revealed the potential predictive significance of these synaptic proteins in relation to PD progression. We are committed to conducting further follow-up with these study subjects over an extended period.

      Furthermore, it's important to acknowledge that the semi-quantitative approach employed to assess protein abundance was a limitation of this study. This limitation stems from the low concentration of plasma EV synaptic proteins, which restricts the feasibility of utilizing techniques such as ELISA or other quantitative methods for protein assessment. We have duly acknowledged this limitation within the scope of the present study as following “Semiquantitative assessment of plasma EV synaptic protein (SNAP-25, GAP-43, and synaptotagmin-1) levels was performed using western blot analysis. The lack of absolute values limits further clinical application.”

      Moving forward, we intend to adopt alternative EV isolation methods that enable the extraction of a larger abundance of plasma EV proteins, facilitating more accurate quantitative assessments. In addition, a longer longitudinal follow-up is warranted to clearly assess the prognostic efficacy of plasma EV synaptic proteins in PwP, which we had mentioned in the manuscript.

      • The authors have a cohort of PD patients with clinical examination and a know-how on EV purification. Regarding this latter part, they may improve their description of EV purification. EV may be broken into smaller size EV after freezing. Does it explain the relatively small size in their EV preparation? Do the authors refer to the MISEV guidelines for EV purity?

      Response: Thanks for your comments. In the previous manuscript, we provided a relatively detailed account of the procedures related to EV isolation and validation (https://doi.org/10.1096/fj.202100787R). In the revised manuscript, we added some information about the principle of the EV isolation kit, and the validation antibody as following “Plasma EVs were isolated from 1 mL of plasma by exoEasy Maxi Kit (Qiagen, Valencia, CA, USA), a membrane-based affinity binding step to isolate exosomes and other EVs without relying on a particular epitope, in accordance with the manufacturer’s instructions and storaged in the −80。C freezer. The isolated plasma EVs were then eluted and stored. Usually, 400 μL of eluate is obtained per mL of plasma. The isolated plasma EVs were validated according to the International Society of Extracellular Vesicles guidelines, which include1.markers, including the presence of CD63 (ab59479, Abcam, Cambridge, UK), CD9(ab92726, Abcam, Cambridge, UK), tumor susceptibility gene 101 protein (GTX118736, GeneTex, CA, USA) and negative of cytochrome c (ab110325; Abcam, Cambridge, UK) 2. Physical characterization through the nanoparticle tracking analysis, which demonstrated the majority of the size of EV are mainly within 50-100nm 3. The morphology from the electron microscopy analysis. The validation had been described previously [29-31]. “

      It's important to note that our primary focus was on exosomes, the smallest subtype of EVs. Through nanoparticle tracking analysis, we observed that the majority of isolated EVs fell within the diameter range of 50-150nm, exhibiting significant surface marker (i.e. CD63 and CD9) expression. Moreover, electron microscopy confirmed their vesicular morphology. These meticulously validated EVs were promptly analysed post-isolation.

      However, we acknowledge that the plasma obtained from study participants might have undergone freezing prior to EV isolation. This freezing process has the potential to diminish the yield rate of EVs and result in some degree of fragmentation. We have duly included this issue as a limitation in our revised manuscript as following “The final technical issue in the present study was the relatively small size of the isolated EVs. Despite the primary focus on isolating exosomes, which are the smallest type of EVs, it's important to consider that the presence of small-sized EVs could potentially be attributed to EV fragmentation that occurs during the freezing and thawing processes.”

      • Regarding synaptic protein quantification, the choice of western blotting may not be the best one. ELISA and other multiplex arrays are available. How the authors do justify their choice?

      Response: Thanks for your comments. We appreciate your input regarding the semi-quantitative western blot analysis not being the most optimal approach. Owing to the limited quantity of isolated plasma EVs and the significant protein abundance of synaptic proteins within these EVs, we did explore the use of an ELISA assay. However, it's worth noting that for a specific subset of the samples, the readout obtained was lower than the lower limit of detection of the ELISA kit. In response, we have incorporated this point as limitation within the discussion section of the revised manuscript as following “Semiquantitative assessment of plasma EV synaptic protein (SNAP-25, GAP-43, and synaptotagmin-1) levels was performed using western blot analysis. The lack of absolute values, i.e. from the results of enzyme-linked immunosorbent assay, limits further clinical application.”

      • Do the authors try to sort plasma EV by membrane-associated neuronal EV markers using either vesicle sorting or immunoprecipitation?

      Response: Thanks for your comments. The current study did not specifically isolate neuron-derived extracellular vesicles (EVs), potentially introducing some bias to the results. However, it's important to note that synaptic proteins, such as SNAP-25, exhibit a high degree of neuron-specific expression, with a predominant presence in the brain (as indicated by https://www.proteinatlas.org/ENSG00000132639-SNAP25/tissue). Given this context, the limitation of not analyzing neuron-derived EVs could be mitigated to some extent. In response, we have incorporated this point as limitation within the discussion section of the revised manuscript as following “Furthermore, this study evaluated the overall plasma EVs rather than specifically focusing on neuron-derived exosomes, potentially introducing a bias towards somatic-origin EVs. Nonetheless, it is worth noting that synaptic proteins primarily originate from neurons. Even when considering neuron-derived exosomes, it's important to recognize that they are not exclusively derived from the brain, which can lead to contamination from the peripheral nervous system.”

      • Many technical aspects may be improved. Such technical questions weakened the authors' conclusions.

      Response: Thanks for your comments. We recognize that the aforementioned issues represent limitations of our current study. In response, we have incorporated these points as limitations, including the semi-quantitative assessments, the isolation of total but not neuron-derived exosomes in the plasma, and the short follow-up time within the discussion section of the revised manuscript.

      • The discussion is pretty long to justify the data. It may be shortened by adding some information in the introduction.

      Response: Thanks for your comments. We have repositioned a statement from the second paragraph of the discussion to the introduction. This adjustment serves to enrich the background understanding of the link between synaptic dysfunction and neurodegenerative diseases.

    1. Author Response

      Reviewer #1 (Public Review)

      The manuscript by Singh et al proposes a new theoretical model for the phenomenon of planar cell polarity (PCP). The new model is simulating the emergence of the subcellular polarity of the Fat-Ds pathway, based on the interactions of the protocadherins Fat and Ds at the boundary between cells and in response to external gradients. Several mathematical models for PCP have been previously developed focusing on different aspects of PCP, including non-autonomy domineering (Amonlirdviman et al.), the effect of stochasticity on polarity (Burak et al.), gradient sensing (Mani et al), formation of molecular bridges (Fisher et al.) to name a few. The current modeling approach suggests a new model, based on a relatively simple set of equations for membrane Fat and Ds and their interactions, both in 1D (line of cells) and in 2D (hexagonal array). The equations are relatively simple on one hand, allowing performing tractable computational analysis as well as analytical approximations, while on the other hand allowing tracking membrane protein levels, which is what is measured experimentally. It has been previously shown that achieving polarity requires local feedback that amplify complexes in one orientation at the expense of complexes in the opposite orientation (e.g. Mani et al.). Interestingly, the current manuscript shows that a simple assumption, that Fat-DS complexes are stabilized when bound is sufficient to induce PCP when concentrations are high enough. The authors use the model to show how it captures several experimental observations, as well as to analyze the sensitivity to noise, the response to gradients, and the response to local perturbations (mutant clones). The manuscript is clear and the analysis is mostly coherent and sensible (although some parts need to be clarified, see below). The main issue I have with the manuscript is that it mostly describes how it captures different features that were mostly explained in previous models. I do think the authors should do more with their model to explain features that were not explained by other models, and/or generate non-trivial predictions that can be tested experimentally.

      We thank the reviewer for the positive feedback and valuable comments We have comprehensively modified the manuscript by including new results and detailing the specific model prediction and their potential experimental tests to address the concerns.

      Reviewer #2 (Public Review):

      The setting of planar cell polarity in epithelial tissues involves a complex interplay of chemical interactions. While local interactions can spontaneously give rise to cell polarity, planar cell polarity also involves tissue scale gradients whose effects are not clear. To understand their role, the authors built a minimal mechanistic model in considering two atypical cadherins, Fat (Ft) and Dachsous (Ds) which can associate at cell-cell interfaces to form hetero-dimers in which monomers belong to adjacent cells. This association can be seen as a local interaction between cells and is also sensitive to overall concentration gradients. From their model which appears to capture diverse experimental observations, the authors conclude that tissue-scale gradients provide to planar cell polarity a directional cue and some robustness to cellular stochasticity. While this model comes after similar works reaching similar predictions, the quality of this model is in its simplicity, its convenience for experimental testing, and the diversity of experimental observations it recapitulates.

      A strength of this work is to recapitulate many experimental observations made on planar cell polarity. It, for example, seems to capture the response of tissues to perturbations such as local downregulation of some important proteins, and the polarity patterns observed in the presence of noise in synthesis or cell-to-cell heterogeneity. It also gives a mechanistic description of planar cell polarity, making its experimental interpretation simple. Finally, the simplicity of the model facilitates its exploration and makes it easily testable because of the reduced amount of free model parameters.

      A weakness of this work is that it comes after several models with similar hypotheses and similar predictions.

      Another weakness is that some conclusions of this work rely on visual appreciation rather than quantification. This is particularly true for what concerns 2D patterns. An argument of the authors is for example that their model reproduces a variety of known spatial patterns, but the comparison with experiments is only visual and would be more convincing in being more quantitative.

      We are grateful to the reviewer for a critical evaluation of the manuscript and for giving important suggestions. We have incorporated all the comments and revised the manuscript accordingly by including quantitative analysis of all the results presented.

      Reviewer #3 (Public Review):

      Using theory, the authors study mechanisms for establishing planar cell polarity (PCP) through local and global modules. These modules refer to the interaction between neighbouring cells and tissue-wide gradients, respectively. Whereas local interactions alone can lead to tissue-wide alignment PCP, a global gradient can set the direction of PCP and maintain the pattern in presence of noise. In contrast, the authors argue that a global gradient can only generate PCP to an extent that is proportional to the gradient magnitude.

      The authors formulate a discrete model in one and two spatial dimensions that describe the assembly dynamics of PCP proteins on membranes. The number of proteins per cell remains constant. Additive noise is introduced to account for stochasticity in the attachment/detachment kinetics of proteins. Furthermore, ’quenched’ noise is introduced to account for variations of protein numbers between cells. The authors perform simulations of the stochastic discrete model in various situations. In addition, they derive a continuum description to perform some analytical computations.

      The strength of this analysis relies clearly on showing that simple dynamics can lead to tissue-wide PCP even in absence of a gradient in protein expression. A number of phenomena observed in tissues are qualitatively reproduced. In two spatial dimensions, they find swirling patterns that resemble patterns found in tissues when a global gradient is absent. The model also captures qualitative effects due to the down-regulation of one of the PCP proteins in a certain region of the tissue.

      The main weak point is that, from a physical point of view, the findings are not particularly surprising. Furthermore, some assumptions underlying the model, need some more justification. This holds notably for the question, of why additive noise is appropriate to account for the effect of stochasticity in the attachment-detachment dynamics of the proteins. Finally, the authors consider a situation that they consider to be one of the most interesting features of PCP, namely, the formation of PCP in the presence of a region with a down-regulated PCP protein and in presence of a gradient. Unfortunately, the effect is not very clear and the data provided remains limited.

      We thank the reviewer for the valuable comments are critique of the work. We have considered all the concerns and revised the manuscript comprehensively. In particular, we have elaborated the sections on model assumptions and added new figures/figure-panels to quantitatively present the model predictions. We have also revised the details of the one-dimensional continuum theory for PCP which, we feel, presents a detailed quantitative picture of PCP and its dependence on model parameters.

    1. Author Response

      Reviewer #2 (Public Review):

      In this study, Leiba et al. aim at establishing the developing zebrafish embryo as a suitable infection model to study Salmonella persistence in vivo. Under environmental stress (ex: macrophage phagosomes) a proportion of bacteria switch to a slow/arrested growth state conferring increased resistance to antibiotic treatments. Persisters are getting increasingly linked to infection relapses. Understanding how persistent infections emerge and bacteria survive in an organism for long time without replicating before switching back to a replicative state is essential. Zebrafish represents an alternative model to mice offering the possibility to image the whole organism and capture persistency with an amazing spatio-temporal resolution.

      In this paper, the authors demonstrate that persistent infections of Salmonella can be reproduced in the developing zebrafish. The kinetics of infection have been well characterized and shows a very nice heterogeneity between animals demonstrating the complex host-pathogen interactions (Fig 1). From the perspective of persistence, the presence of Salmonella survivors to host clearing is reported until 14dpi demonstrating the possibility to induce persistent infection in this model. Through the manuscript, the authors have used a variety of state-of-the-art technics illustrating the flexibility of this model including microscopy and imaging of specific immune populations, various transgenic animals and selective depletion of macrophages or neutrophils to assess their relative contributions. Overall, the conclusions of the authors are well supported by the presented data. This said, the authors should strengthen the conclusions of the paper by providing a better characterization of the infection.

      Major comments:

      1) Figure 1: What is the general life-spam of the fish?

      The general life-span of the zebrafish is approximately 3 years on average. Persistent infection is determined by the existence of a fraction of bacteria that endure over an extended period (after 96 hpi). Further, we observed Salmonella persistence for 14 days. In figure 1, we don’t think that the information of the general life-span of the zebrafish is critical.

      2) Figure 2: It would be nice to clearly state what infection scenario we are looking at. Have the authors studied "high proliferation", "infected" or "cleared" zebrafish?

      In Figure 2 we have studied the "infected" group. Both "high proliferation" and "cleared" larvae were excluded from the analysis. This is now clearly stated in the legend of Figure 2.

      3) Figure 3 and 4: It would be very informative if the authors can tell us what proportion of Salmonella is associated with macrophages and neutrophils. From panel C and D (Figure 3) and Figure 4 C and D and Suppl Fig 1, it seems that a lot of bacteria are extracellular. Maybe an EM image of the tissue would help to understand if the bacteria is "all" intracellular or intracellular.

      We apologize for any misunderstanding regarding the presence of intra- and extracellular bacteria depicted in Figure 3 C and D, Figure 4 C and D and Figure 3 -Suppl Fig 1. These figures illustrate infection experiments conducted in single-reporter larvae, limiting our analysis to bacteria associated with a single cell type. Figure 3G and Figure 4E-G, the panels depict infection experiments carried out in dual-reporter larvae, showing bacteria associated or not with macrophages and neutrophils. The present study aimed to establish the role of neutrophils and macrophages in the control of early and persistent Salmonella infection but further studies will focus on the exact localization of Salmonella during the course of the infection and, despite being a challenging technique for zebrafish, electron microscopy could be of great interest, allowing to visualize any type of cells (to determine if all bacteria are intracellular) at high resolution.

      4) Figure 3 and 4: It would be very useful if the authors can tell us if the intracellular bacteria are mainly found individually (like in Figure 3C) or does host cells harbor many intracellular bacteria. Looking at figure 4G: it is not clear to me how many intracellular bacteria can be counted on this image.

      This is an interesting suggestion. At present, an accurate quantification of the intracellular bacteria on microscopy 3D-datasets is challenging because bacteria aggregate inside the cells. At 4 hpi, single bacteria can occasionally be observed outside leukocytes, while most of infected macrophages harbored several intracellular bacteria (bacteria aggregates). To compare the levels of intracellular bacterial between acute and persistent stages, we measured the size of E2Crimson-positive (E2Crimson+) events. At 5 hpi, the median volume of E2Crimson+ events was lower than that at 4 dpi. The size distribution analysis of E2Crimson+ events indicated a higher representation of smaller volumes (0.5-1.5 m3 and 1.5-10 m3) at 5 hpi compared to 4 dpi, a stage during which very large E2Crimson+ events were observed (between 100-1000 m3, with some exceeding 1000 m3). This observation suggests an elevated presence of intracellular bacteria within the cells during persistent stages and that intracellular bacteria are predominantly observed as multiple rather than as solitary entities. This analysis has been incorporated in new Figure 5.

      5) Figure 3 and 4: The authors should also perform an experiment with a Salmonella strain harboring a growth reporter to quantify the amount of replicating and non-replicating bacteria. This experiment is not absolutely necessary for the story, but if possible, it would provide a very nice add-up to the story and impact to the paper.

      We welcome the reviewers’ suggestion, which we have indeed considered and planning to carry on in the future, along with experimented more oriented on the bacterial side.

      6) Figure 6: The authors should provide in suppl. the flow cytometry scatter plots used to delineate the different subpopulations.

      We agree with the reviewer that the flow cytometry scatter plots used to delineate the different subpopulations were missing and are now incorporated in new Fig 7 - figure supplement 2.

      7) Figure 6: A specific characterization of macrophages harboring Salmonella persisters at 4dpi is missing. As shown by the authors in Figure 6, the tnfa- populations of macrophages at 4dpi are very similar for both infected and non-infected larvae. Persisters should indeed reside within tnfa- macrophages but they should also induce a specific signature through the actions of Salmonella effectors. Measuring this signature will allow a direct comparison with published data in mice and assess how accurately the zebrafish model recapitulates the manipulation of macrophages by Salmonella

      We agree with the reviewer that a specific characterization of macrophages harboring persistent Salmonella at 4 dpi is missing. However due to the technical limitation inherent to the model (limited recovery of infected cells following FACS sorting), we were not able to specifically sort infected macrophages at 4 dpi.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper combines a number of cutting-edge approaches to explore the role of a specific mouse retinal ganglion cell type in visual function. The approaches used include calcium imaging to measure responses of RGC populations to a collection of visual stimuli and CNNs to predict the stimuli that maximally activate a given ganglion cell type. The predictions about feature selectivity are tested and used to generate a hypothesized role in visual function for the RGC type identified as interesting. The paper is impressive; my comments are all related to how the work is presented.

      We thank the reviewer for appreciating our study and for the interesting comments.

      Is the MEI approach needed to identify these cells?

      To briefly summarize the approach, the paper fits a CNN to the measured responses to a range of stimuli, extracts the stimulus (over time, space, and color) that is predicted to produce a maximal response for each RGC type, and then uses these MEIs to investigate coding. This reveals that G28 shows strong selectivity for its own MEI over those of other RGC types. The feature of the G28 responses that differentiate it appears to be its spatially-coextensive chromatic opponency. This distinguishing feature, however, should be relatively easy to discover using more standard approaches.

      The concern here is that the paper could be read as indicating that standard approaches to characterizing feature selectivity do not work and that the MEI/CNN approach is superior. There may be reasons why the latter is true that I missed or were not spelled out clearly. I do think the MEI/CNN approach as used in the paper provides a very nice way to compare feature selectivity across RGC types - and that it seems very well suited in this context. But it is less clear that it is needed for the initial identification of the distinguished response features of the different RGC types. What would be helpful for me, and I suspect for many readers, is a more nuanced and detailed description of where the challenges arise in standard feature identification approaches and where the MEI/CNN approaches help overcome those challenges.

      Thank you for the opportunity for clarification. In fact, the MEI (or an alternative nonlinear approach) is strictly necessary to discover this selectivity: as we show above (response #1 to editorial summary), the traditional linear filter approach does not reveal the color opponency. We realize that this fact was not made sufficiently clear in the initial submission. In the revised manuscript, we now include this analysis. Moreover, throughout the manuscript, we added explanations on the differences between MEIs and standard approaches and more intuitions about how to interpret MEIs. We also added a section to the discussion dedicated to explaining the advantages and limitations of the MEI approach.

      Interpretation of MEI temporal structure

      Some aspects of the extracted MEIs look quite close to those that would be expected from more standard measurements of spatial and temporal filtering. Others - most notably some of the temporal filters - do not. In many of the cells, the temporal filters oscillate much more than linear filters estimated from the same cells. In some instances, this temporal structure appears to vary considerably across cells of the same type (Fig. S2). These issues - both the unusual temporal properties of the MEIs and the heterogeneity across RGCs of the same type - need to be discussed in more detail. Related to this point, it would be nice to understand how much of the difference in responses to MEIs in Figure 4d is from differences in space, time, or chromatic properties. Can you mix and match MEI components to get an estimate of that? This is particularly relevant since G28 responds quite well to the G24 MEI.

      One advantage of the MEI approach is that it allows to distinguish between transient and sustained cells in a way that is not possible with the linear filter approach: Because we seek to maximize activity over an extended period of time, transient cells need to be repetitively stimulated whereas sustained cells will also respond in the absence of multiple contrast changes. In the revised manuscript, we add a section explaining this, together with Figure 3-supplement 2, illustrating this point by showing that oscillations disappear when we optimize the MEI for a short time window. The benefit of a longer time window lies in the increased discriminability between transient and sustained cells, which is also shown in the new supplementary figure.

      Regarding the heterogeneity of MEIs, this is most likely due to heterogeneity within the RGC group: “The mixed non-direction-selective groups G17 and G31 probably contain more than one type, as supported by multiple distinct morphologies and genetic identities (for example, G31,32, Extended Data Fig. 5) or response properties (for example, G17, see below)” (Baden et al. Nature 2016). We added a paragraph in the Results section.

      Concerning the reviewer’s last point: We agree that it is important to know whether the defining feature - i.e., the selectivity for chromatic contrast - is robust against variations in other stimulus properties. New electrophysiological data included in the manuscript (Fig. 6e,f) offers some insights here. We probed G28/tSbC cells with full-field flashed stimuli that varied in chromatic contrast. Despite not matching the cell’s preferred spatial and temporal properties, this stimulus still recovered the cell’s preference for chromatic contrast. While we think it is an interesting direction to systematically quantify the relative importance of temporal, spatial and chromatic MEI properties for an RGC type’s responses, we think this is beyond the scope of this manuscript.

      Explanation of RDM analysis

      I really struggled with the analysis in Figure 5b-c. After reading the text several times, this is what I think is happening. Starting with a given RGC type (#20 in Figure 5b), you take the response of each cell in that group to the MEI of each RGC type, and plot those responses in a space where the axes correspond to responses of each RGC of this type. Then you measure euclidean distance between the responses to a pair of MEIs and collect those distances in the RDM matrix. Whether correct or not, this took some time to arrive at and meant filling in some missing pieces in the text. That section should be expanded considerably.

      We appreciate the reviewer’s efforts to understand this analysis and confirm that they interpreted it correctly. However, we decided to remove the analysis. The point we were trying to make with this analysis is that the transformation implemented by G28/tSbC cells “warps” stimulus space and increases the discriminability of stimuli with similar characteristics like the cell’s MEI. We now make this point in a - we think - more accessible manner by the new analysis about the nonlinearity of G28/tSbC cell’s color opponency (see above).

      Centering of MEIs

      How important is the lack of precise centering of the MEIs when you present them? It would be helpful to have some idea about that - either from direct experiments or using a model.

      In the electrophysiological experiments, the MEIs were centered precisely (now Fig. 5 in revised manuscript) and these experiments yielded almost identical results to the 2P imaging experiments, where the MEIs were presented on a grid to approach the optimal position for the recorded cells. Additionally, all model simulations work with perfectly centered MEIs. We hence conclude that our grid-approach at presenting stimuli provided sufficient precision in stimulus positioning.

      We added this information to the revised manuscript.

      Reviewer #2 (Public Review):

      This paper uses two-photon imaging of mouse ganglion cells responding to chromatic natural scenes along with convolutional neural network (CNN) models fit to the responses of a large set of ganglion cells. The authors analyze CNN models to find the most effective input (MEI) for each ganglion cell as a novel approach to identifying ethological function. From these MEIs they identify chromatic opponent ganglion cells, and then further perform experiments with natural stimuli to interpret the ethological function of those cells. They conclude that a type of chromatic opponent ganglion cell is useful for the detection of the transition from the ground to the sky across the horizon. The experimental techniques, data, and fitting of CNN models are all high quality. However, there are conceptual difficulties with both the use of MEIs to draw conclusions about neural function and the ethological interpretations of experiments and data analyses, as well as a lack of comparison with standard approaches. These bear directly both on the primary conclusions of the paper and on the utility of the new approaches.

      We thank the reviewer for the detailed comments.

      1) Claim of feature detection.

      The color opponent cells are cast as a "feature detector" and the term 'detector' is in the title. However insufficient evidence is given for this, and it seems likely a mischaracterization. An example of a ganglion cell that might qualify as a feature detector is the W3 ganglion cell (Zhang et al., 2012). These cells are mostly silent and only fire if there is differential motion on a mostly featureless background. Although this previous work does not conduct a ROC analysis, the combination of strong nonlinearity and strong selectivity are important here, giving good qualitative support for these cells as participating in the function of detecting differential motion against the sky. In the present case, the color opponent cells respond to many stimuli, not just transitions across the horizon. In addition, for the receiver operator characteristic (ROC) analysis as to whether these cells can discriminate transitions across the horizon, the area under the curve (AUC) is on average 0.68. Although there is not a particular AUC threshold for a detector or diagnostic test to have good discrimination, a value of 0.5 is chance, and values between 0.5 and 0.7 are considered poor discrimination, 'not much better than a coin toss' (Applied Logistic Regression, Hosmer et al., 2013, p. 177). The data in Fig. 6F is also more consistent with a general chromatic opponent cell that is not highly selective. These cells may contribute information to the problem of discriminating sky from ground, but also to many other ethologically relevant visual determinations. Characterizing them as feature detectors seems inappropriate and may distract from other functional roles, although they may participate in feature detection performed at a higher level in the brain.

      The reviewer apparently uses a rather narrow definition of a feature detector. We, however, argue for a broader definition, which, in our view, better captures the selectivities described for RGCs in the literature. For example, while W3 cells have been quite extensively studied, one can probably agree on that so far only a fraction of the possible stimulus space has been explored. Therefore, it cannot be excluded that W3 cells respond also to other features than small dark moving dots, but we (like the reviewer) still refer to it as a feature detector. Or, for instance, direction-selective (DS) RGCs are commonly considered feature detectors (i.e., responsive to a specific motion direction), although they also respond to flashes and spike when null-direction motion is paused (Barlow & Levick J Physiol 1965).

      The G28/tSbC cells’ selectivity for full-field changes in chromatic contrast enables them to encode ground-sky horizon transitions reliably across stimulus parameters (e.g., see new Fig. 7i panel). This cell type is thus well-suited to contribute to detecting context changes, as elicited by ground-sky transitions.

      Therefore, we think that the G28/tSbC RGC can be considered a feature detector and as such, could be used at a higher level in the brain to quickly detect changes in visual context (see also Kerschensteiner Annu Rev Vis Sci 2022). Still, their signals may also be useful for other computations (e.g., defocus, as discussed in our manuscript).

      Regarding the ROC analysis, we acknowledge that an average AUC of .68 may seem comparatively low; however, this is based on the temporally downsampled information (i.e., by way of Ca2+ imaging) gathered from the activity of a single cell. A downstream area would have access to the activity of a local population of cells. This AUC value should therefore be considered a lower bound on the discrimination performance of a downstream area. We now comment on this in the manuscript.

      2) Appropriateness of MEI analysis for interpretations of the neural code.

      There is a fundamental incompatibility between the need to characterize a system with a complex nonlinear CNN and then characterizing cells with a single MEI. MEIs represent the peak in a complex landscape of a nonlinear function, and that peak may or may not occur under natural conditions. For example, MEIs do not account for On-Off cells, On-Off direction selectivity, nonlinear subunits, object motion sensitivity, and many other nonlinear cell properties where multiple visual features are combined. MEIs may be a useful tool for clustering and distinguishing cells, but there is not a compelling reason to think that they are representative of cell function. This is an open question, and thus it should not be assumed as a foundation for the study. This paper potentially speaks to this issue, but there is more work to support the usefulness of the approach. Neural networks enable a large set of analyses to understand complex nonlinear effects in a neural code, and it is well understood that the single-feature approach is inadequate for a full understanding of sensory coding. A great concern is that the message that the MEI is the most important representative statistic directs the field away from the primary promise of the analysis of neural networks and takes us back to the days when only a single sensory feature is appreciated, now the MEI instead of the linear receptive field. It is appropriate to use MEI analyses to create hypotheses for further experimental testing, and the paper does this (and states as much) but it further takes the point of view that the MEI is generally informative as the single best summary of the neural code. The representation similarity analysis (Fig. 5) acts on the unfounded assumption that MEIs are generally representative and conveys this point of view, but it is not clear whether anything useful can be drawn from this analysis, and therefore this analysis does not support the conclusions about changes in the representational space. Overall this figure detracts from the paper and can safely be removed. In addition, in going from MEI analysis to testing ethological function, it should be made much more clear that MEIs may not generally be representative of the neural code, especially when nonlinearities are present that require the use of more complex models such as CNNs, and thus testing with other stimuli are required.

      The reviewer correctly characterizes MEIs as representing the peak in a nonlinear loss landscape that, in this case, describes the neurons’ tuning. As such, the MEI approach is indeed capable of characterizing nonlinear neuronal feature selectivities that are captured by a nonlinear model, such as the CNN we used here. We therefore disagree with the suggestion that MEIs should not be used “when nonlinearities are present that require the use of more complex models such as CNNs”. It is unclear what other “analysis of neural networks” the reviewer refers to. One approach to analyze the predictive neural network are MEIs.

      We also want to clarify that, while the reviewer is correct in stating that the MEI approach as used here only identifies a single peak, this does not mean that it cannot capture neuronal selectivities for a combination of features, as long as this combination of features can be described as a point in high-dimensional stimulus space. In fact, this is demonstrated in our manuscript for the case of G28/tSbC cell’s selectivity for large or full-field, sustained changes in chromatic contrast (a combination of spatial, temporal, and chromatic features). While approaches similar to the one used here generate several diverse exciting inputs (Ding et al. bioRxiv 2023) and could therefore also fully capture On-Off selectivities, we pointed out the limitation of MEIs when describing On-Off cells in the manuscript (both original and revised).

      Regarding the reviewer’s concern that “[...] the message that the MEI is the most important representative statistic [...] takes us back to the days when only a single sensory feature is appreciated”. It was certainly not our intention to proclaim MEIs as the ultimate representation of a cell’s response features and we have clarified this in the revised manuscript. However, we also think that (i) in applying a nonlinear method to extract chromatic, temporal, and spatial response properties from natural movie responses, we go beyond many characterizations that use linear methods to extract spatial or temporal only, achromatic response properties from static, white-noise stimuli. This said, we agree that (ii) expanding around the peak is desirable, and we do that in an additional analysis (new Fig. 6); but that reducing complexity to a manageable degree (at least, at first) is useful and even necessary when discovering novel response properties.

      Concerning the representational similarity analysis (RSA): the point we were trying to make with this analysis is that the transformation implemented by G28 “warps” stimulus space and increases the discriminability of stimuli with similar characteristics like the cell’s MEI. We now made this point in a more accessible fashion through the above-mentioned analysis, where we extended the estimate around the peak. We therefore agree to remove the RSA from the paper.

      In the revised manuscript, we (a) discuss the advantages and limitations of the MEI approach in more detail (in Results and Discussion; see also our reply #1) and (b) replaced the RSA analysis.

      3) Usefulness of MEI approach over alternatives. It is claimed that analyzing the MEI is a useful approach to discovering novel neural coding properties, but to show the usefulness of a new tool, it is important to compare results to the traditional technique. The more standard approach would be to analyze the linear receptive field, which would usually come from the STA of white noise measurement, but here this could come from the linear (or linear-nonlinear) model fit to the natural scene response, or by computing an average linear filter from the natural scene model. It is important to assess whether the same conclusion about color opponency can come from this standard approach using the linear feature (average effective input), and whether the MEIs are qualitatively different from the linear feature. The linear feature should thus be compared to MEIs for Fig. 3 and 4, and the linear feature should be compared with the effects of natural stimuli in terms of chromatic contrast (Fig. 6b). With respect to the representation analysis (Fig. 5), although I don't believe this is meaningful for MEIs, if this analysis remains it should also be compared to a representation analysis using the linear feature. In fact, a representation analysis would be more meaningful when performed using the average linear feature as it summarizes a wider range of stimuli, although the most meaningful analysis would be directly on a broader range of responses, which is what is usually done.

      We agree that the comparison with a linear model is an important validation. Therefore, we performed an additional analysis (see also reply #1, as well as Fig. 6 and corresponding section in the manuscript) which demonstrates that an LN model does not recover the chromatic feature selectivity. This finding supports our claims about the usefulness of the MEI approach over linear approaches.

      Regarding the comment on the representation analysis, as mentioned above, we consider it replaced by the analysis comparing results from an LN model and a nonlinear CNN.

      4) Definition of ethological problem. The ethological problem posed here is the detection of the horizon. The stimuli used do not appear to relate to this problem as they do not include the horizon and only include transitions across the horizon. It is not clear whether these stimuli would ever occur with reasonable frequency, as they would only occur with large vertical saccades, which are less common in mice. More common would be smooth transitions across the horizon, or smaller movements with the horizon present in the image. In this case, cells which have a spatial chromatic opponency (which the authors claim are distinct from the ones studied here) would likely be more important for use in chromatic edge detection or discrimination. Therefore the ethological relevance of any of these analyses remains in question.

      It is further not clear if detection is even the correct problem to consider. The horizon is always present, but the problem is to determine its location, a conclusion that will likely come from a population of cells. This is a distinct problem from detecting a small object, such as a small object against the background of the sky, which may be a more relevant problem to consider.

      Thank you for giving us the opportunity to clear these things up. First, we would like to clarify that we propose that G28/tSbC cells contribute to detecting context changes, such as transitions across the horizon from ground to sky, not to detecting the horizon itself. We acknowledge that we were not clear enough about this in the manuscript and corrected this. To back-up our hypothesis that G28 RGCs contribute to detecting context changes, we performed an additional simulation analysis, which is described in our reply #3 (see above).

      5) Difference in cell type from those previously described. It is claimed that the chromatic opponent cells are different from those previously described based on the MEI analysis, but we cannot conclude this because previous work did not perform an MEI analysis. An analysis should be used that is comparable to previous work, the linear spatiotemporal receptive field should be sufficient. However, there is a concern that because linear features can change with stimulus statistics (Hosoya et al., 2005), a linear feature fit to natural scenes may be different than those from previous studies even for the same cell type. The best approach would likely be presenting a white noise stimulus to the natural scenes model to compute a linear feature, which still carries the assumption that this linear feature from the model fit to a natural stimulus would be comparable to previous studies. If the previous cells have spatial chromatic opponency and the current cells only have chromatic opponency in the center, there should be both types of cells in the current data set. One technical aspect relating to this is that MEIs were space-time separable. Because the center and surround have a different time course, enforcing this separability may suppress sensitivity in the surround. Therefore, it would likely be better if this separability were not enforced in determining whether the current cells are different than previously described cells. As to whether these cells are actually different than those previously described, the authors should consider the following uncited work; (Ekesten Gouras, 2005), which identified chromatic opponent cells in mice in approximate numbers to those here (~ 2%). In addition, (Yin et al., 2009) in guinea pigs and (Michael, 1968) in ground squirrels found color-opponent ganglion cells without effects of a spatial surround as described in the current study.

      First of all, we did not intend to claim to have discovered a completely new type of color-opponent tuning in general; what we were trying to say is that tSbC cells display spatially co-extensive color opponency, a feature selectivity previously not described in this mouse RGC type, and which may be used to signal context changes as elicited by ground-sky transitions.

      Concerning the reviewer’s first argument about a lack of comparability of our results to results previously obtained with a different approach: We think that this is now addressed by the new analysis (new Fig. 6), where we show why linear methods are limited in their capability to recover the type of color opponency that we discovered with the MEI approach.

      Regarding the argument about center-surround opponency, we agree that “if the previous cells have spatial chromatic opponency and the current cells only have chromatic opponency in the center, there should be both types of cells in the current data set”. We did not focus on analyzing center-surround opponency in the present study, but from the MEIs, it is visible that many cells have a stronger antagonistic surround in the green channel compared to the UV channel (see Fig. 4a, example RGCs of G21, G23, G24; Figure 3-supplement 1 example RGCs of G21, G23, G24, G31, G32). Importantly, the MEIs shown in Fig. 4a were also shown in the verification experiment, and had G28 RGCs preferred this kind of stimulus, they would have responded preferentially to these MEIs, which was not the case (Fig. 4f).

      It should also be noted here that, while the model’s filters were space-time separable, we did not impose a restriction on the MEIs to be space-time separable during optimization. However, we analyzed only the rank 1 components of the MEIs (see Methods section Validating MEIs experimentally). since our analysis focused on aspects of retinal processing not contingent on spatiotemporal interactions in the stimulus.

      In summary, we are convinced that our finding of center-opponency in G28 is not an artifact of the methodology.

      We discuss this in the manuscript and add the references mentioned by the reviewer to the respective part of the Discussion.

      Reviewer #3 (Public Review):

      This study aims to discover ethologically relevant feature selectivity of mouse retinal ganglion cells. The authors took an innovative approach that uses large-scale calcium imaging data from retinal ganglion cells stimulated with both artificial and natural visual stimuli to train a convolutional neural network (CNN) model. The resulting CNN model is able to predict stimuli that maximally excite individual ganglion cell types.

      The authors discovered that modeling suggests that the "transient suppressed-by-contrast" ganglion cells are selectively responsive to Green-Off, UV-On contrasts, a feature that signals the transition from the ground to the sky when the animal explores the visual environment. They tested this hypothesis by measuring the responses of these suppressed-by-contrast cells to natural movies, and showed that these cells are preferentially activated by frames containing ground-to-sky transitions and exhibit the highest selectivity of this feature among all ganglion cell types. They further verified this novel feature selectivity by single-cell patch clamp recording.

      This work is of high impact because it establishes a new paradigm for studying feature selectivity in visual neurons. The data and analysis are of high quality and rigor, and the results are convincing. Overall, this is a timely study that leverages rapidly developing AI tools to tackle the complexity of both natural stimuli and neuronal responses and provides new insights into sensory processing.

      We thank the reviewer for appreciating our study.

    1. Author Response

      Reviewer #3 (Public Review):

      This manuscript uses ASO to inhibit the self-cleaving ribozyme within CPEB intron 3 and test its effect on CPEB3 expression and memory consolidation. The authors conclude that the intronic ribozyme negatively affects CPEB3 mRNA splicing and expression, and suggests its implications for experience-induced gene expression underlying learning and memory.

      The strength of the manuscript is in its exploration of a potentially novel mechanism of regulating CPEB3 expression in learning and memory, a combination of both biochemical and behavioral approaches to gain a wide perspective of this regulatory mechanism, and the application of ASO in this context. The introduction is sufficiently detailed. Statistics are thorough and appropriate. If the results could be more robust, the mechanism would provide a novel target and venue to modify learning and memory paradigm.

      The weakness of the manuscript is that the magnitude of the activity-dependent regulation of ribozyme, the effects of ASOs on CPEB3 expression (mRNA and protein) and downstream target gene expression, in vitro and in vivo, are generally weak, raising concerns about the robustness of the result. This may have caused some of the inconsistencies between the data presentation (see below). Also unclear is whether the ribozyme activity is physiologically regulated by experience without ASO interference.

      While the statistics tests support corresponding figure panels and their conclusions. The manuscript can be significantly strengthened by additional evidence, clarification of some methodologies, and reconciling some inconsistent results.

      The premise of a comparable timescale between transcription and ribozyme activity as the foundation of the whole thesis was based on in vitro measurement of self-scission half-life and a broadly generalized transcription rate (which actually varies significantly between genes). This premise is weak and needs direct experimental support.

      The physiological relevance of the proposed mechanism has yet to be demonstrated without ASO interference.

      Fig2b: how were total and uncleaved Ribozymes measured by qRT-PCR? Where are the primers' locations? If the two products were amplified using different primers, their subtraction to derive % cleavage would not be appropriate.

      We thank the reviewer for the thoughtful review. We measured the levels of the total ribozyme by measuring a 220-bp amplicon that starts 18 nts downstream from the ribozyme cleavage site. The uncleaved ribozyme levels were measured using oligos that amplify a region of the intron that starts 45 nts upstream and ends 238 nts downstream of the ribozyme cleavage site. We added this information to the Table of primers in the manuscript. For all PCR oligos we established independent standard curves and calculated RNA levels independently of other amplicons, as noted in the Methods section and now specified in the Results section as well (Page 15). The measurements were thus appropriate for the calculation of the cleaved ribozyme fractions in the various experiments. The fraction ribozyme cleaved was calculated from the uncleaved fraction as the difference between uncleaved fraction and unity (1 – fraction uncleaved), now specified on page 16 of the manuscript. Fraction uncleaved was calculated as [uncleaved ribozyme]/[total ribozyme], as was done previously (see Salehi-Ashtiani et al. Science 313:1788-1792 or Webb et al. Science 326:953).

      Line 400-403: shouldn't ribozyme-blocking ASO prevent ribozyme self-cleavage, and as a result should further increase ribozyme levels? This would contradict the result in fig3a.

      We showed that the ribozyme is inhibited in vitro (Fig. 1F and 1G) and all our data are consistent with ASO inhibition of the ribozyme in cellulo and in vivo. However, we do not have direct evidence for this ribozyme inhibition in vivo, because such an experiment would require a single-molecule FRET-type sensitivity in cells and this assay has not been developed for ribozyme cleavage in cellulo or in vivo. We measured the ribozyme levels by RT-qPCR and observed lower ribozyme levels in presence of ASO in cultured neurons (Fig. 3A) as well as in vivo (Fig. 5B), which is nominally in contrast to the observations in vitro. However, in these situations we do not measure the co-transcriptional fate of the intron or the ribozyme; rather, we measure the levels of the intron after splicing (evidenced by the increased levels of spliced exons 2–3) when the intron is likely already being degraded. We also do not know what effect the ribozyme ASO has on the intron stability once splicing occurs. Understandably, this is a weakness of the study—and we are fully open about this result— however, given the abundance of evidence that the ribozyme ASO leads to increase of CPEB3 mRNA under all conditions tested, we feel that there is strong, if indirect, evidence that our model for the ribozyme function is correct. Future studies will examine this issue closer, but a definitive experimental investigation for the mechanism and timing of ribozyme inhibition and intron degradation is out of scope of this study.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer 1 (Public Review):

      Weakness: Although the cross-links stimulate ATP hydrolysis, further controls are needed to convince me that the TM1 conformations observed in the structures are physiologically relevant, since they have been trapped by "large" substrates covalently-tethered by crosslinks.

      Our response: Reviewer 1 raised concerns about the relatively large size of our covalently attached AAC substrate that would potentially distort TM1 in Pgp. We would like to clarify that AAC has a molecular weight of 462 Da, which, in comparison to many known Pgp substrates ranging from 250 to over 1,000 Da, is not a large compound. For instance, the few other Pgp substrates mentioned in our manuscript all have a comparable or larger size: verapamil, 455 Da; doxorubicin, 544 Da; FK506, 804 Da; valinomycin, 1,111 Da; cyclosporin A, 1,203 Da.

      Furthermore, AAC was strategically attached to a site distant from TM1 in the inwardfacing Pgp conformation. After it was exported to the outward-facing state, several TM helices accommodate the compound. The observation that only TM1 exhibited significant conformational changes suggests its potential role in the transport mechanism. This hypothesis is supported by our findings, where a conservative substitution (G72A) in TM1 resulted in a dramatic loss of transport function for various drug substrates and impaired verapamil-stimulated ATPase activity.

      Reviewer 1 (Recommendations for the Authors):

      I understand the need for an unconventional approach to understanding the translocation pathway. What would help to support this model is to cross-link a much smaller substrate, as the one used is quite large and could potentially distort TM1 in the outward-state when cross-linked.

      Our response: We thank the reviewer for this recommendation, and we have outlined plans for future experiments involving other substrates, including smaller ones, to further investigate our proposed model. However, it is important to acknowledge that conducting these studies will require a significant amount of effort and resources, which we believe extend beyond the scope of our current manuscript.

      In unbiased MD simulations starting from the IF state are there any simulations where the substrate follows the same path as proposed here?

      Our response: All our MD simulations were performed in the outward-facing state to focus on potential substrate release pathways. Starting MD simulations from the inwardfacing state would introduce complexities in capturing the necessary domain motions and nucleotide binding and hydrolysis required for substrate translocations. Therefore, we opted not to perform MD studies starting from the inward-facing state.

      Reviewer 2 (Public Review):

      Weakness: There is much to like about the experimental work here but I am less sanguine on the interpretation. The main idea is to covalently link via disulfide bonds a model tripeptide substrate under different conditions that mimic transport and then image the resulting conformations. The choice of the Pgp cysteine mutants here is critical but also poses questions regarding the interpretation. What seems to be missing, or not reported, is a series of control experiments for further cysteine mutations.

      Our response: Reviewer 2 raised concerns about the interpretation of our results and suggested the need for additional mutant designs to validate our proposed TM1 mechanism. Firstly, we believe that the observed TM1 conformational changes are valid in our cryoEM structures, despite the use of different conditions and several mutants to capture Pgp in the outward-facing state.

      Regarding the G72A mutant, we consider it conclusive that this single point mutation in the TM1 has a profound effect. Importantly, the G72A mutant was readily expressed and purifiable as a stable protein. We were able to resolve a high-resolution structure of the G72A mutant (without the substrate), confirming that the protein is not generally destabilized but properly folded.

      Above all, we appreciate the Reviewer’s suggestion to explore additional mutations and intend to do so in future studies.

      Reviewer 2 (Recommendations for the Authors):

      I am sold on the results regarding TM1 conformational changes as they are evident in the cryoEM structures. However, the set of states compared between mutants are not biochemically equivalent: for 335 and 978 they used an ATP-impaired Pgp whereas for 971 they used what appears to be WT, and the conformation was imaged presumably subsequent to ATP hydrolysis and Vanadate trapping. This is significant if the authors were unable to trap the OF in the impaired mutant background and should be highlighted. I have to believe that they tried that condition but I could be wrong.

      Our response: We acknowledge the point made by the Reviewer about the biochemical equivalence of mutant states and the potential significance of using an ATP-impaired mutant for trapping the outward-facing conformation of 971. We have not yet attempted to use the ATPase-deficient 971C mutant for crosslinking and intend to address this question in future studies.

      In our current approach, we used the ATPase-active 971C for two specific reasons:

      1) Our biochemistry data, as shown in Fig 1C, indicates that 971C only crosslinks in the presence of ATP hydrolysis conditions. Vanadate trapping was employed to stabilize the outward-facing conformation.

      2) Based on our experience, we have observed that the conformations of ATP-bound (mutant) and vanadate-trapped states of an ABC transporter are structurally equivalent at this resolution level of our study (see ref. 21: Hoffmann et al. NATURE 2019).

      The authors propose a new model for substrate translocation. It is based on three mutants and a number of structures. If the authors were not challenging the current dogma I would not have written the next comment. Considering the impact of the findings, I would have designed a couple more cysteine mutants based on their model. For instance, this pathway has a number of stabilizing interactions, can't they make a mutant that preserves conformational switching but eliminates substrate translocation? I like the G97A mutant result but I am worried that the effect could just be a general destabilization or misfolding as part of the cryoEM particles seem to suggest. The authors advance one interpretation of the disorder observed in this mutant but it could easily be my interpretation.

      Our response: We thank the reviewer for the suggestion to design additional mutants to further validate our proposed model for substrate translocation. We agree that this would be highly valuable, considering the potential impact of our findings. However, given the time-intensive nature of our approach, we believe that presenting these additional designs in a future study is a reasonable course of action.

      Regarding the G72A mutation, we believe that our current data fully supports our model and the role of TM1 in regulating the Pgp activity. Importantly, we would like to emphasize that the G72A mutant was readily expressed and purifiable as a stable protein. Additionally, our cryoEM structural determination of the G72A mutant at high resolution confirmed that the protein is not generally destabilized but properly folded.

      There are a couple of troubling methodological questions that I want the authors to address or clarify:

      1. In the methods they report that the final sample for cryoEM was prepared on a SEC devoid of detergent. It is obvious that the sample was folded but I was wondering why the detergent was removed? Was that critical for observing these structures with multiple ligands? Did they observe any lipids in their cryoEM?

      Our response: We avoid detergent in the buffer on final SEC purification. This step is to remove free detergent from the background which helps during cryoEM imaging. Of course, this cannot be done with every detergent but due to the very low CMC of LMNG it is possible. By now, we have verified this method for several other transporters with the same success. While this procedure helps us to obtain better images it is not necessary to obtain specific conformations or ligand bound states, nor does it affect these states or conformations.

      In our cryoEM structures , we did observe multiple cholesterol hemisuccinate (CHS) molecules on the outer transmembrane surface of Pgp.

      1. Can the authors comment on why labeling was carried out in the presence of ATP? Does it matter if the substrate was added prior to ATP and incubated for a few minutes?

      Our response: For every dataset, we first added the substrate to be cross-linked and afterwards added the ATP. In the cases of 335C and 978C, labeling was successful before ATP was added, as evidenced by the inward-facing structures with cross-linked substrate. However, for 971C, cross-linking only occurred after the addition of ATP. We interpret this data to suggest that the 971 site is inaccessible to the substrate in the inward-facing state, and cross-linking can only occur after the transporter transitions to outward-facing state. This is in line with our inward-facing structure which does not show a cross-linked substrate, and our biochemical data shown in Fig 1C, where 971C only crosslinked in the presence of ATP.

      1. I am not an expert on MD simulations and I understand that carrying out simulations at higher temperatures used to be a trick to accelerate the process. Is this still necessary? Why didn't the author use approaches such as WESTPA?

      Our response: Most so-called enhanced sampling methods, including WESTPA, explicitly define a reaction coordinate for the process of interest, usually based on intuition or prior studies. If this coordinate is chosen poorly, enhanced sampling usually fails, either because the sampling becomes inefficient or because the sampling biases the transition pathway (or both). Lacking reliable intuition or prior knowledge on which motions would result in substrate release, we chose temperature to speed up the process. High temperature largely avoids the introduction of an any bias through the definition of a progress coordinate. By contrast, the weighted ensemble method underlying WESTPA is a great method to simulate unbiased dynamics of a process with a known progress coordinate, but unfortunately requires to choose a progress coordinate prior to the simulation and will then mostly sample the process along this progress coordinate, because this is the only direction in which sampling is improved. High temperature MD on the other hand accelerates all processes in the system under study. Indeed, we have now confirmed that the pathway found at high temperature is also feasible at near-ambient conditions.

      In new simulations, we have now observed a similar release pathway at T=330 K. As the only difference, the substrate has not fully dissociated from the protein after 2.5 us, with weak interactions persisting at the top part of TM1 from the extracellular side. Importantly, this is a configuration observed also in higher temperature simulations but with much shorter lifetime.

      In response, we now included these new findings and a new Extended Data Fig. 15 in the revised manuscript.

      1. One way to show that the two substrates binding mode is biochemically relevant is to measure Vmax at different substrate concentrations. One would expect a cooperative transition if that interaction is mechanistically important.<br /> Our response: We have measured Vmax as a function of QZ-Ala concentration in a previous report (ref. 24), supporting positive cooperativity for binding to two sites.

      Reviewer 3 (Public Review):

      We thank Reviewer 3 for recommending the acceptance of our manuscript as is.

      Reviewer 3 (Recommendations for the Authors):

      Page 4, last line: Pgp302 should be Pgp1302. In addition, I can only encourage the authors to add an additional table to the manuscript. Here, the mutation, the obtained structure(s), IF or OF, the resolution, and the main message should be summarized.

      Our response: Following the reviewer’s suggestion, we have added Extended Data Table 2 summarizing the Pgp mutants and respective structural data in the revised manuscript.<br /> We verified that Pgp302 is the correct term on Page 4, last line.

      Pg. 5, section 'Covalent ligand design for Pgp labeling', it is mentioned that even in the presence of Mg2+ATP, Pgp302 could not react with AAC-DNPT. Maybe it would be worthwhile to add the data either in Supplementary Information or state 'data not shown'.

      Our response: We stated ‘data not shown’ in the text.

      Pg. 47, last line : A space is missing between M68, and M74.

      Our response: Space was added.

      Pg. 7, line 2: The authors mention that a single dataset of ATP-bound Pgp335 revealed three different OF conformations: ligand-free, single-ligand-bound, and double-ligandbound. However, the percentage fraction of each dataset sums up to be more than 100%. Would request the authors to recalculate the fraction size of each conformation.

      Our response: We have corrected the error in our calculation, based on the particle distribution in our dataset (OF335-nolig: 1,437,110 particles, 40.4%; OF335-1lig: 1,184,253 particles, 33.3%; and OF335-2lig: 939,924 particles, 26.4%).

      Pg 53, Figure legend of Extended Data Fig. 11: Please include the color coding for the helix TM1 and also the residues colored plum.

      Our response: We added the color coding for TM1 and other residues in the figure legend.

      Pg. 8, line 3: While referring to the structure of OF971-1lig, the authors nicely point towards the conserved residues M74 and F78 which coordinate the ligand. However, in Fig. 3b, residues M74 and F78 should also be indicated.

      Our response: We updated Fig. 3b by adding arrows pointing towards the residues M74 and F78.

      Pg. 54, Extended data Fig. 12: The authors should adopt a single writing style. In some places, Pgp is referred to as P-gp while in others as Pgp.

      Our response: We updated the protein labels in Extended Data Fig. 12.

      Pg. 54, Extended data Fig. 12: The authors should clearly mention which OF335 structure (1st panel) was used for visualizing the interactions.

      Our response: To clarify, we added the following sentences in the figure legend: “Pgp335 OF in the top panel refers to OF335-1lig. In the bottom panel describing OF335-2lig, the left and right diagrams refer to the binding positions of non-covalent and covalent ligand, respectively”.

      Pg. 18, section 'synthesis of dipeptide 8': In the text it is mentioned that for the synthesis of thiazole acid 6, compound 3 was dissolved in a mixture of THF/MeOH/H2O (3:1:1), while in the corresponding figure (Extended Data Fig. 1), the ratio is stated as 5:1:2.

      Our response: 3:1:1 ratio is correct. We made the correction in Extended Data Fig. 1.

      Pg. 19, section 'synthesis of linear tripeptide 10': Same as above for compounds 10 and 4, respectively.

      Our response: We corrected the conditions in the Extended Data Fig. 1 accordingly.

      Pg. 20, section 'Synthesis of cyclic peptide 11': There seems to be a discrepancy in the synthesis protocol between the text and the extended figure 1, especially regarding the use of THF/MeOH/H20, followed by NaOH and TFA or only NaOH and TFA.

      Our response: we further clarified the conditions of using NaOH in THF/MeOH/H2O (3:1:1) and TFA in DCM in the text for synthesis and Extend Data Fig. 1.

      Pg. 40, Extended Data Fig. 1: In the bottom last panel showing the synthesis of peptide 11, the authors have missed showing peptide 10 as the starting material for the reaction.

      Our response: Label for the peptide 10 was added following the suggestion.

      Pg. 26, third last line: 'o' is missing from the last word cry'o'

      Our response: We corrected the typo.

      Pg. 63 and 64, Extended Data Table 1: The Cryo-EM data collection, refinement, and validation statistics for OF971-1lig, IF971-1lig, OF978-1lig, and IF978-2lig are mentioned twice in the table.

      Our response: This was now corrected in the revision.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the Authors):

      The authors have addressed my recommendations in the previous review round in a satisfactory way. I only have one additional comment to the authors:

      In the manuscript abstract lines 31-32, the author state that: "Using NIH data for the period 2006-2022, we report that ~230 K99 awards were made every year, representing ~$25 million annually."-- The "~$25 million" is under-stating the actual funds spent because this sum is just money spent on the first year of some k99s while the NIH is paying years 2,3,4 etc for others for k99 awards (~90% conversion rate to R00) awarded in previous years for a given year. The NIH is actually spending ~$230-$250 million a year on the k99 award mechanism in a given year. so the authors need to amend the stated amount in the manuscript.

      Thank you for pointing this out. The reviewer is correct, that we had incorrectly only calculated the investment $ in new K99 awards made. We have corrected this in the revised manuscript. We appreciate your careful reading of our manuscript and the edits made based on your comments have improved the final version.

      Reviewer #2 (Recommendations for the Authors):

      Thank you for taking the time to revise this important work. I learned a lot reading this paper a second time, and appreciate the improvements you have made.

      My only major thought while re-reading this is that I wish you all had written two papers! I see two themes in this work: one looking at faculty hiring networks from the Wapman et al. dataset, and another at K99/R00 conversions by institution, gender, and researcher mobility and its impact on subsequent funding success. After reading, I felt like I had many follow-up questions about both analyses, but it would be impractical for me to suggest all these follow-up analyses without making your paper unreasonably long.

      Thank you for these comments. We agree that there are 2 general themes in this paper. While we feel that significantly expanding on both themes will be important in future research. Our hope is that this work continues to inspire others to critically examine funding practices and inequity in the same way that the work of Wapman, Pickett, etc. inspired the present work.

      For example, regarding the results that more R00 are activated at different institutions, and that moving institutions improves subsequent funding success, I wonder: Do proportionally more women or men move institutions? Do proportionally more K99 awardees at less-funded places move for their R00, or less? The Cox proportional hazard models illustrate the impact of various characteristics on subsequent funding success, but they do not illustrate disparate impacts of mobility on different groups (if I am understanding them correctly). (You sort of dive into these questions in the very interesting subsection, "K99/R00 awardee self-hires are more common at institutions with top NIH funding." I wanted to read more!)

      Thank you for these kind comments. These are fantastic follow-up questions. We do not feel that we can adequately address them within the present manuscript without potentially splitting it into 2 separate manuscripts. However, we may examine these in future analyses. We are particularly interested in examining additional aspects such as how the K99 MOSAIC funding mechanism may differ from the traditional K99 mechanism. Since the K99 MOSAIC mechanism is newer, there may not be enough K99 MOSAIC awards made for a thorough exploration.

      As another example, for your analysis on faculty hiring networks, the prevalence of self-hiring amongst institutions and regions was one finding. However, this finding seems somewhat at odds with the previous takeaway about how researcher mobility improves subsequent funding success. Are institutions doing themselves a disfavor by hiring their own, then? I suspect there is more to say here about this pattern... maybe there are important differences between PhD institution and postdoc institution and its impact on hiring/subsequent funding success? Or is this a story about upward mobility into the top 25 well-funded NIH institutions?

      Again, these are very insightful comments and follow-up questions. We hope to address these in potential future manuscripts. We also hope that others may become interested in finding answers to these questions by exploring our dataset as well as other publicly available datasets such as the Wapman et al. dataset.

      I can completely understand how combining the faculty hiring network analysis with the K99/R00 conversions would seem like a natural fit, but I personally feel - emphasis on this being a personal opinion - that there would have been benefits to giving more space to the details of both analyses separately. Perhaps this is a "hindsight is 20/20" issue. Or an issue with the current times in which ones' brain can only hold so many main takeaways from a single body of work. (For example, I struggled to summarize your paper in my public review because I find so many takeaways important.)

      I suppose this is all to say that I find your work important enough to warrant additional follow-up work! :)

      Thank you for these very kind remarks. This work evolved over 8-10 months as evidenced by the updates to the biorXiv preprint. With unlimited time and foresight, it would probably be best to have separated the 2 themes into separate manuscripts and expanded both. Given current constraints, we plan to make some changes/updates to the present manuscript and hopefully include more in-depth analyses on each theme in future works. Thank you again for the thoughtful reading and critique of both our original manuscript and the revised version.

      Minor comments/questions:

      "K99 to R00 conversions are increasing in time"

      • Assuming I am interpreting the figures correctly, in my opinion, the most important takeaway is that the number of R00 awards have increased, but only for awardees moving to another institution. This key result, best illustrated by panels A and C of Figure 1, is buried in the long paragraph in this section. The organization of content in this section could be improved and more focused. Consider renaming this subsection to be more declarative: "K99 tR00 conversions have increased, but only for awardees moving to another institution."

      This is a very concise interpretation of this data. We have edited the paragraph referenced by the reviewer, split it into 2 paragraphs, and changed the title to “K99 awardees increasingly move to other institutions for R00 awards from 2008 to 2022” and the final sentence to “Thus, the number of K99 to R00 conversions is consistent over time, but increasingly more R00 awardees have moved to other institutions since 2013”

      • Similarly, I personally found the current title of the subsection, "K99 to R00 conversions are increasing with time" is mildly confusing. An R00 award indicates a successful conversion, so why not simply call this an R00 award instead of saying K99-to-R00 conversion? Also, when I look at Figure 1B and exclude the conversion rates for 2007 and 2008 (because this is a 3 year rolling average), I see that conversion rates (or R00 awards) have remained stagnant. This comment is very much in-the-weeds and is mainly to do with clarity of language.

      Thank you for these comments. We had “K99 to R00 conversion” to highlight the unique nature of this award mechanism that a person can only receive an R00 if they previously had a K99 award. Nevertheless, we have edited the text to “R00 awards” and “R00 awardees” to simplify things. We also want to note that we did not compute a 3-year rolling average. The function we used was: (X/(Y -1))x100 where X is the number of R00 awards made in a year and Y is the number of K99 awards made in a year. We did note an error in our calculation in the previous version of the manuscript. Previously, we included all R00 awards and K99 awards for each year from the NIH Reporter dataset; however, this is a flawed methodology. NIH reporter includes only extramural K99 award data and extramural R00 awards, but intramural K99 awardees can receive extramural R00 awards and thus are only included in the R00 dataset. There were 141 R00 awardees in our dataset from NIH Reporter that did not have K99 data, so we assume these are intramural K99 awards since it is required to have a K99 to be eligible for the R00 award. Since we do not know the awarding year for intramural K99 awardees or have data on intramural K99 awardees that fail to activate the R00 award (or stay internal at NIH), we have excluded these 141 R00 awardees. In the previous version, this mis-calculation exaggerated rolling conversion rate (we had correctly calculated the 78% total conversion rate). We re-analyzed our rolling conversion rate and found the average is 81.8% (excluding the first 2 years of the K99 program and the last 2 years).

      This is a long explanation, but essentially, we overestimated the number of R00 awards which inadvertently increased the rolling conversion rate. We have corrected this and simplified the first 2 paragraphs of the Results section.

      • I was also mildly confused looking at Figure 1c. The caption says that the percentages represent the K99 awardees that stayed at the same institution for the R00 activation, but the percentages are next to the solid circles which the legend labels as "different institution." Perhaps another or different way to show this is a stacked bar chart, where one bar represents the percentage of R00 awards activated at the same institution and another bar represents the percentage of R00 awards activated at a different institution. The bars always add to 100% but the change in proportions illustrates that proportionally fewer awards are being made to those remaining at the same institution.

      Great idea. We have included a stacked bar chart here. Since the stacked bar chart is percentages, we felt it was important to also show the total numbers so we still included the previous chart also but removed the percentage numbers from it. We also changed the departmental analysis to stacked bar charts. This shows the stark difference between 2008-2012 and 2013 onward. These changes were made in the revised Fig. 1.

      • Minor question: I would love to see Table 3 and Table 4 as a time-series. Has the proportion of recipients at various institution types changed with time?

      This is a great suggestion and we felt it fit best in Figure 5, so we’ve added it there.

      • Table 3 is useful but only indirectly addresses my first "Recommendation to the Authors" from my previous review. I did some number crunching myself from the data provided. Assuming I did this correctly: If you're a K99 awardee at a private institute, you had a 76.3% change of getting an R00 compared to 80.4% for a K99 awardee at a public institution. If you're a K99 awardee at a top-funded institution, you had a 76.8% chance of R00 compared to 78.6% for a lower-funded institution. I would have liked to see more figures and tables to illustrate conversion rates by institution type in this way. Interestingly, to me, these data suggest that there are not enormous conversion rate differences by institution type (though looking at these now, I am confused at the 89% statistic in line 174 and where that comes form, since it is much higher than what I've calculated).

      Thank you for this suggestion and these comments. Please see above where we describe how we incorrectly overestimated the 89% statistic. This has been corrected. As the reviewer suggested, we now show yearly percent of grants to specific institution types in the revised Figure 5. We agree with the reviewer that showing the conversion rate by institution type is interesting; however, it is fairly obvious from the new panels in Figure 5 that there is not much difference in conversion rate. Thus, to avoid crowding too many panels into the figure, we opted to keep the stacked bar plot.

      Reviewer #3 (Recommendations for the Authors):

      -One minor change to Figure 1C would be to switch the color coding for the lines so that they match with 1D whereby "same institution" would be white circles, or whatever the authors decide would be best for consistency since they are similar comparisons.

      Thank you for this suggestion. We have corrected this to be consistent.

      -Minor note for lines 459-461: I would suggest changing the wording to "intersectional inequalities" as it is not that a scientist's identities impact their careers as much as how those identities are positioned within an unequal opportunity structure and differentially treated that produce varying career trajectories and experiences of marginalization and cumulative (dis)advantages.

      Thank you and we agree with you. We have made this correction.

      -To carry forward a suggestion for the authors in my previous review, future research that more fully explores the research infrastructure of institutions for how top NIH funded institutions continue to be top funded institutions year after year could help clarify some of the career mobility and same/similar institution hiring found in the data. Rather than hand coding institutions for some of the infrastructure, the National Center for Education Statistics' Integrated Postsecondary Education Data System (IPEDS) has data on colleges and universities including whether they operate a hospital, have a medical degree, and many other interesting data about student and faculty demographics, institutional expenditures (including research budgets), and degrees awarded in different fields of study (undergrad and grad) that may be helpful to the authors as they continue their research stream in this area.

      Thank you very much. We will look into this data set as we continue our investigations in this area.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      The discussion seems to imply that the ball-and-chain peptide is or is related to the common gate. (Although it isn't stated explicitly, it is implied based on the presentation of the gating model in Figure 8 immediately after the discussion of common gating, and the simultaneous opening of both pores in Figure 8). What does the asymmetric structure say about the relationship between the N-term peptide and common gating in ClC-2? It seems like this structure suggests that the CTDs can independently rotate, and independently bind N-terminal peptide, which might not be expected to impact both pores. Some additional clarification and/or discussion of these ideas could be helpful here.

      We thank the reviewer for raising these very important points. We agree we should have been more explicit and have now expanded our discussion on this topic, highlighting the independent movement of the N-term peptide and CTDs and clarifying that it is currently unknown whether CLC-2 has a common gate (lines 431484).

      Discussion of "Revised Framework for CLC-2 gating": I think this would be a little easier to follow if most of the legend from Figure 8 was in the main text at the end of that section. Also, additional labels in Figure 8 (of the glutamates, the N-terminal peptide, and what the CTD arrows represent).

      We have revised this section of the text and added labels to the (revised) Figure as suggested.

      Line 261: typo, misspelling of "hydrogen"

      Fixed. (Now line 279.)

      Figure 6 - supplement 2B: Looks like an error in numbering y-axis - should be 90/120/150, I think. Can you show the three data points for the WT initial current rectification? Can you clarify whether the 3 that you are analyzing are the ones where AK42 the AK42 "zero current" level is not more than the initial positive current?

      We apologize for this error, which arose from the Y-axis label overlapping the tick labels, so 90/120/150 showed as 90/20/50. We have fixed this error and have added a new panel (C) to show three data points for the WT initial current rectification. In the Figure legend to panel C, we clarify that the 3 experiments we analyzed are the ones where the AK-42 current level is not more than the initial current at 80 mV.

      Reviewer #2 (Recommendations For The Authors):

      1. It appears from a close inspection of Figure 2 that the TM dimer is not quite symmetric, but I couldn't tell for sure from the figures as presented. No comment is made in the methods about symmetry imposed, and the authors explicitly comment on asymmetry in the cytoplasmic domain. It would be useful to have an explicit discussion of the TM dimer symmetry.

      We have now explicitly stated that the TM dimer is symmetric, and we have clarified the wording in the Methods:

      Main text, line 81: "The TM region of CLC-2 displays a typical CLC family symmetric homodimeric structure, with each subunit containing an independent Cl– pathway (Figure 2A, B)."

      Methods (lines 557-558): "The following ab initio reconstruction and 3D refinement (for all structures presented in this paper) were performed with C1 symmetry (no symmetry imposed)."

      1. For the simulations in Figure 5 Supplement 2, the N terminus flexibility is shown, but this of course can't be compared to a control. However, given the structural results, one might expect the JK helix to show changes in flexibility/mobility in the apo vs inactivated structures. Is this observed?

      We agree that the structures strongly suggest the JK-helix is not as stable without the N-terminus bound. We did not perform comparative simulations on the JK helix in the apo vs inactivated structures. While we agree this could be of interest, we don’t think it is essential to our conclusions, and the simulations might need to be quite long to adequately capture dynamics of the JK helix. [In the simulation results shown in Figure 5 Supplement 2, our aim was to test the validity of the structure by determining whether the N-terminus remains bound to the channel in simulations. The plot shows that the N-terminus stays in the same binding pose with an average RMSD (to the initial structure) of less than 2 angstroms, which is generally considered to be relatively stable.]

      1. I find the section "revised framework for ClC-2 gating" to be wanting. The ideas are illustrated in the cartoon, but should also be laid out in the text. In what ways are you revising the framework, and in what aspects are you carrying through ideas already proposed?

      Thank you for raising this point, which was also raised by Reviewer 1. We have revised this section and the accompanying Figure (Figure 8 and Lines 431-484).

      1. The authors mention in passing the idea that the hairpin could contribute to inward rectification (lines 227/8), but also suggest a role for the gating glutamate in this process. They also mention the idea of a common gate, but don't flesh out its function very much. These possibilities are very interesting and should be substantially fleshed out in the "framework" section, even if they cannot be fully answered yet.

      We have expanded on these points in the “framework” section.

      1. Figure 6E. points representing individual experiments should be shown.

      We added points representing individual experiments for Delta N (normalized to WT) in the surface-expression experiments in Figure 6E. Individual data points for the electrophysiology experiments are in panel C; we did not replot these in panel E because some of the points would have been off scale.

      1. The density in Figure 2A is hard to see, is there a better way to display it? Also, the orientation of the rightmost panel in Figure 2C is difficult to interpret.

      We revised 2A to make the density easier to see. We revised Figure 2C so that the middle and rightmost panels have the same orientation.

      1. P6. Line 87. This sentence is a little confusing, and perhaps could be a little clearer-the density is consistent with a Cl- ion, but no experiments have been done to support this, no?

      We have clarified the wording as suggested (now line 89) and added references supporting Clˉ binding to the Sext site in CLCs (line 90).

      1. P6 lines 89-98. Two lines of evidence, the conformation of the gate and the pinch point, both point to the structure representing a closed state. The wording as presented is a little hard to follow.

      We have revised the wording in this paragraph (lines 92-111)

      1. It's hard to distinguish water protons and oxygens in the lower right panel (QQQ).

      We revised this panel (in Figure 3 – figure supplement 2) to better distinguish the water protons and oxygens.

      Reviewer #3 (Recommendations For The Authors):

      A few points to consider for improving the manuscript

      1. It is intriguing that in the AK-42 structure, there is no density for the hairpin loop even though the CTD is in a symmetrical conformation as the apo. The authors could perhaps comment on whether there is any difference in the rectification properties of currents (or run-up) upon unblocking of AK-42 which may suggest that the hairpin binding is prevented by AK-42.

      We have not yet performed the suggested experiment nor any experiments to examine state-dependence, though we agree such experiments would be informative. We have added a note on this point in the discussion, lines 334-337.

      1. Although the conformation-dependent placement of the hairpin loop is convincing based on the density, the sequence assigned to this region is not conclusive.

      To strengthen our conclusion concerning the hairpin assignment, we investigated fits of peptide segments from the disordered sections of the C-terminal cytoplasmic domain to the hairpin density. We found that these fits are not as good as that with the N-terminal peptide. This analysis is described in lines 179-181 and a new figure (Figure 5 – figure supplement 1). We appreciate the reviewer’s point that it is extremely difficult to conclusively assign residues that are not contiguous with the rest of the structure. Nevertheless, given the wide variety of evidence all pointing to the conclusion that the hairpin loop corresponds to residues 14-28, we think the assignment is on strong footing. We respectfully ask that you consider removing this criticism from the public review, as we think it will hinder the casual reader from recognizing the strength of the evidence: (1) of unresolved regions in CLC-2, residues 14-28 fit best; (2) residues 14-28 were previously identified as part of the ball blocking region (lines 158-161); (3) MD simulations support that the N-terminal residues stay stably bound (Figure 5 – figure supplement 4) (4) gain-of-function disease causing mutations map onto either the Nterminal residues or interacting residues on the TM domain (Figure 5 – figure supplement 6). Thank you for considering this request.

      1. The authors should comment on the physiological relevance of the CBS domain rearrangements during gating.

      We have added this sentence (lines 131-133): “The physiological relevance of C-terminal domain rearrangements is suggested by disease-causing mutations that alter channel gating (Estevez et al., 2004; Brenes et al., 2023).”

      1. For the figures with cryo-EM maps, indicate the contour levels.

      Contour levels are now indicated in the Figure legends.

      1. It will be useful to the electrostatic map of the N-terminal peptide and the docking site.

      This is now shown in Figure 5 – figure supplement 3 and Video 5.

      1. Include a comment on the recent CLC-2 /AK-42 structure and if there are any differences in the structural features.

      We added this text to lines 273-274: “The RMSD between our CLC2-TM-AK42 structure and that of Ma et al. is 0.655 Å, and the RMSD between the apo TM structures is 0.756 Å.”

    1. Author Response

      The following is the authors’ response to the previous reviews.

      eLife assessment

      The paper contains some useful analysis of existing data but there are concerns regarding the conclusion that there might be alternative mechanisms for determining the location of origins of DNA replication in human cells compared to the well known mechanism known from many eukaryotic systems, including yeast, Xenopus, C. elegans and Drosophila. The lack of overlap between binding sites for ORC1 and ORC2, which are known to form a complex in human cells, is a particular concern and points to the evidence for the accurate localization of their binding sites in the genome being incomplete.

      Public Reviews:

      Reviewer #1 (Public Review):

      In the best genetically and biochemically understood model of eukaryotic DNA replication, the budding yeast, Saccharomyces cerevisiae, the genomic locations at which DNA replication initiates are determined by a specific sequence motif. These motifs, or ARS elements, are bound by the origin recognition complex (ORC). ORC is required for loading of the initially inactive MCM helicase during origin licensing in G1. In human cells, ORC does not have a specific sequence binding domain and origin specification is not specified by a defined motif. There have thus been great efforts over many years to try to understand the determinants of DNA replication initiation in human cells using a variety of approaches, which have gradually become more refined over time.

      In this manuscript Tian et al. combine data from multiple previous studies using a range of techniques for identifying sites of replication initiation to identify conserved features of replication origins and to examine the relationship between origins and sites of ORC binding in the human genome. The authors identify a) conserved features of replication origins e.g. association with GC-rich sequences, open chromatin, promoters and CTCF binding sites. These associations have already been described in multiple earlier studies. They also examine the relationship of their determined origins and ORC binding sites and conclude that there is no relationship between sites of ORC binding and DNA replication initiation. While the conclusions concerning genomic features of origins are not novel, if true, a clear lack of colocalization of ORC and origins would be a striking finding. However, the majority of the datasets used do not report replication origins, but rather broad zones in which replication origins fire. Rather than refining the localisation of origins, the approach of combining diverse methods that monitor different objects related to DNA replication leads to a base dataset that is highly flawed and cannot support the conclusions that are drawn, as explained in more detail below.

      Response: We are using the narrowly defined SNS-seq peaks as the gold standard origins and making sure to focus in on those that fall within the initiation zones defined by other methods. The objective is to make a list of the most reproducible origins. Unlike what the reviewer states, this actually refines the dataset to focus on the SNS origins that have also been reproduced by the other methods in multiple cell lines. We have changed the last box of Fig. 1A to make this clearer: Shared origins = reproducible SNS-seq origins that are contained in initiation zones defined by Repli-seq, OK-seq and Bubble-seq. This and the Fig. 2B (as it is) will make our strategy clearer.

      Methods to determine sites at which DNA replication is initiated can be divided into two groups based on the genomic resolution at which they operate. Techniques such as bubble-seq, ok-seq can localise zones of replication initiation in the range ~50kb. Such zones may contain many replication origins. Conversely, techniques such as SNS-seq and ini-seq can localise replication origins down to less than 1kb. Indeed, the application of these different approaches has led to a degree of controversy in the field about whether human replication does indeed initiate at discrete sites (origins), or whether it initiates randomly in large zones with no recurrent sites being used. However, more recent work has shown that elements of both models are correct i.e. there are recurrent and efficient sites of replication initiation in the human genome, but these tend to be clustered and correspond to the demonstrated initiation zones (Guilbaud et al., 2022).

      These different scales and methodologies are important when considering the approach of Tian et al. The premise that combining all available data from five techniques will increase accuracy and confidence in identifying the most important origins is flawed for two principal reasons. First, as noted above, of the different techniques combined in this manuscript, only SNS-seq can actually identify origins rather than initiation zones. It is the former that matters when comparing sites of ORC binding with replication origin sites, if a conclusion is to be drawn that the two do not co-localise.

      Response: We agree. So the reviewer should agree that our method of finding SNS-seq peaks that fall within initiation zones actually refines the origins to find the most reproducible origins. We are not losing the spatial precision of the SNS-seq peaks.

      Second, the authors give equal weight to all datasets. Certainly, in the case of SNS-seq, this is not appropriate. The technique has evolved over the years and some earlier versions have significantly different technical designs that may impact the reliability and/or resolution of the results e.g. in Foulk et al. (Foulk et al., 2015), lambda exonuclease was added to single stranded DNA from a total genomic preparation rather than purified nascent strands), which may lead to significantly different digestion patterns (ie underdigestion). Curiously, the authors do not make the best use of the largest SNS-seq dataset (Akerman et al., 2020) by ignoring these authors separation of core and stochastic origins. By blending all data together any separation of signal and noise is lost. Further, I am surprised that the authors have chosen not to use data and analysis from a recent study that provides subsets of the most highly used and efficient origins in the human genome, at high resolution (Guilbaud et al., 2022).

      Response: 1) We are using the data from Akerman et al., 2020: Dataset GSE128477 in Supplemental Table 1. We have now separately examined the core origins defined by the authors to check its overlap with ORC binding (Supplementary Fig. S8b)

      2) To take into account the refinement of the SNS-seq methods through the years, we actually included in our study only those SNS-seq studies after 2018, well after the lambda exonuclease method was introduced. Indeed, all 66 of SNS-seq datasets we used were obtained after the lambda exonuclease digestion step. To reiterate, we recognize that there may be many false positives in the individual origin mapping datasets. Our focus is on the True positives, the SNS-seq peaks that have some support from multiple SNS-seq studies AND fall within the initiation zones defined by the independent means of origin mapping (described in Fig. 1A and 2B). These True positives are most likely to be real and reproducible origins and should be expected to be near ORC binding sites.

      We have changed the last box of Fig. 1A to make this clearer: Shared origins = reproducible SNS-seq origins that are contained in initiation zones defined by Repli-seq, OK-seq or Bubble-seq.

      Ini-seq by Torsten Krude and co-workers (Guillbaud, 2022) does NOT use Lambda exonuclease digestion. So using Ini-seq defined origins is at odds with the suggestion above that we focus only on SNS-seq datasets that use Lambda exonuclease. However, Ini-seq identifies a much smaller subset of SNS-seq origins, so, as requested, we have also done the analysis with just that smaller set of origins, and it does show a better proximity to ORC binding sites, though even then the ORC proximate origins account for only 30% of the Ini-seq2 origins (Supplementary Fig. S8d). Note Ini-seq2 identifies DNA replication initiation sites seen in vitro on isolated nuclei.

      Update in response to authors' comments on the original review:

      While the authors have clarified their approach to some aspects of their analysis, I believe they and I are just going to have to disagree about the methodology and conclusions of this work. I do not find the authors responses sufficiently compelling to change my mind about the significance of the study or veracity of the conclusions. In my opinion, the method for identification of strong origins is not robust and of insufficient resolution. In addition, the resolution and the overlap of the MCM Chip-seq datasets is poor. While the conclusion of the paper would indeed be striking and surprising if true, I am not at all persuaded that it is based on the presented data.

      Reviewer #2 (Public Review):

      Tian et al. performed a meta-analysis of 113 genome-wide origin profile datasets in humans to assess the reproducibility of experimental techniques and shared genomics features of origins. Techniques to map DNA replication sites have quickly evolved over the last decade, yet little is known about how these methods fare against each other (pros and cons), nor how consistent their maps are. The authors show that high-confidence origins recapitulate several known features of origins (e.g., correspondence with open chromatin, overlap with transcriptional promoters, CTCF binding sites). However, surprisingly, they find little overlap between ORC/MCM binding sites and origin locations.

      Overall, this meta-analysis provides the field with a good assessment of the current state of experimental techniques and their reproducibility, but I am worried about: (a) whether we've learned any new biology from this analysis; (b) how binding sites and origin locations can be so mismatched, in light of numerous studies that suggest otherwise; and (c) some methodological details described below.

      • I understand better the inclusion/exclusion logic for the samples. But I'm still not sure about the fragments. As the authors wrote, there is both noise and stochasticity; the former is not important but the latter is essential to include. How can these two be differentiated, and what may be the expected overlap as a function of different stochasticity rates?

      It is difficult to separate the effect of noise from the effect of stochastic firing of origins. We therefore took the simplest approach: focus only on the most reproducible origins (shared origins) and ignore the non-reproducible origins. At least the most reproducible origins can be used to test the hypotheses regarding origin firing.

      • Many of the major genomic features analyzed have already been found to be associated with origin sites. For example, the correspondence with TSS has been reported before:

      https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6320713/

      https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6547456/

      • Line 250: The most surprising finding is that there is little overlap between ORC/MCM binding sites and origin locations. The authors speculate that the overlap between ORC1 and ORC2 could be low because they come from different cell types. Equally concerning is the lack of overlap with MCM. If true, these are potentially major discoveries that butts heads with numerous other studies that have suggested otherwise.

      The key missing dataset is ORC1 and ORC2 CHiP-seq from the same cell type. This shouldn't be too expensive to perform, and I hope someone performs this test soon. Without this, I remain on the fence about how much existing datasets are "junk" vs how much the prevailing hypothesis about replication needs to be revisited. Nonetheless, the authors do perform a nice analysis showing that existing techniques should be carefully used and interpreted.

      We agree that a thorough set of ChIP-seq data (with multiple antibodies or with equivalent techniques that do not use antibodies) for all six subunits of ORC in mammalian cells will be very useful for the field. Note, though, that just by simple cell lysis, it is very easy to divide human ORC into at least three different parts: ORC1, ORC2-5, and ORC6. The subunits do not form as robust a complex as seen in the yeasts and in flies.

      Reviewer #3 (Public Review):

      Summary: The authors present a thought-provoking and comprehensive re-analysis of previously published human cell genomics data that seeks to understand the relationship between the sites where the Origin Recognition Complex (ORC) binds chromatin, where the replicative helicase (Mcm2-7) is loaded, and where DNA replication actually beings (origins). The view that these should coincide is influenced by studies in yeast where ORC binds site-specifically to dedicated nucleosome-free origins where Mcm2-7 can be loaded and remains stably positioned for subsequent replication initiation. However, this is most certainly not the case in metazoans where it has already been reported that chromatin bindings sites of ORC and Mcm2-7 do not necessarily overlap, nor do they always overlap with origins. This is likely due to Mcm2-7 possessing linear mobility on DNA (i.e., it can slide) such that other chromatin-contextualized processes can displace it from the site in which it was originally loaded. Additionally, Mcm2-7 is loaded in excess and thus only a fraction of Mcm2-7 would be predicted to coincide with replication start sites. This study reaches a very similar conclusion of these previous studies: they find a high degree of discordance between ORC, Mcm2-7, and origin positions in human cells.

      Strengths: The strength of this work is its comprehensive and unbiased analysis of all relevant genomics datasets. To my knowledge, this is the first attempt to integrate these observations. It also is an important cautionary tale to not confuse replication factor binding sites with the genomic loci where replication actually begins, although this point is already widely appreciated in the field. Response: Thank you for recognizing the comprehensive and unbiased nature of our analysis. Our findings will prevent the unwise adoption of ORC or MCM binding sites as surrogate markers of origins and will stimulate the field to try and improve methods of identifying ORC or MCM binding until the binding sites are found to be proximal to the most reproducible origins. The last possibility is that there are ORC- or MCM-independent modes of defining origins, but we have no evidence of that.

      Weaknesses: The major weakness of this paper is the lack of novel biological insight and that the comprehensive approach taken failed to provide any additional mechanistic insight regarding how and why ORC, Mcm2-7, and origin sites are selected or why they may not coincide.

      Response: we agree that we cannot provide a novel biological insight from this kind of meta-analysis. The importance of this study is in highlighting that there is either significant problems with the data collected till now (preventing the co-localization of ORC or MCM binding sites with the most reproducible origins) or ORC and MCM binding sites are often far away from where the most reproducible origins fire, which should make us consider ways in which origins could be activated kilobases away from ORC and MCM binding sites.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      All suggestions and recommendations were described in a previous review.

      Reviewer #3 (Recommendations For The Authors):

      The most significant omission is a contextualization of the results in the discussion and an explanation of why these results matter for the biology of replication, disease, and/or our confidence in the genomic techniques reported on in this study. As written, the discussion simply restates the results without any interpretation towards novel insight. I suggest that the authors revise their discussion to fill this important gap.

      A second important, unresolved point is whether replication origins identified by the various methods differ due to technical reasons or because different cell types were analyzed. Given the correlation between TSS and origins (reported in this study but many others too), it is somewhat expected that origins will differ between cell types as each will have a distinct transcriptional program. This critique is partly addressed in Figure S1C. However, given the conclusion that the techniques are only rarely in agreement (only 0.27% origins reproducibly detected by the four techniques), a more in-depth analysis of cell type specific data is warranted. Specifically, I would suggest that cell type-specific data be reported wherever origins have been defined by at least two methods in the same cell type, specifically reporting the percent of shared origins amongst the datasets. This type of analysis may also inform on whether one or more techniques produces the highest (or lowest) quality list of true origins.

      We have done what has been suggested: used K562 cell type-specific data because here the origins have been defined by at least two methods in the same cell type and reported the percent of shared origins amongst the datasets (Supp. Fig. S4).

      Other MINOR comments include:

      • Line 215: the authors show that shared origins overlap with TF binding hotspots more often than union origins, which they claim suggests "that they are more likely to interact with transcription factors." As written, it sounds like the authors are proposing that ORC may have some direct physical interaction with transcription factors. Is this intended? If so, what support is there for this claim?

      The reviewer is correct. We have rephrased because we have no experimental support for this claim.

      • In the text, Figure 3G is discussed before Figure 3F. I suggest switching the order of these panels in Figure 3.

      Done.

      • It's not clear what Figure 5H to Figure 6 accomplishes. What specifically is added to the story by including these data? Is there something unique about the high confidence origins? If there is nothing noteworthy, I would suggest removing these data.

      We want to keep them to highlight the small number of origins that meet the hypothesis that ORC and MCM must bind at or near reproducible origins. These would be the origins that the field can focus in on for testing the hypothesis rigorously. They also show the danger of evaluating proximity between ORC or MCM binding sites with origins based on a few browser shots. If we only showed this figure, we could conclude that ORC and MCM binding sites are very close to reproducible origins.

      • Line 394: "Since ORC is an early factor for initiating DNA replication, we expected that shared human origins will be proximate to the reproducible ORC binding sites." This is only expected if one disbelieves the prior literature that shows that ORC and origins are not, in many cases, proximal. This statement should be revised, or the previous literature should be cited, and an explanation provided about why this prior work may have missed the mark.

      We do not know of any genome-wide study in mammalian cell lines where ORC binding sites and MCM binding have been compared to highly reproducible origins, or that show that these binding sites and highly reproducible origins are mostly not proximal to each other. Most studies cherry pick a few origins and show by ChIP-PCR that ORC and/or MCM bind near those sites. Alternatively, studies sometimes show a selected browser shot, without a quantitative measure of the overlap genome wide and without doing a permutation test to determine if the observed overlap or proximity is higher than what would be expected at random with similar numbers of sites of similar lengths. In the revised manuscript we have discussed Dellino, 2013; Kirstein, 2021; Wang, 2017; Mas, 2023. None of them have addressed what we are addressing, is the small subset of the most reproducible origins proximal to ORC or MCM binding sites?

      • Line 402-404: given the lack of agreement between ORC binding sites and origins the authors suggest as an explanation that "MCM2-7 loaded at the ORC binding sites move much further away to initiate origins far from the ORC binding sites, or that there are as yet unexplored mechanisms of origin specification in human cancer cells". The first part of this statement has been shown to be true (Mcm2-7 movement) and should be cited. But what do the authors mean by the second suggestion of "unexplored mechanisms"? Please expand.

      We have addressed this point in the revised manuscript.

      • The authors should better reference and discuss the previous literature that relates to their work, some of these include Gros et al., 2015 Mol Cell, Powell et al., 2015 EMBO J, Miotto et al., 2016 PNAS, but likely there are many others.

      We have addressed this point in the revised manuscript.

      Note for authors:

      Line 107: The introduction discusses the mechanism for yeast ORC recognizes specific origins and discusses the Orc4 contribution, but it is known that Orc2 also binds DNA on a base-specific manner (see PMID 33056978). Thus Lee et al. did not "humanize ORC" as stated.

      Done

      Lines 117-119: Two of the cited papers are on endo-reduplication and not on initiation in a normal cell cycle and this should be pointed out. Second, there is contradictory evidence that ORC is essential in human cells and this should be cited (PMID 33522487)

      Done

    1. Author Response

      The following is the authors’ response to the original reviews.

      Based on the reviewer comments (see below) and subsequent discussion between the reviewers and the Reviewing Editor, I would like to invite the authors to make major revisions, including new experiments. However, if major new experiments are not feasible, as may be the case, then at a minimum, I would urge the authors to:

      1. Tone down the language regarding a causative role for changes in GH/IGF-I signaling in mediating the effects of Tmem63 on the skeleton, and also be very open in acknowledging the lack of mechanistic insight into how Tmem regulates GH signaling.

      Response: We toned down the language as suggested and also acknowledged the lack of mechanistic insights into how Tmem263 regulates GH signaling.

      1. Revise/redo or if not possible, then delete the problematic experiment in Fig. 5E.

      Response: We have included additional Western blot data in Figure 5 from control WT and KO male mice without exogenous GH injection. In the absence of GH injection, we could not detect Jak2 and Stat5 phosphorylation in the liver of male WT and KO mice.

      1. Address the comments about liver feminization.

      Response: We have performed additional analysis as suggested by reviewer # 3. We have now included additional data to address the issue of liver feminization (new Fig. 6G-I and Figure 6-figure supplement 1). We plan to expand on this very topic in future studies as this is an interesting transcriptional phenomenon.

      1. Revise the manuscript to address as many of the recommendations for the authors as possible, many of which can be addressed by textual edits. Response: We have addressed as many of the textual changes as suggested in the revised manuscript.

      Reviewer #2 (Recommendations for The Authors):

      TMEM263 has been suggested to be associated with bone mineral density and growth in humans and mice, but the functional role of this transmembrane protein in the regulation of bone metabolism is unknown. With the knockout mouse approach, this manuscript demonstrates that Tmem263 is essential for longitudinal bone growth in the mouse as deletion of Tmem263 in knockout (KO) mice developed severe postnatal growth impairment and proportional dwarfism. It is determined that the dwarfism was caused by a substantial reduction in liver expression of growth hormone receptor (GHR), a slight increase in serum GH, and a reduction in serum IGF-I, which resulted in disruptive of GH/IGF-I regulatory axis of endochondral bone formation.

      The study was relatively well designed, and the results in general are supportive of the conclusions. While this study discloses new and intriguing functional information about a novel cytoplasmic membrane gene, there are a few minor issues that the authors may wish to address. These issues are listed in the following:

      1. One of the intriguing findings of this manuscript is that deletion of a gene encoding a small cytoplasmic membrane protein could cause a substantial reduction in the expression and protein levels of GHR. Inasmuch as a couple of potential explanations were offered in the Discussion section (first complete paragraph of page 10), there has been no attempt to test any of the suggested causes, since many of these potential mechanisms can readily be tested experimentally. Accordingly, the lack of mechanistic investigation into this intriguing effect renders the manuscript largely descriptive in nature.

      Response: The point made by the reviewer is well taken. We do plan to have follow up studies to establish which among the mechanisms we highlighted in the discussion is contributing to the reduction in GHR transcript and protein level. Our present study is the first functional characterization of this enigmatic novel membrane protein. We anticipate that multiple follow-up studies are needed to gain a deeper understanding of the biology of Tmem263. We believe that our present study represents an important first step.

      1. Because a major conclusion is that the bone phenotype of Tmem263 KO mice was caused by deficient hepatic expression and/or action of GHR, it would be helpful to (or strengthen) the conclusion if a brief comparison of the bone phenotype between GHR KO mice and Tmem263 KO mice is included in the Discussion section.

      Response: We have now included this information in the revised manuscript.

      1. In Figure 3, the cortical bone parameters (i.e., Tt.Ar, Ct.Ar, and Ct.Th), but none of the trabecular bone parameters (i.e., BV/TV, Tb.N, Tb.Th), were normalized against femur length. The authors did not provide a rationale for this differential treatment with the cortical bone parameters from the trabecular bone parameters. If the reason to normalize the cortical bone parameters against bone length was to demonstrate that the reduced cortical bone mass in mutants was related to the impaired longitudinal bone growth, then why did the authors not also assess whether the observed reduction in these trabecular bone parameters in KO mutants was proportional to reduced longitudinal bone growth?

      Response: We actually made the exact adjustments that the reviewer refers to, as stated in the methods section. Please see page 14. The regions of interest (ROIs) of both the trabecular bone analysis and the cortical analysis in the mutants was reduced proportional to the length of the bone (40% smaller). The normalization to Tt Ar to femur length in Figure 3I was only meant to show that the reduction in Tt Ar in the mutants was proportional. We have modified the text in our result section for clarity.

      1. Elements described in Fig. 5A have been well documented. Therefore, Fig. 5A is unnecessary and can be deleted.

      Response: We felt that Figure 5A should remain. It helps orient readers that are not familiar with the literature to be aware that both liver- and bone-derived IGF-1 contribute to longitudinal bone growth.

      1. Figure 6 was performed with male KO mice. Were the altered gene expression profiles in female KO mice any different from male KO mice?

      Response: We plan to perform RNA-seq in female mouse liver in our follow-up studies. We do not know, at present, whether and to what extent the liver transcriptomic profile would be different between male and female KO mice. As far as dwarfism and deficiency in skeletal acquisition, both male and female KO mice showed the same phenotypes.

      1. The number of animals (or samples) per group in some of the Figures (i.e., Fig. 2G, 2I, 2J, 3A to J, the entire Fig. 4, 5D, 5F, and Suppl Fig. 1) is needed to be provided in the legends.

      Response: We have included this information in the figure legends.

      Reviewer #3 (Recommendations for The Authors):

      1. Explain the discrepancy between the impact of KO on serum Igfbp3 (= decreased) vs. hepatic Igfbp3 (= unchanged).

      Response: We do not have a plausible mechanism, at present, that can explain the reduction in circulating serum Igfbp3 level without an apparent reduction in Igfbp3 transcript level in the liver. In human studies, typically only serum IGFBP3 levels are measured but not the hepatic IGFBP3 transcript level. Therefore, it is unclear whether the circulating levels of IGFBP3 is being regulated at the posttranscriptional level, an issue that can be explored in future studies.

      1. Line 215, 221, and elsewhere - Foxa1 does not show significant male-biased expression in mouse liver.

      Response: We have removed Foxa1 from the text.

      1. Line 225- According to the abstract of Ref. #45, Cux2 regulates a subset of sex-biased genes in the liver. The authors should compare the genes dysregulated by TMEM263-KO (Fig. 6) to those altered by Cux2 loss (Ref. #45) to ascertain whether the results of Fig. 6 are partially or entirely explained by Cux2 overexpression.

      Response: We agree that this is a great area of future study. We do feel this, however, would be better explored in a more in-depth follow-up article. We felt, given the current direction of the paper it made more sense to include differential expression comparisons of male vs female, hypophysectomized vs sham control, and Stat5b-KO vs WT mouse liver gene expression data. Our future work will explore the transcriptomes of male and female WT and Tmem263-KO liver gene expression in the context of the observed physiology.

      1. Line 262- "lower transcription of Ghr gene". A decrease in mRNA levels does NOT equate with a decrease in transcription per se. Altered mRNA splicing, poly A, export, cytoplasmic stability, etc. are all potential contributors.

      Response: We have included these possibilities highlighted by the reviewer in our revised Discussion section.

      1. Line 273, "TMEM263... most highly expressed in liver" Not correct - see Fig. 1C for TMEM263 RNA levels in mouse tissues.

      Response: We have corrected the text on page 11.

      1. Line 425 - Include GEO accession number.

      Response: We have already uploaded our RNA-seq data to the NCBI Sequence Read Archive (SRA), and the data can be accessed under accession number # PRJNA938158.

      1. Fig. 6 - Line 796 - Specify the age and sex of mice analyzed.

      Response: We have included the information in the revised figure 6 legend.

      1. Fig.2 - Suppl 1- Specify age of mice.

      Response: We have included the information in the revised Figure 2-figure supplement 2.

      1. Fig.2G -Specify the sex of the mice.

      Response: For the P1 to P21 pups’ data, we did not separate by sex, as gender determination of pups at P1 and P7 can be challenging. We now indicated this in the figure legend.

      1. Fig. 6A and 6C-6F: Which of these genes shows sex-dependent expression in wild-type liver? Use color to highlight gene names for genes that show male-biased or female-biased expression.

      Response: We agree with the reviewer that additional labels on Figure 6A and 6C-F would be helpful to show genes of sex-bias. However, this is not the primary point of the paper. This topic deserves a much more in-depth analysis in follow up studies focused on defining the exact type and degree of transcript feminization in the liver of Tmem263-KO mice, as well as, its physiologic consequences. For readers interested in this topic, we have included the subfigures G-I in Figure 6 and for greater transcript level detail, figure 6 supplement 1.

    1. Author response

      Reviewer #1 (Public Review):

      Loss of skeletal muscle tissue from traumatic injury is debilitating. Restoring muscle mass and function remains a challenge. Using a mouse model, the authors performed punch biopsy injuries of the tibialis anterior in which the volume of muscle loss was varied to result in either successful muscle regeneration with a smaller injury or the unsuccessful outcome of fibrosis with a larger injury. For both conditions, a novel lipidomic profiling approach was used to evaluate pro-inflammatory and anti-inflammatory lipids at key time points post-injury with respect to collagen deposition, macrophage infiltration, muscle fiber regeneration, and force produced during isometric contractions. A key finding was that while all lipids increased at 3 days post-injury (dpi) and then declined through 14 dpi, pro-inflammatory lipids remained elevated during recovery from greater muscle loss which led to fibrosis. Maresin 1 was identified as an anti-inflammatory lipid that, when injected into injured muscle, reduced fibrosis, improved muscle regeneration, and partially restored the strength of contraction.

      Strengths: The metabolipidomic profiling demonstrated here represents a novel approach to identifying pro-inflammatory and anti-inflammatory mediators of successful vs unsuccessful skeletal muscle regeneration. These findings may translate into a new therapeutic approach for promoting successful regeneration following volumetric muscle loss.

      Weaknesses: Certain aspects of the data are overinterpreted; while some measures appear to have an adequate sample size to make sound conclusions, other measures are likely to lack sufficient statistical power given their variability. Presentation of the results would be strengthened by adhering to consistent terminology and labeling of figures throughout; specific examples are identified in recommendations to the authors. Several of the images used to illustrate differences between treatments are unconvincing because differences are not readily.

      We agree with the reviewer and have scaled back some of the interpretation as well as clarified the sample sizes. We have also amended the text to maintain a consistent terminology.

      Reviewer #2 (Public Review):

      The study is novel and valuable to the field and provides new and important insights into the role of lipid mediators in VML injuries. By expanding our understanding of the mechanisms that regulate muscle regeneration following VML injuries, the study has the potential to guide the development of novel therapeutic interventions that promote tissue repair and recovery. The data presented in the manuscript is of good quality. The findings and conclusions are supported by a variety of different analyses (e.g., gene expression, histology, flow cytometry).

      Despite the strengths of the study, some limitations are identified. Specifically, the impact of maresin 1 on macrophage phenotypes (M1/M2) could have been explored in more detail using histological or protein expression analysis. Moreover, additional data are needed to substantiate the claims about increased muscle regeneration. Lastly, the study does not address myofiber innervation, myofiber-type transitions, or motor unit remodeling.

      We thank the reviewer for the suggestions and have performed a more in-depth exploration of macrophage phenotypes through additional scRNA-sequencing analysis. We have also included additional data describing how Maresin 1 impacts muscle stem cells through cyclic AMP. Respectfully, profiling myofiber innervation, motor unit remodeling and myofiber-type transitions are beyond the scope of this manuscript.

    1. Author Response

      LD Score regression (LDSC) is a software tool widely used in the field of genome-wide association studies (GWAS) for estimating heritabilities, genetic correlations, the extent of confounding, and biological enrichment. LDSC is for the most part not regarded as an accurate estimator of \emph{absolute} heritability (although useful for relative comparisons). It is relied on primarily for its other uses (e.g., estimating genetic correlations). The authors propose a new method called \texttt{i-LDSC}, extending the original LDSC in order to estimate a component of genetic variance in addition to the narrow-sense heritability---epistatic genetic variance, although not necessarily all of it. Epistasis in quantitative genetics refers to the component of genetic variance that cannot be captured by a linear model regressing total genetic values on single-SNP genotypes. \texttt{i-LDSC} seems aimed at estimating that part of the epistatic variance residing in statistical interactions between pairs of SNPs. To simplify, the basic model of \texttt{i-LDSC} for two SNPs $X_1$ and $X_2$ is

      \begin{equation}\label{eq:twoX} Y = X_1 \beta_1 + X_2 \beta_2 + X_1 X_2 \theta + E, \end{equation}

      and estimation of the epistatic variance associated with the product term proceeds through a variant of the original LD Score that measures the extent to which a SNP tags products of genotypes (rather than genotypes themselves). The authors conducted simulations to test their method and then applied it to a number of traits in the UK Biobank and Biobank Japan. They found that for all traits the additive genetic variance was larger than the epistatic, but for height the absolute size of the epistatic component was estimated to be non-negligible. An interpretation of the authors' results that perhaps cannot be ruled out, however, is that pairwise epistasis overall does not make a detectable contribution to the variance of quantitative traits.

      We thank the reviewer for carefully reading of our manuscript and we appreciate the constructive comments. Our responses and edits to the specific major comments and minor issues are given below.

      Major Comments

      This paper has a lot of strong points, and I commend the authors for the effort and ingenuity expended in tackling the difficult problem of estimating epistatic (non-additive) genetic variance from GWAS summary statistics. The mere possibility of the estimated univariate regression coefficient containing a contribution from epistasis, as represented in the manuscript's Equation~3 and elsewhere, is intriguing in and of itself.

      Is \texttt{i-LDSC} Estimating Epistasis?

      Perhaps the issue that has given me the most pause is uncertainty over whether the paper's method is really estimating the non-additive genetic variance, as this has been traditionally defined in quantitative genetics with great consequences for the correlations between relatives and evolutionary theory (Fisher, 1930, 1941; Lynch & Walsh, 1998; Burger, 2000; Ewens, 2004).

      Let us call the expected phenotypic value of a given multiple-SNP genotype the \emph{total genetic value}. If we apply least-squares regression to obtain the coefficients of the SNPs in a simple linear model predicting the total genetic values, then the partial regression coefficients are the \emph{average effects of gene substitution} and the variance in the predicted values resulting from the model is called the \emph{additive genetic variance}. (This is all theoretical and definitional, not empirical. We do not actually perform this regression.) The variance in the residuals---the differences between the total genetic values and the additive predicted values---is the \emph{non-additive genetic variance}. Notice that this is an orthogonal decomposition of the variance in total genetic values. Thus, in order for the variance in $\mathbf{W}\bm{\theta}$ to qualify as the non-additive genetic variance, it must be orthogonal to $\mathbf{X} \bm{\beta}$.

      At first, I very much doubted whether this is generally true. And I was not reassured by the authors' reply to Reviewer~1 on this point, which did not seem to show any grasp of the issue at all. But to my surprise I discovered in elementary simulations of Equation~\ref{eq:twoX} above that for mean-centered $X_1$ and $X_2$, $(X_1 \beta_1 + X_2 \beta_2)$ is uncorrelated with $X_1 X_2 \theta$ for seemingly arbitrary correlation between $X_1$ and $X_2$. A partition of the outcome's variance between these two components is thus an orthogonal decomposition after all. Furthermore, the result seems general for any number of independent variables and their pairwise products. I am also encouraged by the report that standard and interaction LD Scores are ``lowly correlated' (line~179), meaning that the standard LDSC slope is scarcely affected by the inclusion of interaction LD Scores in the regression; this behavior is what we should expect from an orthogonal decomposition.

      I have therefore come to the view that the additional variance component estimated by \texttt{i-LDSC} has a close correspondence with the epistatic (non-additive) genetic variance after all.

      In order to make this point transparent to all readers, however, I think that the authors should put much more effort into placing their work into the traditional framework of the field. It was certainly not intuitive to multiple reviewers that $\mathbf{X}\bm{\beta}$ is orthogonal to $\mathbf{W}\bm{\theta}$. There are even contrary suggestions. For if $(\mathbf{X}\bm{\beta})^\intercal \mathbf{W} \bm{\theta} = \bm{\beta}^\intercal \mathbf{X}^\intercal \mathbf{W} \bm{\theta} $ is to equal zero, we know that we can't get there by $\mathbf{X}^\intercal \mathbf{W}$ equaling zero because then the method has nothing to go on (e.g., line~139). We thus have a quadratic form---each term being the weighted product of an average (additive) effect and an interaction coefficient---needing to cancel out to equal zero. I wonder if the authors can put forth a rigorous argument or compelling intuition for why this should be the case.

      In the case of two polymorphic sites, quantitative genetics has traditionally partitioned the total genetic variance into the following orthogonal components:

      \begin{itemize}

      \item additive genetic variance, $\sigma^2_A$, the numerator of the narrow-sense heritability;

      \item dominance genetic variance, $\sigma^2_D$;

      \item additive-by-additive genetic variance, $\sigma^2_{AA}$;

      \item additive-by-dominance genetic variance, $\sigma^2_{AD}$; and

      \item dominance-by-dominance genetic variance, $\sigma^2_{DD}$.

      \end{itemize}

      See Lynch and Walsh (1998, pp. 88-92) for a thorough numerical example. This decomposition is not arbitrary or trivial, since each component has a distinct coefficient in the correlations between relatives. Is it possible for the authors to relate the variance associated with their $\mathbf{W}\bm{\theta}$ to this traditional decomposition? Besides justifying the work in this paper, the establishment of a relationship can have the possible practical benefit of allowing \texttt{i-LDSC} estimates of non-additive genetic variance to be checked against empirical correlations between relatives. For example, if we know from other methods that $\sigma^2_D$ is negligible but that \texttt{i-LDSC} returns a sizable $\sigma^2_{AA}$, we might predict that the parent-offspring correlation should be equal to the sibling correlation; a sizable $\sigma^2_D$ would make the sibling correlation higher. Admittedly, however, such an exercise can get rather complicated for the variance contributed by pairs of SNPs that are close together (Lynch & Walsh, 1998, pp. 146-152).

      I would also like the authors to clarify whether LDSC consistently overestimates the narrow-sense heritability in the case that pairwise epistasis is present. The figures seem to show this. I have conflicting intuitions here. On the one hand, if GWAS summary statistics can be inflated by the tagging of epistasis, then it seems that LDSC should overestimate heritability (or at least this should be an upwardly biasing factor; other factors may lead the net bias to be different). On the other hand, if standard and interaction LD Scores are lowly correlated, then I feel that the inclusion of interaction LD Score in the regression should not strongly affect the coefficient of the standard LD Score. Relatedly, I find it rather curious that \texttt{i-LDSC} seems increasingly biased as the proportion of genetic variance that is non-additive goes up---but perhaps this is not too important, since such a high ratio of narrow-sense to broad-sense heritability is not realistic.

      We thank the reviewer for taking the time to thoughtfully offer more context on how we might situate the i-LDSC framework within the greater context of traditional quantitative genetics. We now formalize the interaction component used in the i-LDSC model as an estimate of the phenotypic variance explained by additive-by-additive interactions between genetic variants (which we denote by 𝜎" to follow the conventional notation). In the newly revised Material and Methods, we also show how the i-LDSC model can be formulated to include dominance effects in a more general framework. Our updated derivations provide two key takeaways.

      First, we assume that the additive and interaction effect sizes in the general model (𝜷,𝜽) are each normally distributed with variances proportional to their individual contributions to trait heritability: 𝛽& ∼ 𝒩(0, 𝜎"), 𝜃' ∼ 𝒩(0, 𝜎" ). This independence assumption implies that the additive and non- $ $$ additive components 𝑿𝜷 and 𝑾𝜽 are orthogonal where 𝔼[𝜷⊺𝑿⊺𝑾𝜽] = 𝔼[𝜷⊺]𝑿⊺𝑾𝔼[𝜽] = 𝟎. This is important because, as the reviewer points out, it means that there is a unique partitioning of genetic variance when studying a trait of interest. In the revised version of the manuscript, we show this derivation in the main text (see lines 129-143). We also extend this derivation in the Materials and Methods where we show the same result even after we include the presence of dominance effects in the generative model (see lines 415-417 and 438-457).

      Second, we show that the genotype matrix 𝑿 and the matrix of genetic interactions 𝑾 are not linearly dependent because the additive-by-additive effects between two SNPs are encoded as the Hadamard product of two genotypic vectors in the form 𝒘! = 𝒙" ∘ 𝒙# (which is a nonlinear function of the genotypes). Linear dependence would have implied that one could find a transformation between a SNP and an interaction term in the form 𝒘! = 𝑐 × 𝒙" for some constant 𝑐. However, despite their linear independence, 𝑿 and 𝑾 are themselves not orthogonal and still have a nonzero correlation. This implies that the inner product between genotypes and their interactions is nonzero 𝑿⊺𝑾 ≠ 𝟎. To see this, we focus on a focal SNP 𝒙& and consider three different types of interactions:

      • Scenario I: Interaction between a focal SNP with itself (𝒙" ∘ 𝒙").
      • Scenario II: Interaction between a focal SNP with a different SNP (𝒙" ∘ 𝒙#).
      • Scenario III: Interaction between a focal SNP with a pair of different SNPs (𝒙# ∘ 𝒙$).

      In the Materials and Methods of the revised manuscript, we now provide derivations showing when would expect nonzero correlation between 𝑿 and 𝑾 which rely on the fact that: (1) we assume that genotypes have been mean-centered and scaled to have unit variance, and (2) under Hardy-Weinberg equilibrium, SNPs marginally follow a binomial distribution 𝒙& ∼ 𝐵𝑖𝑛(2, 𝑝) where 𝑝 represents the minor allele frequency (MAF) (Wray et al. 2007, Genome Res; Lippert et al. 2013, Sci Rep). These new additions are given in new lines 460-485).

      Lastly, we agree with the reviewer that our results indicate that LDSC inflates estimates of SNP- based narrow-sense heritability. Our intuition for why this happens is largely consistent with the reviewer’s first point: since GWAS summary statistics can be inflated by the tagging of non- additive genetic variance, then it makes sense that LDSC should overestimate heritability. LDSC uses a univariate regression without the inclusion of cis-interaction scores. A simple consequence from “omitted variable bias” is likely happening where, since LDSC does not explicitly account for contributions from the tagged non-additive components which also contribute to the variance in the GWAS summary statistics, the estimate for the coefficient 𝜎" becomes slightly inflated.

      How Much Epistasis Is \texttt{i-LDSC} Detecting?

      I think the proper conclusion to be drawn from the authors' analyses is that statistically significant epistatic (non-additive) genetic variance was not detected. Specifically, I think that the analysis presented in Supplementary Table~S6 should be treated as a main analysis rather than a supplementary one, and the results here show no statistically significant epistasis. Let me explain.

      Most serious researchers, I think, treat LDSC as an unreliable estimator of narrow-sense heritability; it typically returns estimates that are too low. Not even the original LDSC paper pressed strongly to use the method for estimating $h^2$ (Bulik-Sullivan et al., 2015). As a practical matter, when researchers are focused on estimating absolute heritability with high accuracy, they usually turn to GCTA/GREML (Evans et al., 2018; Wainschtein et al., 2022).

      One reason for low estimates with LDSC is that if SNPs with higher LD Scores are less likely to be causal or to have large effect sizes, then the slope of univariate LDSC will not rise as much as it ``should' with increasing LD Score. This was a scenario actually simulated by the authors and displayed in their Supplementary Figure~S15. [Incidentally, the authors might have acknowledged earlier work in this vein. A simulation inducing a negative correlation between LD Scores and $\chi^2$ statistics was presented by Bulik-Sullivan et al. (2015, Supplementary Figure 7), and the potentially biasing effect of a correlation over SNPs between LD Scores and contributed genetic variance was a major theme of Lee et al. (2018).] A negative correlation between LD Score and contributed variance does seem to hold for a number of reasons, including the fact that regions of the genome with higher recombination rates tend to be more functional. In short, the authors did very well to carry out this simulation and to show in their Supplementary Figure~S15 that this flaw of LDSC in estimating narrow-sense heritability is also a flaw of \texttt{i-LDSC} in estimating broad-sense heritability. But they should have carried the investigation at least one step further, as I will explain below.

      Another reason for LDSC being a downwardly biased estimator of heritability is that it is often applied to meta-analyses of different cohorts, where heterogeneity (and possibly major but undetected errors by individual cohorts) lead to attenuation of the overall heritability (de Vlaming et al., 2017).

      The optimal case for using LDSC to estimate heritability, then, is incorporating the LD-related annotation introduced by Gazal et al. (2017) into a stratified-LDSC (s-LDSC) analysis of a single large cohort. This is analogous to the calculation of multiple GRMs defined by MAF and LD in the GCTA/GREML papers cited above. When this was done by Gazal et al. (2017, Supplementary Table 8b), the joint impact of the improvements was to increase the estimated narrow-sense heritability of height from 0.216 to 0.534.

      All of this has at least a few ramifications for \texttt{i-LDSC}. First, the authors do not consider whether a relationship between their interaction LD Scores and interaction effect sizes might bias their estimates. (This would be on top of any biasing relationship between standard LD Scores and linear effect sizes, as displayed in Supplementary Figure~S15.) I find some kind of statistical relationship over the whole genome, induced perhaps by evolutionary forces, between \emph{cis}-acting epistasis and interaction LD Scores to be plausible, albeit without intuition regarding the sign of any resulting bias. The authors should investigate this issue or at least mention it as a matter for future study. Second, it might be that the authors are comparing the estimates of broad-sense heritability in Table~1 to the wrong estimates of narrow-sense heritability. Although the estimates did come from single large cohorts, they seem to have been obtained with simple univariate LDSC rather than s-LDSC. When the estimate of $h^2$ obtained with LDSC is too low, some will suspect that the additional variance detected by \texttt{i-LDSC} is simply additive genetic variance missed by the downward bias of LDSC. Consider that the authors' own Supplementary Table~S6 gives s-LDSC heritability estimates that are consistently higher than the LDSC estimates in Table~1. E.g., the estimated $h^2$ of height goes from 0.37 to 0.43. The latter figure cuts quite a bit into the estimated broad-sense heritability of 0.48 obtained with \texttt{i-LDSC}.

      Here we come to a critical point. Lines 282--286 are not entirely clear, but I interpret them to mean that the manuscript's Equation~5 was expanded by stratifying $\ell$ into the components of s-LDSC and this was how the estimates in Supplementary Table~S6 were obtained. If that interpretation is correct, then the scenario of \texttt{i-LDSC} picking up missed additive genetic variance seems rather plausible. At the very least, the increases in broad-sense heritability reported in Supplementary Table~S6 are smaller in magnitude and \emph{not statistically significant}. Perhaps what this means is that the headline should be a \emph{negligible} contribution of pairwise epistasis revealed by this novel and ingenious method, analogous to what has been discovered with respect to dominance (Hivert et al., 2021; Pazokitoroudi et al., 2021; Okbay et al., 2022; Palmer et al., 2023).

      This is an excellent question raised by the reviewer and, again, we really appreciate such a thoughtful and thorough response. First, we completely agree with the reviewer that the s-LDSC estimates previously included in the Supplementary Material should instead be discussed in the main text of the manuscript. In the revision, we have now moved the old Supplemental Table S6 to be the new Table 2. Second, we also agree that the conclusions about the magnitude of additive-by-additive effects should be based upon variance explained when using the cis- interaction score in addition to scores specific to different biological annotations when available, per s-LDSC.

      However, we want to respectfully disagree that the results indicate a negligible contribution of additive-by-additive genetic variance to all the traits we analyzed (see Figure 4D). Although the additive-by-additive genetic variance component is not significant in any trait in the UK Biobank, there is little reason to expect that they would be given the inclusion of 97 other biological annotations from the s-LDSC model. Indeed, in the s-LDSC paper itself the authors look only for enrichment of heritability for a given annotation not a statistically significant test statistic. It also worth noting that jackknife approaches tend to be conservative and yield slightly larger standard errors for hypothesis testing. Taking all the great points that the reviewer mentioned into account, we believe that a moderate stance to the interpretation of our results is one that: (i) emphasizes the importance of using s-LDSC with the cis-interaction score to better assess the variance explained by additive-by-additive interaction effects and (ii) allows for the significance of the additive-by-additive component to not be the only factor when determining the importance of the role of non-additive effects in shaping trait architecture.

      In the revision, we now write the following in lines 331-343:

      Lastly, we performed an additional analysis in the UK Biobank where the cis-interaction scores are included as an annotation alongside 97 other functional categories in the stratified-LD score regression framework and its software s-LDSC (Materials and Methods). Here, s-LDSC heritability estimates still showed an increase with the interaction scores versus when the publicly available functional categories were analyzed alone, but albeit at a much smaller magnitude (Table 2). The contributions from the additive-by-additive component to the overall estimate of genetic variance ranged from 0.005 for MCHC (P = 0.373) to 0.055 for HDL (P = 0.575) (Figures 4C and 4D). Furthermore, in this analysis, the estimates of the additive-by-additive components were no longer statistically significant for any of the traits in the UK Biobank (Table 2). Despite this, these results highlight the ability of the i-LDSC framework to identify sources of “missing” phenotypic variance explained in heritability estimation. Importantly, moving forward, we suggest using the cis- interaction scores with additional annotations whenever they are available as it provides more conservative estimates of the role of additive-by-additive effects on trait architecture.

      Lastly, in the Discussion, we now mention an area of future work would be to explore how the relationship between cis-interaction LD scores and interaction effect sizes might bias heritability estimates from i-LDSC (e.g., similar to the relationship explored standard LD scores and linear effect sizes in Figure 3 – figure supplement 8). See new lines 364-367.

    1. Author Response

      The following is the authors’ response to the current reviews.

      We agree with Reviewer #1 that it is not typical to include primary data in a review, but this seems to be a very unusual situation and it is not unprecedented. We seriously believe that it will significantly dilute the impact of the message if we were to separate this into two papers. We intended initially to do a comprehensive review of the αC-β4 motif as we think it is an extremely important element of secondary structure that has been rather overlooked in the protein kinase field. It is the site where the nucleotide and peptide/protein binding sites converge in the C:PKI complex and also in the RIα holoenzyme, which is also a pseudo-substrate inhibitor. This stable element is highly conserved in all protein kinases, and we think it is an extremely important allosteric site where the kinases differ. Thus, it is highly relevant for this set of Elife papers on kinase allostery. In parallel, we have developed the Local Spatial Pattern (LSP) alignment method for identifying Protein Residue Networks (PRNs) into a robust tool. When the Veglia team, our long-time collaborators, did their NMR analysis of the F100A mutant, which is in the αC-β4 loop, we thus decided to do the LSP analysis. The LSP results were so interesting and striking that we decided immediately to explore the motif further and to specifically compare the various crystal structures that we had solved in the past to see if indeed we had missed some changes. In addition to looking at the backbone, we decided to also look at the side chains and to compare the structures with the simulations. The results proved to be extremely informative and defined a multi-pronged approach that could be used to screen any disease mutation or alternatively as an Ala scan for any residue in any protein. I consider this to be one of the most important papers that I have published in many years. It describes a process for exploring the potential dynamic impact of any disease mutation or any point mutation. We emphasize repeatedly that the hypotheses generated from the computational screen will need to be validated experimentally, but our LSP analysis is a rapid and relatively inexpensive way to screen a set of mutations and predict which will have the greatest impact on dynamics. It is an especially powerful and robust way to identify allosteric sites as the LSP approach maps global changes of a single mutation across the entire protein. These mutants would then be prioritized for experimental follow-up. We are indeed now implementing this more comprehensive strategy in two ways. We are specifically exploring three disease mutations in the αC-β4 loop and, in parallel, are also doing a computational Ala scan of the entire loop (L95-L106); however, this is part of a separate and more comprehensive study that will take much longer. It will be the "Proof-of-Principle” of the hypotheses that we propose in our Elife paper. In addition to the LSP method, the MD simulations provide new and complementary insights into side chain dynamics in contrast to the static crystal structures. We will also begin to compare the αC-β4 loop in other kinases, specifically PKCβ2 and LRRK2, but once again this is part of a separate study and is clearly beyond the scope of this Elife paper. This focus on the αC-β4 loop is an excellent strategy that can be applied to any protein kinase. The LSP approach, however, can obviously also be applied to any protein or any motif, so it is potentially very powerful tool. We think that the impact and potential importance of this paper will be lost if it is split into two papers.

      I went back to look at a recent review that we did for the Biochemical Journal on the PKA Cβ isoform, and there we also included some new primary data in the review. It was never questioned. We believe that our manuscript is so perfectly appropriate for this Elife series that is focus on allostery in kinases, and having our paper back-to-back with the Veglia NMR paper is especially important and relevant. We thus ask you will seriously consider keeping this as a single paper as part of this series on allostery.


      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this work Wu, J., et al., highlight the importance of a previously overlooked region on kinases: the αC-β4 loop. Using PKA as a model system, the authors extensively describe the conserved regulatory elements within a kinase and how the αC-β4 loop region integrates with these important regulatory elements. Previous biochemical work on a mutation within the αC-β4 loop region, F100A showed that this region is important for the synergistic high affinity binding of ATP and the pseudo substrate inhibitor PKI. In the current manuscript, the authors assess the importance of the αC-β4 loop region using computational methods such as Local Spatial Pattern Alignment (LSP) and MD simulations. LSP analysis of the F100A mutant showed decreased values for degree centrality and betweenness centrality for several key regulatory elements within the kinase which suggests a loss in stability/connectivity in the mutant protein as compared to the WT. Additionally, based on MD simulation data, the side chain of K105, another residue within the αC-β4 loop region had altered dynamics in the F100A mutant as compared to the WT protein. While these changes in the αC-β4 loop region seem to be consistent with the previous biochemical data, the results are preliminary and the manuscript can be strengthened (as the authors themselves acknowledge) with additional experiments. Specific comments/concerns are listed below.

      1. MD simulations were carried out using a binary complex of the catalytic subunit of PKA and ATP/Mg and not the ternary complex of PKA, ATP/Mg and PKI. MD simulations carried out using the ternary complex instead of the binary complex would be more informative, especially on the role of the αC-β3 loop region in the synergistic binding of ATP/Mg and PKI.

      Response 1. Thank you for your suggestion. We have included the data for the MD simulations of the ternary complex in the revised manuscript. This includes a new figure and was indeed informative (Figure 11). Text describing this simulation is also added on pages 15-17. All the changes in the revised manuscript are highlighted in red.

      1. The LSP analysis shows a decrease in degree centrality for the αC-β4 loop region in the F100A mutant compared to the WT protein which suggests a gain in stability in this region for the F100A mutant (Fig. 8A). These results seem to be contradictory to the MD simulation data which shows the side chain dynamics of K105 destabilizes the αC-β4 loop region in the F100A mutant (Fig. 10B). It would be helpful if the authors could clarify this apparent discrepancy.

      Response 2. In Figure 8A, the negative values of degree centrality for the αC-β4 loop region show that the value of DC is less in the WT compared to the mutant, suggesting that those regions are more stable in the mutant. This says that the mutation in the αC-β4 loop region both rigidifies the motif and alters the communication signaling networks between the two lobes.

      The betweenness centrality plots (Figure 8B) also show how the connectivity between the two lobes is altered upon mutation. In the mutant the major connectors become V104 and I150 in the C-lobe, whereas connectivity was primarily governed by K72 (N-lobe) and D184 (C-lobe) in the wt C-subunit. Overall, the mutation causes rigidification of the αC-β4 loop and this leads to loss of allosteric communication between the two lobes.

      The MD simulation results as shown in Figure 10B are not contradictory. This figure shows the overall dynamic profile of the protein, based on principal component analysis (PCA) using the parameter of the residual flexibility. It does not reflect a particular motif's stability or flexibility. Instead it shows that overall the protein upon mutation becomes more dynamic and can sample different conformational states, while, in contrast, the WT protein preferred a single global state of conformation. However, the LSP results showed that, compared to the other parts, the αC-β4 loop, especially V104 at the tip, becomes more stable following mutation, and this has an impact on the allosteric communication between the two lobes. We have added this information into the revised manuscript on page 14, also highlighted in red.

      1. The foundation for the experiments carried out in this paper are based on previous NMR and computational data for the F100A mutant. However, the specific results and conclusions from these previous experiments are not clearly described.

      Response 3. The NMR paper has been already accepted by eLIFE and here we are attaching the bioRxiv paper link, “https://www.biorxiv.org/content/10.1101/2023.09.12.557419v1.”

      Reviewer #1 (Recommendations For The Authors):

      In this work Wu, J., et al., draw attention to the αC-β4 loop, a previously neglected region within kinases. A comprehensive review on the important regulatory elements within the kinase along with how the αC-β4 loop (and the αE helix) integrates with these different regulatory elements is presented well. As the authors themselves acknowledge, the data presented here while promising is preliminary. Additional biochemical, NMR and computational experiments need to be carried out to assess the importance of F100, K105 and other residues in this region.

      1. The authors indicate that previous computational studies predict a flip in the αC-β4 loop in the apo state. It would be helpful to have a figure showing the predicted flip as well as an explanation for the significance of this predicted flip.

      Response 1. The NMR paper has been already accepted by eLIFE and here we are attaching the bioRxiv paper link, “https://www.biorxiv.org/content/10.1101/2023.09.12.557419v1.” The Figures 3 and 6 in that paper described the predicted flip in the αC-β4 loop in the apo state. We did not see a flip in any of our crystal structures, and the LSP analysis which is based on 200 ns simulations is not sufficient to see this major conformational change.

      1. The authors cite previous NMR and biochemical experiments (reference 62), work that has just been submitted to eLife. Access to this work was difficult as this manuscript could not be found on the eLife website.

      Response 2. The NMR paper has been already accepted by eLIFE and here we are attaching the bioRxiv paper link, “https://www.biorxiv.org/content/10.1101/2023.09.12.557419v1.”

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      Despite the importance of T follicular helper cells (Tfh cells) in vaccine-induced humoral responses, it is still unclear which type of Tfh cells (Tfh1, Tfh2, and Tfh17) is critical for generating protective humoral immunity. By using the rhesus macaques model (most similar to human), the authors have addressed this potentially important question and obtained suggestive data that Tfh1 is critical. Although being suggestive, the evidence for the importance of Tfh1 is incomplete.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Developing vaccination capable of inducing persistent antibody responses capable of broadly neutralizing HIV strains is of high importance. However, our ability to design vaccines to achieve this is limited by our relative lack of understanding of the role of T-follicular helper (Tfh) subtypes in the responses. In this report Verma et al investigate the effects of different prime and boost vaccination strategies to induce skewed Tfh responses and its relationship to antibody levels. They initially find that live-attenuated measles vaccine, known to be effective at inducing prolonged antibody responses has a significant minority of germinal center Tfh (GC-Tfh) with a Th1 phenotype (GC-Tfh1) and then explore whether a prime and boost vaccination strategy designed to induce GC-Tfh1 is effective in the context of anti-HIV vaccination. They conclude that a vaccine formulation referred to as MPLA before concluding that this is the case.

      Clarification: MPLA serves as the adjuvant, and the vaccine formulation is characterized as a Th1 formulation based on the properties of the adjuvant.

      Strengths:

      While there is a lot of literature on Tfh subtypes in blood, how this relates to the germinal centers is not always clear. The strength of this paper is that they use a relevant model to allow some longitudinal insight into the detailed events of the germinal center Tfh (GC-Tfh) compartment across time and how this related to antibody production.

      Weaknesses:

      The authors focus strongly on the numbers of GC-Tfh1 as a proportion of memory cells and their comparison to GC-Tfh17. There seems to be little consideration of the large proportion of GC-Tfh which express neither CCR6 and CXCR3 and currently no clear reasoning for excluding the majority of GC-Tfh from most analysis. There seems to be an assumption that since the MPLA vaccine has a higher number of GC-Tfh1 that this explains the higher levels of antibodies. There is not sufficient information to make it clear if the primary difference in vaccine efficacy is due to a greater proportion of GC-Tfh1 or an overall increase in GC-Tfh of which the percentage of GC-Tfh1 is relatively fixed.

      Response: We appreciate the reviewer's comment. Indeed, while there is substantial literature on Tfh subtypes in blood; the strength of our study lies in utilizing a relevant model to provide longitudinal insights into the dynamics of the germinal center Tfh (GC-Tfh) compartment over time and its relationship to antibody production. Regarding the concern about the comprehensive analysis of GC Tfh subsets, including GC-Tfh1, GC-Tfh17, and others not expressing CCR6 and/or CXCR3, we fully acknowledge its importance. To address this, we will conduct a detailed analysis of GC Tfh and GC Tfh1 frequencies, encompassing subsets without CCR6 and CXCR3 expression, to provide a more comprehensive view of the GC-Tfh population in our analysis.

      Reviewer #2 (Public Review):

      Summary:

      Anil Verma et al. have performed prime-boost HIV vaccination to enhance HIV-1 Env antibodies in the rhesus macaque model. The authors used two different adjuvants, a cationic liposome-based adjuvant (CAF01) and a monophosphoryl lipid A (MPLA)+QS-21 adjuvant. They demonstrated that these two adjuvants promote different transcriptomes in the GC-TFH subsets. The MPLA+QS-21 adjuvant induces abundant GC TFH1 cells expressing CXCR3 at first priming, while the CAF01 adjuvant predominantly induced GC TFH1/17 cells co-expressing CXCR3 and CCR6. Both adjuvants initiate comparable Env antibody responses. However, MPLA+QS-21 shows more significant IgG1 antibodies binding to gp140 even after 30 weeks.

      The enhancement of memory responses by MPLA+QS-21 consistently associates with the emergence of GC TFH1 cells that preferentially produce IFN-γ.

      Strengths:

      The strength of this manuscript is that all experiments have been done in the rhesus macaque model with great care. This manuscript beautifully indicated that MPLA+QS-21 would be a promising adjuvant to induce the memory B cell response in the HIV vaccine.

      Weaknesses:

      The authors did not provide clear evidence to indicate the functional relevance of GC TFH1 in IgG1 class-switch and B cell memory responses.

      Response. We appreciate the recognition of our meticulous work in the rhesus macaque model and the potential of MPLA+QS-21 as an adjuvant for HIV vaccine-induced humoral immunity. We acknowledge the need to provide clearer evidence of the functional relevance of GC Tfh1 in IgG1 class-switching and B cell memory responses. We will attempt to address this concern in our revisions.

      Recommendations for Authors:

      Reviewer #1:

      1. Is the proportion of GC-Tfh1 within GC-Tfh significantly increased in MPLA vs CAF01? The balance between Tfh1 and Tfh17 data is shown in 4C but appears quite a modest difference. Additionally, it excludes the majority of GC-Tfh since it only considers CCR6 and CXCR3 expressing cells.

      Response. We have now included a comparison of the relative proportions of GC Tfh cells expressing CCR6 and CXCR3, as well as those lacking these markers. Our data now demonstrate an increased presence of Tfh1 within the GC-Tfh population when MPLA is employed at P1w2, as depicted in Figure 4D.

      1. Is there any relationship between GC-Tfh17, 1/17 and non Th1/17 GC-Tfh and antibody levels? In Figure 5C only GC Tfh1 is examined making it impossible to judge if this is specific to GC-Tfh1 or a general relationship between higher total GC-Tfh and antibodies.

      Response. In our revised description of the results, we have mentioned that GC Tfh frequencies correlated with antibody levels (r = 0.6, p < 0.05). However, it is important to note that this correlation was specific to the GC Tfh1 subset and was not observed with other subsets.

      Other points:

      1. The authors make a number of statements that rather exaggerate differences such as stating in the abstract that CAF01 induces Tfh1/17 while MPLA predominantly induces Tfh1. As shown in Figure 4C the majority of CCR6-CXCR3- GC-Tfh induced by CAF01 are GC-Tfh1 i.e. both formulations predominantly induce GC-Tfh1. Also, it is difficult to judge since the data is never provided but the predominant group of GC-Tfh appears to be CCR6-CXCR3- in both cases.

      Response. We acknowledge the need for greater precision in our descriptions. In response, we have addressed this concern by providing the frequencies of CCR6-CXCR3- GC Tfh cells in Figure 4D. We have also included a comparison of the relative frequencies across the adjuvant groups in the Results section (Lines 331-338).

      1. The authors use the term peripheral Tfh (pTfh), it may be better to use the more common term circulating Tfh (cTfh) to avoid confusion with T peripheral helper cells (Tph).

      Response. We appreciate the reviewer's suggestion to use the more commonly accepted term "circulating Tfh (cTfh)" instead of "peripheral Tfh (pTfh)." We have incorporated this change into our manuscript to ensure clarity and avoid potential confusion with "peripheral helper cells (Tph).

      1. Some further labelling of the pie chart in Figure 1G to at least specify larger groups such as Tfh2, Tfh17, Tfh1/17 would be helpful.

      Response. We have incorporated the suggestion and identified cTfh2, cTfh17, and cTfh2/17 cells. We additionally now state in the legend that overlapping pie arcs correspond to specific polarized Tfh subsets denoted by arc color.

      1. A gating example of the CXCR3, CCR6, CCR4 patterns in the GC Tfh would be helpful. "up to 25% of GC Tfh cells expressed CCR6" I think it is better to state the average here since 25% appears an outlier.

      Response. We have now included a gating example of chemokine receptor expression, patterns in the GC Tfh. Additionally, we have revised the statement to mention the median (7%) of GC Tfh cells expressing CCR6 instead of specifying the upper limit.

      1. Figure 1I, does this graph exclude triple negative cells? It's not clear from the figure legend but the numbers do not seem to add up with the graphical proportions shown in figure 1H.

      Response. We have made the necessary clarification in both the results section, figure, and the figure legend to state that the Boolean analysis is based on cells expressing either CXCR3 or CCR6, thus explaining the exclusion of triple negative cells.

      1. Figure 3C. Some label should be added to make clear which violins are from the CD95- and CD95+ groups. There may be too much data in this panel for p values to be legible. Either less graphs or more space may be needed.

      Response. We have updated the Y axis labels in the figure to state that the violin plots show the differences in gene expression between CD95+ CD4 T cells and CD95- CD4 T cells (naive).

      1. Figure 4B. Numbers attached to the gates (1, 17 etc) should be more clearly labeled Tfh1, Tfh17 etc since normally they might be expected to be gate percentages in this format. Gate percentages should also be added.

      Response. We have clearly labeled the subsets as "Tfh1" and "Tfh17," making it easier for readers to interpret the figure. Additionally, we have included gate percentages in the flow plot. Furthermore, the percentages of GC Tfh subsets are now depicted in Figure 4D.

      1. Overlarge and indistinct datapoint symbols are often a problem e.g. Figure 4G most of the CAF01 datapoints are merged into a single blob with no indication of where one point ends or begins. Supplementary figure 5E. Datapoint sizes are large to the extent that the lines are difficult to see. Lines indicating central tendency are often lost.

      Response. We have reworked the graphs (including 4G, now 4I) to ensure clarity,

      1. Generally greater care is needed with graph layout e.g. the B indicating figure 6B is on the graph of figure 6A.

      Response. We have made the necessary adjustment to ensure that the letter "B" correctly corresponds to the graph in Figure 6B.

      1. Figure 6J, the text seems to indicate "higher avidity with MPLA against autologous Env including V1V2 loops." However, the graph seems to indicate lower avidity for V1V2 loops? Response. We appreciate the careful observation. We have rectified this by updating the description in the results section to accurately reflect the graph, which shows higher avidity for V1V2 loops with CAF01.

      2. Figure 6A. The authors state that significantly higher IgG1 was induced but Figure 6A seems to be the only graph lacking an indication of statistical significance.

      Response. We have made the necessary adjustment to ensure that significance symbol is depicted in Figure 6A.

      1. Brackets indicating significance are often unclear. e.g. in Figure 4B MPLA graph there are three groups and a single multipoint bracket with a single result making it unclear which groups are being compared.

      Response. We have added clarification to the legend. It now states that the temporal comparisons in GC Tfh subsets for each vaccine group are made in relation to frequencies at baseline. This revision provides a clear reference point for the significance comparisons and ensures that readers can easily understand which groups are being compared.

      Reviewer #2:

      Overall, the manuscript is well-written and addresses an important issue. However, further investigation is warranted to understand how the MPLA+QS-21 induced GC TFH1 influenced on memory B cell response. This manuscript only showed the correlation between GC TFH1 and antibody responses. If the authors explain adjuvant preference in memory B cell responses, this manuscript could be more considerable for publication.

      1. This reviewer recommends that the author provide more evidence to indicate the functional relevance of GC TFH1 in IgG1 class-switch and B cell memory responses. Some evidence supports that IFN-γ controls the antigen-specific IgG1 responses in humans, but it is still controversial. The author also suggests the involvement of IL-21, but this is also an open question even in the human system. This is also the case in the memory responses. There is no direct link between IFN-γ and memory B cell responses in the human system. The authors need more evidence of how GC TFH1 cell development has more advantages in IgG1 and memory responses than GC TFH1 /17 cells. I believe an antibody blockade of cytokines would be a possible strategy to prove these questions.

      Response. We appreciate the reviewer's valuable suggestion to provide more evidence regarding the functional relevance of GC Tfh1 cells in IgG1 class-switch and B cell memory responses. It is indeed important to establish a direct link between GC Tfh1 cells and these responses, particularly in the context of cytokine skewing. The suggestion of antibody blockade studies to mechanistically link the modulation of the inflammatory milieu to Tfh differentiation and subsequent antibody functions is important. However, we must acknowledge that these studies are currently beyond the scope of our work. We have included this as a limitation in our study, recognizing the need for further studies to address these important questions.

      1. In Fig.5, the authors use different scales to indicate the IgG antibody titer. A shows the log scale, while B shows the linear scale. Moreover, the differences are minimal, even though the authors indicated a significant difference. I am not sure this difference is meaningful.

      Response. To clarify, we used a log scale in Figure 5A to demonstrate temporal changes over the course of vaccination. In Figure 5B, where we are comparing differences across vaccine regimens at week 30, a linear scale was deemed more appropriate, as it allows for a clear representation of the approximately two-fold difference observed. We fully acknowledge that to establish the biological significance of the observed difference, challenge studies will be essential.

    1. Author Response

      Reviewer #1 (Public Review):

      This article proposes a new statistical approach to identify which of several experimenter-defined strategies best describes a biological agent's decisions when such strategies are not fully observable by choices made in a given trial. The statistical approach is described as Bayesian but can be understood instead as computing a smoothed running average (with decay) of the strategies' success at matching choices, with a winner-take-all inference across the rules. The article tests the validity of this statistical approach by applying it to both simulated agents and real data sets in mice and humans. It focuses on dynamically changing environments, where the strategy best describing a biological agent may change rapidly.

      The paper asks an important question, and the analysis is well conducted; the paper is well-written and easy to follow. However, there are several concerns that limit the strength of the contribution. Major concerns include the framing of the method, considerations around the strategy space, limitations in how useful the technique may be, and missing details in analyses.

      Reviewer #2 (Public Review):

      In this study, the goal is to leverage the power of Bayesian inference to estimate online the probability that any given arbitrarily chosen strategy is being used by the decision-maker. By computing the trial-by-trial MAP and variance of the posterior distribution for each candidate strategy, the authors can not only see which strategy is primarily being used at every given time during the task and when strategy changes occur but also detect when the target rule of a learning task becomes the front-running strategy, i.e., when successful learning occurs.

      Strengths:

      1) The proposed approach adds to recent methods for capturing the dynamics of decision-making at finer temporal resolution (trials) (Roy et al., 2021; Ashwood et al., 2022) but it is novel and differs from these in that it is suited especially well for analyzing when learning occurs, or when a rule switches and learning must recommence, and it does not necessitate large numbers of trials.

      2) The manuscript starts with a validation of the approach using synthetic data and then is applied to datasets of trial-based two-alternative forced choice tasks ranging from rodent to non-human primate to human, providing solid evidence of its utility.

      3) Compared to classic procedures for identifying when an animal has learned a contingency which typically needs to be conservative in favor of better accuracy, this method retrieves signs of learning happening earlier (~30 trials earlier on average). This is achieved by identifying the moment (trial) when the posterior probability of the correct "target" rule surpasses the probability of all other strategies. Having greater temporal precision in detecting when learning happens may have a very significant impact on studies of the neural mechanisms of learning.

      4) This approach seems amenable to testing many different strategies depending on the purpose of the analysis. In the manuscript, the authors test target versus non-target strategies (correct versus incorrect) and also in another version of the analysis, they test what they call "exploratory" strategies.

      5) One of the main appeals of this method is its apparent computational simplicity. It necessitates only updating on every trial the parameters of a beta distribution (prior distribution for a given strategy) with the evidence that the behavior on trial was either consistent or inconsistent with the strategy. Two scalars, the mode of the posterior (MAP) and the inverse of the variance, are all that are required for identifying the decision criterion (highest MAP and if tied lowest variance) and the learning criterion (first trial where MAP for target strategy is higher than chance).

      Weaknesses:

      1) It seems like a limitation of this approach is that the candidate strategies to arbitrate between must be known ex-ante. It is not clear how this approach could be applied to uncover latent strategies that are not mixtures of the strategies selected.

      2) Different strategies may be indistinguishable from each other and thus it may not be possible to distinguish between them. Similarly, the fact that two strategies seem to be competing for the highest MAP doesn't necessarily mean that those are correct strategies and perhaps interchangeable as the manuscript seems to suggest.

      3) The decay parameter is a necessary component to make the strategy selection non-stationary and accommodate data sets where the rules are changing throughout the task. However, the choice of the decay parameter value bounds does not seem very principled. Having this parameter as a free-parameter adds a flexibility that seems to have significant effects on when the strategy switch is detected and how stable the detected switch is.

      4) This method is a useful approach for arbitrating between strategies and describing the behavior with a temporal precision that may prove important for studies attempting to tie these precise events to changes in neural activity. However, it seems limited in its explanatory power. In its current form, this method does not provide a prediction of the probability to transition from one strategy to another. And, because the MAP of different strategies may be close at any given moment, it is hard to imagine using this approach to tease out the different "mental states" that represent each strategy being at play.

      The reviewers’ detailed comments, not shared here, helped us considerably to improve the paper, and we thank the reviewers for their time here. We are unsure of the merits of sharing public reviews of a paper that has now changed considerably from the version that these reviews address. Nonetheless we shall address some key points of potential misunderstanding here.

      “The statistical approach is described as Bayesian but can be understood instead as computing a smoothed running average (with decay) of the strategies' success at matching choices, with a winner-take-all inference across the rules.“

      This is inaccurate. The algorithm performs sequential Bayesian updates on the evidence for and against the use of each strategy considered; for a given strategy i, its output at each trial is a fully parameterised posterior distribution over the probability of that strategy being used by the subject.

      We are careful in the paper to separate the algorithm’s output from our further use of that output. To plot and analyse the output we often make use of the maximum a posteriori (MAP) estimate from each posterior. Other choices are of course possible, and we discuss them in the text.<br /> In one set of simulations we quantify the results using a decision rule that chooses the strategy with the highest MAP - this is presumably the “winner-takes-all inference” in the quoted text. We do not use this anywhere else in the paper, including the analyses of the 4 datasets, and so do not consider it as part of our method, but one possible use of the output of the algorithm.

      “Major concerns include the framing of the method, considerations around the strategy space, limitations in how useful the technique may be, and missing details in analyses”

      Our goal for this paper was to develop a computationally lightweight, trial-resolution, Bayesian approach to tracking the probability of user-specified strategies, so that we can capture the observer’s evidence for learning or for the features driving exploratory choice (e.g. whether subjects are responding to losses or wins; are they responding to cues or choice etc). The above quote reflects their detailed review comments, where we felt this reviewer wanted a solution to a different problem, that of a parameterised latent model of strategy use: while a perfectly valid research goal, this was not what we addressed here.

      “1) It seems like a limitation of this approach is that the candidate strategies to arbitrate between must be known ex-ante. It is not clear how this approach could be applied to uncover latent strategies that are not mixtures of the strategies selected.”

      The problem of knowing which strategies to analyse in advance only applies when running our algorithm in real-time. The fact that it could be run in real-time on modest computing hardware is to us one of its strengths, so we consider this a good problem to have.

      As noted above, rather than determine latent strategies, our goal was to build an observer model that allows users to specify whatever strategy they wanted in order to answer their scientific question(s) of their data. For example, to define when a particular rule has been learnt; or to look for changes in response to particular features of the environment, such as a cue, or to a drug treatment or other intervention.

      2) Different strategies may be indistinguishable from each other and thus it may not be possible to distinguish between them. Similarly, the fact that two strategies seem to be competing for the highest MAP doesn't necessarily mean that those are correct strategies and perhaps interchangeable as the manuscript seems to suggest.

      As noted above, this is an observer model, and it is thus necessarily true that there are strategies for which the observer does not have sufficient evidence to distinguish. For example, a subject who continually chooses the rewarded left-hand lever will be doing both a strategy of “go left” and of “win-stay” in response to their choice. The inability to distinguish strategies is a property of the data, not of the algorithm. Also as noted above, we do not here consider the competition between strategies.

      3) The decay parameter is a necessary component to make the strategy selection non-stationary and accommodate data sets where the rules are changing throughout the task. However, the choice of the decay parameter value bounds does not seem very principled. Having this parameter as a free-parameter adds a flexibility that seems to have significant effects on when the strategy switch is detected and how stable the detected switch is.

      The revised manuscript draws together the existing simulations and analysis of the method to directly address this point, showing that there is a principled range of the decay parameter in which the algorithm should operate. The Discussion also points out that this is no different to a free parameter than any frequentist approach to strategy analysis, which must choose some time windows over which to compute the frequentist probability.

      4) This method is a useful approach for arbitrating between strategies and describing the behavior with a temporal precision that may prove important for studies attempting to tie these precise events to changes in neural activity. However, it seems limited in its explanatory power. In its current form, this method does not provide a prediction of the probability to transition from one strategy to another. And, because the MAP of different strategies may be close at any given moment, it is hard to imagine using this approach to tease out the different "mental states" that represent each strategy being at play.

      As noted above, this is an observer model and does not intend to infer mental states. The goal is to make accurate statements about observable behaviour. We agree that an interesting extension to this approach would be to model the transitions between strategies, and had already outlined this in the Discussion.

    1. Author Response

      The following is the authors’ response to the original reviews.

      REVIEWER 1:

      Reviewer 1 stated: “The authors have provided strong evidence that high levels of auxin exposure perturb feeding behavior, survival rates, lipid metabolism, and gene expression patterns, providing a cautionary note for the field in using this technology. They also concluded that “overall, the experiments were suitably designed with appropriate sample size and data analysis methods.”

      Reviewer 1 provided the following recommendations for improvement, which are addressed below:

      Point 1: “Although authors showed that auxin causes gene expression changes including the possible alteration of Gal4 expression levels, no cell-type-specific data is provided. It would be informative to the Drosophila field if the authors could examine major Gal4 drivers in their expression levels, such as the ones used in studying metabolism and oogenesis.”

      We agree with the reviewer that cell-type specific Gal4 expression should be thoroughly analyzed by scientists in the community wishing to use the current auxin-inducible gene expression system (AGES) in their studies; however, those analyses are beyond the scope of our manuscript. There are many tissues and cell types that are used to study metabolism and oogenesis (e.g., muscle, adipocytes, oenocytes, multiple cell types in the gut, multiple cell types in the ovary), and Gal4 expression patterns could be different depending on age, sex, and diet. It is therefore impossible for us to pinpoint one or two key tissues important for regulating lipid levels and would be a significant investment of time. We believe that each researcher should thoroughly check the Gal4 expression pattern for their specific tissue of interest under their normal standard or altered food conditions. As this reviewer pointed out, our current study provides a cautionary note for the field in using this technology. Nevertheless, we have provided a reference to a recent micropub (Hawley et al; PMID: 37396791) which describes neuronal Gal4 expression patterns comparing the AGES and temporal and regional gene expression targeting (TARGET) systems and updated the text in lines 539-544 of the revised manuscript.

      Point 2: “Although the authors briefly mentioned aging research, feeding behavior, and lipid metabolism, RNA-seq data are provided only for short-term treatment (2 days). The ovary phenotype was examined with long-term treatment (15 days). It would be informative if the authors could also show other long-term treatment data.”

      We respectfully point out to the reviewer that a 5-day auxin feeding assay was provided in Figure S4H, which reproduces the data provided for the 2-day auxin treatment. In addition, the original AGES paper (McClure et al, PMID: 35363137) provided adult survival data that extended to 80 days. In our updated manuscript, we have provided data for a 10-day auxin treatment that also addresses Point #4 below regarding whether the decrease in lipid levels upon auxin feeding is reversible.

      Point 3: “The auxin used in this work is a more water-soluble version and at a high concentration (10 mM). In the C. elegans system, researchers are using a much lower concentration of auxin typically at 1 mM. Therefore, the discussion of their results in terms of potential impacts on other experimental systems should be done carefully. It would be helpful to know what impacts might be observed at a lower concentration of auxin. The recommendation would be that the authors add the 1 mM auxin data point to key elements of their analysis.”

      The concentration of 10 mM auxin used in our study is the recommended dose to use in Drosophila (see McClure et al) and has been used in at least one additional study (Hawley et al). We also would like to point out that other systems (e.g., C. elegans and mice) have many differences in physiology and therefore the concentration of auxin used to elicit a response are likely to be different (e.g., 71.4 mM final concentration is the recommended concentration used in mice; Macdonald et al; PMID: 35736539). We have merely suggested that researchers using auxin for protein degradation should carefully check whether lipid levels (or other physiological processes of interest) are altered upon auxin feeding (or soaking) alone compared to a 0 mM auxin control. The text in lines 467-470 has been altered to reflect this. In addition, the specific recommended dose for Drosophila is highlighted and referenced in multiple places (i.e., methods and results and discussion) throughout the updated text.

      Point 4: “Another related question is whether these detected changes are reversible or not after exposure to auxin at different concentrations. This would be informative for researchers to better design their temporally controlled experiments.”

      We thank the reviewer for this suggestion and have provided the data in Figure S4I. Briefly, we found that after a 5-day treatment of auxin, removal of auxin for an additional 5 days does not recover lipid levels to those of control animals never exposed to auxin.

      Point 5: “It would also be helpful to know whether spermatogenesis is affected or not.”

      Although this would be an interesting developmental process to determine if affected by auxin exposure, we believe that these analyses are beyond the scope of the current manuscript.

      Point 6: “A few other points include changing the nomenclature and validating some of the key genes shown in Figure 3 using quantitative RT-PCR experiments with the tissues where the affected genes are known to be expressed and functional.”

      We thank the reviewer for this suggestion. We have provided qRT-PCR analysis using whole body samples and this data is now provided in the new Figure S8. We used whole-body samples for the qRT-PCR analysis because it would be impossible to pinpoint the specific tissue the differentially regulated genes are required for eliciting the response to auxin exposure. For example, according to Flybase (flybase.org) GstE3 transcripts are moderately to highly expressed in 15 of the 23 cell types annotated by the Fly Cell Atlas project (Li et al; PMID: 35239393).

      REVIEWER 2:

      Reviewer 2 stated: “The authors provide evidence of several Auxin effects. Experiments are suitably designed with appropriate sample size and data analysis methods.”

      This reviewer expressed the following concerns, which are addressed below:

      Point 1: “The provided information is limited and not very helpful for many applications. For example, although authors briefly mentioned aging research, feeding behavior, and lipid data, RNA seq data are provided only for short-term (48 hours) treatment. Especially, since ovary phenotype was examined with long-term treatment (15 days), authors should also show other data for long-term treatment as well.”

      Please see our response to Point #2 of Reviewer 1 regarding long-term treatment experiments. Furthermore, although the ending timepoint for the ovarian analyses is 15 days, we also provide analysis at shorter time points (e.g., daily analysis for egg counts, 5 and 10 day timepoints for fixed sample analyses).

      Point 2: “Although the authors show that Auxin causes a change in gene expression patterns and suggests the possible alteration of Gal4 expression levels, no cell-type-specific data is provided. It would be informative if the authors could examine the expression level of major Gal4 drivers. Authors should discuss how severe these changes are by comparing them with other treatments or conditions, such as starvation or mutant data (ideally, comparing with reported data or their own data if any?).”

      Please see our response to Point #1 from Reviewer 1.

      REVIEWER 3:

      Reviewer 3 stated that they “found the study to be carefully done” and “this study will be of interest to researchers using the Drosophila system, especially those focusing on fatty acid metabolism or physiology.”

      Reviewer 3 also had the following minor points, which are addressed below:

      Point 1: “Auxin, actually 1-naphthaleneaceid acid here, which is a more water-soluble version of auxin (indole-3-acetic acid) is used at what I consider to be a high concentration-10 mM. The problem I have is that the authors are discussing their results in terms of potential impacts on other experimental systems. At least for C. elegans, I think this is not a reasonable extension of the current dataset. In the C. elegans system, researchers are using 1 mM auxin. The authors note that their RNA-seq results suggest a xenobiotic response. Could this apparent xenobiotic response be due to a metabolic byproduct following auxin administration at high concentrations? Figure S1A shows that there is quite a robust transcriptional response at 1 mM auxin. It would be helpful to know what impacts might be observed at this lower concentration in which the transcriptional induction could be used in the context of biologically meaningful experiments. The recommendation would be that the authors add the 1 mM auxin data point to key elements of their analysis.”

      Regarding the comparisons to other model organisms, we refer to our response to Point #3 from Reviewer 1. We also point out that although there is a robust response to 1 mM auxin using the 3.1Lsp2-Gal4 driver, 1 mM is not sufficient for a robust response using additional driver lines in Drosophila (see Hawley et al). It is possible that the xenobiotic response is due to using the recommended dose of auxin (McClure et al).

      However, given the fact that researchers are currently using the 10 mM dose for experiments in Drosophila, we believe that the 10 mM transcription dataset is the most relevant. Nevertheless, we do agree that researchers who choose to use lower concentrations of auxin in the future should carefully look at whether any transcriptional induction alters physiological processes of interest.

      Point 2: “This reviewer was confused by the genetic nomenclature the authors use. The authors have chosen to use the designation 3.1Lsp2-Gal4 (3.1Lsp2-Gal4AID). I think this is potentially confusing because a reader might think that it is the Gal4 transcription factor that is the direct target of auxin- and TIR1-mediated protein degradation, as I initially did. Rather, it is the Gal80 repressor protein that is the direct target. The authors might consider a nomenclature that is more reflective of how this system works. It would also be helpful if the full genotypes of strains were included in each figure legend.”

      We apologize for the nomenclature confusion in our original submission. We have changed our “AID” nomenclature throughout the manuscript to “AGES,” which is the nomenclature used in McClure et al. We respectfully note that the traditional nomenclature for using the temperature-sensitive Gal80 system is Gal80ts or adding the “ts” superscript to the Gal4 line used (e.g., 3.1Lsp2ts).

      Point 3: “The RNA-seq dataset does not appear to be validated by RT-PCR experiments. The authors should consider validating some of the key genes shown in Figure 3 using quantitative RT-PCR experiments, potentially adding a 1 mM auxin data point.”

      Please see our response to Point #6 to Reviewer 1.

      REVIEWER 4:

      Reviewer 4 stated: “Overall, the experiments were well-designed and carefully executed. The results were quantified with appropriate statistical analyses. The paper was also well-written and the results were presented logically.”

      RECOMMENDATIONS FOR THE AUTHORS:

      We have further addressed reviewer recommendations below. Thank you again, for your critique of our manuscript.

      REVIEWER 2:

      As I mentioned in my public review, long-term treatment data would be especially helpful. Examining changes in the expression level of major Gal4 lines is also informative.

      Please see our responses to Points #1 and #2 to Reviewer 1 in the “Public Reviews” section. Although examination of Gal4 expression patterns is extremely important, we believe that these analyses should be carefully performed on a case-by-case basis in the future for labs who wish to continue to use this methodology.

      REVIEWER 4:

      I feel addressing #2 would be a great addition to the current version, while #1 and #3 could be addressed in future studies or by researchers who are interested in these processes.

      Recommendation 1: “Both the metabolomics and transcriptome analyses were done using the whole animals, would it be more informative if these were done using specific tissue/organs such as the adult adipose tissue?”

      Please see our response to Points #1 and #6 to Reviewer 1 in the “Public Reviews” section.

      Recommendation 2: “Another related question is whether these detected changes are reversible or not after exposure to auxin? This would be informative for researchers to better design their temporally controlled experiments.”

      We thank the reviewer for this suggestion and the analysis for this experiment is now provided in Figure S4I.

      Recommendation 3: “Is spermatogenesis affected at all?”

      We respectfully point out that many processes in spermatogenesis (as well as other biological processes) are affected by feeding (e.g., starvation) and would be extremely time consuming to carefully perform the analyses with the rigor required. We agree with Reviewer 4 and believe that this would be best to be performed on a case-by-case examination in the future.

    1. Author Response

      Our responses to the reviewers to go into the published pre-print. We thank the reviewers for their encouraging and thoughtful comments. These are good points that we would like to comment on as follows:

      Reviewer 1:

      Some important and interesting data are missing. For example, whether the gene therapy can extend the life span of these mutants? The overall in vivo voiding function is missing. AAV9/HSPE2 expression in the bladder wall is not shown.

      A. Our study was not designed to determine whether gene therapy can improve life span of the Hpse2 mutant mice. We know that the mutant mice usually become ill after the first month of life and can die. However, we wanted to study the mice when they were generally well so that there would be no confounding effects on the bladder physiology caused by general ill health. Indeed, a recent study of Hpse2 inducible deletion in adult mice has shown evidence of exocrine pancreatic insufficiency (Kayal et al., PMID 37491420). We are currently exploring the status of the pancreas in our non-conditional juvenile Hpse2 mice, and whether gene transfer into the pancreas is possible.

      B. We strongly agree that in vivo voiding studies will be important it the future, and suggest in vivo cystometry is the gold standard for this but is currently beyond the remit of this study.

      C. It is correct that in this paper we have focussed on gene transduction into the pelvic ganglia, because the evidence is mounting that this is a neurogenic disease. Our ex vivo physiological studies show predominantly neurogenic defects that are corrected by the gene therapy. A detailed study of the bladder body is an interesting idea, in terms of possible transgene expression and detailed histology, and is something we will pursue in future studies.

      Review 2:

      Weaknesses include a lack of discussion of the basis for differences in carbachol sensitivity in Hpse2 mutant mice, limited discussion of bladder tissue morphology in Hpse2 mutant mice, some questions over the variability of the functional data, and a need for clarification on the presentation of statistical significance of functional data.

      A. Yes, it is interesting that untreated male mutant mice have an increased bladder body contraction to carbachol compared with WT males. In a previous paper (Manak et al., 2020) we performed quantitative western blots for the M2 and M3 receptors and found levels were similar in mutants to the WTs, thus the increased sensitivity probably lies post-receptor.

      B. A detailed study of the bladder body is an interesting idea, in terms of possible transgene expression and detailed histology, and is something we will pursue in future studies.

      C. We have reported in our physiology graphs what we find. We do find some variability, particularly at lower frequencies, but our conclusions depend on analyses of the whole curve, which depend on multiple frequencies and show the expected overall pattern of frequency-dependent relaxation.

      D. Thank you, the stats for Figure 8 will be corrected in the final version.

      Reviewer 3:

      Single-cell analysis of mutants versus control bladder, urethra including sphincter. This would be great also for the community.

      A. Yes, in future we are very interested in using a single cell sequencing approach to look at the mutant, WT and rescued pelvic ganglia. In relation to this, there is a recent proof-of-principle paper pre-print in WT mouse pelvic ganglia, which suggests this may be feasible (Sivori et al., 2023).

      Detailed tables showing data from each mouse examined.

      B. In theory, it would be very interesting to correlate the strength of human gene transduction into the pelvic ganglia, with, for example, the effect on a physiological parameter. However, in general we used different sets of mice for these techniques so at the present we don’t have this information.

      Use of measurements that are done in vivo (spot assay for example). This sounds relatively simple.

      C. We strongly agree that in vivo voiding studies will be important it the future, and suggest in vivo cystometry is the gold standard for this but is currently beyond the remit of this study.

      Assessment of viral integration in tissues besides the liver (could be done by QPCR).

      D. This is an important point, and suggest the pancreas is a particularly interesting target for future studies. a recent study of Hpse2 inducible deletion in adult mice has shown evidence of exocrine pancreatic insufficiency (Kayal et al., PMID 37491420). We are currently exploring the status of the pancreas in our non-conditional juvenile Hpse2 mice, and whether gene transfer into the pancreas is possible.

      Discuss subtypes of neurons that are present and targeted in the context of mutants and controls.

      E. The make-up of the pelvic ganglia in Hpse2 mutant mice is a fascinating question. Future analysis using scRNA-Seq may be the most effective way to answer this question and is a molecular approach we are looking to pursue in the future.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This paper reports the development of SCA-seq, a new method derived from PORE-C for simultaneously measuring chromatin accessibility, genome 3D and CpG DNA methylation. Most of the conclusions are supported by convincing data. SCA-seq has the potential to become a useful tool to the scientific communities to interrogate genome structure-function relationships.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this work, Xie et al. developed SCA-seq, which is a multiOME mapping method that can obtain chromatin accessibility, methylation, and 3D genome information at the same time. SCA-seq first uses M.CviPI DNA methyltransferase to treat chromatin, then perform proximity ligation followed by long-read sequencing. This method is highly relevant to a few previously reported long read sequencing technologies. Specifically, NanoNome, SMAC-seq, and Fiber-seq have been reported to use m6A or GpC methyltransferase accessibility to map open chromatin, or open chromatin together with CpG methylation; Pore-C and MC-3C have been reported to use long read sequencing to map multiplex chromatin interactions, or together with CpG methylation. Therefore, as a combination of NanoNome/SMAC-seq/Fiber-seq and Pore-C/MC-3C, SCA-seq is one step forward. The authors tested SCA-seq in 293T cells and performed benchmark analyses testing the performance of SCA-seq in generating each data module (open chromatin and 3D genome). The QC metrics appear to be good and I am convinced that this is a valuable addition to the toolsets of multi-OMIC long-read sequencing mapping.

      The revised manuscript addressed most of my questions except my concern about Fig. S9. This figure is about a theory that a chromatin region can become open due to interaction with other regions, and the author propose a mathematic model to compute such effects. I was concerned about the errors in the model of Fig. S9a, and I was also concerned about the lack of evidence or validation. In their responses, the authors admitted that they cannot provide biological evidence or validations but still chose to keep the figure and the text.

      The revised Fig. S9a now uses a symmetric genome interaction matrix as I suggested. But Figure S9a still have a lot of problems. Firstly, the diagonal of the matrix in Fig. S9a still has many 0's, which I asked in my previous comments without an answer. The legend mentioned that the contacts were defined as 2, 0 or -2 but the revised Fig. S9a only shows 1,0, or -1 values. Furthermore, Fig. S9b,9c,9d all added a panel of CTCF+/- but there is no explanation in text or figure legend about these newly added panels. Given many unaddressed problems, I would still suggest deleting this figure.

      In my opinion, this paper does not need Fig. S9 to support its major story. The model in this figure is independent of SCA-seq. I think it should be spinoff as an independent paper if the authors can provide more convincing analysis or experiments. I understand eLife lets authors to decide what to include in their paper. If the authors insist to include Fig. S9, I strongly suggest they should at least provide adequate explanation about all the figure panels. At this point, the Fig. S9 is not solid and clearly have many errors. The readers should ignore this part.

      We appreciate the reviewer for raising these concerns regarding Fig. S9. After careful consideration, we have decided to address your concerns by deleting Fig. S9 and the corresponding text from the manuscript. We understand your point that the model presented in Fig. S9 is independent of SCA-seq and may require additional evidence and validation to be presented in a separate paper.

      We agree that it is important to maintain the integrity and accuracy of the manuscript, and we appreciate your feedback in helping us make this decision.

      Reviewer #2 (Public Review):

      In this manuscript, Xie et al presented a new method derived from PORE-C, SCA-seq, for simultaneously measuring chromatin accessibility, genome 3D and CpG DNA methylation. SCA-seq provides a useful tool to the scientific communities to interrogate the genome structure-function relationship.

      The revised manuscript has clarified almost of the concerns raised in the previous round of review, though I still have two minor concerns,

      1. In fig 2a, there is no number presented in the Venn diagram (although the left panel indeed showed the numbers of the different categories, including the numbers in the right panel would be more straightforward).

      We appreciate the reviewer for pointing out the need for clarification in the Venn diagram in Fig 2a. We have added the numbers to Venn diagram.

      1. The authors clarified the discrepancy between sfig 7a and sfig 7g. However, the remaining question is, why is there a big difference in the percentage of the cardinality count of concatemers of the different groups between the chr7 and the whole genome?

      We apologize for the confusion regarding the difference in the percentage of the cardinality count of concatemers between chr7 and the whole genome in figures S7a and S7g. The difference arises because the chr7 cardinality count only considers the intra-chromosome segments that are adjacent to each other on a SCA-seq concatemer, while the whole genome cardinality count includes both intra-chromosome and inter-chromosome segments.

      In the case of a SCA-seq concatemer that contains both intra-chromosome junctions and inter-chromosome junctions, the whole genome cardinality count will be greater than the intra-chromosome cardinality count. This explains the difference in the percentages between chr7 and the whole genome in figures S7a and S7g.

      To better clarify the definition of intra-chromosome cardinality, we have added an illustrative graph in figure S7a. In the updated figure S7a, the given exemplary SCA-seq concatemer has a whole genome cardinality of 4 and a chr7 intra-chromosome cardinality of 3.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study reports investigation of the dynamics of PKA at the single-cell level in in vitro and in epithelia in vivo. Using different fluorescent biosensors and optogenetic actuators, the authors dissect the signaling pathway responsible for PKA waves, finding that PKA activation is a consequence of PGE2 release, which in turn is triggered by calcium pulses, requiring high ERK activity. The evidence supporting the claims is solid. At this stage the work is still partly descriptive in nature, and additional measurements would increase the strength of mechanistic insights and physiological relevance.

      We deeply appreciate Dr. Alejandro San Martín and Dr. Jonathan Cooper and the reviewers. Each comment is valuable and reasonable. We will revise our paper as much as possible.

      We have described what we will do for the reviewer’s comments one by one in the below section.

      Reviewer #1 (Recommendations For The Authors):

      1. Even though the phenomenon of PGE2 signal propagation is elegantly demonstrated and well described, the whole paper is mostly of descriptive nature - the PGE2 signal is propagated via intercellular communication and requires Ca transients as well as MAPK activity, however function of these RSPAs in dense epithelium is not taken into consideration. What is the function of these RSPAs in cellular crowding? - Does it promote cell survival or initiate apoptosis? Does it feed into epithelial reorganization during cellular crowding? Still something else? The authors discuss possible roles of this phenomenon in cell competition context, but show no experimental or statistical efforts to answer this question. I believe some additional analysis or simple experiment would help to shed some light on the functional aspect of RSPAs and increase the importance of all the elegant demonstrations and precise experimental setups that the manuscript is rich of. Monolayer experiments using some perturbations that challenge the steady state of epithelial homeostasis - drug treatments/ serum deprivation/ osmotic stress/ combined with live cell imaging and statistical methods that take into account local cell density might provide important answers to these questions. The authors could consider following some of these ideas to improve the overall value of the manuscript.

      We would like to thank the reviewer’s comment. Although we have intensively tried to identify the physiological relevance of RSPA, we could not detect the function at present.

      In the case of MDCK, the treatment of NSAIDs, which cancels RSPA, did not affect its cell growth, ERK wave propagation during collective migration, migration velocity, cell survival, or apoptosis. In mouse epidermis, the frequency of RSPA was NOT affected by inflammation and collective cell migration, evoked by TPA treatment and wound, respectively.

      Notably, RSPA also occurs in the normal epidermis, implying its relevance in homeostasis. However, at the current stage, we believe that the PGE2 dynamics and its regulation mechanism in the normal epidermis would be worth reporting to researchers in the field.

      1. In the line 82-84 the authors claim: "We found that the pattern of cAMP concentration change is very similar to the activity change of PKA, indicating that a Gs protein-coupled receptor (GsPCR) mediates RSPA". In our opinion, this conclusion is not well-supported by the results. The authors should at least show that some measurements of the two patterns show correlation. Are the patterns of cAMP of the same size as the pattern of PKA? Do they have the same size depending on cell density? Do they occur at the same frequency as the PKA patterns, depending on the cell density? Do they have an all or nothing activation as PKA or their activation is shading with the distance from the source?

      We have modified the text (line85)

      “Although the increment of the FRET ratio was not so remarkable as that of Booster-PKA, Wwe found that the pattern of cAMP concentration change is very similar to the activity change of PKA, indicating that a Gs protein-coupled receptor (GsPCR) mediates RSPA. This discrepancy may be partially explained by the difference in the dynamic ranges for cAMP signaling in each FRET biosensor (Watabe2020). “

      1. In general, the absolute radius of the waves is not a good measurement for single-cell biology studies, especially when comparing different densities or in vivo vs in vitro experiments. We suggest the authors add the measurement of the number of the cells involved in the waves (or the radius expressed in number of cells).

      We appreciate the reviewer’s comment. We have analyzed our results to demonstrate the number of cells as in Fig2E, which would be easy for readers to understand.

      1. In 6D, the authors should also show the single-cell trajectories to understand better the correlation between PKA and ERK peaks. Is the huger variability in ERK activity ratio dues to different peak time or different ERK activity levels in different cells? The authors should show both the variability in the time and intensity.

      We have added a few representative results as Fig. S4.

      1. In lines 130-132, the authors write, "This observation indicates that the amount of PGE2 secretion is predetermined and that there is a threshold of the cytoplasmic calcium concentration for the triggered PGE2 secretion". How could the author exclude that the amount of PGE2 is not regulated in its intensity as well? For sure, there is a threshold effect regarding calcium, but this doesn't mean that PGE2 secretion can be further regulated, e.g. by further increasing calcium concentration or by other mechanisms.

      We agree with the reviewer’s comment. We have modified the text.

      1. The manuscript shows that not all calcium transients are followed by RSPAs. Does the local cell density/crowding increase the probability of overlap between calcium transients and RSPAs?

      We appreciate the reviewer’s comment. We have also hypothesized the model. However, we did not see the correlation that the reviewer pointed out. Currently, the increment of the RSPA frequency at high density is partially caused by the increment of calcium transients.

      Reviewer #2 (Recommendations For The Authors):

      1. The work is hardly conclusive as to the actual biological significance of the phenomenon. It would be interesting to know more under which physiological and pathological conditions PGE2 triggers such radial PKA activity changes. It is not well explained in which tissues and organs and under what conditions this type of cell-to-cell communication could be particularly important.

      The greatest weakness of the study seems to be that the biological significance of the phenomenon is not clearly clarified. Although it can be deduced that PKA activation has many implications for cell signaling and metabolism, the work lacks the actual link to physiological or pathological significance.

      We deeply appreciate the reviewer’s comment. Similar to the reseponse of reviewer#1, although we have intensively tried to identify the physiological relevance of RSPA, we could not detect the function.

      On the other hand, we believe that the PGE2 dynamics and its regulation mechanism in the normal epidermis would be worth reporting to researchers in the field.

      1. The authors do not explain further why in certain cells of the cell clusters Ca2+ signals occur spontaneously and thus trigger the phenomenon. What triggers these Ca2+ changes? And why could this be linked to certain cell functions and functional changes?

      At this moment, we do not have a clear answer or model for the comment although the calcium transients have been reported in the epidermis (https://doi.org/10.1038/s41598-018-24899-7). Further studies are needed and we will pursue this issue as a next project.

      1. What explains the radius and the time span of the radial signal continuation? To what extent are these factors also related to the degradation of PGE2? The work could be stronger if such questions and their answers would be experimentally integrated and discussed.

      We agree with the reviewer’s comment. Although we have intensively studied that point, we have omitted the results because of its complications. In HeLa cells, but not MDCK cells, we demonstrate the meaning of the radius of RSPA (https://pubmed.ncbi.nlm.nih.gov/37813623/)

      1. The authors could consider whether they could investigate the subcellular translocation of cPLA2 in correlation with cytosolic Ca2+ signals using GFP technology and high-resolution fluorescence microscopy with their cell model.

      Actually, we tried to monitor the cPLA2 translocation using GFP-tagged cPLA2. However, the translocation of GFP-cPLA2 was detected, only when the cells were stimulated by calcium ionophore. At this point, we have concluded that the quantitative analysis of cPLA2 translocation would be difficult.  

      Reviewer #3 (Recommendations For The Authors):

      1. "The cell density in the basal layer is approximately 2x106 cells cm-2, which is markedly higher than that in MDCK cells (Fig. 2D). It is not clear whether this may be related to the lower frequency (~300 cm-2 h-1) and smaller radius of RSPA in the basal layer cells compared to MDCK cells (Fig. 2E)." Wasn't the relationship with cell density the opposite, higher density higher frequency? Isn't then this result contradicting the "cell density rule" that the authors argue is there in the in vitro system? The authors need to revise their interpretation of the data obtained.

      We agree with the reviewer’s comment. Currently, we do not find the "cell density rule" in mouse epidermis. It would be difficult to identify common rules between mouse epidermis and MDCK cells. However, although it is descriptive, we believe it is worth comparing the MDCK results at this moment.

      1. Similarly, the authors over conclude on the explanation of lack of change in the size of RSPA size when the change in fluorescence for the calcium reporter surpasses a threshold by saying that "This observation indicates that the amount of PGE2 secretion is predetermined and that there is a threshold of the cytoplasmic calcium concentration for the triggered PGE2 secretion." First, the study does not really measure directly PGE2 secretion. Hence, there is no way that they can argue that the level of PGE2 secreted is "predetermined". Instead, there could be an inhibitory mechanism that is triggered to limit further activation of PGE2 signaling/PKA in neighboring cells.

      We agree with the reviewer’s comment. We have omitted the context.

      1. To rule out a transcription-dependent mechanism in the apparent cell density-regulated sensitivity to PGE2, the authors need to inhibit transcription. We agree that our RNA-seq analysis would not 100% rule out the transcription-dependent mechanism. However, we believe that shutting down all transcription will show a severe off-target effect that indirectly affects the calcium transients and the PGE2-synthetase pathway. Therefore, our conclusion is limited.

      4) EGF is reported to increase the frequency of RSPA but the change shown in Fig. 6F is not statistically significant, hence, EGF does not increase RSPA frequency in their experiments.

      We have toned down the claim that EGF treatment increases the frequency (line172).

      "Accordingly, the addition of EGF faintly increased the frequency of RSPA in our experiments, while the MEK and EGFR inhibitors almost completely abrogated RSPA (Fig. 6F), representing that ERK activation or basal ERK activity is essential for RSPA.“

      1. The Discussion section is at times redundant with the results section. References to figures should be kept in the Results section.

      We would like to argue in opposition to this comment. For readers, we believe that the reference to figures would be helpful and kind. However, if eLife recommends removing the reference from the Discussion section, we will follow the publication policy.

      1. "Notably, the propagation of PKA activation, ~100 μm/min (Fig. 1H), is markedly faster than that of ERK activation, 2-4 μm/min (Hiratsuka et al., 2015)." The 2 kinase reporters are based on different molecular designs. Thus, it does not seem appropriate to compare the kinetics of both reporters as a proxy of the comparison of the kinetics of propagation of both kinases.

      We think that we should discuss the comparison of the activity propagation between ERK and PKA. First, among many protein kinases, only ERK and PKA activities have been shown to spread in the epithelial cells. Second, both pathways are considered to be intercellular communication. Finally, crosstalk between these two pathways has been reported in several cells and organs.

      1. In Figure 1E it is unclear what is significantly different from what. Statistical analysis should be added and reporting of the results should reflect the results from that analysis.

      2. In Figure 3F and G the color coding is confusing. In F pink is radius and black is GCaMP6 and in G is RSPA+ and - cells. The authors should change the color to avoid ambiguity in the code.

      We have amended the panels.

      1. In Fig. 5C, how do they normalize per cell density if they are measuring radius of the response?

      In Fig5C, we just measure the increment of FRET ratio in the view fields.

      1. In Fig. 5D, what is the point of having a label for PTGER3 if data were not determined (ND)?

      We have added what N.D. means.

      “N.D. represents Not Detected.”

      1. It is important to assess whether ERK activation depends of PGE2 signaling to better place ERK in the proposed signaling pathway. In fact, the authors argue that "ERK had a direct effect on the production of PGE2." But it could be that ERK is downstream PGE2 signaling instead.

      It could be possible in other experimental conditions via EP1 and/or EP3 pathways. However, we never detected an effect of RSPA on ERK activity by analyzing our imaging system. In addition, treatment with NSAIDs or COX-2 depletion, which completely abolishes RSPA, did not affect ERK wave propagation. Thus, in our context, we concluded that ERK is not downstream of PGE2. This notion is also supported by the NGS results in Fig. 5D.

      We have refrained from discussing the pathway of PGE2-dependent ERK activation because it would be redundant.

      1. The authors need to explain better what they mean by "AND gate" if they want to reach a broad readership like that of eLife

      We have modified the legend to explain the “AND gate” as much as possible (line639).

      “Figure 7: Models for PGE2 secretion.

      The frequency of calcium transients is cell density-dependent manner. While the ERK activation wave is there in both conditions. Because both calcium transient and ERK activation are required for RSPA, the probability for PGE2 secretion is regulated as “AND gate”. ”

      1. In Fig. 5D, "The average intensity of the whole view field of mKate2 or mKOκ, at 20 to 30 min after the addition of PGE2, was applied to calculate the mKate2/mKOκ ratio." But this means that overlapping/densely plated cells in high density will show stronger changes in fluorescence. This should be done per cell not per field of view. It is obvious that the higher density will have more dense/brighter signal in a given field of view.

      We are sorry for the confusion. The cell density does not affect the FRET ratio, although the brightness could be changed. A typical example is Fig1D. Thus, we are sure that our procedures represent the PKA activity in plated cells.

      1. In Fig. 6B the authors need to explain how were the "randomly set positions" determined.

      We have modified the legend section as below (line618).

      “The ERK activities within 10 µm from the center of RSPA and within 10 µm from randomly set positions with a random number table generated by Python are plotted in the left panel. Each colored dot represents an average value of an independent experiment.”

      1. Sentences 314-318 are repeated in 318-322.

      We deeply appreciate the reviewer’s comment and have amended

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Here, Boor et al focus on the regulation of daf-7 transcription in the ASJ chemosensory neurons, which has previously been shown to be sensitive to a variety of external and internal signals. Interestingly, they find that soluble (but not volatile) signals released by food activate daf-7 expression in ASJ, but that this is counteracted by signals from the ASIC channels del-3 and del-7, previously shown to detect the ingestion of food in the pharynx. Importantly, the authors find that ASJ-derived daf-7 can promote exploration, suggesting a feedback loop that influences locomotor states to promote feeding behavior. They also implicate signals known to regulate exploratory behavior (the neuropeptide receptor PDFR-1 and the neuromodulator serotonin) in the regulation of daf-7 expression in ASJ. Additionally, they identify a novel role for a pathway previously implicated in C. elegans sensory behavior, HEN1/SCD-2, in the regulation of daf-7 in ASJ, suggesting that the SCD-2 homolog ALK may have a conserved role in feeding and metabolism.

      Strengths:

      The studies reported here, particularly the quantitation of gene expression and the careful behavioral analysis, are rigorously done and interpreted appropriately. The results suggest that, with respect to food, DAF-7 expression encodes a state of "unmet need" - the availability of nearby food to animals that are not currently eating. This is an interesting finding that reinforces and extends our understanding of the neurobiological significance of this important signaling pathway. The identification of a role for ASJ-derived daf-7 in motor behavior is a valuable advance, as is the finding that SCD-2 acts in the AIA interneurons to influence daf-7 expression in ASJ.

      We appreciate the Reviewer 1’s thoughtful assessment of our work and inference that the expression of daf-7 encodes internal state corresponding to “unmet need.” Based on comments of Reviewer 1 and other reviewers, we have revised the title, abstract, and parts of the discussion to highlight not only the functional contribution of daf-7 expression in the ASJ neurons to behavioral state, but also the remarkable correlation between gene expression and internal state driving foraging behavior.

      Weaknesses:

      A limitation of the work is that some mechanistic relationships between the identified signaling pathways are not carefully examined, but this provides interesting opportunities for future work.

      To enable the reader to begin to infer the relative contributions of the identified signaling pathways to the circuitry coupling distinct bacterial cues to foraging behavior, we have added data for the analysis of DAF-7 expression in the ASJ neurons in the tph-1 and pdfr-1 mutants in the complete absence of food. Our current leaning is that multiple pathways, including those we have begun to characterize here, may function in parallel to influence DAF-7 expression and internal state driving foraging behavior. Future work to explore this further is certainly of interest.

      A minor weakness concerns the experiment in which daf-7 is conditionally deleted from ASJ. This is an ideal approach for probing the function of daf-7, but these experiments seem to be carried out in the well-fed, on-food condition in which control animals should express little or no daf-7 in ASJ. Thus, the experimental design does not allow an assessment of the role of daf-7 under conditions in which its expression is activated (e.g., in animals exposed to un-ingestible food).

      The interpretation of genetic analysis in the complete absence of food is complicated by what we think are multiple parallel pathways that function to strongly promote roaming, as indicated in the prior work of Ben Arous et al. Our observation that the conditional deletion of daf-7 from the ASJ pair of neurons confers altered roaming behavior on a lawn of bacterial food supports physiological ongoing role for dynamic daf-7 expression from the ASJ neurons even in the presence of bacterial food that may contribute to the control of transitions between foraging states and the persistence of roaming and dwelling states.

      To demonstrate the functional contribution of DAF-7 expression from the ASJ neuron pair during constitutive expression favoring roaming, we examined the roaming behavior of scd2(syb2455) animals that carry a gain-of-function mutation in scd-2 that promotes roaming and how the selective deletion of daf-7 from the ASJ neurons in the scd-2(syb2455) genetic background influences roaming behavior. This new experiment supports a model in which DAF-7 expression from the ASJ neurons contributes to the increased roaming behavior exhibited by scd-2(syb2455) animals. The new experiment is added as Figure 4I.

      An additional minor issue concerns the interpretation of the scd-2 experiments. The authors' findings do support a role for scd-2 signaling in the activation of daf-7 expression by un-ingestible food, but the data also suggest that scd-2 signaling is not essential for this effect, as there is still an effect in scd-2 mutants (Figure 4B).

      Considering that most of previous Figure 4B is redundant with previous Figure 4D, we removed previous Figure 4B. Our current Figure 4 has redesignated previous Figure 4D as 4B. We have also added qualification to the text to indicate that other pathways may modulate the daf-7 expression response to ingested food in parallel to SCD-2 signaling.

      Reviewer #2 (Public Review):

      Summary:

      In this work, Boor and colleagues explored the role of microbial food cues in the regulation of neuroendocrine-controlled foraging behavior. Consistent with previous reports, the authors find that C. elegans foraging behavior is regulated by the neuroendocrine TGFβ ligand encoded by daf-7. In addition to its known role in the neuroendocrine/sensory ASI neurons, Boot and colleagues show that daf-7 expression is dynamically regulated in the ASJ sensory neurons by microbial food cues - and that this regulation is important for exploration/exploitation balance during foraging. They identify at least two independent pathways by which microbial cues regulate daf-7 expression in ASJ: a likely gustatory pathway that promotes daf-7 expression and an opposing interoceptive pathway, also likely chemosensory in nature but which requires microbial ingestion to inhibit daf-7 expression. Two neuroendocrine pathways known to regulate foraging (serotonin and PDF-1) appear to act at least in part via daf-7 induction. They further identify a novel role for the C. elegans ALK orthologue encoded by scd-2, which acts in interneurons to regulate daf-7 expression and foraging behavior. These results together imply that distinct cues from microbial food are used to regulate the balance between exploration and exploitation via conserved signaling pathways.

      Strengths:

      The findings that gustatory and interoceptive inputs into foraging behavior are separable and opposing are novel and interesting, which they have shown clearly in Figure 1. It is also clear from their results that removal of the interoceptive cue (via transfer to non-digestible food) results in rapid induction of daf-7::gfp in ASJ, and that ASJ plays an important role in the regulation of foraging behavior.

      We thank Reviewer 2 for underscoring the modulation of neuroendocrine gene expression in the ASJ neuron pair by distinct gustatory and interoceptive inputs derived from bacterial food that we show in Figure 1.

      The role of the hen-1/scd-2 pathway in mediating the effects of ingested food is also compelling and well-interpreted. The use of precise gain-of-function alleles further supports their conclusions. This implies that important elements of this food-sensing pathway may be conserved in mammals.

      We thank Reviewer 2 for emphasizing the implications of our study on SCD-2/ALK as well as the generation and use of gain-of-function scd-2 alleles based on oncogenic mutations in ALK.

      Weaknesses:

      What is less clear to me from the work at this stage is how the gustatory input fits into this picture and to what extent can it be strongly concluded that the daf-7regulating pathways that they have identified (del-3/7, 5-HT, PDFR-1, scd-2) act via the interoceptive pathway as opposed to the gustatory pathway.

      It follows from the work of the Flavell lab that del-3/7 likely acts via the interoceptive pathway in this context as well but this isn't shown directly - e.g. comparing the effects of aztreonam-treated bacteria and complete food removal to controls. The roles of 5-HT and PDFR-1 are even a bit less clear. Are the authors proposing that these are entirely parallel pathways? This could be explained in better detail.

      We have added additional data regarding daf-7 expression from the ASJ neurons in the complete absence of food in the different mutant backgrounds noted by Reviewer 2. Data regarding daf-7 expression in the ASJ neurons under three distinct conditions—ingestible bacterial food, non-ingestible bacterial food, and the complete absence of food—enable the pairwise comparison of mutant data that allows for inference regarding the relative contributions of the genes to the interoceptive vs. gustatory pathways. In particular, effects on the interoceptive pathway can be inferred from the comparison of daf-7 expression on ingestible vs. non-ingestible food, whereas effects on the gustatory pathway can be inferred from the comparison of daf-7 expression on non-ingestible food vs. the absence of food (newly added).

      These additional data are most informative for del-3; del-7 (Figure 1H), where the added data corroborate a role for these genes in the interoceptive pathway, consistent with the findings of the Flavell lab. Specifically, the observation that daf-7 expression levels are equivalent between wild-type and del-3;del-7 animals when there is no ingestible food (either no food or non-ingestible food conditions) suggest that DEL-3 and DEL-7 are functioning specifically to sense ingested food.

      For pdfr-1, the analysis of the gain-of-function allele suggest that this pathway may have a greater relative effect on the gustatory pathway compared with the interoceptive pathway (Figure 3D). The robust upregulation seen in the pdfr-1(syb3826) animals between animals on ingestible and non-ingestible food, suggests that the interoceptive regulation is functional in these mutants, while the lack of upregulation between no-food and noningestible-food conditions suggests that the gustatory pathway is affected.

      The observations with the 5-HT biosynthesis mutant are most consistent with serotonin signaling affecting daf-7 expression in the ASJ neurons through a mechanism that is parallel to the gustatory and interoceptive inputs into daf-7 expression in the ASJ neurons, as tph1(n4622) animals appear to have an elevated baseline expression of daf-7 in the ASJ neurons while retaining sensitivity to both gustatory and interoceptive food cues (Figure 3B).

      The data with scd-2 are consistent with a role in the epistatic interoceptive pathway, considering the roughly equivalent levels of daf-7 expression in the ASJ neurons under all food conditions in scd-2(syb2455) animals (Figure 4B). However it is difficult to exclude the possibility that SCD-2 functions in both pathways or parallel to the gustatory and interoceptive inputs.

      While we agree that our genetic analysis alone cannot distinguish between genes acting in parallel or directly in serial with the gustatory or interoceptive inputs, our data do establish that signaling through SCD-2, 5-HT or PDFR-1-dependent pathways can act on the same gene expression and signaling node (i.e. daf-7 expression in the ASJ neurons) to modulate the effects of bacterial food inputs on foraging behavior, with the effects on daf-7 expression in the ASJ neurons in scd-2, tph-1 and pdfr-1 mutants correlating with their effects on roaming and dwelling behaviors.

      It would also be helpful to elaborate more on why the identified transcriptional positive feedback loop is predicted to extend roaming state duration - as opposed to some other mechanism of increasing roaming such as increased probability of roaming state initiation. This doesn't seem self-evident to me.

      Given that animals can exist in only two states, the increased probability of roaming state initiation would present as shorter dwelling states, which we do not see for daf-7 mutants. As described in Flavell, et al., 2013, a decreased fraction of time roaming can be attributed to longer dwelling states, shorter roaming states, or both. Our positive feedback loop is predicted to extend roaming states because of the predicted effect of DAF-7 on stabilizing the roaming state.

      Related to this point is the somewhat confusing conclusion that the effects of tph-1 and pdfr-1 mutations on daf-7 expression are due to changes in ingestion during roaming/dwelling. From my understanding (e.g. Cermak et al., 2020), pharyngeal pumping rate does not reliably decrease during roaming - so is it clear that there are in fact lower rates of ingestion during roaming in their experiments?

      This is an interesting point. Despite consistent pumping rates, we still believe that roaming animals ingest less food than dwelling animals. For instance, dwelling animals are localized to areas with bacterial food, while roaming animals might traverse patches with no food where pumping does not result in food ingestion.

      If so, why does increased roaming (via tph-1 mutation) result in further increases in daf-7 expression in animals fed aztreonam-treated food (Fig 3B)?

      This is possibly because although roaming animals are eating less, when animals are on non-ingestible food, they’re not eating at all, resulting in further daf-7 upregulation.

      Alternatively, there could be a direct signaling connection between the 5-HT/PDFR-1 pathways and daf-7 expression which could be acknowledged or explained.

      Yes, this is certainly possible. We do not propose that all of the difference in daf-7 expression is due to changes in foraging behavior, but rather we are highlighting further instances of the correlation between daf-7 expression in the ASJ neurons and roaming. For instance, in the case of our tph-1 mutants, we see a relatively modest effect on daf-7 expression in the ASJ neurons but a large difference in the fraction of time roaming. This suggests that the magnitude of change in one (daf-7 expression in ASJ or roaming) does not predict the magnitude of the change in the other, but rather that they trend in the same direc<on.

      Reviewer #3 (Public Review):

      Summary:

      In this interesting study, the authors examine the function of a C. elegans neuroendocrine TGF-beta ligand DAF-7 in regulating foraging movement in response to signals of food and ingestion. Building on their previous findings that demonstrate the critical role of daf-7 in a sensory neuron ASJ in behavioral response to pathogenic P. aeruginosa PA14 bacteria and different foraging behavior between hermaphrodite and male worms, the authors show, here, that ingestion of E. coli OP50, a common food for the worms, suppresses ASJ expression of daf-7 and secreted water-soluble cues of OP50 increases it. They further showed that the level of daf-7 expression in ASJ is positively associated with a higher level of roaming/exploration movement. Furthermore, the authors identify that a C. elegans ortholog of Anaplastic Lymphoma Kinase, scd-2, functions in an interneuron AIA to regulate ASJ expression of daf-7 in response to food ingestion and related cues. These findings place the DAF-7 TGF-beta ligand in the intersection of environmental food conditions, food intake, and foodsearching behavior to provide insights into how orchestrated neural functions and behaviors are generated under various internal and external conditions.

      Strengths:

      The study addresses an important question that appeals to a wide readership. The findings are demonstrated by generally strong results from carefully designed experiments.

      We thank Reviewer 3 for the comments and interest in the work.

      Weaknesses:

      However, a few questions remain to provide a complete picture of the regulatory pathways and some analyses need to be strengthened. Specifically,

      1. The authors show that diffusible cues of bacteria OP50 increase daf-7 expression in ASJ which is suppressed by ingestible food. Their results on del-3 and del-7 suggest that NSM neuron suppresses daf-7 ASJ expression. What sensory neurons respond to bacterial diffusible cues to increase daf-7 expression of ASJ? Since ASJ is able to respond to some bacterial metabolites, does it directly regulate daf-7 expression in response to diffusible cues of OP50 or does it depend on neurotransmission for the regulation? Some level of exploration in this question would provide more insights into the regulatory network of daf-7.

      The focus of our study has been on the modulation of daf-7 expression in the ASJ neurons by distinct bacterial food cues and the downstream neuroendocrine circuitry that is influenced. The question of whether bacterial cues are directly sensed by the ASJ neurons remains unresolved by our study. However, we have previously demonstrated that the daf-7 expression in the ASJ neurons induced by P. aeruginosa metabolites is likely the result of direct detection by the ASJ neurons. We would also note (and have added to the manuscript) the observation of Zaslaver et al. (2015), in which increased calcium transients were observed in the ASJ neurons in response to the withdrawal of E. coli OP50 supernatant, which is consistent with our observations of the effect of a soluble bacterial food signal on daf-7 expression in the ASJ neurons.

      1. The results including those in Figure 2 strongly support that daf-7 in ASJ is required for roaming. Meanwhile, authors also observe increased daf-7 expression in ASJ under several conditions, such as non-ingestible food. Does non-ingestible food induce more roaming?

      Yes, this has been published by Ben Arous, et al., 2009. Figure 3C shows increased roaming on aztreonam-treated food. We have added specific mention of this in the text.

      It would complete the regulatory loop by testing whether a higher (than wild type) level of daf-7 in ASJ could further increase roaming. The results in pdf-1 and scd-2 gain-of-function alleles support more ASJ leads to more roaming, but the effect of these gain-of-function alleles may not be ASJ-specific and it would be interesting to know whether ASJ-specific increase of daf-7 leads to a higher level of roaming. In my opinion, either outcome would be informative and strengthen our understanding of the critical function of daf-7 in ASJ demonstrated here.

      We looked at roaming in animals with a ptrx-1::daf-7 cDNA transgene in a wild-type background and did not see changes in the fraction of time animals roam. However, multiple experimental factors could contribute to our inability to detect an effect, including relative promoter strength and context of other variables that alter daf-7 expression. Nevertheless, our data confirmed that ASJ neuron-specific expression of daf-7 cDNA can increase roaming in a daf-7 mutant background (Figure 2B).

      We have also included an experiment (Figure 4I) looking at roaming in the scd-2(syb2455) gain-of-function animals in animals with daf-7 deleted from the ASJ neurons. These results suggest that part of the increased roaming seen in these scd-2(syb2455) animals is specifically due to increased daf-7 expression in the ASJ neurons.

      1. The analyses in Figure 4 cannot fully support "We further observed that the magnitude of upregulation of daf-7 expression in the ASJ neurons when animals were moved from ingestible food to non-ingestible food was reduced in scd-2(syb2455) to levels only about one-fourth of those seen in wild-type animals (Figure 4D)...", because the authors tested and found the difference in daf-7 expression between ingestible and non-ingestible food conditions in both wild type and the mutant worms. The authors did not analyze whether the induction was different between wild type and mutant. Under the ingestible food condition, ASJ expression of daf-7 already looks different in scd-2(syb2455).

      We appreciate the reviewer pointing out our lack of clarity in discussing our analysis of the data. The 4x difference represents the difference in fold change from ingested to noningested food in wild type and scd-2(syb2455) backgrounds. For wild-type animals, daf-7 expression in the ASJ neurons on non-ingestible food is 8.1-times higher on non-ingestible food than on ingestible food. In scd-2(syb2455) animals, this difference is 1.7 times. We have clarified this in the text.

      1. The authors used unpaired two-tailed t-tests for all the statistical analyses, including when there are multiple groups of data and more than one treatment. In their previous study Meisel et al 2014, the authors used one-way ANOVA, followed by Dunnett's or Tukey's multiple comparison test when they analyzed daf-7 expression or lawn leaving in different mutants or under different bacterial conditions. It is not clear why a two-tailed t-test was used in similar analyses in this study

      We have performed one-way ANOVAs for all comparisons included, and the results were largely consistent with what we found for t-tests. Ultimately, for our analysis we were most interested in pairwise comparisons and decided that t-tests would be most appropriate.

      *Reviewer #1 (Recommendations For The Authors):

      Line 170: For clarity, I suggest editing this to: "When animals are removed from edible food but are still exposed to soluble food signals, upregulation of daf-7..."

      We have edited this in the text and appreciate the suggestion.

      The authors report that pdfr-1(syb3826) was retrieved from "a screen done in parallel to this work." syb3826 is a Suny Biotech allele, suggesting that this screen may not have been done in the authors' lab but rather outsourced. Some additional details might be useful.

      This S325F allele was originally recovered as qd385 in an EMS screen performed in our lab. syb3826 is an independently generated Suny Biotech allele we ordered to confirm that the S325F substitution in PDFR-1 was responsible for our phenotypes. This has been clarified in the text.

      Line 210: Please provide a citation for the screen that identified hen-1(qd259).

      This is the first time the allele is being published. The screen is included in two theses from our lab, Meisel 2016 and Park 2019.

      Line 214: It would be useful here to also mention the previously identified role of scd2 in sensory integration.

      Yes, we have added this to the text. Additionally, we have included a couple of sentences in the discussion about how previous studies that have found a role for SCD-2 in sensory integration may instead be detecting the role for SCD-2 in food sensing, as many of the assays used for sensory integration are also sensitive to nutritional status of the animals.

      Line 271: Please provide a citation for the sex differences in food-leaving behavior (Lipton 2004 PMID 15329389 is the first careful characterization of this).<br /> We have added this to the text.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #2 (Recommendations For The Authors):

      The evidence provided in this study reflects important discoveries on language lateralisation and most of the conclusions of this paper are supported by evidence. However, there are several areas regarding the characteristics of participants tested, hypotheses/predictions and the type of analysis, that need to be clarified and/or corrected.

      1. There is a substantial disconnection between the introduction and the methods/results section.

      One reason is because of lack of consistency. One example refers to the fact that, in the introduction, only IFC is mentioned. However, the analyses carried out to examine neural activity in different groups focused on IFC as well as other brain regions related to inhibitory control. However, these areas were not mentioned at all in the introduction. Second and related to the above, the rationale for conducting certain types of analyses is not specified. Some brain analyses focus on IFC only. Instead, other analyses focus on several areas.

      Another weakness is that there is not sufficient detail regarding the hypotheses/predictions and the specific types of analyses chosen to test these hypotheses/predictions. For example, there is no mention of resting state fMRI data in the introduction, but later we discover that this type of data was collected and analyzed. Even a brief mention of the inclusion of resting state data in the introduction would be beneficial. Along the same lines, by reading the methods section we find out that VBM analyses were conducted. But it is unclear why. What was the purpose of this data analysis? This should be clarified briefly in the introduction and then in the methods section. It remains unclear why resting state results would be particularly informative for addressing the research question of this study. Task-related brain connectivity seems a more appropriate choice. Additionally, it is not explained what comparisons and outcomes would be informative/expected to distinguish between the two mentioned competing hypotheses. This should be made clear.

      Another aspect that lacks clarity is the authors' predictions when investigating the relationship "between the lateralization of both functions and inter-hemispheric structural-functional connectivity, as well as with behavioural markers of certain clinical conditions that have been related with atypical lateralization". The hypotheses are completely omitted in this section.

      Thank you for bringing this to our attention. We concur with Reviewer #2 that our introduction was somewhat lacking in detail and assumed too much prior knowledge on the part of the reader. This, together with a lack of a clear presentation of our tested hypotheses, made the introduction have a poor connection with both the results and discussion sections, which hindered the understanding of the paper.

      As a result, we have made some additions to enhance the exposition of the following areas: (1) the causal and statistical hypotheses of lateralization (Lines 55-65); and (2) the hypotheses regarding subclinical markers of neurological disorders and the corpus callosum (Lines 90-104).

      Furthermore, we have extensively revised the final paragraph of the introduction (Lines 105-121) to provide a clearer and more coherent linkage between the drivers presented during the introduction, our hypotheses, and the subsequent analyses.

      1. It is important to provide more information on the language background of the participants. Were the participants in this study Catalan-Spanish bilinguals? If so, it is crucial for the authors to mention this.

      Language background of the participants has been added to the corresponding section (Lines 138-145).

      In fact, previous studies, including several publications from the authors themselves (Garbin et al., 2010; Rodríguez-Pujadas et al., 2013; Anderson et al., 2018), have shown that there are qualitative differences between bilinguals and monolinguals in the neural circuitry underlying executive control. Across all these studies, it was consistently reported that bilingual individuals, when engaged in non-linguistic inhibitory control tasks, recruited a broader network of left-brain regions associated with language control, including the left IFC, in comparison to monolingual individuals. If the participants in this study were indeed bilinguals, it raises concern if the aim of the study is to generalize the conclusions on lateralization effects beyond the bilingual population.

      Rodríguez-Pujadas, A., Sanjuán, A., Ventura-Campos, N., Román, P., Martin, C., Barceló, F., … & Ávila, C. (2013). Bilinguals use language-control brain areas more than monolinguals to perform non-linguistic switching tasks. PLoS One, 8(9), e73028.

      Garbin, G., Sanjuan, A., Forn, C., Bustamante, J. C., Rodríguez-Pujadas, A., Belloch, V., ... & Ávila, C. (2010). Bridging language and attention: Brain basis of the impact of bilingualism on cognitive control. NeuroImage, 53(4), 1272-1278.

      Anderson, J. A., Chung-Fat-Yim, A., Bellana, B., Luk, G., & Bialystok, E. (2018). Language and cognitive control networks in bilinguals and monolinguals. Neuropsychologia, 117, 352-363.

      Indeed, we have thoroughly reported that, when compared to monolinguals, bilinguals exhibit a significant implication of left brain regions during switching and inhibition tasks. So, this is a legitimate concern. Unfortunately, the society from which our participants were drawn is primarily bilingual, encompassing both active and passive bilinguals. The monolingual sample in those previous studies consisted of university students originating from predominantly monolingual regions of Spain. Given this context, it is unsurprising that the current study has a rather limited number of monolinguals (n=8), with only 2 displaying atypical language lateralization. Thus, we cannot provide a reliable answer to the role of bilingualism status in our data. Consequently, we have included a comment on this limitation on the discussion (Lines 504-512).

      1. Regarding the methods section, I have the following specific queries. The first is about the control condition in the verb generation task. I find it puzzling that the 'task' and 'control' conditions differ in terms of the number of words uttered. Could the authors please provide further clarification on this?

      Thank you for raising this question. Regarding the control condition, it is important to note that the design of this task drew inspiration from previously published verb generation tasks for fMRI (Benson et al., 1999; Fitzgerald et al., 1997) and PET (Petersen et al., 1988). In the fMRI tasks, a fixation cross served as the control condition, while the PET study used word repetition as the control. We acknowledged that a mere fixation cross might not adequately control for the movement and visual-related activations inherent in the verb generation task. Conversely, word repetition could potentially engage the default mode network due to the repetition of the same simple task, which might not be suitable for a control condition, and it could be overly linguistic because it involves a word. Consequently, we aimed to strike a balance by employing a control condition that consisted of reading letters. This approach allowed us to control for movement and vision factors without invoking semantics. Thus, after careful consideration, we ultimately opted on the reading of two letters to equate the response to the vocalization length of generating a verb.

      Although we understand the concern of single vs. two vocalizations, it is worth emphasizing that this version of the verb generation task had undergone prior testing to assess its suitability for determining language lateralization in both healthy and clinical populations (Sanjuan et al., 2010). In fact, this task has been an integral component of our lab’s standard presurgical assessment protocol, which has been used for nearly two decades in individually evaluating language function in over 500 patients with central nervous system lesions.

      Benson, R. R., Fitzgerald, D. B., Lesueur, L. L., Kennedy, D. N., Kwong, K. K., Buchbinder, B. R., Davis, T. L., Weisskoff, R. M., Talavage, T. M., Logan, W. J., Cosgrove, G. R., Belliveau, J. W., & Rosen, B. R. (1999). Language dominance determined by whole brain functional MRI in patients with brain lesions. Neurology, 4(52), 798–809.

      Fitzgerald, D. B., Cosgrove, G. R., Ronner, S., Jiang, H., Buchbinder, B. R., Belliveau, J. W., Rosen, B. R., & Benson, R. R. (1997). Location of Language in the Cortex: A Comparison between Functional MR Imaging and Electrocortical Stimulation. AJNR Am J Neuroradiol, 18, 1529–1539.

      Petersen, S. E., Fox, P. T., Posner, M. I., Mintun, M., & Raichle, M. E. (1988). Positron emission tomographic studies of the cortical anatomy of single-word processing. Nature, 331(18), 585–589.

      Sanjuán, A., Bustamante, J. C., Forn, C., Ventura-Campos, N., Barrós-Loscertales, A., Martínez, J. C., Villanueva, V., & Ávila, C. (2010). Comparison of two fMRI tasks for the evaluation of the expressive language function. Neuroradiology, 52(5), 407–415. https://doi.org/10.1007/s00234-010-0667-8

      Second, it is mentioned that some participants were excluded from different tasks due to technical issues or time constraints. It is important to ensure that all the results can be attributed to the exact same sample of participants across all tasks.

      We absolutely agree that excluding participants can be problematic when presenting the results of multiple sets of analyses. Therefore, we repeated all analyses while excluding the 7 participants that lacked resting-state data. All results remained virtually identical, with a few minor exceptions:

      1) Region-wise analysis of the stop-signal task: Hemisphere × Group effect in the preSMA region is significant (uncorrected P = 0.019), but it does not survive Bonferroni correction (corrected P = 0.076)

      2) Voxel-wise analysis of the stop-signal task: The Thalamus + STN and Caudate clusters are significant at the voxel level, but do not survive the cluster-based FWE correction. They do survive FDR correction, though.

      3) Correlation between SPQ score and LI of the stop-signal task: This correlation weakens just behind statistical significance, with a P value of 0.053.

      4) Correlation between reading variables and LIs of both tasks: Severe drops in P values are evident between both LIs and reading length accuracy (P = .111 and .133), as well as between verb generation LI and reading familiarity accuracy (P = .111). However, the association between the stop-signal LI and the reading length time is now significant (r = −.229, P = .042).

      According to this, we have included this statement in the methods section: (Lines 218-220).“It is important to highlight that the exclusion of these seven participants across all analyses does not notably impact the overall results.“

      It is unclear how the authors have estimated the RTs results from the practice trials. This requires more explanation. Also, why was the median used for the Go Reaction Time instead of the mean, when calculating the individual SSRT?

      We adapted the procedure used by Xue et al. (2008), implementing their approach to calculate SSRT. This has been elaborated further (Lines 227-230), together with the use of practice trials (Lines 233-236).

      Xue, G., Aron, A.R., and Poldrack, R.A. (2008). Common Neural Substrates for Inhibition of Spoken and Manual Responses. Cerebral Cortex 18, 1923–1932. 10.1093/CERCOR/BHM220.

      On a final note, information about the different types of pre-processing and data analysis is all reported in the same paragraph. I think using subsections would increase the intelligibility of the section.

      Thank you for this suggestion. We have added subsections in both the ‘image processing’ and ‘statistical analyses’ sections.

      1. Data analysis and Interpretation of the results. It is unclear how the mean BOLD signal was extracted to conduct ROI analysis (Marsbar?).

      Thank you for ponting this out. Indeed, we were not very accurate in the description of this procedure. We extracted the first eigenvariate via the VOI function within SPM12. This has been included in Lines 291-293.

      I feel uneasy about the way results are corrected for multiple comparisons. For instance, it is mentioned that in the ROI analysis, all p-values were FDR-corrected for four comparisons, but it is unclear why. The correct procedure for supporting conclusions about the effect of specific brain would be to have 'brain region' (n=4) as another within-subject factor. Furthermore, the one-tailed correlation is appropriate but only when testing for the possibility of a relationship in one direction and completely disregarding the possibility of a relationship in the other direction. However, this does not seem to be the case here (see Introduction), so a two-tailed correlation would be more appropriate.

      We agree with Reviewer #2 that presenting this analysis as a single MANOVA that includes a ‘Region’ factor is a more accurate approach. Consequently, we have made the aforementioned correction in the methods section (Lines 357-364) and the results section (Lines 395-406). The LI-LI one-tailed correlation was also changed to a two-tailed correlation in the methods section (Line 383), the results section (Line 417), and Figure 2 (Line 886).

      I am quite confused about using the term interhemispheric connectivity to refer to the volume of the genu, body and splenium of the corpus callosum. In fact, the volumes of genu, body and splenium of the corpus callosum do not reflect a measure of how strongly RH and LH IFC are connected to each other.

      We agree that using the term ‘interhemispheric connectivity’ when referring to callosal volume may be somewhat misleading. We have replaced every instance of this terminology throughout the paper.

      Furthermore, it is unclear why in a set of analyses (ROI and whole brain analyses) the authors focus on brain responses in different ROIs but instead, in connectivity measures the focus is only on IFC.

      Our initial rationale was to focus on regions that are prominently involved in language, particularly the IFC, for examining inter-hemispheric connectivity at rest.

      However, upon more careful consideration, it is true that the preSMA is also implicated in the language network (Labache et al., 2018), and certain studies have reported an impact of STN stimulation on specific language skills (for a review, see Vos et al., 2021). Consequently, we have incorporated these two regions into the resting-state analysis, along with subsequent correlations with LIs (Table 1 and Lines 118, 321-322 & 449-452).

      Labache, L., Joliot, M., Saracco, J., Jobard, G., Hesling, I., Zago, L., Mellet, E., Petit, L., Crivello, F., Mazoyer, B., & Tzourio-Mazoyer, N. (2018). A SENtence Supramodal Areas AtlaS (SENSAAS) based on multiple task-induced activation mapping and graph analysis of intrinsic connectivity in 144 healthy right-handers. Brain Structure and Function 2018 224:2, 224(2), 859–882. https://doi.org/10.1007/S00429-018-1810-2

      Vos, S. H., Kessels, R. P. C., Vinke, R. S., Esselink, R. A. J., & Piai, V. (2021). The Effect of Deep Brain Stimulation of the Subthalamic Nucleus on Language Function in Parkinson’s Disease: A Systematic Review. Journal of Speech, Language, and Hearing Research, 64(7), 2794–2810. https://doi.org/10.1044/2021_JSLHR-20-00515

      Minor corrections/comments:

      It is unclear why in figure caption 1, the conjunction maps are mentioned even if formal conjunction analysis was not conducted.

      This poor choosing of words has been replaced to ‘overlapping maps’.

      Line 382. VHMC should be VMHC.

      Fixed. Thank you.

      Line 334. This sentence and especially its relationship with the results is not clear at all. What do you mean by 'This finding is consistent with previous reports showing that cognitive deficits appear only in specific cognitive domains'?

      This has been clarified (Lines 521-525).

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Throughout the study, there is insufficient information about how experiments were performed and how often (imaging, pull-downs etc), how data was acquired, modified and analysed (especially imaging data, see below), how statistical analyses were done and what is presented in the figures (single planes or maximum intensity projections etc). This makes it difficult to evaluate the data and results.

      We have incorporated additional experimental details to the Materials and Methods section: "Recent advancements in optical and camera technologies permit the acquisition of Z-stacks without perturbing Q cell division or overall animal development. Z-stack images were acquired over a range of -1.6 to +1.6 μm from the focal plane, at intervals of 0.8 μm. The field-of-view spanned 160 μm × 160 μm, and the laser power, as measured at the optical fiber, was approximately 1 mW. ImageJ software (http://rsbweb.nih.gov/ij/) was used to perform image analysis and measurement. Image stacks were z-projected using the average projection for quantification and using the maximum projection for visual display. "

      The majority of our experimental procedures adhere to methodologies delineated in our prior publications and other scientific literature. We were pioneers in the development of fluorescence time-lapse live microscopy techniques for capturing Q cell migration and asymmetric division (Ou and Vale, Journal of Cell Biology, 2009; Ou et al., Science, 2010; Chai et al., Nature Protocols, 2012). Our innovative imaging protocol uncovered a novel mode of polarized, non-muscle myosin-II-dependent asymmetric cell division (Ou et al., Science, 2010). Subsequently, we unveiled another previously uncharacterized mechanism of asymmetric cell division dependent on polarized actin polymerization (Chai et al., Cell Discovery, 2022). In the present study, we have significantly refined our imaging and quantification protocols. Different from the single-focal-plane imaging employed in our earlier study by Ou et al. 2009, advancements in optical technologies and camera resolution now enable us to undertake time-lapse imaging across multiple focal planes and track signal differences between the anterior and posterior segments of dividing cells.

      There is insufficient information about tools and reporters used. This is misleading and impacts the conclusions that can be made from the results presented. To give an example, in Figure 1D-F, the authors present data that HDA-1::GFP and LIN-53::mNeonGreen (both components of the nucleosome remodeling and deacetylation complex) but not the histone acetyltransferase MYS-1::GFP are 'asymmetrically segregated' during QR.a division. However, the authors do not mention that HDA-1::GFP and LIN-53::mNeonGreen are expressed at endogenous levels (they are CRISPR alleles) whereas MYS-1::GFP is overexpressed (integration of a multi-copy extrachromosomal array). The difference in 'segregation' could therefore be a consequence of different levels of expression rather than different modes of segregation ('asymmetric' versus 'symmetric').

      Figure S2 shows overexpressed HDA-1, LIN-53 and CHD-3 are also asymmetrically segregated during ACD of QR.a, which indicates that different levels of expression do not affect the modes of segregation, at least for the NuRD subunits. In the main text, however, we presented the asymmetric segregation of HDA-1::GFP and LIN-53::mNeonGreen using their CRISPR KI alleles.

      There is insufficient information about the phenotypes of the animals used (RNAi knock-downs of hda-1, lin-53 RNAi, pig-1 etc). Again this is misleading and impacts the conclusions that can be made. To give some examples,

      1. In Figure 3A-G, control RNAi embryos are compared to hda-1 RNAi and lin-53 RNAi embryos. What the authors do not mention is that hda-1 RNAi and lin-53 RNAi embryos have severe developmental defects and essentially cannot be compared to control RNAi embryos. The differences between the embryos can be seen in Figure S7B where bright-field images of control RNAi, hda-1 RNAi and lin-53 RNAi embryos are shown. At the 350 min time point, a normal embryo is visible for the control, a 'ball of cells' embryo for hda-1 RNAi and an embryo that seems to have arrested at an earlier developmental stage (and therefore have much larger cells) for lin-53 RNAi. Because of these pleiotropic phenotypes, it is unclear whether differences seen for example in sAnxV::GFP positive cells (Figure 3A) are the result of a direct effect of hda-1(RNAi) on cell death or whether they are the result of global changes in development and cell fate induced by hda-1(RNAi). hda-1(RNAi) and lin-53(RNAi) embryos are also used for the data shown in Figures S6 and S7, raising the same concerns;

      In the submitted manuscript, we mentioned that hda-1 RNAi and lin-53 RNAi caused embryonic lethality and that we could track some of the apoptotic events in hda-1 RNAi embryos arrested between the late gastrulation stage and bean stage. We agree with the reviewers that because of the pleiotropic phenotypes, we cannot distinguish whether sAnxV::GFP positive cells (Figure 3A) are the result of a direct effect of hda-1 (RNAi) on cell death or whether they are the result of global changes in development and cell fate induced by hda-1 (RNAi). We added the sentence to page 9 line 26: “Considering the pleiotropic phenotypes caused by loss of HDA-1, we cannot exclude the possibility that ectopic cell death might result from global changes in development, even though HDA-1 may directly contribute to the life-versus-death fate determination.”

      1. The authors do not mention what the impact of Baf A1 treatment is on animals; however, the images provided in Figure 5E indicate that Baf A1 treatment causes pleiotropic effects in L1 larvae.

      We have carefully checked the BafA1 treated animals, but have not been able to detect any visible defect in Baf A1 treated animals under a 25× dissection microscope at the given dosage and duration of treatment. We also searched for the published images or literature and did not find pleiotropic effects on the animal level at this dosage and duration; however, we agree with the reviewers that perturbation of pH homeostasis in lysosomes by BafA1 will certainly generate pleiotropic cellular defects. We discussed the issue below:

      "Although BafA1-mediated disruption of lysosomal pH homeostasis is recognized to elicit a wide array of intracellular abnormalities, we found no evidence of such pleiotropic effects at the organismal level with the dosage and duration of treatment employed in this study."

      There is a lack of adequate controls. Because of this, some of the data presented must be considered as preliminary. To give some examples:

      1. Controls are lacking for the data shown in Figure 3D-G (i.e. genes other than egl-1). Since hda-1 RNAi has a pleiotropic effect and most likely affects H3K27 acetylation genome-wide, this is critical. Based on what is shown, it is unclear whether the results presented are specific to egl-1 or not;

      In figure 3F, we added F23B12.1 and sru-43 as the controls of egl-1. We added “while the H3K27ac level of genes adjacent to egl-1 showed no significant changes” to Page 10 line 22 in the revised text.

      1. The co-IP and mass spec data shown in Figure 4A, C and Figure S8 also lack a critical control, which is GFP only. Because of this, it is unclear whether subunits of the V-ATPase bind to HDA-1 or GFP. The co-IP and mass spec data forms the basis of Figures 5 and 6 as well as Figure S9. Data presented in these figures therefore has to be considered preliminary as well.

      In the co-IP and mass spec shown in Figure 4A, we used ACT-4::GFP as the negative control, which can preclude V-ATPase subunits that bind to GFP. In Figure 4C, we used anti-V1A (V-ATPase V1 domain A subunit) antibody to confirm the interaction between V1A and HDA-1. In Figure S8B, we also used ACT-4::GFP as a control, showing other NuRD subunits bind to HDA-1 rather than GFP.

      Inappropriate methods are used. For this reason, some of the data again must be considered preliminary. To give some examples:

      1. In Figure 5A, B, the authors used super-ecliptic pHluorin to look at changes in pH in the daughter cells. However, the authors used quenching of super-ecliptic pHluorin fluorescence rather than a ratio-metric method to 'measure' changes in pH. Because of this, it is unclear whether the changes in fluorescence observed are due to changes in pH or changes in the amount of pHluorin protein. Figure 5A, B forms the basis for the experiments presented in the remaining parts of Figure 5 as well as in Figure 6 and Figure S9;

      Bafilomycin A1 inhibits the activity of V-ATPase, presumably preventing the pumping of protons into the apoptotic daughter cell. It is more likely that the apoptotic daughter cell becomes less acidic and more neutral after the treatment of Baf1A, although we cannot exclude the possibility that the changes in fluorescence could be due to changes in the amount of pHluorin protein. A ratio-metric method to measure changes in pH will be further used to distinguish the two possibilities.

      We added “although we cannot exclude the possibility that the changes in fluorescence could be due to changes in the amount of pHluorin protein.” to Page 12 line 12 in the revised text.

      1. The authors' description of how some images were modified before quantitative analysis raises concerns. The figures of concern are particularly Figure 1 and Figure S4, where background subtraction with denoising and deconvolution was used. Background subtraction, with denoising and deconvolution is an image manipulation that enhances the contrast between background and what looks like foreground. Therefore, background subtraction should be applied primarily in experiments involving image segmentation not fluorescence intensity measurement. Not being provided any information by the authors about the kind of subtraction that was made, this processing could lead to an uneven subtraction across the image, which can easily lead to artefacts. Since the fluorescence intensity in the smaller daughter cell is lower, and thus closer to background, the algorithm the authors used may have misinterpreted the grey value information in the smaller daughter cell pixels. This could have led to an asymmetric subtraction of background in the two daughter cells, leading to a stronger subtraction in the smaller daughter cell. Ultimately, their processing could have artificially increased the intensity asymmetry between the two daughter cells in all their results.

      As mentioned earlier, the imaging and quantification methods of this manuscript have been routinely used in our previous publications or studies from many other labs (Gräbnitz F, et al., Cell Rep. 2023; Herrero E, et al., Genetics. 2020; Roubinet C, et al., Curr Biol. 2021). Background subtraction is a standard procedure to quantify cellular fluorescence intensities. The fluorescence intensity of the slide background was measured from a region without worm bodies, of the same size as the region of interest. We have added how we measured the background to page 19 Line 24: “The fluorescence intensity of the slide background was measured from a region without worm bodies, of the same size as the region of interest.”

      The imaging data is of low quality (for example Figures 1, 2, 5, 6; Figures S2, S3, S5, S6, S9). Since much of the study and the findings are based on imaging, this is a major concern. Critical parameters are not mentioned (number of sections in z-stack, size of the field-of-view, laser power used etc), which makes it difficult to understand what was done and what one is looking at.

      Fluorescence images of neuroblast asymmetric cell division in developing C. elegans larvae has historically presented considerable challenges. Our recent methodological advancements have facilitated live imaging in this intricate system with improved resolution. In the revised manuscript, we have elucidated the specific z-stack parameters, field-of-view dimensions, and laser power settings employed: "Z-stack images were acquired over a range of -1.6 to +1.6 μm from the focal plane, at intervals of 0.8 μm. The field-of-view spaned 160 μm × 160 μm, and the laser power, as measured at the optical fiber, was approximately 1 mW."

      To give some specific examples,

      1. The images shown in Figure 2B are of very low quality with severe background from neighbouring cells. In addition, the outline of the cells (plasma membrane) or the nuclei of the daughter cells is unknown. Based on this it is not clear how the authors could have measured 'Fluorescence intensity ratio between sister nuclei' in an accurate and unbiased way (what is clear from these images is that there is an increase in HDA-1::GFP signal in ALL surviving daughters (asymmetric and symmetric divisions) post cytokinesis but not in the daughter cell that is about to die (asymmetric and unequal division));

      We employed live-cell imaging in conjunction with automated cell lineage tracing algorithms (Du et al., Cell, 2014) to scrutinize NuRD asymmetry in embryos from the two- or four-cell stage up to the 350-cell stage. This sophisticated approach was initially pioneered by Dr. Zhirong Bao at Sloan Kettering and subsequently refined by Dr. Zhuo Du during Dr. Du's postdoctoral training in Dr. Bao's laboratory. This advanced imaging pipeline enables the scientific community to quantify cellular fluorescence intensity in an automated fashion, thereby substantially mitigating manual intervention and bias.

      1. The images in Figure 6A and Figure S9A on VHA-17 segregation and its colocalization to ER and lysosome segregation during QR.a division are of very low quality and it is unclear to the reviewer how such images were used to obtain the quantitative data shown.

      In some cases, there is a discrepancy between what is shown in figures and what the authors state in the text. To give some examples:

      1. On page 7, the authors state "..., we found that nuclear HDA-1 or LIN-53 asymmetry gradually increased from 1.1-fold at the onset of anaphase to 1.5 or 1.8-fold at cytokinesis, respectively (Figure 1D-E)." Looking at the images for HDA-1 and LIN-53 in Figure 1D, the increase in the ratio mainly occurs between 4 min and 6 min, which is post cytokinesis and NOT prior to cytokinesis;

      Thank the reviewer for pointing out this. The nuclear HDA-1 or LIN-53 asymmetry increased to 1.5 or 1.8-fold 6 min after the onset of anaphase, when QR.a just completes cytokinesis. Therefore, We change the sentence “we found that nuclear HDA-1 or LIN-53 asymmetry gradually increased from 1.1-fold at the onset of anaphase to 1.5 or 1.8-fold at cytokinesis, respectively (Figure 1D-E).” to “we found that nuclear HDA-1 or LIN-53 asymmetry gradually increased from 1.1-fold at the onset of anaphase to 1.5 or 1.8-fold upon the completion of cytokinesis, respectively (Figure 1D-E).”

      However, nuclear HDA-1 or LIN-53 asymmetry initiates prior to cytokinesis. We started to see the nuclear HDA-1 or LIN-53 asymmetry (1.4 fold for HDA-1 and 1.2 fold for LIN-53 ) 2 min after the onset of anaphase (Figure 1D).

      1. These images (Figure 1D) also show that there is an increase in the HDA-1 and LIN-53 signals in the larger daughter cells (QR.ap), which suggests that the increase in ratios (Figure 1E) is the result of increased HDA-1 and LIN-53 synthesis post cytokinesis. However, on top of page 8, the authors state "The total fluorescence of HDA-1, LIN-53 and MYS-1 remained constant during ACDs, suggesting that protein redistribution may establish NuRD asymmetry (Figure S4C)." In Figure S4C, the authors present straight lines for 'relative total fluorescence' for imaging (probably z-stacks) that was done every min over the course of 7 min. If there was no increase in material as the authors claim, they should have seen significant photobleaching over the course of the 7 min and therefore reduced level of 'relative total fluorescence' over time. How the data presented in Figure S4C was generated is therefore unclear. (Despite the fact that the authors claim that the asymmetry seen is not due to new synthesis in the larger daughter cell post cytokinesis, it would be more consistent with the first experiment presented in this study (Figure S1) that shows that there is more hda-1 mRNA in egl-1(-) cells compared to egl-1(+) cells);

      Regarding the concern of photo-bleaching, we have meticulously calibrated our imaging system over the past several years. Rigorous controls, qualification, and analyses were scrupulously undertaken during the development of our fluorescence time-lapse imaging system for the investigation of Q cell dynamics, initially established by Dr. Guangshuo Ou in Ron Vale's laboratory—a renowned hub for avant-garde imaging techniques (Ou & Vale, Journal of Cell Biology, 2009; Ou et al., Science, 2010). Remarkably, no discernible photobleaching was observed even during two to three-hour imaging.

      We agree that protein turnover, involving both degradation and synthesis, may occur. However, NuRD asymmetric distribution occurred within several minutes after metaphase and QR.a completes cytokinesis ~6min after the onset of anaphase, while GFP protein translation and maturation require ~ 30 min in Q neuroblast (Ou & Vale, Journal of Cell Biology, 2009). Even if hda-1::gfp mRNA is translated during cell division, the nascent GFP-tagged protein will mature long after the completion of cytokinesis. Consequently, we postulate that the influence of newly synthesized GFP-tagged protein during Q cell division is negligible for quantification purposes. It is plausible that the asymmetry in HAD-1 protein distribution is independent of hda-1 mRNA asymmetry.

      1. On page 12, the authors state "..., in Baf A1-treated animals, QRaa inherited similar levels of HDA-1::GFP as its sister cell,...". However, looking at the image provided in Figure 5E (0 min), there seems to be a similar ratio of HDA-1::GFP between the daughter cells in DMSO and Baf A1-treated animals.

      We have adjusted the images in Figure 5E to show the asymmetry in DMSO-treated control animals. We acknowledge variations among animals. Our quantifications from more than 10 animals show the HDA-1 asymmetry in DMSO-treated animals in Figure 5B.

      Recommendations for the authors:

      Conclusion 1

      "Here, we demonstrate that the nucleosome remodeling and deacetylase (NuRD) complex is asymmetrically segregated into the surviving daughter cell rather than the apoptotic one during ACDs in Caenorhabditis elegans" (Abstract)

      Results described on pages 6-9 ("NuRD asymmetric segregation during neuroblast ACDs" and "NuRD asymmetric segregation in embryonic cell lineages") and data shown in Figure S1, Figure 1, Figures S2, S3, S4, S5, Figure 2.

      Conclusion 1 is not supported by the results as numerous concerns exist about the data in many of these figures (see above, major weaknesses). A more likely explanation for the authors' observations is that there is synthesis of NuRD post cytokinesis and that asymmetries in the amounts of NuRD observed in the two daughter cells is a consequence of their different cell sizes (QR.ap is 3x as large as QR.aa). This is supported by the finding that the loss of pig-1, which causes 'equal' division resulting in two daughter cells of similar sizes, abolishes the differences in NuRD seen between the daughter cells.

      As discussed earlier, GFP protein translation and maturation require ~ 30 min in Q neuroblast (Ou & Vale, Journal of Cell Biology, 2009). Even if there is the synthesis of NuRD post cytokinesis, the nascent GFP-tagged protein will not mature within our imaging timeframe, Therefore, NuRD asymmetry is unlikely to be a result of the synthesis of NuRD post cytokinesis. In addition, We found that MYS-1::GFP was symmetrically segregated into the small apoptotic daughter cells and big surviving daughter cells, suggesting NuRD asymmetry may be irrelevant to cell size asymmetry.

      Interestingly, despite the fact that the loss of pig-1 causes 100% of the divisions to be equal by size and symmetric with respect to NuRD amounts, it only causes about 30% of QR.aa cells to inappropriately survive. This demonstrates that there is a correlation between NuRD asymmetry and daughter cell size asymmetry but NOT between NuRD asymmetry and cell death. This also demonstrates that loss of 'NuRD asymmetry' and presence of NuRD in the daughter that should die is NOT sufficient to block its death.

      Cordes et al. 2006 (DOI: 10.1242/dev.02447) reported that in pig-1 loss-of-function mutants, <40% of Q.p lineages produce extra neurons because Q.pp cells inappropriately survive. Noticeably, only 30% and 5% Q.p lineages produce extra neurons in ced-3 and egl-1 loss of function single mutant, respectively. pig-1 ced-3 double mutant or pig-1 egl-1 double mutants show a dramatically stronger phenotype than either single mutant, resulting in about 80% of Q.p lineages producing extra neurons. These results suggest that pig-1 functions in parallel to the EGL-1-CED-9-CED-4-CED-3 cell death pathway to promote Q cell apoptosis.

      We agree with the reviewer that “loss of 'NuRD asymmetry' and presence of NuRD in the daughter that should die is NOT sufficient to block its death” in pig-1 loss-of-function mutants. However, these results do not rule out the correlation between NuRD asymmetry and cell death. In the pig-1 mutant, the concentration of NuRD in Q.pp might not be high enough to completely block the death pathway. Alternatively, NuRD may be one but not the only factor blocking the cell death pathway.

      Lastly, it is imperative to underscore that cellular aberrations observed during early developmental stages frequently undergo compensatory correction during subsequent developmental stages or even initial aging stages. For example, in certain cell migration mutants exhibiting early migration defects, the initial penetrance exceeds 80%; however, the penetrance is mitigated to a mere 30% in adults. Such observations have been corroborated in our prior publications focusing on cell migration dynamics (Wang et al., PNAS, 2013; Zhu et al., Dev Cell, 2016). This appears to be a pervasive phenomenon, echoed by several laboratories specializing in neural development. Sengupta and Blacque’s labs has reported that early aging can ameliorate ciliary phenotypes in C. elegans mutants with compromised intraflagellar transport mechanisms. Accordingly, late developmental stages may act as a compensatory buffer for antecedent developmental abnormalities.

      Conclusion 2

      "The absence of NuRD triggers apoptosis via the EGL-1-CED-9-CED-4-CED-3 pathway, while an ectopic gain of NuRD enables apoptotic cells to survive." (Abstract) Results described on pages 8-10 ("Loss of the deacetylation activity of NuRD causes ectopic apoptosis" and "NuRD RNAi upregulates the egl-1 expression by increasing its H3K27 aceylation") and data shown in Figure S6, Figure 3, Figure S7 and data shown in Figure 5.

      Because of the various concerns raised above (major weaknesses) about the data presented in Figure S6, Figure 3, Figure S7 (pleiotropic phenotypes of hda-1 and lin-53 RNAi animals, lack of controls etc), there is no evidence that NuRD has a specific and/or direct effect on egl-1 expression in cells programmed to die or that loss of NuRD causes ectopic egl-1-dependent cell death. The claim that "ectopic gain of NuRD enables apoptotic cells to survive." is based on Figure 5E, where the authors show that Baf A1 treatment causes symmetric NuRD segregation in 11/12 animals and that QR.aa survives in 11/12 animals. However, those data are unconvincing. As mentioned above (major weaknesses), from the low-quality images provided, it is not clear whether there is 'symmetric NuRD segregation' in Baf A1 treated animals, and the conditions of the animals are a concern because of pleiotropic effects of blocking V-ATPase. (I am not convinced I am actually looking at the same region of an L1 larvae in the three animals because the HDA-1::GFP signal seems inconsistent across them.) One process that is affected by a block of V-ATPase is engulfment. The fact that the authors observe that 130 min post-cytokinesis, QR.aa still persists in Baf A1 treated animals could therefore be the result of a delay in engulfment rather than a block in cell death. In addition, the claim that ectopic gain of NuRD enables apoptotic cells to survive contradicts their findings on loss of pig-1 described about ('Conclusion 1').

      We acknowledge the limitations of our imaging system; however, as we pointed out earlier that we developed imaging methods and kept improving them. We have tried our best to obtain images from developing C. elegans larvae. On the other hand, we showed that hda-1 RNAi and lin-53 RNAi increase the expression of a subset of genes, including egl-1, either directly or indirectly (Fig. 3C). Figure 3B shows the ectopic cell death caused by loss of NuRD is dependent on EGL-1-CED-9-CED-4-CED-3 pathway. While we cannot exclude several other possibilities while addressing such a complex problem in such a challenging model system, these results provide some evidence supporting that our claim can be one of the possibilities.

      Conclusion(s) 3

      "We identified the vacuolar H+-adenosine triphosphatase (V-ATPase) complex as a crucial regulator of NuRD's asymmetric segregation. V-ATPase interacts with NuRD and is asymmetrically segregated into the surviving daughter cell. Inhibition of V-ATPase disrupts cytosolic pH asymmetry and NuRD asymmetry" (Abstract)

      Results described on pages 10-13 ("V-ATPase regulates asymmetric segregation of NuRD during somatic ACDs") and data shown in Figures 4, 5, 6, Figures S8, S9.

      As outlined above (major weaknesses), the evidence that HDA-1 interacts with the V-ATPase complex is preliminary (no GFP control), and I consider the imaging data showing that V-ATPase asymmetrically segregates very low quality and unconvincing (Figure 6). The data on pH changes are also preliminary as the experiment was not done the way it should have (quenching rather than ratiometric). Finally, there are concerns about the results that apparently demonstrate that inhibiting V-ATPase activity disrupts pH asymmetry and NuRD asymmetry (impact of Baf A1 treatment).

      As discussed earlier, Bafilomycin A1 inhibits the activity of V-ATPase, presumably preventing the pumping of protons into apoptotic daughter cells. It is more likely that the apoptotic daughter cell becomes less acidic and more neutral after the treatment of Baf1A, although we cannot exclude the possibility that the changes in fluorescence could be due to changes in the amount of pHluorin protein. A ratio-metric method to measure changes in pH will be further used to distinguish the two possibilities.

      We added “although we cannot exclude the possibility that the changes in fluorescence could be due to changes in the amount of pHluorin protein.” to Page 12 line 12 in the revised text.

      Conclusion 4

      "We suggest that asymmetric segregation of V-ATPase may cause distinct acidification levels in the two daughter cells, enabling asymmetric epigenetic inheritance that specifies their respective life-versus-death fates." (Abstract) Discussion and model Figure 6C.

      I consider the model premature and not based on any convincing data. In addition, the role of V-ATPase and acidification does not make sense. V-ATPase is involved in the acidification of the lysosomal system (lumen), and it is thought that cytosolic acidification in apoptotic cells is caused by lysosomal leakage. This is not consistent with the authors' model.

      This manuscript lacks a section describing details of statistical analyses and the rationale for the chosen test, sample sizes, exclusion criteria, and replication details. Although the sample size is relatively smaller (less than 30), the authors used "unpaired t-test" for most of the tests. They should describe which type of t-test they used (parametric or non-parametric test). They also should provide replication details for non-statistical data set, for example Fig 3F and Fig 4C.

      We used the Unpaired two-tailed parametric t-test. We have now added the information for statistic tests in the revised methods and figure legends.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      In this study, single neurons were recorded, using tetrodes, from the parahippocampal cortex of 5 rats navigating a double-Y maze (in which each arm of a Y-maze forks again). The goal was located at any one of the 4 branch terminations, and rats were given partial information in the form of a light cue that indicated whether the reward was on the right or left side of the maze. The second decision point was uncued and the rat had no way of knowing which of the two branches was correct, so this phase of the task was more akin to foraging. Following the outbound journey, with or without reward, the rat had to return (inbound journey) to the maze and start to begin again.

      Neuronal activity was assessed for correlations with multiple navigation-relevant variables including location, head direction, speed, reward side, and goal location. The main finding is that a high proportion of neurons showed an increase in firing rate when the animal made a wrong turn at the first branch point (the one in which the correct decision was signalled). This increase, which the authors call rate remapping, persisted throughout the inbound journey as well. It was also found that head direction neurons (assessed by recording in an open field arena) in the same location in the room were more likely to show the rate change. The overall conclusion is that "during goal-directed navigation, parahippocampal neurons encode error information reflective of an animal's behavioral performance" or are "nodes in the transmission of behaviorally relevant variables during goal-directed navigation."

      Overall I think this is a well-conducted study investigating an important class of neural representation: namely, the substrate for spatial orientation and navigation. The analyses are very sophisticated - possibly a little too much so, as the basic findings are relatively straightforward and the analyses take quite a bit of work to understand. A difficulty with the study is that it was exploratory (observational) rather than hypothesis-driven. Thus, the findings reveal correlations in the data but do not allow us to infer causal relationships.

      We would like to clarify that this report consists of hypothesis-driven experiments, with post-hoc exploratory analyses. We have now made hypotheses more explicit in the text, and pointed out that follow-up analyses were to understand how these effects came to be. We thank the reviewer for pointing out that our hypotheses were not explicit in the introduction. We believe our results open the door for investigating the causal role of these regions in the propagation or generation of error signals during navigational behavior. Those types of experiments are however, outside the scope of the current work.

      That said, the observation of increased firing in a subset of neurons following an erroneous choice is potentially interesting. However, the effect seems small. What were the actual firing rate values in Hz, and what was the effect size?

      We thank the reviewer for the opportunity to clarify the effect size question. As there are multiple neurons in the analyses, differences in firing rate need necessarily to be normalized by a neuron's mean activity. For example, a difference of 1 spk/s is less meaningful when a neuron's base rate is 50 spk/s than when it is 10spks/s. Furthermore, our reports are for population level analyses, at which point comparing raw firing rate values and differences becomes more challenging. Nonetheless, we are including these raw metrics in two new supplemental figures (Figure 2 - figure supplement 4,5), where differences in individual neurons change can be up to 15 spks/s. Additionally, the patterns and statistical results observed in the main text are preserved, with outbound Right Cue minus Left Cue showing a left>stem>right (indicating error coding), and RW minus NRW showing negative values across all segments, indicating NRW>RW or higher activity following on inbound unrewarded trials. Statistics follow the corresponding main text results (Cue: segment LRT = 71.70; RW: segment LRT=45.80).

      I also feel we are lacking information about the underlying behavior that accompanies these firing rate effects. The authors say "one possibility is that the head-direction signal in the parahippocampal region reflects a behavioral state related to the navigational choice or the lack of commitment to a particular navigational route" which is a good thought and raises the possibility that on error trials, rats are more uncertain and turn their heads more (vicarious trial and error) and thus sample the preferred firing direction more thoroughly. Another possibility is that they run more slowly, which is associated with a higher firing rate in these cells. I think we, therefore, need a better understanding of how behavior differed between error trials in terms of running speed, directional sampling, etc.

      In terms of running speed, there was a small effect of mean running speed between correct and incorrect trials (across subjects LMEM: Cue correct>incorrect Z=2.3, p=0.02; RW Z=2.15, p=0.03). In most neurons, increases in speed will be accompanied by increases in firing rate. Thus, the differences in running speed cannot explain the observed results, as higher speed during correct trials would predict higher activity, which is the opposite of what we found.

      A few good, convincing raw-data plots showing a remapping neuron on an error trial and a correct trial on the same arm would also be helpful (the spike plots were too tiny to get a good sense of this: fewer, larger ones would be more helpful).

      Additional plots for individual units have been added, Figure 2 - figure supplement 3.

      It would be useful to know at what point the elevated response returned to baseline, how - was it when the next trial began, and was the drop gradual (suggesting perhaps a more neurohumoral response) or sudden.

      Due to the experimental design, this question cannot be addressed fully. Concretely, error trials incur a time-penalty in which the rats need to wait an additional 10 seconds before the next trial, while a new trial would start immediately when the animal nose-poked the home well after a correct trial. Nonetheless, the data on Reward provides insight into this question. The magnitude of the responses on left and right segments of the maze were larger than on the stem for Unrewarded (NRW) vs Rewarded (RW) trials on inbound trajectories, Fig. 4c. This suggests that while activity is still elevated post-incorrect throughout the maze, across units, this effect is smaller on the stem segment. Additionally, the analyses indicate that in the transition between outbound vs inbound trajectories (Figure 4 - figure supplement 3), activity patterns are sustained (incorrect>correct). Together, these results indicate that elevated "error-like" signal are slow in returning to baseline.  

      Reviewer #2 (Public Review):

      This work recorded neurons in the parahippocampal regions of the medial entorhinal cortex (MEC) and pre- and para-subiculum (PrS, PaS) during a visually guided navigation task on a 'tree maze'. They found that many of the neurons reflected in their firing the visual cue (or the associated correct behavioral choice of the animal) and also the absence of reward in inbound passes (with increased firing rate). Rate remapping explained best these firing rate changes in both conditions for those cells that exhibited place-related firing. This work used a novel task, and the increased firing rate at error trials in these regions is also novel. The limitation is that cells in these regions were analyzed together.

      We acknowledge this limitation on our study, and we believe there might be interesting differences between these regions. Unfortunately, the post-mortem extraction of the recording implant micro-drive used for these experiments generated too much tissue damage for exact localization of the tetrodes. Nonetheless, given that the patterns were observed in all subjects, we are confident that at least the major findings of "error-like" signaling is present across the parahippocampal regions. At the same time, the distributions of functional cell types as defined in the open field are different across the PaS, PrS and MEC, leaving the possibility of a more nuanced error coding scheme by region.

      Reviewer #3 (Public Review):

      The authors set out to explore how neurons in the rodent parahippocampal area code for environmental and behavioral variables in a complex goal-directed task. The task required animals to learn the association between a cue and a spatial response and to use this information to guide behavior flexibly on a trial-by-trial basis. The authors then used a series of sophisticated analytical techniques to examine how neurons in this area encode spatial location, task-relevant cues, and correct vs. incorrect responding. While these questions have been addressed in studies of hippocampal place cells, these questions have not been addressed in these upstream parahippocampal areas.

      Strengths:

      1) The study presents data from ensembles of simultaneously recorded neurons in the parahippocampal region. The authors use a sophisticated method for ensuring they are not recording from the same neurons in multiple sessions and yet still report impressive sample sizes.

      2) The use of the complex behavioral task guards against stereotyped behavior as rats need to continually pay attention to the relevant cue to guide behavior. The task is also quite difficult ensuring rats do not reach a ceiling level of performance which allows the authors to examine correct and incorrect trials and how spatial representations differ between them.

      3) The authors take the unusual approach of not pre-processing the data to group neurons into categories based on the type of spatial information that they represent. This guards against preconceived assumptions as to how certain populations of neurons encode information.

      4) The sophisticated analytical tools used throughout the manuscript allow the authors to examine spatial representations relative to a series of models of information processing.

      5) The most interesting finding is that neurons in this region respond to situations where rewards are not received by increasing their firing rates. This error or mismatch signal is most commonly associated with regions of the basal ganglia and so this finding will be of particular interest to the field.

      Weaknesses:

      1) The histological verification of electrode position is poor and while this is acknowledged by the authors it does limit the ability to interpret these data. Recent advances have enabled researchers to look at very specific classes of neurons within traditionally defined anatomical regions and examine their interactions with well-defined targets in other parts of the brain. The lack of specificity here means that the authors have had to group MEC, PaS, and PrS into a functional group; the parahippocampus. Their primary aim is then to examine these neurons as a functional group. Given that we know that neurons in these areas differ in significant ways, there is not a strong argument for doing this.

      See response to Reviewer 2.

      2) The analytical/statistical tools used are very impressive but beyond the understanding of many readers. This limits the reader's ability to understand these data in reference to the rest of the literature. There are lots of places where this applies but I will describe one specific example. As noted above the authors use a complex method to examine whether neurons are recorded on multiple consecutive occasions. This is commendable as many studies in the field do not address this issue at all and it can have a major impact as analyses of multiple samples of the same neurons are often treated as if they were independent. However, there is no illustration of the outputs of this method. It would be good to see some examples of recordings that this method classifies as clearly different across days and those which are not. Some reference to previously used methods would also help the reader understand how this new method relates to those used previously.

      We have added an additional Supplemental Figure (Figure 7 - figure supplement 1, that showcases the matching waveform approach. In the original manuscript, Fig. 7a provided an example output of the method.

      3) The effects reported are often subtle, especially at the level of the single neuron. Examples in the figures do not support the interpretations from the population-level analysis very convincingly.

      Additional plots for individual units have been added, Figure 2 - figure supplement 3. However, the effects, though small by unit, are consistent across neurons and subjects.

      The authors largely achieve their aims with an interesting behavioral task that rats perform well but not too well. This allows them to examine memory on a trial-by-trial basis and have sufficient numbers of error trials to examine how spatial representations support memory-guided behavior. They report ensemble recordings from the parahippocampus which allows them to make conclusions about information processing within this region. This aim is relatively weak though given that this collection of areas would not usually be grouped together and treated as a single unitary area. They largely achieve their aim of examining the mechanisms underlying how these neurons code task-relevant factors such as spatial location, cue, and presence of reward. The mismatch or error-induced rate remapping will be a particularly interesting target for future research. It is also likely that the analytical tools used in this study could be used in future studies.

      Reviewer #1 (Recommendations For The Authors):

      1) Typo: "attempted to addresses these challenges"

      We thank the reviewer for pointing out, this has been fixed.

      2) "classified using tuning curve based metrics" - what does "tuning curve" mean in this context?

      We have clarified this sentence in the main text.

      3) "MEC neurons encode time-elapsed" should be "MEC neurons encode time elapsed" (no hyphen)

      We thank the reviewer for pointing out, this has been fixed.

      4) "a phenomenon referred to as 'global remapping'." - I dislike this term because it has two meanings in the literature. On the one hand, it is used to contrast with rate remapping: that is, it refers to a change in the location of place fields. On the other hand, it refers to the remapping of the whole population of cells at once, as contrasted with partial remapping. I suggest calling them location remapping (vs. rate) and complete remapping (vs. partial)

      We agree that this is an overloaded term in the field. We have added 'location remapping' in the intro as a more specific term for global remapping.

      5) " tasks with no trial-to-trial predictability or experimenter-controlled cues and goals in the same environment." - ambiguously worded as it isn't clear whether the "no" refers to one or both of what follows. Also needs a hyphen after experimenter.

      We thank the reviewer for pointing out, this sentence has been reworded for clarity.

      6) " neurons changed their firing activity as a function of cue identity" - this is confounded by behavior and reward contingency, both linked to cue identity, so the statement is slightly misleading.

      We thank the reviewer for pointing this out, however, we disagree that this wording is misleading. Neurons changed their activity as a function cue identity and reward contingencies. Why neurons change their activity in such a manner is a different, more nuanced question, that we addressed through our analyses by converging on the "error" like signal that these signals seem to carry.

      7) "remapping" - I am not fully comfortable with the use of this term in this context. It derives from the original reports of change in the firing location of place cells, and the proposal that these cells form a "map" with the change in activity reflecting recruitment of a new map. With observations of rate changes in some place cells, the new term "rate remapping" was introduced, and now the authors are using "rate remapping" to mean firing rate changes in non-spatial cells. The meaning is thus losing its value. "Re-coding" might be slightly better, although we can argue about whether "code" is much better than "map"

      While we agree with the reviewer that "remapping" has been coerced into a grab-all term, these are the accepted semantics in the field. Thus, we are disinclined to change the language.

      8) Figure 1 - it would be useful to indicate somehow that one of the decision points was cued and once free choice with the random outcome - it took me a while to work this out. Also, the choice of colors for the cues needs explaining - my understanding is that rats are very insensitive to these wavelengths. And what does Pse mean? I didn't understand those scatterplots at all.

      The section "Tree-Maze behavior and electrophysiological recordings" under Results go into the details of the task. A reference and additional context for the selection of cues is now included in the "Behavioral Training" methods section. Rats possess dichromatic vision systems. Caption of Figure 1, 2 includes what Pse means, the performance of a subject for a given session. The scatter plots relate remapping to performance.

      9) Also on Figure 1 - in the examples shown, it looks like the animals always checked the two end arms in the same order. Was this position habit typical?

      We have added additional context in "Behavioral Training" methods section. Well trained rats do exhibit stereotyped behaviors (eg. going to one well then the other).

      10) "...we hypothesized that the cue remapping score would be related to a subject's performance in the task." I am struggling to see why this doesn't follow trivially from the observation that remapping occurred on error trials.

      We thank the reviewer for pointing out that this could use further clarity. We have added that the magnitude of remapping is what should relate to performance. To further clarify, remapping does not occur on error trials, remapping as operationally defined in this work, is the difference of spatial maps as a function of Cue identity or Reward contingency. Thus, as a difference metric, remapping occurs because there is a difference in activity between correct and incorrect trials. The magnitude of that difference need not relate to how the subject performed on the task.

      11) "With this approach, found that incorrect coding units were more likely to overlap between cue and RW coding units than correct." Missing "we". I didn't understand this sentence - what does "overlap" mean?

      We have added a sentence to further clarify this point.

      12) "We found that incorrect>correct activity levels on outbound trajectories predicted incorrect>correct activity levels on inbound trajectories" - I don't understand how this can be the case given that many of these units were head direction tuned and therefore shouldn't even have been active in both directions.

      As seen in Figure 7b, we were able to match 217 units across tasks. Of those, "Cluster 0" with 98 units showed strong head-direction coding. While "Cluster 0" units showed strong remapping effects, there were a lot of other units that could have contributed to the "incorrect>correct" across (out/in)-bound segments. Further, head-direction coding is defined in the Open-field environment, and there's no constraint on what these neurons could do on the Tree Maze task.

      13). " Error or mismatch signals conform a fundamental computation" - should be "perform"

      Wording slightly changed, but "conform" as in "act in accordance to" is what we intend here.

      14) " provides it with the required stiffness and chemical resistivity"- what does "chemical resistivity" mean? To what chemicals?

      This is mostly in reference to rat waste and cleaning products (alcohol). We changed the wording to durability for simplicity.

      15) Supp Fig 1 shows that behavioral performance was very distinctly different for one of the animals. Was its neural data any different? What happens to the overall effect if this animal is removed from the analysis?

      Unless otherwise stated, all analyses are performed through linear mixed effects with "subject" as a random effect. Thus, the effects of individual subjects are accounted for.

      16) Histology - it would be useful to have a line drawing from the atlas alongside the micrographs to enable easier anatomical understanding.

      There was variability in the medial lateral location of the tetrodes across animals and in the histological images provided and thus, we felt this would not be useful information as a single line drawing will not encompass/apply to all histology photos.

      17) Supp. Fig. 5/6 I didn't understand what Left, Stem, and Right mean at the top. Also, the color keys are too tiny to be noticed

      An additional sentence has been added to the caption to clarify that left, stem, right refer to what segment was selected via the ranking of scores.

      Reviewer #2 (Recommendations For The Authors):

      Was there a particular reason why cells in these regions were analyzed together? Can some of the results be tested for cells of a particular region, especially the MEC? One major limitation of these results is that it is unclear which regions it applies to, e.g., one cannot be certain that data shows here that MEC cells have these firing properties.

      Damage due to the extraction of the recording tetrode bundle was extensive and we were not able to parcelate out individual regions. We have added additional details on this in the "Histology" section of the methods.

      It is unclear how many cells in each region are included in each analysis. There is supplementary fig 3 but not sure how many of these met the criteria to be included in a certain analysis and it does not differentiate regions. Also, was any of the MUA included in the analyses?

      Isolated single units were included in all analyses, but we did not differentiate from what region each unit came from. Analyses that include MUA are separate from the main findings, and are included in supplemental figures as reference.

      Was the error trial defined as a trial when the animal did not make the right light-guided choice or did it also include cases in which the light-related arm choice was correct, but the animal first went to the unrewarded end arm? Nomenclature in the results is not explained well - what is an unrewarded trial or unrewarded trajectory or an error trial?

      We have added a new paragraph in the methods under Behavioral Training that address trial nomenclature. This methods section is now referenced twice in the initial paragraphs of the results section.

      Were any grid cells included in the data, especially could any cross-matched across the open field and the maze runs?

      This was indeed a question of interest to us, however, the number of grid-cells recorded was not adequate for meaningful statistical inference. We further sought to avoid tuning curve based functional classifications of units.

      In general, the results section is difficult to read, and its accessibility could be improved.

      We thank the reviewer for this accessibility point. We hope that the small tweaks as a product of this revision will improve the readability of the manuscript. We tried to have major takeaways for each result, but the nature of the analyses necessarily make the text somewhat dense.

      Minor:

      One of the Figure 3f references should be Figure 3g and later, Figure 44 should be corrected.

      We thank the reviewer for noting this, it has been fixed.

      Reviewer #3 (Recommendations For The Authors):

      There are a number of issues that I think could be addressed to improve the manuscript:

      1) The figure could make it clearer where the LED panel is. Are the authors confident the rats see the cue on each trial?

      We have added a new supplemental figure to address this question (Figure 1 - figure supplement 1). The new figures show the 3D geometry of the maze and the location of the Cue panel. The rats were able to see the cue, otherwise task performance would have remained at chance levels.

      2) The same maze has been used in a series of studies of hippocampal place cells by Paul Dudchenko's group. They also went on to examine how these representations are affected in a very similar cued spatial response task. These studies should be acknowledged.

      We thank the reviewer for pointing out this oversight. We have added the Ainge et al. citation ( https://doi.org/10.1523/JNEUROSCI.2011-07.2007) when first introducing the maze and in the methods.

      3) In a number of supplementary figures, the authors present neurons that are selective for different properties such as segment, cue, reward, and direction. However, the number of spatially and cue-selective cells and the criteria by which cells are designated as selective are not reported. The analyses of spatial remapping and response to cues are done at the population level so I'm not sure how these cells are classified or selected for the figures.

      The procedure for selection is included in the figure captions. Each unit is ranked based on the Uz score by segment as originally shown in Figures 2 and 4.

      4) Related to this, the example cells on the figures do not clearly represent the effects presented. For example, given the title of Figure 2, I assume that the cells in 2B significantly remap. However, they don't look like they remap - the cells in the top row show rate remapping in one segment of the maze while the cells in the bottom do not show clear rate remapping responses. I suspect that traditional rate map-based analyses using maps based on consistently sized pixels rather than large segments would show only very modest changes in correlations or rates across these different types of trials. It is important to report the findings in this way as the authors interpret their data relative to the rate-remapping studies which have used these analyses. Readers who do not have the time or expertise to examine the methods in detail will conclude that the effects reported here are the same as previous rate remapping studies which the examples suggest is not the case.

      Additional plots for individual units have been added to the supplement, Figure 2 - figure supplement 3. However, the effects, though small by unit, are consistent across neurons and subjects (Figure 2 - figure supplement 5).

      5) Why is there a bias on the stem in 2C? This is of similar size to the effect on the right size and so deserves discussion.

      The analysis in question is the across unit level bias in cue-coding by maze segment. The left segment shows elevated Right Cue coding, while the right segment shows elevated Left Cue coding. There was one reported statistical result, the main effect of segment in the Linear Mixed Effects model. We expand this result in the following two ways:

      1. Individual statistical results by segment

      a. Left Segment (Uz Coef. Estimate = 0.5, CI95%=[0.26, 0.75; p<1e-4])

      b. Stem Segment (Uz Coef. Estimate = 0.22, CI95%=[-0.01, 0.47]; p=0.06)

      c. Right Segment (Uz Coef. Estimate = -0.27, CI95%=[-0.51, -0.03], p=0.03)

      1. Reporting the joint hypothesis test of left > stem > right by unit.

      a. X2=90.45, p=2.28e-20

      b. The comparison of left>stem by unit:

      i. coefficient estimate = 0.28, CI95%=[0.11, 0.44], p=0.0008

      Although the reviewer is correct in pointing out the effect size similarity, the appropriate statistical comparisons within and across units support the stated conclusions. In terms of systematic coding bias, there is a small bias across units (60% of units) and animals (4 out 5) for the Right Cue. Although interesting, this effect is orthogonal to the comparisons of interests (within unit differences). In order to highlight this point we have added the statistics of the joint hypothesis test of left>stem>right to the main manuscript.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Response to Reviewer 1 Comments (Public Review):

      Point 1: While the authors provided a large amount of data regarding the genes involved in the TOR pathway, it is highly descriptive and mostly confirmative data, as numerous papers have already shown that the TOR pathway plays essential roles in a myriad of biological processes in multiple fungi.

      Response 1: Thank you for your comment. The target of rapamycin (TOR) signal pathway plays critical roles in various eukaryotic organisms. However, its specific role in controlling the development and virulence of opportunistic pathogenic fungi like A. flavus has remained unclear. Additionally, the underlying mechanism of the TOR pathway remains elusive in the A. flavus. As such, our study provides a useful contribution, as it is the first to comprehensively investigate the majority of genes in the conserved TOR signaling pathway in A. flavus.

      Point 2: The authors seemed to perform a series of parallel studies in several genes involved in the TOR pathway in other fungi. However, their data are not properly interconnected to understand the TOR signaling pathway in this fungal pathogen. The authors frequently drew premature conclusions from basic phenotypic observations. For instance, based on their finding that sch9 mutant showed high calcium stress sensitivity, they concluded that Sch9 is the element of the calcineurin-CrzA pathway. Furthermore, based on their finding that the sch9 mutant show weak rapamycin sensitivity and increased Hog1 phosphorylation, they concluded that Sch9 is involved in TOR and HOG pathways. To make such conclusions, the authors should provide more detailed mechanistic data.

      Response 2: Yes, we agree with the reviewer's comment. We have carefully reviewed the manuscript and made necessary revisions to eliminate arbitrary conclusions. For example, we have removed the statement that "Sch9 is the element of the calcineurin-CrzA pathway". Furthermore, we have rephrased our conclusions to better reflect our findings. "these results reflected that Sch9 regulates osmotic stress response via the HOG pathway in A. flavus"(Lines 279-280, page 13). We appreciate the reviewer's input, which has contributed to the clarity and accuracy of our work.

      Point 3: In the section "Tor kinase plays important roles in A. flavus", some parts of their data are confusing. The authors said they identified a single Tor kinase ortholog, which is orthologous to S. cerevisiae Tor2. And then, they said failed to obtain a null mutant, but constructed a single copy deletion strain delta Tor1+/Tor2-. What does this mean? Does this mean A. flavus diploid strain? So is this heterozygous TOR/tor mutant? Otherwise, does the haploid A. flavus strain they used contain multiple copies of the TOR gene within its genome? What is the real name of A. flavus Tor kinase (Tor1 or Tor2?). "tor1+/tor2-" is the wrong genetic nomenclature. What is the identity of detalTor1+/Tor2-? Please provide detailed information on how all these mutants were generated. A similar issue was found in the analysis of TapA, which is speculated to be essential (what is the deltaTapA1+/TapA2-?). I couldn't find any detailed information even in Materials and Methods. The authors should provide southern blot data to validate all their mutants.

      Response 3: Thank you for your comments. We acknowledge the confusion in our presentation and will ensure that accurate genetic nomenclature is used consistently throughout the paper.

      In response to your queries, we have included a section in the Materials and Methods, titled "Detection of tor and tapA genes copy number in strains" (Lines 615-621, page 29), to provide details on how we determined the copy numbers of the tor and tapA genes in the strains. Our findings revealed that both the tor and tapA genes are present in double copies in our strains, which guided our decision to construct single-copy deletion strains using homologous recombination. We have verified these copy numbers using absolute quantification PCR (Table S1).

      The use of the abbreviation '+/-' for the single copy knockout strains, such as tor+/- and tapA+/-, is consistent with common fungal literature practice. We apologize for any confusion caused by this nomenclature.

      Although we did not employ southern blot data for validation, we conducted PCR and gene sequencing to confirm the mutants. We appreciate your comments to improve the clarity and accuracy of our manuscript.

      Point 4: How were the FRB domain deletion mutants constructed? If the FKBP12-rapamycin binding (FRB) domain is specifically deleted in the Tor kinase allele, should it be insensitive and resistant to rapamycin? However, the authors showed that the FRB domain deleted TOR allele was indeed non-functional.

      Response 4: We appreciate the reviewer's attention to the construction of the Fkbp12-rapamycin binding (FRB) domain deletion mutants and the discrepancy between the expected and observed results.

      For the knockout of the FRB domain, we used the homologous recombination method, but because tor genes are double-copy genes, there are also double copies in the FRB domain. Despite our efforts, we encountered challenges in precisely determining the location of the other copy of the tor gene.

      We speculate the common expectation that the deletion of the FRB domain should result in insensitivity and resistance to rapamycin, as it disrupts the binding site for Fkbp-rapamycin. However, we observed that the FRB domain-deleted mutant was more sensitive to rapamycin. This intriguing result suggests that there are additional factors or complexities involved in TOR signaling pathway regulation in A. flavus. We hypothesize that this result is related to the double copy of the tor gene. The reviewer's keen observation and comment have contributed to our efforts to better understand and explain this intriguing result.

      Point 5: In Figure 4C, the authors should monitor Hog1 phosphorylation patterns under stressed conditions, such as NaCl treatment, and provide quantitative measurements. Similar issues were found in the western blot analysis of Slt2 (Fig. 8D).

      Response 5: We agree with the reviewer that we should monitor Hog1 phosphorylation patterns under stressed conditions. In response to this valuable suggestion, we conducted additional experiments to examine Hog1 phosphorylation patterns under NaCl treatment for 30 minutes. The quantitative measurements of Hog1 phosphorylation levels under stress have been added to Figure 4E in the revised manuscript. Similarly, we have addressed the issue raised regarding Slt2 in Figure 8D.

      Point 6: For all the deletion mutants generated in this study, the authors should generate complemented strains to validate their data.

      Response 6: We appreciate the reviewer's suggestion to generate complemented strains for all the deletion mutants in our study to validate our data. However, due to the extensive number of genes involved in this research, it is hard to create complemented strains for each individual deletion mutant. As suggested by the reviewer, we have constructed complemented strains for several key deletion mutants, such as ΔsitA-C and Δppg1-C.

      Response to Reviewer 1 Comments (Recommendations For The Authors):

      Point 1: Overall, this manuscript was very poorly organized and not presented logically. It requires extensive English language editing.

      Response 1: We appreciate the reviewer's feedback regarding the organization and language quality of our manuscript. To address these concerns, we have restructured the manuscript to improve its logical flow and coherence. We thank the reviewer for their constructive criticism, which has been instrumental in the manuscript's refinement.

      Point 2: The authors did not present their figures in the order of description. For example, the authors suddenly described Figure 9A data in lines 128-130 in the middle of describing Figure 1. Furthermore, Figures 1D and 1F were described earlier than Figures 1B and 1C. In addition, Figure S2 was shown earlier than Figure S1. Please check this throughout the manuscript.

      Response 2: We thank the reviewer for their insightful observation. We acknowledge the importance of a logical and coherent figure sequence for reader comprehension. After careful review, we have rearranged the text and images throughout the entire document to enhance the reading experience. The revised manuscript now presents figures in a consistent and logical order, following the sequence of descriptions. We believe this improvement will enhance the overall readability and comprehension of our research.

      Point 3: The authors should follow the standard genetic nomenclature rules.

      Response 3: Thank you for your suggestion. We have revised our manuscript to ensure that we are following the standard genetic nomenclature rules throughout. This includes the correct naming of genes, proteins, and mutations, as well as the use of appropriate italicization and formatting. We follow the rules: gene symbols are typically composed of three lowercase italicized letters, while protein symbols are not italicized, with an initial capital letter followed by lowercase letters.

      Point 4: These are just a few examples. Besides the ones that I mentioned, I found numerous grammatically wrong or awkward sentences throughout the manuscript. So this manuscript requires extensive English proofreading.

      Response 4: We apologize for the problem of our manuscript. We have asked an English native speaker to enhance the overall language quality and readability of the text. We believe that these improvements will significantly enhance the manuscript's overall quality and make it more accessible to a broader audience.

      Response to Reviewer 2 Comments (Public Review):

      Point 1: However, findings have not been deeply explored and conclusions mostly are based on parallel phenotypic observations. In addition, there are some concerns that exist surrounding the conclusions.

      Response 1: We are grateful for the suggestion. We conduct additional experiments and analyses to delve more deeply into our findings and ensure a more robust basis for our conclusions.

      Response to Reviewer 2 Comments (Recommendations For The Authors):

      Point 1: Verification for mutants: a single copy deletion strain ΔTor1+/Tor2(containing one copy of the Tor gene), however, in the table of strain list, it seems like null mutants. There are no further verifications for relative genes' expression and no complementary strains.

      A. Flavus ΔTor: Δku70; ΔniaD; ΔTor::pyrG

      A. Flavus ΔTapA Δku70; ΔniaD; ΔTapA::pyrG

      As described in pp208, "While we failed to obtained a null mutant, we constructed a single copy deletion strain ΔTor1+/Tor2- (containing one copy of the Tor gene) constructed by homologous recombination)"? But the authors think there was only one Tor kinase ortholog (AFLA_044350). It is hard to understand for this mutant What is the evidence to verify phenotypes of the ΔTor1+/Tor2- strain resulted from deletion of Tor2, no detail for how to make ΔTor1+/Tor2- strain.

      Response 1: Thank you for your important comments and suggestion. We apologize for the confusion caused by genetic nomenclature. We make the necessary corrections in the table of strain lists to accurately reflect the genotypes of the strains (Table S3).

      Multicopy variation of genes has not been explored in detail in fungi, especially in A. flavus, but is a commonly known phenomenon in mammalian genomes[1-2]. In yeast, the presence of two tor genes, tor1 and tor2, whereas in higher eukaryotes such as plants, animals, and filamentous fungi, there is only one tor gene[3-4]. The homology comparison results show that the genome of A. flavus contains only one tor gene. However, the tor gene in A. flavus exhibited varying copy numbers, as was confirmed by absolute quantification PCR at the genome level (Table S1).

      In this study, we constructed a single copy deletion strain, tor+/-, through homologous recombination. This strain contains one copy of the tor gene. We provide a more detailed and explicit description of the methods used to detect of the genes copy number in strains (Lines 615-621, page 29). We thank the reviewer for pointing out these important issues.

      Point 2: For a point mutant strain TORS1904L, they found that the sensitivity to rapamycin is consistent with the WT strain, it could not tell anything. It should be moved to Suppl.

      Response 2: Thanks for your important comments. We acknowledge that these results may not provide significant insights. In response to this suggestion, we delete the data related to the TORS1904L point mutant strain and its sensitivity to rapamycin to ensure that the main manuscript focuses on the most pertinent and informative findings. Corresponding modifications have been made in the revised manuscript.

      Point 3: For subtitle "Sch9 is correlate with the HOG and TOR pathways "What is the meaning for "correlate" similarly?

      Response 3: Thank you for this comment. We apologize for the unclear wording. To enhance clarity, we revise the subtitle to more explicitly convey this conclusion, for example, "The Sch9 kinase is involved in aflatoxin biosynthesis and the HOG pathway". (Lines 242, page 12).

      Point 4:for the ΔTapA 1+/TapA 2- strain (containing one copy of the TapA gene). It should have the complementary strain to verify the specific role of TapA. In FigS1B, ΔTOR and ΔTapA it could not tell TOR gene has been edited. Did you test mRNA of TOR gene?

      Response 4: Thanks for your important comments. Due to the large number of genes involved, we did not perform a complementation experiment. However, we used PCR and sequencing to verify the editing of our gene. Additionally, we conducted copy number and mRNA analyses to verify its function. The transcriptional level of the tor gene in the tor+/- mutant was downregulated compared to the level in the wild-type strain (Fig. S6).

      Response to Reviewer 3 Comments (Public Review):

      Point 1: As for many results, I miss the re-complementation of the created mutants throughout the manuscript. This is standard praxis.

      Response 1: Thanks for your suggestions. We acknowledge that re-complementation is a standard practice for validating the effects of gene deletions. However, due to the large number of genes involved in our study, we have performed supplementary experiments on a selection of them, such as ΔsitA-C and Δppg1-C. We are grateful to the reviewer for your understanding of this practical consideration.

      Point 2: Fig. 1: cultures were grown for 48 h before measuring the transcript level. The authors show that brlA, abaA, and some sexual regulators are less expressed. In my opinion, this does not allow the conclusion that there is a direct control through rapamycin. Since the colonies grow very slowly in the presence of rapamycin, the authors should add rapamycin and follow gene expression after 15, 30, 60, 90 min. The figure legend needs to be more detailed. Which type of cultures were used, liquid, solid medium? Etc.

      Response 2: We deeply appreciate the reviewer’s suggestion. Since we found that there were no significant differences in gene expression changes following shorter treatment times, we extended the treatment duration. We conduct additional experiments to examine the gene expression levels at longer time intervals (3, 6, and 9 h) after the addition of rapamycin (Figure 1H-1J). These time points allow us to capture the dynamic changes in gene expression in response to rapamycin more effectively. Additionally, we enhance the figure legend to provide a more comprehensive description that specifies the type of cultures used in the experiments.

      Point 3: Why in chapter one Fig. 9 is already cited? Those data should then be included in Fig. 1 for the general phenotype.

      Response 3: Thank you for the suggestion. We have reordered the figures in the updated version of the manuscript to ensure that the data for consistent and clarity.

      Point 4: The authors wrote that radial growth and conidiation were gradually reduced with increasing rapamycin concentrations. This is not true. There is no gradient! However, it should be tested if there is a gradient if lower concentrations are used. The current data imply that there is a threshold concentration, so either there is 100 % growth or a reduction to 25 %. This looks strange.

      Response 4: Thank you for underlining this deficiency. We agree that a threshold concentration versus a gradient is an important distinction that needs to be clarified. Our results show that the addition of excessive quantities of rapamycin does not increase the inhibition of A. flavus growth. As the concentration of the FK506 drug increases, there is a gradual decrease in the growth and cell production of A. flavus. This phenomenon could potentially be attributed to varying mechanisms of action exhibited by the drugs. Therefore, we have revised these confused sentences. ( Lines 120-121, Page 5)

      Point 1: There are many wrong spellings:

      Fig. 1. Before washed, before washing; RelaTEtive gene expERSion should read relative gene expression. Sclerotial should be sclerotia. See also Fig. 5 F, H, Fig. 6 E. 6D colon diameter should be colony diameter.

      Fig. 4E. The expressED level... should read Expression level..... (also without article) Also in A, F, H.

      Fig. 6C. TLC detection of WT.... The authors mean AF detection in extracts of WT..... AF was extracted and analyzed by TLC.....

      Labelling of axes in one figure should be uniform.

      Response 1: Thank you for your reminder. We apologize for the oversights, and we carefully address and correct all the mentioned spelling issues to ensure the accuracy and clarity of the manuscript.

      Point 2: If the authors refer to the genes, I think they should be in small letters and italics, if it is the protein, the first letter should be capitalised tap1 (italics) and Tap1.

      Response 2: We appreciate this suggestion. We have carefully checked the entire manuscript and revised follow the standard genetic nomenclature rules. We follow the naming conventions for microbial genes and proteins, where gene symbols are typically composed of three lowercase italicized letters, and protein symbols are not italicized, with an initial capital letter followed by lowercase letters.

      Point 3: Very often articles are used where I would not use them.

      Response 3: Thanks for your careful checks. We are sorry for our carelessness. Based on your comments, we have made the corrections to make the articles harmonized within the whole manuscript. We value the reviewer's feedback, which will contribute to the overall quality of our writing.

      References:

      [1] Handsaker R, Van Doren, V, Berman, J. et al. Large multiallelic copy number variations in humans. Nat Genet 47, 296–303 (2015).

      [2] Wang Y, Wang S, Nie X. et al. Molecular and structural basis of nucleoside diphosphate kinase-mediated regulation of spore and sclerotia development in the fungus Aspergillus flavus. J Biol Chem. 2019 Aug 16;294(33):12415-12431.

      [3] Kim DH, Sarbassov DD, Ali SM, et al. mTOR interacts with raptor to form a nutrient-sensitive complex that signals to the cell growth machinery. Cell. 2002; 110(2): 163-75.

      [4] Fu L, Liu Y, Qin G, et al. The TOR-EIN2 axis mediates nuclear signalling to modulate plant growth. Nature. 2021; 591(7849): 288-292.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Thank you for submitting your article "New genetic tools for mushroom body output neurons in Drosophila" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the assessment has been overseen by a Reviewing Editor and Albert Cardona as the Senior Editor.

      eLife assessment:

      This work advances on two Aso et al 2014 eLife papers to describe further resources valuable for the field. This paper adds more MBON split-Gal4s convincingly describing their anatomy, connectivity and function.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this manuscript Rubin and Aso provide important new tools for the study of learning and memory in Drosophila. In flies, olfactory learning and memory occurs at the Mushroom Body (MB) and is communicated to the rest of the brain through Mushroom Body Output Neurons (MBONs). Previously, typical MBONs were thoroughly studied. Here, atypical MBONs that have dendritic input both within the MB lobes and in adjacent brain regions are studied. The authors describe new cell-type-specific GAL4 drivers for the majority of atypical MBONs (and other MBONs) and using an optogenetic activation screen they examined their ability to drive behaviors and learning.

      The experiments in this manuscript were carefully performed and the results are clear. The tools provided in this manuscript are of great importance to the field.

      Reviewer #2 (Public Review):

      In this study, Aso and Rubin generated new split-GAL4 lines to label Drosophila mushroom body output neurons (MBONs) that previously lacked specific GAL4 drivers. The MBONs represent the output channels for the mushroom body (MB), a computational center in the fly brain. Prior research identified 21 types of typical MBONs whose dendrites exclusively innervate the MB and 14 types of atypical MBONs whose dendrites also innervate brain regions outside the MB. These MBONs transmit information from the MB to other brain areas and form recurrent connections to dopaminergic neurons whose axonal terminals innervate the MB. Investigating the functions of the MBONs is crucial to understanding how the MB processes information and regulates behavior. The authors previously established a collection of split-GAL4 lines for most of the typical MBONs and one atypical MBON. That split-GAL4 collection has been an invaluable tool for researchers studying the MB. This work extends their previous effort by generating additional driver lines labeling the MBON types not covered by the previous split-GAL4 collection. Using these new driver lines, the authors also activated the labeled MBONs using optogenetics and assessed their role in learning, locomotion, and valence coding. The expression patterns of the new split-GAL4 lines and the behavioral analysis presented in this manuscript are generally convincing. I believe that these new lines will be a valuable resource for the fly community.

      Recommendations for the authors:

      Minor additional suggestions:

      1. Please ensure that the FlyLight links are provided for the new splitGal4s in the methods as well as results.

      We added the requested link to the methods.

      1. Correct a typo in 'ethyl lactate in the learning assays section of methods

      corrected

      Reviewer #1 (Recommendations For The Authors):

      In the behavior assay, the authors use the same flies that were used for optogenetic olfactory conditioning and memory tests, to also examine the effects of activation in the absence of odors but with airflow. I think this may affect the interpretation of the results. If possible, it would be nice to show in the MBON types where a conditioning effect was found (i.e. MBON21, 29, 33) that performing the activation in the absence of odors but with airflow without previous conditioning yields the same results.

      We share the reviewers concern that behavioral phenotypes during the later 10s LED sessions may be compromised by early optogenetic olfactory conditioning. Therefore, prior to running the experiment shown in Figure 2, we confirmed that the activation phenotypes of three positive control lines (MB011B and SS40755) could be observed after olfactory conditioning sessions. We added this data as Figure 2-figure supplement 2. For SS75200 and SS77383, a split-GAL4 driver for MBON33, we observed a loss of activation phenotype in the second trial of LED ON/OFF binary choice assay (Figure 3H). Therefore, we reran the 10s LED activation experiments without a previous optogenetic olfactory conditioning assay; these data are now also included in Figure 2-figure supplement 2.

      Reviewer #2 (Recommendations For The Authors):

      Below, I list some comments and suggestions which I hope could help the authors further improve their manuscript.

      1. The authors identified 2 candidate lines for MBON28. It would be helpful if they could clarify how they determined whether a split-GAL4 correctly labels an MBON or is just a candidate line.

      We have added in the methods section an explanation of the criteria used.

      “The correspondence between the morphologies of EM skeletons and light microscopic images of GAL4 driver line expression patterns was used to assign GAL4 lines to particular cell types. This can be done with confidence when there are not multiple cell types with very similar morphology. However, in the case MBON28 we were not able to make a definitive assignment because of the similarity in the morphologies of MBON16, MBON17 and MBON28.”

      1. The authors have previously shown that the expression pattern of a GAL4 driver is strongly influenced by the reporter used. The expression patterns of the split-GAL4 lines in this study are based on 20XUAS-Chrimson-mVenus trafficked (attp18), the expression strength of which may differ from other reporters or effectors. I suggest that the authors discuss this potential caveat in their manuscript. This will allow readers to be more cautious and check the expression patterns with their own reporters/effectors when using these new split-GAL4 lines.

      We added the sentences below to address this concern.

      “The expression patterns shown in this paper were obtained using an antibody against GFP which visualizes expression from 20xUAS-CsChrimson-mVenus in attP18. Directly visualizing the optogenetic effector is important since expression intensity, the number of labeled MBONs and off-targeted expression can differ when other UAS-reporter/effectors are used (for an example, see Figure 2—figure supplement 1 of Aso et al., 2014a).”

      1. For the kinematic parameters in Fig. 2C, it is important to also show the baseline value of the parameters (i.e., the value before the light stimulation). For example, if a group of flies moves slower during the baseline period, their slower speed during the light-on period may not be due to MBON activation.

      Figure 2 has been revised to include the z-scores for the 2s period just before turning on LED. The source data includes the parameter values used to calculate z-scores.

      1. For Methods and Materials, the authors mostly refer to previous papers or websites for details. However, it would be helpful if they could include in this manuscript key information essential for repeating their experiments, such as the reporter/effector transgenes, empty-split controls, and antibodies and their working concentrations. It would also be helpful if they could provide the manufacturers and catalog numbers for the reagents used in this study.

      We have added Appendix 1- Key Resource Table to list all the key reagents.

      1. The original studies that identified the reward or punishment dopaminergic neurons mentioned in this manuscript should be cited.

      We have added the following citations:

      “Total number of synaptic connections from each MBON type to DANs and OANs. Based on the valence of memory when activation of DANs is used as unconditioned stimulus in olfactory conditioning (Aso et al., 2012, 2010; Aso and Rubin, 2016; Claridge-Chang et al., 2009; Huetteroth et al., 2015; Ichinose et al., 2015; Lin et al., 2014; Liu et al., 2012; Yamada et al., 2023; Yamagata et al., 2016, 2015)”

    1. Author Response

      The following is the authors’ response to the original reviews.

      Response to comments of editor/s:

      • With regard to the comments on nonavailability of representative images/videos for Figures 1 A and B, in the revised manuscript we have added a representative video of GFP (-) and GFP (+) tracks in Supplemental video 1.

      Response to comments of reviewer 2:

      • With respect to the concern on figure 1, we have changed ‘% CD4+ T cell Migration’ to ‘% Proportion CD4+ T cell migration’ in Figures 1D & 1E in the revised manuscript. We also labelled the upper and lower panels of Figure 1I as ‘Untreated’ and ‘SDF1α’ respectively.

      Response to comments of reviewer 1:

      • With regard to the concern that ‘The transfection alone with siRNA may cause the lack of polarity’, we have added comparison of 2D migration MSD between control EGFP siRNA and Piezo1 siRNA-transfected CD4+ T cells as Supplementary Figure 1E.

      • We have added new references as ref 42 and 43, with respect to PIEZO1 association with focal adhesions.

      • With regard to the concerns around co-localization of Piezo1 and focal adhesions, we have added a representative image of Piezo1 and pFAK co-localization upon treatment of chemokine in revised Supplementary Fig. 3C. We have also used an additional focal adhesion marker, paxillin, to show that focal adhesion formation is not affected by Piezo1 KD (Revised Fig. 3E-3H). Upon comparing the mean pFAK and paxillin intensities, we observed no difference in Control and Piezo1 KD CD4+ T cells (Supplementary Figs. 3A, B).

      • All the minor concerns and suggestions have been taken care of in the revised manuscript.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The manuscript is very-well written. Although the study is well-conducted the authors should be more convincing on how bacteria residing in tissues do not induce death. The association with IL-10 cytokine production appears weak and more experiments are needed to make it more robust

      Reviewer #2 (Public Review):

      Iske et al. provide experimental data that NAD+ lessens disease severity in bacterial sepsis without impacting on the host pathogen load. They show that in macrophages, NAD+ prevents Il1b secretion potentially mediated by Caspase11.

      While the in vivo and in vitro data is interesting and hints towards a crucial role of NAD+ to promote metabolic adaptation in sepsis, the manuscript has shortcomings and would profit from several changes and additional experiments that support the claims.

      Conceptually, the definition of sepsis is outdated. Sepsis is not SIRS, as in sepsis-2. Sepsis-3 defines sepsis as infection-associated organ dysfunction. This concept needs to be taken into account for the introduction and when describing the potential effects of NAD+ in sepsis. Also, LPS application cannot be considered a sepsis model, since it only recapitulates the consequence of TLR-4 activation. It is a model of endotoxemia. Also, the LPS data does not allow to draw conclusions about bacterial clearance (L135).

      The authors state that protective effects by NAD were independent of the host pathogen load. This clearly indicates that NAD confers protection via enhancing a disease tolerance mechanism, potentially via reducing immunopathology. This aspect is not considered by the authors. The authors should incorporate the concept of disease tolerance in their work, cite the relevant literature on the topic and discuss it their findings in light of the published evidence for metabolic alteration sand adaptations in sepsis.

      For the in vitro data, the manuscript would benefit from additional experiments using in vitro infection models.

      In the merge manuscript, the authors provide two different versions of the figures. In one, bar plots are shown without individual data and in the other with scatter blots. All bar plots need to be provided as scatter plots showing individual values.

      The authors should show further serology data for kidney and liver failure etc. as well as further cytokine data such as IL-6 and TNF to better characterize their models.

      Careful revision of the entire manuscript, the figure legends and figures is required. The figure legend should not repeat the methods and materials section. The nomenclature for mouse protein and genes needs to be thoroughly revised.

      L350. The authors write that they dissect the capacity of NAD+ to dampen auto- and alloimmunity. In this work, no data that supports this statement is shown and experiments with autoantigens or alloantigens are not performed.

      L163 The authors describe pyroptosis but in the figure legend call it apoptosis. Specific markers for each cell death should be measured and determined which cell death mechanisms is involved.

      Animal data comes from an infection model and LPS application. The RNAseq data is obtained from cells primed with Pam3CSK4 and subsequently subjected to LPS. It is unclear how the cell culture model reflects the animal model. As such the link between IFN signaling and the bacterial infection/LPS model are not convincing and need to be further elaborated.

      Figure 5: It is unclear how many independent survival experiments were done, how many mice per group were used and whether the difference between groups was statistical significant. This information should be added.

      Further experiments with primary cells from Il10 k.o. and Caspase11 k.o. animals should be provided that support the findings in macrophages.

      Author Response:

      Reviewer #1 (Public Review):

      “The manuscript is very-well written. Although the study is well-conducted the authors should be more convincing on how bacteria residing in tissues do not induce death. The association with IL-10 cytokine production appears weak and more experiments are needed to make it more robust.”

      Thank you very much for your thoughtful and constructive feedback on our manuscript. We appreciate your positive assessment of the writing quality and the acknowledgment of the wel-lconducted nature of the study.

      In regard to the reviewer's comment that "The association with IL-10 cytokine production appears weak," we would like to provide a comprehensive response based on the findings and insights presented in our study (Fig 5). We would like to emphasize several key points to further elucidate this association:

      The established knowledge underscores IL-10's capacity to hinder the activation and proliferation of macrophages, thereby safeguarding against an overly aggressive immune-inflammatory reaction (as referenced). In our earlier investigations, we demonstrated that NAD+ orchestrates a systemic generation of IL-10, which assumes a pivotal function in curtailing proinflammatory responses across various conditions, such as autoimmune diseases (as referenced), alloimmunity (as referenced), and bacterial infections (as referenced). In our latest research, we divulge that the introduction of NAD+ leads to an elevated occurrence of IL-10-producing CD4+ T cells, CD8+ T cells, and macrophages, although not dendritic cells (depicted in Figure 5B and C). Furthermore, our comprehensive analyses have substantiated that NAD+ administration thwarts pyroptosis by specifically targeting the non-canonical inflammasome pathway. Intriguingly, our in vitro outcomes suggest that the neutralization of the autocrine IL-10 signaling pathway through a neutralizing antibody and an IL-10 receptor antagonist partially reverses the NAD+-mediated blockage of pyroptosis. These in vitro results imply that NAD+ induces the production of IL-10 cytokines by macrophages, contributing to the suppression of pyroptosis. To corroborate our in vitro conclusions, we employed IL-10 knockout mice and wild-type mice, both treated with either NAD+ or a placebo solution. The wild-type mice treated with NAD+ displayed a survival rate exceeding 80%, whereas the IL-10 knockout mice exhibited a survival rate of "only" 40%. These in vivo findings align with our in vitro discoveries, underscoring the crucial role of NAD+mediated IL-10 cytokine production in impeding pyroptosis through NAD+ and shielding against septic shock. Drawing from our prior and current investigations, we respectfully disagree with the reviewer's characterization of our work as "weak."

      Recommendations for the authors

      ‘’I suggest that animals subject to E. coli infection need to be followed-up for longer and sacrificed at a later time points. It is too difficult to believe that mice are surviving with full resting bacteria in tissues. Do results suggest a full shut-down of the mechanism? What was the level of infiltration of the tissues by neutrophils?’’

      ‘’I have difficulty to agree with the survival results of the IL-10(-/-) mice of Figure 5E. Can the authors provide the p-values and follow-up for longer? Why the WT and the IL-10(-/-) mice survive the same?’’

      Thank you for your thoughtful and constructive comments on our manuscript. We appreciate your valuable insights, and we have carefully considered your suggestions.

      We thank the reviewers for this comment. We have indeed followed-up for a longer period of time mice subjected to E. Coli infection and LPS (54mg/kg). Mice infected and treated with NAD+ survived for several months and recovered fully after 10 days. Mice survived for at least a year following infection. We have now included a sentence regarding the long-term survival in the results section of Figure 1 entitled “NAD+ protects mice against septic shock not via bacterial clearance but via inflammasome blockade”. Figure illustrating the level of infiltration of the tissues by neutrophils was added in supplementary data as supplementary figure 4.

      In contrast, WT and IL-10-/- mice failed to withstand E. Coli or LPS (54mg/kg) administration when treated with a placebo solution. To our knowledge, our investigation represents the pioneering instance of successfully conferring protection against the lethal doses of E. Coli and LPS administered to animals. Considering the potent immunosuppressive nature of IL-10, our anticipation was that IL-10-/- mice would manifest an exacerbated inflammatory response subsequent to LPS administration, in contrast to WT mice. Our in vivo findings indeed corroborate this assumption, revealing that IL-10-/- mice succumbed more swiftly to LPS administration, displaying statistically significant disparities in survival rates compared to WT mice (p value of 0.0154). The pertinent p-value has been thoughtfully included in Figure 5E of our study.

      Reviewer #2 (Public Review):

      “Iske et al. provide experimental data that NAD+ lessens disease severity in bacterial sepsis without impacting on the host pathogen load. They show that in macrophages, NAD+ prevents Il1b secretion potentially mediated by Caspase11.

      While the in vivo and in vitro data is interesting and hints towards a crucial role of NAD+ to promote metabolic adaptation in sepsis, the manuscript has shortcomings and would profit from several changes and additional experiments that support the claims.

      Conceptually, the definition of sepsis is outdated. Sepsis is not SIRS, as in sepsis-2. Sepsis-3 defines sepsis as infection-associated organ dysfunction. This concept needs to be taken into account for the introduction and when describing the potential effects of NAD+ in sepsis. Also, LPS application cannot be considered a sepsis model, since it only recapitulates the consequence of TLR-4 activation. It is a model of endotoxemia. Also, the LPS data does not allow to draw conclusions about bacterial clearance (L135).

      The authors state that protective effects by NAD were independent of the host pathogen load. This clearly indicates that NAD confers protection via enhancing a disease tolerance mechanism, potentially via reducing immunopathology. This aspect is not considered by the authors. The authors should incorporate the concept of disease tolerance in their work, cite the relevant literature on the topic and discuss it their findings in light of the published evidence for metabolic alteration sand adaptations in sepsis.

      For the in vitro data, the manuscript would benefit from additional experiments using in vitro infection models.

      In the merge manuscript, the authors provide two different versions of the figures. In one, bar plots are shown without individual data and in the other with scatter blots. All bar plots need to be provided as scatter plots showing individual values.

      The authors should show further serology data for kidney and liver failure etc. as well as further cytokine data such as IL-6 and TNF to better characterize their models.

      Careful revision of the entire manuscript, the figure legends and figures is required. The figure legend should not repeat the methods and materials section. The nomenclature for mouse protein and genes needs to be thoroughly revised.

      L350. The authors write that they dissect the capacity of NAD+ to dampen auto- and alloimmunity. In this work, no data that supports this statement is shown and experiments with autoantigens or alloantigens are not performed.

      L163 The authors describe pyroptosis but in the figure legend call it apoptosis. Specific markers for each cell death should be measured and determined which cell death mechanisms is involved.

      Animal data comes from an infection model and LPS application. The RNAseq data is obtained from cells primed with Pam3CSK4 and subsequently subjected to LPS. It is unclear how the cell culture model reflects the animal model. As such the link between IFN signaling and the bacterial infection/LPS model are not convincing and need to be further elaborated.

      Figure 5: It is unclear how many independent survival experiments were done, how many mice per group were used and whether the difference between groups was statistical significant. This information should be added.

      Further experiments with primary cells from Il10 k.o. and Caspase11 k.o. animals should be provided that support the findings in macrophages.”

      Thank you for taking the time to review our manuscript. We appreciate your insightful comments and valuable feedback regarding our study on the role protective role and underlying mechanisms of NAD+ in septic shock.

      “While the in vivo and in vitro data is interesting and hints towards a crucial role of NAD+ to promote metabolic adaptation in sepsis, the manuscript has shortcomings and would profit from several changes and additional experiments that support the claims.”

      We would like to point out that our current study does not underscore a metabolic adaptation in sepsis but more an immune regulation and a specific blockade of the non-canonical inflammasome signaling machinery.

      “Conceptually, the definition of sepsis is outdated. Sepsis is not SIRS, as in sepsis-2. Sepsis-3 defines sepsis as infection-associated organ dysfunction. This concept needs to be taken into account for the introduction and when describing the potential effects of NAD+ in sepsis. Also, LPS application cannot be considered a sepsis model, since it only recapitulates the consequence of TLR-4 activation. It is a model of endotoxemia. Also, the LPS data does not allow to draw conclusions about bacterial clearance (L135).”

      Our study uses highly lethal doses of E. Coli or LPS. These doses have been shown to result in multiple organ failure (1, 2). For many decades until now an un-numerable number of studies have used LPS as a model of sepsis (3, 4, 5). We have used LPS animal model based on a study published in 2013 by Kayagaki et al. (1), where the authors reported a novel TLR4-independent mechanism but mediated via activate caspase-11. We used the same animal model to demonstrate the specific role of NAD+ in targeting this TLR4-independent mechanism but mediated via activate caspase-11 and underscore NAD+’s mode of protection.

      Moreover, we have not only used LPS but bacterial infection as well using E. Coli. We have also previously published an additional research article demonstrating the protective effect against Listeria Monocytogenes (6). The only model we currently did not use in our current study, is a cecal ligation puncture (CLP) model which is also another common animal model for sepsis.

      Our conclusions regarding bacterial clearance are based not only on LPS results but also based on the bacterial load measurement and survival (Figure 1B&C) following E. Coli administration in different tissues (kidney and liver) and not LPS.

      “The authors state that protective effects by NAD were independent of the host pathogen load. This clearly indicates that NAD confers protection via enhancing a disease tolerance mechanism, potentially via reducing immunopathology. This aspect is not considered by the authors. The authors should incorporate the concept of disease tolerance in their work, cite the relevant literature on the topic and discuss it their findings in light of the published evidence for metabolic alteration sand adaptations in sepsis.”

      We respectfully disagree with the reviewer’s comment and do not believe that NAD+ enhances disease tolerance. We have supporting data indicating that NAD+ mediates protection via a specific blockade of the non-canonical inflammasome pathway, which prevents an over-zealous immune response that results in organ damage and multiple organ failure (MOF). Moreover, we demonstrate that not only NAD+ mediates protection via a specific blockade of the non-canonical inflammasome pathway but prevents septic shock induced death by an additional immunosuppression mediated by the systemic production of IL-10.

      Both Caspase-11 and IL-10 pathways are crucial in NAD+ mediated protection against lethal doses of E. Coli and LPS administration. Figure 5A indicates that caspase-11-/- mice treated with PBS have a modest survival rate (~40% survival) when compared to the group of mice treated with NAD+ (>80% survival). These data indicate that NAD+ promotes survival via a caspase-11independent mechanism. Similarly, wild type mice subjected to NAD+ administration exhibited >80% survival, while NAD+ administration to IL-10-/- mice resulted only in a 40% survival rate. Based on these findings, we believe that NAD+ mediated protection against septic shock via a blockade of caspase-11 blockade and by IL-10 cytokine production that dampened the overzealous immune response rather than a disease tolerance.

      “For the in vitro data, the manuscript would benefit from additional experiments using in vitro infection models.”

      In the current study we have used two in vivo models using LPS and E. Coli a gram-negative bacterium. We have also previously reported the protective role of NAD+ in the context of Listeria Monocytogenes (6) a gram-positive bacterium. In the current study, our aim was to demonstrate the inhibitory role of NAD+ on the non-canonical pathway specifically. We believe that additional in vitro experiments for this study are out of scope.

      “In the merge manuscript, the authors provide two different versions of the figures. In one, bar plots are shown without individual data and in the other with scatter blots. All bar plots need to be provided as scatter plots showing individual values.”

      As requested by reviewer #2 all bar plots are now provided as scatter plots showing individual values.

      “The authors should show further serology data for kidney and liver failure etc. as well as further cytokine data such as IL-6 and TNF to better characterize their models.”

      We did not perform further serology analysis, but we did measure IL-6 and TNFα in mice treated with NAD+ or PBS. Mice treated with NAD+ had a reduced systemic level of both cytokines IL-6 and TNFα. We have now added the figures (Figure 1F). In addition, we performed a long-term survival, and all mice treated with NAD+ recovered fully after 10 days and survived over a year after infection. In addition, the mice that survived following NAD+ treatment died of old age.

      “Careful revision of the entire manuscript, the figure legends and figures is required. The figure legend should not repeat the methods and materials section. The nomenclature for mouse protein and genes needs to be thoroughly revised.”

      A Careful revision of the entire manuscript has been performed.

      “L350. The authors write that they dissect the capacity of NAD+ to dampen auto- and alloimmunity. In this work, no data that supports this statement is shown and experiments with autoantigens or alloantigens are not performed.”

      We thank the reviewer for this comment. We have now re-phrased our last sentence in the discussion and included references for our previous work. We have now stated:” We have previously reported that NAD+ administration can block auto- (7) and allo-immunity (8) via IL10 cytokine production. Here, we unveiled the capacity of NAD+ to protect against sepsisinduced death via a specific blockade of the non-canonical inflammasome pathway and a robust immunosuppression mediated by IL-10 cytokine production.

      L163 The authors describe pyroptosis but in the figure legend call it apoptosis. Specific markers for each cell death should be measured and determined which cell death mechanisms is involved.

      We thank the reviewer for this comment. We have focuses on pyoptosis-mediated cell death and not apoptosis. We have now replaced the term “apoptosis” by “pyroptosis-mediated to cell death”.

      “Animal data comes from an infection model and LPS application. The RNAseq data is obtained from cells primed with Pam3CSK4 and subsequently subjected to LPS. It is unclear how the cell culture model reflects the animal model. As such the link between IFN signaling and the bacterial infection/LPS model are not convincing and need to be further elaborated.”

      Our findings, depicted in Figure 3, pertain exclusively to in vitro investigations rather than in vivo examinations. Our research has demonstrated the selective inhibition of the non-canonical inflammasome pathway by NAD+, with a primary focus on unraveling the specific signaling pathway influenced by NAD+. Our in vitro outcomes indicate that the introduction of recombinant IFN-β counteracted the inhibitory effect of NAD+ on the non-canonical pathway. However, it's important to note that we have not evaluated the IFN-β pathway within our E. Coli and LPS in vivo models. Our primary intention was to exclusively decipher the roles of IFN-β and NAD+ in the context of inhibiting the non-canonical inflammasome, without extending our investigation to the broader in vivo scenarios.

      “Figure 5: It is unclear how many independent survival experiments were done, how many mice per group were used and whether the difference between groups was statistical significant. This information should be added.”

      We have now included the number of experiments, p values and number of animals used in Figure 5.

      “Further experiments with primary cells from Il10 k.o. and Caspase11 k.o. animals should be provided that support the findings in macrophages.”

      We concur with the reviewer's suggestion regarding the need for further experiments involving primary cells from IL-10-/- and Caspase-11-/- mice. However, we are uncertain about the potential contribution of these experiments in generating novel or supplementary findings to the existing study.

      Recommendations For The Authors:

      Besides the comments made in the public section, there are further issues that need to be considered by the authors.

      “It is unclear what signifies „impressive, L106" or „dramatic, L257"”

      “impressive” meant that we were surprised by the results since to the best of our knowledge prior this study there exists no report/study claiming such survival (>80%) following such high dose of E. Coli. In this aspect protective effects of NAD+ are unique. “dramatic” We (8) and others (9, 10) have previously used this term to describe a robust increase of cytokine production.

      “L116. The authors describe „symptoms". It should be clarified what symptoms they observed and the data should be shown. If only temperature is available, then this should be said. It would be interesting to see effects of NAD+ on the glucose levels of the animals during sepsis.”

      We thank the reviewer’s comment. We have measured only temperature. We believe that glucose level is beyond the scope of this study.

      “L29. Sepsis is not restricted to bacterial and viral pathogens. Also fungi and protozoa can cause sepsis.”

      We have now included fungi and protozoa.

      “Suppl.Fig.1. A scale should be added.”

      Scale has been added

      “L822. Lethal dose of LPS would mean that this was lethal for all mice. However, the data suggests that NAD+ treated animals would not have died. This should be clarified.”

      Here we meant lethal dose in absence of NAD+ treatment. Our study focuses on the protective role of NAD+ in a lethal context (bacterial and LPS).

      “L823/824. The part of the sentence: ... IHC was performed staining for H&E.. is incomplete.”

      We thank the reviewer’s comment. We have re-phrased our sentence.

      “L804. IL-10 is not a pathway. This should be revised.”

      We have replaced “pathway” by” mechanism”.

      “The graphical abstract should be the last figure summarizing all findings.”

      Figure 4 isn't the final illustration, as it doesn't encompass an overarching graphical summary of our discoveries. Instead, it exclusively highlights the findings related to NAD+'s impact on noncanonical inflammasome inhibition. Notably, this figure omits NAD+-mediated IL-10 cytokine generation and its crucial role in mitigating septic shock.

      “The authors report that they used a dosage of 54mg/kg LPS (l.502). This is a rather unusual concentration. How was this determined?”

      This was initially based on the first study reporting the role of casapase-11 in septic shock induced death published in 2013 by Kayagaki et al. (1). Many other have used this dosage for septic shock induced death animal model (11, 12, 13).

      References:

      1. Kayagaki N, et al. Noncanonical inflammasome activation by intracellular LPS independ ent of TLR4. Science 341, 1246‐1249 (2013).

      2. Qin, X., Jiang, X., Jiang, X. et al. Micheliolide inhibits LPS-induced inflammatory response and protects mice from LPS challenge. Sci Rep 6, 23240 (2016).

      3. Li Z, Qu W, Zhang D, Sun Y, Shang D. The antimicrobial peptide chensinin-1b alleviates the inflammatory response by targeting the TLR4/NF-κB signaling pathway and inhibits Pseudomonas aeruginosa infection and LPS-mediated sepsis. Biomed Pharmacother. 2023 Aug 1; 165:115227.

      4. Ramani V, Madhusoodhanan R, Kosanke S, Awasthi S. A TLR4-interacting SPA4 peptide inhibits LPS-induced lung inflammation. Innate Immun. 2013 Dec;19(6):596610.

      5. Zhang Y, Lu Y, Ma L, Cao X, Xiao J, Chen J, Jiao S, Gao Y, Liu C, Duan Z, Li D, He Y, Wei B, Wang H. Activation of vascular endothelial growth factor receptor-3 in macrophages restrains TLR4-NF-κB signaling and protects against endotoxin shock. Immunity. 2014 Apr 17;40(4):501-14.

      6. Rodriguez Cetina Biefer H, Heinbokel T, Uehara H, Camacho V, Minami K, Nian Y, Koduru S, El Fatimy R, Ghiran I, Trachtenberg AJ, de la Fuente MA, Azuma H, Akbari O, Tullius SG, Vasudevan A, Elkhal A. Mast cells regulate CD4+ T-cell differentiation in the absence of antigen presentation. J Allergy Clin Immunol. 2018 Dec;142(6):18941908.e7.

      7. Tullius SG, Biefer HR, Li S, Trachtenberg AJ, Edtinger K, Quante M, Krenzien F, Uehara H, Yang X, Kissick HT, Kuo WP, Ghiran I, de la Fuente MA, Arredouani MS, Camacho V, Tigges JC, Toxavidis V, El Fatimy R, Smith BD, Vasudevan A, ElKhal A. NAD+ protects against EAE by regulating CD4+ T-cell differentiation. Nat Commun. 2014 Oct 7;5:5101.

      8. Elkhal A, et al. NAD(+) regulates Treg cell fate and promotes allograft survival via a systemic IL‐10 production that is CD4(+) CD25(+) Foxp3(+) T cells independent. Sci Rep 6, 22325 (2016).

      9. Natalia Garcia-Becerra, Marco Ulises Aguila-Estrada, Luis Arturo Palafox-Mariscal, Georgina Hernandez-Flores, Adriana Aguilar-Lemarroy, Luis Felipe Jave-Suarez, FOXP3 Isoforms Expression in Cervical Cancer: Evidence about the Cancer-Related Properties of FOXP3Δ2Δ7 in Keratinocytes, Cancers, 15, 2, (347), (2023).

      10. Estelle Bettelli, Maryam Dastrange, Mohamed Oukka. Foxp3 interacts with nuclear factor of activated T cells and NF-κB to repress cytokine gene expression and effector functions of T helper cells. Proceedings of the National Academy of Sciences. 2005.102; 14; 5138-5143.

      11. Han Gyung Kim, Chaeyoung Lee, Ji Hye Yoon, Ji Hye Kim, Jae Youl Cho,BN82002 alleviated tissue damage of septic mice by reducing inflammatory response through inhibiting AKT2/NF-κB signaling pathway,Biomedicine & Pharmacotherapy,Volume 148,2022,112740.

      12. Tao Q, Zhang Z-D, Qin Z, Liu X-W, Li S-H, Bai L-X, Ge W-B, Li J-Y and Yang Y-J (2022) Aspirin eugenol ester alleviates lipopolysaccharide-induced acute lung injury in rats while stabilizing serum metabolites levels. Front. Immunol. 13:939106.

      13. Chen, N, Ou, Z, Zhang, W, Zhu, X, Li, P, Gong, J. Cathepsin B regulate non-canonical NLRP3 inflammasome pathway by modulating activation of caspase-11 in Kupffer cells. Cell Prolif. 2018; 51:e12487.

    1. Author Response:

      Reviewer #1:

      1. This is a complex paper and would benefit from a schematic depicting the key findings.

      This comment is appreciated. Unfortunately, due to time restraints, the authors were not able to graphically depict our findings.

      1. The paper would benefit from additional supporting evidence. Would it be possible to measure fatty acid oxidation by metabolic tracing here, in IRG-deficient cells or in response to 4-OI? Although changes in protein level for Cpt1A are seen, this is correlated with fatty acid oxidation rather than direct demonstration. This may be challenging but would strengthen the manuscript.

      This is a great comment. While we did not directly measure fatty acid flux in our manuscript, Weiss et al. Nature Metabolism 2023 did these studies in primary hepatocytes. They showed an increased palmitate incorporation into citrate.

      1. The aspect concerning body temperature regulation is confusing. Would Itaconate not promote fatty acid oxidation to increase or maintain body temperature? Itaconate must therefore not be involved in the hypothermic response? Bringing UCP1 into the finding is confusing and needs to be better explained. Again a diagram would help, but enhanced BAT fatty acid oxidation and UCP1 expression appear linked here, with both being affected by Itaconate. This needs clarifying.

      We appreciate this comment. The rationale is that if itaconate is stabilizing fatty acid oxidation, it would be necessary to fuel thermogenesis, a process dependent on fatty acid utilization. Our data support a role for itaconate in stabilizing body temperature following inflammation, potentially through enhanced fatty acid oxidation. This is evidenced by the hypothermic response to LPS in Acod1 KO mice. Furthermore, Mills et al. Nature 2018 show 4-OI injection boosts body temperature following LPS stimulation.

      Reviewer #2:

      Some conclusions involving the Irg1 knockout mice require important controls and clarifications to be fully convincing and some controls are missing.

      We appreciate the needs for appropriate controls. Negative controls were omitted when baseline phenotypes were not observed. Due to time and resource limitations we were unable to repeat the experiments.

  3. Dec 2023
    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors explored correlations between taste features of botanical drugs used in ancient times and therapeutic uses, finding some potentially interesting associations between intensity and complexity of flavors and therapeutic potential, plus some more specific associations described in the discussion sections. I believe the results could be of potential benefit to the drug discovery community, especially for those scientists working in the field of natural products.

      Strengths:

      Owing to its eclectic and somehow heterodox nature, I believe the article might be of interest to a general audience. In fact, I have enjoyed reading it and my curiosity was raised by the extensive discussion.

      The idea of revisiting a classical vademecum with new scientific perspectives is quite stimulating.

      The authors have undertaken a significant amount of work, collecting 700 botanical drugs and exploring their taste and association with known uses via eleven trained panelists.

      Weaknesses:

      I have some methodological concerns. Was subjective bias within the panel of participants explored or minimized in any manner?

      Yes, in all models we included ‘panellist’ as a random effect and therefore any biased perception by a single panellist across drugs or differences among panellists for an individual drug was accounted for. We now make this clearer in our methods.

      Were the panelists exposed to the drugs blindly and on several occasions to assess the robustness of their perceptions?

      The study was double blind, but blinding was not possible with the more well-known drugs (e.g., almonds, walnuts, thyme, mint). A random number generator was used to assign the drugs to the panellists, and according to the random distribution, some drugs were presented to the same panellist more than once. Robustness of panellists’ perception was not assessed specifically. We have added some text to the methods to clarify.

      Judging from the total number of taste assessments recorded and from Supplementary Material, it seems that not every panelist tasted every drug. Why?

      Because there were many drugs and panellists had time constraints. Overall, 3973 individual sensory trials were conducted, with an average of 361±153 trials per panellist and 5.7±1.3 trials per botanical drug.

      It may be a good idea to explore the similarity in the assessments of the same botanical drug by different volunteers. If a given descriptor was reported by a single volunteer, was it used anyway for the statistical analysis or filtered out?

      All responses were used as reported by the panellists, including potential ‘outliers’. As described above, the inclusion of ‘panellist’ as a random effect means that if one individual gave an unusual description of a particular drug in comparison to other individuals, this would be less impactful on any parameter estimates.

      The idea of "versatility" is repeatedly used in the manuscript, but the authors do not clearly define what they call "versatile".

      In line with suggestions made by reviewers, we have slightly adjusted the definition of therapeutic versatility and have now clearly defined the term on first use. Here, we define therapeutic versatility as the number of therapeutic ‘categories’ a drug is used for (the 25 broad categories are represented by shared iconography in Figure 1). Our revised results include analyses using this definition – which are qualitatively identical to our previous results which defined versatility using the 46 individual therapeutic uses.

      The introduction should be expanded. There are plenty of studies and articles out there exploring the evolution of bitter taste receptors, and associating it with a hypothetical evolutionary advantage since bitter plants are more likely to be poisonous.

      We agree. Bitter is arguably the most frequent chemosensory attribute of plants and botanical drugs perceived by humans. Our data shows that ‘poisons’ are not associated with bitterness but positively with ‘aromatic’, ‘sweet’ and ‘soapy’ – and negatively with ‘salty’ qualities.

      We have added this paragraph to the introduction:

      "The perception of taste and flavour (a combination of taste, smell and chemesthesis) here also referred to as chemosensation, has evolved to meet nutritional requirements and are particularly important in omnivores for seeking out nutrients and avoiding toxins (Rozin and Todd, 2016; Breslin, 2013; Glendinning, 2022). The rejection of bitter stimuli has generally been associated with the avoidance of toxins (Glendinning, 1994; Lindemann, 2001; Breslin, 2013) but to date no clear relationship between bitter compounds and toxicity at a nutritionally relevant dose could be established (Glendinning, 1994; Nissim et al., 2017). While bitter tasting metabolites occurring in fruits and vegetables have been linked with a lower risk for contracting cancer and cardiovascular diseases (Drewnoswski and Gomez-Carneros, 2000) the avoidance of pharmacologically active compounds is probably the reason why many medicines, including botanical drugs, taste bitter (Johns, 1990; Mennella et al., 2013)."

      And expanded in the discussion:

      "Though many bitter compounds are toxic, not all bitter plant metabolites are (Glendinning, 1994; Drewnoswski and Gomez-Carneros, 2000; e.g., iridoids, flavonoids, glucosinolates, bitter sugars). In part, this may be the outcome of an arms race between plant defence and herbivorous mammals’ bitter taste receptor sensitivities, resulting in the synthesis of metabolites capable of repelling herbivores and confounding the perception of potential nutrients by mimicking tastes of toxins. Here, poisons showed no association with bitter but positive associations with aromatic (px = 0.041), sweet (px = 0.022) and soapy (px = 0.025) as well as a negative association with salty (px = 0.046) qualities."

      Since plant secondary metabolites are one of the most important sources of therapeutic drugs and one of their main functions is to protect plants from environmental dangers (e.g., animals), this evolutionary interplay should be at least briefly discussed in the introductory section.

      This is now referred to in the introduction as well as in the discussion.

      Since the authors visit some classical authors, Parecelsus' famous quote "All things are poison and nothing is without poison. Solely the dose determines that a thing is not a poison" may be relevant here. Also note that some authors have explored the relationship between taste receptors and pharmacological targets (e.g., Bioorg Med Chem Lett. 2012 Jun 15;22(12):4072-4).

      We agree that pharmacologic action is determined by the dose. We now refer to the dose in the introduction: “…to date no clear relationship between bitter compounds and toxicity at a nutritionally relevant dose could be established (Glendinning, 1994; Nissim et al., 2017)”.

      We are aware of the fact that several authors have explored the relationship between taste receptors as targets and their similarity with other targets. We use many examples from the literature to explain our data. Our analysis did, however, not highlight any association between sweet tastes and epilepsy (as reported in Bioorg Med Chem Lett. 2012 Jun 15;22(12):4072-4)). We are not able to explain all associations, and we acknowledge that there may be more associations between chemosensory receptors and therapeutic effects than those found and discussed here.

      Reviewer #2 (Public Review):

      Summary:

      This is an unusual, but interesting approach to link the "taste" of plants and plant extracts to their therapeutic use in ancient Graeco-Roman culture. The authors used a panel of 11 trained tasters to test ~700 different medicinal plants and describe them in terms of 22 "taste" descriptors. They correlated these descriptors with the plant's medical use as reported in the De Materia Medica (DMM 1st Century, CE). Correcting for some of the plants' evolutionary phylogenetic relationships, the authors found that taste descriptors along with intensity measures were correlated with the "versatility" and/or specific therapeutic use of the medicine. For example, simple but intense tastes were correlated with the versatility of a medicine. Specific intense tastes were linked to versatility while others were not; intense bitter, starchy, musky, sweet, cooling, and soapy were associated with versatility, but sour and woody were negatively associated. Also, some specific tastes could be associated with specific uses - both positive and negative associations. Some of these findings make sense immediately, but others are somewhat surprising, and the authors propose some links between taste and medicinal use (both historical and modern use) in the discussion. The authors state that this study allows for a re-evaluation of pre-scientific knowledge, pointing toward a central role of taste in medicine.

      Strengths:

      The real strength of this study is the novelty of this approach - using modern-day tasters to evaluate ancient medicinal plants to understand the potential relationships between taste and therapeutic use, lending some support to the idea that the "taste" of a medicine is linked to its effectiveness as a treatment.

      Weaknesses:

      While I find this study very interesting and potentially insightful into the development and classification of certain botanical drugs for specific medicinal use, I would encourage the authors to revise the manuscript and the accompanying figures significantly to improve the reader's understanding of the methods, analyses, and findings. A more thorough discussion of the limitations of this particular study and this general type of approach would also be very important to include.

      Figures were revised, one deleted (former Fig. 3), and another one put to the supplementary (former Fig. 4, now Figure supplement 1). We now acknowledge limitations in the final paragraph.

      The metric of versatility seems somewhat arbitrary. It is not well explained why versatility is important and/or its relationship with taste complexity or intensity.

      We have modified the definition of versatility in line with reviewers’ comments. We have provided a detailed explanation of this in our response to reviewer #1 but for ease of reference, we paste this again here:

      Here, we define therapeutic versatility as the number of therapeutic ‘categories’ a drug is used for (the 25 broad categories are represented by shared iconography in Figure 1). Our revised results include analyses using this definition – which are qualitatively identical to our previous results which defined versatility using the 46 individual therapeutic uses.

      The importance of versatility was not the focus but the impact of taste intensity and complexity on versatility. We hypothesize that associations between perceived complexity and intensity of chemosensory qualities with versatility of botanical drug use provides insights into the development of empirical pharmacological knowledge and therapeutic behaviour (now included in the introduction).

      Similarly, the rationale for examining the relationships between individual therapeutic uses and taste intensity/complexity is not well explained, and given that a similar high intensity/low complexity relationship is common for most of the therapeutic uses, it restates the same concepts that were covered by the initial versatility comparison.

      The examination of the relationships between individual therapeutic uses and taste intensity/complexity fine-tunes the overall analysis and shows that this concept is applicable in general. However, in general, the reviewer is correct, and this is not our main focus. We therefore shifted the analysis including the figure to the supplementary material and state in the discussion: “We also detected nuances in significance, and complete absence of significance across the relationships between individual therapeutic uses and complexity/intensity magnitudes for which we lack, however, more specific explanations (Figure supplement 1).

      There are multiple issues with the figures - the use of icons is in many cases counterproductive and other representations are not clear or cause confusion (especially Figure 3).

      We have excluded former Fig. 3. Otherwise, the use of iconography is to facilitate graphical representation and cross-referencing between figures without over-cluttering. We provide all text and numeric values in the supporting information if individual detail is required.

      The phylogenetic information about the botanicals is missing. Also missing is any reference/discussion about how that analysis was able to disambiguate the confounding effects of shared uses and tastes of drugs from closely related species.

      This is explained in the methods (sections: ‘Phylogenetic tree’ and ‘statistical procedure’). We highlight that all models showed high heritability which means that shared ancestry has a statistical influence on the model. The trees themselves are now represented in our modified Figure 2.

      Reviewer #1 (Recommendations For The Authors):

      Besides the points already covered in my public review, I believe it would be interesting to assess and discuss the differences between the category "food" (how many drugs were allocated there?) and the drugs used for therapeutic purposes. In this manner, the food category could serve as a retrospective negative control to test the authors' hypotheses. Does the food category include drugs of weak flavor? Does it include drugs of complex flavor?

      All drugs in this database are associated with therapeutic uses. Only 96 are specifically mentioned to be also used as food while in total at least 152 are also used as food (many of the most obvious food drugs are not labelled as such in DMM). It is difficult to use the food category as a negative control (for testing whether food drugs have weaker tastes), because spices are included in the food category. If at all, only staples should be used for such an analysis. But this would be another study.

      In the context of the present analyses, we do agree that there is interest and so we have therefore added a small section to our manuscript: The 96 botanical drugs specifically mentioned also for food (though there are more than 150 edible drugs in our dataset; Supplementary file 1) show positive associations with starchy (px = 0.005), nutty (px = 0.002) and salty (px = 0.001) and negative associations with bitter (px = 0.007), woody (px = 0.001) and stinging (px = 0.033) tastes and flavours.

      Please replace "plant defence" with "plant defense".

      Currently the whole MS is formatted BE. We are happy to revise on the basis of editorial policy.

      Reviewer #2 (Recommendations For The Authors):

      1. I would encourage replacing "taste" with "flavor" throughout the manuscript and in the title because this paper addresses "taste here defined as a combination of taste, odour and chemesthesis" which essentially is the definition of flavor, and should not be simplified to taste. Flavor is the more precise word, and there is no need to confuse readers by defining "taste" in this way when taste means just the gustatory aspect of flavor.

      We now define flavour as a combination of taste, smell and chemesthesis and use ‘taste’ when referring to a specific taste quality. We use the term ‘chemosensory’ (perception, quality) and chemosensation for addressing the perception of both, taste and flavour qualities together. The abstract now reads: “The perception of taste and flavour (a combination of taste, smell and chemesthesis) here referred to as chemosensation, enables animals to find high-value foods and avoid toxins.”

      We prefer to leave the title as it is in accordance with standard books (e.g., “Pharmacology of Taste” by Palmer and Servant) which address all kinds of chemosensory interactions and the fact that we’ve conducted a ‘tasting panel’ (and not a ‘flavour panel’), and because flavour as a concept is only used in English (and also there not consistently, with ‘taste’ being the preferred term used by English native speakers for describing perception where in a strict sense, ‘flavour’ would be the correct term, see Rozin P. "Taste-smell confusions" and the duality of the olfactory sense. Percept Psychophys. 1982 Apr;31(4):397-401)) and maybe also in French.

      1. Methods - A much more detailed description of how the samples were prepared for the taste tests is needed. Were they sampled as a dry powder? No, they were sampled as dried pieces. We have added more information to our methods section to clarify.

      Why is there such a big range in the amount provided (.1 to 2 g)? Because certain drugs are highly toxic (aconitum, opium) we could only provide a relatively small amount (that still permitted the perception of taste qualities). For practical reasons, half a walnut was dispensed. We have added more information to our methods section to clarify.

      Also "Panelists were instructed to spit, rinse their mouth with drinking water and to take a break before tasting the next sample" This seems more likely that the samples were dissolved in a liquid if they were spitting and rinsing, but this is not clear. Also - take a break for how long between samples?

      Panellists were instructed to chew the amount of sample necessary for taste perception, to annotate their perception, and to spit out residues of samples and finally rinse their mouth with drinking water. The breaks between tasting different samples depended on chemosensory persistence. We have added more information to our methods section to clarify.

      How many samples were tested per day?

      The number of tasted samples was different from panelist to panelist and depending on available time frames. On average each panellist tasted 17,2 drugs per hour using 10.5 sessions (18 sessions in total) lasting approximately two hours each. We have added more information to our methods section to clarify.

      Did individual panelists get repeated samples?

      Random distribution permitted that individual panellists were challenged also with repeated samples. We have added more information to our methods section to clarify.

      1. Methods - Phylogenetic tree - Where is the output of this tree? It should be included in the figures and referred to in the results/discussion where the authors claim that they have been able to disambiguate phylogenetic closeness with taste and medicinal use.

      We did not ‘build’ a phylogenetic tree, rather we modified an existing one. Therefore, the wording of that section in the methods has been adjusted for clarity. We refer to the tree in the results pertaining to phylogenetic relatedness by explicitly quantifying the extent of phylogenetic signal using the widely used heritability (h2) statistic. This means that shared ancestry has a statistical influence on the model. We have also added to our Figure 2 representations of the phylogenetic tree we used in our analysis, limited to the species for which we have data, also displaying the data (in this case, intensity and complexity) at the tips.

      1. Taste intensity ratings should be better explained. Since the panelists are evaluating different amounts of samples (.1 to 2g) wouldn't the intensity of taste also depend on the amount of the substance?

      The panelists were not told to introduce all the sample into their mouth but just enough to perceive the taste qualities clearly (explanation given in methods). E.g.: one black pepper corn is normally enough to perceive the taste and flavour of pepper while the same amount of hazelnut would be insufficient.

      Or is this measure a relative value - "woodiness" vs "sourness" for example within the sample is strong/weak?

      Chemosensation and sensory perception in general is always relative. (For instance, currently I can hear the birds singing outside. Was there music playing in my room I wouldn’t be able to hear them).

      Because of this - are samples with strong tastes less likely to seem complex because the intensity of one stimulus masks the other?

      Yes, we argue that drugs with strong tastes/flavours are less likely be perceived as being complex (fewer individual qualities perceived), arguably because strong stimuli overshadow weaker ones. We currently address this in the discussion and have made some modifications in line with the below comment.

      This issue was presented briefly in the discussion when addressing the finding that samples with intense, but fewer tastes were more versatile, but this was highly confusing.

      The authors presented both sides of the problem without referring to any of their own experiments to resolve the issue, or to highlight this as a potential limitation of the study at hand.

      Yes, stronger tastes mask weaker tastes which addresses both sides of the problem.

      We have modified the first paragraph of the discussion to make this clearer.

      It now reads: "Unexpectedly, botanical drugs eliciting fewer but intense chemosensations were more versatile (Fig. 2). People often associate complexity with intensity, and taste complexity is popularly interpreted with a higher complexity of ingredients (Spence, and Wang, 2018). However, simple tastes can be associated with complex chemistry when intense tastes mask weaker tastes, or when tastants are blended (Breslin and Beauchamp, 1997; Green et al., 2010). For example, starchy flavours or sweet tastes can be sensed when bitter and astringent antifeedant compounds are present below a certain threshold while salts enhance overall flavour by suppressing the perception of bitter tastants (Breslin and Beauchamp, 1997; Johns, 1990). On the other hand, combinations of different tastants or olfactory stimuli do not necessarily result in increased perceived complexity (Spence and Wang, 2018; Weiss et al., 2012)."

      It would be useful to understand the parameters a bit more - a data visualization of the relationships of intensity and complexity across all samples would be a welcome addition to Figure 2.

      Shared ancestry has a statistical influence on the model. We have now also added to our Figure 2 representations of the phylogenetic tree we used in our analysis, limited to the species for which we have data, also displaying the data (in this case, intensity and complexity) at the tips.

      1. "Therapeutic Versatility" is a measure of how many different therapeutic uses a given botanic is listed in the DMM. This is one of the primary comparisons of this study, but the authors do not provide much of a rationale for using this metric. Also, there are 46 therapeutic uses, but many are interrelated such as gastric, gynecology, muscle, neurological, respiratory, skin, and kidney. It is not clear in my reading of the methods if this was also treated in some type of "phylogeny" as well or not. I would assume a real therapeutic versatility metric should be higher for something used for cough, ulcers, gout, and menses rather than something that was used for 4 different, but skin-related complaints.

      The reviewer is correct, and we appreciate this comment. We have modified the definition of versatility in line with the suggestions laid out here. We have provided a detailed explanation of this in our public responses but for ease of reference, we paste this again here:

      Here, we define therapeutic versatility as the number of therapeutic ‘categories’ a drug is used for (the 25 broad categories are represented by shared iconography in Figure 1). Our revised results include analyses using this definition – which are qualitatively identical to our previous results which defined versatility using the 46 individual therapeutic uses.

      We repeated our original ‘versatility’ analyses using the 25 broader categories rather than the 46 individual uses. The results remained largely the same.

      1. Use of icons/pictorial representations in figures. Overall, the use of icons is not necessary - words could be used, and then readers would not need to keep going back and forth to the key in Figure 1 to identify the taste/use. I am very confused by Figure 3. How is the strength of taste shown in this figure? The use of the balance is a confusing representation since I don't associate strength/intensity with weight. Also there are specific tastes that are used more, and others that are used less (but the numbers of those are also more/less). I do not think this figure accomplishes the goal of relaying these findings.

      Whilst we agree that iconography is not strictly necessary, we think it is a good way of graphically representing the results without over-crowding the figures or introducing text sizes too small to read in print. All values are provided in the supporting information if any individual detail is required.

      We have decided on the basis of these comments to exclude former Fig. 3 and (Figure supplement 1). We hope that the removal of this figure and clearer signposting towards the text and numerical tables in the supplementary information alleviates the reviewer’s concerns.

      1. Similarly, figure 4 is unclear. This could be better represented in a table with words and p values listed. But a larger issue is that this shows essentially the same overarching relationship across the therapeutic use cases - high intensity, low complexity. Only the pink kidney (other?) case differs from this pattern. In the discussion, several therapeutic uses are discussed that could need intense tasting medicine - but these are not related directly back to the relationships shown in Figure 4.

      Yes, we agree with the reviewer and have now moved Fig. 4 to the supplementary (Figure supplement 1)

    1. Author Response

      The following is the authors’ response to the original reviews.

      Note to all Reviewers

      We appreciate the reviewers’ comments and suggestions for improving the manuscript. Below is a summary of new data added and a brief description of the major new results. A detailed pointby-point response follows.

      New data:

      • Figure 1f

      • Figure 2b, f, g

      • Figure 4b

      • Figure S7 • Figure S8

      • Figure S9

      Summary of major new results/edits:

      • At the request of Reviewer #1 we have updated the name of the degradation tag to be more specific and we now call it the “LOVdeg” tag.

      • We have added new controls demonstrating that light stimulation does not cause photobleaching or toxicity issues (Fig. S7).

      • We now show that LOVdeg can function at various points in the growth cycle, demonstrating robust degradation (Fig. 1f, Fig. S8).

      • We have included relevant controls for the AcrB-LOVdeg efflux pump results (Fig. 2f-g).

      • We have included important benchmarking controls, such as an EL222-only control and SsrA tag control to provide a clearer view of how LOVdeg performance compares to other systems (Fig. S9, Fig. 4b).

      Additional note:

      • While repeating experiments during the revision process we found that the results for the combined action of EL222 and the LOVdeg tag were not as dramatic as in our original measurements, though the overall findings are consistent with our original results. Specifically, we still find that the combination of EL222 and the LOVdeg tag produces a lower signal than either on their own. We have updated these data in the revised manuscript (Fig. 4b).

      Reviewer #1:

      Public Review:

      Specifically controlling the level of proteins in bacteria is an important tool for many aspects of microbiology, from basic research to protein production. While there are several established methods for regulating transcription or translation of proteins with light, optogenetic protein degradation has so far not been established in bacteria. In this paper, the authors present a degradation sequence, which they name "LOVtag", based on iLID, a modified version of the blue-light-responsive LOV2 domain of Avena sativa phototropin I (AsLOV2). The authors reasoned that by removing the three C-terminal amino acids of iLID, the modified protein ends in "-E-A-A", similar to the "-L-A-A" C-terminus of the widely used SsrA degradation tag. The authors further speculated that, given the light-induced unfolding of the C-terminal domain of iLID and similar proteins, the "-E-A-A" C-terminus would become more accessible and, in turn, the protein would be more efficiently degraded in blue light than in the dark.

      Indeed, several tested proteins tagged with the "LOVtag" show clearly lower cellular levels in blue light than in the dark. While the system works efficiently with mCherry (10-20x lower levels upon illumination), the effect is rather modest (2-3x lower levels) in most other cases. Accordingly, the authors propose to use their system in combination with other light-controlled expression systems and provide data validating this approach. Unfortunately, despite the claim that the "LOVtag" should work faster than optogenetic systems controlling transcription or translation of protein, the degradation kinetics are not consistently shown; in the one case where this is done, the response time and overall efficiency are similar or slightly worse than for EL222, an optogenetic expression system.

      The manuscript and the figures are generally very well-composed and follow a clear structure. The schematics nicely explain the underlying principles. However, limitations of the method in its main proposed area of use, protein production, should be highlighted more clearly, e.g., (i) the need to attach a C-terminal tag of considerable size to the protein of interest, (ii) the limited efficiency (slightly less efficient and slower than EL222, a light-dependent transcriptional control mechanism), and (iii) the incompletely understood prerequisites for its application. In addition, several important controls and measurements of the characteristics of the systems, such as the degradation kinetics, would need to be shown to allow a comparison of the system with established approaches. The current version also contains several minor mistakes in the figures.

      We thank reviewer #1 for the feedback and suggestions to strengthen the manuscript. We have addressed these comments in the points that follow and now include important controls and benchmarks for our molecular tool.

      Major points

      1. The quite generic name "LOVtag" may be misleading, as there are many LOV-based tags for different purposes.

      We appreciate that it would be beneficial to have a more specific name. We have updated the name to “LOVdeg” tag, which captures both the inclusion of LOV and the degradation function of the tag.

      Updated throughout the manuscript and figures

      1. Throughout the manuscript, the authors use "expression levels". As protein degradation is a post-expression mechanism, "protein levels" should be used instead.

      We have transitioned to using “protein levels” at many points in the manuscript.

      Updated throughout the manuscript

      1. Degradation dynamics (time course experiments) should be shown. The only time this is done in the current version (in Fig. 4), degradation appears to be in the same range (even a bit slower) than for EL222, which does not support the claim that the "LOVtag" acts faster than other optogenetic systems controlling protein levels.

      In the revised manuscript, time course data are now shown at multiple points. These include new data in Fig. 1f and Fig. S8 that demonstrate degradation at various stages of growth. Fig. S4 also shows the dynamics of degradation when comparing to the addition of exogenously expressed ClpA. We have added text in the results section to point the reader to these data. In addition, we have made minor modifications to the text in the Introduction to avoid making claims about speed comparisons. Fig. 1f, Fig. S8, Fig. S4

      Results: Design and characterization of the AsLOV2-based degradation tag, Introduction

      1. "Frequency" is used incorrectly for Fig. 3. A series of 5 seconds on, 5 seconds off corresponds to a frequency of 0.1 Hz (1 illumination round / 10 s), not of 0.5 Hz. What the authors indicate as "frequency" is the fraction of illumination time. However, the (correct) frequency should be given, as this is likely the more important factor.

      We have changed how we calculate frequency to use the proposed definition of one pulse per time period. We updated the values in the text and in the figure. Fig. 3c

      Results: Tuning frequency response of the LOVdeg tag

      1. To properly evaluate the system, several additional controls are needed:

      a. To test for photobleaching of mCherry by blue light illumination, untagged controls should be shown for the mCherry-based experiments. Fluorescence always seems to be lower upon illumination, except for the AsLOV2*(546) data, where it cannot be excluded that fluorescence readings are saturated. Relatedly, the raw data for OD and fluorescence should be included. Showing a Western blot against mCherry in at least one case would allow to separate the effects of photobleaching and degradation.

      We appreciate the suggestion and have conducted these important controls. We now include new data demonstrating that light induction does not change fluorescence levels using an untagged mCherry control, nor does it significantly affect endpoint OD levels. Based on these results, we did not perform a Western blot because there were no effects to separate. Fig. S7

      b. In Fig. 2b, light + IPTG should be shown to estimate the activity of the system at higher expression levels.

      We have added these to the figure. Light + IPTG modestly increases expression compared to IPTG only, likely due to the saturating level of IPTG added, which achieves near full induction. Fig. 2b

      c. In Fig. 4, EL222 alone should be shown to allow a comparison with the LOVtag. From the data presented, it looks like EL222 is both slightly faster and more efficient than the LOVtag.

      We have added the EL222-only case for comparison with LOVdeg only and EL222 + LOVdeg. We note that Reviewer #3 raised a similar concern. Fig. 4b

      d. The effect of the used light on bacterial viability under exponential and stationary conditions should be shown.

      In this revision, we have added new data on light exposure at various points during exponential and stationary phase (Fig. 1f, Fig. S8). These OD data show that growth curves are similar for all cultures, regardless of the time light is applied during the growth phase. Additionally, we also now include ODs for the photobleaching experiments. These data also show that growth is not significantly altered under continuous light exposure. Figure 1f, Fig. S7b

      1. The claim that "Post-translational control of protein function typically requires extensive protein engineering for each use case" is not correct. The authors should discuss alternative options, e.g. based on dimerization, more extensively and in a less biased manner.

      We have toned down the language in this location and at other points in the manuscript. However, we maintain that other types of post-translational control, such as dimerization or LOV2 domain insertion, require more protein engineering than inserting a degradation tag. For example, we and others have directly demonstrated this in previous work (e.g. DOI: 10.1021/acssynbio.9b00395, 10.1101/2023.05.26.542511, 10.1038/s41467-023-38993-6), where numerous split site or insertion variants need to be screened and fine-tuned for successful light control. In contrast, a degradation mechanism has the potential to require less fine tuning to achieve a light response. We have included the above sources to clarify this point. Introduction, Results: Modularity of the LOVdeg tag

      Minor points

      1. In Suppl. Fig. 1, amino acid numbers seem to be off. Also, the alterations in iLID (compared to AsLOV2) that are not used in "LOVtag" appear to be missing and the iLID sequence incorrect, as a consequence.

      Thank you for catching this. The number indices in Fig. S1 have been corrected. We also realized we were reporting the iLID(C530M) variant in our amino acid sequence and have reverted the 530M back to C. Fig. S1

      1. Why is AsLOV2(543) more efficiently degraded than AsLOV2(543) (blue column in Fig. 1d) when the dark state should be stabilized in AsLOV2(543)?

      We are not sure of the exact reason for the increased degradation response in the AsLOV2*(543) variant. It may be that the dark-state stabilizing mutations introduced also have more favorable interactions with degradation machinery, although this is highly speculative.

      1. Why does the addition of EL222 reduce protein levels so strongly in the dark for CpFatB1* (Fig. 5)?

      We believe this effect stems from the EL222 responsive promoter (PEL222). With LOVdeg only, CpFatB1* is expressed from an IPTG inducible promoter (PlacUV5) whereas EL222 responsive constructs necessitate a promoter switch containing an EL222 binding site. We have clarified this point and expanded our discussion of these results.

      Results: Optogenetic control of octanoic acid production

      1. Fig. 2f / S10 are difficult to interpret. Why does illumination only lead to a significant effect at 2.5 and 5 µg/ml and not at lower concentrations, where the degradation system would be expected to be most efficient?

      We have expanded our discussion on these results to explain that this likely stems from basal protein levels of AcrB-LOVdeg in the light that can provide resistance at low antibiotic concentrations. We have also added new controls to this figure to show the chloramphenicol sensitivity of a ΔacrB strain and a ΔacrB strain with an IPTG-inducible version of acrB with no induction, demonstrating the lowest achievable chloramphenicol resistance from a standard inducible system.

      Results: Modularity of the LOVdeg tag, Fig. 2f-g

      1. Fig. 2f / S10 do not measure the MIC (which is a clearly defined value), but the sensitivity to Chloramphenicol.

      We have changed the text to use the term chloramphenicol sensitivity instead of MIC. Results: Modularity of the LOVdeg tag

      1. "***" in Fig. S1 should be explained.

      We have removed the ‘***’ to avoid confusion. Fig. S1

      1. The fold-change differences between light and dark, indicated in some selected cases, should be listed for all figures.

      We have added fold-change values where appropriate. Fig 1d, Fig. 2b

      Reviewer #2:

      Public Review:

      In this manuscript the authors present and characterize LOVtag, a modified version of the bluelight sensitive AsLOV2 protein, which functions as a light-inducible degron in Escherichia coli. Light has been shown to be a powerful inducer in biological systems as it is often orthogonal and can be controlled in both space and time. Many optogenetic systems target regulation of transcription, however in this manuscript the authors target protein degradation to control protein levels in bacteria. This is an important advance in bacteria, as inducible protein degradation systems in bacteria have lagged behind eukaryotic systems due to protein targeting in bacteria being primarily dependent on primary amino acid sequence and thus more difficult to engineer. In this manuscript, the authors exploit the fact that the J-alpha helix of AsLOV2, which unwinds into a disordered domain in response to blue light, contains an E-A-A amino acid sequence which is very similar to the C-terminal L-A-A sequence in the SsrA tag which is targeted by the unfoldases ClpA and ClpX. They truncate AsLOV2 to create AsLOV2(543) and combine this truncation with a mutation that stabilizes the dark state to generate AsLOV2*(543) which, when fused to the C-terminus of mCherry, confers light-induced degradation. The authors do not verify the mechanism of degradation due to LOVtag, but evidence from deletion mutants contained in the supplemental material hints that there is a ClpA dominated mechanism. They demonstrate modularity of this LOVtag by using it to degrade the LacI repressor, CRISPRa activation through degradation of MCP-SoxS, and the AcrB protein which is part of the AcrAB-TolC multidrug efflux pump. In all cases, measurement of the effect of the LOVtag is indirect as the authors measure reduction in LacI repression, reduction in CRISPRa activation, and drug resistance rather than directly measuring protein levels. Nevertheless the evidence is convincing, although seemingly less effective than in the case of mCherry degradation, although it is hard to compare due to the different endpoints being measured. The authors further modify LOVtag to contain a known photocycle mutation that slows its reversion time in the dark, so that LOVtag is more sensitive to short pulses of light which could be useful in low light conditions or for very light sensitive organisms. They also demonstrate that combining LOVtag with a blue-light transcriptional repression system (EL222) can decrease protein levels an additional 269-fold (relative to 15-fold with LOVtag alone). Finally, the authors apply LOVtag to a metabolic engineering task, namely reducing expression of octanoic acid by regulating the enzyme CpFatB1, an acyl-ACP thioesterase. The authors show that tagging CpFatB1 with LOVtag allows light induced reduction in octanoic acid titer over a 24 hour fermentation. In particular, by comparing control of CpFatB1 with EL222 transcriptional repression alone, LOVtag, or both the authors show that light-induced protein degradation is more effective than light-induced transcriptional repression. The authors suggest that this is because transcriptional repression is not effective when cells are at stationary phase (and thus there is no protein dilution due to cell division), however it is not clear from the available data that the cells were in stationary phase during light exposure. Overall, the authors have generated a modular, light-activated degron tag for use in Escherichia coli that is likely to be a useful tool in the synthetic biology and metabolic engineering toolkit.

      We thank Reviewer #2 for the constructive feedback. In the updated manuscript, we now include data demonstrating degradation at different growth stages and address other points brought up in the review to improve understanding of the degradation tag.

      Overall, the authors present a well written manuscript that characterizes an interesting and likely very useful tool for bacterial synthetic biology and metabolic engineering. I have a few suggestions that could improve the presentation of the material.

      Major Comments:

      • Could the authors clarify, perhaps through OD measurements, that the cultures in the octanoic acid experiment are actually in stationary phase during the relevant light induction. It isn't clear from the methods.

      We have updated the Methods to clarify that the cells are entering stationary phase (OD600 = 0.6) when light is either kept on or turned off for production experiments. Production is continued for the following 24 hours. Note that we now show OD measurements in a separate set of experiments (Fig. 1f, Fig. S8).

      Methods: Octanoic acid production experiment. Fig. 1f, Fig. S8

      • Can the authors clarify why there is an overall decrease in protein in the clpX deletion? And is it this initial reduction that is the source of the change in fold in 1C? Similarly, for hslU is it because overall protein levels are higher with the tag? In general, I feel that the interpretation of Supplemental Figures S6-S10 could be moved in more detail to the main text, or at least the main takeaway points. But this is a personal preference, and not necessary to the major flow of the story which is about the utility of the LOVtag tool.

      As shown in Fig. S5, expression of mCherry without any degradation tag is decreased in a clpX knockout strain compared to wild type. This difference may be the result of reduced cell health, and we now note this in the text. The strains shown in Fig. 1c are in wild type cells with normal expression, so this is not the source of the fold change. As for hslU, we agree it is interesting that expression seems to increase. However, the increase is modest and could stem from gene network regulation differences in that strain compared to wild type and may not be related to LOVdeg tag degradation. Each endogenous protease is involved in a wide range of functions within the cell, and it is unknown how global gene expression is impacted. We acknowledge the suggestion of moving the protease results to the main text, but we have ultimately elected to keep these data in the Supplementary Information to maintain the flow in the manuscript. However, we have added additional text pointing the reader to the Supplemental Text and include a brief summary of the findings in the main text.

      Results: Design and characterization of the AsLOV2-based degradation tag

      • What is the source of the poor repression in Figure 2D?

      Presumably, this stems from low levels of the CRISPRa MCP-SoxS activator, even in the presence of light. We have added this point to the text.

      Results: Modularity of the LOVdeg tag

      • In general, it would be nice to have light-only controls for many of the experiments to validate that light is not affecting the indicated proteins or their function.

      We thank the reviewer for this suggestion and note that Reviewer #1 raised a similar concern. We have now included light-only data for a strain containing IPTG-inducible mCherry without the LOVdeg tag (Fig. S7). These data show that light itself, at the levels used in this study, does not affect mCherry expression or cell growth. This strain serves as a direct control for data presented in Fig. 1 and Fig. 2b, as the systems are identical except for the addition of the LOVdeg tag onto either mCherry or the LacI repressor. Additionally, the control translates to other experiments since mCherry is used as a reporter for other systems in this study. Fig. S7

      • It would be nice to directly measure the function of the tool at different phases of E. coli growth to show directly that protein degradation works at stationary phase, rather than the more indirect measurements used in the octanoic acid experiment.

      We thank the reviewer for this suggestion, which significantly strengthens our results. We have added an experiment that tests the LOVdeg tag at different phases of growth (Fig. 1f, Fig. S8). In this experiment, cultures are growth from early exponential to stationary phase, and light is introduced at various points. Exposure windows of 4 hours, ranging from early exponential to stationary phase, all show functional light inducible degradation. Fig. 1f, Fig. S8.

      Results: Design and characterization of the AsLOV2-based degradation tag

      Minor Comments:

      • It would be nice to make clear that the data in S6d and S7 is repeated, but with the HslUV data in S7.

      We clarified this point in the caption of Fig. S4 (the former Fig. S7 in the original manuscript). Fig. S4 caption

      • Why was 5s picked for the frequency response in Figure 3

      We picked 5s because 1) it is a substantially shorter timescale than overall degradation dynamics seen for the LOVdeg tag, and 2) we found that shorter pulses could not be reliably achieved with the light stimulation hardware and software we used (Light Plate Apparatus with Iris software). To ensure high fidelity pulses, we opted for 5 second pulses that we empirically determined to be stable throughout long experiments. We have added text clarifying this. Results: Tuning frequency response of the LOVdeg tag

      Reviewer #3:

      Public Review:

      The authors present the mechanism, validation, and modular application of LOVtag, a light-responsive protein degradation tag that is processed by the native degradosome of Escherichia coli. Upon exposure to blue light, the c-terminal alpha helix unfolds, essentially marking the protein for degradation. The authors demonstrate the engineered tag is modular across multiple complex regulatory systems, which shows its potential widespread use throughout the synthetic biology field. The step-by-step rational design of identifying the protein that was most dark stabilized as well as most light-responsive for degradation, was useful in terms of understanding the key components of this system. The most compelling data shows that the engineered LOVTag can be fused to multiple proteins and achieve light-based degradation, without affecting the original function of the fused protein; however, results are not benchmarked against similar degradation tagging and optogenetic control constructs. Creating fusion proteins that do not alter either of the original functions, is often difficult to achieve, and the novelty of this should be expanded upon to drive further impact.

      We appreciate the feedback from Reviewer #3 to improve the manuscript. We have included important controls and benchmarking experiments to address the reviewer’s concerns, which are detailed in the points below.

      Benchmarking:

      The similarity between the L-A-A sequence of SsrA and the E-A-A sequence of LOVtag is one of the pieces of evidence that led the authors to their current protein design. The differences in degradation efficiency between the SsrA degradation tag and LOVtag are not shown, and benchmarking against SsrA would be a valuable way to demonstrate the utility of this construct relative to an established protein tagging tool.

      We thank the reviewer for suggesting an experiment to benchmark performance. We have added new experimental data where a full length SsrA tag is added to a fusion protein of nearly identical size (mCherry-iLID), allowing us to directly compare performance to mCherryLOVdeg (Fig. S9). These results show that light inducible control with LOVdeg tag decreases protein expression levels to near those achieved with the native SsrA tag. Fig. S9.

      Results: Design and characterization of the AsLOV2-based degradation tag

      Additionally, there is a lack of an EL222-only control presented in Figure 4b and in the results section beginning with "Integrating the LOVtag with EL222...". Without benchmarking against this control the claim that "EL222 and the LOVtag work coherently to decrease expression" is unsubstantiated. No assumptions of synergy can be made.

      We appreciate this comment and note that Reviewer #1 raised a similar concern. We have added data to Fig. 4b with an EL222-only control for comparison. Fig. 4b

      The dramatic change in dark octanoic acid titer between the EL222, LOVtag and combined conditions are surprising, especially in comparison to the lack of change in the dark mCherry expression shown in Figure 4b. This data is the only to suggest that LOVtag may perform better than EL222. However, the inconsistencies in dark state regulation presented in the two experiments, and between conditions in this experiment bring the latter claim to question. A recommendation is that the authors either repeat this experiment, or comment on the observed discrepancy in dark state octanoic acid titers in their discussion.

      First, a key difference between the data presented in Fig. 4 and Fig. 5 is that the production experiment is conducted over a long time period (24 hours) and the EL222/LOVdeg reporter experiment is conducted over 5 hours. Likely, performance differences between EL222 and the LOVdeg tag become more pronounced as protein accumulation occurs. Second, the LOVdeg only construct is expressed from a non-EL222 promoter which is able to achieve higher expression (see response to Reviewer #1, Minor point #3). Lastly, a convoluting factor is that the relationship between expression of CpFatB1 and octanoic acid production is not completely linear, and there are likely thresholds or expressions windows that result in similar endpoint titers. We agree a more detailed examination of how CpFatB1 changes over the course of the production period would be very interesting. However, this is beyond the scope of the present study, whose goal is to introduce and showcase the utility of the LOVdeg tag as a tool. We have added new discussion on this in the Results section to clarify some of these points. We have also repeated all experiments in Fig. 4 and consistently see the LOVdeg tag performing as well as or better than EL222. As noted in the remarks to all reviewers, these data have been updated in the revised manuscript.

      Results: Optogenetic control of octanoic acid production. Fig. 4d

      Based on the methodology presented, no change in the duration in light exposure was tested, even though this may be an important part of the system response. The on/off, for example in Figure 4b, is either all light or all dark, but they claim that their system is beneficial especially at stationary phase. The authors should consider showing the effects of shifting from dark to light at set intervals. (i.e. 1 hr dark then light, 2hr dark until light, etc.) This data would also aid in supporting the utility of this tag for controlling expression during different growth phases, where light may be used after the cells have reached a certain phase.

      We have added new data showing the effect of light stimulation at different times in the growth cycle (see response to Reviewer #2, bullet point #5). These data demonstrate that the LOVdeg tag performs well at various points in the growth cycle. Fig. 1f, Fig. S8.

      Results: Design and characterization of the AsLOV2-based degradation tag

      Minor Revisions Figures:

      • Figure 1:

      • More clarity is needed in the naming conventions for this figure and in the body of the text. For example, a different convention than 546 and 543 should be used to refer to the full and truncated lengths of the tag. It would greatly aid understanding for this to be made more clear. The authors could simply continue to use "full" and "truncated" to refer to them. In addition, the term "stabilizing mutations" in 1c could be changed to read "dark state stabilizing mutations" to aid in clarity.

      When describing the design of the LOVdeg tag, we opted towards a more technically accurate description over clarity in order to make our engineering process easily comparable to other LOV2 systems. As such, we kept the number-based nomenclature (543 or 546) to represent the domain within the phototropin 1 protein from Avena sativa (AsLOV2). The domain used in this study, and many other studies, are only amino acids 404-546, i.e. not the full sequence, thus saying simply ‘full’ or ‘truncated’ is not technically accurate. We believe the detailed nomenclature, which is limited to one section, is important to provide clarity on exactly what we used for protein engineering. In the revised version we introduce the nickname “LOVdeg” tag earlier and use it throughout the rest of the manuscript.

      Results: Design and characterization of the AsLOV2-based degradation tag

      • 1b It is not clear that this is the dark state stabilized structure in the figure, but is referred to as such only in the body of the text.

      We have added text in the manuscript to clarify this is AsLOV2, not iLID, and have labeled it in the figure caption as well.

      Results: Design and characterization of the AsLOV2-based degradation tag

      • 1d. Fold change is reported in Figure 2d, and may be relevant to include those values in 1d as well.

      Done. Fig. 1d

      • 1e. It is not clear which tag is being used in this bar plot. Please specify that this is the dark state stabilized, truncated tag.

      We have added a title to the plot and language to the caption, both of which clarify this point. Fig. 1e

      • In addition, the microscopy images provided in supplemental material should be included in the first figure as it adds a compelling observation of LOVtag activity.

      We are pleased to hear that the microscopy results are beneficial, however we elected to leave them in Supplementary to preserve the flow of the manuscript in the text surrounding Fig. 1.

      • Figure 2:

      • 2d. It is unclear what the 2.5x fold change is relative to (the baseline or the dark)

      We have added a line in the figure to clarify the comparison being made. Fig. 2d

      • 2f. More discussion can be added to describe what concentration of chloramphenicol is biologically/bioreactor relevant.

      Our previous studies on the relationship between AcrAB expression and mutation rate (cited in the text) were carried out at a concentration within the range in which the LOVdeg tag is effective (5 μg/ml), suggesting this range to be relevant to tolerance and resistance.

      • Figure 3:

      • We recommend that this data and discussion are better suited for supplementary figures. The results shown here essentially recapitulate the same findings of Zoltowski et al., 2009. In addition, the paper describing this mutation should be cited in this figure caption in addition to the body of the text

      Although these results are in line with previous findings, we believe this dataset is important for several reasons. First, the agreement with known mutations validates the unfolding-based mechanism for degradation control. Second, degradation that is contingent on unfolding of LOV2 offers a direct actuating mechanism of photocycle properties. Other systems, like that in Zoltowski et al., examine properties of purified proteins but lack the mechanism to translate its effect in live cells. This figure demonstrates how degradation can do so and lays the groundwork for degradation-based frequency processing circuits. Last, there are discrepancies between photocycle kinetics in situ, as reported by Li et al. (DOI: 10.1038/s41467-020-18816-8), and in cell-free studies such as in Zoltowski et al. The studies use different methods of measuring photocycle kinetics (in situ vs cell-free). This dataset substantiates relaxation times from Li et al. and suggests cell-free relaxation time constants are over estimated relative to our live cell results.

      • Figure 4:

      • There is a lack of an EL222-only control presented in Figure 4b. Without this data present, the claim that "EL222 and the LOVtag work coherently to decrease expression" is unsubstantiated. No assumptions of synergy can be made.

      We have added EL222-only data to the figure; we note that Reviewer #1 made a similar request. Figure 4b

      Manuscript

      Results

      • Design and characterization...

      • Due to the extensive discussion of ClpX at the beginning of this section, more of the results on evaluating the binding partners and mechanism of LOVtag degradation should be presented in the main body of the manuscript and not in supplementary materials.

      To maintain flow of the manuscript and focus on how the LOVdeg tag works as a synthetic biology tool, we have opted to keep this section in the Supplement Information, but have several lines in the text related to Fig. 1 that point the reader to this material. Results: Design and characterization of the AsLOV2-based degradation tag

      • In the second paragraph of this section, the authors theorize that the C-terminal truncated E-AA sequence will "remain caged as part of the folded helix". How did the authors determine this? Was there any evidence to suggest that the truncated state would be any more responsive than the full length sequence? More data or rationale may need to be introduced to support the overall hypothesis presented in this paragraph.

      We determined this by examining the crystal structure which shows that the E-A-A sequence is part of the folded helix. As seen in Fig. 1b, addition of amino acids after the EAAKEL sequence would not be part of the folded helix which ends prior to the terminal leucine. We added text to clarify our logic.

      Results: Design and characterization of the AsLOV2-based degradation tag

      • The similarity between the L-A-A sequence of SsrA and the E-A-A sequence of LOVtag is one of the pieces of evidence that brought the authors to their current protein design. The differences in degradation efficiency between the SsrA degradation tag and LOVtag are not clear, and benchmarking against SsrA would be a valuable way to demonstrate the utility of this construct relative to an established protein tagging tool.

      We added an SsrA comparison to benchmark the system. Fig. S9

      Results: Design and characterization of the AsLOV2-based degradation tag

      • Tuning frequency and response...

      • Overall the results presented in this section essentially recapitulate the effects that mutation presented in Zoltowski et. al., 2009 have on AsLOV2 dark state recovery and although this is a useful observation of LOVtag performance, a recommendation is to move this into a supplementary section.

      See above response to Fig. 3 comment.

      • Integrating the LOVtag with EL222...

      • The claim is made in this section that LOVtag and EL222 work synergistically, however the experiments presented do not test repression due to EL222 activity alone. Without benchmarking against this control, the claim of synergy is not supported and we recommend that the authors perform this experiment again with the EL222-only control.

      We have added this important control. Fig. 4b

      Discussion

      • The statement "the LOVtag can easily be integrated with existing optogenetic systems to enhance their function" is not substantiated without benchmarking LOVtag against an EL222- only control. As mentioned above this condition should be included in the experiments discussed in Figure 4 and in the section "Integrating the LOVtag with EL222.."

      We added EL222-only regulation to benchmark the LOVdeg tag and LOVdeg + EL222 experiments. Fig. 4b

      Experiments

      Applications:

      The application of this tag to the metabolic control of octanoic acid production could be more impactful. For instance, using the LOVtag with two different enzymes to change the composition of long/short chain fatty acids with light induction., Or possibly integrating the tag into a switch to activate production. However, the authors address that "decreasing titers is not the overall goal in metabolic engineering" in their discussion, and therefore the pursuit of this additional experiment is up to the authors' discretion.

      We appreciate the suggestions for further applications of the LOVdeg tag. We envision that follow up studies will focus on the application of the LOVdeg tag in metabolic engineering. However, this will require significant development of production systems. We believe this to be out of the scope of this work, where the goal is to present the design and function of the LOVdeg tag as a tool.

    1. Author Response

      We are very thankful to the reviewers for a thorough review of our manuscript, and we are confident that we can address all identified weaknesses in the revised version. At the current point, we believe that it is important to mention the following:

      1. The review by reviewer 1 contains factual errors. For example, the reviewer writes "There is much important information missing. For instance: how many animals were used per group and how was the breeding done?" Both animal numbers and the breeding scheme are described in detail in the manuscript.

      2. Reviewer 3 criticizes our choice of animal ages used for the analysis of sperm DNA methylation aging. The reviewer suggests that the sperm of our younger group may contain spermatozoa from the 1st wave of spermatogenesis, while our older group cannot be considered chronologically old mice. We have experimental data that demonstrate that DNA regions that undergo methylation change with age have a linear association between methylation levels and age across the mouse lifespan (including ages used in our study). Thus, age-dependent changes in DNA methylation may be analyzed using any two ages as soon as they are different enough to detect the changes. We will include this experimental data in our resubmitted manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      Question 1: The experiment that utilizes lactose or glucose supplementation to infer the importance of carbohydrate recognition by galectin-9 cannot be interpreted unequivocally owing to the growth-enhancing effect of lactose supplementation on Mtb during liquid culture in vitro.

      Response: Thanks for this very constructive comment. We will repeat this experiment and lower the concentration of lactose in order to attenuate its effect on Mtb growth, thereby highlighting the reversed mycobacterial growth inhibition by galectin-9.

      Question 2: Similar to the comment above, the apparent dose-independent effect of galectin-9 on Mtb growth in vitro is difficult to reconcile with the interpretation that galectin is functioning as claimed.

      Response: We thank the reviewer for the correction. Indeed, as the reviewer pointed out, galectin-9 inhibits Mtb growth in dose-independent manner. We will correct the claim in the revised manuscript.

      Question 3: The claimed differences in galectin-9 concentration in sera from tuberculin skin test (TST)-negative or TST-positive non-TB cases versus active TB patients are not immediately apparent from the data presented.

      Response: We appreciate the reviewer’s concern. We will perform the detection of galectin-9 in sera in another independent cohort of active TB patients and healthy donors in China.

      Question 4: Neither fluorescence microscopy nor electron microscopy analyses are supported by high-quality, interpretable images which, in the absence of supporting quantitative data, renders any claims of anti-AG mAb specificity (fluorescence microscopy) or putative mAb-mediated cell wall swelling (electron microscopy) highly speculative.

      Response: We appreciate the reviewer’s concern. We will improve the procedure of the immunofluorescence assay to obtain high-quality and interpretable images with quantitative data. As for electron microscopy analyses, we will add a more precise label indicating cell wall in revised manuscript.

      Question 5: Finally, the absence of any discussion of how anti-AG antibodies (similarly, galectin-9) gain access to the AG layer in the outer membrane of intact Mtb bacilli (which may additionally possess an extracellular capsule/coat) is a critical omission - situating these results in the context of current knowledge about Mtb cellular structure (especially the mycobacterial outer membrane) is essential for plausibility of the inferred galectin-9 and anti-AG mAb activities.

      Response: Exactly, AG is hidden by mycolic acids in the outer layer of Mtb cell wall. As we have discussed in the Discussion part of previous manuscript (line286), we speculate that during Mtb replication, cell wall synthesis is active and AG becomes exposed, thereby facilitating its binding to galectin-9 or AG antibody and leading to Mtb growth arrest. It’s highly possible that galectin-9 or AG antibody targets replicating Mtb. We will describe this point more comprehensibly.

      Reviewer #2 (Public Review):

      Question 1: In light of other observations that cleaved galectin-9 levels in the plasma is a biomarker for severe infection (Padilla A et al Biomolecules 2021 and Iwasaki-Hozumi H et al. Biomoleucles 2021) it is difficult to reconcile the author's interpretation that the elevated gal-9 in Active TB patients (Figure 1E) contributes to the maintenance of latent infection in humans. The authors should consider incorporating these observations in the interpretation of their own results.

      Response: Thank you for these very insightful comments. We observed elevated levels of galectin-9 in the serum of active TB patients, consistent with reports indicating that cleaved galectin-9 levels in the serum serve as a biomarker for severe infection (Iwasaki-Hozumi et al., 2021; Padilla et al., 2020). We interpret this to mean that elevated levels of galectin-9 in serum of active TB are an indicator of the host immune response to Mtb infection. However, the magnitude of elevated galectin-9 is insufficient to control Mtb infection thereby maintaining latent infection. This is comparable to other protective immune factors such as interferon gamma, which is considered protective and elevated in active TB, as well (El-Masry et al., 2007; Hasan et al., 2009).

      Question 2: The anti-AG titers were measured only in individuals with active TB (Figure 3C), generally thought to be a less protective immunological state. The speculation that individuals with anti-AG titers have some protection is not founded. Further only 2 mAbs were tested to demonstrate restriction of Mtb in culture. It is possible that clones of different affinities for AG present within a patient's polyclonal AG-antibody responses may or may not display a direct growth restriction pressure on Mtb in culture. The authors should soften the claims about the presence of AG-titers in TB patients being indicative of protection.

      Response: We appreciate the reviewer’s concern. As per the reviewer’s suggestion, we will soften the claim that anti-AG antibodies in the sera of TB patients indicate protection.

      References El-Masry, S., Lotfy, M., Nasif, W.A., El-Kady, I.M., and Al-Badrawy, M. (2007). Elevated serum level of interleukin (IL)-18, interferon (IFN)-gamma and soluble Fas in patients with pulmonary complications in tuberculosis. Acta microbiologica et immunologica Hungarica 54, 65-77.

      Hasan, Z., Jamil, B., Khan, J., Ali, R., Khan, M.A., Nasir, N., Yusuf, M.S., Jamil, S., Irfan, M., and Hussain, R. (2009). Relationship between circulating levels of IFN-gamma, IL-10, CXCL9 and CCL2 in pulmonary and extrapulmonary tuberculosis is dependent on disease severity. Scandinavian journal of immunology 69, 259-267.

      Iwasaki-Hozumi, H., Chagan-Yasutan, H., Ashino, Y., and Hattori, T. (2021). Blood Levels of Galectin-9, an Immuno-Regulating Molecule, Reflect the Severity for the Acute and Chronic Infectious Diseases. Biomolecules 11.

      Padilla, S.T., Niki, T., Furushima, D., Bai, G., Chagan-Yasutan, H., Telan, E.F., Tactacan-Abrenica, R.J., Maeda, Y., Solante, R., and Hattori, T. (2020). Plasma Levels of a Cleaved Form of Galectin-9 Are the Most Sensitive Biomarkers of Acquired Immune Deficiency Syndrome and Tuberculosis Coinfection. Biomolecules 10.

    1. Author Response

      We thank the reviewers for spending the time to read and provide reviews for our manuscript. The reviewers bring good points regarding the sample size, and the low exposure in the South Asian cohort owing to their unique cultural and social practices. We recognize these as limitations of the paper and will discuss these more extensively in the revised version. With respect to sample size, we are not attempting discovery but rather application of mDNA scores derived from external, large discovery samples. As such, though our sample sizes (n = 300–500) seem low for a typical EWAS, they are in a similar range as replication samples in other studies.

      We would also like to take this opportunity to emphasize there is no possible overfitting as the score was tested in studies (FAMILY and START) independent of the discovery set (Joubert et al., 2016; n > 5,000) and the LASSO validation (CHILD; n = 352). In other words, the same participants used for LASSO validation were not used in testing. This is precisely to leverage the larger sample size from external studies to select more plausible CpGs as candidates to include in the model. In fact, the discovery sample size in Reese et al., (2017) was only n = 1,057 in comparison.

      The validated score was then used for further testing in new datasets (FAMILY and START), where FAMILY achieved a more significant association than in the original validation sample (CHILD). At the same time, the mean squared error for the continuous smoking severity outcome (0 for no smoking, 1 for quit before pregnancy, 2 for quit during pregnancy, and 3 for current smoker) was 0.68 in CHILD and 1.43 in FAMILY, which indicate good fit; while the AUC for predicting current vs. non-smoker was 0.86 in CHILD and 0.9 in FAMILY. Taken together, these suggest the MRS constructed was not in violation of overfitting, or “failing to fit to additional data or predict future observations reliably”.

      In terms of value, our derived score contained 11 CpGs that only overlapped 2 out of the 28 CpGs in the score that was derived in the reference provided (Reese, EHP, 2017, PMID 27323799), but they shared four genes that contributed the most weight to the score (MYO1G, CYP1A1, AHRR, and GFI1). In fact, using the 7 CpGs of the score derived in Reese that were present in all cohorts, we obtained slightly worse performance in CHILD (validation cohort; ANOVA p = 4.1E-5, AUC 0.74), and it was not associated with smoking history in FAMILY (testing cohort; p = 0.13). However, we do agree with the reviewer that including more CpGs will improve the performance, using 24/28 CpGs available in CHILD (HM450K), we obtained slightly better results (ANOVA p = 3.8E-7, AUC 0.94), but these were mostly due to the 14/24 CpGs that showed evidence of association with maternal smoking according to EWAS catalog. In conclusion, we believe our score captures the core genes with robust evidence of association and is more parsimonious for applying to external data, but it can also benefit from a larger sample size to capture CpGs that are moderately associated with maternal smoking.

    1. Author Response

      Reviewer #1 (Public Review):

      Overall, the magnitude of the effect size due to FNDC5 deficiency in both male and female mice is rather modest. Looking at the data from a qualitative perspective, it is clear that knockout females still lose bone during lactation and on the low calcium diet (LCD). It is difficult to assess the physiologic consequence of the modest quantitative 'protection' seen in FNDC5 mutants since the mutants still show clear and robust effects of lactation and LCD on all parameters measured. Similarly, the magnitude of the 'increased' cortical bone loss in FNDC5 mutant males is also modest and perhaps could be related to the fact that these mice are starting with slightly more cortical bone. Since the authors do not provide a convincing molecular explanation for why FNDC5 deficiency causes these somewhat subtle changes, I would like to offer a suggestion for the authors to consider (below, point #2) which might de-emphasize the focus of the manuscript on FNDC5. If the authors chose not to follow this suggestion, the manuscript could be strengthened by addressing the consequences of the modest changes observed in WT versus FNDC5 KO mice.

      We agree that the magnitude of the effect size due to FNDC5 deficiency is modest with regards to the quantitative cortical bone parameters. However, if one examines the changes in osteocyte lacunar size and the mechanical properties of these bones, the differences are greater. As shown in Figure 3 E, the lacunar area of the WT females on a low calcium diet increases by over 30% and the KO by less than 20%, while in the males it is approximately 38% in WT compared to 46% in KO mice. According to Sims and Buenzli (PMID: 25708054) a potential total loss of ~16,000 mm3 (16 mL) of bone occurs through lactation in the human skeleton. This was based on our measurements in lactation-induced murine osteocytic osteolysis (Qing et al PMID: 22308018). They used our 2D section of tibiae from lactating mice showing an increase in lacunar size from 38 to 46 um2. In that paper we also showed that canalicular width is increased with lactation. Therefore, this would suggest a dramatic decrease in intracortical porosity due to the osteocyte lacunocanalicular system in female KO on a low calcium diet compared to WT females and a dramatic increase in KO males compared to WT males. Also, PTH was higher in the serum of female WT compared to female KO mice on a low calcium diet, the opposite for males in order to maintain normal calcium levels (See Table 1). Based on this data, using the FNDC5 null animals, we would speculate that the product of FNDC5, irisin, is having a highly significant effect on the ultrastructure of bone in both males and females challenged with a low calcium diet.

      2) The bone RNA-seq findings reported in Figures 4-6 are quite interesting. Although Youlten et al previously reported that the osteocyte transcriptome is sex-dependent, the work here certainly advances that notion to a considerable degree and likely will be of high interest to investigators studying skeletal biology and sexual dimorphism in general. To this end, one direction for the authors to consider might be to refocus their manuscript toward sexually-dimorphic gene expression patterns in osteocytes and the different effects of LCD on male versus female mice. This would allow the authors to better emphasize these major findings, and to then use FNDC5 deficiency as an illustrative example of how sexually-dimorphic osteocytic gene expression patterns might be affected by deletion of an osteocyte-acting endocrine factor. Ideally, the authors would confirm RNA-seq data comparing male versus female mice in osteocytes using in situ hybridization or immunostaining.

      Thank you for this suggestion. We have compared the different effects of LCD on male versus female mice in our revised version and have added a figure containing this information.

      3) Along the lines of point #2 (above), the presentation of the RNA-seq studies in Figures 4-6 is somewhat confusing in that the volcano plot titles seem to be reversed. For example, Figure 4A is titled "WT M: WT F", but the genes in the upper right quadrant appear to be up-regulated in female cortical bone RNA samples. Should this plot instead be titled "WT F: WT M"? If so, then all other volcano plots should be re-titled as well.

      We have now insured that the plots are appropriately labeled.

      4) Have the authors compared male versus female transcriptomes of LCD mice?

      We have now compared the male vs female transcriptomes of LCD mice and added an additional figure.

      5) It would be appreciated if the authors could provide additional serum parameters (if possible) to clarify incomplete data in both lactation and low-calcium diet models: RANKL/OPG ratio, Ctx, PTHrP, and 1,25-dihydroxyvitamin D levels.

      It is not possible to quantitate each of these as the serum has been exhausted. We have checked the RANKL/OPG ratio in the RNA seq and qPCR data using osteocyte enriched bone chips and found no difference.

      6) Lastly, the data that overexpressing irisin improved bone properties in Fig 2G was somewhat confusing. Based on Kim et al.'s (2018) work, irisin injection increased sclerostin gene expression and serum levels, thus reducing bone formation. Were sclerostin levels affected by irisin overexpression in this study? Was irisin's role in modulating sclerostin levels attenuated with additional calcium deficiency?

      We have not observed any differences in the osteocyte Sost mRNA expression between WT and KO normal and low-calcium-diet male and female mice in our RNAseq and qPCR data. As such, we did not check the Sost levels for the 2G experiment.

      Reviewer #2 (Public Review):

      Summary:

      The goal of this study was to examine the role of FNDC5 in the response of the murine skeleton to either lactation or a calcium-deficient diet. The authors find that female FNDC5 KO mice are somewhat protected from bone loss and osteocyte lacunar enlargement caused by either lactation or a calcium-deficient diet. In contrast, male FNDC5 KO mice lose more bone and have a greater enlargement of osteocyte lacunae than their wild-type controls. Based on these results, the authors conclude that in males irisin protects bone from calcium deficiency but that in females it promotes calcium removal from bone for lactation.

      While some of the conclusions of this study are supported by the results, it is not clear that the modest effects of FNDC5 deletion have an impact on calcium homeostasis or milk production.

      Specific comments:

      1) The authors sometimes refer to FNDC5 and other times to irisin when describing causes for a particular outcome. Because irisin was not measured in any of the experiments, the authors should not conclude that lack of irisin is responsible. Along these lines, is there any evidence that either lactation or a calcium-deficient diet increases the production of irisin in mice?

      The global FNDC5 KO mice used for our experiments do not produce or secrete irisin, therefore we have extrapolated that the observed effects are due to a lack of circulating irisin. However, this does not rule out that Fndc5 itself could have a function, but this would have to be most likely in muscle and not in the osteocyte as we do not detect significant levels of irisin in either primary osteoblasts nor primary osteocytes compared to muscle and C2C12 cells. As such, we concluded that the phenotypical differences we saw in our experiments are due to a lack of irisin. We now address the reviewer’s point in the discussion. The measurement of irisin in the circulation with lactation or with low calcium diet of normal mice has not been performed.

      2) The results of the irisin-rescue experiment shown in figure 2G cannot be appropriately interpreted without normal diet controls. In addition, some evidence that the AAV8-irisin virus actually increased irisin levels in the mice would strengthen the conclusion.

      We do not have the normal diet controls at this time. We have now added the quantitative data for tagged irisin in these mice showing highly significant expression

      3) There is insufficient evidence to support the idea that the effect of FNDC5 on bone resorption and osteocytic osteolysis is important for the transfer of calcium from bone to milk. Previous studies by others have shown that bone resorption is not required to maintain milk or serum calcium when dietary calcium is sufficient but is critical if dietary calcium is low (Endo. 156:2762-73, 2015). To support the conclusions of the current study, it would be necessary to determine whether FNDC5 is required to maintain calcium levels when lactating mice lack sufficient dietary calcium.

      We agree that it would be important to measure calcium levels in the milk to test the hypothesis that FNDC5 is important to maintain calcium levels in milk. However, as the calcium levels are normal in the serum, we are assuming they are normal in milk. This would require future experiments.

      4) The amount of cortical bone loss due to lactation is very similar in both WT and FNDC5 KO mice. The results of the statistical analysis of the data presented in figure 1B are surprising given the very similar effect size of lactation. The key result from the 2-way ANOVA is whether there is an effect of genotype on the effect size of lactation (genotype-lactation interaction). The interaction terms were not provided. Similar concerns are noted for the results shown in figure 1G and H.

      We agree, thanks. We will now add the interaction terms in the figure legends.

      5) It is not clear what justifies the term 'primed' or 'activated' for resorption. Is there evidence that a certain level of TRAP expression lowers the threshold for osteocytic osteolysis in response to a stimulus?

      The number of TRAP positive osteocytes in female KO mice are lower than in female WT. The number of TRAP positive osteocytes are lower in WT males compared to WT females. We propose that irisin plays a role in the number of TRAP positive osteocytes in normal, WT females by readying or preparing these cells to rapidly respond to low calcium. We will use the term ‘primed’ and will not use the term ‘activated’. We are open to any terminology or description as to why this is observed and what irisin could be doing to the osteocyte.

      Reviewer #3 (Public Review):

      Summary: Irisin has previously been demonstrated to be a muscle-secreted factor that affects skeletal homeostasis. Through the use of different experimental approaches, such as genetic knockout models, recombinant Irisin treatment, or different cell lines, the role of Irisin on skeletal homeostasis has been revealed to be more complex than previously thought and this warrants further examination of its role. Therefore, the current study sought to rigorously examine the effects of global Irisin knockout (KO) in male and female mouse bone. Authors demonstrated that in calcium-demanding settings, such as lactation or low-calcium diet, female Irisin KO mice lose less bone compared to wild-type (WT) female mice. Interestingly male Irisin KO mice exhibited worse skeletal deterioration compared to WT male mice when fed a low-calcium diet. When examined for transcriptomic profiles of osteocyte-enriched cortical bone, authors found that Irisin KO altered the expression of osteocytic osteolysis genes as well as steroid and fatty acid metabolism genes in males but not in females. These data support the authors' conclusion that Irisin regulates skeletal homeostasis in sex-dependent manner.

      Strengths: The major strength of the study is the rigorous examination of the effects of Irisin deletion in the settings of skeletal maturity and increased calcium demands in female and male mice. Since many of the common musculoskeletal disorders are dependent on sex, examining both sexes in the preclinical setting is crucial. Had the investigators only examined females or males in this study, the conclusions from each sex would have contradicted each other regarding the role of Irisin on bone. Also, the approaches are thorough and comprehensive that assess the functional (mechanical testing), morphological (microCT, BSEM, and histology), and cellular (RNA-seq) properties of bone.

      Weaknesses: One of the weaknesses of this study is a lack of detailed mechanistic analysis of why Irisin has a sex-dependent role on skeletal homeostasis. This absence is particularly notable in the osteocyte transcriptomic results where such data could have been used to further probe potential candidate pathways between LC females vs. LC males.

      Our future studies will focus on understanding the molecular mechanism behind the sex-dependent effects of irisin. Our RNA seq data shows a significant difference in the lipid, steroid, and fat metabolism pathways between male and female mice, as well as between WT and KO mice. Future studies will focus on these pathways.

      Another weakness is authors did not present data that convincingly demonstrate that Irisin secretion is altered in the skeletal muscle between female vs. male WT mice in response to calcium restriction. The supplement skeletal muscle data only present functional and electrophysiolgical outcomes. Since Itgav or Itgb5 were not different in any of the experimental groups, it is assumed that the changes in the level of Irisin is responsible for the phenotypes observed in WT mice. Assessing Irisin expression will further strengthen the conclusion based on observing skeletal changes that occur in Irisin KO male and female mice.

      The problem is that the commercial assays for irisin are not dependable, and results can differ widely across and beyond the physiologic range of 1-10 ng/ml. In part this is due to the nature of the polyclonal antibodies used and the resultant cross reactivity with other proteins. It was shown in Islam et al, 2021 (Nature Metabolism) that the commercial ELISAs were completely unreliable in mice and the only reliable method of measuring circulating irisin is mass spectrometry.

    1. Author Response

      Reviewer #1 (Public Review):

      Strengths:

      1. In my assessment, the data sufficiently demonstrates that a modified version of Pertuzamab can bind both the wild-type and S310 mutant forms of ERBB2.

      2. The engineering strategy employed is rational and effectively combines computational and experimental techniques.

      3. Given the clinical activity of HER2-targeting ADCs, antibodies unaffected by ERBB2 mutations would be desired.

      Weaknesses:

      1. There is no data showing that the engineered antibody is equally specific as Pertuzamab i.e. that it does not bind to other (non-ERBB2) proteins.

      Showing the specificity of the engineered antibodies is indeed important. We did not address it in the current ms, but it can be tested in the future.

      1. There is no data showing that the engineered antibody has the desired pharmacokinetics/pharmacodynamics properties or efficacy in vivo.

      In this ms we did not conduct in-vivo experiments. When moving forward, pharmacokinetics/pharmacodynamics properties and efficacy will be tested as well.

      1. Computational approaches are only used to design a phage-screen library, but not used to prioritize mutations that are likely to improve binding (e.g. based on predicted impact on the stability of the interaction). A demonstration of how computational pre-screening or lead optimization can improve the time-intensive process would be a welcome advance.

      Thank you for this important comment. In the present ms we indeed used a computational approach for prioritizing residues to be mutated, but we did not prioritize the mutations that are likely to improve binding. In the initial library design, we did prioritize the mutations. However, due to experimental approach limitations with codon’s selection for the library, we had decided to allow all possible residues in each position, knowing that the selection will remove non-binding variants.

      Context:

      The conflict of interest statement is inadequate. Most authors of the study (but not the first author) are employees of Biolojic, a company developing multi-specific antibodies, but the statements do not clarify whether the presented antibodies represent Biolojic IP, whether the company sponsored the research, and whether the company is further developing the specific antibodies presented.

      The Conflict-of-Interest statement will be revised as such: The Biolojic Design authors are employees of Biolojic Design and have stock options in Biolojic Design. The company did not sponsor the research, does not hold IP for the presented antibodies, and is not further developing the presented antibodies.

      Reviewer #2 (Public Review):

      Strengths:

      1. Deep computational analyses of large datasets of clinical data provide useful information about HER2 mutations and their potential relevance to antibody therapy resistance.

      2. There is valuable information analyzing the residues within or near the interface between the antigen HER2 and the Pertuzumab antibody (heavy chain). The experimental antibody library screening obtained 90+ clones from 3.86×1011 sequences for further functional validation.

      Weaknesses:

      1. There is a lack of assessment for antibody variant functions in cancer cell phenotypes in vitro (proliferation, cell death, motility) or in vivo (tumor growth and animal survival). The only assay was the western blotting of phosphopho-HER3 in Figure 4. However, HER2 levels and phosphor-HER2 were not analyzed.

      We indeed did not assess the engineered antibodies function in cancer cells. Regarding signaling assessment, previous works [1-3] also measured the signaling activation following HER2-HER3 dimerization by measuring pHER3, and we relied on them in this ms.

      1. There is a misleading impression from the title of computational engineering of a therapeutic antibody and the statement in the abstract "we designed a multi-specific version of Pertuzumab that retains original function while also bindings these HER2 variants" for a few reasons:

      a. The primary method used for variant antibody identification for HER2 mutant binding is rather traditional experimental screening based on yeast display instead of the computational design of a multi-specific version of Pertuzumab.

      b. There is insufficient or lack of computational power in the antibody design or prioritization in choosing variant residues for the library construction of 3.86×1011 sequences. It seems random combinations from 6 residues out of 4 groups with 20 amino acid options.

      c. The final version of the tri-binding variant is a combination of screened antibody clones instead of computation design from scratch.

      d. There is incomplete experimental evidence about the therapeutic values of newly obtained antibody clones.

      Thank you for this relevant comment. When addressing relevant residues to be mutated, the number of potential variants is enormous. The computational approach was aimed at identifying the most preferable residues, in which variation can improve binding and is not likely to harm important interactions. Although an initial smaller number of residues could be chosen, we decided to broaden our view and create a larger library, in the aim of combining the computational selection with an experimental selection. This indeed is not a computational design from scratch, but rather an intercourse between the computer and the lab, that yielded the presented results.

      1. Figures can be improved with better labeling and organization. Some essential pieces of data such as Supplementary Figure 1B on HER2 mutations in S310 that abrogated its binding to Pertuzumab should be placed in the main figures.

      Thank you for this comment, the relevant figures will be moved to the main text, and the labels will be revised.

      1. It is recommended to provide a clear rationale or flowchart overview into the main Figure 1. Figure 2A can be combined with Figure 1 to the list of targeted residues.

      Figures 1 and 2 will be divided differently, and the rationale will be detailed in the revised text.

      1. The quality of Figures such as Figure 2B-C flow data needs to be improved.

      This will be corrected in the revised text.

      1. Diwanji, D., et al., Structures of the HER2-HER3-NRG1β complex reveal a dynamic dimer interface. Nature, 2021. 600(7888): p. 339-343.

      2. Yamashita-Kashima, Y., et al., Mode of action of pertuzumab in combination with trastuzumab plus docetaxel therapy in a HER2-positive breast cancer xenograft model. Oncol Lett, 2017. 14(4): p. 4197-4205.

      3. Kang, J.C., et al., Engineering multivalent antibodies to target heregulin-induced HER3 signaling in breast cancer cells. MAbs, 2014. 6(2): p. 340-53.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      This reviewer found the paper of very high interest, well supported, and well written. I have only a few suggestions to the authors for further improvement:

      1. TRAIL mutants carrying individual mutations of basic residues R119, R122 and K125 were tested, but a TRAIL mutant lacking all three residues was not. This combined mutant protein would have allowed to test whether all heparin binding is abolished (e.g. that no other residues contribute to HS binding) and could have also been used as an independent control replacing heparin and heparinase treatment in binding/apoptosis studies. Given that the DR4/5 and heparin binding sites of TRAIL do not overlap, this form would be useful in determining the extent to which HS contributes to, or serves as a prerequisite for TRAIL binding to its receptor and cell death. Moreover, if bound to the receptor, this mutant TRAIL is expected to completely prevent HS-mediated receptor internalization. The added value of this experiment therefore is that it may provide an answer to the controversial debate on whether DR receptor internalization promotes or inhibits apoptosis.

      In Fig. 5C, we provided data showing that the binding of R115A mutant of hTRAIL (equivalent to murine R199A mutant) to MB-453 cells was very similar to the binding of WT hTRAIL to heparin lyase treated cells. This finding suggests that nearly all HS-dependent binding to cell surface HS was abolished by mutating R115. Since a single mutant is sufficient, we felt there is little point in combining multiple mutations. We also used R115A mutant as an independent control replacing heparin and heparinase treatment in apoptosis assay in Fig. 7E. With regard to using the mutant in the internalization assay, we thank the reviewer for this excellent suggestion and will incorporate it into our future study as we intend to perform more in-depth investigation on the exact mechanism of internalization.

      1. The domain data is interesting, but its physiological significance remains obscure and it also somewhat distracts from the main theme of the study. It may be removed from a revised manuscript.

      We partially agree with the reviewer’s assessment, but we felt that this discovery is of sufficient novelty and should be made known to the whole community.

      1. TUNEL data is shown as a picture in Figure 6, but quantification is lacking.

      We have included the statistics of the TUNEL data in the final version as Fig. 6D.

      1. Is the HS20 antibody a well-suited pan-anti-HS antibody? Why was this antibody used instead of heparinase digestion followed by the use of HS "stub" antibodies that were previously used as a reliable readout for overall sulfation?

      The HS20 mAb has been very well characterized by Dr. Mitchell Ho group (Gao et al., 2016). We have also done side-by-side comparison of HS20 and the most commonly used anti-HS mAb 10E4 by immunostaining and FACS. In nearly all tissues and cells tested, HS20 gave better sensitivity and lower background (after heparin lyase treatment) compared to 10E4. The staining pattern of the two mAbs are usually identical, but the signal/noise ratio of HS20 is much better than 10E4. The HS ”stub” antibody can be useful in certain applications, but it is used mainly as an indicator of the distribution/abundance of HSPGs, rather than a readout of overall sulfation.

      1. The discussion should be stripped from expressions such as interestingly, curiously, unexpectedly, certainly, undoubtedly and the like to improve readability. The manuscript should be checked for typos (for example surface plasma resonance line 473, was served line 481).

      We thank the reviewer for the suggestions and many of these expressions were removed in the final version.

      1. Last but not least: to test the physiological relevance of these findings, it would be of the highest interest to use a mouse model harboring a tumor cell line of choice and derived lines with impaired or increased HS expression, as outlined in my public comments, and to test tumor responsiveness to TRAIL treatment. If already planned, I wish you Good Luck with the experiments!

      We thank the reviewer for this excellent suggestion and we have indeed planned to do exactly that!

      Reviewer #2 (Recommendations For The Authors):

      1. The authors showed in Fig.2 that 12mer HS forms complex with TRAIL homotrimer. Please clarify if 12mer HS binding leads to the formation of the TRAIL homotrimer or TRAIL can form homotrimer in the absence of HS binding. Do the TRAIL mutations that affect HS binding, such as R115A, also impact the homotrimer formation?

      TRAIL automatically forms a homotrimer independent of HS. It is known that formation of the homotrimer critically depends on a zinc ion, which is located on the threefold axis of the trimer and is bound by cysteine 240. We have also verified that all TRAIL mutants remain homotrimeric by size exclusion chromatography.

      1. Does 12mer HS also suppress TRAIL-mediated apoptosis in MDA-MB-453 cells?

      We thank the reviewer for this question but felt performing this experiment will not add any more insight to the main conclusion. Most likely, the result will be similar to what we saw in Fig. 7D, where we found 12mer significantly inhibits TRAIL-induced apoptosis, but inhibits less efficiently compared to heparin.

      1. The authors nicely showed the correlation between surface HS level and sensitivity to TRAIL-induced apoptosis in MM cell lines and implicated that such correlation could be related with the difference in the expression level of SDC1. This is an interesting point worth further validation. Does ectopic SDC1 expression in IM-9 cells lead to increase cell surface HS and sensitivity to TRAIL treatment? On the other hand, will depletion of SDC1 expression in U266 or RPMI8226 cells decrease their sensitivity to TRAIL treatment?

      We agree that this would be an excellent experiment to try and have actually attempted to overexpress SDC1 in IM-9 cells. But we found IM-9 cells are very difficult to transfect and we only managed to convert a small percentage of SDC1 negative cells to positive cells. Also, the level of SDC1 expression on the SDC1-positive cells was not changed after overexpression. We have not tried depleting SDC1 expression in U266 and RPMI8226 cells because such an experiment might change the property of these cells in unexpected ways, which would make result interpretation impossible. A previous report has shown that knocking down SDC1 could enhance clustering of TRAIL receptors in H929 cells (Wu et al., J Immunol 2012;), which actually led to slightly increased apoptosis.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study extends insights on NAFLD and NASH regarding the role of plasma lactate levels using mice haplo-insufficient for the gene encoding lactate transporter MCT-1. While the evidence is largely convincing and the work significantly advances our understanding of the roles of distinct hepatic cell types in steatosis, a number of issues require attention and would best be solved by further experimentation.

      RESPONSE: We agree with this assessment by eLife, and appreciate the reviewers’ view that the study is important and extends insights into liver disease.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors put forth the hypothesis that hepatocyte and/or non-parenchymal liver MCT1 may be responsible for physiologic effects (lower body weight gain and less hepatic steatosis) in MCT1 global heterozygote mice. They generate multiple tools to test this hypothesis, which they combine with mouse diets that induce fatty liver, steatohepatitis and fibrosis. Novel findings include that deletion of hepatocyte MCT1 does not change liver lipid content, but increases liver fibrosis. Deletion of hepatic stellate cell (HSC) MCT1 does not substantially affect any liver parameter, but concomitant HSC MCT1 deletion does reverse fibrosis seen with hepatocyte MCT1 knockout or knockdown. In both models, plasma lactate levels do not change, suggesting that liver MCT1 does not substantially affect systemic lactate. In general, the data match the conclusions of the manuscript, and the studies are well-conducted and well-described. Further work would be necessary to dissect mechanism of fibrosis with hepatocyte MCT1, and whether this is due to changes in local lactate (as speculated by the authors) or another MCT1 substrate. This would be important to understand this novel potential cross-talk between hepatocytes and HSCs.

      A parallel and perhaps more important advance is the generation of new methodology to target HSC in mice, using modified siRNA and by transduction of AAV9-Lrat-Cre. Both methods would reduce the need to cross floxed mice with the Lrat-Cre allele, saving time and resources. These tools were validated to an extent by the authors, but not sufficiently to ensure that there is no cross-reactivity with other liver cell types. For example, AAV9-LratCre-transduced MCT1 floxed mice show compelling HSC but not hepatocyte Mct1 knockdown, but other liver cell types should be assessed to ensure specificity. This is particularly important as overall liver Mct1 decreased by ~30% in AAV9-Lrat-Cre-transduced mice, which may exceed HSC content of these mice, especially when considering a 60-70% knockdown efficiency. This same issue also affects Chol-MCT1-siRNA, which the authors demonstrate to affect hepatocytes and HSC, but likely affects other cell types not tested. As this is a new and potentially valuable tool, it would be important to assess Mct1 expression across more non-parenchymal cells (i.e. endothelial, cholangiocytes, immune cells) to determine penetration and efficacy.

      RESPONSE: We appreciate the reviewer’s view that the new methods we describe represent an important advance. To ensure the specificity of our novel AAV-Lrat-Cre construct, it would be fair to test its distribution among all possible hepatic cell types, including endothelial cells, cholangiocytes, and other immune cells, as suggested. Our efforts in this study were primarily focused on the major cell types thought to contribute to NASH, namely hepatocytes, Kupffer cells, and in particular hepatic stellate cells. The reasons for this focus were:

      1) Our primary goal was to investigate the role of MCT1 in hepatic fibrogenesis. According to Manderacke et al. (2013, Nature Comm), hepatic stellate cells account for the dominant proportion (82-96%) of myofibroblast progenitors, which produce collagen fibers. While there may be interesting roles of MCT1 in those other cell types, to elucidate MCT1's role in fibrogenesis, focusing on the dominant fibrogenic cell type, hepatic stellate cells, was the most appropriate approach for this goal.

      2) Considering the proportion of each hepatic cell type in the liver, hepatocytes constitute the majority (60-70%), followed by endothelial cells (15%), immune cells (10%), and stellate cells (5%), among others.

      3) The AAV-Cre system is highly specific to its promoter, in this case, Lrat, which has been well established in multiple previous studies to exhibit high specificity for hepatic stellate cells in the liver. We will certainly conduct more comprehensive biodistribution studies in the future, as we believe that our AAV-Lrat-Cre system could be a valuable tool in this field.

      Reviewer #2 (Public Review):

      In this study, the authors seek to answer two main questions: 1) Whether interfering with lactate availability in hepatocytes through depletion of hepatocyte specific MCT-1 depletion would reduce steatosis, and 2) Whether MCT-1 in stellate cells promote fibrogenesis. While the first question is based on the observation that haploinsufficiency of MCT-1 makes mice resistant to steatosis, the rationale behind how MCT-1 could impact fibrogenesis in stellate cells is not clear. A more detailed discussion regarding how lactate availability would regulate two different processes in two different cell types would be helpful. The authors employ several mouse models and in vitro systems to show that MCT1 inhibition in hepatic stellate cells reduces the expression of COL-1. The significance of the findings is moderately impacted due to the following considerations:

      RESPONSE: We have included additional in vitro data in order to provide a more comprehensive discussion of MCT1's potential role in regulating collagen production. Please refer to the new Figure 8, Supplementary Figure 6, and the results section (Potential Mechanism). Also note that our original hypothesis was that depleting MCT1 specifically in hepatocytes would protect mice with MCT1 haploinsufficiency from liver lactate overload and NAFLD. Furthermore, we postulated that this protection might prevent NASH progression since lipotoxicity-driven hepatocyte damage is a central factor in NASH pathogenesis. However, our findings did not support this hypothesis. We found only one brief article (2015, Z Gastroenterol et al., "Functional effects of monocarboxylate transporter 1 expression in activated hepatic stellate cells") that discussed the potential role of MCT1 depletion in hepatic stellate cells in regulating collagen production or fibrosis, as mentioned in their abstract. Unfortunately, the DOI for this article is not functional, and the data cannot be located. Moreover, when we attempted to replicate their results, we were unable to do so, leading us to report our own findings in the current paper.

      a. Fibrosis in human NAFLD is a significant problem as a predictor of liver related mortality and is associated with type 1 and type 3 collagen. However, the reduction in COL1 in stellate cells did not amount to a reduction in liver fibrosis even in cell specific KO (in Fig 7E, there is no indication of whether Sirius red staining was different between HSC KO and control mice- the authors mention a downward trend in the text). The authors postulate that type 1 COL may not be the more predominant form of fibrosis in the model. This does not seem likely, since the same ob/ob mouse model was used to determine that fibrosis was enhanced with hepatocyte specific MCT-1 KO and decreased with Chol MCT-1KO. Measurements of different types of collagens in their model and the effect of MCT-1 on different types could be more informative. In particular, although collagens are the structural building blocks for hepatic fibrosis, fibrosis can also be controlled by matrix remodeling factors such as Timp1, Serpine 1, PAI-1 and Lox.

      RESPONSE: We monitored the expression levels of matrix remodeling factors, such as Timp1 (Figure 5C, 5F). There was no change in expression upon Chol-MCT1-siRNA treatment, while a significant increase was observed upon GN-MCT1-siRNA treatment. This trend was similar to collagen expression in both cases. Regarding the different types of collagen, instead of measuring each individual type of collagen, we conducted Sirius red and trichrome staining, which enabled us to detect multiple types of collagen simultaneously (Figure 5G, Figure 7D).

      b. The authors use multiple animal models including cell specific KO to conclude that stellate cell MCT-1 inhibition decreases COL-1. However, the mechanisms behind this reduced expression of COL-1 are not discussed or explored, making it descriptive.

      RESPONSE: We agree that the mechanisms involved are not fully defined but have added new data (Figure 8, Supplement Figure 6) and text to discuss possibilities.

      c. Different types of diets are used in this study which could impact lactate availability. Choline deficiency diets are reported to cause weight loss, and importantly have none of the metabolic features of human NASH. Therefore, their utility is doubtful, especially for this study which proposes to investigate if metabolic dysregulation and substrate availability could be a tool for therapy.

      RESPONSE: Unfortunately, none of the rodent models used to study NASH completely replicate the condition in human patients, each having its own set of advantages and drawbacks. In line with the concern raised by reviewer #2, there has been a shift away from the use of severely detrimental methionine and choline-deficient diets in contemporary NASH research. Instead, diets that combine methionine and other amino acids with cholinedeficient diets, in conjunction with high-fat diets, have become more popular. The diet we employed in our study consists of high-fat diet combined with choline-deficient diets. We believe that our findings, which are consistent and established across two distinct NASH pathogenesis models and genetic backgrounds, lend additional robustness to our results.

      d. Hepatocyte specific MCT-1 KO mice seem to have increased COL-1 production, despite no noticeable difference in hepatocyte steatosis. The reasons for this are not discussed. Fibrosis in NASH is thought to be from stellate cell activation secondary to signals from hepatocellular damage. There is no evidence that there was a difference in either of these parameters in the mouse models used.

      RESPONSE: While lipotoxicity-driven liver damage remains a central aspect of NASH pathogenesis, the traditional two-hit theory has become less tractable, giving way to the multi-hit theory in the NASH field. The current debate revolves around whether steatosis is a decisive factor and requirement for NASH fibrogenesis. Our previous publication (Yenilmez et al., 2022, Mol Ther) demonstrated that nearly complete resolution of steatosis did not prevent other NASH features like inflammation and fibrosis, indicating the existence of multiple factors beyond steatosis in NASH pathogenesis. We believe that steatosis and fibrosis influence each other but can also develop independently.

      e. The authors report that serum lactate levels did not rise after MCT-1 silencing, but the reasons behind this are unclear. There is insufficient data about lactate production and utilization in this model, which would be useful to interpret data regarding steatosis and fibrosis development. For example, does the MCT-1 KO prevent hepatocyte and stellate cell net import or export of lactate? What is the downstream metabolic consequence in terms of pyruvate, acetylCoA and the NAD/NADH levels. Does the KO have downstream effects on mitochondrial TCA cycling?

      RESPONSE: Due to both biological and technical challenges (which are described in the new draft), conducting a comprehensive metabolomics study comparing hepatocyte MCT1 KO to hepatic stellate cell MCT1 KO was not feasible. It is important to note that MCT1 can also transport other substrates that are often overlooked, including pyruvate, short-chain fatty acids, and ketone bodies. Also, in addition to MCT1, there are at least two other functional isoforms of MCT: MCT2 and MCT4. Regrettably, due to these biological and technical complications, conducting a comprehensive metabolomic analysis is extremely complicated and difficult to interpret. Nevertheless, some insights are gained from a study involving MCT1 chaperone protein Basigin/CD147 knockout (KO) mice in a high-fat diet- induced hepatic steatosis model. Basigin acts as an auxiliary protein for MCT1, and its absence leads to improper localization and stabilization of MCT1, effectively simulating a state of MCT1 deficiency. In this context, hepatic lactate levels were reduced by half, and other metabolites such as pyruvate, citrate, α-ketoglutarate, fumarate, and malate were significantly decreased. While we must exercise caution when extrapolating these findings to our MCT1 study, they suggest that multiple metabolites, particularly pyruvate, may play a crucial role in the context of MCT1 deficiency.

      f. MCT-1 protein expression is measured only in the in vitro assay. Similar quantitation through western blot is not shown in the animal models.

      RESPONSE: We monitored MCT1 protein expression with either Western blot (Fig 2D, 2E (in vitro)) or immune-histology (Fig 4B, 4C (in vivo, ob/ob + GAN diet NASH model), Sup Fig 5F, 5G (in vivo, MCT1 f/f + CDHFD model)).

      Reviewer #3 (Public Review):

      A major finding of this work is that loss of monocarboxylate transporter 1 (MCT1), specifically in stellate cells, can decrease fibrosis in the liver. However, the underlying mechanism whereby MCT1 influences stellate cells is not addressed. It is unclear if upstream/downstream metabolic flux within different cell types leads to fibrotic outcomes. Ultimately, the paper opens more questions than it answers: why does decreasing MCT1 expression in hepatocytes exacerbate disease, while silencing MCT1 in fibroblasts seems to alleviate collagen deposition? Mechanistic studies in isolated hepatocytes and stellate cells could enhance the work further to show the disparate pathways that mediate these opposing effects. The work highlights the complexity of cellular behavior and metabolism within a disease environment but does little to mechanistically explain it.

      RESPONSE: Described above to Reviewer #2

      The observations presented are compelling and rigorous, but their impact is limited by the nearly complete lack of mechanistic insight presented in the manuscript. As also mentioned elsewhere, it is important to know whether lactate import or export (or the transport of another molecule-like ketone bodies, for example) is the decisive role of MCT1 for this phenotype. Beyond that, it would be interesting, albeit more difficult, to determine how that metabolic change leads to these fibrotic effects.

      RESPONSE: Described above to Reviewer #2

      Kuppfer cells are initially analyzed and targeted. These cells may play a major role in fibrotic response. It will be interesting to determine the effects of lactate metabolism in other cells within the microenvironment, like Kuppfer cells, to gain a complete understanding of how metabolism is altered during fibrotic change.

      RESPONSE: To address the potential involvement of inflammatory cells, we added new data to the manuscript (Supplement Figure 4). Given the distinct hepatic cellular distribution of Chol-MCT1-siRNA and GN-MCT1-siRNA, the opposite fibrogenic phenotype observed may be attributed to MCT1’s role in non-hepatocyte cell types such as the inflammatory Kupffer cells and the fibrogenic hepatic stellate cells. To determine which hepatic cell type drives the opposite fibrotic phenotypes, we first hypothesized that GN-MCT1-siRNA activates M2 pro-fibrogenic macrophages more than Chol-MCT1-siRNA does. The representative M1/ M2 macrophage polarization gene markers were monitored in Kupffer cells. However, GN-MCT1-siRNA treatment caused comparable M1/M2 macrophage activation levels to Chol-MCT1-siRNA treatment (Supplement Figure 4A, 4B). These data suggest that the opposite fibrotic phenotypes caused by the different siRNA constructs are not due to M1/M2 macrophage polarization.

      The timing of MCT1 depletion raises concern, as this is a largely prophylactic experiment, and it remains unclear if altering MCT1 would aid in the regression of established fibrosis. Given the proposal for translation to clinical practice, this will be an important question to answer.

      RESPONSE: Agree these are important experiments for future evaluation.

      Reviewer #1 (Recommendations For The Authors):

      As above, in general, the conclusions match the data presented. The one exception is the authors discussion point that these data show the importance of lactate flux in fibrosis. As MCT1 has other substrates, it does not seem this is definitively due to lactate flux. It would be helpful to have additional experiments to clarify mechanism by which loss of hepatocyte MCT1 leads to increased fibrosis, while loss of HSC MCT1 reverses this finding. This may aid in concluding that altered fibrosis is in fact due to lactate flux in these cell types.

      RESPONSE: Described above to Reviewer #2

      In addition, it is unclear why the authors switched NASH models for the two tools generated (GAN diet for siRNA, CDHFD for AAV). Similarly, methodology to assess fibrosis switched between these two experiments - i.e. Sirius Red staining for siRNA-treated GAN diet-fed mice vs. Trichrome staining for AAV-transduced CDHFD-fed mice. These changes make it difficult to perform cross-comparisons of the data, to explain (for example), why GN-siRNA to Mct1 reduced body weight but AAV8-TBG-Cre did not. Similarly, GN-siRNA increased liver Col1a1 protein but AAV8-TBG-Cre did not. These differences could be explained by model system, or tool efficacy/off-target effects.

      RESPONSE: We agree that different model systems can explain difference in results, but there is also an advantage of using different models and various methodologies as preclinical tests of consistency of data on NASH under different conditions. There are no perfect mouse models for human NASH.

      • Phenotyping is also incomplete for the latter experiment, in particular amount of liver lipid content –

      RESPONSE: We estimated lipid content by H&E (Fig 6E, F). In some experiments, we focused mostly on COL1 protein expression, as this rather than mRNA is the functional aspect of fibrosis.

      Reviewer #2 (Recommendations For The Authors):

      This study could benefit from standardization of the types of diet used across all animal models and a more comprehensive focus on the metabolic/substrate availability and utilization aspects of NAFLD and NASH affected in the mouse models with MCT-1 dependent lactate transport deficiency. Since hepatic fibrogenesis in NASH is impacted by signals following hepatocyte damage, the extent of cell death in these models could also be better characterized.

      RESPONSE: Our ALT data provides indirect insight into hepatocyte damage. Our histology images did not reveal significant changes in cell morphology or integrity and there were no notable changes in caspase protein levels.

      Other comments:

      In Fig 4G, there is an increase in the number of lipid droplets with Chol- MCT-1 siRNA compared to GN-MCT1-sirRNA, suggesting that the stellate cell component might be responsible for this finding. The possible reasons for this are not discussed.

      RESPONSE: The effects in Fig 4G were exceedingly small and there is no difference in total TG in these experiments, so it is hard to interpret these data and provide logical explanations.

      In Fig 5A. A western Blot for aSMA and COL 1 is shown but the sample labeling is unclear i.e, do the lanes belong to different mice of the same condition? HFD mice vs Ctr mice?

      RESPONSE: Both groups of ob/ob mice were fed a GAN diet. The graph in Fig 5 is a direct comparison between NTC-siRNA and MCT1-siRNA. To enhance clarity, this is indicated in the figure legends, and the data in Fig 5 is a continuation of the data presented in Fig 4

      In Fig 5E, COL1 densitometry data should also be provided for non-silenced mice on HFD and Chow diet for appropriate comparison

      RES\PONSE: Both groups of ob/ob mice were fed a GAN diet. The graph in Fig 5 represents a comparison between NTC-siRNA and MCT1-siRNA. It's important to note that, typically, ob/ob mice fed either a chow diet or a high-fat diet do not exhibit fibrogenic phenotypes within this time frame (3 weeks of dietary intervention).

      There are many mis-statements throughout the text.Page 6 - "MCT1 silencing significantly inhibited Tgf1β-stimulated ACTA2 mRNA expression as well as collagen 1 protein production" but it is not stated that CO1A1 mRNA is unchanged in Fig 1C.

      RESPONSE: We observed no change in CO1A1 mRNA levels (Fig 1C), so we focused on collagen 1 protein production (Fig 1B) on page 6. Given the consistent trend observed in Chol-MCT1-siRNA (Fig 5C), we proposed the possibility of MCT1's influence on collagen translation or protein turnover on page 11.

      Page 7- ".......our Chol-MCT1-siRNA does not require transfection reagents as it is fully chemically modified". What does fully chemically modified mean and why does this mean in terms of transfection efficiency.

      RESPONSE: One of the primary challenges in utilizing RNAi as a therapeutic approach has been the effective in vivo delivery strategy, particularly concerning stability and longevity against systemic nucleases. Recent developments in siRNA duplex chemical modification strategies, such as 2-Fluoro and 2-O-Methyl ribose substitutions, as well as phosphorothioate backbone replacements, have addressed these challenges (Please see Figure 3. In our current study, we employed 'chemically fully modified' siRNA, featuring several key modifications: (1) every single ribose is chemically modified to 2-F or 2-OMeribose, (2) phosphorothioate backbone replacement, (3) 5'-end of the antisense strand modification to (E)-Vinyl-phosphonate, and (4) 3'-end of the sense strand linkers such as Cholesterol or Tri-N-Acetyl-galactosamine. These chemical enhancements significantly improve transfection efficiency, longevity, and selectivity, setting it apart from traditional siRNA lacking such chemical modifications. A prior study from the Khvorova lab has demonstrated substantial efficiency differences between partially and fully modified siRNA in vivo.

      Page 7- the results present for Fig 2 ignores Fig, 2C, if this is important it needs to be described if not, please delete.

      RESPONSE: The dose-response potency results, crucial for identifying the most potent Chol-MCT1-siRNA compound, are depicted in Figure 2C. The wording "(Figure 2C)" has been inserted in the sentence as follows. “The silencing effect on Mct1 mRNA was monitored after 72 hours (Figure 2B). Several compounds elicited a silencing effect greater than 80% compared to the NTC-siRNA. The two most potent Chol-MCT1-siRNA, Chol- MCT1-2060 (IC50: 59.6nM, KD%: 87.2), and Chol-MCT1-3160 (IC50: 32.4nM, KD%: 87.7) (Figure 2C) were evaluated for their inhibitory effect on MCT1 protein levels (Figure 2D, 2E). Based on its IC50 value and silencing potency, Chol-MCT1-3160 construct was chosen for further studies in vivo (Table 2).”

      Supplement Fig 1A-F should be analyzed by multiple comparisons not by paired t-tests.

      RESPONSE: We performed t-tests for every comparison between two groups. However, for Sup Fig 1A-F, which involved a comparison among three different groups, we applied oneway ANOVA.

      The x-axis in supplement Fig 2A and B are not labeled, and I assume are in weeks. The Fig 2B x-axis numbers also mis-labeled and should also be 0-3 and not 10-13.

      RESPONSE: The x-axis is now appropriately labeled.

      Page 10 - the description of supplement Fig 4A is not accurate. Srebf1 mRNA is unchanged by the GN-MCT1-siRNA treatment and Mlxipl mRNA is unchanged by Chol-MCT1-siRNA treatment. Is this total Mlxipl mRNA or can you distinguish between the alpha and beta variants.

      RESPONSE: We adhered to NCBI nomenclature, where 'SREBP1' and 'ChREBP' represent proteins, not mRNA. The Mlxipl mRNA we tested pertains to total Mlxipl mRNA. Original draft shown below.

      “To investigate the underlying mechanism by which lipid droplet morphological dynamics change, we monitored the effect of hepatic MCT1 depletion on DNL-related gene expression. Both GN-MCT1-siRNA and Chol-MCT1-siRNA strongly decreased the mRNA and protein levels related to representative DNL genes (Supplement Figure 4A-4D). Intriguingly, both modes of hepatic MCT1 depletion also inhibited expression of the upstream regulatory transcription factors SREBP1 and ChREBP.”

      There are no molecular weight markers in supplement Fig 4C and D. Is the Srebp1c blot for the nuclear or precursor form?

      RESPONSE: The Srebp1c blot presented represents the precursor form. I have edited the figure legend accordingly. It's worth noting that the cleaved form of Srebp1c either exhibited significantly lower expression compared to its precursor form or displayed comparable expression between the control group and the MCT1 depletion group.

      Changes in mRNA and protein do not always reflect changes in activity (allosteric regulation). If you want to draw any conclusions about de novo lipogenesis you need to directly measure fatty acid synthesis rates from a carbohydrate precursor.

      RESPONSE: We completely agree. Therefore, in the current study, we emphasized two key points: (1) hepatic MCT1 depletion affects the expression levels of representative DNL genes, and (2) however, this regulation was insufficient to resolve the steatosis phenotypes in our NASH model. We have added the text “while recognizing that the decreased expression of DNL genes does not necessarily indicate inhibited fatty acid synthesis rate” on page 15.

      Reviewer #3 (Recommendations For The Authors):

      Figure 1 - Are there changes to fibroblast phenotype with TGF-beta stimulation and are these changes reversed with MCT1 siRNA-mediated silencing, or is this purely an expression phenomenon?

      RESPONSE: This study was designed to assess the preventative effect of MCT1 silencing on Tgf1β-induced fibrosis, rather than a reversal study. As detailed in the methods section, LX2 cells were initially cultured in DMEM/high glucose media with 2% FBS. The following day, we transfected the cells with either NTC-siRNA or MCT1-siRNA (IDT, cat 308915476) using Lipofectamine RNAi Max (ThermoFisher, cat 13778075) for 6 hours in serum-reduced Opti-MEM media (ThermoFisher, cat 31985062). Subsequently, the cells were maintained in serum-starved media, with or without 10ng/ml of recombinant human Tgf1β (R&D Systems, cat 240-B/CF), for 48 hours before harvesting.

      Is lactate import/export itself responsible for this phenotype? It is presumed that MCT1 depletion alters import/export of lactate and subsequently modulates this phenotype, but this is never shown experimentally. Does lactate accumulate in these cells or in the medium in culture? The foundation of the paper rests on this hypothesis, so we believe that this is critical to establish. This is particularly relevant as MCT1 has been proposed to function primarily as a lactate importer, so the availability of medium lactate could be easily modulated to determine whether that mimics MCT1 loss.

      RESPONSE: To address the underlying mechanism of MCT1/Lactate in stellate cells, we added a new figure to the manuscript (Figure 8). We had previously conducted an experiment to determine whether MCT1 depletion in LX2 cells in vitro influences extracellular lactate concentrations in DMEM/high glucose (25mM glucose) media supplemented with 1mM sodium pyruvate but without sodium lactate. Interestingly, we found no significant difference in extracellular glucose and lactate concentrations, which remained at 25mM and 5mM, respectively. These concentrations were comparable between groups, regardless of MCT1 loss. Additionally, we investigated the effects of MCT1 silencing in the presence of potent fibrogenic inducer TGF-β1. Intriguingly, MCT1 depletion effectively prevented TGF-β1-induced collagen production, irrespective of lactate (+/- pyruvate) supply in the media. LX2 cells with MCT1 depletion exhibited reduced collagen 1 production when lactate was solely generated by endogenous glycolysis (Figure 8F) and when exogenous lactate was supplied (Figure 8G).

      Figure 2 - It is compelling that the Chol-MCT1-siRNA compounds are effective at targeting MCT1. However, is it clear how specific the siRNA target is? Are other MCT genes affected as well (if the siRNAs target areas of homology, for example)? Given that this siRNA strategy is used going forward and proposed as a therapeutic, it would be important to discuss and perhaps characterize off-target effects. A simple BLAST search for homology for the chosen siRNAs could help answer this question.

      RESPONSE:

      1) We designed the siRNA to specifically avoid any potential off-target effects on MCT1's 14 isoforms, and this approach aligns with the results obtained from the NCBI-BLAST analysis.

      2) While there are 14 isoforms of MCTs, only the first four are functional. To assess the off-target effect of Chol-MCT1-siRNA on MCT2 and MCT4 (MCT3 was excluded due to its limited expression in retinal pigment epithelium), we conducted in vivo experiments in ob/ob mice, which demonstrated a highly selective MCT1 silencing effect. We have also included MCT1, MCT2, and MCT4 rt-qPCR data in the manuscript (Supplement Figure 2A, 2B).

      3) We plan to further optimize and validate the human MCT1-targeting siRNA sequence for use in humanized mouse studies. It's important to note that the MCT1-siRNA used in this study was designed for mice.

      Supplemental Figure 1 - brain would be one other highly metabolic tissue wherein it would be important to show lack of activity/accumulation.

      RESPONSE: Undoubtedly, the brain is one of the most metabolically active tissues, playing a pivotal role in regulating signaling pathways and metabolism in other tissues. However, it poses a significant challenge in terms of targeting due to the presence of the blood-brain barrier (BBB). Overcoming BBB penetration remains one of the foremost challenges in the field of therapeutic siRNA delivery. For many therapeutic oligonucleotides, including Cholesterol-conjugated siRNAs, systemic administration alone is normally insufficient to achieve BBB penetration. Direct local injection or transient disruption of the BBB is normally required.

      Figure 4 - The image shown for chol-MCT1-siRNA seems to show variation in lipid droplet size. Is this just this single image? The authors quantify smaller lipid droplets in this group, so the image may not be representative as there are many large droplets. Ultimately, additional mechanisms as to how alterations in lactate metabolism could mediate this phenotype are missing. This hypothesis also rests upon the assumption that MCT1 is modulating lactate, which is not shown experimentally, as discussed above.

      RESPONSE: We changed the representative images (Fig 4B). We agree this aspect of the study is not resolved, and we have related text in the manuscript on this point: “neither GNMCT1-siRNA nor Chol-MCT1-siRNA decreased total hepatic TG levels (Figure 4H), although quantitative analysis of H&E images showed a small decrease in mean lipid droplet size and increased number of lipid droplets upon MCT1 silencing (Figure 4F, 4G). These data suggest the possibility that hepatic MCT1 depletion either 1) inhibits formation or fusion of lipid droplets, or 2) enhances lipolysis to diminish lipid droplet size.”

      Figure 5 provides evidence that Chol-MCT1-siRNA expression decreases fibrosis but this is attributed to the effects on stellate cells. While GN-MCT1-siRNA and subsequent MCT1 silencing in hepatocytes has an opposite effect. The cell population that is not discussed, however, is the Kupffer cell. Could MCT1 silencing in this cell population be mediating part of the phenotype observed? How does MCT1 silencing affect Kupffer cell phenotype and activity?

      This extends into Figure 6 where Kupffer cells are not given consideration in targeted experiments.

      RESPONSE: Described above to Reviewer #3

      Figure 6 and 7 use a different model to show that stellate cell depletion of MCT1, specifically, decreases collagen 1 protein levels in NASH, which reinforces the authors claims. Given the cell specificity of this experiment, it is more compelling data. It would be nice to show that Kupffer cell depletion of MCT1 does not have any affect (or perhaps show that it does.

      RESPONSE: We agree, but Kupffer selective depletion is not possible to do with this siRNA technology. Please see the response above as our most recent attempt to address this question.

      Figure 7 shows that even with decreased collagen deposition, there is no effect on liver stiffness or chronic liver injury as measure by ALT. This may suggest that the decreased level of fibrosis is either not significant to overall clinical outcome or that there are other fibroinflammatory mechanisms compensating for lack of COL1 deposition. Is there increased reticulin fibrosis when MCT1 is knocked down? This could be assessed with IHC or monitoring type 3 collogen (COL3A1).

      RESPONSE: Reticulin fibrosis results from the excessive deposition of reticular fibers, primarily composed of type 3 collagen. However, based on our observation of trichrome staining in whole liver histology data (Fig 7D-E), which exhibited nearly identical trends to collagen type 1 expression (Fig 7A-C), it seems unlikely that type 3 collagen compensated for the decrease in type 1 collagen protein expression upon hepatic stellate cell MCT1 KO. We plan to perform detailed analysis of a more comprehensive list of ECM proteins including type 3 collagen in our humanized mouse model with engrafted human liver cells in future experiments.

      Additional considerations:

      It may be useful to know if inhibition of fibrosis affects survival/progression in these NASH models over a longer timeframe, although this may understandably be beyond the scope of the current work. The timing of MCT1 depletion is prophylactic and given the proposal to translate this research, it would be important to determine whether MCT1 inhibition reversed fibrosis, and if so, by what metabolic mechanism?

      RESPONSE: We have observed that extending the duration of the NASH model increases the likelihood of hepatocarcinoma development. Exploring the aim to include survival and disease progression as well as reversal of fibrosis would be important in future experiments.

      Summary of new Figures and Figures modified:

      • Fig 1B: added "and" (significance) between the first and the third group, and the second and the last group.

      • Fig 4B: replaced images with more representative ones as the mean lipid size was questioned by the reviewer.

      • Fig 7D: made the images bigger (original images cropped and enlarged → 5X)

      • Fig 8: newly created to explain the underlying pathway of lactate, and MCT1 regulating collagen production. Please find the results sections.

      • Sup fig 2A, B: newly added to show our compounds’ selective silencing effect. - Sup Fig 2C-D: Added missing x-axis (moved from previous Figure 2A, 2B) - Sup Fig 2E-F: moved from sup Fig 3 not to have too many sup figures.

      • Sup Fig 3C-D: showed both precursor and cleaved form of SREBP1 bands as requested (moved from previous sup Figure 4)

      • Sup Fig 4: newly created, as questioned many times for the effect on Kupffer cells or other inflammatory cells.

      • Sup Fig 6: newly created to explain the potential underlying mechanism of MCT1 depletion on collagen production.

      • Sup Fig 7: moved from previous sup Fig 6.

      • Sup Fig 8: moved from previous sup Fig 7.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors have previously employed micrococcal nuclease tethered to various Mcm subunits to the cut DNA to which the Mcm2-7 double hexamers (DH) bind. Using this assay, they found that Mcm2-7 DH are located on many more sites in the S. cerevisiae genome than previously shown. They then demonstrated that these sites have characteristics consistent with origins of DNA replication, including the presence of ARS consensus sequences, the location of very inefficient sites of initiation of DNA replication in vivo, and for the most part are free of nucleosomes. They contain a G-C skew and they locate to intergenic regions of the genome. The authors suggest, consistent with published single molecule results, that there are many more potential origins in the S. cerevisiae genome than previously annotated, but also conclude that many of the newly discovered Mcm2-7 DH are very infrequently used as active origins of DNA replication.

      The results are convincing and are consistent with prior observations. The analysis of the origin associated features is informative.

      Specific Comments:

      1. Page 8. The addition of an estimate of the most active origins using Southern blotting is fine for highly active origins, but how was Southern blotting used to calculate that 1-2% of cells in the eight cohort have an Mcm complex loaded.

      We used a combination of Southern blotting and qPCR to measure licensing at the most active origins and then used our abundance curve to extrapolate these values to the less abundant cohorts. We expand on this point below, and we have changed the text to clarify this issue.

      Reviewer #3 (Public Review):

      By mapping the sites of the Mcm2-7 replicative helicase loading across the budding yeast genome using highresolution chromatin endogenous cleavage or ChEC, Bedalov and colleagues find that these markers for origins of DNA replication are much more broadly distributed than previously appreciated. Interestingly, this is consistent with early reconstituted biochemical studies that showed that the ACS was not essential for helicase loading in vitro (e.g. Remus et al., 2009, PMID: 19896182). To accomplish this, they combined the results of 12 independent assays to gain exceptionally deep coverage of Mcm2-7 binding sites. By comparing these sites to previous studies mapping ssDNA generated during replication initiation, they provide evidence that at least a fraction of the 1600 most robustly Mcm2-7-bound sequences act as origins. A weakness of the paper is that the group-based (as opposed to analyzing individual Mcm2-7 binding sites) nature of the analysis prevents the authors from concluding that all of the 1,600 sites mentioned in the title act as origins. The authors also show that the location of Mcm2-7 location after loading are highly similar in the top 500 binding sites, although the mobile nature of loaded Mcm2-7 double hexamers prevents any conclusions about the location of initial loading. Interestingly, by comparing subsets of the Mcm2-7 binding sites, they find that there is a propensity of at least a subset of these sites to be nucleosome depleted, to overlap with at least a partial match to the ACS sequence (found at all of the most well-characterized budding yeast origins), and a GC-skew centered around the site of Mcm loading. Each of these characteristics is related to previously characterized S. cerevisiae origins of replication.

      Overall, this manuscript greatly broadens the number of sites that are capable of loading Mcm2-7 in budding yeast cells and shows that a subset of these additional sites act as replication origins. Although these studies show that the sequence specificity of S. cerevisiae replication origins still sets it apart from metazoan origins, the ability to license and initiate replication from sites with increasing sequence divergence suggests a previously unappreciated versatility.

      Specific points:

      1. The authors need to come up with a consistent name for loaded Mcms at an origin. In the manuscript they variously use 'MCM'(page 3), 'Mcm complexes' (page 4), 'MCM double hexamer' (page 6), and 'double-helicase' (page 8) to describe the Mcm2-7 complexes detected in their ChEC experiments. They should pick one name (Mcm2-7 double hexamer or MCM double hexamer would be the most accurate and clear) and stick with it throughout the manuscript.

      We appreciate the criticism and agree that consistency is important for clarity, thus we tried using the term "Mcm2-7 double hexamer" in every instance in which we refer to Mcm loaded at an origin. However, upon reading the resulting manuscript, we felt that these changes hurt readability more than they helped with clarity, so we left the manuscript in its original form.

      1. The authors state that "It is notable that, when Mcm is present, it is present predominantly as a single doublehexamer (right panel of Figure 3A), and that this remains true across the entire range of abundance shown in Figure 3A." This statement would be improved by prefacing it with "Based on the size of the protected regions" or some other clarifying statement that lets the reader know what they should be looking for in the data in 3A.


      We thank the reviewer for the helpful suggestion. We have added the underlined words to the text to clarify this point.

      It is notable that, when Mcm is present, it is present predominantly as a single doublehexamer (based on the size of the protected region in the right panel of Figure 3A), and that this remains true across the enAre range of abundance shown in Figure 3A.

      1. The revised statements that "We have previously used Southern blotting to demonstrate that approximately 90% of the DNA at one of the most active known origins (ARS1103) is cut by Mcm-MNase (Foss et al., 2021), and to thereby infer that 90% of cells have a double- helicase loaded at this origin. Using this as a benchmark, we estimate that ~1-2% cells have an Mcm complex loaded at the Mcm binding sites in the eighth cohort (ranks 1401- 1600)." partially clarifies how the authors came to the 1-2% number, however, the calculation is still unclear. Based on Figure 1A, there are at least three logs (1,00 fold) difference in the number of CBMSs between the best origins (which is what they state the 90% comes from) to anywhere close to the 1400-1600 rank. Seems like the number should be at best 0.1% and probably less. Either way, the authors need to explain this calculation either in the text or in the text. This sort of number tends to get thrown around later and without a clear explanation readers cannot evaluate its credibility. 
<br /> We apologize for insufficiently clarifying how we arrived at our estimate of licensing. We believe that we have now remedied this, both by incorporating more measurements of licensing to improve our accuracy and by expanding the text to make our calculation unambiguous. We have added a supplemental figure showing the linear regression, based on 7 qPCR-based measurements of licensing, that we used to determine the median level of licensing of the first cohort of 200, and the altered text in the main text reads as follows:

      We have previously used Southern blotting to demonstrate that approximately 90% of the DNA at one of the most active known origins (ARS1103) is cut by Mcm-MNase (Foss et al. 2021), and to thereby infer that 90% of cells have a double-helicase loaded at this origin. Combining this measurement with 6 additional measurements of licensing in cohort 1, we used linear regression (r2=0.7) to infer a median value of 69% for cohort 1. Because the median abundance in the 8th cohort is 1.5% of that in the first cohort, we estimate that CMBSs in the 8th cohort are typically licensed in 1% of cells in the population (69% x 0.015 = 1.0%).

      1. The authors make the point in the introduction and discussion that recent single-molecule studies of replication origins indicate that as many as 20% of the origins identified are outside of known origins. This is very interesting but there seems to be a missed opportunity of comparing the location of these origins with the CBMSs. It would improve the manuscript to include some sort of comparison rather than using only the much older and less accurate ssDNA analysis.

      Unfortunately, coverage and resolution with nanopore-based single-molecule precludes such an analysis.

      1. The authors state at the end of the first paragraph on page 6 that the ChEC data is "very reproducible" which does seem to be the case but it is a little confusing for the knowledgeable reader since one would expect quite different results for an HU arrested strain versus a asynchronous or G1 arrested strain. This is hidden in the analysis in Figure S1 since 13 experiments are compared against one in each plot, however, if one x one comparisons were done there would certainly be substantial differences (or if there are not, there is a problem with the data - e.g. HU arrested cells should lack licensing at early firing origins).

      It may appear counterintuiAve that one could obtain high r2 values when comparing G1 and HU-arrested samples. However, HU arrest was performed by transferring log phase cultures to 200 mM HU and harvesting cells after just 50 minutes. In this situation, most cells will be in G1 or very early S phase. Presumably, increasing times of incubation in HU would cause r2 values to decline.

      1. On page 8 the authors state, "First, clear peaks of ssDNA extend down to the eighth cohort..." This seems to be stretching the data. There are clear peaks for the first five cohorts and then there is a notable change with any peak being much broader, extending over at least 10,000 bp. The authors should reconsider their statement here as it is not well supported by the data.

      We have softened our language to the following: First, peaks of ssDNA signal, as judged by higher signal at the midpoints than the edges, extend down to the eighth cohort (brown line), which corresponds to CMBSs ranked 1401-1600.

      1. There is one last missing reference. Wherever Eaton et al, 2010 is referenced Berbenetz, et al, 2010 (full ref below) should also be referenced as they come to very similar conclusions.

      Berbenetz, N. M., Nislow, C. & Brown, G. W. Diversity of eukaryotic DNA replication origins revealed by genome-wide analysis of chromatin structure. PLoS Genet 6, (2010).

      We have added this reference at all 4 instances in which we reference Eaton et al., 2010.

      Recommendations for the authors:

      Reviewer #3 (Recommendations For The Authors):

      There are missing references in several places:

      All references are included, and the references in point 3 have been split according to the reviewer's suggestion.

      1. "For example, 15 of the 56 genes that contained a high abundance site have been implicated in meiosis and sporulation and are not expressed during vegetative growth (~5 out of 56 expected from random sampling), consistent with previous observations (Mori and Shirahige, 2007)." Should include Blitzblau et al., 2012 (PMC3355065) which showed that Mcm2-7 loading was impacted by differences in meiotic and mitotic transcription.

      2. "In contrast to the low abundance sites, the most abundant 500 sites showed a preference for convergent over divergent transcription (left of vertical dotted line in Figure 4B), in agreement with a previous report (Li et al., 2014)." This preference was first pointed out in MacAlpine and Bell, 2005 (PMID: 15868424).

      3. "This sequence is recognized by the Origin Recognition Complex (Orc), a 6-protein complex that loads MCM (Broach et al., 1983; Deshpande and Newlon, 1992; Eaton et al., 2010; Kearsey, 1984; Newlon and Theis, 1993; Singh and Krishnamachari, 2016; Srienc et al., 1985)." This list should include a reference to Bell and Stillman, 1992 (PMID: 1579162), which first described ORC and showed that it recognized the ACS. It would also be more helpful to the reviewer to distinguish the references that identified that ACS from those concerning ORC binding to it.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      On behalf of all the authors, I'd like to thank you for your insightful comments and valuable suggestions, which fully reflect your high level of scientific thinking and point the direction of our research and help us and other future researchers in the field to more comprehensively study and interpret the toxic effects of imidacloprid on honey bee larvae and its potential mechanisms, as well as the mechanisms of larval resistance and adaptations to imidacloprid. We have addressed each of the questions and revised the manuscript point-by-point in response to your comments. Below are detailed point-by-point responses to each question.

      Public Review:

      This study provides evidence of the ability of sublethal imidacloprid doses to affect growth and development of honeybee larva. While checking the effect of doses that do not impact survival or food intake, the authors found changes in the expression of genes related to energy metabolism, antioxidant response, and metabolism of xenobiotics. The authors also identified cell death in the alimentary canal, and disturbances in levels of ROS markers, molting hormones, weight and growth ratio. The study strengths come from exploring different aspects and impacts of imidacloprid exposure on honeybee juvenile stages and for that it demonstrates potential for assessing the risks posed by pesticides. The study weaknesses come from the lack of in depth investigation and an incomplete methodological design. For instance, many of the study conclusions are based on RT-qPCR, which show only a partial snapshot of gene expression, which was performed at a single time point and using whole larvae. There is no understanding of how different organs/tissues might respond to exposure and how they change over time. That creates a problem to understand the mechanisms of damage caused by the pesticide in the situation studied here. There is no investigation of what happens after pupation. The authors show that the doses tested have no impact on survival, food consumption and time to pupation, and the growth index drops from ~0.96 to ~0.92 in exposed larvae, raising the question of its biological significance. The origin of ROS are not investigated, nor do the authors investigate if the larvae recover from the damage observed in the gut after pupation. That is important as it could affect the adult workers' health. One of the study's central claims is that the reduced growth index is due to the extra energy used to overexpress P450s and antioxidant enzymes, but that is based on RT-qPCR only. Other options are not well explored and whether the gut damage could be causing nutrient absorption problems, or the oxidative stress could be impairing mitochondrial energy production is not investigated. These alternatives may also affect the growth index. The authors also state that the honeybee larvae has 7 instars, which is an incorrect as Apis mellifera have 5 larval instars. It is not clear from methods which precise stage of larval development was used for gut preparations. That information is important because prior to pupation larvae defecate and undergo shedding of gut lining. That could profoundly affect some of the results in case gut preparations for microscopy were made close to this stage. A more in-depth investigation and more complete methodological design that investigates the mechanisms of damage and whether the exposures tested could affect adult bees may demonstrate the damage of low insecticide doses to a vital pollinator insect species.

      Recommendations for the authors:

      This study presents a useful investigation on changes in gene expression by real time PCR and some of the physiological consequences of sublethal exposures to the neonicotinoid insecticide imidacloprid in honeybee larvae. It offers preliminary evidence of imidacloprid impacts on the development of bee larvae by interfering with molting and metabolism. Whereas the study provides evidence that small doses of imidacloprid affect larval growth rate, there is no investigation on whether that could affect the overall colony health, and some of the results open the possibility that the larvae may overcome some of the impacts of the exposure. As the authors state, the doses tested show no impact on larvae survival, food consumption or time to pupation. The investigation and methodological design lack in depth to explain the findings and provide incomplete evidence to support the authors conclusions. The study would benefit from a more thorough mechanistic characterization to better sustain the findings and demonstrate their biological relevance.

      Response: I would like to express, on behalf of all the authors, our sincere appreciation for your insightful and insightful comments and suggestions, which significantly enhanced the quality of the manuscript. Your incisive insights point the way for future research in the field of bee biology on the mechanisms underlying imidacloprid-induced delays in larval development.

      In this study, we investigated the effects of imidacloprid on honey bee larval development, including macro and micro changes and possible causes. This is the first of its kind in the field of honeybee biology research. However, we found that the underlying mechanism is extremely complex. The effects of toxic substances on animals and their interactions with larval development are complex and far-reaching. They include oxidative stress and damage; disruption of nutrient metabolic homeostasis; inhibition of detoxification and immunity; adverse effects on the nervous, circulatory, and digestive systems; inflammation, disease, and even organ failure; and subsequent effects on physiological activities such as development, reproduction, and behavior, and even death. These toxic effects interact in complex ways with the development of young animals, with some effects directly or indirectly affecting development while others do not.

      Addressing this complex mechanistic issue based solely on the results of this study is a formidable challenge, which leads to some limitations of our study as pointed out by the reviewers. Although our study is not comprehensive enough in terms of mechanistic analysis and does not fully elucidate the mechanism, we believe it is an important and valuable first step in this area.

      In the future, we will follow the reviewers' suggestions and deliberately redesign the experiments to focus on further research on the issues they raised. These include examining the effects of larval developmental delay on adult and colony health, investigating the post-pupal situation, identifying the source of ROS, and determining whether the larval gut damage observed after pupalization recovers.

      In accordance with the reviewers' comments and suggestions, we have revised the manuscript to improve its rigor and scientific quality. We sincerely ask the reviewers to understand and accept this modification from us!

      Next is our response to each of the questions and valuable suggestions provided by reviewers:

      Recommendations For The Authors:

      1. The authors found a reduction in growth index and body mass, but document no impact on survival, food consumption or time to pupariation. How much exactly is the reduction in growth index? It seems to be from ~0.96 to ~0.93. Is this biologically relevant? Would that be enough to impact the colony health?

      Response: Thank you for your comments. In this study, we observed a gradual decrease in larval growth index from day 4, which stabilized by day 6. At the 4th, 5th and 6th instars, the growth index of the imidacloprid-treated groups were significantly lower than those of the control group by an average of 1.35%, 4.49% and 2.76%, respectively (Figure 1, source data 8). Statistical analysis confirmed the significance of the difference in these results. We have incorporated the above description into the red text on lines 148-152 of the Results section. Regarding the reviewer's inquiry on colony health, including imidacloprid-induced delayed larval development and some reduction in growth index and body weight with no effect on survival, food consumption, or time develop to pupation, because we do not currently have the technical capabilities to culture larvae to adulthood in laboratory incubators, this has resulted in a failure to further investigate the effects of imidacloprid-induced delayed larval development on adult colony health. However, this is a very important scientific question for future colony health. We will design experiments to address this issue in a follow-up study.

      1. The authors find that P450s can help in detoxifying mechanisms to mitigate imidacloprid impacts. That however is a well-known fact. What is new about this claim?

      Response: The point at which the ability to detoxify toxic substances is acquired during early development varies widely among animals. Although many studies have reported that the detoxification function of P450s helps mitigate the effects of imidacloprid in adult honey bees, there is no conclusive evidence as to whether or not honey bee larvae have acquired this ability at early stages of development. This ability is critical to the defense and health of honey bee larvae. Therefore, it is incumbent upon this study to clarify this issue, which is important in explaining the effects of imidacloprid on honey bee larvae.

      1. Some references are cited incorrectly. The first and last name are swapped, for instance Charles et al.

      Response: Thank you very much for pointing out this error, which we have corrected. Please see lines 92 and 889 in our revised version.

      1. I still encounter important methodological flaws. The authors acknowledge my previous suggestions but only address a small fraction of them. The most relevant points regarding the understanding of the mechanisms behind the delayed growth rate remain unexplored. The expression levels of other nAChRs target of imidacloprid in honeybees were not investigated. The expression analyses are still based on a single time point and using whole larvae, which only superficially explore the problem and may lead to misinterpretations. I do not understand the authors claim that a technological breakthrough is required to address these issues, when performing more PCRs and doing dissections should cover the matter.

      Response: Thank you very much for your important comment. You point out several unexplored issues related to understanding the mechanisms behind delayed growth rates. For example, The most relevant points regarding the understanding of the mechanisms behind the delayed growth rate remain unexplored. The expression levels of other nAChRs target of imidacloprid in honeybees were not investigated. The expression analyses are still based on a single time point and using whole larvae. Please allow me to explain. Honeybees (Apis mellifera) have nine different α-subunits, Amelα1-9, and two β-subunits, Amelβ1-2. Amelα5, Amelα7, and Amelα8 are expressed in MB Kenyon cells and AL neurons, and the Amelβ2 subunit is present in Kenyon cells. Amelα2, Amelα3, and Amelα7-2 are expressed in the optic lobes. The aim of this study was to investigate whether imidacloprid induces larval neurotoxicity. Based on the above information, we selected the two most representative nAChRs (Alph1 and Alph2) for analysis. The results showed that exposure to imidacloprid increased the expression of the Alph2 gene and inhibited AChE activity, indicating that imidacloprid is neurotoxic to larvae. This result answered our question of whether imidacloprid induces neurotoxicity in larvae. Therefore, we did not further analyze the expression levels of other nAChRs. We believe that this does not affect the understanding of the mechanism behind the delayed growth rate and that it is not necessarily necessary to analyze all 11 nAChRs to find an answer. We sincerely hope that the reviewers will understand and agree with this.

      Furthermore, regarding the expression analysis based on a single time point and whole larvae. In this study, 72 h after imidacloprid exposure Fig. 1J, 5 days of age) was chosen for sampling because this is when imidacloprid has the greatest and most representative effect on larval development. Therefore, analyzing samples at this time point did not interfere with our exploration of the mechanisms by which imidacloprid causes larval developmental retardation. We used whole larvae rather than individual tissues for sample selection, which is a shortcoming for us. This was mainly due to technical challenges where we were unable to obtain pure single tissues through dissection. Nevertheless, we will make technical breakthroughs in the future so that we can sample and compare different tissues and developmental stages to obtain more comprehensive and accurate data. Thank you again for raising this important issue and for your valuable suggestions.

      1. The authors could in many different ways explore what are the origin of ROS is. That is important to further develop their hypothesis on reduced energy levels.

      Response: Thank you very much for your insightful comment and suggestion, it gives us great insight. Mitochondria are the main producers of ATP for cellular metabolism, accounting for approximately 90% of the total. However, mitochondria are also involved in the generation of reactive oxygen species (ROS). Excessive accumulation of ROS in mitochondria leads to oxidative stress, which in turn damages mitochondria and further increases ROS levels, creating a vicious cycle (Boovarahan and Kurian, 2018). In the present study, it was found that imidacloprid exposure led to increased ROS and MDA levels in larvae (Figure 5A and Figure 5-source data 14), indicating that imidacloprid induced severe oxidative stress and lipid damage, which may damage mitochondria and in turn affect mitochondrial ATP production, resulting in insufficient energy supply for larval development. This factor may also be an important explanation for the larval developmental delay caused by imidacloprid. We have included the above text in our revised manuscript. Please see the lines 432-442 in the revised manuscript.

      1. If there is gut damage, is it restored in the adults? It is not clear from the methods which precise stage of larval development was used for gut preparations. That information is important because prior to pupation larvae defecate for the first time and undergo shedding of the gut lining. That could profoundly affect some of the results in case gut preparations for microscopy were made close to this stage. If no food residues are found in the gut of control larvae, does it mean that they are close to pupation? Could the apoptosis found in gut of exposed larvae be the natural shedding of gut lining prior to pupation? All these possibilities have to be discussed and authors should clarify the precise larval stage used in every assay.

      Response: Thank you for your important comments. In this study, all samples used for the assay were larvae that had developed to 5-day-old after oral administration imidacloprid at 2-day-old. This is described in detail in the Materials and Methods. See lines 507, 517-521 in the revised manuscript. In general, 6-day-old bee larvae cease feeding and begin their first defecation at approximately 7-day-old. However, in our study, intestinal sections were prepared from 5-day-old larvae that had not fasted or defecated, when the intestinal mucosa was normal and not undergoing shedding. In this case, we found that imidacloprid caused damage to intestinal structures, apoptosis of intestinal cells, incomplete formation of the peritrophic membrane, and undigested food residues in the intestine. We believe that these results are objective and reliable.

      1. Honeybee have 5 larval instars, not 7 (Figure 1). That creates confusion about which larval stage the authors used.

      Response: Thank you very much for pointing out this editorial error, which we have corrected, please see Figure 1.

      1. The Results section does not state the numbers by which parameters measures have changed, neither the values of significance. How much is the impact in growth index, body mass, gene fold change, etc?

      Response: Thank you very much for pointing out this important problem. We have revised the Results section according to your suggestions. Please see the revised manuscript.

      1. Mention figures in order (5c comes before 5b in the text)

      Response: Thank you very much for the comment. We have revised according to your suggestions. Please see the lines 208-212 in the revised manuscript.

      1. Paraquat is a herbicide not a pesticide

      Response: Thank you for pointing out the loose wording. We have revised according to your suggestions. Please see the lines 316-319 in the revised manuscript.

      1. What is the evidence that imidacloprid reduces growth index by inhibiting 20E? The authors provide real time data and discuss the data in terms of correlation. But correlation does not mean causation. Reduction in growth index could come from multitude of factors such as ROS affecting mitochondrial energy metabolism.

      Response: We deeply appreciate your insightful comments and valuable suggestions. In this study, although we conducted an in-depth analysis of ecdysone regulation, which is crucial for insect larval development, and found some clues, as you pointed out, this is not the sole reason for larval developmental delay. In fact, animal growth and development are collectively regulated by numerous physiological, biochemical, and genetic factors. The the decline in the growth index may be due to other factors as you mentioned, such as oxidative stress impairing mitochondria, dysregulated neuro-endocrine axis caused by imidacloprid targeting neurons, poor nutrient absorption, impaired movement, etc, as animal growth and development are collectively regulated by numerous physiological, biochemical, and genetic factors. We have incorporated this understanding into the revised manuscript. Please see the lines 389-394 in the revised manuscript.

      1. The authors state that "digestion and breakdown of nutrients is impaired by imidacloprid", the evidence discussed in the paragraph however supports only that imidacloprid impairs some of the genes involved in these processes.

      Response: Thank you for your comments and valuable insights. In this paragraph, a lack of clarity and completeness in our writing may have led to the misconception that the evidence discussed only demonstrates the effects of imidacloprid on specific genes in these processes. In fact, our intent in this paragraph was to analyze and discuss the effects of imidacloprid on nutrient digestion and breakdown in larvae and to explore the causes of larval developmental delay. We demonstrated this using tissue sections, qRT-PCR and correlation analysis, which showed that the intestinal structure was disrupted and the expression of genes involved in nutrient digestion and catabolism was suppressed, resulting in defects in the catabolic utilization of food and consequently the presence of many food residues. In addition, there was a positive correlation between these genes and larval developmental delay. All this may be another important factor contributing to imidacloprid-induced larval developmental delay. We have revised and incorporate the above logic into the revised manuscript. Please see the lines 407-431 in the revised manuscript.

      1. There is no evidence for the claim that overexpressing P450s and antioxidant enzymes cause a reduction in growth index. No transcriptome analysis was performed so it is unknown under the circumstances presented here how all the other P450s, antioxidant genes and overall gene profiles are responding. Surely, some genes will be repressed. Reduction in growth index could stem from, oxidative stress impairing mitochondria, dysregulated neuro-endocrine axis caused by imidacloprid targeting neurons, poor nutrient absorption, impaired movement, etc.

      Response: Thank you for your comments and valuable insights. Indeed, as you have pointed out, drawing the conclusion that antioxidants and detoxification are significant contributors to larval developmental retardation solely based on correlation analysis is inherently flawed and lacks critical support, especially in the absence of P450 and antioxidant enzyme overexpression and comprehensive transcriptome analysis of other P450s, antioxidant genes, and the entire gene map. We have revised and included in the revised manuscript. Please see lines 461-467 in the red text in the revised manuscript. We have revised and incorporate the above logic into the revised manuscript. Please see the lines 407-431 in the revised manuscript.

      1. How come the decreased ATP and glycogen levels have no effect on time to pupation? Extra time points for gene expression, measurements of gut damage, ATP levels, ROS, etc, are vital to answer how the exposed larvae eventually catch up with the unexposed group. Also, it is vital to understand whether these larval impacts translate to impacts on adults.

      Response: We sincerely thank you for your insightful comments and suggestions! These important scientific issues you've raised are a good example of your high-level scientific thinking, and they will help us and other future researchers in the field to more comprehensively study and interpret the toxic effects of imidacloprid on honey bee larvae and their potential mechanisms, as well as the mechanisms of larval resistance and adaptation to imidacloprid. According to your comments, we will adapt our experiments and conduct more thorough research in the future to address the above issues.

      1. I am confused about the author's definition of developmental rate; rate gives the notion of speed to achieve something. But the authors use developmental rate as a measure of viability (number of larvae that successfully pupated). There seems to be a significant decrease in their developmental rate plot (Fig 1i), but at the same time the authors show in Figure 1c (and mention throughout the manuscript) that there is no difference in probability of survival. This is quite confusing and the method section regarding these data is too concise and does little to help explain what the authors were trying to measure. The whole section on developmental traits would benefit of more details on how experiments were conducted and equipment used.

      Response: Thank you so much for your valuable comments. Yes, as you can see, there appears to be a significant decrease in developmental rate but no difference in survival probability, which is an intriguing finding of this study. This finding suggests that the 377 ppb imidacloprid dose is not as harmful to the larvae as previously thought. Imidacloprid appeared to limit the larval ability to molt and develop only to a certain extent, but had no effect on the developmental process, let alone survival. It's worth investigating the underlying mechanism. As a result, we have included this question in the design of future studies. In addition, following your suggestion, we have revised the description of the material and methods in this section, including the experimental method in more detail. For more information, please see the revised manuscript, lines 530-541.

      1. The authors should try to make it clear what percentage of exposed larvae become adults? I am confused because the plot called developmental rate might be trying to convey this message, but developmental rate and viability are very distinct traits. What is the difference, if any, in the time it takes for exposed larvae to become adults in comparison to non-exposed ones? Is there a difference in adult body weight? The answers to these last two questions are important to start understanding if the impacts of imidacloprid on larvae alimentation would still impact these same individuals once they become adults, i.e., would there be impacts for the colony and workers activity?

      Response: Thank you very much for your insightful comments. Unfortunately, this is where the research falls short. Culturing larvae to adulthood in 24-well cell culture plates is a significant technical challenge that we have yet to overcome. As a result, the important questions you raise, such as what percentage of exposed larvae become adults? How does the time to adulthood differ (if at all) for exposed larvae versus non-exposed larvae? Is there a difference in adult weight? Do the effects of imidacloprid on larval feeding persist after these individuals reach adulthood? Does imidacloprid damage to larvae affect colony and adult activity? We do not have answers at this time. We are aware that answers to the above questions will help people better understand how serious the effects of imidacloprid environmental residues on honey bee larvae and adults, as well as bee colonies as a whole, are, and will draw sufficient attention to them. We intend to break through this technological bottleneck of culture larvae to adulthood in future studies and incorporate the above scientific questions into our next research design. Thank you again for your insightful comments! This gives us new research ideas.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important paper builds on a method, previously conceptualized and validated, of genetic control for insect populations. The method, called pgSIT, uses integrated CRISPR-Cas9 based constructs to generate, in certain combinations of genotypes, mutations that cause both male sterility and female inviability. Release of such genotypes in sufficiently large numbers can lead to an inundation of a local insect population with sterile males and this can lead to localised population suppression, which represents an important method of control for problematic insect populations. The data are convincing and will be valuable to anyone working on vector control strategies.

      Public Reviews:

      Reviewer #1 (Public Review):

      Precision guided sterile insect technology (pgSIT) is a means of mosquito vector control that aims to simultaneously kill females while generating sterile males for field release. These sterile males are expected to mate with 'wild' females resulting in very few eggs being laid or low hatching rates. Repeated releases are expected to result in the suppression of the mosquito population. This method avoids cumbersome sex-sorting while generating the sterile males. Importantly, until release, the two genetic elements that bring about female lethality and male sterility - the Cas9 and the gRNA carrying mosquitoes - are maintained as separate lines. They are crossed only prior to release, and therefore, this approach is considered to be more safe than gene drives.

      The authors had made a version of this pgSIT in their 2021 paper where they targeted β-Tubulin 85D, which is only expressed in the male testes and its loss-of-function results in male sterility. In that pgSIT, they did not have female lethality, but generated flightless females by simultaneously targeted myosin heavy chain, which is expressed only in the female wings. Here the authors argue, that the survival of females is not ideal, and so modify their 2021 approach to achieve female lethality/sterility.

      To do this, they target two genes - the female specific isoform of Dsx and intersex. They use multiple gRNAs against these genes and validate their ability to cause female lethality/sterility. Having verified that these do indeed affect female fertility, they combine gRNAs against Dsx and ix to generate female lethality/sterility and use β-Tubulin 85D to generate male sterility (previously validated). When these gRNA mosquitoes are crossed to Cas9 and the progeny crossed to WT (the set-up for pgSIT), they find that very few eggs are laid, larval death is high, and what emerges are males or intersex progeny that are sterile.

      As this is the requirement for pgSIT, the authors then test if it is able to induce population suppression. To do this, they conduct cage trials and find that only when they use 20:1 or 40:1 ratio of pgSIT:WT cages, does the population crash in 4-5 generations. They model this pgSIT's ability to suppress a population in the wild. Unfortunately, I was not able to assess what parameters from their pgSIT were used in the model and therefore the predicted efficacy of their pgSIT, (though the range of 0-.1 is not great, given that the assessment is between 0-0.15).

      We express our sincere appreciation for the valuable comments received. A wide range of ♀ viability and ♂ fertility values were explored in the model. The results determined that: “Achieving a ≥90% probability of elimination places slightly tighter restrictions on ♀ viability and ♂ fertility - a safe ballpark being ♀ viability and ♂ fertility both in the range 0-0.10, given a release scheme of ~26 releases of 250 pgSIT eggs per wild adult (Fig. 4B). These results suggest a target product profile for pgSIT to be ♀ viability and ♂ fertility both in the range 0-0.10.” A subsequent sentence has been added pointing out how the described pgSIT strain falls within this range: “The pgSIT strain described here falls well within these bounds, with ♀ viability of 0 and ♂ fertility of ~0.01.” The parameters of the described pgSIT strain are also listed throughout the paper and quoted here: “Cas9 in combination with gRNAdsx,ix,βTub induces either the lethality or transformation of pgSIT ♀’s into sterile unfit ⚥’s.” And: “Firstly, we determined that pgSIT males were not 100% sterile, with an estimated ~1% still producing some progeny.”

      Finally, they also develop a SENSR with a rapid fluorescence read-out for detecting the transgene in the field. They show that this sensor is specific and sensitive, detecting low copy numbers of the transgene. This would be important for monitoring any release.

      Overall, the data are clear and well presented. The manuscript is well written (albeit likely dense for the uninitiated!). I had concerns about the efficacy of generating the pgSIT animals - the overall number of eggs hatched from the gRNA (X) Cas9 cross appears to be low, therefore, very large numbers of parental animals would have to be reared and crossed to obtain enough sterile males for the SIT. In addition to this, I was concerned about the intersex progeny that can blood-feed. These could potentially contribute to the population and it would be useful to see the data that suggest that these numbers are low and the animals will not be competent in the field.

      Reviewer #2 (Public Review):

      This is a thorough and convincing body of work that represents an incremental but significant improvement on iterations of this method of CRISPR-based Sterile Insect Technique ('pgSIT'). In this version, compared to previous, the authors target more genes than previously, in order to induce both female inviability (targeting the genes intersex and doublesex, compared to fem-myo previously) and male sterility (targeting a beta-tubulin, as previously in the release generation. The characterization of the lines is extensive and this data will be useful to the field. However, what is lacking is some context as to how this formulation compares to the previous iteration. Mention is made of the possible advantage of removing most females, compared to just making them flightless (as previously) but there is no direct comparison, either experimental, or theoretical i.e. imputing the life history traits into a model. For me this is a weakness, yet easily addressed. In a similar vein, much is made in alluding to the 'safety concerns of gene drive' and how this is a more palatable half-way house, just because it has CRISPR component within it; it is not. It would be much more sensible, and more informative, to compare this pgSIT technology to RIDL (release of insects carrying a dominant lethal), which is essentially a transgene-based version of the Sterile Insect Technique, as is the work presented here.

      We express our sincere appreciation for the valuable comments received. A wide range of ♀ viability and ♂ fertility values were explored in the model. Given the intricate nature of this study and taking into account the recommendations provided by multiple reviewers and the editor, we have eliminated superfluous comparisons among various methodologies.

      The authors achieve impressive results and show that these strains, under a scenario of high levels of release ratios compared to WT, could achieve significant local suppression of mosquito populations. The sensitivity analysis that examines the effect of changing different biological or release parameters is well performed and very informative.

      The authors are honest in acknowledging that there are still challenges in bringing this to field release, namely in developing sexing strains and optimizing release strategies - a question I have here is how to actually release eggs, and could variability in the efficiency of this aspect be modelled in the sensitivity analysis? It seems to me like this could be a challenge and inherently very variable.

      We really appreciate comments. Several approaches are available to release eggs - either in pre-existing breeding sites in the field, or in artificial breeding sites (e.g., cups). We have added a sentence in the Discussion section to highlight that this is an area requiring further research: “Secondly, studies are required to determine the survival and mating competitiveness of released pgSIT males under field conditions, and to optimize their release protocol.” Regarding the efficiency of egg releases, the following sentence in the modeling results section has been added: “We assume released eggs have the same survival probability as wild-laid eggs; however if released eggs do have higher mortality, this would be equivalent to considering a smaller release.” As stated in the modeling results (and depicted in Figure 4 and Supplementary Figure 5): “Suppression outcomes were found to be most sensitive to release schedule parameters (number, size and interval of releases), ♂ fertility and ♀ viability.” It follows that suppression outcomes are equivalently sensitive to the efficiency of an egg release.

      Reviewer #3 (Public Review):

      Summary and Strengths:

      The manuscript by Li et al. presents an elegant application of sterile insect technology (pgSIT) utilizing a CRISPR-Cas9 system to suppress mosquito vector populations. The pgSIT technique outlined in this paper employs a binary system where Cas9 and gRNA are conjoined in experimental crosses to yield sterile male mosquitoes. Employing a multiplexed strategy, the authors combine multiple gRNA to concurrently target various genes within a single locus. This approach successfully showcases the disruption of three distinct genes at different genomic positions, resulting in the creation of highly effective sterile mosquitoes for population control. The pioneering work of the Akbari lab has been instrumental in developing this technology, previously demonstrating its efficacy in Drosophila and Aedes aegypti. By targeting the female-specific splice isoform (exon-5) of doublesex in conjunction with intersex and β-tubulin, the researchers induce female lethality, leading to a predominance of sterile male mosquitoes. This innovation is particularly noteworthy as the deployment of sterile mosquitoes on a large scale typically requires substantial investment in sex sorting. However, this study circumvents this challenge through genetic manipulation.

      Weaknesses:

      One notable concern arising from this manuscript pertains to the absence of data regarding the potential off-target effects of the gRNA. Given the utilization of multiple gRNA, the risk of unintended mutations in non-target areas of the genome increases. With around 1% of males still capable of producing fertile offspring, understanding the frequency of unintended genome targeting becomes crucial. Such mutations could potentially become fixed within the natural population.

      We express our sincere appreciation for the valuable comments received and fully agree with the reviewer regarding the importance of understanding the frequency of unintended genome targeting. However, the likelihood of off-target effects becoming fixed within the population is exceedingly low. To mitigate potential negative impacts, we employed CHOPCHOP V3.0.0 (https://chopchop.cbu.uib.no) for the selection of gRNAs, which will specifically tminimize the occurrence of genomic off-target cleavage events. Furthermore, our releasing process will be carried out in multiple rounds. In the event that an undesired mutant is introduced into the local population, the mutated gene will either be quickly eradicated through subsequent rounds of releases or be naturally eliminated through the process of natural selection over time.

      The experiments are well-conceived, featuring suitable controls and repeated trials to yield statistically significant data. However, a primary issue with the manuscript lies in its data presentation. The authors' graphical representations are intricate and demand considerable attention to discern the nuances, especially due to the striking similarity between the symbols representing different genotypes. As it stands, the manuscript primarily caters to experts within the field, thereby warranting improvements in data visualization for broader comprehension.

      We appreciate the comment. However, as this work is indeed complex and intricate and as there is limitations imposed by the publisher on data visualizations (i.e. number of figures in the main text, etc.) we have tried our best for presenting our data in full.

      All three reviewers were appreciative of the work presented in this manuscript. There were some common concerns that we shared, that the authors could consider revising. They are listed below.

      Essential revisions:

      1. Formal comparison with the previous/other methods: The authors make many statements that compare this pgSIT with their previous method, gene drives, or with RIDL. We suggest that they focus their comparisons within the scope of data and avoid comparisons between RIDL, gene drive, and pgSIT that are based on perceptions of these methods. It would be useful if, for example, they could impute life history traits and demonstrate this pgSIT's efficacy over their previous versions.

      We express our sincere appreciation for the valuable comments received. We have removed the unnecessary comparisons between different methods, please review the revised version.

      1. Writing and presentation of figures: The authors should please take advantage of the eLife format and unpack each sentence/figure so that it's accessible to readers outside this field.

      We appreciate your comment, and we have implemented some necessary changes based on your suggestions.

      1. Data to support claims made in passing: There are many instances, such as detailed in the reviews (and the entire second paragraph in the discussion) that are not supported by data. The authors should either provide that data or not make these claims.

      Thank you for the comment. We have removed these claims.

      1. Off target effects: There is the formal possibility that off target effects that might get fixed in the population. Could the authors please address this in the discussion.

      We appreciate the comment and fully agree with the reviewer regarding the importance of understanding the frequency of unintended genome targeting. However, the likelihood of off-target effects becoming fixed within the population is exceedingly low. We have address this in the discussion.

      “Even though mutations could potentially become fixed within the natural population, the likelihood of off-target effects becoming fixed within the population is exceedingly low. To mitigate potential negative impacts, we employed CHOPCHOP V3.0.0 (https://chopchop.cbu.uib.no) for the selection of gRNAs, specifically to minimize the occurrence of genomic off-target cleavage events. Furthermore, our releasing process will be carried out in multiple rounds. Even in the event that an undesired mutant is introduced into the local population, it will either be completely eradicated through subsequent rounds of releases or be naturally eliminated through the process of natural selection over time.”

      Aside from this, we ask that the authors please pay attention to the detailed reviews.

      Reviewer #1 (Recommendations For The Authors):

      The writing: Each sentence is packed with information and while this is fine for those immersed in the field, it might be dense for those who are not. There are a lot of nuances in such an approach and clearly laying it out for the reader is important. The authors should unpack some of these sentences to make their work more accessible.

      Thank you for the comment. We have unpacked some of sentences, please review the revised version.

      It will help to have a schematic linked to the introduction about how these mosquitoes are designed to be used. Which strains would be scaled up in the lab, which ones (and what stage) could be released, and in which animal/generation they expect sterility or lethality. This would be useful while interpreting the schematics of the genetic crosses in the rest of the figures (1B, 2B). Li et al 2021 has something to this effect. I say this particularly because in the text, 'pgSIT' is used to refer to both the lab stocks and the F1s.

      We really appreciate the suggestion to incorporate a schematic into the introduction to clarify the intended use of these mosquitoes. Taking into account all the suggestions, we would like to keep textual descriptions and context provided within the manuscript, which, together with Figures 1B and 2B, illustrate our intentions. Nevertheless, we value your input and have taken other feedback into account to improve the overall quality of the content.

      Because Figure 1A depicts all the gRNAs I thought that's what they were testing in the first results section. But the legends seems to suggest that the individual gRNAs have been tested. Such issues will be sorted with attention to the writing. It would also be nice to have Figure 2A here.

      We apologize for any misunderstanding. Figure 1A displays two gRNA constructs: one for dsx (comprising 4 gRNAs) and another for ix (with 2 gRNAs). All of these gRNAs were tested in the initial results section. Subsequently, we engineered the final gRNA construct, denoted as gRNAdsx,ix,βTub, which combines the effective gRNAs described earlier (3 targeting dsx and 1 targeting ix, as illustrated in Supplementary Figure 2).

      It wasn't clear to me how egg laying percentages were calculated or what it means.

      We appreciate your comment. Female fecundity depends on the egg output (egg laying percentage) and the egg hatching rate, since insect female can lay unfertalized eggs that does not hatch. Egg laying percentages were calculated by dividing the numbers of laid eggs by a test female group by that of the control female group that laid the highest egg number. This procedure is called normalization and enable relative comparison of laid egg number.

      How is hatching at times more than laying?

      When a female group laid a small egg number but the high percentage of those eggs hatched.

      Calling something 'intersex': The authors are assessing intersex by malformed genitalia, maxillary palps and ovaries. But the genitalia defects in Fig1D were not clear to me. Can the authors show better images? While the MP snd ovary phenotypes were clear, it would be nice to see these quantified - what proportion of the females have each/some/all of these phenotypes? It would be nice to see this quantified. (They have some of this in the supplementary table).

      We express our gratitude for the comment received and acknowledge the issue regarding the clarity of the images. It is important to note that these photographs represent the highest level of clarity achieved thus far. We value your interest in the quantification of the observed phenotypes. However, due to certain constraints, we were unable to quantify the proportions for all the females, and we did not retain all the samples needed for this specific quantification.

      It's interesting that 50% of the intersex don't blood-feed - is this because they do not have appropriately formed stylets? It would be important to quantify the number of hatch-able eggs. This is particularly important in the context of field application and should ideally be included in the mathematical modelling. In the discussion, the authors mention that they are not able to host-seek and a variety of other behaviours - these data should be presented as it would be important for assessing the efficacy of the pgSIT.

      Thank you for the comment. We did not find the mutant stylets from these intersex mosquitoes. We agree with the reviewer that the number of hatchable eggs is particularly important in the context of field application. Indeed, the number of hatchable eggs is what was considered in the mathematical modeling. We did a blood feed assay (small cage and big cage) for host seeking behavior. Data were presented in Supplementary Table 5.

      At the end of the first results section, the authors state, "Taken together, these findings reveal that ♀-specific lethality and/or ⚥..." But I don't see data that show female-specific lethality until Figure 2C.

      Thank you for pointing out this. In order to describe our results clearly, we have deleted “♀-specific lethality and/or”

      In the combined gRNA mosquito (the pgSIT), they find that the cross between the gRNA and Cas9 results in very few eggs being laid, high larval death, and what emerges are males. This suggests that it would be a poor pgSIT, right? You'd have to set up huge crosses to get enough males emerging in the wild to mate with WT females to bring about population suppression. Could the authors comment on this?

      We appreciate the comment. Even in the presence of imperfections, such as reduced egg production resulting from the gRNA and Cas9 cross and the necessity of extensive mating to obtain an adequate number of males, population suppression is very promising with the pgSIT, both in terms of the potential to eliminate a mosquito population, or to suppress it to an extent that would largely interrupt disease transmission. It's worth noting that our current efforts serve as a validation of the system before its potential large-scale application, because we have demonstrated that removing females by disrupting sex determinate genes is possible with pgSIT, which can inform the development of such systems in other species in the future.

      If I'm reading Figure 2C right, the authors have combined the results from two types of crosses in the last two plots: 1) the Cas9 (X) gRNA mosquitoes and 2) the progeny from these crossed to WTs. This is not ideal. I would suggest the authors unpack the text around this data and plot it separately.

      We really appreciate the comment here, the panel 2C depicts the phenotypic data of the F1 progeny generated by the cross of the parents indicated below the X axis: egg-to-adult survival, larval death, sex ratios, and fertility. The fertility of F1 progeny is the major phenotypic feature for the project. To assess the fertility of the surviving F1 progeny, we had to cross the F1 females and males to WT males and females, respectively and assess the hatching rate of produced eggs before sacrificing emerged larvae and unhatched eggs. It's important to note that mosquito females can lay unfertilized eggs that fail to hatch.

      The text around 2F needs to be more explanatory. There are lots of labels in the figure that are not referred to, making it difficult to follow the data.

      We have gone through and expanded many of the figure legends and modified some figures to help make them more understandable.

      The supplementary figure numbering is off.

      We really appreciate the comment. The supplementary figure numbering have been fixed.

      I cannot comment on Figure 4 as this is outside my expertise. However, I do feel that some attention to the writing might help make the approach more accessible to the invested advanced lay-person.

      We appreciate the comment, and we re-wrote some of the sentences describing Figure 4.

      Reviewer #2 (Recommendations For The Authors):

      Line 49 'resistances' is a strange plural.

      Corrected. Thank you so much!

      the genitive, used with the sex symbols throughout, looks very weird e.eg line 60, 66 etc. Also the intersex symbol, on my copy at least, just prints as a square

      These have been fixed in the revised version. Thank you so much!

      Line 74 syntax (...: the spread of...") seems off

      Corrected. Thank you for pointing out this.

      Line 80-81 " to address some of the challenges with gene drives, pgSIT also leverages....." this is a straw man/red herring argument, and simply does not follow. It is this element that I raised above in the public review. See also line 84 'gene drive safety concerns'.

      Thank you, we have re-wrote the paragraph.

      Line 128 "the induced phenotypes were especially strong in intersex individuals" - this is a curious statement since, if intersex, they are by definition already showing a strongly induced phenotype

      We apologize for the lack of clarity and have updated the text, we have deleted “the induced phenotypes were especially strong in intersex individuals”, to be more explicit, now stating “These gRNAdsx/+; Cas9/+ ⚥ exhibited multiple malformed morphological features, such as mutant maxillary palps, abnormal genitalia, and malformed ovaries”

      The extent and completeness of the supplementary data is appreciated but there needs to be some statistical tests applied to back up statements like 'showed normal fertility' (line 138) or wind lengths 'were a bit larger'. None seem to have been applied.

      We appreciate the comment. We've removed these sentences in the new version.

      Supp Fig 4 - on left of panel C there is a small blue square at dsx locus that is unexplained. What is this?

      Thank you for pointing this. It was a mistake, we have removed the small blue square from Sup Fig4.

      Line 182 the reduction in flight activity in release genotype of pgSIT males - is it only those coming with the maternal source of Cas9 that are plotted (only pink dots)?

      We appreciate the comment. pgSIT males, regardless of whether they originate from a maternal or paternal source of Cas9, exhibit a similar reduction in flight activity compared to wild-type (WT) males.

      Figure 3A legend - I think there is a typo that says males were fed

      Corrected. Thank you for pointing this out.

      “♂’s” to “♀’s”

      On the window of protection (WOP) plots (e.g. supp fig 12) what is the unit on Y-axis for WOP? It goes from 0-1, as if it were probability, but I was expecting some duration.

      Thanks for the comment. The y-axis for WOP in Supp Fig 12 had been normalized unnecessarily. It has now been corrected to span from 0 to 5 years.

      Fig 4B blue (line) on blue(shading) is impossible to decipher on my copy

      Thank you for pointing this out. We have changed the colors of the traces (population dynamics), made the window of protection line thicker, and have made the shading less opaque to make the population dynamics in this figure clearer.

      Line 250 and 252: supp Fig 13 (not 12)

      Corrected. Thank you for pointing this out.

      Line 279 "potentially a more widespread effect of sex determination genes than previously expected" - I simply don't see how this is so, or why there is the need to make such a claim. Dsx is known to underpin almost of somatic determination of sex-specific morphologies, in a range of insects.

      We appreciate the comment. We have delete the sentence:

      “Taken together, these observations indicate a potentially more widespread effect of sex determination genes than previously expected, though regardless.”

      Line 320 "We would expect pgSIT to be regulated similarly to Oxitec's RIDL" because they are similar, which goes to my main point above about more appropriate context, and this warrants some direct attention to a comparison of the efficacy.

      We appreciate the comment. We have delete these sentences:

      “We would expect pgSIT to be regulated similarly to Oxitec's RIDL technology (Spinner et al., 2022), which has already been successfully deployed in numerous locations, including the United States.”

      Was there a minimal performance advantage with strain #1 with the triple locus g-RNA suite, over the other two strains? Am just curious as to why one was chosen over the other

      We appreciate the comment. There was no performance advantage with the strain #1 over the other two strains.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary

      In this manuscript, Hagihara et al. characterized the relationship between the changes in lactate and pH and the behavioral phenotypes in different animal models of neuropsychiatric disorders at a large-scale level. The authors have previously reported that increased lactate levels and decreased pH are commonly observed in the brains of five genetic mouse models of schizophrenia (SZ), bipolar disorder (BD), and autism spectrum disorder (ASD). In this study, they expanded the detection range to 109 strains or conditions of animal models, covering neuropsychiatric disorders and neurodegenerative disorders. Through statistical analysis of the first 65 strains/conditions of animal models which were set as exploratory cohort, the authors found that most strains showed decreased pH and increased lactate levels in the brains. There was a significant negative correlation between pH and lactate levels both at the strain/condition level and the individual animal level. Besides, only working memory was negatively correlated with brain lactate levels. These results were successfully duplicated by studying the confirmative cohort, including 44 strains/conditions of animal models. In all strains/conditions, the lactate levels were not correlated with age, sex, or storage duration of brain samples.

      Strengths

      1. The manuscript is well-written and structured. In particular, the discussion is really nice, covering many potential mechanisms for the altered lactate levels in these disease models.

      2. Tremendous efforts were made to recruit a huge number of various animal models, giving the conclusions sufficient power.

      We are grateful to Reviewer #1 for the positive evaluation of our manuscript. As indicated in the responses that follow, we have taken all the comments and suggestions made by the reviewer into account in the revised version of our paper.

      Weaknesses

      1. The biggest concern of this study is the limited novelty. The point of "altered pH and/or lactate levels in the brains from human and rodent animals of neuropsychiatric disorders" has been reported by the same lab and other groups in many previous papers.

      The previous study mentioned by the reviewer evaluated a small number of animal models of psychiatric disorders. The novelty of this study is underscored by two key findings: 1) the generality of changes in brain pH and lactate levels across a diverse range of disease models, and 2) the association of these phenomenon with specific behaviors. First, this large-scale animal model study revealed that alterations in brain pH/lactate levels can be found in approximately 30% of the animal models examined. This generality suggests a common basis in the neuropathophysiology of not only schizophrenia, bipolar disorder, and ASD, but also of Alzheimer’s disease (APP-J20 Tg mice), Down’s syndrome (Ts1Cje mice), Mowat–Wilson syndrome (Zeb2 KO mice), Dravet syndrome (Scn1a-A1783V KI mice), tuberous sclerosis complex (Tsc2 KO mice), Ehlers-Danlos syndrome (Tnxb KO mice), and comorbid depression in diabetes (streptozotocin-treated mice) and colitis (dextran sulfate sodium-treated mice). Secondly, this study demonstrated that these phenomenon in the brain are primarily associated with working memory impairment over depression- and anxiety-related behaviors. Importantly, developing these hypotheses in an exploratory cohort of animals and confirming them in an independent cohort within this study enhances the robustness and reliability of our hypotheses, which we believe are equally crucial as their novelty. Accordingly, we have revised the discussion section as follows (page 31, line 7):

      Original text

      "We performed a large-scale analysis of brain pH and lactate levels in 109 animal models of neuropsychiatric disorders, which revealed the diversity of brain energy metabolism among these animal models. Some strains of mice that were considered models of different diseases showed similar patterns of changes in pH and lactate levels. Specifically, the SZ/ID models (Ppp3r1 KO, Nrgn KO mice, and Hivep2 KO mice), BD/ID model (Camk2a KO mice), ASD model (Chd8 KO mice), depression models (mice exposed to social defeat stress, corticosterone-treated mice, and Sert KO mice), AD model (APP-J20 Tg mice), and DM model (Il18 KO and STZ-treated mice) commonly exhibited decreased brain pH and increased lactate levels."

      Revised text

      "We performed a large-scale analysis of brain pH and lactate levels in 109 animal models of neuropsychiatric disorders, which revealed the diversity of brain energy metabolism among these animal models. The key findings of this study are as follows: 1) the generality of changes in brain pH and lactate levels across a diverse range of disease models, and 2) the association of these phenomenon with specific behaviors. First, this large-scale animal model study revealed that alterations in brain pH/lactate levels can be found in approximately 30% of the animal models examined. This generality suggests a common basis in the neuropathophysiology of not only schizophrenia, bipolar disorder, and ASD, but also of Alzheimer’s disease (APP-J20 Tg mice), Down’s syndrome (Ts1Cje mice), Mowat–Wilson syndrome (Zeb2 KO mice), Dravet syndrome (Scn1a-A1783V KI mice), tuberous sclerosis complex (Tsc2 KO mice), Ehlers-Danlos syndrome (Tnxb KO mice), and comorbid depression in diabetes (streptozotocin-treated mice) and colitis (dextran sulfate sodium-treated mice). Secondly, this study demonstrated that these phenomenon in the brain are primarily associated with working memory impairment over depression- and anxiety-related behaviors. Importantly, developing these hypotheses in an exploratory cohort of animals and confirming them in an independent cohort within this study enhances the robustness and reliability of our hypotheses."

      1. This study is mostly descriptive, lacking functional investigations. Although a larger cohort of animal models were studied which makes the conclusion more solid, limited conceptual advance is contributed to the relevant field, as we are still not clear about what the altered levels of pH and lactate mean for the pathogenesis of neuropsychiatric disorders.

      We agree with the reviewer’s comment. To address this issue, it is necessary to comprehensively identify brain regions and cell types responsible for pH and lactate changes in each strain/condition of animals, as these may differ among them. Subsequently, based on such findings, we can then proceed with functional investigations that specifically target the identified brain regions/cell types. However, conducting such investigations would require a significant amount of time to complete, approximately 2–3 years, and is beyond the scope of this study. Therefore, we would like to conduct such studies in the future. We have mentioned this limitation by revising the discussion section of this study as follows (page 43, line 5):

      Original text

      "Because we used whole brain samples to measure pH and lactate levels, we could not determine whether the observed changes in pH and/or lactate levels occurred ubiquitously throughout the brain or selectively in specific brain region(s) in each strain/condition of the models. Indeed, brain region-specific increases in lactate levels were observed in human patients with ASD in an MRS study (Goh et al., 2014). Furthermore, while increased lactate levels were observed in whole-brain measurements in mice with chronic social defeat stress (Figure S7) (Hagihara et al., 2021a), decreased lactate levels were found in the dorsomedial prefrontal cortex (Yao et al., 2023). The brain region-specific changes may occur even in animal models in which undetectable changes were observed in the present study. This could be due to the masking of such changes in the analysis when using whole-brain samples. Further studies are needed to address this issue by measuring microdissected brain samples and performing in vivo analyses using pH- or lactate-sensitive biosensor electrodes (Marunaka et al., 2014; Newman et al., 2011) and MRS (Davidovic et al., 2011)."

      Revised text:

      "The major limitations of this study include the absence of analyses specific to brain regions or cell types and the lack of functional investigations. Because we used whole brain samples to measure pH and lactate levels, we could not determine whether the observed changes in pH and/or lactate levels occurred ubiquitously throughout the brain or selectively in specific brain region(s) in each strain/condition of the models. It is known that certain molecular expression profiles and signaling pathways display brain region-specific alterations, and in some cases, even exhibit opposing changes in neuropsychiatric disease models (Hosp et al., 2017; Floriou-Servou et al. 2018; Reim et al., 2017). Indeed, brain region-specific increases in lactate levels were observed in human patients with ASD in an MRS study (Goh et al., 2014). Furthermore, while increased lactate levels were observed in whole-brain measurements in mice with chronic social defeat stress (Figure S7) (Hagihara et al., 2021a), decreased lactate levels were found in the dorsomedial prefrontal cortex (Yao et al., 2023). Additionally, it has been reported that the basal intracellular pH differs between neurons and astrocytes (lower in astrocytes than in neurons), and their responsiveness to conditions simulating neural hyperexcitation and the metabolic acidosis in terms of intracellular pH also varies (Raimondo et al., 2016; Salameh et al., 2017). It would also be possible that the brain region/cell type-specific changes may occur even in animal models in which undetectable changes were observed in the present study. This could be due to the masking of such changes in the analysis when using whole-brain samples. Given the assumption that the brain regions and cell types responsible for pH and lactate changes vary across different strains/conditions, comprehensive studies are needed to thoroughly examine this issue for each animal model individually. This can be achieved through techniques such as evaluating microdissected brain samples, conducting in vivo analyses using pH- or lactate-sensitive biosensor electrodes (Marunaka et al., 2014; Newman et al., 2011), and MRS (Davidovic et al., 2011). Subsequently, based on such findings, it is also necessary to conduct functional analyses for each model animal by manipulating pH or lactate levels in specific brain regions/cell types and evaluating behavioral phenotypes relevant to neuropsychiatric disorders."

      1. The experiment procedure is also a concern. The brains from animal models were acutely collected without cardiac perfusion in this study, which suggests that resident blood may contaminate the brain samples. The lactate is enriched in the blood, making it a potential confounded factor to affect the lactate levels as well as pH in the brain samples.

      We thank the reviewer for pointing this out. We have discussed this issue as follows (page 45, line 4):

      We also note that there are several potential confounding factors in this study. The brain samples analyzed in this study contained cerebral blood. The cerebral blood volume is estimated to be approximately 20–50 μl/g in human and feline brains (Leenders et al., 1990; van Zijl et al., 1998). When we extrapolate these values to murine brains, it would imply that the proportion of blood contamination in the brain homogenates analyzed is 0.2–0.6%. Additionally, lactate concentrations in the blood are two to three times higher than those in the brains of mice (Béland-Millar et al., 2017). Therefore, even if there were differences in the amount of resident blood in the brains between control and experimental animals, the impact of such differences on the lactate measurements would likely be minimal.

      1. The lactate and pH levels may also be affected by other confounded factors, such as circadian period, and locomotor activity before the mice were sacrificed. This should also be discussed in the paper.

      Following the reviewer’s suggestion, we have discussed the matter as follows (page 45, line 12): Other confounding factors include circadian variation and locomotor activity before the brain sampling. Lactate levels are known to exhibit circadian rhythm in the rodent cortex, transitioning gradually from lower levels during the light period to higher levels during the dark period (Dash et al., 2012; Shram et al., 2002; Wallace et al., 2022). The variation in the times of sample collection during the day was basically kept minimized within each strain/condition of animals. However, the sample collection times were not explicitly matched across the different laboratories, which may contribute to variations in the baseline control levels of pH and lactate among different strains/conditions of animals (Table S3). In addition, motor activity and wake/sleep status immediately before brain sampling can also influence brain lactate levels (Neylor et al., 2012; Shram et al., 2002). These factors have the potential to act as confounding variables in the measurement of brain lactate and pH in animals.

      1. Another concern is the animal models. Although previous studies have demonstrated that dysfunctions of these genes could cause related phenotypes for certain disorders, many of them are not acknowledged by the field as reliable disease models. Besides, gene deficiency could also cause many known or unknown unrelated phenotypes, which may contribute to the altered levels of lactate and pH, too. In this circumstance, the conclusion "pH and lactate levels are transdiagnostic endophenotype of neuropsychiatric disorders" is somewhat overstated.

      We thank the reviewer for pointing this out. We should have taken this issue into consideration. Accordingly, we have discussed this issue as the limitation of this study in the discussion section as follows (page 34, line 14):

      "While we analyzed 109 strains/conditions of animals, we included both those that are widely recognized as animal models for specific neuropsychiatric disorders and those that are not. For example, while interleukin 18 (Il18) KO mice and mitofusin 2 (hMfn2-D210V) Tg mice exhibited changes in pH and lactate levels, the evidence that these genes are associated with specific neuropsychiatric disorders is limited. However, these strains of mice exhibited behavioral abnormalities related to neuropsychiatric disorders, such as depressive-like behaviors and impaired working memory (Ishikawa et al., 2019, 2021; Yamanishi et al., 2019). Furthermore, these mice showed maturation abnormality in the hippocampal dentate gyrus and neuronal degeneration due to mitochondrial dysfunction, respectively, suggesting conceptual validity for utilization as animal models for neuropsychiatric and neurodegenerative disorders (Cunnane, et al., 2021; Burté et al., 2015; Hagihara et al., 2013, 2019). In contrast, mice with heterozygous KO of the synaptic Ras GTPase-activating protein 1 (syngap1), whose mutations have been identified in human patients with ID and ASD, showed an array of behavioral abnormalities relevant to the disorders (Komiyama et al., 2002; Nakajima et al., 2019), but did not show changes in brain pH or lactate levels. Therefore, while changes in brain pH and lactate levels could be transdiagnostic endophenotypes of neuropsychiatric disorders, they might occur depending on the subpopulation due to the distinct genetic and environmental causes or specific disease states in certain disorders."

      Regarding the latter point suggested by the reviewer, we consider that alterations in brain pH and lactate levels occur, whether they are a direct and known consequence or indirect and unknown ones of genetic modifications. We have proposed that genetic modifications, along with environmental stimulations, may induce various changes, which subsequently converge toward specific endophenotypes in the brain, such as neuronal hyperexcitation, inflammation, and maturational abnormalities (Hagihara et al., 2013; Yamasaki et al., 2008). The findings of this study, demonstrating the commonality of alteration of brain pH and lactate levels, align with this concept, suggesting that these alterations could serve as brain endophenotypes in multiple neuropsychiatric disorders. We have revised the discussion section as follows (page 42, line 8):

      Original text

      "These findings suggest that the observed increase in lactate production and subsequent decrease in pH in whole-brain samples may be attributed to the hyperactivity of specific neural circuits in a subset of the examined animal models."

      Revised text

      "These findings suggest that neuronal hyperexcitation may be one of the common factors leading to increased lactate production and decreased pH in the brain. We consider that alterations in brain pH and lactate levels occur, whether they are a direct and known consequence or indirect and unknown ones of genetic modifications. We have proposed that genetic modifications, along with environmental stimulations, may induce various changes, which subsequently converge toward specific endophenotypes in the brain, such as neuronal hyperexcitation, inflammation, and maturational abnormalities (Hagihara et al., 2013; Yamasaki et al., 2008). The findings of this study, demonstrating the commonality of alterations in brain pH and lactate levels, align with this concept and suggest that these alterations could serve as brain endophenotypes in multiple neuropsychiatric disorders."

      1. The negative correlationship between pH and lactate is rather convincing. However, how much the contribution of lactate to pH is not tested. In addition, regarding pH and lactate, which factor contributes most to the pathogenesis of neuropsychiatric disorders is also unclear. These questions may need to be addressed in the future study.

      To estimate the degree of contribution of lactate to pH, we determined the contribution ratio using the regression coefficient within a linear regression model applied to a combined cohort. The results showed that 33.2% of changes in pH may be explained by changes in lactate level. We have added the following text in the Results section (page 28, line 7).

      The contribution ratio of lactate to pH, calculated based on the regression coefficient in a linear regression model, was 33.2% at the individual level, suggesting a moderate level of contribution.

      Regarding the latter suggestion, we would like to address the issue in the future study. Accordingly, we have added the following sentence in the discussion section (page 40, line 11):

      Original text

      "Further studies are needed to address these hypotheses by chronically inducing deficits in mitochondrial function to manipulate endogenous lactate levels in a brain region-specific manner and to analyze their effects on working memory."

      Revised text

      "Further studies are needed to address these hypotheses by chronically inducing deficits in mitochondrial function to manipulate endogenous lactate levels in a brain region-specific manner and to analyze their effects on working memory. It is also important to consider whether pH or lactate contributes more significantly to the observed behavioral abnormalities."

      1. The authorship is open to question. Most authors listed in this paper may only provide mice strains or brain samples. Maybe it is better just to acknowledge them in the acknowledgments section.

      In the light of the current circumstances, wherein there is no universally agreed definition of authorship (the Committee on Publication Ethics1), we acknowledge the reviewer’s concern. Collecting a comprehensive range of mouse strains and brain samples is a fundamental principle of this study. Maintaining mouse lines, breeding mice, genotyping, drug administration, and preparation of brain samples each require specialized expertise. Therefore, the scientific and technical contributions of individuals who only provided mouse strains or brain samples was also crucial for obtaining the data essential to this study. In accordance with the authorship guidelines outlined by the journal, which stipulate that “We recommend that all researchers who made substantial or important contributions to the design of a work, or the acquisition, analysis or interpretation of the data used in the paper, be included as authors.”, we would like to retain their authorship status. Furthermore, we ensured that all authors had read and approved the manuscript before submission, using Google Forms.

      1. GUIDELINES ON GOOD PUBLICATION PRACTICE, Committee on Publication Ethics (COPE), https://publicationethics.org/files/u7141/1999pdf13.pdf
      1. The last concern is about the significance of this study. Although the majority of strains showed increased lactate, some still showed decreased lactate levels in the brains. These results suggested that lactate or pH is an endophenotype for neuropsychiatric disorders, but it is hard to serve as a good diagnostic index as the change is not unidirectional in different disorders. In other words, the relationship between lactate level and neuropsychiatric disorders is not exclusive.

      As pointed out by the reviewer, whether brain pH and lactate levels increase or decrease could vary among animal models. Such variation may represent subpopulations of patients or specific disease states. Considering both increases and decreases in changes in pH and lactate levels could be important to achieve that goal. Accordingly, we have revised the text as follows:

      Added text (page 33, line 12)

      "Detecting changes in brain pH and lactate levels, whether resulting in an increase or decrease due to their potential bidirectional alterations, using techniques such as MRS may help the diagnosis, subcategorization, and identification of specific disease states of these biologically heterogeneous and spectrum disorders, as has been shown for mitochondrial diseases (Lin et al., 2003)."

      Added text (page 35, line 14)

      "Therefore, while changes in brain pH and lactate levels could be transdiagnostic endophenotypes of neuropsychiatric disorders, they might occur depending on the subpopulation due to the distinct genetic and environmental causes or specific disease states in certain disorders."

      Reviewer #2 (Public Review):

      Hagihara et al. conducted a study investigating the correlation between decreased brain pH, increased brain lactate, and poor working memory. They found altered brain pH and lactate levels in animal models of neuropsychiatric and neurodegenerative disorders. Their study suggests that poor working memory performance may predict higher brain lactate levels.

      However, the study has some significant limitations. One major concern is that the authors examined whole-brain pH and lactate levels, which might not fully represent the complexity of disease states. Different brain regions and cell types may have distinct protein and metabolite profiles, leading to diverse disease outcomes. For instance, certain brain regions like the hippocampus and nucleus accumbens exhibit opposite protein/signaling pathways in neuropsychiatric disease models.

      We want to thank the reviewer for the valuable suggestions. To address this issue, it is necessary to comprehensively identify brain regions and cell types responsible for pH and lactate changes in each strain/condition of animals, as these may differ among them. Subsequently, based on such findings, we can then proceed with functional investigations that specifically target the identified brain regions/cell types. However, conducting such investigations would require a significant amount of time to complete, approximately 2–3 years, and is beyond the scope of this study. Therefore, we would like to conduct such studies in the future. We have mentioned this limitation by revising the discussion section of this study as follows (page 43, line 5):

      Original text

      "Because we used whole brain samples to measure pH and lactate levels, we could not determine whether the observed changes in pH and/or lactate levels occurred ubiquitously throughout the brain or selectively in specific brain region(s) in each strain/condition of the models. Indeed, brain region-specific increases in lactate levels were observed in human patients with ASD in an MRS study (Goh et al., 2014). Furthermore, while increased lactate levels were observed in whole-brain measurements in mice with chronic social defeat stress (Figure S7) (Hagihara et al., 2021a), decreased lactate levels were found in the dorsomedial prefrontal cortex (Yao et al., 2023). The brain region-specific changes may occur even in animal models in which undetectable changes were observed in the present study. This could be due to the masking of such changes in the analysis when using whole-brain samples. Further studies are needed to address this issue by measuring microdissected brain samples and performing in vivo analyses using pH- or lactate-sensitive biosensor electrodes (Marunaka et al., 2014; Newman et al., 2011) and MRS (Davidovic et al., 2011)."

      Revised text

      "The major limitations of this study include the absence of analyses specific to brain regions or cell types and the lack of functional investigations. Because we used whole brain samples to measure pH and lactate levels, we could not determine whether the observed changes in pH and/or lactate levels occurred ubiquitously throughout the brain or selectively in specific brain region(s) in each strain/condition of the models. It is known that certain molecular expression profiles and signaling pathways display brain region-specific alterations, and in some cases, even exhibit opposing changes in neuropsychiatric disease models (Hosp et al., 2017; Floriou-Servou et al. 2018; Reim et al., 2017). Indeed, brain region-specific increases in lactate levels were observed in human patients with ASD in an MRS study (Goh et al., 2014). Furthermore, while increased lactate levels were observed in whole-brain measurements in mice with chronic social defeat stress (Figure S7) (Hagihara et al., 2021a), decreased lactate levels were found in the dorsomedial prefrontal cortex (Yao et al., 2023). Additionally, it has been reported that the basal intracellular pH differs between neurons and astrocytes (lower in astrocytes than in neurons), and their responsiveness to conditions simulating neural hyperexcitation and the metabolic acidosis in terms of intracellular pH also varies (Raimondo et al., 2016; Salameh et al., 2017). It would also be possible that the brain region/cell type-specific changes may occur even in animal models in which undetectable changes were observed in the present study. This could be due to the masking of such changes in the analysis when using whole-brain samples. Given the assumption that the brain regions and cell types responsible for pH and lactate changes vary across different strains/conditions, comprehensive studies are needed to thoroughly examine this issue for each animal model individually. This can be achieved through techniques such as evaluating microdissected brain samples, conducting in vivo analyses using pH- or lactate-sensitive biosensor electrodes (Marunaka et al., 2014; Newman et al., 2011), and MRS (Davidovic et al., 2011). Subsequently, based on such findings, it is also necessary to conduct functional analyses for each model animal by manipulating pH or lactate levels in specific brain regions/cell types and evaluating behavioral phenotypes relevant to neuropsychiatric disorders."

      Moreover, the memory tests used in the study are specific to certain brain regions, but the authors did not measure lactate levels in those regions. Without making lactate measurements in brain-regions and cell types involved in these diseases, any conclusions regarding the role of lactate in CNS diseases is premature.

      Regarding the point about “lactate measurements in brain-regions and cell types involved in these diseases,” please refer our responses provided above.

      Additionally, evidence suggests that exogenous treatment with lactate has positive effects, such as antidepressant effects in multiple disease models (Carrard et al., 2018, Carrard et al., 2021, Karnib et al., 2019, Shaif et al., 2018). It also promotes learning, memory formation, neurogenesis, and synaptic plasticity (Suzuki et al., 2011, Yang et al., 2014, Weitian et al., 2015, Dong et al., 2017, El Hayek et al. 2019, Wang et al., 2019, Lu et al., 2019, Lev-Vachnish et a.l, 2019, Descalzi G et al., 2019, Herrera-López et al., 2020, Ikeda et al., 2021, Zhou et al., 2021,Roumes et al., 2021, Frame et al., 2023, Akter et al., 2023).

      We thank the reviewer for pointing out many references regarding the effects of lactate that were not cited in our paper. We have since included these studies and discussed in more detail the effect of lactate at molecular, cellular, and behavioral levels (page 39, line 11).

      Original text

      "Moreover, increased lactate may have a positive or beneficial effect on memory function to compensate for its impairment, as lactate administration with an associated increase in brain lactate levels attenuates cognitive deficits in human patients (Bisri et al., 2016) and rodent models (Rice et al., 2002) of traumatic brain injury. In addition, lactate administration exerts antidepressant effects in a mouse model of depression (Carrard et al., 2016)."

      Revised text

      "Moreover, increased lactate may have a positive or beneficial effect on memory function to compensate for its impairment, as lactate administration with an associated increase in brain lactate levels attenuates cognitive deficits in human patients (Bisri et al., 2016) and rodent models (Rice et al., 2002) of traumatic brain injury. In addition, lactate administration exerts antidepressant effects in a mouse model of depression (Carrard et al., 2021, 2016; Karnib et al., 2019; Shaif et al., 2018). Lactate has also shown to promote learning and memory (Descalzi G et al., 2019; Dong et al., 2017; Hayek et al. 2019; Lu et al., 2019; Roumes et al., 2021; Suzuki et al., 2011), synaptic plasticity (Herrera-López et al., 2020; Yang et al., 2014; Zhou et al., 2021), adult hippocampal neurogenesis (Lev-Vachnish et al., 2019), and mitochondrial biogenesis and antioxidant defense (Akter et al., 2023), while its effects on adult hippocampal neurogenesis and learning and memory are controversial (Ikeda et al., 2021; Lev-Vachnish et al., 2019; Wang et al., 2019)."

      In conclusion, the relevance of total brain pH and lactate levels as indicators of the observed correlations is controversial, and evidence points towards lactate having more positive rather than negative effects. It is important that the authors perform studies looking at brain-region-specific concentrations of lactate and that they modulate lactate levels (decrease) in animal models of disease to validate their conclusions. it is also important to consider the above-mentioned studies before concluding that "altered brain pH and lactate levels are rather involved in the underlying pathophysiology of some patients with neuropsychiatric disorders" and that "lactate can serve as a potential therapeutic target for neuropsychiatric disorders".

      Regarding the points about positive effects of lactate, measurement of brain-region-specific lactate concentrations, and modulation of lactate levels, please refer to our responses provided earlier. The points raised by the reviewer are important and should be addressed in future studies.

      Reviewer #2 (Recommendations For The Authors):

      • Measure lactate in specific brain regions. The whole brain measurements are not relevant to the disease states.

      We thank the reviewer for pointing this out. We totally agree with the reviewer’s comment and recognize that the lack of investigations in specific brain regions is one of the major limitations of this study. To address this issue, it is necessary to comprehensively identify brain regions and cell types responsible for pH and lactate changes in each strain/condition of animals, as these may differ among them. Subsequently, based on such findings, we can then proceed with functional investigations that specifically target the identified brain regions/cell types. However, conducting such investigations would require a significant amount of time to complete, approximately 2–3 years, and is beyond the scope of this study. Therefore, we would like to conduct such studies in the future. We have mentioned this limitation by revising the discussion section of this study as follows (page 43, line 5):

      Original text

      "Because we used whole brain samples to measure pH and lactate levels, we could not determine whether the observed changes in pH and/or lactate levels occurred ubiquitously throughout the brain or selectively in specific brain region(s) in each strain/condition of the models. Indeed, brain region-specific increases in lactate levels were observed in human patients with ASD in an MRS study (Goh et al., 2014). Furthermore, while increased lactate levels were observed in whole-brain measurements in mice with chronic social defeat stress (Figure S7) (Hagihara et al., 2021a), decreased lactate levels were found in the dorsomedial prefrontal cortex (Yao et al., 2023). The brain region-specific changes may occur even in animal models in which undetectable changes were observed in the present study. This could be due to the masking of such changes in the analysis when using whole-brain samples. Further studies are needed to address this issue by measuring microdissected brain samples and performing in vivo analyses using pH- or lactate-sensitive biosensor electrodes (Marunaka et al., 2014; Newman et al., 2011) and MRS (Davidovic et al., 2011)."

      Revised text:

      "The major limitations of this study include the absence of analyses specific to brain regions or cell types and the lack of functional investigations. Because we used whole brain samples to measure pH and lactate levels, we could not determine whether the observed changes in pH and/or lactate levels occurred ubiquitously throughout the brain or selectively in specific brain region(s) in each strain/condition of the models. It is known that certain molecular expression profiles and signaling pathways display brain region-specific alterations, and in some cases, even exhibit opposing changes in neuropsychiatric disease models (Hosp et al., 2017; Floriou-Servou et al. 2018; Reim et al., 2017). Indeed, brain region-specific increases in lactate levels were observed in human patients with ASD in an MRS study (Goh et al., 2014). Furthermore, while increased lactate levels were observed in whole-brain measurements in mice with chronic social defeat stress (Figure S7) (Hagihara et al., 2021a), decreased lactate levels were found in the dorsomedial prefrontal cortex (Yao et al., 2023). Additionally, it has been reported that the basal intracellular pH differs between neurons and astrocytes (lower in astrocytes than in neurons), and their responsiveness to conditions simulating neural hyperexcitation and the metabolic acidosis in terms of intracellular pH also varies (Raimondo et al., 2016; Salameh et al., 2017). It would also be possible that the brain region/cell type-specific changes may occur even in animal models in which undetectable changes were observed in the present study. This could be due to the masking of such changes in the analysis when using whole-brain samples. Given the assumption that the brain regions and cell types responsible for pH and lactate changes vary across different strains/conditions, comprehensive studies are needed to thoroughly examine this issue for each animal model individually. This can be achieved through techniques such as evaluating microdissected brain samples, conducting in vivo analyses using pH- or lactate-sensitive biosensor electrodes (Marunaka et al., 2014; Newman et al., 2011), and MRS (Davidovic et al., 2011). Subsequently, based on such findings, it is also necessary to conduct functional analyses for each model animal by manipulating pH or lactate levels in specific brain regions/cell types and evaluating behavioral phenotypes relevant to neuropsychiatric disorders."

      • Discuss in detail the studies that show the neuroprotective effects of lactate and reconcile these with the authors' conclusions.

      As suggested by the reviewer, we have discussed in more detail the positive effect of lactate at molecular, cellular, and behavioral levels as below (page 39, line 11):

      Original text

      "Moreover, increased lactate may have a positive or beneficial effect on memory function to compensate for its impairment, as lactate administration with an associated increase in brain lactate levels attenuates cognitive deficits in human patients (Bisri et al., 2016) and rodent models (Rice et al., 2002) of traumatic brain injury. In addition, lactate administration exerts antidepressant effects in a mouse model of depression (Carrard et al., 2016)."

      Revised text

      "Moreover, increased lactate may have a positive or beneficial effect on memory function to compensate for its impairment, as lactate administration with an associated increase in brain lactate levels attenuates cognitive deficits in human patients (Bisri et al., 2016) and rodent models (Rice et al., 2002) of traumatic brain injury. In addition, lactate administration exerts antidepressant effects in a mouse model of depression (Carrard et al., 2021, 2016; Karnib et al., 2019; Shaif et al., 2018). Lactate has also shown to promote learning and memory (Descalzi G et al., 2019; Dong et al., 2017; Hayek et al. 2019; Lu et al., 2019; Roumes et al., 2021; Suzuki et al., 2011), synaptic plasticity (Herrera-López et al., 2020; Yang et al., 2014; Zhou et al., 2021), adult hippocampal neurogenesis (Lev-Vachnish et al., 2019), and mitochondrial biogenesis and antioxidant defense (Akter et al., 2023), while its effects on adult hippocampal neurogenesis and learning and memory are controversial (Ikeda et al., 2021; Lev-Vachnish et al., 2019; Wang et al., 2019)."

      • Conduct experiments whereby you decrease/deplete/modulate lactate levels in animal models and show that there is amelioration of the symptoms.

      Regarding this point, kindly refer to the responses we provided in the first comment from the reviewer. We have mentioned this limitation by revising the discussion section of this study as follows (page 43, line 5):

      Original text

      "Because we used whole brain samples to measure pH and lactate levels, we could not determine whether the observed changes in pH and/or lactate levels occurred ubiquitously throughout the brain or selectively in specific brain region(s) in each strain/condition of the models. Indeed, brain region-specific increases in lactate levels were observed in human patients with ASD in an MRS study (Goh et al., 2014). Furthermore, while increased lactate levels were observed in whole-brain measurements in mice with chronic social defeat stress (Figure S7) (Hagihara et al., 2021a), decreased lactate levels were found in the dorsomedial prefrontal cortex (Yao et al., 2023). The brain region-specific changes may occur even in animal models in which undetectable changes were observed in the present study. This could be due to the masking of such changes in the analysis when using whole-brain samples. Further studies are needed to address this issue by measuring microdissected brain samples and performing in vivo analyses using pH- or lactate-sensitive biosensor electrodes (Marunaka et al., 2014; Newman et al., 2011) and MRS (Davidovic et al., 2011)."

      Revised text:

      "The major limitations of this study include the absence of analyses specific to brain regions or cell types and the lack of functional investigations. Because we used whole brain samples to measure pH and lactate levels, we could not determine whether the observed changes in pH and/or lactate levels occurred ubiquitously throughout the brain or selectively in specific brain region(s) in each strain/condition of the models. It is known that certain molecular expression profiles and signaling pathways display brain region-specific alterations, and in some cases, even exhibit opposing changes in neuropsychiatric disease models (Hosp et al., 2017; Floriou-Servou et al. 2018; Reim et al., 2017). Indeed, brain region-specific increases in lactate levels were observed in human patients with ASD in an MRS study (Goh et al., 2014). Furthermore, while increased lactate levels were observed in whole-brain measurements in mice with chronic social defeat stress (Figure S7) (Hagihara et al., 2021a), decreased lactate levels were found in the dorsomedial prefrontal cortex (Yao et al., 2023). Additionally, it has been reported that the basal intracellular pH differs between neurons and astrocytes (lower in astrocytes than in neurons), and their responsiveness to conditions simulating neural hyperexcitation and the metabolic acidosis in terms of intracellular pH also varies (Raimondo et al., 2016; Salameh et al., 2017). It would also be possible that the brain region/cell type-specific changes may occur even in animal models in which undetectable changes were observed in the present study. This could be due to the masking of such changes in the analysis when using whole-brain samples. Given the assumption that the brain regions and cell types responsible for pH and lactate changes vary across different strains/conditions, comprehensive studies are needed to thoroughly examine this issue for each animal model individually. This can be achieved through techniques such as evaluating microdissected brain samples, conducting in vivo analyses using pH- or lactate-sensitive biosensor electrodes (Marunaka et al., 2014; Newman et al., 2011), and MRS (Davidovic et al., 2011). Subsequently, based on such findings, it is also necessary to conduct functional analyses for each model animal by manipulating pH or lactate levels in specific brain regions/cell types and evaluating behavioral phenotypes relevant to neuropsychiatric disorders."

      Other corrections

      Title page and Acknowledgements:

      We have revised the affiliation information for the following co-authors: Drs. Anja Urbach8, Mohamed Darwish19, 20, Keizo Takao20, 22, Bong-Kiun Kaang53, 54, Michihiro Igarashi74, 75, Rie Ohashi87-89, and Nobuyuki Shiina87-89.

      Page 56, line 12:

      The term ‘The International Brain pH Consortium’ has been corrected to ‘The International Brain pH Project Consortium’.

      Supplementary Table 1: Supplementary References:

      1. Oota-Ishigaki A, Takao K, Yamada D, Sekiguchi M, Itoh M, Koshidata Y, et al. (2022): Prolonged contextual fear memory in AMPA receptor palmitoylation-deficient mice. Neuropsychopharmacology 47: 2150–2159.

      We have updated the name of the mouse strain from “patDp” to “15q dup” throughout the manuscript.

      We have made the following revisions to enhance readability.

      Page 24, line 9: According to a simple correlation analysis, working memory measures (correct responses in the maze test) were significantly negatively correlated with brain lactate levels (r = -0.76, P = 1.93 × 10-5; Figure 1F).

      Page 27, line 1:

      Revised text

      "We found that working memory measures (correct responses in the maze test) were the most frequently selected behavioral measures for constructing a successful prediction model (Figure 2E), which is consistent with the results of the exploratory study (Figure 1E)."

      Figure 1 legend:

      Revised text

      "(F–H) Scatter plot showing correlations between actual brain lactate levels and measures of working memory (correct responses in the maze test) (F), the number of transitions in the light/dark transition test (G), and the percentage of immobility in the forced swim test (H)."

      Figure 2 legend:

      Revised text

      "(F–H) Scatter plots showing correlations between actual brain lactate levels and working memory measures (correct responses in the maze test) (F), the acoustic startle response at 120 dB (G), and the time spent in dark room in the light/dark transition test (H)."

      Page 30, line 2:

      Original text

      "The high to moderate-high pH/low to moderate-low lactate group included mouse models of ASD or developmental delay, such as Shank2 KO, Fmr1 KO, BTBR, Stxbp1 KO, Dyrk1 KO, Auts2 KO, and patDp mice (Table S1, Figure S7)."

      Revised text

      "The high pH/low lactate group and moderate-high pH/moderate-low lactate group included mouse models of ASD or developmental delay, such as Shank2 KO, Fmr1 KO, BTBR, Stxbp1 KO, Dyrk1 KO, Auts2 KO, and 15q dup mice (Table S1, Figure S7)."

      Page 40, line 7:

      Original text

      "Moreover, increased lactate levels may also be involved in behavioral changes other than memory deficits such as anxiety."

      Revised text

      "Moreover, increased lactate levels may also be involved in behavioral changes other than memory deficits, such as anxiety."

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The experimental design presented cannot clearly show that the effect of passive exposure was due to the specific exposure to task-relevant stimuli since there is no control group exposed to irrelevant stimuli.

      We acknowledge the possibility that exposure to task-irrelevant stimuli could result in improvements in learning. Testing this possibility would be a worthwhile goal of future experiments, but it is outside the scope of our current study. We have been careful in our paper to only draw conclusions about the effects of exposure to task-relevant stimuli compared to no exposure. We have added a discussion of this point and relevant references to the literature in the Discussion section of our manuscript.

      The conclusion that "passive exposure influences responses to sounds not used during training" (line 147) does not seem fully supported by the authors' analysis. The authors show that there is an increase in accuracy for intermediate sweep speeds despite the fact that this is the first time the animals encounter them in the active session. However, it seems impossible to exclude that this effect is not simply due to the increased accuracy of the extreme sounds that the animals had been trained on.

      We have modified this sentence to emphasize that it refers to “intermediate” sounds. Regarding the reviewer’s concern, the conclusion is drawn from Figure 3, in which we show that mice exhibit an improvement on non-extreme stimuli after training on extreme stimuli. Panel 3D illustrates that the observed improvements are not just changes in psychometric performance driven by the extreme sounds. In the context of this result, the conclusion relates to generalization in performance on task-relevant stimuli that are closely related to the training stimuli. In our view, it was not entirely obvious a priori that this result would have to occur, since it is possible that performance could improve at the extremes without improving at the intermediate stimuli.

      In the modelling section, the authors adjusted the hyper-parameters to maximize the difference between pure active and passive/active learning. This makes a comparison of learning rates between models somewhat confusing.

      We apologize for the confusion. None of our conclusions are based on comparisons of learning speed between models, but perhaps this was not pointed out sufficiently clearly. The relevant comparisons between conditions for each specific model are made using the same hyperparameters. We have clarified this point in the modeling section of our manuscript.

      The description of the sound does not state whether when reducing the slope of the sweeps the center or the onset frequency of the sounds is preserved.

      Frequency modulated sounds of different FM slopes were generated such that the center frequency was always the same. This is now clarified in the updated version of the manuscript.

      Reviewer #1 (Recommendations for the authors):

      As mentioned, the specificity of the stimuli presented during the passive period is not explicitly addressed in either modelling or behaviour. For modelling, this could be quite straightforward to assess by manipulating the input stimuli during passive episodes. For the behaviour, this would require repeating the experiment with passive sessions during which unrelated sounds are presented (for example varying in frequency or intensity instead of frequency slope). I mainly include this suggestion to clarify my previous comment because this would require a huge amount of work.

      We agree that varying the extent to which the presented passive stimuli are task-related to the task is an interesting point to study for future experiments. However, doing so for the experiments is outside the scope of the current study, and we believe exploring this only in the modeling part would add little value to the current study, because the outcome will highly depend on the details of the implementation.

      Reviewer #2 (Public Review):

      One limitation here is that the presented analysis is somewhat simplistic, does not include any detailed psychometric analysis (bias, lapse rates etc), and primarily focuses on learning speed.

      In our preliminary analyses of trials that included extreme and intermediate stimuli after animals had learned the task (Figure 3), we investigated some metrics of the type that the reviewer suggests here. However, since such additional psychometric analyses were somewhat tangential to our main results (which are about learning speed and responses to sounds not included during training), we did not include these in our manuscript. In agreement with the reviewer’s concern, a main limitation of our study is that the available data does not allow for an analysis of psychometrics during the initial learning stages, since only the extreme stimuli were presented during the task.

      Reviewer #2 (Recommendations for the authors):

      The International Brain Lab has shown quite nicely that psychometric curves continue to improve (increased slope, decreased bias) across learning. This was not really discussed or presented in your data - is this observed during the S4 training portion?

      We indeed saw improvements in the psychometric performance during stage S4, in particular for the active-only learners, as can be seen in Figure 3. We quantified these changes (now presented in the Results section), and added a discussion to the main text.

      Why use a linear fit to extract the various quantities of interest? All of these quantities could be extracted from the raw behavioral data itself.

      Because of the large variations in performance from day-to-day, a linear fit allowed us to extract a more reliable estimate of quantities like “Time to achieve 70%” and “Performance at 21 days” for each animal.

      The analysis presented was focussed primarily on the fast learners. What about the slow learners? Are the ANN models able to recapitulate different aspects of their behavior?

      We agree with the reviewer that the observation that the learners clustered into two groups calls for further investigation. In this study, we focused on the mice that learned more efficiently, because those allowed us to address our main research question about the influence of passive exposure. We believe, the slow learners could be modeled with ANNs that start with a less-easily discriminable input representation, which limits the performance that the trained network is ultimately able to achieve. This additional analysis is outside the scope of the current manuscript, but we hope to address these questions in the future.

      Although I appreciate the thoroughness of the modeling, I was not entirely convinced by the narrative underlying models 1-5, since none of these models were able to successfully recapitulate your core findings. Would it not make more sense to focus primarily on the final model?

      By starting with the simplest possible model that incorporates supervised and unsupervised learning, we were able to determine which ingredients were necessary to capture the behavioral data. We believe this could not have been clearly established by considering the final model alone.

      Reviewer #3 (Public Review):

      The first [major weakness] is that even Model 5 differs from their data. For example, the A+P (passive interleaved condition) learning curve in Figure 7 seems to be non-monotonic, and has some sort of complex eigenvalue in its decay to the steady state performance as trials increase. This wasn't present in their experimental data (Figure 2D), and implies a subtle but important difference. There also appear to be differences in how quickly the initial learning (during early trials) occurs for the A+P and A:P conditions. While both A+P and A:P conditions learn faster than A only in M5, A+P and A:P seem to learn in different ways, which isn't supported in their data.

      The reviewer is correct that there are subtle differences between the two learning curves produced by Model 5. Due to expected variability in the experimental data, however, it is difficult to conclude whether such subtle distinctions also appear in the learning curves of the mice. Further, the slight overshoot of the learning curve that the reviewer mentions is not constrained by the experimental data due to different mice reaching asymptotic performance at different times, and many of them not having even reached asymptotic performance by the end of the training period.

      However, even if there are minor discrepancies between the learning curves produced by the final version of the model and by the mice, we do not see this as being especially surprising or problematic. As in any model, there are a large number of potentially important features that are not included in any of our models–for example, realistic spectrotemporal neural responses, nonlinearity in neural activations, heterogeneity across mice, and many others. The aim of our modeling was to choose a space of possible models (which is inevitably restricted) and show which model version within that space best captures our experimental observations. Expanding the space of possible models that we considered to capture further nuances in the data will be a task for future work.

      The second major weakness is that the authors also don't generate any predictions with M5. Can they test this model of learning somehow in follow-up behavioural experiments in mice? ... Without follow-up experiments to test their mechanism of why passive exposure helps in a schedule-independent way, the impact of this paper will be limited.

      Although testing predictions from our models was beyond the scope of the current study, we do generate specific predictions with model M5 (in particular, about neural representations). Our model produces predictions about neural representations and the ways in which they evolve through learning, and we hope to test these predictions in future work.

      I believe the authors need to place this work in the context of a large amount of existing literature on passive (unsupervised) and active (supervised) learning interactions. This field is broad both experimentally and computationally. For example, there is an entire sub-field of machine learning, called semi-supervised learning that is not mentioned at all in this work.

      We thank the reviewer for pointing this out. The Discussion section of the updated manuscript now includes a discussion on how our results fit in with this literature.

      Reviewer #3 (Recommendations for the authors):

      All points made by the reviewer in their Recommendations For The Authors are associated with those presented in the Public Review and they are addressed in our response above.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This is a valuable study of Eph-Ephrin signaling mechanisms generating pathological changes in amyotropic lateral sclerosis. There are exciting findings bearing on the role of glial cells in this pathology. The study emerges with solid evidence for a novel astrocyte-mediated mechanism for disease propagation. It may help identify potential therapeutic targets.

      Response to Editor’s decision letter: Drs. Huang and Zaidi: Thank you for considering this re-revision of our manuscript for potential publication in eLife. We have addressed the remaining comments of reviewer #2. We have included detailed response-to-reviewer comments below to address each of these remaining specific points from reviewer #2, and we have highlighted all the changes in the manuscript text (using a red font color) made in response to these comments. Based on the reviewers’ critiques, we feel our re-working of the manuscript has made for a greatly improved study.

      Reviewer #1 (Recommendations For The Authors):

      Reviewer comment: All questions/concerns have been addressed.

      Response: We thank Reviewer #1 for the previous helpful comments that we used to improve our manuscript. As Reviewer #1 has no new comments, we have provided no additional responses to address this reviewer’s input. Instead, we only focus (in this new “Response to Reviewer Comments” document) on the remaining points from Reviewer #2 below.

      Reviewer #2 (Recommendations For The Authors):

      Overall, the authors have addressed most concerns raised in the prior review. A couple of very minor points remain, which would improve the clarity of the report.

      Reviewer comment 1: The abstract has not been edited and still emphasizes that astrocyte-mediated upregulation in ephrinB2 signaling underlies pathogenicity in mutant SOD1-associated ALS. There is certainly sufficient evidence to suggest a large role for astrocytes, however, without a thorough investigation of other key cell types in the spinal cord, this cannot be concluded specifically. Especially given that a non-specific promoter (U6) was employed in the viral constructs.

      Response: We apoplogize for this mistake. In response to the reviewer’s previous comment in the first round of review, we made changes throughout the manuscript to address this issue; however, we failed to do this in the Abstract. In this re-revised manucript, we now also make the necessary changes to the Abstract.

      Reviewer comment 2: It is interesting to note that a non-specific promoter, U6, exhibited such large specificity to astrocytes in the cord as compared to neurons (Fig 2M). This is worth discussing briefly in the discussion and how this result compares to those in the literature.

      Response: We have now added a brief discussion of this issue to the Discussion section, including describing our previous studies that used the Gfa2 promotor to achieve astrocyte-specific transduction when employing viral vectors in the rodent spinal cord.

      Reviewer comment 3: I appreciate the authors including a supplemental figure on the expression of ephrinA4 receptors in the cervical ventral horn. Unfortunately, the quality of this image is very poor in conveying the receptor expression. The detailed discussion point on the expression of EphB receptors in the cervical ventral horn should be sufficient for readers to take into consideration.

      Response: We have now removed this supplemental figure and keep only the text from the rerevised manuscript.

      Reviewer comment 4: A few instances of motor neuron diameter being attributed to a 200μm2 size remain (e.g. pg 14).

      Response: We have corrected this issue throughout the re-revised manuscript. The correct information is: somal diameter greater than 20 μm.

      Reviewer comment 5: It is still a little unclear in the result text as to when assessment of lentiviral transduction was conducted following intraspinal injections.

      Response: We have now added this detail about the time point of assessing transduction to both the Results section and the Materials/Methods section.

      Reviewer comment 6: Some figures are missing markers of significance (e.g. Fig 2M).

      Response: Below are our comments about significance markers for each graph in all figures.

      Figure 1:

      Panel E: We have now added asterisks for any statistically-significant comparisons. In addition, we provide the details of this statistical analysis in the text of the re-revised manuscript.

      Figure 2:

      Panel M: We have now added asterisks for statistical comparisons, as well as details in the text.

      Panel N: The asterisk was already shown in the previous version of the figure.

      Figure 3:

      Panels B and G: The asterisks were already shown in the previous version of the figure.

      Figure 4:

      All panels: There are no significant differences; therefore, no asterisks are needed.

      Figure 5:

      Panel F and G: The asterisks were already shown in the previous version of the figure.

      Panel H: The difference is not statistically-signficant.

      Figure 6: No graphs are shown in this figure.

      Reviewer comment 7: Since a wild type mouse control has not been included in the quantification of diaphragm NMJ innervation with and without ephrin knock-down, it would be useful to include a description or discussion on the phenotype of NMJ denervation exhibited in the SOD1G93A mouse model of ALS.

      Response: We have now added description of diaphragm NMJ denervation that occurs in SOD1G93A mice, in particular at the age/time point of our NMJ analysis.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This valuable manuscript investigates the roles of DKK3 in AD synapse integrity. Although previous work has identified the involvement of Wnt and DKK1 in synaptic physiology, this study provides compelling evidence that suppression of DKK3 rescues the changes in excitatory synapse numbers, as well as memory deficits in an established AD model mice. The authors provide both gain and loss of function data that support the main conclusion and advance our understanding of the mechanisms by which Wnt pathway mediates early synaptic dysfunction in AD models.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this study, Nuria Martin-Flores, Marina Podpolny and colleagues investigate the role of Dickkopf-3 (DKK3), a Wnt antagonist in synaptic dysfunction in Alzheimer's disease. Loss of synapses is a feature of Alzheimer's and other forms of dementia such as frontotemporal dementia and linked amyotrophic lateral sclerosis (FTD). The authors utilise a broad range of experimental approaches. They show that DKK3 levels are increased in Alzheimer's disease and that this occurs early in disease. This is an important finding since early disease changes are believed to be the most important. They also show increases in DKK3 in transgenic mouse models of Alzheimer's disease and that DKK3 knockdown restores synapse number and memory in one such model. Finally, they link these DKK3 increases to loss of excitatory synapses via the blockade of the Wnt pathway and subsequent activation of GSK3B; GSK3B is strongly linked to both Alzheimer's disease and FTD. The quality of the data is good and the conclusions well supported by these data. There are no major weaknesses. The findings support studies that target the Wnt pathway as a potential therapeutic for Alzheimer's disease.

      Reviewer #2 (Public Review):

      This manuscript by Martin-Flores et al., has examined the role of DKK3 in Alzheimer's disease, focusing on the regulation of synaptic numbers. By using human AD brain databases and tissue samples, the authors showed that DKK3 protein and mRNA levels are increased in the brains of AD patients. DKK3 is expressed in the excitatory neurons in WT mouse brains and accumulates at atrophic neurites around amyloid plaques in AD mouse brains. Interestingly, secretion of DKK3 appears to be regulated by NMDAR antagonist as well as chemical LTD. Through gain and loss of function studies, the authors showed that DKK3 regulates the number of excitatory as well as inhibitory synapses with distinct downstream pathways. Finally, the authors investigated the contribution of DKK3 to synaptic changes in AD and found that DKK3 loss of function rescues both the excitatory and inhibitory synaptic defects, resulting in the improvement of memory function in J20 mice.

      Overall, the data is clearly presented and deals with novel roles of DKK3 in controlling excitatory and inhibitory synapses. The finding that shRNA expression of DKK3 in AD model mice rescues synaptic phenotypes and memory impairment is potentially interesting and may provide a new strategy for AD treatment.

      We would like to thank the Editors and the Reviewers for their very insightful suggestions. We are delighted to receive very positive reviews of our manuscript. In response to the comments made by the reviewers, we have carried out an extensive revision of our manuscript. In the revised manuscript, we have addressed all the comments made by the reviewers.

      Recommendations for the authors:

      Reviewer #1:

      My only comment regards the role of GSK3B activation in synaptic dysfunction and its targets. GSK3B is a Tau kinase but is also involved in IP3 receptor delivery of Ca2+ to mitochondria. This delivery is major regulator of mitochondrial ATP production and synaptic function is heavily dependent on ATP. Both Alzheimer's disease and FTD insults have been linked to GSK3B activation -see for e.g. Szabo EMBO R 2023, Gomez-Suaga Aging Cell 2022. It might be valuable to readers for the authors to speculate briefly on potential GSK3B synaptic targets in the Discussion.

      We appreciate the reviewer for this suggestion. In the Discussion, we now included how GSK3β may contribute to synaptic dysfunction and loss in the context of increased DKK3 levels and in Alzheimer’s disease.

      Reviewer #2:

      1. In Fig 1B, the authors showed that soluble DKK3 levels were increased in Braak 1-3 patients, while no changes were observed in Braak 4-5. If the secretion of DKK3 is dependent on NMDAR activity, does this data imply that Braak 4-5 patients have reduced NMDAR activity in general, resulting in the reduced DKK3 release even with the increased mRNA levels? It would be interesting to test this hypothesis in a mouse AD model.

      In Figure 1B, we analyzed the levels of soluble and insoluble DKK3 in the hippocampus of AD patients at different disease stages based on their Braak stages. As the reviewer indicated, soluble levels of DKK3 were increased in patients with Braak I-III but not at later stages. Importantly, DKK3 levels were also elevated in Braak IV-VI patients, but only in the insoluble fraction (Figure 1C), suggesting that DKK3 could accumulate within Aβ aggregates. Based on these findings, we cannot conclude that DKK3 release is reduced at later stages of the disease in patients.

      To explore the underlying mechanisms regulating DKK3 levels, we used cultured hippocampal neurons and AD mouse brain slices. In mouse models, we have demonstrated that extracellular DKK3 levels (secreted DKK3 fraction) depends on NMDAR activation early in the disease progression (Figure 2E, F). Moreover, we also provide new data showing that antagonizing NMDAR partially blocks the increase of DKK3 extracellular levels induced by oligomeric Aβ (see response to question 4 of this reviewer and Figure S2G, H). It is well established that oligomeric Aβ promotes hyperexcitability through, in part, the aberrant activation of NMDAR (Li S et al., 2011, PMID: 21543591; Mucke L and Selkoe DJ et al., 2012, PMID: 22762015). In line with this, NMDAR blockers prevent Aβ-induced synapse loss and improve cognition in AD models (Hu NW et al., 2009, PMID: 19918059; Ye C et al., 2004, PMID: 15288443). In addition, an NMDAR antagonist is currently approved as a drug treatment for AD patients (Cumming J 2021, PMID: 33441154). Together, our findings in dissociated neurons, AD mouse brain and human samples indicate that soluble Aβ oligomers promote the release of DKK3 through NMDAR activation and suggest that this mechanism might also be occurring in the brain of AD patients.

      1. Recent work (Yuan et al., 2022, Nature) has shown that dystrophic neurites/axonal spheroids found around Aβ deposits are filled with neuronal endolysosomes. Are DKK3 in ThioS positive amyloid plaques located in endolysosomes of these axonal spheroids? If so, does this data mean that DKK3 in Fig 2B-D represents the entrapped DKK3 protein population that fails to be secreted from dystrophic neurites?

      The reviewer points an interesting question. Our results show that secretion of DKK3 is increased in two AD models before substantial plaque load. Later in the disease, DKK3 accumulates in dystrophic neurites (visualized as axonal spheroids) surrounding amyloid plaques. To address if DKK3 protein is located in vesicles of the endolysosomal pathway within axonal spheroids, we performed co-localization analyses of DKK3 and the endolysosomal marker LAMP1. We found that DKK3 colocalized with LAMP1 (Figure 2D) indicating the presence of DKK3 in axonal spheroids. These results indeed suggest that DKK3 is present in abnormally enlarged vesicles in dystrophic neurites around Aβ plaques. This could affect the axonal transport of DKK3. Given that proteins present in dystrophic neurites have been correlated with defects in bidirectional transport in the axon (Stokin GB et al., 2005, PMID: 15731448; Sadleir KR et al., 2016, PMID: 26993139), both DKK3 turnover and secretion could be affected.

      1. Why does only LTD induce DKK3 release? Why not general activation of neuronal activity? It would be important to test the relationship between DKK3 secretion and neuronal activity with optogenetics and chemogenetics.

      We tested whether neuronal activity triggered increased extracellular DKK3 levels by subjecting neurons to chemical long-term potentiation (cLTP) or long-term depression (cLTD). However, only cLTD increased extracellular DKK3, which we then confirmed in brain slices (Figure S3). This finding is not unexpected as it is well described that different patterns of activity can lead to different molecular outcomes. For example, high-frequency stimulation (HFS; an activity pattern that resembles LTP) and low-frequency stimulation (LFS; a different activity pattern resembling LTD) leads to opposing effects on surface levels of the Wnt receptor Frizzled-5 (Fz5) (Sahores M et al., 2010, PMID: 20530549). Furthermore, cLTP increases Fz5 s-acylation, an important post-translational modification that regulates the surface levels of Fz5, whereas cLTD decreases it (Teo S et al., 2023, PMID: 37557176). Another example is the BDNF receptor TrkB. Surface TrkB is increased by tetanic stimulation, which also induces LTP as HFS or cLTP, but not by LFS (Du J et al., 2000, PMID: 10995446). Our findings suggest that DKK3 might contribute to synaptic changes underlying cLTD. Future experiments using chemogenetics or optogenetics might elucidate the role of DKK3 in activity-induced synaptic changes.

      1. Are Abeta oligomer treatment-dependent increases in DKK3 protein levels in the cellular lysate and the extracellular fraction also suppressed by APV?

      Our results in AD mice indicate that increased DKK3 release is dependent on NMDAR activation. To investigate if amyloid-β oligomers (Aβo) increase DKK3 levels in the cell lysate and extracellular fractions through NMDAR, we blocked these receptors in hippocampal neurons using AP-V (Figure S2G, H). In these experiments, we use a lower concentration of Aβo (200nM of Aβ1-42) to avoid any potential cytotoxic effect. In line with our previous results using a higher concentration of Aβo, we observed that Aβo markedly increased DKK3 levels both in the cell lysate and in the extracellular fraction compared to the reverse Aβ42-1 control peptide. Kruskal-Wallis with Dunn’s test showed a trend to a reduced levels of DKK3 in the extracellular fraction when we compared neurons treated with Aβo and APV with those neurons treated with Aβ and vehicle (p = 0.0726). However, this reduced levels of DKK3 in the extracellular fraction reached statistical significance using a t-test (p = 0.0384). No differences were observed between the reverse control peptide and Aβo and APV conditions. These results suggest that blockade of the NMDAR partially occludes the ability of Aβo to increase DKK3 levels in the extracellular fraction.

      1. Why does DKK3 shRNA only downregulate inhibitory synapses but not excitatory synapses in the WT brain slice? Does this mean that in the WT brain, other DKK proteins (without changes in their expression as shown in Fig S6) are sufficiently expressed and compensate for the roles of DKK3 in excitatory synapse integrity?

      The reviewer points out an interesting result. In J20 mice, DKK3 knockdown affects both excitatory and inhibitory synapse density (Figure 6B, C). In Figure 3B, D, we show that in vivo downregulation of DKK3 leads to an increased number of inhibitory synapses without affecting excitatory ones in the brain of WT animals. These results indicate that in a healthy brain (WT), DKK3 is required for the maintenance of inhibitory synapses but not for excitatory synapses under our experimental conditions. Furthermore, DKK3 partially shares the mechanism of action with DKK1 as both DKK proteins promote excitatory synapse loss through the Wnt/GSK3β pathway (Figure 4A-C) (Marzo A et al., 2016, PMID: 27593374). Therefore, it is possible that endogenous DKK1 levels in the hippocampus could compensate for the reduced expression of DKK3 resulting in the lack of changes in excitatory synapse number when DKK3 is knockdown in WT animals.

      1. Manipulating DKK3 in WT brains only affects Gephyrin but not VGAT, but in J20, both Gephyrin and VGAT seem to be affected by DKK3 shRNA (Fig 6). The authors need to provide the pre vs post synapse number in Fig 6 and discuss the potential differences.

      We have now included the quantification of excitatory and inhibitory pre- and postsynaptic puncta for 4-months old (Figure S6B, C) and 9-months old (Figure S6D, E) WT and J20 mice. At 4-months old, the density of Homer1 puncta for excitatory synapses and both vGAT and Gephyrin for inhibitory synapses was increased and decreased respectively by knocking down DKK3 in the J20 mice. At 9-months, strong trends were observed in all the synaptic markers when downregulating DKK3, but significance was only reached for Homer1 puncta.

      1. Where are the Wnt receptors expressed? Are they exclusively expressed in neurons? Can the authors exclude the potential involvement of glial cells in this process?

      In neurons, Wnt receptors can be expressed in the synaptic terminals. For example, Wnt receptor Frizzled-5 is located at the presynaptic terminal and the dendritic shaft but not at spines (Sahores M et al., 2010, PMID: 20530549; McLeod F et al., 2018, PMID: 29694885), whereas Frizzled-7 is located at the dendritic shaft and spines (McLeod F et al., 2018, PMID: 29694885). In addition, the Wnt co-receptor LRP6 is present at both pre- and postsynaptic sites in excitatory synapses (Jones ME et al., 2023, PMID: 36638182). Kremen1, another receptor for Dkk proteins, is also highly expressed in the brain and our unpublished superresolution results show that this receptor is present in both pre- and postsynaptic sites of 53% of excitatory and 30% of inhibitory synapses. However, these receptors are not exclusively expressed in neurons and many of them are also highly expressed in astrocytes (Zhang Y et al., 2016, PMID: 25186741). Based on the literature and our findings, we cannot rule out the possibility that DKK3 may signal to other cell types such as astrocytes, which could also contribute to changes in synapse density. However, recombinant DKK3 induces structural and functional changes in excitatory and inhibitory synapses within 3-4h (Figure 3), suggesting that DKK3 acts on neurons leading to synaptic changes.

      1. Does the shRNA treatment of DKK3 affect the size and number of amyloid plaques in the AD mice?

      We thank the reviewer for raising this very important question. We have now evaluated the impact of DKK3 knockdown in Aβ pathology in the J20 mice. We did not observe differences in the Aβ coverage nor the averaged number and size of Aβ plaques when DKK3 was silenced in the CA3 (Figure S6F). Therefore, the changes we observe in excitatory and inhibitory synapse density around plaques after knocking down DKK3 are unlikely to be due to changes in Aβ plaques.

    1. Author Response

      eLife assessment

      This study presents a valuable finding on the distinct subpopulation of adipocytes during brown-to-white conversion in perirenal adipose tissue (PRAT) at different ages. The evidence supporting the claims of the authors is convincing, although specific lineage tracing of this subpopulation of cells and mechanistic studies would expand the work. The work will be of interest to scientists working on adipose and kidney biology.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors performed single nucleus RNA-seq for perirenal adipose tissue (PRAT) at different ages. They concluded a distinct subpopulation of adipocytes arises through brown-to-white conversion and can convert to a thermogenic phenotype upon cold exposure.

      Strengths:

      PRAT adipose tissue has been reported as an adipose tissue that undergoes browning. This study confirms that brown-to-white and white-to-beige conversions also exist in PRAT, as previously reported in the subcutaneous adipose tissue.

      We did not observe any white-to-beige conversion in PRAT under regular condition. The adipocyte population that arises from brown-to-white conversion (mPRAT-ad2) can respond to cold and restore their UCP1 expression. However, brown adipocytes that arise from the mPRAT-ad2 subpopulation after cold exposure have a distinct transcriptome to that of cold-induced beige adipocyte in iWAT (Figure S6K) and are more related to iBAT brown adipocytes (Figure 6E).

      Weaknesses:

      1. There is overall a disconnection between single nucleus RNA-seq data and the lineage chasing data. No specific markers of this population have been validated by staining.

      We are not sure what “this population” refers to. We suspect it is the Ucp1-&Cidea+ mPRAT-ad2 adipocyte subpopulation. If so, we did not identify specific markers for these adipocytes as shown in Figure 1H and statement in the Discussion. mPRAT-ad2 is negative for Ucp1 and Cyp2e1, which are markers for mPRAT-ad1 and mPRAT-ad3&4, respectively. Therefore, we plan to stain the mPRAT with Ucp1, Cyp2e1 and Perilipin (a pan adipocyte marker) antibodies. Cells that are Perilipin+&Ucp1-&Cyp2e1- will represent the mPRAT-ad2 subpopulation.

      1. It would be nice to provide more evidence to support the conclusion shown in lines 243 to 245 "These results indicated that new BAs induced by cold exposure were mainly derived from UCP1- adipocytes rather than de novo ASPC differentiation in puPRAT". Pdgfra-negative progenitor cells may also contribute to these new beige adipocytes.

      Our sequencing data and many previous studies (Angueira et al., 2021; Burl et al., 2022; Dong et al., 2022) have shown that Pdgfra is a marker for all ASPCs. We will also check adipocyte labelling pattern of mPRAT in the PdgfraCre;Ai14 mice. If all adipocytes are Tomato+, it suggests that adipocytes in mPRAT are all derived from Pdgfra-expressing cells. Also, the cold-induced adipocytes in mPRAT resemble more to the brown adipocytes of iBAT than the beige adipocytes of iWAT (Figure 6E and S6K).

      Angueira, A.R., Sakers, A.P., Holman, C.D., Cheng, L., Arbocco, M.N., Shamsi, F., Lynes, M.D., Shrestha, R., Okada, C., Batmanov, K., et al. (2021). Defining the lineage of thermogenic perivascular adipose tissue. Nat Metab 3, 469-484. 10.1038/s42255-021-00380-0.

      Burl, R.B., Rondini, E.A., Wei, H., Pique-Regi, R., and Granneman, J.G. (2022). Deconstructing cold-induced brown adipocyte neogenesis in mice. Elife 11. 10.7554/eLife.80167.

      Dong, H., Sun, W., Shen, Y., Balaz, M., Balazova, L., Ding, L., Loffler, M., Hamilton, B., Kloting, N., Bluher, M., et al. (2022). Identification of a regulatory pathway inhibiting adipogenesis via RSPO2. Nat Metab 4, 90-105. 10.1038/s42255-021-00509-1.

      1. The UCP1Cre-ERT2; Ai14 system should be validated by showing Tomato and UCP1 co-staining right after the Tamoxifen treatment.

      We will inject Ucp1CreERT2;Ai14 mice at 1- and 6-month-old of age with tamoxifen and collect one day after the last injection to check the overlap between the Tomato signal and UCP1 immunofluorescent staining.

      Reviewer #2 (Public Review):

      Summary:

      In the present manuscript, Zhang et al utilize single-nuclei RNA-Seq to investigate the heterogeneity of perirenal adipose tissue. The perirenal depot is interesting because it contains both brown and white adipocytes, a subset of which undergo functional "whitening" during early development. While adipocyte thermogenic transdifferentiation has been previously reported, there remain many unanswered questions regarding this phenomenon and the mechanisms by which it is regulated.

      Strengths:

      The combination of UCP1-lineage tracing with the single nuclei analysis allowed the authors to identify four populations of adipocytes with differing thermogenic potential, including a "whitened" adipocyte (mPRAT-ad2) that retains the capacity to rapidly revert to a brown phenotype upon cold exposure. They also identify two populations of white adipocytes that do not undergo browning with acute cold exposure.

      Anatomically distinct adipose depots display interesting functional differences, and this work contributes to our understanding of one of the few brown depots present in humans.

      Weaknesses:

      The most interesting aspect of this work is the identification of a highly plastic mature adipocyte population with the capacity to switch between a white and brown phenotype. The authors attempt to identify the transcriptional signature of this ad2 subpopulation, however, the limited sequencing depth of single nuclei somewhat lessens the impact of these findings. Furthermore, the lack of any form of mechanistic investigation into the regulation of mPRAT whitening limits the utility of this manuscript. However, the combination of well-executed lineage tracing with comprehensive cross-depot single-nuclei presented in this manuscript could still serve as a useful reference for the field.

      The sequencing depth of our data is comparable, if not better than previously published snRNA-seq studies on adipose tissue (Burl et al., 2022; Sarvari et al., 2021; Sun et al., 2020). Therefore, the depth of our data has reached the limit of the 3’ sequencing methods. Unfortunately, due to size limitation of the adipocytes, it is also not feasible to sort them for Smart-seq.

      Burl, R.B., Rondini, E.A., Wei, H., Pique-Regi, R., and Granneman, J.G. (2022). Deconstructing cold-induced brown adipocyte neogenesis in mice. Elife 11. 10.7554/eLife.80167.

      Sarvari, A.K., Van Hauwaert, E.L., Markussen, L.K., Gammelmark, E., Marcher, A.B., Ebbesen, M.F., Nielsen, R., Brewer, J.R., Madsen, J.G.S., and Mandrup, S. (2021). Plasticity of Epididymal Adipose Tissue in Response to Diet-Induced Obesity at Single-Nucleus Resolution. Cell Metab 33, 437-453 e435. 10.1016/j.cmet.2020.12.004.

      Sun, W., Dong, H., Balaz, M., Slyper, M., Drokhlyansky, E., Colleluori, G., Giordano, A., Kovanicova, Z., Stefanicka, P., Balazova, L., et al. (2020). snRNA-seq reveals a subpopulation of adipocytes that regulates thermogenesis. Nature 587, 98-102. 10.1038/s41586-020-2856-x.

    1. Author Response

      Public Reviews:

      Roget et al. build on their previous work developing a simple theoretical model to examine whether ageing can be under natural selection, challenging the mainstream view that ageing is merely a byproduct of other biological and evolutionary processes. The authors propose an agent-based model to evaluate the adaptive dynamics of a haploid asexual population with two independent traits: fertility timespan and mortality onset. Through computational simulations, their model demonstrates that ageing can give populations an evolutionary advantage. Notably, this observation arises from the model without invoking any explicit energy tradeoffs, commonly used to explain this relationship.

      The model’s results are based on both numerical simulations and formal mathematical analysis.

      Additionally, the theoretical model developed here indicates that mortality onset is generally selected to start before the loss of fertility, irrespective of the initial values in the population. The selected relationship between the fertility timespan and mortality onset depends on the strength of fertility and mortality effects, with larger effects resulting in the loss of fertility and mortality onset being closer together. By allowing for a trans-generational effect on ageing in the model, the authors show that this can be advantageous as well, lowering the risk of collapse in the population despite an apparent fitness disadvantage in individuals. Upon closer examination, the authors reveal that this unexpected outcome is a consequence of the trans-generational effect on ageing increasing the evolvability of the population (i.e., allowing a more effective exploration of the parameter landscape), reaching the optimum state faster.

      The simplicity of the proposed theoretical model represents both the major strength and weakness of this work. On one hand, with an original and rigorous methodology, the logic of their conclusions can be easily grasped and generalised, yielding surprising results. Using just a handful of parameters and relying on direct competition simulations, the model qualitatively recapitulates the negative correlation between lifespan and fertility without requiring energy tradeoffs. This alone makes this work an important milestone for the rapidly growing field of adaptive dynamics, opening many new avenues of research, both theoretically and empirically.

      We thank the reviewers and editor for highlighting the importance of the work presented here.

      On the other hand, the simplicity of the model also makes its relationship with living organisms difficult to gauge, leaving open questions about how much the model represents the reality of actual evolution in a natural context.

      We presented both in results and discussion how the mathematical trade-offs between fertility and survival time give rise to (xb, xd) configuration representative of existing aging modes.

      In particular, a more explicit discussion of how the specifics of the model can impact the results and their interpretation is needed. For example, the lack of mechanistic details on the trans-generational effect on ageing makes the results difficult to interpret.

      We discussed the role of the transgenerational Lansing effect played to its function, there is no need for a particular mechanism beyond that function of transgenerational negative effect. We reinforce this in the discussion by adding the following sentence “Regarding the nature of the transgenerational effect, our model is agnostic and the mere transmission of any negative effect would be sufficient to exert the function. “

      Even if analytical results are obtained, most of the observations appear derived from simulations as they are currently presented. Also, the choice of parameters for the simulations shown in the paper and how they relate to our biological knowledge are not fully addressed by the authors.

      The long time limit of the system with and without the Lansing effect is based on analytical results later confirmed using numerical simulations. The choice of parameters is explained in the introduction as being the minimum ones for defining a living organism. As for the parameters’ values, our numerical analysis gives a solution for any ib, id, xb and xd on R+, making the choice of initial value a mere random decision.

      Finally, the conclusions of evolvability are insufficiently supported, as the authors do not show if the wider genotypic variability in populations with the ageing trans-generational effect is, in fact, selected.

      We do not show nor claim that evolvability per se is selected for but that the apparent advantage given by this transgenerational effect seems to be mediated by an increased genotypic/phenotypic variability conferred to the lineage that we interpreted as evolvability.

    1. Author Response

      Reviewer #1 (Public Review):

      De Seze et al. investigated the role of guanine exchange factors (GEFs) in controlling cell protrusion and retraction. In order to causally link protein activities to the switch between the opposing cell phenotypes, they employed optogenetic versions of GEFs which can be recruited to the plasma membrane upon light exposure and activate their downstream effectors. Particularly the RhoGEF PRG could elicit both protruding and retracting phenotypes. Interestingly, the phenotype depended on the basal expression level of the optoPRG. By assessing the activity of RhoA and Cdc42, the downstream effectors of PRG, the mechanism of this switch was elucidated: at low PRG levels, RhoA is predominantly activated and leads to cell retraction, whereas at high PRG levels, both RhoA and Cdc42 are activated but PRG also sequesters the active RhoA, therefore Cdc42 dominates and triggers cell protrusion. Finally, they create a minimal model that captures the key dynamics of this protein interaction network and the switch in cell behavior.

      We thank reviewer #1 for this assessment of our work.

      The conclusions of this study are strongly supported by data. Perhaps the manuscript could include some further discussion to for example address the low number of cells (3 out of 90) that can be switched between protrusion and retraction by varying the frequency of the light pulses to activate opto-PRG.

      The low number of cells being able to switch can be explained by two different reasons:

      1) first, we were looking for clear inversions of the phenotype, where we could see clear ruffles in the case of the protrusion, and clear retractions in the other case. Thus, we discarded cells that would show in-between phenotypes, because we had no quantitative parameter to compare how protrusive or retractile they were. This reduced the number of switching cells

      2) second, we had a limitation due to the dynamic of the optogenetic dimer used here. Indeed, the control of the frequency was limited by the dynamic of unbinding of the optogenetic dimer. This dynamic of recruitment (~20s) is comparable to the dynamics of the deactivation of RhoA and Cdc42. Thus, the differences in frequency are smoothed and we could not vary enough the frequency to increase the number of switches. Thanks to the model, we can predict that decreasing the unbinding rate of the optogenetic tool should allow us to increase the number of switching cells.

      We will add further discussion of this aspect to the manuscript.

      Also, the authors could further describe their "Cell finder" software solution that allows the identification of positive cells at low cell density, as this approach will be of interest for a wide range of applications.

      There is a detailed explanation of the ‘Cell finder’ in the method sections. It is also available on github at https://github.com/jdeseze/cellfinder and currently in development to be more user-friendly and properly commented.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript builds from the interesting observation that local recruitment of the DHPH domain of the RhoGEF PRG can induce local retraction, protrusion, or neither. The authors convincingly show that these differential responses are tied to the level of expression of the PRG transgene. This response depends on the Rho-binding activity of the recruited PH domain and is associated with and requires (co?)-activation of Cdc42. This begs the question of why this switch in response occurs. They use a computational model to predict that the timing of protein recruitment can dictate the output of the response in cells expressing intermediate levels and found that, "While the majority of cells showed mixed phenotypes irrespectively of the activation pattern, in few cells (3 out of 90) we were able to alternate the phenotype between retraction and protrusion several times at different places of the cell by changing the frequency while keeping the same total integrated intensity (Figure 6F and Supp Movie)."

      Strengths:

      The experiments are well-performed and nicely documented. However, the molecular mechanism underlying the shift in response is not clear (or at least clearly described). In addition, it is not clear that a prediction that is observed in ~3% of cells should be interpreted as confirming a model, though the fit to the data in 6B is impressive.

      Overall, the main general biological significance of this work is that RhoGEF can have "off target effects". This finding is significant in that an orthologous GEF is widely used in optogenetic experiments in drosophila. It's possible that these findings may likewise involve phenotypes that reflect the (co-)activation of other Rho family GTPases.

      We thank reviewer #2 for having assessed our work. Indeed, the main finding of this work is the change in the GEF function upon its change in concentration, which could be explained with a simple model supported by quantitative data. We think that the mechanism of the switch is quite clear, supported by the data showing the double effect of the PH domain and the activation of Cdc42. The few cells that are able to switch phenotype have to be seen as an honest data confirming that 1) concentration is indeed the main determinant of the protein’s function, and the switch is hard to obtain (which is also predicted by the model) 2) the two underlying networks are being activated at different timescales, which leaves some space for differential activation in the same cell. We are here limited by the dynamic of the optogenetic tool, as explained in the response to reviewer #1, and the intrinsic cell-to-cell variability.

      Regarding the interpretation of our results as RhoGEF “off target effects”, we think that it might be too reductive. As said in the discussion, we proposed that the dual role of the RhoGEF could have physiological implications on the induction of front protrusions and rear retractions. While we do not demonstrate it here, it opens the door for further investigation.

      Weaknesses:

      The manuscript makes a number of untested assumptions and the underlying mechanism for this phenotypic shift is not clearly defined.

      We may not have been clear in our manuscript, but we think that the underlying mechanism for this phenotypic shift is clearly explained and backed up by the data and the literature. It relies on 1) the ability of PRG to activate both RhoA and Cdc42 and 2) the ability of the PH domain to directly bind to active RhoA (which is, as shown in the manuscript, necessary but not sufficient for protrusions to happen). The model succeeds in reproducing the data of RhoA with only one free parameter and two independently fitted ones. The fact that activation of RhoA and Cdc42 lead to retraction and protrusion respectively is known since a long time. Thus, we think that the switch is clearly and quantitatively explained.

      This manuscript is missing a direct phenotypic comparison of control cells to complement that of cells expressing RhoGEF2-DHPH at "low levels" (the cells that would respond to optogenetic stimulation by retracting); and cells expressing RhoGEF2-DHPH at "high levels" (the cells that would respond to optogenetic stimulation by protruding). In other words, the authors should examine cell area, the distribution of actin and myosin, etc in all three groups of cells (akin to the time zero data from figures 3 and 5, with a negative control). For example, does the basal expression meaningfully affect the PRG low-expressing cells before activation e.g. ectopic stress fibers? This need not be an optogenetic experiment, the authors could express RhoGEF2DHPH without SspB (as in Fig 4G).

      We thank reviewer #2 for this suggestion. PRG-DHPH is known to affect the phenotype of the cell as shown in Valon et al., 2017. Thus, we really focused on the change implied by the change in optoPRG expression, to understand the phenotype difference. However, we agree that this could be an interesting data to add and will do the experiments for the revised version of the manuscript.

      Relatedly, the authors seem to assume ("recruitment of the same DH-PH domain of PRG at the membrane, in the same cell line, which means in the same biochemical environment." supplement) that the only difference between the high and low expressors are the level of expression. Given the chronic overexpression and the fact that the capacity for this phenotypic shift is not recruitment-dependent, this is not necessarily a safe assumption. The expression of this GEF could well induce e.g. gene expression changes.

      We agree with reviewer #2 that there could be changes in gene expression. In the next point of this supplementary note, we had specified it, by saying « that overexpression has an influence on cell state, defined as protein basal activity or concentration before activation. » We are sorry if it was not clear and will change this sentence for the new version.

      One of the interests of the model is that it does not require any change in absolute concentrations, beside the GEF. The model is thought to be minimal and fits well and explains the data with very few parameters. We don’t show that there is no change in concentration but we show that it is not required to invoke it.

      We will add in the revised version of the manuscript a paragraph discussing this question.

      The third paragraph of the introduction, which begins with the sentence, "Yet, a large body of works on the regulation of GTPases has revealed a much more complex picture with numerous crosstalks and feedbacks allowing the fine spatiotemporal patterning of GTPase activities" is potentially confusing to readers. This paragraph suggests that an individual GTPase may have different functions whereas the evidence in this manuscript demonstrates, instead, that a particular GEF can have multiple activities because it can differentially activate two different GTPases depending on expression levels. It does not show that a particular GTPase has two distinct activities. The notion that a particular GEF can impact multiple GTPases is not particularly novel, though it is novel (to my knowledge) that the different activities depend on expression levels.

      We thank the reviewer for this remark and didn’t intended to confuse the readers. Indeed, we think that this manuscript confirms the canonical view on the GTPases (as most optogenetic experiments did in the past years). We show here that it is more complicated at the level of the GEF. We agree that this is not particularly novel. However, to our knowledge, there is no example of such clear phenotypic control, explained solely by the change in concentration.

      We think that the last paragraph of the introduction is quite clear in the fact that it is the GEF itself that switches its function, and not the Rho-GTPases, but we will reconsider the phrasing of this paragraph for the revised version.

      Concerning the overall model summarizing the authors' observations, they "hypothesized that the activity of RhoA was in competition with the activity of Cdc42"; "At low concentration of the GEF, both RhoA and Cdc42 are activated by optogenetic recruitment of optoPRG, but RhoA takes over. At high GEF concentration, recruitment of optoPRG lead to both activation of Cdc42 and inhibition of already present activated RhoA, which pushes the balance towards Cdc42."

      These descriptions are not precise. What is the nature of the competition between RhoA and Cdc42? Is this competition for activation by the GEFs? Is it a competition between the phenotypic output resulting from the effectors of the GEFs? Is it competition from the optogenetic probe and Rho effectors and the Rho biosensors? In all likelihood, all of these effects are involved, but the authors should more precisely explain the underlying nature of this phenotypic switch. Some of these points are clarified in the supplement, but should also be explicit in the main text.

      We are going to precise these descriptions for the revised version of the manuscript. The competition between RhoA and Cdc42 was thought as a competition between retraction due to the protein network triggered by RhoA (through ROCK-Myosin and mDia-bundled actin) and the protrusion triggered by Cdc42 (through PAK-Rac-ARP2/3-branched Actin). We will make it explicit in the main text.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      The findings of this study are valuable as they provide new insights into the role of acetylcholine in modulating sensory processing in the auditory cortex. This paper reports a systematic measurement of cell activity in the auditory cortex before and after applying ACh during an oddball and cascade sequence of auditory stimuli in anesthetized rats. The results presented are solid given the rigorous experimental design and statistical analysis. The conclusions are provocative and will interest researchers in auditory neuroscience and neuromodulation, as well as clinicians and individuals with auditory processing disorders. However, the findings support multiple interpretations, beyond that offered by the authors.

      Our reply: First and foremost, we would like to thank the editors and reviewers for their constructive criticisms, as well as their thoughtful and thorough evaluations of our manuscript. We greatly appreciate their assessment about the novelty and general significance in our study and have revised the manuscript according to their recommendations. In the following we include detailed responses and revisions based on the reviewer’s recommendations.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study examined the impact of exogenous microapplication of acetylcholine (Ach) on metrics of novelty detection in the anesthetized rat auditory cortex. The authors found that the majority of units showed some degree of modulation of novelty detection, with roughly similar numbers showing enhanced novelty detection, suppressed novelty detection, or no change. Enhanced novelty responses were driven by increases in repetition suppression. Suppressed novelty responses were driven by deviance suppression. There were no compelling differences seen between auditory cortical subfields or layers, though there was heterogeneity in the Ach effects within subfields. Overall, these findings are important because they suggest that fluctuations in cortical Ach, which are known to occur during changes in arousal or attentional states, will likely influence the capacity of individual auditory cortical neurons to respond to novel stimuli.

      Strengths:

      The work addresses an important problem in auditory neuroscience. The main strengths of the study are that the work was systematically done with appropriate controls (cascaded stimuli) and utilizes a classical approach that ensures that drug application is isolated to the micro-environment of the recorded neuron. In addition, the authors do not isolate their study to only the primary auditory cortex, but examine the impact of Ach across all known auditory cortical subfields.

      Our reply: Thank you very much for these supportive comments and the appreciation of our work.

      Weaknesses:

      1. As acknowledged by the authors, this study explicitly examines a phenomenon of high relevance to active listening but is done in anesthetized animals, limiting its applicability to the waking state.

      Our reply: We agree; and indeed, this weakness was already recognized in the original manuscript but is now emphasized in the discussion.

      1. The authors do not make any attempt to determine, by spike shape/duration, if their units are excitatory or inhibitory, which may explain some of the variance of the data.

      Our reply: This is a very interesting question, and in fact, we have previously estimated whether neurons are excitatory or inhibitory based on the spike shape (Pérez-Gonzalez et al., 2021). Originally, we sought to implement a similar analysis here and tried to estimate if the recorded units were excitatory or inhibitory based on the spike shapes. But when we tried to perform this analysis, we found that in many cases the recordings had captured occasional spikes from other neurons. This caveat had introduced alterations in the average spike shape, and thus precluded an accurate categorization. Therefore, we decided to discard this analysis for the sake of correctness. This weakness is further commented on in the discussion.

      1. The application of exogenous Ach, potentially in supra-physiological amounts, makes this study hard to extrapolate to a behaving animal. A more compelling design would be to block Ach, particularly at particular receptor types, to determine the effect of endogenous Ach.

      Our reply: We agree again with the reviewer; this weakness was already acknowledged, but this is now further highlighted in discussion where we comment that future studies should analyze the effect of muscarinic- and nicotinic- receptors and blockade them to potentially observe more physiologically-comparable effects. Moreover, this issue is also related to a comment raised by reviewer#2 on a possible ‘dose-response relationship’ issue.

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors investigate the effect of ACh on neuronal responses in the auditory cortex of anesthetized rats during an auditory oddball task. The paradigm consisted of two pure tones (selected from the frequency responses at each recording site) presented in a pseudo-random sequence. One tone was presented frequently (the "standard" tone) and the other infrequently (the "deviant" tone). The authors found that ACh enhances the detection of unexpected stimuli in the auditory environment by increasing or decreasing the neuronal responses to deviant and standard tones.

      Strengths:

      The study includes the use of appropriate and validated methodology in line with the current state-of-the-art, rigorous statistical analysis, and the demonstration of the effects of acetylcholine on auditory processing.

      Our reply: Thank you very much for these supportive comments and the appreciation of our work.

      Weaknesses:

      The study was conducted in anesthetized rats, and further research is needed to determine the behavioral relevance of these findings.

      Our reply: We agree; and indeed, this weakness was already recognized but is now emphasized in discussion.

      Reviewer #1 (Recommendations For The Authors):

      As outlined above, breaking out the units into those that are putative excitatory or inhibitory cells would be helpful, if possible. Other critiques are minor:

      1. "Acetylcholine", "ACh" and "Ach" are used throughout the manuscript. Please define the chosen abbreviation at first use, and be consistent.

      2. Line 116, remove comma after "ACh".

      3. Line 123, I would add "in the rat at the end of the first sentence since the species was not mentioned up to this point.

      4. Fig 2 - it would be useful in the Figure (not just in the text) to label red as being the deviant tone and blue as being the standard.

      5. In many Figures (e.g., Fig 5), the term "effect" is found in the legend rather than "ACh". It would seem more intuitive to label these as "ACh".

      6. The AUC and MI interpretations are not clear. Both are metrics that quantify similarity but the authors state that when these values decrease the neurons are less able to discriminate between them (i.e., they are more similar). Some clarifying text would be useful.

      7. L276 - should "SI increase" be "SI decrease"?

      8. L285 - would replace "solely" with "primarily".

      9. Fig 7 - the authors may consider indicating with a label what the difference is between A and C compared to B and D.

      10. L634 - why were only females used?

      11. L646 - "bran" should be "brain".

      12. L649 - "homoeothermic" should be "homeothermic".

      13. L661 - "allowed to generate" should be "allowed the generation of".

      14. L670 - no need for both "about" and "approximately".

      15. L681 - please state what the search stimuli were.

      16. L688 - should be "closed-field".

      17. L754 - add a hyphen to "time-consuming".

      Our reply: Thanks so much for the detailed proofreading of the manuscript and suggestions. All them have been clarified or implemented and corrected in the text.

      Reviewer #2 (Recommendations For The Authors):

      The authors could investigate the effects of different doses of ACh on auditory processing to determine if there is a dose-response relationship.

      Our reply: We agree that this is an interesting question also relate to a matter raised by Reviewer#1 that could be linked to the issue of ‘exogenous Ach’.

      The study only investigated the effects of ACh on neuronal responses during an auditory oddball task. It would be interesting to investigate the effects of ACh on other aspects of auditory processing, such as sound localization or the discrimination of tones.

      Our reply: We agree that, while these aspects of auditory processing are very fascinating, they were outside the scope of the study, and not directly related to predictive coding and precision, so each one of these characteristics would be a full, future project in itself.

      The authors could provide more context on the significance of their findings for individuals with auditory processing disorders.

      Our reply: Thanks for the suggestion. It remains unclear how abnormal brainstem and cortical processing associated with auditory processing disorders arises (Moore, 2006, 2012). While we are not aware of any known direct connection between auditory processing disorders and acetylcholine, individuals with auditory processing disorders do have difficulties with auditory selective attention, so perhaps one could speculate that ACh, by modulating SSA/prediction error, could have some impact on encoding salient events, and if disrupted could lead to problems with selective attention. Moore (2012) speculated that auditory processing disorders may arise from unbalanced processing in bottom-up and top-down contributions.

      Since ACh has been implicated in some neurogenerative diseases and neurodevelopmental disorders, we have also added in the Discussion dialogue about a possible relationship between the modulatory effect of ACh on predictive coding (which involves bottom-up and top-down contributions) and auditory processing disorders. We also cite the recent work by Felix and colleagues (2019) which is the only study we have found on the effects of ACh on auditory processing disorders where they analyzed altered temporal processing at the level of the brainstem in α7-subunit of the nicotinic acetylcholine receptor (α7-nAChR)-deficient mice. After studying α7-nAChR knockout mice of both sexes and wild-type colony controls, they concluded that the malfunction of the CHRNA7 gene that encodes the α7-nAChR may contribute to degraded spike timing in the midbrain, which may underlie the observed timing delay in the ABR signals. These authors propose that their findings are consistent with a role for the α7-nAChR in types of neurodevelopmental and auditory processing disorders. There is also evidence on cholinergic system disfunction being related to the pathophysiology of Alzheimer’s disease (Pérez-González et al., 2022). For instance, disfunction of the synapses of cholinergic neurons in the hippocampus and nucleus basalis of Meynert, as well as decreased choline acetyltransferase activity, is associated to memory disorders in Alzheimer’s disease (Hampel et al., 2018). Also, A Alzheimer’s disease D patients show reduced amounts of the vesicular ACh transporter in some brain areas (Aghourian et al., 2017). Finally, cholinesterase inhibitors seem to have some favorable effect in the treatment of Alzheimer’s disease patients (Sharma, 2019).

      Aghourian M, Legault-Denis C, Soucy J-P, Rosa-Neto P, Gauthier S, Kostikov A, et al. 2017. Quantification of brain cholinergic denervation in Alzheimer’s disease using PET imaging with [18F]-FEOBV. Mol. Psychiatry 22:1531–1538. doi: 10.1038/mp.2017.183

      Felix RA 2nd, Chavez VA, Novicio DM, Morley BJ, Portfors CV. 2019. Nicotinic acetylcholine receptor subunit α7-knockout mice exhibit degraded auditory temporal processing. J Neurophysiol. 122(2):451-465. doi: 10.1152/jn.00170.2019.

      Hampel H, Mesulam M-M, Cuello AC, Khachaturian AS, Vergallo A, Farlow MR, et al. 2018. Revisiting the Cholinergic Hypothesis in Alzheimer’s Disease: emerging Evidence from Translational and Clinical Research. J. Prev. Alzheimers Dis. 6:1–14. doi:10.14283/jpad.2018.43

      Moore DR. 2006. Auditory processing disorder (APD)-potential contribution of mouse research. Brain Res. 1091:200–206.

      Moore DR. 2012. Listening difficulties in children: bottom-up and top-down contributions. J Commun Disord. ;45:411–418.

      Pérez-González D, Parras GG, Morado-Díaz CJ, Aedo-Sánchez C, Carbajal GV, Malmierca MS. 2021. Deviance detection in physiologically identified cell types in the rat auditory cortex. Hear Res. 2021 Jan;399:107997. doi: 10.1016/j.heares.2020.107997.

      Pérez-González D, Schreiner TG, Llano DA and Malmierca MS. 2022. Alzheimer’s Disease, Hearing Loss, and Deviance Detection. Front. Neurosci. 16:879480. doi: 10.3389/fnins.2022.879480

      Sharma K. 2019. Cholinesterase inhibitors as Alzheimer’s therapeutics. Mol. Med. Rep. 20:1479–1487. doi:10.3892/mmr.2019.1 0374

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript by Heyndrickx et al describes protein crystal formation and function that bears similarity to Charcot-Leyden crystals made of galectin 10, found in humans under similar conditions. Therefore, the authors set out to investigate CLP crystal formation and their immunological effects in the lung. The authors reveal the crystal structure of both Ym1 and Ym2 and show that Ym1 crystals trigger innate immunity, activated dendritic cells in the lymph node, enhancing antigen uptake and migration to the lung, ultimately leading to induction of type 2 immunity.

      Strengths:

      We know a lot about expression levels of CLPs in various settings in the mouse but still know very little about the functions of these proteins, especially in light of their ability to form crystal structures. As such data presented in this paper is a major advance to the field.

      Resolving the crystal structure of Ym2 and the comparison between native and recombinant CLP crystals is a strength of this manuscript that will be a very powerful tool for further evaluation and understanding of receptor, binding partner studies including the ability to aid mutant protein generation.

      The ability to recombinantly generate CLP crystals and study their function in vivo and ex vivo has provided a robust dataset whereby CLPs can activate innate immune responses, aid activation and trafficking of antigen presenting cells from the lymph node to the lung and further enhances type 2 immunity. By demonstrating these effects the authors directly address the aims for the study. A key point of this study is the generation of a model in which crystal formation/function an important feature of human eosinophilic diseases, can be studied utilising mouse models. Excitingly, using crystal structures combined with understanding the biochemistry of these proteins will provide a potential avenue whereby inhibitors could be used to dissolve or prevent crystal formation in vivo.

      The data presented flows logically and formulates a well constructed overall picture of exactly what CLP crystals could be doing in an inflammatory setting in vivo. This leaves open a clear and exciting future avenue (currently beyond the scope of this work) for determining whether targeting crystal formation in vivo could limit pathology.

      Weaknesses:

      Although resolving the crystal structure of Ym2 in particular is a strength of the authors work, the weaknesses are that further work or even discussion of Ym2 versus Ym1 has not been directly demonstrated. The authors suggest Ym2 crystals will likely function the same as Ym1, but there is insufficient discussion (or data) beyond sequence similarity as to why this is the case. If Ym1 and Ym2 crystals function the same way, from an evolutionary point, why do mice express two very similar proteins that are expressed under similar conditions that can both crystalise and as the authors suggest act in a similar way. Some discussion around these points would add further value.

      We agree with reviewer. We have further elaborated the discussion section including these points, stating clearly that more research needs to be done using Ym2 crystals before we can draw parallels in vivo.

      Additionally, the crystal structure for Ym1 has been previously resolved (Tsai et al 2004, PMID 15522777) and it is unclear whether the data from the authors represents an advance in the 3D structure from what is previously known.

      The crystal structure of Ym1 has indeed been previously solved, and we refer to that paper. In addition, we also provide the crystal structure of in vitro grown Ym1, ashowing biosimilarity. This, for the field of crystallography is a major finding, since it validates the concept that crystal structures generated in vitro can reflect in vivo grown structures. Moreover, the in vivo crystallization of Ym2 was unknown prior to this work, and is now clear as revealed by the ex vivo X-ray crystallography. The strength of our story is that we can now compare Ym1 and Ym2 crystals structures in detail.

      Whilst also generating a model to understand Charcot-Leyden crystals (CLCs), the authors fail to discuss whether crystal shape may be an important feature of crystal function. CLCs are typically needle like, and previous publications have shown using histology and TEM that Ym1 crystals are also needle like. However, the crystals presented in this paper show only formation of plate like structures. It is unclear whether these differences represent different methodologies (ie histology is 2D slides), or differences in CLP crystals that are intracellular versus extracellular. These findings highlight a key question over whether crystal shape could be important for function and has not been addressed by the authors.

      In contrast to the bipyramidal, needle-like CLC crystals formed by human galectin-10 protein (hexagonal space group P6522), the in vivo grown Ym1 and Ym2 crystals we were able to isolate for X-ray diffraction experiments had a plate-like morphology with identical crystallographic parameters as recombinant Ym1/Ym2 crystals (space group P21). We note that depending on the viewing orientation of the thin plate-like Ym1 crystals, they may appear needle-like in histology and TEM images. In addition, we can fully not exclude that both Ym1 or Ym2 may crystallize in vivo in different space groups (which could result in different crystal morphologies for Ym1/Ym2) but we have no data to support this. It is finally also a possibility that plate like structures can break up in vivo along a long axis as a result of mechanical forces, and end up as rod-or needle like shapes.

      Ym1/Ym2 crystals are often observed in conditions where strong eosinophilic inflammation is present. However, soluble Ym1 delivery in naïve mice shows crystal formation in the absence of a strong immune response. There is no clear discussion as to the conditions in which crystal formation occurs in vivo and how results presented in the paper in terms of priming or exacerbating an immune response align with what is known about situations where Ym1 and Ym2 crystals have been observed.

      Although Ym1 and Ym2 crystals are often observed in mice at sites of eosinophilic inflammation, they are not made by eosinophils, but mainly by macrophages and epithelial cells, respectively. In vitro, protein crystallization typically starts from supersaturated solutions that support crystal nucleation. Several factors such as temperature and pH can affect the solubility of Ym1 and Ym2 in vivo and thus affect the nucleation and crystallization process. For Ym1 and Ym2 we noticed in vitro that a small drop in pH facilitates the crystallization process. Although the physiological pH is 7.4, during inflammation, there is a drop in pH. This drop in pH is the result of the infiltration and activation of inflammatory cells in the tissue, which leads to an increased energy and oxygen demand, accelerated glucose consumption via glycolysis and thus increased lactic acid secretion. In addition, we cannot exclude that in vivo, the nucleation process for Ym1/Ym2 is facilitated by interaction with ligands in the extracellular space (e.g. polysaccharide ligands or other – yet to be identified – specific ligands to Ym1/Ym2).

      Reviewer #2 (Public Review):

      Summary:

      This interesting study addresses the ability of Ym1 protein crystals to promote pulmonary type 2 inflammation in vivo, in mice.

      Strengths:

      The data are extremely high quality, clearly presented, significantly extending previous work from this group on the type 2 immunogenicity of protein crystals.

      Weaknesses:

      There are no major weaknesses in this study. It would be interesting to see if Ym2 crystals behave similarly to Ym1 crystals in vivo. Some additional text in the Introduction and Discussion would enrich those sections.

      We agree that this would be interesting to investigate, however, we choose to not include recombinant Ym2 crystal data in this report. However, we have further elaborated the discussion section including this point.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Suggestions for improved experiments and to strengthen findings:

      I think additional data on the ability of Ym2 crystals to induce an immune response would be advantageous. I'm not by any means suggesting the authors repeat all the experiments with Ym2 crystals, but even just the ability to show that Ym2 could promote type 2 immunity in the acute OVA model, would help to strengthen the argument that these crystals in general function in a similar way. Alternatively, a discussion on whether these protein crystals may function in different scenarios/tissues or conditions could help in light of additional data

      We agree that this is an interesting point to investigate, however, we choose to not include recombinant Ym2 crystal data in this report. However, we have further elaborated the discussion section including this point.

      Measuring IL-33 in lung tissue is difficult to interpret as cells will express intracellular IL-33 that is not active and may explain why the results in Fig 2D are not overly convincing. It could just be that Ym1 crystals are changing the number of cells expressing IL-33 (e.g macrophages, or type 2 pneumocytes) Did the authors also measure active IL-33 release in the BAL fluid which may give a better indication of Ym1's ability to activate DAMPs?

      We also measured active IL-33 release in the BAL fluid, but due to the limited sample availability we could only measure this in one of the two repeat experiments, resulting in non-significant results for the BAL fluid. However, certainly for the 6h timepoint we saw a similar trend in the BAL fluid as in the lung tissue, meaning higher levels of IL-33 in the Ym1 crystal group compared to the PBS and soluble Ym1 group.

      Crystals in Fig 2F staining with Ym1 appear to be brighter in the soluble Ym1 group. Is this related to increased packing of Ym1 in the crystals formed in vivo as opposed to those formed in vitro? Aside from reduced amount of crystals that form when you give soluble Ym1, could the type of crystal also be influencing the ability of soluble Ym1 crystals to generate an immune response?

      Our X-ray diffraction experiments show that the packing of Ym1 is identical for in vivo and in vitro grown crystals. Possibly the apparent difference in brightness is caused by stochastic staining by the antibody. In this regard we note that the crystals formed from soluble Ym1 after 24h also can appear as less bright in a similar fashion as recombinant Ym1 crystals.

      Overall, the data and writing of the manuscript is presented to a very high standard

      A few minor points:

      • Fig 2F - a little unsure what the number in the left top corner of the images represented.

      These numbers represent the picture numbers generated by the software, but as they don’t have any added value for the story, we removed these numbers from the images.

      • Not clear why two different expression vectors were used - one for Ym1 and one for Ym2?

      Because we observed that recombinant Ym2 is more poorly secreted in the mammalian cell culture supernatant as compared to recombinant Ym1, we produced Ym2 with an N-terminal hexahistidine-tag followed by a Tobacco Etch Virus (TEV)-protease cleavage site to facilitate its purification.

      Reviewer #2 (Recommendations For The Authors):

      The authors briefly outline in their Introduction potential Sources of Ym1/2 in vivo, highlighting monocytes, M2 macrophages, alveolar macrophages, neutrophils and epithelial cells. Do DCs also make detectable/meaningful amounts of Ym1/2 in vivo, particularly in type 2 settings?

      In the introduction we only highlighted the main cellular sources of Ym1 and Ym2, but there is literature available stating/showing that Ym1/2 is not only expressed by macrophages, neutrophils, monocytes and epithelial cells, but can also be induced in DCs and mast cells. We added the word ‘mainly’ to this sentence in the introduction, to make clear that macrophages, neutrophils and monocytes are not the only sources of Ym1.

      Given the nicely demonstrated similarity of recombinant Ym1 and Ym2 crystals, I think it is important for the authors to include at least initial data on the outcome of recombinant Ym2 crystal admin to mice, in comparison to their Ym1 data.

      We agree that this is an interesting point to investigate, however, we choose to not include recombinant Ym2 crystal data in this report. However, we have further elaborated the discussion section including this point.

      Given the generation of crystals following in vivo administration of soluble Ym1, albeit at a lower level than when crystals were administered, it would be interesting to see if increased concentrations of soluble material show a dose dependent increase in lung inflammation readouts.

      We agree that this would be an interesting point to investigate. Alongside this we could also titrate down the crystal dose, to see if there is a dose dependent decrease in lung inflammation readouts. However, at this time, we choose to not investigate this further.

      I couldn't easily follow the authors' Discussion about potential ability of anti Ym-1/2 Abs to dissolve Ym1/2 crystals (similar to what they have demonstrated for Abs vs Gal10 crystals). Have they addressed this possibility experimentally? If so, addition of such data to the manuscript would be extremely interesting, given the obvious potential Ym1/2 crystal dissolving Abs for investigation of the role of these in a range of different murine models of type 2 inflammation.

      We agree that the phrasing of this part of the discussion can be unclear/confusing. We rephrased this part to make it clearer. However, we did not address the possibility of Ym1/2 crystal dissolving antibodies experimentally.

      In the Results section, the authors briefly comment on the pro-type 2 nature of Ym1 crystals in relation to their previous work with uric acid and Gal10 crystals, proposing that the pulmonary type 2 response may be a 'generic response to crystals of different chemical composition'. The Discussion would be enriched by deeper exploration of this comment.

      We have further elaborated the discussion section including this point.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their thorough reading of the manuscript and insightful comments. We have responded to both the “public review” and the “recommendations” and feel that the manuscript is now significantly strengthened.

      Public Review comments

      Reviewer #1:

      Weaknesses:

      1. The abstract does not discuss the reduction of E-gel consumption that occurs after multiple days of exposure to the THC formulation, but rather implies that a new model for chronic oral self-administration has been developed. Given that only two days of consumption was assessed, it is not clear if the model will be useful to determine THC effects beyond the acute measures presented here. The abstract should clarify that there was evidence of reduced consumption/aversive effects with repeated exposures.

      Thank you for your observation. We have added language to address this in the manuscript and the abstract. The model developed in the manuscript is an acute exposure model, with the intention of further chronic exposure adaptations to be developed separately (page 2, line 29).

      1. In the results section, the authors sometimes describe effects in terms of the concentration of gel as opposed to the dose consumed in mg/kg, which can make interpretation difficult. For example, the text describing Figure 1i states that significant effects on body temperature were achieved at 4 mg CTR-gel and 5 mg THC-gel, but were essentially equivalent doses consumed? It would be helpful to describe what average dose of THC produced effects given that consumption varied within each group of mice assigned to a particular concentration.

      We thank the reviewer for this comment and have edited our text to clarify our results. For example, this point is further emphasized by the correlation of the data in Figure1l-n showing the relationship between individual consumption and behavioral readouts (page 11, line 225-226).

      1. The description of the PK data in Figure 3 did not specify if sex differences were examined. Prior studies have found that males and females can exhibit stark differences in brain and plasma levels of THC and metabolites, even when behavioral effects are similar. However, this does depend on species, route, timing of tissue collection. It would be helpful to describe the PK profile of males and females separately.

      We did compare sex dependent effects and found no significant effects after THC E-gel consumption. We’ve added additional language to address this point in the discussion (Supplementary tables T1 and T2).

      1. In Figure 5, it is unclear how the predicted i.p. THC dose could be 30 mg/kg when 30 mg/kg was not tested by the i.p. route according to the figure, and if it had been it would have likely been almost zero acoustic startle, not the increased startle that was observed in the 2 hr gel group. It seems more likely that it would be equivalent to 3 mg/kg i.p. Could there be an error in the modeling, or was it based on the model used for the triad effects? This should be clarified.

      We apologize for the confusion created by that data, and it has now been updated for clarity. The original ~30mg/kg was not a predicted dose consumed, but rather an expected dose consumed based on individual male v. female consumption data in Supplemental Figure S1b. For clarity on the figure, we’ve instead placed dashed lines that draw attention only to the predicted startle response expected from our THC-E-gel model. We have also updated the text which hopefully makes this clearer.

      Reviewer #2:

      Weaknesses:

      Certainly, more THC mediated behavioral outcomes could have been tested, but the work presents a proof-of-concept study to investigate acute THC treatment.

      It would have been interesting if this application form is also possible for chronic treatment regimen

      We agree that a chronic treatment regimen and additional behavioral outcomes is the next, most exciting step for expanding this oral THC-E-gel consumption model, and something we are actively pursuing.

      Reviewer #3:

      Weaknesses:

      The main weaknesses of the manuscript revolve around clarification of the Methods section. All of these weaknesses are described in the "Recommendations to authors" section. Revising the manuscript would account for many of these weaknesses.

      Thank you for carefully reading through our methodology. We have made edits according to everything brought up in the recommendation section of reviewer comments.

      Recommendations for Authors

      Reviewer #1:

      Minor edits to the text:

      Abstract: "intraperitoneal contingent" should be "intraperitoneal noncontingent".

      Line 221, this sentence needs editing for clarity.

      Lines 249-250, incomplete sentence.

      Line 284, the word "activity" is missing from "locomotor between mice".

      Lines 299-301, incomplete sentence.

      Thank you for finding these mistakes. All these recommendations have been incorporated into the final publication.

      Reviewer #2:

      1. The typical THC tetrad includes catalepsy. Why was this behavioral outcome not monitored?

      We felt that locomotion, analgesia, and body temperature were robust behavioral readouts for monitoring cannabimimetic responses and that acoustic startle served as an additional, novel means of understanding THC-E-gel effects.

      1. Please specify the exact substrain of C57BL/6 (i.e., J or N or some other)

      C57BL/6J mice were used for the publication. This clarification has been made in the methods section.

      1. Figure S3 is not mentioned in the result part, but only in the discussion.

      Figure S3 is now referenced in the main body of the Results section.

      1. It might be interesting to follow up the issue that the individual THC consumption is considerable, as depicted in Fig. 1e (at high dose). This will presumably also lead to different behavioral responses. Or is there individual metabolism, also difference male vs. female?

      Thank you for the suggestion. We agree that the distribution of THC doses consumed (calculation based on weight) would be worth further investigating and have now included language about this (page 20, line 436). Please note that we did not find a sex difference (Supplemental Figure S1b), but it would be exciting to discover some biologically relevant cause such as individual absorption or metabolism

      Reviewer #3:

      Major

      1. Methods: Were the observers of experiments blinded to animal treatment? Why or why not?

      Multiple investigators performed the behavioral measurements and were not blinded to mouse treatments, but the dose consumed by each mouse remained blind. Thus, because animals consumed THC gelatin of their own volition while having ad libitum access, we performed the correlational analysis presented in Figure 1 l-n.

      1. Methods: The authors could consider relating their study design to the ARRIVE guidelines and providing a statement as to whether their study adheres to these guidelines. Related to this, were mice provided with any environmental enrichment during the study?

      We followed the ARRIVE guidelines with exception to investigator blinding (described above). Please note that mice were not provided with additional environmental enrichment during the study, a point that we specified in our methods (page 5, line 91).

      1. Methods / Results: In the Methods it is stated that the triad of cannabimimetic behaviors was measured 1 h post-injection or immediately after gelatin exposure. Why were these timepoints chosen? Perhaps this wording should be revised because measurements of cannabimimetic effects were taken several times after drug exposure. Peak i.p. drug may occur earlier than 1 h whereas peak oral drug effect is likely to occur over a longer time period (i.e., not immediately after) due to delays of absorption and first pass metabolism. Is it possible that the authors have underestimated oral drug effects by selecting these timepoints? Please discuss.

      We observed a reduction in locomotion activity starting 1 h following the beginning of exposure to the gelatin (Figure 2), suggesting initial cannabimimetic changes. Based on this observable response we chose to measure all cannabimimetic behaviors immediately following gelatin exposure. The exposure timeline for i.p. injection (1 h post-injection) was selected based on a standard published protocol (Metna-Laurent et al, 2017).

      a. Pharmacodynamics: Related to this and because the aim of this paper is to establish a rodent oral dose model, could the authors discuss the need for better characterization of the time course of drug effects? For example, how might anti-nociception or locomotor activity vary following THC E-gel consumption? This is somewhat addressed in the locomotion time course in Figure 2G but could be elaborated on or discussed in more detail.

      We agree that future studies should include additional time points measuring behavioral changes. This important point is now emphasized in the discussion (page 21, line 455).

      b. Pharmacokinetics: Related to this point above, have the authors considered collecting blood or tissue samples from their i.p.-injected animals to assess drug pharmacokinetics as they relate to drug effect and as compared to oral THC consumption? I am not suggesting the authors conduct a completely new study for this manuscript; however, this could be raised as a future study and/or as a weakness of the current study.

      We did not measure blood and tissue concentrations after i.p. administration due to the number of studies reporting these values by our co-author, Dr. Daniele Piomelli, that established these pharmacokinetic measures. Thus, we chose to reference these studies. Please note that repeating such measurements would be labor intensive, unnecessary use federal NIH resources and animals, while being very redundant to the existing literature.

      c. Minor, but related to these points: In the results, page 14 line 299: the first sentence of this paragraph is confusing as written. The Reviewer recognizes that the authors are relating the pharmacokinetic work to previously published findings, but still thinks that measuring and comparing THC levels from their cohort of i.p.-injected animals would have benefitted the present study.

      Thank you, this edit has been made in the manuscript.

      1. Methods, Histology: The methods as described do not contain sufficient detail regarding THC and THC metabolite quantification. In addition, it is not clear from this section what Histology was performed and how (no histology results appear in the manuscript). Please add more detail to this section of the Methods.

      We apologize for this typo and have corrected it in the methods section of the manuscript.

      1. Methods / Results: The statistics section requires additional detail regarding the rationale for tests being performed on different datasets. In addition, a description of the curve fitting used for data in figures 1H-J, 4B-D, and S4 would be helpful to the reader.

      Thank you, we have updated and provided more information regarding the curve fitting that was used in the methods and results section for the respective figure panels (page 9, line 183-184).

      Minor

      1. Throughout: The use of the phrase "high" dose is somewhat arbitrary and not defined relative to other doses of the THC formulation throughout the manuscript. The Reviewer suggests simply stating that THC was used, specifying the dose, or justifying in the Abstract and/or Introduction the classification of "high" based on relevant literature.

      Thank you for the observation. We have removed this ambiguity by specifically mentioning the dose that was consumed (e.g., abstract page 2, line 20).

      1. Abstract: define "CB1" in the abstract. Although this is a common abbreviation within the field, its use should be defined.

      We have added this definition in the abstract for clarification.

      1. Figure 2: why are the consumption panels B, C, and D given separate labels but the locomotor data are all labeled together as panel G?

      Thank you for the observation, we have adjusted the labeling, so it is equal for both sets of panels.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Thank you very much for forwarding these two important reviews on our paper. Please find hereby our point-by-point responses addressing the ideas, arguments and points of concern raised by the reviewers. We provide explanation of how these points have been incorporated in the paper.

      We feel the review process has been a useful exercise and that the paper has greatly benefited in terms of clarity and accessibility. It is our hope that our findings may ignite renewed interest on unexplored and “unexpected” aspects of great ape vocal communication, inspire novel research, and invite bold new advances on the long-standing puzzle of language origins and evolution. In several relevant sections, we have also sought to explicitly address the point of doubt raised in eLife’s editorial assessment, published alongside the reviewed preprint of our paper. The editorial assessment stated that “…However the evidence provided to support the major claims of the paper is currently incomplete. Specifically, it is not yet clear how the rhythmic structuring found in these long calls is more similar to human language recursion per se rather than isochrony as a broader, more common phenomenon.” To directly clarify this point, we provide now various examples of how recursion is distinct from repetition, using everyday objects for an intuitive understanding (e.g., lines 43-51). We have also expanded the discussion to better contextualise and clarify the implications of our findings on language evolution theory. We hope this will help addressing the implicit request for clarification in the previous editorial assessment.

      Thank you very much for your kind and dedicated attention in the processing of our study.

      Public Reviews:

      Reviewer #1 (Public Review):

      This study investigates the structuring of long calls in orangutans. The authors demonstrate long calls are structured around full pulses, repeated following a regular tempo (isochronic rhythm). These full pulses are themselves structured around different sub-pulses, themselves repeated following an isochronic rhythm. The authors argue this patterning is evidence for self-embedded, recursive structuring in orangutang long calls.

      The analyses conducted are robust and compelling and they support the rhythmicity the authors argue is present in the long calls. Furthermore, the authors went above and beyond and confirmed acoustically the sub-categories identified were accurate.

      We thank the reviewer for this important support regarding our methods and findings.

      However, I believe the manuscript would benefit from a formal analysis of the specific recursive patterning occurring in the long call. Indeed, as of now, it is difficult for the reader to identify what the authors argue to be recursion and distinguish it from simple repetitions of motifs, which is essential.

      We agree with the reviewer that the distinction between repetition and recursion is very important for the adequate interpretation of our findings. Following the reviewer’s point (and the Editorial Assessement), we have now rephrased several passages in the initial paragraph of the paper for added clarity, where recursion is introduced and explained. We now also provide various new examples of recursion in everyday life and popular culture to better illustrate in an easy and accessible way the fundamental nature of recursion. We then use two of these common examples (computer folders and Russian dolls) to specifically distinguish repetition from recursion.

      Although the authors already discuss briefly why linear patterning is unlikely, the reader would benefit from expanding on this discussion section and clarifying the argument here (a lay terminology might help).

      Corrected accordingly.

      I believe an illustration here might help. In the same logic, I believe a tree similar to the trees used in linguistics to illustrate hierarchical structuring would help the reader understand the recursive patterning in place here. This would also help get the "big picture", as Fig 1A is depicting a frustratingly small portion of the long call.

      We completely understand the reviewer’s concern here. As proposed by the reviewer, and in addition to changes in the Introduction (see above) and Discussion (see below), we have now added a new figure in the Discussion to help the reader get the “big picture” of our findings.

      We have also made revisions throughout the Introduction and Discussion to simplify the text, clarify our exposition and facilitate the reader better and intuitively understand the nature and relevance of our results.

      Notwithstanding these comments, this paper would provide crucial evidence for recursion in the vocal production of a non-human ape species. The implication it would have would represent a key shift in the field of language evolution. The study is very elegant and well-constructed. The paper is extremely well written, and the point of view adopted is original, well-argued and compelling.

      We are humbled by the reviewer’s words, and we thank the reviewer for attributing these qualities to our paper. This feedback reassures us of the disruptive potential that these and similar future findings may have on our understanding of language evolution.

      Reviewer #2 (Public Review):

      I am not qualified to judge the narrow claim that certain units of the long calls are isochronous at various levels of the pulse hierarchy. I will assume that the modelling was done properly. I can however say that the broad claims that (i) this constitutes evidence for recursion in non-human primates, (ii) this sheds light on the evolution of recursion and/or language in humans are, when not made trivially true by a semantic shift, unsupported by the narrow claims. In addition, this paper contains errors in the interpretation of previous literature.

      We report the first confirmed case of “vocal sequences within vocal sequences” in a wild nonhuman primate, namely a great ape. The currently prevailing models of language evolution often rest on the (purely theorical) premise that such structures do not exist in any animal bar humans. We find the discovery of such structures in a wild great ape exciting, remarkable, and promising. We regret that the reviewer does not share this sentiment with us. We feel that the statement that these findings are trivial and narrow is unfounded.

      In order to clarify and better communicate the significance of our findings, we now explain in more detail in the Introduction and Discussion how the discovery of nested isochrony in wild orangutans promises to stimulate new series of studies in nature and captivity. Our findings dovetail nicely with previous captive studies that have shown that animals can learn how to recognise recursive patterns and invite new research efforts for the investigation of recursive abilities in the wild and in the absence of human priming and in nonhuman primates.

      The main difficulty when making claims about recursion is to understand precisely what is meant by "recursion" (arguably a broader problem with the literature that the authors engage with). The authors offer some characterization of the concept which is vague enough that it can include anything from "celestial and planetary movement to the splitting of tree branches and river deltas, and the morphology of bacteria colonies". With this appropriately broad understanding, the authors are able to show "recursion" in orangutans' long calls. But they are, in fact, able to find it everywhere.

      The reviewer is correct in highlighting that recursion is ubiquitous in nature and this is something that we explicitly state in the paper. This only makes it the more surprising that, when it comes to vocal combinatorics, recursion has only been described in human language and music, but in no other animals. If studies providing such evidence are known to reviewer, we kindly request their corresponding references.

      In the new revised version, we have paid attention to this aspect raised by the reviewer, and we have sought to disambiguate that our observations pertain to temporal recursion. This clarification will hopefully allow a better understanding of our results.

      The sound of a plucked guitar string, which is a sum of self-similar periodic patterns, count as recursive under their definition as well.

      The example pointed out here by reviewer is factually correct; sound harmonics represent a recursive pattern of a fundamental frequency. (In fact, we explain this phenomenon in the Discussion.) The reviewer’s comment seems to offer an analogy to oscillatory phenomena in the physiology of the vocal folds, and so, it is misplaced with regards to our present study, which focused vocal sequences. Admittedly, this misinterpretation may have been implicitly caused by our wording and we apologise for this. We now refer to “vocal combinatorics” instead of “vocal production” throughout the paper to avoid the reader considering that our findings pertain to the physiology of the vocal folds.

      One can only pick one's definition of recursion, within the context of the question of interest: evolution of language in humans. One must try to name a property which is somewhat specific to human language, and not a ubiquitous feature of the universe we live in, like self-similarity. Only after having carved out a sufficiently distinctive feature of human language, can we start the work of trying to find it in a related species and tracing its evolutionary history. When linguists speak of recursion, they speak of in principle unbounded nested structure (as in e.g., "the doctor's mother's mother's mother's mother ..."). The author seems to acknowledge this in the first line of the introduction: "the capacity to iterate a signal within a self-similar signal" (emphasis added). In formal language theory, which provides a formal and precise definition of one notion of recursivity appropriate for human language, unbounded iteration makes a critical difference: bounded "nested structures" are regular (can be parsed and generated using finite-state machines), unbounded ones are (often) context-free (require more sophisticated automaton). The hierarchy of pulses and sub-pulses only has a fixed amount of layers, moreover the same in all productions; it does not "iterate".

      The reviewer explains here how recursion, in its fully fledged form in modern language(s), is defined by linguistics. We fully agree and do not contest such descriptions and definitions in any way. These descriptions and definitions aim to describe how recursion operates today, not how it evolved. Nor do these descriptions and definitions generate data-driven, testable predictions about precursors or proto-states of recursion as used by modern language-able humans. This is scientifically problematic and heuristically unsatisfying regarding the open question of language evolution.

      Following human-specific definitions for recursion, as proposed by the reviewer, cannot per se be used to undertake a comparative approach to evolution because they leave nothing to compare recursion with in other (wild) species. Using human-specific definitions unavoidably leads to black-and-white notions that language is always absolutely present in humans and always absolutely absent in other animals, regardless of their degree of relatedness to humans. It is unpreventable that these descriptions flout foundational principles of evolution, such as descent with modification and shared ancestry.

      This conceptual problem is not new. Less than a century ago, it was believed that humans were the only tool-user (thousands of examples are known today in nonhuman animals, including fish and invertebrates), and later, that humans were the only cultural animal (today it is known that migrating caribou and fruit flies can establish traditions based on social learning). We must follow in the footsteps of those who have helped redefine human nature in the past. As famously stated by Louis Leakey when presented with evidence for chimpanzee tool-use collected by Jane Goodall, “Now we must redefine tool, redefine man, or accept chimpanzees as human”. Therefore, as a matter of course, we must redefine recursion, embracing empirically (other than purely theoretically) definitions that allow recursion to take on forms and functions different from that of modern language-able humans.

      Another point is that the authors don't show that the constraints that govern the shape of orangutans long calls are due to cognitive processes.

      The reviewer is indeed correct. This does not, however, refute our findings. We do not directly show that cognitive processes govern recursion in orangutan long calls. Instead, we show that the observed patterns cannot be explained by simple bodily or motoric processes, excluding therefore low-level explanations. With more than 50 years of accumulated field experience in primatology, this was the only possible way that our team found to go about conducting research and analyses on natural behaviour, in the wild, with a critically endangered primate. We would be very interested in learning from the reviewer what ethical and non-invasive methods, specific locations in the wild, and type of behavioural or socio-ecological data could be otherwise viably used to demonstrate what the reviewer requests. If other scientists believe that the patterns observed in wild orangutan long calls – three independent, but simultaneously-occurring recursive motifs – can be generated based on low-level physiological mechanisms alone, the burden of proof resides with them.

      Any oscillating system will, by definition, exhibit isochrony.

      We disagree with this statement. The example provided above by the reviewer him/her-self disproves the statement: a guitar string when struck is an oscillating system but it is not isochronic nor is it combinatorial. Isochrony cannot be established with single events, only with event sequences (in practice, ideally >3).

      For instance, human trills produce isochronouns or near isochronous pulses. No cognitive process is needed to explain this; this is merely the physics of the articulators. Do we know that the rhythm of the pulses and sub-pulses in orangutans is dictated by cognition as opposed to the physics of the articulators?

      The reviewer seems to misinterpret our results here. Our focus is on vocal combinatorics, not vocal fold oscillation (see previous response). We have now reworded all instances where the text could be unclear.

      Even granting the authors' unjustified conclusion that wild orangutans have "recursive" structures and that these are the result of cognition, the conclusions drawn by the authors are too often fantastic leaps of induction. Here is a cherry-picked list of some of the far-fetched conclusions: - "our findings indicate that ancient vocal patterns organized across nested structural strata were likely present in ancestral hominids". Does finding "vocal patterns organized across nested structural strata" in wild orangutans suggest that the same were present in ancestral hominids?

      Following the reviewer’s comment, we have now rephrased and toned down this passage, stating that such structures “may have been present” in ancestral hominids. We are grateful to the reviewer for this comment.

      • "given that isochrony universally governs music and that recursion is a feature of music, findings (sic.) suggest a possible evolutionary link between great ape loud calls and vocal music". Isochrony is also a feature of the noise produced by cicadas. Does this suggest an evolutionary link between vocal music and the noise of cicadas?

      We apologise, but it is unclear what the reviewer is exactly suggesting or proposing here. It seems as though it is believed that cicadas are as phylogenetically related to humans as great apes are. Our last common ancestor with great apes diverged about 10mya, but with cicadas 600mya. The last common ancestor with great apes was a great ape (or hominid). The human-cicada last common ancestor would have looked like a worm (it is probable it would already have a nervous bulge at the head, or “brain”). In order to avoid similar misinterpretations, we have now clarified in several instances that our study and interpretation of results are based on shared ancestry within the Hominid family.

      It seems that the reviewer may be also misinterpreting our findings. We do not simply report isochrony in a wild great ape (multiple references for isochronous calls in primate are provided in the Discussion). We report isochrony within isochrony in three non-exclusive rhythmic arrangements. In case the reviewer knows of a study on cicadas, or any non-human species, showing recursive sound combinatorics of this nature, we kindly request the citation. We can only hope that such new cases may be gradually unveiled in wild animals to help propel our general understanding of possible ways of how insipient recursive vocal combinatorics in ancient hominids could have given rise to recursion as used today by language-able modern humans.

      Finally, some passages also reveal quite glaring misunderstandings of the cited literature. For instance:

      • "Therefore, the search for recursion can be made in the absence of meaning-base operations, such as Merge, and more generally, semantics and syntax". It is precisely Chomsky's (disputable) opinion that the main operation that govern syntax, Merge, has nothing to do with semantics. The latter is dealt within a putative conceptual-intentional performance system (in Chomsky's terminology), which is governed by different operations.

      Following the reviewer’s comment, we have now removed “meaning-base operations, such as Merge, and more generally” from the target sentence in order to avoid confusion. Thank you.

      • "Namely, experimental stimuli have consisted of artificial recursive signal sequences organized along a single temporal scale (though not structurally linear), similarly with how Merge and syntax operate". The minimalist view advocated by Chomsky assumes that mapping a hierarchichal structure to a linear order (a process called linearizarion) is part of the articulatory-perceptual system. This system is likewise not governed by Merge and is not part of "syntax" as conceived by the Chomskyan minimalists.

      Following the reviewer’s comment, we have not omitted the target sentence for added clarity.

      Reviewer #1 (Recommendations For The Authors):

      L55-67: I feel there is a step missing in the logic of the argumentation here. The studies cited by the authors here are mostly about syntactic-like structuring but not recursion. Hence when the authors mention in the next sentence that these studies investigate the perception of recursive signalling, it seems incorrect. I agree with the logic, but the references do not seem appropriate. I would further suggest that if there are no other references, that would make the introduction of the study here even easier: there is very little work investigating this capacity in non-human animals, let alone on a production perspective, therefore, the study conducted here is paramount and fills this important gap in the literature.

      We are grateful to Reviewer #1 for these comments, and we are honoured to hear that our findings are filling a literature gap. We have now carefully revised the manuscript, hopefully, streamlining our line of reasoning and improving the paper’s overall readability. We agree that there is very little work investigating the spontaneous “production” of recursion in nonhuman animals. We decided to better detail the logic of our paper by clarifying the difference between recursion and repetition and clarifying that the motifs that we identify in wild orangutan represent a case of "temporal recursion".

      L59: Johan J should be removed (same in discussion).

      Removed, thanks.

      L60: For example is repeated twice, here and L55.

      We have rephrased this part of the manuscript, thanks.

      L72-73: If we consider the Watson et al., 2020 study an example of recursive perception (which I do not think is true), this was conducted using a passive design - i.e. with no active training.

      We have rephrased this part of the manuscript, thanks.

      L240-241: Again, non-adjacent dependency processing does not equal recursion.

      We agree that non-adjacent dependency processing does not equal recursion. We have now clarified this section accordingly.

      L269: one of the most.

      Corrected, thanks.

      L296: add space after settings.

      Corrected, thanks.

      Reviewer #2 (Recommendations For The Authors):

      In addition to the public portion of the review, I advise the authors' to substantially alter their style of writing. The language used is not accurate and the intended meaning is often not clear. This makes it hard for any reader to follow the authors' reasoning fully. Below I list only a few of the egregious examples but the examples abound:

      • "this hints at a neuro-cognitive or neuro-computational transformation in the human brain" what meaning do the author assign to "neuro-cognitive" and "neuro-computational" ? what difference do they place between the two (so that they would be disjoined.) ? What "transformation" are we talking about ? From what to what ?

      • " However, recursive signal structures can also unfold in other manners, such as across nested temporal scales and in the absence of semantics (Fitch, 2017a), as in music." what is meant here by nested temporal scales ?

      • "The simultaneous occurrence of non-exclusive recursive patterns excludes the likelihood that orangutans concatenate long calls and their subunits in linear structure without any recursive processes": isn't there a more straightforward way to say "excludes the likelihood"? What is meant by "non-exclusive recursive patterns"?

      It seems that Reviewer #2 does not share our writing style. Nonetheless, we have tried to meet the reviewer halfway, clarifying throughout the new revised version our definitions, our line of argument, our motivations, our results, the context of our findings in what is known about recursion in animals, and the implication of our discovery for language evolution theory.

    1. Author Response

      The following is the authors’ response to the current reviews.

      We agree with the reviewer that the statistics are buried in a dense excel file without a read-me page. We will address this by making a summary excel page for p-values during the production process.


      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study uses genomically-engineered glypican alleles to demonstrate convincingly that Dally (not Dally-like protein [Dlp]) is the key contributor to formation of the Dpp/BMP morphogen gradient in the wing disc of Drosophila. The authors provide solid genetic evidence that, surprisingly, the core domain of Dally appears to suffice to trap Dpp at the cell surface. They conclude with a model according to which Dally modulates the range of Dpp signaling by interfering with Dpp's internalization by the Dpp receptor Thickveins.

      Public Reviews:

      Reviewer #1 (Public Review):

      How morphogens spread within tissues remains an important question in developmental biology. Here the authors revisit the role of glypicans in the formation of the Dpp gradient in wing imaginal discs of Drosophila. They first use sophisticated genome engineering to demonstrate that the two glypicans of Drosophila are not equivalent despite being redundant for viability. They show that Dally is the relevant glypican for Dpp gradient formation. They then provide genetic evidence that, surprisingly, the core domain of Dally suffices to trap Dpp at the cell surface (suggesting a minor role for GAGs). They conclude with a model that Dally modulates the range of Dpp signaling by interfering with Dpp's degradation by Tkv. These are important conclusions, but more independent (biochemical/cell biological) evidence is needed.

      As indicated above, the genetic evidence for the predominant role of Dally in Dpp protein/signalling gradient formation is strong. In passing, the authors could discuss why overexpressed Dlp has a negative effect on signaling, especially in the anterior compartment. The authors then move on to determine the role of GAG (=HS) chains of Dally. They find that in an overexpression assay, Dally lacking GAGs traps Dpp at the cell surface and, counterintuitively, suppresses signaling (fig 4 C, F). Both findings are unexpected and therefore require further validation and clarification, as outlined in a and b below.

      a. In loss of function experiments (dallyDeltaHS replacing endogenous dally), Dpp protein is markedly reduced (fig 4R), as much as in the KO (panel Q), suggesting that GAG chains do contribute to trapping Dpp at the cell surface. This is all the more significant that, according to the overexpression essays, DallyDeltaHS seems more stable than WT Dally (by the way, this difference should also be assessed in the knock-ins, which is possible since they are YFP-tagged). The authors acknowledge that HS chains of Dally are critical for Dpp distribution (and signaling) under physiological conditions. If this is true, one can wonder why overexpressed dally core 'binds' Dpp and whether this is a physiologically relevant activity.

      According to the overexpression assay, DallyDeltaHS seems more stable than WT Dally (Fig. 4B’, E’, 5A’, B’). As the reviewer suggested, we addressed the difference using the two knock-in alleles and found that DallyDeltaHS is more stable than WT Dally (Fig.4 L, M inset), further emphasizing the insufficient role of core protein of Dally for extracellular Dpp distribution.

      In summary, we showed that, although Dally interacts with Dpp mainly through its core protein from the overexpression assay (Fig. 4E, I), HS chains are essential for extracellular Dpp distribution (Fig. 4R). Thus, the core protein of Dally alone is not sufficient for extracellular Dpp distribution under physiological conditions. These results raise a question about whether the interaction of core protein of Dally with Dpp is physiologically relevant. Since the increase of HS upon dally expression but not upon dlp expression resulted in the accumulation of extracellular Dpp (Fig. 2) and this accumulation was mainly through the core protein of Dally (Fig. 4E, I), we speculate that the interaction of the core protein of Dally with Dpp gives ligand specificity to Dally under physiological conditions.

      To understand the importance of the interaction of core protein of Dally with Dpp under physiological conditions, it is important to identify a region responsible for the interaction. Our preliminary results overexpressing a dally mutant lacking the majority of core protein (but keeping the HS modified region intact) showed that HS chains modification was also lost. Although this is consistent with our results that enzymes adding HS chains also interact with the core protein of Dally (Fig. 4D), the dally mutant allele lacking the core protein would hamper us from distinguishing the role of core protein of Dally from HS chains.

      Nevertheless, we can infer the importance of the interaction of core protein of Dally with Dpp using dally[3xHA-dlp, attP] allele, where dlp is expressed in dally expressing cells. Since Dally-like is modified by HS chains but does not interact with Dpp (Fig. 2, 4), dally[3xHA-dlp, attP] allele mimics a dally allele where HS chains are properly added but interaction of core protein with Dpp is lost. As we showed in Fig.3O, S, the allele could not rescue dallyKO phenotypes, consistent with the idea that interaction of core protein of Dally with Dpp is essential for Dpp distribution and signaling and HS chain alone is not sufficient for Dpp distribution.

      b. Although the authors' inference that dallycore (at least if overexpressed) can bind Dpp. This assertion needs independent validation by a biochemical assay, ideally with surface plasmon resonance or similar so that an affinity can be estimated. I understand that this will require a method that is outside the authors' core expertise but there is no reason why they could not approach a collaborator for such a common technique. In vitro binding data is, in my view, essential.

      We agree with the reviewer that a biochemical assay such as SPR helps us characterize the interaction of core protein of Dally and Dpp (if the interaction is direct), although the biochemical assay also would not demonstrate the interaction under the physiological conditions.

      However, SPR has never been applied in the case of Dpp, probably because purifying functional refolded Dpp dimer from bacteria has previously been found to be stable only in low pH and be precipitated in normal pH buffer (Groppe J, et al., 1998)(Matsuda et al., 2021). As the reviewer suggests, collaborating with experts is an important step in the future.

      Nevertheless, SPR was applied for the interaction between BMP4 and Dally (Kirkpatrick et al., 2006), probably because BMP4 is more stable in the normal buffer. Although the binding affinity was not calculated, SPR showed that BMP4 directly binds to Dally and this interaction was only partially inhibited by molar excess of exogenous HS, suggesting that BMP4 can interact with core protein of Dally as well as its HS chains. In addition, the same study applied Co-IP experiments using lysis of S2 cells and showed that Dpp and core protein of Dally are co-immunoprecipitated, although it does not demonstrate if the interaction is direct.

      In a subsequent set of experiments, the authors assess the activity of a form of Dpp that is expected not to bind GAGs (DppDeltaN). Overexpression assays show that this protein is trapped by DallyWT but not dallyDeltaHS. This is a good first step validation of the deltaN mutation, although, as before, an invitro binding assay would be preferable.

      Our overexpression assays actually showed that DppDeltaN is trapped by DallyWT and by dallyDeltaHS at similar levels (Fig. 5C), indicating that interaction of DppDeltaN and HS chains of Dally is largely lost but DppDeltaN can still interact with core protein of Dally.

      We thank the reviewer for the suggesting the in vitro experiment. Although we decided not to develop biophysical experiments such as SPR for Dpp in this study due to the reasons discussed above, we would like to point out that our result is consistent with a previous Co-IP experiment using S2 cells showing that DppDeltaN loses interaction with heparin (Akiyama2008).

      However, in contrast to our results, the same study also proposed by Co-IP experiments using S2 cells that DppDeltaN loses interaction with Dally (Akiyama2008). Although it is hard to conclude since western blotting was too saturated without loading controls and normalization (Fig. 1C in Akiyama 2008), and negative in vitro experiments do not necessarily demonstrate the lack of interaction in vivo. One explanation why the interaction was missed in the previous study is that some factors required for the interaction of DppDeltaN with core protein of Dally are missing in S2 cells. In this case, in vivo interaction assay we used in this study has an advantage to robustly detect the interaction.

      Nevertheless, the authors show that DppDeltaN is surprisingly active in a knock-in strain. At face value (assuming that DeltaN fully abrogates binding to GAGs), this suggests that interaction of Dpp with the GAG chains of Dally is not required for signaling activity. This leads to authors to suggest (as shown in their final model) that GAG chains could be involved in mediating the interactions of Dally with Tkv (and not with Dpp. This is an interesting idea, which would need to be reconciled with the observation that the distribution of Dpp is affected in dallyDeltaHS knock-ins (item a above). It would also be strengthened by biochemical data (although more technically challenging than the experiments suggested above). In an attempt to determine the role of Dally (GAGs in particular) in the signaling gradient, the paper next addresses its relation to Tkv. They first show that reducing Tkv leads to Dpp accumulation at the cell surface, a clear indication that Tkv normally contributes to the degradation of Dpp. From this they suggest that Tkv could be required for Dpp internalisation although this is not shown directly. The authors then show that a Dpp gradient still forms upon double knockdown (Dally and Tkv). This intriguing observation shows that Dally is not strictly required for the spread of Dpp, an important conclusion that is compatible with early work by Lander suggesting that Dpp spreads by free diffusion. These result show that Dally is required for gradient formation only when Tkv is present. They suggest therefore that Dally prevents Tkv-mediated internalisation of Dpp. Although this is a reasonable inference, internalisation assays (e.g. with anti-Ollas or anti-HA Ab) would strengthen the authors' conclusions especially because they contradict a recent paper from the Gonzalez-Gaitan lab.

      Thanks for suggesting the internalization assay. As we discussed in the discussion, our results suggest that extracellular Dpp distribution is severely reduced in dally mutants due to Tkv mediated internalization of Dpp (Fig. 6). Thus, extracellular Dpp available for labelling with nanobody is severely reduced in dally mutants, which can explain the reduced internalization of Dpp in dally mutants in the internalization assay. Therefore, we think that the nanobody internalization assay would not distinguish the two contradicting possibilities.

      The paper ends with a model suggesting that HS chains have a dual function of suppressing Tkv internalisation and stimulating signaling. This constitutes a novel view of a glypican's mode of action and possibly an important contribution of this paper. As indicated above, further experiments could considerably strengthen the conclusion. Speculation on how the authors imagine that GAG chains have these activities would also be warranted.

      Thank you very much!

      Reviewer #2 (Public Review):

      The authors are trying to distinguish between four models of the role of glypicans (HSPGs) on the Dpp/BMP gradient in the Drosophila wing, schematized in Fig. 1: (1) "Restricted diffusion" (HSPGs transport Dpp via repetitive interaction of HS chains with Dpp); (2) "Hindered diffusion" (HSPGs hinder Dpp spreading via reversible interaction of HS chains with Dpp); (3) "Stabilization" (HSPGs stabilize Dpp on the cell surface via reversible interaction of HS chains with Dpp that antagonizes Tkv-mediated Dpp internalization); and (4) "Recycling" (HSPGs internalize and recycle Dpp).

      To distinguish between these models, the authors generate new alleles for the glypicans Dally and Dally-like protein (Dlp) and for Dpp: a Dally knock-out allele, a Dally YFP-tagged allele, a Dally knock-out allele with 3HA-Dlp, a Dlp knock-out allele, a Dlp allele containing 3-HA tags, and a Dpp lacking the HS-interacting domain. Additionally, they use an OLLAS-tag Dpp (OLLAS being an epitope tag against which extremely high affinity antibodies exist). They examine OLLAS-Dpp or HA-Dpp distribution, phospho-Mad staining, adult wing size.

      They find that over-expressed Dally - but not Dlp - expands Dpp distribution in the larval wing disc. They find that the Dally[KO] allele behaves like a Dally strong hypomorph Dally[MH32]. The Dally[KO] - but not the Dlp[KO] - caused reduced pMad in both anterior and posterior domains and reduced adult wing size (particularly in the Anterior-Posterior axis). These defects can be substantially corrected by supplying an endogenously tagged YFP-tagged Dally. By contrast, they were not rescued when a 3xHA Dlp was inserted in the Dally locus. These results support their conclusion that Dpp interacts with Dally but not Dlp.

      They next wanted to determine the relative contributions of the Dally core or the HS chains to the Dpp distribution. To test this, they over-expressed UAS-Dally or UAS-Dally[deltaHS] (lacking the HS chains) in the dorsal wing. Dally[deltaHS] over-expression increased the distribution of OLLAS-Dpp but caused a reduction in pMad. Then they write that after they normalize for expression levels, they find that Dally[deltaHS] only mildly reduces pMad and this result indicates a major contribution of the Dally core protein to Dpp stability.

      Thanks for the comments. We actually showed that compared with Dally overexpression, Dally[deltaHS] overexpression only mildly reduces extracellular Dpp accumulation (Fig. 4I). This indicates a major contribution of the Dally core protein to interaction with Dpp, although the interaction is not sufficient to sustain extracellular Dpp distribution and signaling gradient.

      The "normalization" is a key part of this model and is not mentioned how the normalization was done. When they do the critical experiment, making the Dally[deltaHS] allele, they find that loss of the HS chains is nearly as severe as total loss of Dally (i.e., Dally[KO]). Additionally, experimental approaches are needed here to prove the role of the Dally core.

      Since the expression level of Dally[deltaHS] is higher than Dally when overexpressed, we normalized extracellular Dpp distribution (a-Ollas staining) against GFP fluorescent signal (Dally or Dally[deltaHS]). To do this, we first extracted both signal along the A-P axis from the same ROI in the previous version. The ratio was calculated by dividing the intensity of a-Ollas staining with the intensity of GFP fluorescent signal at a given position x. The average profile from each normalized profile was generated and plotted using the script described in the method (wingdisc_comparison.py) as other pMad or extracellular staining profiles.

      Although this analysis provides normalized extracellular Dpp accumulation at different positions along the A-P axis, we are more interested in the total amount of Dpp or DppDeltaN accumulation upon Dally or dallyDeltaHS expression. Therefore, in the revised ms, we decided to normalize total amount of extracellular Dpp against the level of Dally or Dally[deltaHS] by dividing total signal intensity of extracellular Dpp staining (ExOllas staining) by total GFP fluorescent signal (Dally or Dally[deltaHS]) around the Dpp producing cells in each wing disc. Statistical analysis showed that accumulation of extracellular Dpp is only slightly reduced without HS chains (Fig.4I), indicating that Dally interacts with Dpp mainly through its core protein.

      We agree with the reviewer that additional experimental approaches are needed to address the role of the core protein of Dally. As we discussed in the response to the reviewer1, to understand the importance of the interaction of core protein of Dally with Dpp, it is important to identify a region responsible for the interaction. Our preliminary results overexpressing a dally mutant lacking the majority of core protein (but keeping the HS modified region intact) showed that HS chains modification was also lost. Although this is consistent with our results that enzymes adding HS chains also interact with the core protein of Dally (Fig. 4D), the dally mutant allele lacking the core protein would hamper us from distinguishing the role of the core protein of Dally from HS chains.

      Nevertheless, we can infer the importance of the interaction of core protein of Dally with Dpp using dally[3xHA-dlp, attP] allele, where dlp is expressed in dally expressing cells. Since Dally-like is modified by HS chains but does not interact with Dpp (Fig. 2, 4), dally[3xHA-dlp, attP] allele mimics a dally allele where HS chains are properly added but interaction of core protein with Dpp is lost. As we showed in Fig.3O, S, the allele could not rescue dallyKO phenotypes, consistent with the idea that interaction of core protein of Dally with Dpp is essential for Dpp distribution and signaling.

      Prior work has shown that a stretch of 7 amino acids in the Dpp N-terminal domain is required to interact with heparin but not with Dpp receptors (Akiyama, 2008). The authors generated an HA-tagged Dpp allele lacking these residues (HA-dpp[deltaN]). It is an embryonic lethal allele, but they can get some animals to survive to larval stages if they also supply a transgene called “JAX” containing dpp regulatory sequences. In the JAX; HA-dpp[deltaN] mutant background, they find that the distribution and signaling of this Dpp molecule is largely normal. While over-expressed Dally can increase the distribution of HA-dpp[deltaN], over-expression of Dally[deltaHS] cannot. These latter results support the model that the HS chains in Dally are required for Dpp function but not because of a direct interaction with Dpp.

      Our overexpression assays actually showed that both Dally and Dally[deltaHS] can accumulate Dpp upon overexpression and the accumulation of Dpp is comparable after normalization (Fig. 5C), consistent with the idea that interaction of DppdeltaN and HS chains are largely lost. As the reviewer pointed out, these results support the model that the HS chains in Dally are required for Dpp function but not because of a direct interaction with Dpp.

      In the last part of the results, they attempt to determine if the Dpp receptor Thickveins (Tkv) is required for Dally-HS chains interaction. The 2008 (Akiyama) model posits that Tkv activates pMad downstream of Dpp and also internalizes and degrades Dpp. A 2022 (Romanova-Michaelides) model proposes that Dally (not Tkv) internalizes Dpp.

      To distinguish between these models, the authors deplete Tkv from the dorsal compartment of the wing disc and found that extracellular Dpp increased and expanded in that domain. These results support the model that Tkv is required to internalize Dpp.

      They then tested the model that Dally antagonizes Tkv-mediated Dpp internalization by determining whether the defective extracellular Dpp distribution in Dally[KO] mutants could be rescued by depleting Tkv. Extracellular Dpp did increase in the D vs V compartment, potentially providing some support for their model. However, there are no statistics performed, which is needed for full confidence in the results. The lack of statistics is particularly problematic (1) when they state that extracellular Dpp does not rise in ap>tkv RNAi vs ap>tkv RNAi, dally[KO] wing discs (Fig. 6E) or (2) when they state that extracellular Dpp gradient expanded in the dorsal compartment when tkv was dorsally depleted in dally[deltaHS] mutants (Fig. 6I). These last two experiments are important for their model but the differences are assessed only visually. In fact, extracellular Dpp in ap>tkv RNAi, dally[KO] (Fig. 6B) appears to be lower than extracellular Dpp in ap>tkv RNAi (Fig. 6A) and the histogram of Dpp in ap>tkv RNAi, dally[KO] is actually a bit lower than Dpp in ap>tkv RNAi, But the author claim that there is no difference between the two. Their conclusion would be strengthened by statistical analyses of the two lines.

      We provided statistics for all the quantifications for pMad and extracellular Dpp distribution as supplementary data. In the previous version, we argued that extracellular Dpp level in ap>tkvRNAi, dallyKO (Fig.6B) does not increase compared with that in ap>tkvRNAi (Fig.6A). Statistical analysis (t-test) showed that the extracellular Dpp level in Fig. 6B is similar to or lower than that in Fig. 6A (Fig. 6E), confirming our conclusion. Statistical analysis (t-test) also confirmed that extracellular Dpp distribution expanded when tkv was knocked down in dallyHS mutants (Fig. 6I).

      Strengths:

      1. New genomically-engineered alleles

      A considerable strength of the study is the generation and characterization of new Dally, Dlp and Dpp alleles. These reagents will be of great use to the field.

      Thanks. We hope that these resources are indeed useful to the field.

      1. Surveying multiple phenotypes

      The authors survey numerous parameters (Dpp distribution, Dpp signaling (pMad) and adult wing phenotypes) which provides many points of analysis.

      Thanks!

      Weaknesses:

      1. Confusing discussion regarding the Dally core vs HS in Dpp stability. They don't provide any measurements or information on how they "normalize" for the level of Dally vs Dally[deltaHS]? This is important part of their model that currently is not supported by any measurements.

      We explained how we normalized in the above section and updated the method section in the revised ms.

      1. Lacking quantifications and statistical analyses:

      a. Why are statistical significance for histograms (pMad and Dpp distribution) not supplied? These histograms provide the key results supporting the authors' conclusions but no statistical tests/results are presented. This is a pervasive shortcoming in the current study.

      Thanks. We provided t-test analyses together with the raw data as supplementary data.

      b. dpp[deltaN] with JAX transgene - it would strengthen the study to supply quantitative data on the percent survival/lethal stage of dpp[deltaN] mutants with or without the JAK transgene

      In this study, we are interested in the role of dpp[deltaN] during the wing disc development. Therefore, we decided not to perform the detailed analysis on the percent survival/lethal stage of dpp[deltaN] mutants with or without the JAX transgene in the current study. Nevertheless, the fact that dpp[deltaN] allele is maintained with a balanced stock and JAX;dpp[deltaN] allele can be maintained as homozygous stock indicates that the lethality of dpp[deltaN] allele comes from the early stages. Indeed, our preliminary results showed that pMad signal is severely lost in the dpp[deltaN] embryo without JAX (data not shown), indicating that the allele is lethal at early embryonic stages.

      c. The graphs on wing size etc should start at zero.

      Thanks. We corrected this in the current ms.

      d. The sizes of histograms and graphs in each figure should be increased so that the reader can properly assess them. Currently, they are very small.

      Thanks. We changed the sizes in the current ms.

      The authors' model is that Dally (not Dlp) is required for Dpp distribution and signaling but that this is not due to a direct interaction with Dpp. Rather, they posit that Dally-HS antagonize Tkv-mediated Dpp internalization. Currently the results of the experiments could be considered consistent with their model, but as noted above, the lack of statistical analyses of some parameters is a weakness.

      Thanks. We now performed and provided the statistical analyses in the revised ms.

      One problematic part of their result for me is the role of the Dally core protein (Fig. 7B). There is a mis-match between the over-expression results and Dally allele lacking HS (but containing the core). Finally, their results support the idea that one or more as-yet unidentified proteins interact with Dally-HS chains to control Dpp distribution and signaling in the wing disc.

      Our results simply suggest that Dpp can interact with Dally mainly through core protein but this interaction is not sufficient to sustain extracellular Dpp gradient formation under physiological conditions (dallyDeltaHS) (Fig. 4Q). We find that the mis-match is not problematic if the role of Dally is not simply mediated through interaction with Dpp. We speculate that interaction of Dpp and core protein of Dally is transient and not sufficient to sustain the Dpp gradient without HS chains of Dally stabilizing extracellular Dpp distribution by blocking Tkv-mediated Dpp internalization.

      There is much debate and controversy in the Dpp morphogen field. The generation of new, high quality alleles in this study will be useful to Drosophila community, and the results of this study support the concept that Tkv but not Dally regulate Dpp internalization. Thus the work could be impactful and fuel new debates among morphogen researchers.

      Thanks.

      The manuscript is currently written in a manner that really is only accessible to researchers who work on the Dpp gradient. It would be very helpful for the authors to re-write the manuscript and carefully explain in each section of the results (1) the exact question that will be asked, (2) the prior work on the topic, (3) the precise experiment that will be done, and (4) the predicted results. This would make the study more accessible to developmental biologists outside of the morphogen gradient and Drosophila communities.

      Thanks. We modified texts and changed the order of Fig.5. We hope that the changes make this study more accessible to developmental biologists outside of the field.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their feedback. Our response and a summary of the changes made to the manuscript are shown below. In addition to the changes made in response to the reviewer’s comments, we made the following changes to improve the manuscript:

      • We updated figures 8 and 9 using data with improved preprocessing and source reconstruction. We now also include graphical network plots. This helps in the cross method (figure 8 vs 9) and cross dataset (figure 9 vs 10) comparison.

      • We added funding acknowledgments and a credit author statement.

      Reviewer #1 (Public Review):

      Summary:

      These types of analyses use many underlying assumptions about the data, which are not easy to verify. Hence, one way to test how the algorithm is performing in a task is to study its performance on synthetic data in which the properties of the variable of interest can be apriori fixed. For example, for burst detection, synthetic data can be generated by injected bursts of known durations, and checking if the algorithm is able to pick it up. Burst detection is difficult in the spectral domain since direct spectral estimators have high variance (see Subhash Chandran et al., 2018, J Neurophysiol). Therefore, detected burst lengths are typically much lower than injected burst lengths (see Figure 3). This problem can be solved by doing burst estimation in the time domain itself, for example, using Matching Pursuit (MP). I think the approach presented in this paper would also work since this model is also trained on data in the time domain. Indeed, the synthetic data can be made more "challenging" by injecting multiple oscillatory bursts that are overlapping in time, for which a greedy approach like MP may fail. It would be very interesting to test whether this method can "keep up" as the data is made more challenging. While showing results from brain signals directly (e.g., Figure 7) is nice, it will be even more impactful if it is backed up with results obtained from synthetic data with known properties.

      We completely agree with the reviewer that testing the methods using synthetic data is an important part of validating such an approach. Each of the original papers that apply these methods to a particular application do this. The focus of this manuscript is to present a toolbox for applying these methods rather than to introduce/validate the methods themselves. For a detailed validation of the methods, the reader should see the citations. For example, the following paper introduces the HMM as a method for oscillatory burst detection:

      • A.J. Quinn, et al. “Unpacking transient event dynamics in electrophysiological power spectra”. Brain topography 32.6 (2019): 1020-1034. See figures 2 and 3 for an evaluation of the HMM’s performance in detecting single-channel bursts using synthetic data.

      We have added text to paragraph 2 in section 2.5 to clarify this burst detection method has been validated using simulated data and added references.

      I was wondering about what kind of "synthetic data" could be used for the results shown in Figure 8-12 but could not come up with a good answer. Perhaps data in which different sensory systems are activated (visual versus auditory) or sensory versus movement epochs are compared to see if the activation maps change as expected. We see similarities between states across multiple runs (reproducibility analysis) and across tasks (e.g. Figure 8 vs 9) and even methods (Figure 8 vs 10), which is great. However, we should also expect the emergence of new modes specific to sensory activation (say auditory cortex for an auditory task). This will allow us to independently check the performance of this method.

      The following papers study the performance of the HMM and DyNeMo in detecting networks using synthetic data:

      • D. Vidaurre, et al. “Spectrally resolved fast transient brain states in electrophysiological data”. Neuroimage 126 (2016): 81-95. See figure 3 in this paper for an evaluation of the HMM’s performance in detecting oscillatory networks using simulation data.

      • C. Gohil, et al. “Mixtures of large-scale dynamic functional brain network modes”. Neuroimage 263 (2022): 119595. See figures 4 and 5 for an evaluation of DyNeMo performance in detecting overlapping networks and long-range temporal structure in the data.

      We have added text to paragraph 2 in section 2.5 to clarify these methods have been well tested on simulated data and added references.

      The authors should explain the reproducibility results (variational free energy and best run analysis) in the Results section itself, to better orient the reader on what to look for.

      Considering the second reviewer’s comments, we moved the reproducibility results to the supplementary information (SI). This means the reproducibility results are no longer part of the main figures/text. However, we have added some text to help the reader understand what aspects indicate the results are reproducible in section 2 of the SI.

      Page 15: the comparison across subjects is interesting, but it is not clear why sensory-motor areas show a difference and the mean lifetime of the visual network decreases. Can you please explain this better? The promised discussion in section 3.5 can be expanded as well.

      It is well known that the frequency and amplitude of neuronal oscillations changes with age. E.g. see the following review: Ishii, Ryouhei, et al. "Healthy and pathological brain aging: from the perspective of oscillations, functional connectivity, and signal complexity." Neuropsychobiology 75.4 (2018): 151-161. We observe older people have more beta activity and less alpha activity. These changes are seen in time-averaged calculations, i.e. the amplitude of oscillations are calculated using the entire time series for each subject.

      The dynamic analysis presented in the paper provides further insight into how changes in the time-averaged quantities can occur through changes in the dynamics of frequency-specific networks. The sensorimotor network, which is a network with high beta activity, has a higher fractional occupancy. This indicates the change we observe in time-average beta power may be due to a longer amount of time spent in the sensorimotor network. The visual network, which is a network with high alpha activity, shows reduced lifetimes, which can explain the reduced time-averaged alpha activity seen with ageing.

      We hope the improved text in the last paragraph of section 3.5 clarifies this. It should also be taken into account that the focus of this manuscript is the tools rather than an in-depth analysis of ageing. We use the age effect as an example of the potential analysis this toolbox enables.

      Reviewer #2 (Public Review):

      Summary:

      The authors have developed a comprehensive set of tools to describe dynamics within a single time-series or across multiple time-series. The motivation is to better understand interacting networks within the human brain. The time-series used here are from direct estimates of the brain's electrical activity; however, the tools have been used with other metrics of brain function and would be applicable to many other fields.

      Strengths:

      The methods described are principled, and based on generative probabilistic models.

      This makes them compact descriptors of the complex time-frequency data.

      Few initial assumptions are necessary in order to reveal this compact description.

      The methods are well described and demonstrated within multiple peer-reviewed articles.

      This toolbox will be a great asset to the brain imaging community.

      Weaknesses:

      The only question I had was how to objectively/quantitatively compare different network models. This is possibly easily addressed by the authors.

      We thank the reviewer for his/her comments. We address the weaknesses in our response in the “Recommendations For The Authors” section.

      Reviewer #1 (Recommendations For The Authors):

      Figure 2 legend: Please add the acronym for LCMV also.

      We have now done this.

      Section 2.5.1 page 8: the pipeline is shown in Figure 4, not 3.

      This has been fixed.

      Reviewer #2 (Recommendations For The Authors):

      This is a great paper outlining a resource that can be applied to many different fields. I have relatively minor comments apart from one.

      How does one quantitatively compare network descriptors (from DyNeMo and TDE-HMM for example)? At the moment the word 'cleaner' (P17) is used, but is there any non-subjective way? (eg Free energy/ cross validation etc). At the moment it is useful that one method gives a larger effect size (in a comparison between groups).. but could the authors say something about the use of these methods as more/less faithful descriptors of the data? Or in other words, do all methods generate datasets (from the latent space) that can be quantitatively compared with the original data?

      In principle, the variational free energy could be used to compare models. However, because we use an approximate variational free energy (an exact measure is not attainable) for DyNeMo and an exact free energy for the HMM, it is possible that any differences we see in the variational free energy between the HMM and Dynemo are caused by the errors in its approximation. This makes it unreliable for comparing across models. That said, we can still use the variational free energy to compare within models. Indeed, we use the variational free energy for quantitative model comparisons when we select the best run to analyse from a set of 10.

      One viable approach for comparing models is to assess their performance on downstream tasks. In this manuscript, examples of downstream tasks are the evoked network response and the young vs old group difference. We argue a better performance in the downstream task indicates a more useful model within that context. This performance is a quantitative measure. Note, there is no straightforward answer to which is the best model. It is likely different models will be useful for different downstream tasks.

      In terms of which model provides a more faithful description of the data. The more flexible generative model for DyNeMo means it will generate more realistic data. However, this doesn’t necessarily mean it’s the best model (for a particular downstream task). Both the HMM and DyNeMo provide complementary descriptions that can be useful.

      We have clarified the above in paragraph 5 of section 4.

      Other comments:

      • Footnote 6 - training on concatenated group data seems to be important. It could be more useful in the main manuscript where the limitations of this could be discussed.

      By concatenating the data across subjects, we learn a group-level model. By doing this, we pool information across all subjects to estimate the networks. This can lead to more robust estimates. We have moved this footnote to the main text in paragraph 1 of section 2.5 and added further information.

      • In the TDE burst detection section- please expand on why/how a specific number of states was chosen.

      As with the HMM dynamic network analysis, the number of states must be pre-specified. For burst detection, we are often interested in an on/off type segmentation, which can be achieved with a 2 state HMM. However, if there are multiple burst types, these will all be combined into a single ‘on’ state. Therefore, we might want to increase the number of states to model multiple burst types. 3 was chosen as a trade-off to stay close to the on/off description but allow the model to learn more than 1 burst type. We have added text discussing this in paragraph 4 of section 4.

      • Normally the value of free energy is just a function of the data - and only relative magnitude is important. I think figures (eg 7c) would be clearer if the offset could be removed.

      We agree only the relative magnitude is important. We added text clarifying this in section 2 of the SI. We think it would still be worthwhile to include the offset so that future users can be sure they have correctly trained a model and calculated the free energy.

      • Related to the above- there are large differences in model evidence shown between sets. Yet all sets are the same data, and all parameter estimates are more or less the same. Could the authors account for this please (i.e. is there some other parameter that differentiates the best model in one set from the other sets, or is the free energy estimate a bit variable).

      We would like to clarify only the model parameters for the best run are shown in the group-level analysis. This is the run with the lowest variational free energy, which is highlighted in red. We have now clarified this in the caption of each figure. The difference in free energy for the best runs (across sets) is relatively small compared to the variation across runs within a set. If we were to plot the model parameters for each of the 10 runs in a set, we would see more variability. We have now clarified this in section 2 of the SI.

      Also note, the group analysis usually involves taking an average. Small differences in the variational free energy could reflect small differences in subject-specific model parameters, which are then averaged out, giving virtually identical group effects.

      • And related once again, if the data are always the same, I wonder if the free-energy plots and identical parameter estimates could be removed to free up space in figures?

      The reproducibility results have now been moved to the supplementary information (SI).

      • When citing p-values please specify how they are corrected (and over what please eg over states, nodes, etc?). This would be useful didactically as I imagine most users will follow the format of the presentation in this paper.

      We now include in the caption further details of how the permutation significance testing was done.

      • Not sure of the value of tiny power maps in 9C. Would consider making it larger or removing it?

      The scale of these power maps is identical to part (A.I). We have moved the reproducibility analysis to the SI, enlarged the figure and added colour bars. We hope the values are now legible.

      • Figure 3. I think the embedding in the caption doesn't match the figure (+-5 vs +-7 lags). Would be useful to add in the units of covariance (cii).

      The number of embeddings in the caption has been fixed. Regarding the units for the covariances, as this is simulated data there aren’t really any units. Note, there is already a colour bar to indicate the values of each element.

      • Minimize variational free energy - it may be confusing for some readers that other groups maximize the negative free energy. Maybe a footnote?

      We thank the reviewer for their suggestion. We have added a footnote (1).

      • Final question- and related to the Magnetoencephalography (MEG) data presented. These data are projected into source space using a beamformer algorithm (with its own implicit assumptions and vulnerabilities). Would be interested in the authors' opinion on what is standing between this work and a complete generative model of the MEG data - i.e. starting with cortical electrical current sources with interactions modeled and a dynamic environmental noise model (i.e. packing all assumptions into one model)?

      In principle, there is nothing preventing us from including the forward model in the generative model and training on sensor level MEG data. This would be a generative model starting from the dipoles inside the brain to the MEG sensors. This is under active research. If the reviewer is referring to a biophysical model for brain activity, the main barrier for this is the inference of model parameters. However, note that the new inference framework presented in the DyNeMo paper (Gohil, et al. 2022) actually makes this more feasible. Given the scope of this manuscript is to present a toolbox for studying dynamics with existing methods, we leave this topic as future work.

    1. Author Response

      We would like to express our thorough gratitude to the editors and reviewers, for the helpful comments and valuable suggestions, which provided us an opportunity to further address our research. Prior to submitting our final revision, here we provide our preliminary responses for the comments. Please find our detailed responses to the reviewers’ recommendations below.

      Reviewer #1 (Public Review):

      Summary:

      This study examines the spatial and temporal patterns of occurrence and the interspecific associations within a terrestrial mammalian community along human disturbance gradients. They conclude that human activity leads to a higher incidence of positive associations.

      Strengths:

      The theoretical framework of the study is brilliantly introduced. Solid data and sound methodology. This study is based on an extensive series of camera trap data. Good review of the literature on this topic.

      Weaknesses:

      The authors use the terms associations and interactions interchangeably.

      Response: This is not the case. In fact, we state specifically that "... interspecific associations should not be directly interpreted as a signal of biotic interactions between pairs of species…" However, co-occurrence can be an important predictor of likely interactions, such as competition and predation. We stand by our original text.

      It is not clear what the authors mean by "associations". A brief clarification would be helpful.

      Response: Our specific definition of what is meant here by spatial association can be found in the Methods section. To clarify, the calculation of the index of associations is based on the covariance for the two species of the residuals (epsilon) after consideration of all species-specific response to known environmental covariates. These covariances are modelled to allow them to vary with the level of human disturbance, measured as human presence and human modification. After normalization, the final index of association is a correlation value that varies between -1 (complete disassociation) and +1 (complete positive association).

      Also, the authors do not delve into the different types of association found in the study. A more ecological perspective explaining why certain species tend to exhibit negative associations and why others show the opposite pattern (and thus, can be used as indicator species) is missing.

      Response: Suggesting the ecological underpinnings of the associations observed here would mainly be speculation at this point, but the associations demonstrated in this analysis do suggest promising areas for the more detailed research suggested.

      Also, the authors do not distinguish between significant (true) non-random associations and random associations. In my opinion, associations are those in which two species co-occur more or less than expected by chance. This is not well addressed in the present version of the manuscript.

      Response: Results were considered to be non-random if correlation coefficients (for spatial association) or overlap (for temporal association) fell outside of 95% Confidence Intervals. This is now stated clearly in the Methods section. In Supplementary Figures S2 and S3, p<0.01 levels are also presented.

      The obtained results support the conclusions of the study.

      Anthropogenic pressures can shape species associations by increasing spatial and temporal co-occurrence, but above a certain threshold, the positive influence of human activity in terms of species associations could be reverted. This study can stimulate further work in this direction.

      Reviewer #2 (Public Review):

      Summary:

      This study analyses camera trapping information on the occurrence of forest mammals along a gradient of human modification of the environment. The key hypotheses are that human disturbance squeezes wildlife into a smaller area or their activity into only part of the day, leading to increased co-occurrence under modification. The method used is joint species distribution modelling (JSDM).

      Strengths:

      The data source seems to be very nice, although since very little information is presented, this is hard to be sure of. Also, the JSDM approach is, in principle, a nice way of simultaneously analysing the data.

      Weaknesses:

      The manuscript suffers from a mismatch of hypotheses and methods at two different levels.

      1. At the lower level, we first need to understand what the individual species do and "like" (their environmental niche). That information is not presented, and the methods suggest that the representation of each species in the JSDM is likely to be extremely poor.

      Response: The response of each species to the environmental covariates provides a window into their environmental niche, encapsulated in the beta coefficients for each environmental covariate. This information is presented in Figure 2.

      1. The hypothesis clearly asks for an analysis of the statistical interaction between human disturbance and co-occurrence. Yet, the model is not set up this way, and the authors thus do a lot of indirect exploration, rather than direct hypothesis testing.

      Response: Our JSDM model is set up specifically to examine the effect of human disturbance on co-occurrence, after controlling for shared responses to environmental variables. It directly tests the first hypothesis, since, if increase in indices of human disturbance had not tended to increase the measured spatial correlations between species as detected by the model, we would have rejected our stated hypothesis that human modification of habitats results in increased positive spatial associations between species.

      Even when the focus is not the individual species, but rather their association, we need to formulate what the expectation is. The hypotheses point towards presenting the spatial and the temporal niche, and how it changes, species for species, under human disturbance. To this, one can then add the layer of interspecific associations.

      Response: Examining each species one by one and how each one responds to human disturbance would miss the effects of any meaningful interactions between species. The analysis presented provides a means to highlight associations that would have been overlooked. Future research could go on to analyze the strongest associations in the community and the strongest effects of human disturbance so as to uncover the underlying interactions that give rise to them and the mechanisms of human impact. We believe that this will prove to be a much more productive approach than trying to tackle this problem species by species and pair by pair.

      The change in activity and space use can be analysed much simpler, by looking at the activity times and spatial distribution directly. It remains unclear what the contribution of the JSDM is, unless it is able to represent this activity and spatial information, and put it in a testable interaction with human disturbance.

      The topic is actually rather complicated. If biotic interactions change along the disturbance gradient, then observed data are already the outcome of such changed interactions. We thus cannot use the data to infer them! But we can show, for each species, that the habitat preferences change along the disturbance gradient - or not, as the case may be.

      Then, in the next step, one would have to formulate specific hypotheses about which species are likely to change their associations more, and which less (based e.g. on predator-prey or competitive interactions). The data and analyses presented do not answer any of these issues.

      Response: We suggest that the so-called “simpler” approach described above is anything but simple, and this is precisely what the Joint Species Distribution Model improves upon. As pointed out in the Introduction, simply examining spatial overlap is not enough to detect a signal of meaningful biotic interaction, since overlap could be the result of similar responses to environmental variables. With the JSDM approach, this would not be considered a positive association and would then not imply the possible existence of meaningful interaction.

      Another more substantial point is that, according to my understanding of the methods, the per-species models are very inappropriate: the predictors are only linear, and there are no statistical interactions (L374). There is no conceivable species in the world whose niche would be described by such an oversimplified model.

      Response: While interaction terms can be included in the JSDM, this would considerably increase the complexity of the models. In previous work, we have found no strong evidence for the importance of interaction terms and they do not improve the performance of the models.

      We have no idea of even the most basic characteristics of the per-species models: prevalences, coefficient estimates, D2 of the model, and analysis of the temporal and spatial autocorrelation of the residuals, although they form the basis for the association analysis!

      Response: The coefficient estimates for response to environmental variables used in the JSDM are provided in Figure 2.

      Why are times of day and day of the year not included as predictors IN INTERACTION with niche predictors and human disturbance, since they represent the temporal dimension on which niches are hypothesised to change?

      Also, all correlations among species should be shown for the raw data and for the model residuals: how much does that actually change and can thus be explained by the niche models?

      The discussion has little to add to the results. The complexity of the challenge (understanding a community-level response after accounting for species-level responses) is not met, and instead substantial room is given to general statements of how important this line of research is. I failed to see any advance in ecological understanding at the community level.

      Response: We agree that the community-level response to human disturbance is a complex topic, and we believe it is also a very important one. This research and its support of the spatial compression hypothesis, while not providing definitive answers to detailed mechanisms, opens up new lines of inquiry that makes it an important advance. For example, the strong effects of human disturbance on certain associations that were detected here could now be examined with the kind of detailed species by species and pair by pair analysis that this reviewer appears to demand.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The manuscript has helped address a long-standing mystery in splicing regulation: whether splicing occurs co- or post-transcriptionally. Specifically, the authors (1) uniquely combined smFISH, expansion microscopy, and live cell imaging; (2) revealed the ordering and spatial distribution of splicing steps; and (3) discovered that nascent, not-yet-spliced transcripts move more slowly around the transcription site and undergo splicing as they move through the clouds. Based on the experimental results, the authors suggest that the observation of co-transcriptional splicing in previous literature could be due to the limitation of imaging resolution, meaning that the observed co-transcriptional splicing might actually be post-transcriptional splicing occurring in proximity to the transcription site. Overall, the work presented here clearly provides a comprehensive picture of splicing regulation.

      Major points:

      1. Linearity of expansion microscopy. For Figure 2B, it would be helpful to display the same sample before and after expansion, just like Supplementary Figure 3, but with a transcription site and "cloud". In the current version, the transcription site looks quite different in the not-expanded (more green dots on the left) and expanded image (more green dots on the top).

      We thank the reviewer for this comment on linearity of expansion. Based on our prior manuscript (Chen et al 2015 Nature Methods. PMID: 27376770), we expect expansion microscopy to yield isotropic expansion. Indeed, as shown in Supplemental Figure 3, we confirmed that expansion of nuclei (3B, top) and transcripts (3B, bottom) is isotropic. Additionally, before splicing inhibition, we demonstrated the linearity of expansion for a transcription site (3B, left), shown at standard resolution with intron stain. The images shown in Figure 2B are meant solely to illustrate the change in resolution upon expansion, and are not meant to imply spatial matching between the expanded and unexpanded image. We apologize for the confusion and have clarified this in the figure legend for Figure 2.

      We also point the reader towards Supplemental Figure 4, in which we validate the use of expansion microscopy in these findings. We show that transcription sites in expanded samples were the same size as those imaged using stochastic optical reconstruction microscopy (STORM), demonstrating that expansion did not significantly alter the morphology of the site.

      1. FISH dot colocalization. What is the colocalization rate of FISH dots in general under experimental conditions? In addition, in Figures 2C and 2G, why do some 3'exon dots not have co-localized 5'exon dots?

      We thank the reviewer for asking for these important clarifications. Under standard (non-expanded) conditions, our colocalization of 3’ and 5’ spots varies by gene, but more than 75% of intron spots colocalize with exon spots for the vast majority of transcripts we evaluated. The percentage of colocalization for each gene and intron can be found in column 4 of Table 1.

      Regarding the second point—these individual images may not reflect the actual quantitative number of spot counts at the site, as these transcription sites have a sizable Z dimension that is difficult to capture in one image, and certain dyes are more easily visually distinguished in contrasted images than others. These factors may cause some 3’ spots to appear without a corresponding co-localized 5’ spot in these images. We refer the reviewer to Supplemental Figure 4C for quantitative spot counting of an expanded transcription site, for which there are a similar number of 3’ end and 5’ end spots within the entire Z-stacked image. Importantly, these transcription site clouds contain longer, unspliced transcripts, potentially leading to further separation between the 5’ and 3’ ends of a single transcript when compared to a cytoplasmic, spliced transcript (quantified in Figure 2I).

      1. It would be helpful if the authors uploaded a few examples of live cell imaging movies.

      Certainly! Please refer to the new Supplementary Movies 1-3 for representative examples of live cell imaging data.

      1. It is recommended to double-check the text for errors.

      We apologize for errors in the original manuscript, and have made the appropriate corrections.

      Reviewer #2 (Public Review):

      Allison Coté et al. investigated the ordering and spatial distribution of nascent transcripts in several cells using smFISH, expansion microscopy, and live-cell imaging. They find that pre-mRNA splicing occurs post-transcriptionally at the clouds around the transcription start site, termed the transcription site proximal zone. They show that pre-mRNA may undergo continuous splicing when they pass through the zone after transcription. These data suggest a unifying model for explaining previously reported co-transcriptional splicing events and provide a direction for further study of the nature of the slow-moving zone around the transcription start site.

      This paper is well-written. The findings are very important, and the data supports the conclusions well. However, some aspects of the image and description need to be clarified and revised.

      The authors describe Figure 4E and 4F results in the main text as that "we performed RNA FISH simultaneously with immunofluorescence for SC35, a component of speckles, and saw that this compartmentalized pre-mRNA did indeed appear near nuclear speckles both before (Supplementary Figure 6C) and after (Figure 4E) splicing inhibition." However, no SC35 staining is shown in the Figure 4E. A similar situation happened in describing Figure 4F.

      We thank the reviewer for noting this error. We mistakenly called in text for Figure 4E, when we meant to refer to Figure 4G, which shows combined RNA FISH and SC35 immunofluorescence show compartmentalization within nuclear speckles. Figures 4E and 4F do not show SC35 immunofluorescence. We have altered the text and figure captions accordingly. Recommendations for the authors: please note that you control which revisions to undertake from the public reviews and recommendations for the authors Reviewer #1 (Recommendations For The Authors):

      Minor points:

      1. For Figures, it would be better to mark co-transcriptional and proximal post-transcriptional splicing in a clearer way. Like in Figure 1A, the simulated RNA FISH signals are almost identical across two conditions, which is a bit confusing. Overlapping and close proximity shall be better illustrated in related figures.

      We thank the reviewer for these suggestions. We have iterated these figures through multiple revisions and have found that these diagrams tend to resonate the most, so we have elected to keep them as is, but we do appreciate the suggestion.

      1. May include some details of expansion microscopy in the last paragraph of the Introduction. For example, why introduce expansion microscopy? To what level it can help overcome the diffraction limit?

      We thank the reviewer for this comment, and have added additional text to this paragraph to further set up the use of expansion microscopy.

      1. Double-check the formatting. Some sub-titles are in Bold, some in Italic.

      We apologize for any formatting errors, and have made the appropriate corrections.

      1. Please double-check the writing. I find many incompatible parts across the manuscript. For example, as described in the Figure 1D caption, there aren't "first" and "second" graphs in the figure. Moreover, some writings require additional refinement. For instance, in the Introduction part, the paragraph discussing RNA imaging, various techniques (such as FISH and live imaging), and concerns (such as microscopy resolution, chromatin fraction, and limitations related to reporter genes) are intertwined without clear indexing or logical structuring. Similar cases in other paragraphs too. Last but not least, I can even find repetitive sentences across the manuscript. For instance, I believe that the authors forgot to delete "By distinguishing the separate fluorescent signals from probes bound to exons and introns, we could visualize splicing intermediates (represented by colocalized intron and exon spots) relative to the site of transcription (represented by bright colocalized intron and exon spots) and fully spliced products (represented by exon spots alone)." in the first paragraph of the Results part, as the exact same sentence re-occurs right after. I've only listed a few examples here. Please refine the manuscript.

      We apologize for any errors in the original manuscript, and have made the appropriate corrections.

      Reviewer #2 (Recommendations For The Authors):

      1. The sentence "By distinguishing the separate fluorescent signals from probes bound to exons and introns, we could visualize splicing intermediates (represented by colocalized intron and exon spots) relative to the site of transcription (represented by bright colocalized intron and exon spots) and fully spliced products (represented by exon spots alone)." is accidentally repeated twice, one of them should be deleted.

      We apologize for this duplication, and have made the appropriate correction.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the two reviewers very much for their careful review and valuable comments. Upon these comments, the following revisions have been made. First, we have performed a new analysis on human accelerated regions (HARs) recently reported by the Zoonomia Project. Second, we have presented more data on experimentally detected and computationally predicted DBSs of MALAT1, NEAT1, and MEG3. Third, we have added details on the RNA-seq data processing and subsequent differential expression testing to the Materials and Methods section. Fourth, we have clarified some details on the human ancestor sequence and the use of parameters and thresholds. Six new citations are added. In addition, we have also carefully polished the main text. We hope these revisions, together with the Responses-to-Reviewers, would help the reader better get the information from the paper.

      eLife assessment

      In this valuable manuscript, the authors attempt to examine the role of long non-coding RNAs (lncRNAs) in human evolution, through a set of population genetics and functional genomics analyses that leverage existing datasets and tools. Although the methods are at times inadequate - for example, suitable methods and/or relevant controls are lacking at many points, and selection is inferred sometimes too quickly - the results nonetheless point towards a possible contribution of long non-coding RNAs to the evolution of human biology and they suggest clear directions for future, more rigorous study.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary

      While DNA sequence divergence, differential expression, and differential methylation analysis have been conducted between humans and the great apes to study changes that "make us human", the role of lncRNAs and their impact on the human genome and biology has not been fully explored. In this study, the authors computationally predict HSlncRNAs as well as their DNA Binding sites using a method they have developed previously and then examine these predicted regions with different types of enrichment analyses. Broadly, the analysis is straightforward and after identifying these regions/HSlncRNAs the authors examined their effects using different external datasets.

      Strengths/weaknesses

      By and large, the analysis performed is dependent on their ability to identify HSlncRNAs and their DBS. I think that they have done a good job of showing the performance metrics of their methods in previous publications. Thereafter, they perform a series of enrichment-type analyses that have been used in the field for quite a while now to look at tissue-specific enrichment, or region-specific enrichment, or functional enrichment, and I think these have been carried out well. The authors achieved the aims of their work. I think one of the biggest contributions that this paper brings to the field is their annotation of these HSlncRNAs. Thus a major revisionary effort could be spent on applying their method to the latest genomes that have been released so that the community could get a clean annotation of newly identified HSlncRNAs (see comment 2).

      Comments

      1. Though some of their results about certain HSlncRNAs having DBSs in all genes is rather surprising/suspicious, I think that broadly their process to identify and validate DBSs is robust, they have multiple lines of checks to identify such regions, including functional validation. These predictions are bound to have some level of false positive/negative rate and it might be nice to restate those here and on what experiment/validation data these were conducted. However, the rest of their analysis comprises different types of enrichment analysis which shouldn't be affected by outlier HSlncRNAs if indeed their FPR/FNR are low.

      2. There are now several new genomes available as part of the Zoonomia consortium and 240 Primate consortium papers released. These papers have re-examined some annotations such as Human Accelerated Regions (HARs) and found with a larger dataset as well as better reference genomes, that a large fraction of HARs were actually incorrectly annotated - that is that they were also seen in other lineages outside of just the great apes. If these papers have not already examined HSlncRNAs, the authors should try and re-run the computational predictions with this updated set and then identify HSlncRNAs there. This might help to clarify their signal and remove lncRNAs that might be present in other primates but are somehow missing in the great apes. This might also help to mitigate some results that they see in section 3 of their paper in comparing DBS distances between archaics and humans.

      Responses:

      (1) Thanks for the good suggestion. We have checked the Zoonomia reported genomes and found that new primate genomes are monkeys and lemurs but not apes (Zoonomia Consortium. Nature 2023. https://doi.org/10.1038/s41586-020-2876-6), and the phylogenetic relationships between monkeys and humans are much more remote than those between apes and humans. In addition, the Zoonomia project did target identifying new lncRNA genes.

      (2) We have examined the Zoonomia-reported HARs (Keough et al. Science 2023. DOI: 10.1126/science.abm1696). Of the 312 HARs reported by Keough et al, 8 overlap 26 DBSs of 14 HS lncRNAs; moreover, DBSs greatly outnumber HARs, suggesting that HAR and DBS are different sequences with different functions.

      (3) In the revised manuscript, a new paragraph (the second one) has been added to the section “HS lncRNAs regulate diverse genes and transcripts” to describe the HAR analysis result.

      1. The differences between the archaic hominins in their DBS distances to modern humans are a bit concerning. At some level, we expect these to be roughly similar when examining African modern humans and perhaps the Denisovan being larger when examining Europeans and Asians, but they seem to have distances that aren't expected given the demography. In addition, from their text for section 3, they begin by stating that they are computing two types of distances but then I lost track of which distance they were discussing in paragraph 3 of section 3. Explicitly stating which of the two distances in the text would be helpful for the reader.

      Responses:

      (1) Upon the archaic human genomes, the genomic distances from the three modern humans are shorter to Denisovan than to Altai Neanderthal; however, upon the related studies we cite, the phylogenetic relationship between the three modern humans is more remote to Denisovan than to Altai Neanderthal. Thus, the finding that 2514 and 1256 DBSs have distances >0.034 in Denisovans and Altai Neanderthals is not unreasonable. The numbers of DBSs, of course, depend on the cutoff of 0.034, which is somewhat subjective but not unreasonable.

      (2) The second paragraph is added to the Discussion, discussing parameters and cutoffs.

      (3) Regarding the two types of distance, the distances computed in the first way were not further analyzed because, as we note, “This anomaly may be caused by that the human ancestor was built using six primates without archaic humans”.

      1. Isn't the correct control to examine whether eQTLs are more enriched in HSlncRNA DBSs a set of transcription factor binding sites? I don't think using just promoter regions is a reasonable control here. This does not take away from the broader point however that eQTLs are found in DBSs and I think they can perform this alternate test.

      Responses:

      Indeed, TFBSs are more comparable to DBSs than promoters. However, many more methods have been developed to predict TFBSs than to predict DBSs, making us concerned about TFBS prediction's reliability. Since most QTLs in DBSs are mQTLs (Supplementary Table 13), but many QTLs in TFBSs are eQTLs (Flynn et al. PLoS Genetics 2021. DOI: 10.1371/journal.pgen.1009719), it is pretty safe to conclude that DBSs are enriched in mQTLs.

      1. In the Discussion, they highlight the evolution of sugar intake, which I'm not sure is appropriate. This comes not from GO enrichment but rather from a few genes that are found at the tail of their distribution. While these signals may be real, the evolution of traits is often highly polygenic and they don't see this signal in their functional enrichment. I suggest removing that line. Moreover, HSlncRNAs are ones that are unique across a much longer time frame than the transition to agriculture which is when sugar intake rose greatly. Thus, it's unlikely to see enrichment for something that arose in the past 6000-7000 years would in the annotation that is designed to detect human-chimp or human-neanderthal level divergence.

      Responses:

      (1) The Discussion on human adaptation to high sugar intake is based on both enriched GO terms (Supplementary Table 4, 7) and a set of genes in modern humans with the most SNP-rich DBSs (Table 2). These glucose-related GO terms are not at the tail of the list because, of the 614 enriched GO terms (enriched in genes with strongest DBSs), glucose metabolism-related ones are ranked 208, 212, 246, 264, 504, 522, 591, and of the 409 enriched GO terms (enriched in the top 1256 genes in Altai Neanderthals), glucose metabolism-related ones are ranked 152 and 217.

      (2) Indeed, there are other top-ranked enriched GO terms; some (e.g., neuron projection development (GO:0031175) and cell projection morphogenesis (GO:0048858)) have known impact on human evolution, but the impact of others (e.g., cell junction organization (GO:0034330)) remain unclear. We specifically report human adaptation to high sugar intake because the DBSs in related genes show differences in modern humans (Table 2).

      Reviewer #2 (Public Review):

      Lin et al attempt to examine the role of lncRNAs in human evolution in this manuscript. They apply a suite of population genetics and functional genomics analyses that leverage existing data sets and public tools, some of which were previously built by the authors, who clearly have experience with lncRNA binding prediction. However, I worry that there is a lack of suitable methods and/or relevant controls at many points and that the interpretation is too quick to infer selection. While I don't doubt that lnc RNAs contribute to the evolution of modern humans, and certainly agree that this is a question worth asking, I think this paper would benefit from a more rigorous approach to tackling it.

      At this point, my suggestions are mostly focused on tightening and strengthening the methods; it is hard for me to predict the consequence of these changes on the results or their interpretation, but as a general rule I also encourage the authors to not over-interpret their conclusions in terms of what phenotype was selected for when as they do at certain points (eg glucose metabolism).

      Responses:

      (1) Now, we use more cautious wording to describe the results.

      (2) A paragraph (the second one) is added to Discussion to explain parameters and cutoffs.

      (3) We make the caution at the end of the third paragraph that “We note that these are findings instead of conclusions, and they indicate, suggest, or support something revealing the primary question of what genomic differences critically determine the phenotypic differences between humans and apes and between modern and archaic humans”.

      I note some specific points that I think would benefit from more rigorous approaches, and suggest possible ways forward for these.

      1. Much of this work is focused on comparing DNA binding domains in human-unique long-noncoding RNAs and DNA binding sites across the promoters of genes in the human genome, and I think the authors can afford to be a bit more methodical/selective in their processing and filtering steps here. The article begins by searching for orthologues of human lncRNAs to arrive at a set of 66 human-specific lncRNAs, which are then characterised further through the rest of the manuscript. Line 99 describes a binding affinity metric used to separate strong DBS from weak DBS; the methods (line 432) describe this as being the product of the DBS or lncRNA length times the average Identity of the underlying TTSs. This multiplication, in fact, undoes the standardising value of averaging and introduces a clear relationship between the length of a region being tested and its overall score, which in turn is likely to bias all downstream inference, since a long lncRNA with poor average affinity can end up with a higher score than a short one with higher average affinity, and it's not quite clear to me what the biological interpretation of that should be. Why was this metric defined in this way?

      Responses:

      (1) Binding affinity and length of all DBSs of HS lncRNAs are given in Supplementary Table 2 and 3. Since a triplex (say, 100 bp in length) may have 50% or 70% of nucleotides bound, it is necessary to differentiate binding affinity and length, and the two measures can differentiate DBSs of the same length but with different binding affinity and DBSs with the same binding affinity but different length.

      (2) Differentiating DBSs into strong and weak ones is somewhat subjective, accurately differentiating them demands experimental data that are currently unavailable, and it is advisable to separately analyze strong and weak DBSs because they may likely influence different aspects of human evolution.

      1. There is also a strong assumption that identified sites will always be bound (line 100), which I disagree is well-supported by additional evidence (lines 109-125). The authors show that predicted NEAT1 and MALAT1 DBS overlap experimentally validated sites for NEAT1, MALAT1, and MEG3, but this is not done systematically, or genome-wide, so it's hard to know if the examples shown are representative, or a best-case scenario.

      Responses:

      (1) We do not assume/think that identified sites will always be bound. Instead, lncRNA/DBS binding is highly context-dependent (including tissue-specific).

      (2) An extra supplementary table (Supplementary Table 15) is added to show what predicted DBSs overlap experimentally detected DBSs for NEAT1, MALAT1, and MEG3. By the way, it is more accurate to say “experimentally detected” than “experimentally validated”, because experimental data have true/false positives and true/false negatives, and different sequencing protocols (for detecting lncRNA/DNA binding) may generate somewhat different results.

      It's also not quite clear how overlapping promoters or TSS are treated - are these collapsed into a single instance when calculating genome-wide significance? If, eg, a gene has five isoforms, and these differ in the 3' UTR but their promoter region contains a DBS, is this counted five times, or one? Since the interaction between the lncRNA and the DBS happens at the DNA level, it seems like not correcting for this uneven distribution of transcripts is likely to skew results, especially when testing against genome-wide distributions, eg in the results presented in sections 5 and 6. I do not think that comparing genes and transcripts putatively bound by the 40 HS lncRNAs to a random draw of 10,000 lncRNA/gene pairs drawn from the remaining ~13500 lncRNAs that are not HS is a fair comparison. Rather, it would be better to do many draws of 40 non-HS lncRNAs and determine an empirical null distribution that way, if possible actively controlling for the overall number of transcripts (also see the following point).

      Responses:

      (1) We analyzed each and every GENCODE-annotated transcript (Supplementary Table 2). For example, if a gene has N TSS and N transcripts, DBSs are predicted in N promoter regions. When analyzing gene expression in tissues, each and every transcript is analyzed.

      (2) Ideally, it would be better to do many draws, but statistically, a huge number is needed due to the number of total genes in the human genome.

      (3) We feel that doing many draws of 40 non-HS lncRNAs and determining an empirical null distribution is not as straightforward as comparing HS lncRNA-target transcript pairs (45% show significant expression correlation) with random lncRNA-random transcript pairs (2.3% show significant expression correlation).

      1. Thresholds for statistical testing are not consistent, or always well justified. For instance, in line 142 GO testing is performed on the top 2000 genes (according to different rankings), but there's no description of the background regions used as controls anywhere, or of why 2000 genes were chosen as a good number to test? Why not 1000, or 500? Are the results overall robust to these (and other) thresholds? Then line 190 the threshold for downstream testing is now the top 20% of genes, etc. I am not opposed to different thresholds in principle, but they should be justified.

      Responses:

      (1) The over-representation analysis using g:Profiler was applied to the top and bottom 2000 genes with the whole genome as the background. The number “2000” was chosen somewhat subjectively. If more or fewer genes were chosen, more or fewer enriched GO terms would be identified, but GO terms with adjusted P-values <0.05 would be quite stable.

      (2) A paragraph (the second one) is added to the Discussion to explain parameters and cutoffs.

      Likewise, comparing Tajima's D values near promoters to genome-wide values is unfair, because promoters are known to be under strong evolutionary constraints relative to background regions; as such it is not surprising that the results of this comparison are significant. A fairer comparison would attempt to better match controls (eg to promoters without HS lncRNA DBS, which I realise may be nearly impossible), or generate empirical p-values via permutation or simulation.

      Responses:

      We examined Tajima’s D in DBSs (Supplementary Figure 9) and in HS lncRNA genes (Supplementary Figure 18). We compared the Tajima’s D values with the genome-wide background in both cases.

      1. There are huge differences in the comparisons between the Vindija and Altai Neanderthal genomes that to me suggest some sort of technical bias or the such is at play here. e.g. line 190 reports 1256 genes to have a high distance between the Altai Neanderthal and modern humans, but only 134 Vindija genes reach the same cutoff of 0.034. The temporal separation between the two specimens does not seem sufficient to explain this difference, nor the difference between the Altai Denisovan and Neanderthal results (2514 genes for Denisovan), which makes me wonder if it is a technical artefact relating to the quality of the genome builds? It would be worth checking.

      Responses:

      (1) The cutoff of 0.034 was chosen upon that DBSs in the top 20% (4248) genes in chimpanzees have distances larger than this cutoff, and accordingly, 4248, 1256, 2514, and 134 genes have DBSs distances >0.034 in chimpanzees, Altai Neanderthals, Denisovans, and Vindija Neanderthals. These numbers of genes qualitatively agree with the phylogenetic distances from chimpanzees, archaic humans to modern humans. If a percentage larger or smaller than 20% (e.g., 10% or 30%) is chosen, and so is a cutoff X, the numbers of genes with DBSs distance >X would not be 4248, 1256, 2514, and 134, but could still qualitatively agree with the phylogenetic distances from chimpanzees, archaic humans to modern humans.

      (2) The second paragraph in the Discussion now explains the parameters and cutoffs.

      1. Inferring evolution: There are some points of the manuscript where the authors are quick to infer positive selection. I would caution that GTEx contains a lot of different brain tissues, thus finding a brain eQTL is a lot easier than finding a liver eQTL, just because there are more opportunities for it. Likewise, claims in the text and in Tables 1 and 2 about the evolutionary pressures underlying specific genes should be more carefully stated. The same is true when the authors observe high Fst between groups (line 515), which is only one possible cause of high Fst - population differentiation and drift are just as capable of giving rise to it, especially at small sample sizes.

      Responses:

      (1) We analyzed brain tissues separately instead of taking the whole brain as a tissue, see Supplementary Table 12 and Figure 3.

      (2) We make the caution at the end of the third paragraph that “We note that these are findings instead of conclusions, and they indicate, suggest, or support something revealing the primary question of what genomic differences critically determine the phenotypic differences between humans and apes and between modern and archaic humans”.

      Reviewer #1 (Recommendations For The Authors):

      Some figures are impossible to see/read so I wasn't able to evaluate them - Fig, 1B, 1E, 1F are small and blurry.

      Responses:

      High-quality figures are provided.

      Typo in line 178: in these archaic humans, the distances of HS lncRNAs are smaller than the distances of DBSs.

      Responses:

      This is not a typo. We use “distance per base” to measure whether HS lncRNAs or their DBSs have evolved more from archaic humans to modern humans. See also Supplementary Note 4 and 5.

      Reviewer #2 (Recommendations For The Authors):

      1. There's some inconsistency in the genome builds and the database versions used, eg, sometimes panTro4 is used and sometimes panTro5 (line 456). Likewise, the version of GENCODE used is very old (18), the current version is 43. The current version contains 19928 lncRNAs, which is a big difference relative to what is being tested!

      Responses:

      (1) panTro4 was used to search orthologues of human lncRNAs; this time-consuming work started several years ago when the version of GENCODE was V18 (see Lin et al., 2019).

      (2) Regarding “the version of GENCODE used is very old (V18)”, we have later examined the 4396 human lncRNAs reported in GENCODE V36 and found that the set of 66 HS lncRNAs remains the same.

      (3) The counterparts of HS lncRNAs’ DBSs in chimpanzees were predicted recently using panTro5.

      1. Table 1: What does 'mostly' mean in this context? I understand that it refers to sequence differences between humans and the other genomes, but what is the actual threshold, and how is it defined?

      Responses:

      The title of Table 1 is “Genes with strongest DBSs and mostly changed sequence distances from modern humans to archaic humans and chimpanzees”. Instead of using two cutoffs, choosing genes with the two features seems easy and sensible.

      1. Line 117: The methods do not include information on the RNA-seq data processing and subsequent DE testing.

      Responses:

      The details are added to the section “Experimentally validating DBS predictiom” (The reads were aligned to the human GRCh38 genome using Hiasat2 (Kim et al., 2019), and the resulting sam files were converted to bam files using Samtools (Li et al., 2009). Stringtie was used to quantify gene expression level (Pertea et al., 2015). Fold change of gene expression was computed using the edgeR package (Robinson et al., 2010), and significant up- and down-regulation of target genes after DBD knockout was determined upon |log2(fold change)| > 1 with FDR < 0.1).

      1. Line 180: I looked at the EPO alignment and it's not clear to me what 'human ancestor' means, but it may well explain the issues the authors have with calculating distances (I agree those numbers are weird). Is it the reconstructed ancestral state of humans at around 300-200,000 years ago (coalescence of most human uniparental lineages), or the inferred sequence of the human-chimpanzee most recent common ancestor? If it's the former, it's not surprising it skews results towards shorter distances for modern humans, since the tree distance from that point to archaic hominins is significantly larger than to modern humans.

      Responses:

      The “human ancestor” is constructed by the EBI team upon the genomes of six primates in the Ensembl website. We find that the reconstructed ancestral state of humans may be unlikely around 300,000-200,000 years, and may be much earlier. We also find that many DNA sequences of the “human ancestor” are low-confidence calls (i.e., the ancestral states are supported by only one primate’s sequence).

      1. Line 221: SNP-rich DBS: Is this claim controlled for the length of the DBS?

      Responses:

      No. Long DBSs tend to have more SNPs. When comparing the same DBS in modern humans, archaic humans, and chimpanzees, both the length and SNP number reflect evolution, so it is not necessary to control for the length.

      1. Given that GTEx is primarily built off short-read data and it is impossible to link binding of a lncRNA to a DBS with its impact with a specific transcript

      Responses:

      As written in the section “Examining the tissue-specific impact of HS lncRNA-regulated gene expression”, we calculated the pairwise Spearman's correlation coefficient between the expression of an HS lncRNA (the representative transcript, median TPM value > 0.1) and the expression of each of its target transcripts (median TPM value > 0.1) using the scipy.stats.spearmanr program in the scipy package. The expression of an HS lncRNA gene and a target transcript was considered to be significantly correlated if the |Spearman's rho| > 0.3, with Benjamini-Hochberg FDR < 0.05.

      1. Line 429: should TTO be TFO?

      Responses:

      Here TTO should be TFO; the typo is corrected.

      1. Methods, section 7: Some of the text in this section should perhaps be moved to the results section?

      Responses:

      Each of the two paragraphs in Methods’ section 7 is quite large, and some contents in Supplementary Notes are also very relevant. Thus, moving them to the Results section could make the Results too lengthy and specific.

      1. Line 587: GTEx is built from samples of primarily European ancestry and has poor representation of African ancestry and negligible representation of Asian ancestry (see the GTEx v8 paper supplement). This means that it is basically impossible to find a non-European population-specific eQTL in GTEx, which in turn impacts these results.

      Responses:

      (1) Indeed, this is a serious issue of data analysis, and this issue cannot be solved until more Africans are sequenced.

      (2) Anyway, one can still find considerable African-specific eQTLs in GTEx, such as rs28540058 (with frequency of 0, 0, 0.13 in CEU, CHB, YRI) and rs58772997 (with frequency of 0, 0, 0.12 in CEU, CHB, YRI (see Supplementary Table12 and Supplementary Figure 22).

    1. Author Response

      The following is the authors’ response to the previous reviews.

      eLife assessment

      The finding that Fusicoccin (FC-A) promotes locomotor recovery after spinal cord injury is useful, and the idea of harnessing small molecules that may affect protein-protein interactions to promote axon regeneration is interesting and worthy of study. However, the main methods, data, and analyses are inadequate to support the primary claim of the manuscript that a 14-3-3-Spastin complex is necessary for the observed FC-A effects.

      Response: We appreciate the eLife editorial and review team for consideration and evaluation of our manuscript. In light of the feedback from the editors and reviewers, we recognize that certain aspects of the title and key conclusions require further refinement. We have shown that 14-3-3, through its interaction with phosphorylated spastin, inhibits the degradation of spastin. Also, we have demonstrated that 14-3-3 can enhance spastin's microtubule-severing ability in cell lines. Furthermore, our work has illustrated the significant roles of 14-3-3 and spastin in the repair process of spinal cord injury. However, there is currently insufficient direct evidence to confirm the cooperation between 14-3-3 and spastin during axon regeneration and the recovery of spinal cord injury. Moreover, we have not provided conclusive evidence of their simultaneous action in injured axons, mediating changes in microtubule dynamics. Consequently, we have re-evaluated the manuscript's title and primary conclusions, and have made relevant modifications. For more detailed information, please refer to the reviewer's comments.

      Public Reviews:

      Reviewer #1 (Public Review):

      The present work establishes 14-3-3 proteins as binding partners of spastin and suggests that this binding is positively regulated by phosphorylation of spastin. The authors show evidence that 14-3-3 - spastin binding prevents spastin ubiquitination and final proteasomal degradation, thus increasing the availability of spastin. The authors measured microtubule severing activity in cell lines and axon regeneration and outgrowth as a prompt to spastin activity. By using drugs and peptides that separately inhibit 14-3-3 binding or spastin activity, they show that both proteins are necessary for axon regeneration in cell culture and in vivo models in rats.

      The following is an account of the major strengths and weaknesses of the methods and results.

      Major strengths

      -The authors performed pulldown assays on spinal cord lysates using GST-spastin, then analyzed pulldowns via mass spectrometry and found 3 peptides common to various forms of 14-3-3 proteins. In co-expression experiments in cell lines, recombinant spastin co-precipitated with all 6 forms of 14-3-3 tested. The authors could also co-immunoprecipitate spastin-14-3-3 complexes from spinal cord samples and from primary neuronal cultures.

      -By protein truncation experiments they found that the Microtubule Binding Domain of spastin contained the binding capability to 14-3-3. This domain contained a putative phosphorylation site, and substitutions that cannot be phosphorylated cannot bind to 14-3-3.

      -Overexpression of GFP-spastin shows a turn-over of about 12 hours when protein synthesis is inhibited by cycloheximide. When 14-3-3 is co-overexpressed, GFP-spastin does not show a decrease by 12 hours. When S233A is expressed, a turn-over of 9 hours is observed, suggesting that phosphorylation increases the stability of the protein. In support of that notion, the phospho-mimetic S233D makes it more stable, lasting as much as the over-expression of 14-3-3.

      -By combining FCA with Spastazoline, authors claim that FCA increased regeneration is due to increased spastin activity in various models of neurite outgrowth and regeneration in cell culture and in vivo, the authors show impressive results on the positive effect of FCA in regeneration, and that this is abolished when spastin is inhibited.

      Major weaknesses

      1. The present manuscript suggests that 14-3-3 and spastin work in the same pathway to promote regeneration. Although the manuscript contains valuable evidence in support for a role of 14-3-3 and spasting in regeneration, the conclusive evidence is difficult to generate, and is missing in the present manuscript. For example, there are simpler explanations for the combined effect of FC-A and spastazoline. The FC-A mechanism of action can be very broad, since it will increase the binding of all 14-3-3 proteins with presumably all their substrates, hence the pathways affected can rise to the hundreds. The fact that spastazoline abolishes FC-A effect, may not be because of their direct interaction, but because spastin is a necessary component of the execution of the regeneration machinery further downstream, in line with the fact that spastazoline alone prevented outgrowth and regeneration, and in agreement with previous work showing that normal spastin activity is necessary for regeneration.

      With this in mind, I consider the title and most major conclusions of the manuscript related to these two proteins acting together for the observed effects are overstated.

      Response: We appreciate and acknowledge the reviewers' considerations. Our results demonstrated that the spastin inhibitor, spastazolin, almost completely inhibited axon regeneration and the spinal cord injury repair process. This, in turn, leads to the disappearance of any promoting effect on spinal cord injury repair when spastin function is compromised. While we have provided evidence that the expression levels of spastin are moderately increased at the injury site in mice after treatment with FC-A following spinal cord injury, the conclusion that FC-A promotes spinal cord injury repair through the direct interaction between 14-3-3 and spastin still lacks direct evidence. Therefore, we have made appropriate modifications to the manuscript's title and main conclusions.

      1. Authors show that S233D increases MT severing activity, and explain that it is related to increased binding to 14-3-3. An alternative explanation is that phosphorylation at S233 by itself could increase MT severing activity. The authors could test if purified spastin S233D alone could have more potent enzymatic activity.

      Response: We appreciate the considerations of the reviewer. We believe that supplementing in vitro experiments to assess whether S233D affects spastin's microtubule severing function can more intuitively demonstrate whether phosphorylation of spastin at S233 affects its microtubule severing function; however, spastin forms hexamers through its AAA domain to exert ATPase activity and cut microtubules. Current research has reported that mutation sites leading to changes in microtubule severing function are mainly located within spastin's AAA domain (affecting spastin's ATPase activity, amino acids 342-599), such as E356A, G370R, N386K, K388R, E442Q, K427R and R562Q. Furthermore, studies have shown that mutating 11 phosphorylation sites in spastin's MIT and MTBD regions to alanine does not affect spastin's microtubule severing function, including human S268 (Rat Ser233) (Phosphorylation mutation impairs the promoting effect of spastin on neurite outgrowth without affecting its microtubule severing ability. Eur J Histochem. doi: 10.4081/ejh.2023.3594). Additionally, we also provided supplementary experiments in cell lines which showed that both spastin S233A and S233D could effectively sever microtubules (Fig.S2).

      1. The interpretation of the authors cannot explain how Spastin can engage in MT severing while bound to 14-3-3 using its Microtubule Binding Domain.

      Response: We appreciate the considerations of the expert reviewer. The IP experiments with truncated fragments suggest that the binding region of 14-3-3 with spastin is located within the region (215-336 amino acids) in spastin. Furthermore, experiments involving site-directed mutagenesis confirm that the actual binding site of 14-3-3 with spastin is the S233 site, rather than its MTBD region (270-328). Therefore, we have made corrections in the manuscript. We also indicate that 14-3-3 enhances spastin's protein levels by binding to the S233 site, which may be due to 14-3-3 masking the ubiquitination sites near spastin S233 (K206 or K254). Our further experiments also demonstrate that 14-3-3 inhibits the ubiquitination degradation pathway of phosphorylated spastin.

      1. Also, the term "microtubule dynamics", which is present in the title and in other major conclusions, is overstated. Although authors show, in cell lines, changes in microtubule content, it is far from evidence for changes in "MT dynamics" in the settings of interest (i.e. injured axons).

      Response: We appreciate and acknowledge the rigorous feedback. While our manuscript demonstrated the regulatory role of 14-3-3 and spastin in microtubule dynamics in cell lines, we lack direct evidence of these changes in microtubule dynamics within injured axons. Therefore, we have made appropriate modifications to the title, main conclusions, and related statements in our manuscript.

      1. In the same lines, the manuscript lacks evidence for the changes of MT content and/dynamics as a function of the proposed 14-3-3 - Spastin pathway.

      Response: We appreciate and concur with the opinions of the expert reviewer. The observed changes in microtubule dynamics in spinal cord injury were related to the overall alterations in microtubule dynamics within the spinal cord injury site. We still lack direct evidence that 14-3-3, in conjunction with spastin, alters the microtubule dynamics within axons during the process of regeneration. Therefore, we have made modifications to the manuscript.

      Reviewer #2 (Public Review):

      Summary:

      The idea of harnessing small molecules that may affect protein-protein interactions to promote axon regeneration is interesting and worthy of study. In this manuscript Liu et al. explore a 14-3-3-Spastin complex and its role in axon regeneration.

      Strengths:

      Some of the effects of FC-A on locomotor recovery after spinal cord contusion look interesting

      Weaknesses:

      The manuscript falls short of establishing that a 14-3-3-Spastin complex is important for any FC-A-dependent effects and there are several issues with data quality that make it difficult to interpret the results. Importantly, the effects of the spastin inhibitor has a major impact on neurite outgrowth suggesting that cells simply cannot grow in the presence of the inhibitor and raising serious questions about any selectivity for FC-A - dependent growth. Aspects of the histology following spinal cord injury were not convincing.

      Response: We appreciate the rigorous review by the expert reviewers. In response to the feedback from reviewer 1, we lack direct evidence to demonstrate that the reparative effect of FC-A on spinal cord injury is mediated by the combined action of 14-3-3 and spastin. We have accordingly made the necessary changes to our manuscript. Additionally, due to upload limitations, the resolution of our tissue slices related to spinal cord injury in the manuscript is relatively low. To address this, we have supplemented relevant images which was enlarged in the supplementary materials (Fig. S7-9), Also, the original confocal files and images were uploaded.

      Furthermore, our manuscript does not suggest that the reparative effect of FC-A in spinal cord injury selectively impacts the interaction between 14-3-3 and spastin. Therefore, we have modified our claims (title and conclusions) to ensure a more precise statement. Despite the fact that our axonal markers do not fully align, our evidence still strongly supports the role of FC-A in promoting nerve regeneration after spinal cord injury. Additionally, we will further optimize our immunohistochemistry methods.

      Reviewer #3 (Public Review):

      Summary:

      The current manuscript shows that 14-3-3 are binding partners of spastin, preventing its degradation. It is additionally shown, using complementary methods, that both 14-3-3 and spastin are necessary for axon regeneration in vitro and in vivo. While interesting in vitro and vivo data is provided, some of the claims of the authors are not convincingly supported.

      Major strengths:

      Very interesting effect of FC-A in functional recovery after spinal cord injury.

      Major Weaknesses:

      Some of the in vitro data, including colocalizations, and analysis of microtubule severing fall short to support the claims of the authors.

      The in vivo selectivity of FC-A towards spastin is not adequately supported by the data presented. There are aspects of the spinal cord injury site histology that are unclear.

      Response: Reviewer 3's comments align with those of Reviewers 1 and 2.

      Reviewer #1 (Recommendations For The Authors):

      -The new blots presented in Fig. 3N lacks corresponding labels as for antibodies used for IP and IB and molecular weight markers.

      Response: We appreciate the reviewer's feedback. We have made the corresponding modifications in the figure.

      Reviewer #2 (Recommendations For The Authors):

      The authors have addressed many of the specific concerns shared with the authors in the first round of review but several issues remain with the manuscript.

      1. Fig. 1D - the interpretation that spastin co-localizes with 14-3-3 proteins in hippocampal neurons is still tenuous since 14-3-3 uniformly labels the cell.

      Response: We appreciate the reviewer's consideration. Upon re-examining the source files, we found that the predominant reason for 14-3-3 showing a ubiquitous cellular distribution was excessive brightness and insufficient contrast. After appropriate adjustments, we discovered that 14-3-3 exhibits characteristic distribution in axons, including aggregation at growth cone and specific locations in the axon shaft. We have made the relevant changes in the revised version.

      1. Line 336. The meaning of the following statement is unclear "To further identify which isoform of 14-3-3 interacts with spastin, we generated six 14-3-3 isoforms in rats (β、γ、ε、ζ、η、θ ), then purified GST fusion 14-3-3 proteins (Figure 1G).

      Response: Sorry for any confusing statement. We obtained gene fragments of six 14-3-3 isoforms from rat brain cDNA and inserted these fragments into the pEGX-5X-3 vector. Subsequently, GST 14-3-3 fusion proteins were expressed and purified in vitro. We have made the corresponding revisions in the revised version.

      1. Line 341. The authors still fall short of showing that spastin and 14-3-3 interact directly thus it may be more accurate to say that they form a complex.

      Response: Thank you for the reviewer's advice. We have made the corresponding corrections in the manuscript.

      1. Line 388. Please clarify 2th and the meaning of "moderately" - "S233D) was moderately expressed in primary hippocampal neurons at 2th DIV." While it is specified that the transfection dosage and duration were meticulously controlled - it is unclear what the criteria was for establishing the appropriate moderate dosage.

      Response: Sorry about the mistake, it should be "2nd" instead of "2th". In order to establish a model for overexpressing spastin to promote neuronal neurite growth, we transfected 0.2 µg of plasmid into 1 well (1×104 cells/cm2, 24-well plate), with a transfection duration controlled at 24 hours.

      1. Line 395 - It is unclear if S233D is toxic as there seem to be no measurements of cell survival.

      Response: We have supplemented relevant experiments (See comment 6) based on comment 6 and found that Spastin S233D can promote neuronal neurite growth. The corresponding descriptions have been revised.

      1. The pro-growth effects of S233A still does not seem to fit the narrative and the results would have been more convincing if dosage was better controlled to establish any differences between WT and S233A Spastin.

      Response: We appreciate the constructive comments from the reviewer. In order to better illustrate the role of spastin S233 in neuronal growth, we have made appropriate adjustments to our experimental conditions based on previous experiments. Cells were transfected with plasmids expressing non-fused GFP and spastin and the relevant S233 mutants at a transfection dose of 0.2 µg into 1 well (1×104 cells/cm2, 24-well plate), duration was controlled at 12 hours. Due to the low expression state of the overexpressed protein, GFP (ab290 antibody for IF) was then stained to trace neuronal morphology. The experimental results demonstrate that spastin promotes neuronal neurite growth, and the dephosphorylation mutant of spastin (spastin S233A) significantly attenuates its neurite-promoting effect compared to wild-type spastin. Conversely, the phosphorylation mutant spastin S233D further enhances the promotion of neuronal neurite growth. We have also made corrections to the relevant statements in the manuscript.

      1. The reason for examining protection in response to glutamate is not well rationalized based on known spastin functions. The interpretation of this experiment is unclear with respect to effects on protection vs repair.

      Response: Thank you for the reviewer's consideration. We suppose that spastin may be involved in both protective and repair processes. Existing studies suggest that spastin can control store-operated calcium entry (SOCE) by altering endoplasmic reticulum morphology (doi: 10.1093/brain/awac122, doi: 10.3389/fphys.2019.01544), which may indicate its role in regulating calcium overload. Additionally, due to the critical role of spastin in axon growth, it is also essential for neuronal repair after injury. Therefore, we have not strictly distinguished between these two concepts here.

      1. It is unclear if Spastazoline simply blocks any type of growth and it is thus difficult to conclude that FC-A functions through a 14-3-3-spastin effect based on the current data.

      Response: We have re-evaluated and modified the title and main conclusions of the manuscript based on the reviewer's comments and the existing evidence, as responded to in reviewer 1's comments.

      1. The access of FC-A to the CNS with the current protocol has not been clearly established and the effects of FC_A on spastin expression seem to mirror the profile of the control condition.

      Response: We agree with the reviewer's comments. The expression trend of spastin after FC-A treatment is consistent with that of the control group, with a slight increase in its expression level compared to the control group.

      1. The NF and 5-HT staining is not convincing labelling fibres.

      Response: We appreciate the reviewer's comments. We believe that the reason for the incomplete axon staining is closely related to the thickness of the tissue sections. In our future research, we will further optimize our axon labeling methods.

      Reviewer #3 (Recommendations For The Authors):

      Figure 1D: Both spastin and 14-3-3 label the entire neuron which is rather unusual. Conditions of immunfluorescence should be improved. As it is, this image should not be used to claim colocalization.

      Response: We appreciate the reviewer's consideration. In response to comment 1 from the expert reviewer 2, we have re-examined the source files and identified that the primary reason for the overall cell-wide distribution of 14-3-3 and spastin is due to excessive brightness and a lack of sufficient contrast. After making appropriate adjustments, we found that 14-3-3 and spastin exhibit characteristic localization within the axon (concentrated in a particular region of the axon shaft and the growth cone). We have made corresponding revisions in the revised version of the manuscript.

      Figure S2: The experimental setup and data provided is not adequate to infer microtubule severing.

      Response: We appreciate the reviewer's guidance. We have improved the relevant experiments and used a 100X objective lens to observe the microtubule structures more clearly.

      Figure 2 I-K: The functional effect of spastin S233A and S233D on neurite outgrowth does not correlate with a function of 14-3-3 and thus does not support the central hypothesis of the manuscript. Minor: The images selected as representative show differences in neurite length and branching that are not portrayed in the graphs.

      Response: Thank you for the reviewer’s comment. Similar to the response to the reviewer 2's comment 6, in order to better illustrate the role of spastin S233 in neurite outgrowth, we made corresponding adjustments to our experimental conditions. Cells were transfected with plasmids expressing non-fused GFP and spastin and the relevant S233 mutants at a transfection dose of 0.2 µg into 1 well (1×104 cells/cm2, 24-well plate), duration was controlled at 12 hours. Due to the low expression state of the overexpressed protein, GFP (ab290 antibody for IF) was then stained to trace neuronal morphology. The experimental results demonstrate that spastin promotes neuronal neurite growth, and the dephosphorylation mutant of spastin (spastin S233A) significantly attenuates its neurite-promoting effect compared to wild-type spastin. Conversely, the phosphorylation mutant spastin S233D further enhances the promotion of neuronal neurite growth. We have also made corrections to the relevant statements in the manuscript.

      Figure 5 J and L: The quality, resolution and size of the images is insufficient to support the claims of the authors. As it is, one cannot interpret the data. It is very hard to envisage, even considering the explanation provided by the authors, that spinal cords where spastazoline was used correspond to contusion as a complete discontinuity between the rostral and caudal spinal cord tissue is present.

      Response: Due to limitations in file uploads, we encountered issues with the resolution of the tissue slices related to spinal cord injury. To address this, we have adjusted the size and resolution of the corresponding images in the supplementary materials (Fig.S7-S9 ) and included the original confocal files and images.

      Additionally, it's important to note that the tissue slices we presented do not represent all layers of the spinal cord, and not all layers exhibit discontinuity. Our slices are taken longitudinally at the dorsal site of the lesion area. The dorsal slices represent areas closer to the injury site, while deeper slices correspond to areas distant from the injury site. Therefore, we selected areas closer to the injury site to reflect the repair process following injury.

      Figure 7B: Similar comment to spianl cord images provided in Figure 5. NF and MBP are not supposed to colocalize as they label different cell types...

      Response: We appreciate the comments from the expert reviewer, and we agree with their suggestions. We will further optimize our axon labeling methods. The excessive brightness and lack of contrast primarily led to the non-specific labeling of other cell types with the MBP antibody. In fact, our primary goal was to highlight the injured areas by enhancing the fluorescence intensity of the images, which inadvertently resulted in neglecting the exclusion of non-specific staining. Therefore, we have made appropriate adjustments to the images to better visualize the distribution of myelin sheaths.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The importance of the role of sexual behavior, specifically ejaculation rates, is worth emphasizing for the formation of pair bonds in prairie voles. It suggests that the role of sexual behavior in contributing to the strength of pair bonds should be explored more. It is also important to add that males and females in the study were screened for sexual receptivity. It would therefore be important to identify characteristics of animals that did not mate under the laboratory conditions used that may add depth and complexity to what was identified in the current study. The identification of brain regions for pair bond maintenance centered around the amygdala was also intriguing.

      Thank you for pointing some interpretations of our findings that can be emphasized in the Discussion. We added the following sentences to the Discussion:

      “Our findings, along with this previous work, support the hypothesis that sexual behavior plays a key role in driving pair-bond strength. However, the current study focused on animals that were screened for sexual receptivity, which may have limited variation in sexual behavior across pairs. An intriguing direction for future research will be to test how this variation contributes to bond strength.”

      We also emphasized amygdala in relation to pair bond maintenance. We added the following sentence to the Discussion:

      “These brain regions, and especially amygdala, will be important candidates for future research on neural regulation of pair-bond maintenance.”

      The issue of the lack of a strong presence of the reward circuitry (nucleus accumbens) in the final models is also worth more discussion. Perhaps it has been overly emphasized in the past, but there are strong results from other studies pointing to the importance of reward circuitry.

      Thank you for this suggestion. There is a section in the Results that analyses accumbens in more detail than other brain regions. Accumbens did not survive our corrections for multiple statistical tests, however it was significant at early timepoint without these corrections. This Results paragraph states the following:

      “Although the nucleus accumbens did not survive multiple test corrections in our ROI analysis (q=0.17), it was significant in univariate analysis (p=0.03), particularly when focused on the 2.5 and 6h timepoints (two sample t-test: t=2.53, p=0.01, Video 2). Furthermore, voxel-level comparisons revealed significant sites within the ventral striatum and the posterior nucleus accumbens (Figure 2A, Figure Supplement 1b-c, Video 2).”

      We added Supplementary File 4, which contains model comparison results for accumbens and all other ROIs. We also added more detail on nucleus accumbens to the Discussion:

      “Pairing drove increased c-Fos expression in the ventral pallidum, a major node in reward circuity, as well as in the paraventricular nucleus and the medial preoptic area, modulators of reward. This is consistent with a large body of work implicating neuropeptide actions on reward circuits in the formation of bonds (Walum & Young, 2018; Young & Wang, 2004). Conspicuously missing from our list, however, is significant pairing induced c-Fos induction in the nucleus accumbens. One possibility is that an absence of significant accumbens IEG induction reflects the limitations of using c-Fos and other immediate early genes as indicators of neural activity. It is known that some neuronal populations can be active without expressing c-Fos (Sheng & Greenberg, 1990). Indeed, although a variety of studies implicate the accumbens in bond formation (Amadei et al., 2017; Aragona et al., 2006; Scribner et al., 2020), previous work finds only weak c-Fos induction in the prairie vole accumbens during bonding (Curtis & Wang, 2003). Another possibility is that there was heterogeneous activation in the accumbens that was not captured by the precision of our atlas. Consistent with this interpretation, found that the accumbens was significant in univariate tests, as well as in voxel-level analyses. Overall, our results do not conflict with pharmacological, electrophysiological, and calcium-imaging data on the role of the nucleus accumbens in prairie vole bonding (Amadei et al., 2017; Aragona et al., 2006; Scribner et al., 2020). Instead, the absence of significant effects at the level of the entire nucleus accumbens together with the presence of anatomically restricted voxel-level significance suggests substantial anatomical heterogeneity in the contributions of the nucleus accumbens to bond formation.”

      Please discuss the consequences of creating the behavioral data for pair bond formation by subtracting same-sex pairs interactions from the opposite-sex interactions. What sources of information are removed by using this approach?

      One limitation of our study’s approach is that we are unable to fully separate information related to social novelty from mating experience. Thank you for pointing out that we should touch on this sort of caveat in the paper. We added several sentences to the Discussion:

      “It seems likely that sensory and motor areas were important for social processes related to both pair-bonding and reunion with same-sex cagemates, such as investigation and recognition. Our study design, however, highlights differences between treatments, and in order to detect such effects, it might be necessary to compare mating and bonding pairs to animals left in complete isolation.”

      We reiterate the point in a new paragraph we added to the Discussion to explicitly provide caveats regarding our data:

      “Before offering a synthesis of our findings, it would be useful to acknowledge a few caveats. First, as noted above, IEG induction does not capture all relevant neural activity. Second, the design of our experiment, which controlled for social interaction, likely excluded many circuits important to both pair bonding and sibling social interactions. Third, c-Fos activity within a given brain region may nevertheless rely on distinct cell types, and so the absence of sex differences in c-Fos immunoreactivity does not definitively rule out the sexually dimorphic circuits hypothesized in the “dual function hypothesis” (de Vries, 2004). Lastly, the current study focused on animals that were screened for sexual receptivity, which may have limited the variation in sexual behavior across opposite-sex pairs.”

      Time 0 is when the barrier is removed after a two-hour exposure. Please speculate on what is going on during the two-hour exposure. Time zero is potentially more than the time of mating. Is it possible that aggression is being decreased during this timepoint that represents mating? Could it also be a measure of the outcome of an initial compatibility assessment by the male and female?

      Thank you for this interesting observation. While the opaque divider prevented physical social interactions, it is possible that animals picked up on auditory or olfactory cues. We did not detect group differences in movement patterns and vocalization rates from the 0 h timepoint group (Figure 2). These findings suggest that potential partner detection and assessment occurred in a similar way for both experiment groups. It is unlikely that this period represents a decrease in aggression, since unbonded prairie voles are not known to be aggressive towards conspecifics. However, the idea that animals may potentially use olfactory or auditory cues to assess each other is an interesting idea, and one that we cannot rule out. We added a brief statement to the Methods “Experiment Design” section about the possibility that the two hours prior to divider removal (0 h timepoint) could represent more than an acclimation period:

      “It is important to note that the opaque divider in the acclimation period prevented physical interactions, but it is possible that animal pairs may have detected each other through olfactory or auditory cues.”

      We also mention this in the revised Discussion in the context of the PFC cluster, which not only differed between mating and non-mating groups, but also showed differences between isolated (0h) and socially interacting animals (sibs and mates, 2.5h-22h):

      “A fourth cluster (“PFC,” green) is composed of prelimbic, infralimbic and olfactory cortex; activity in the vole prefrontal cortex is known to be modulated by hypothalamic oxytocin, and to shape bonding through projections to the nucleus accumbens (Amadei et al., 2017; Burkett et al., 2016; Horie et al., 2020). The pattern of activity in this cluster, however, indicates that it was due in part to differences between the isolated animals (0h) and other time points (Figure 4—figure supplement 1 and Figure 4—figure supplement 2). Because animals in the isolated condition were in a compartment adjacent to either an opposite sexed individual or a familiar former cagemate, we cannot rule out that olfactory or auditory cues may have made animals aware of the presence of a potential social partner. Indeed, we interpret this dimension as capturing appetitive aspects of behaviors associated with investigation of the animal isolated from the subject by the barrier.”

      Reviewer #2 (Public Review):

      An important caveat to this study not mentioned by the authors is that c-fos provides a snapshot of neural activity and that important populations of neurons could be active and not express c-fos. Thus observed correlations are likely to be robust, but the absence of differences (in say accumbens) may just reflect the limits of c-fos estimation of neural activity. Similarly, highly coordinated neural activity between males and females might still be driven by different mechanisms if different cell types were activated within a specific region.

      We now discuss limitations of c-Fos in the Discussion paragraph that focuses on accumbens:

      “The absence of significant accumbens IEG induction may reflect the limitations of using c-Fos and other immediate early genes as indicators of neural activity. It is known that some neuronal populations can be active without expressing c-Fos (Sheng & Greenberg, 1990). Indeed, although a variety of studies implicate the accumbens in bond formation (Amadei et al., 2017; Aragona et al., 2006; Scribner et al., 2020), previous work finds only weak c-Fos induction in the prairie vole accumbens during bonding (Curtis & Wang, 2003).”

      We also include the following sentence in a new Discussion paragraph that focuses on caveats to our findings:

      “Before offering a synthesis of our findings, it would be useful to acknowledge or reiterate a few caveats. First, as noted above, IEG induction does not capture all relevant neural activity (Sheng & Greenberg 1990). Second, the design of our experiment, which controlled for social interaction, likely excluded many circuits important to both pair bonding and sibling social interactions. Third, c-Fos activity within a given brain region may nevertheless rely on distinct cell types, and so the absence of sex differences in c-Fos immunoreactivity does not definitively rule out the sexually dimorphic circuits hypothesized in the “dual function hypothesis” (de Vries, 2004). Lastly, the current study focused on animals that were screened for sexual receptivity, which may have limited the variation in sexual behavior across opposite-sex pairs.”

      Recommendations for the authors:

      It appears as if df is missing from some statistical reporting.

      Thank you for pointing this out. We went through the manuscript and added in sample sizes to statistical reporting.

      Reviewer #1 (Recommendations for the authors):

      It is surprising that the cortex was not more extensively identified as being involved in pair bonding, but perhaps this is because the emphasis for choosing brain areas in the cortical region is biased towards olfactory regions. Please discuss. It may also be worth noting that brain regions associated with perception may be important in all of these processes, but selected out because of the design.

      Thank you for this observation. We agree that some cortical regions may not have been identified due to the study design. For example, social processes related to both pair bonding and cagemate recognition likely rely on overlapping circuits. It is also important to note here that our analysis approach identified the “most” significant regions. This means that several candidate regions did not survive the statistical threshold used to select regions. We now discuss the cortex in more detail in the Discussion, where we also identify the regions that approached significance but did not survive multiple test corrections:

      “Although the PFC and other olfactory cortical areas formed a cluster, we did not find widespread c-Fos induction throughout the cortex in response to pairing. It seems likely that sensory and motor areas were important for social processes related to both pair-bonding and reunion with same-sex cagemates, such as investigation and recognition. Our study design, however, highlights differences between treatments, and in order to detect such effects, it might be necessary to compare mating and bonding pairs to animals left in complete isolation. Moreover, several cortical regions that did not survive corrections for multiple tests may have been identified in a less stringent analysis. Several subregions within the isocortex, hippocampal formation, and cortical subplate had statistical models that approached significance (i.e., p-values < 0.1) prior to multiple test corrections. These subregions were found within primary somatosensory area, primary auditory area, dorsal and ventral auditory areas, primary visual area, anteromedial visual area, agranular insular area, temporal association areas, ectorhinal area, postsubiculum, and basomedial amygdala. Frontal cortex subregions were within the agranular insular area and orbital area, as well as additional subregions in prelimbic and infralimbic areas of the PFC.”

      Same-sex siblings were isolated for 4-5 days and then repaired. This is a creative way of dealing with this, but was any aggression displayed in the same-sex pairs? Are there bonds or preferences among same-sex individuals? Could the isolation have set the stage for neural changes associated with migrating from the natal group? 4-5 days of isolation is not trivial.

      Thank you for these questions. We did not witness aggression between same-sex pairs. We had recorded ‘aggression’ events (lunges and chases) during the 1 h behavioral observation epochs and found that these rates were nearly zero for all sibling timepoint groups (events/h per focal animal in mean ± sd: 2.5 h group = 0.58 ± 1.53, 6 h group = 0.17 ± 0.48, 22 h group = 0.25 ± 0.44).

      The question about peer relationships is a good one. Previous literature does suggest that prairie voles can develop preferences for familiar same-sex individuals (e.g., Beery et al. 2018 Front. Behav. Neuro., Lee et al. 2019 Front. Behav. Neuro). Thus, we want to reiterate here that our study design tests for differences between these baseline levels of affiliation with pair bonding in a reproductive context.

      It is possible that the period of isolation prior to experiments may have set the stage for neural changes associated with migration from the natal group. Testing this possibility is outside the scope of the current study. We want to point out here that animals were separated from their natal groups several weeks prior to the experiment. Animals were weaned at 21 days and put into same-sex cages, and then experiments occurred between 8-12 weeks of age. All experiment groups went through the same weaning and co-housing conditions.

      Pg 26, Line 655: "better" is listed twice in the sentence and only one is needed

      Thank you for catching this typo. This is fixed.

      Reviewer #2 (Recommendations for the authors):

      Why was it necessary to bring voles into estrus when they are induced ovulators? The authors need to state how voles were brought into estrus.

      Thank you for this suggestion. We explained estrus induction in the Methods, but this explanation could be missed because it was within the “Behavioral procedures” section. We put the paragraph about estrus induction into a new section called “Estrus induction and animal selection”. We also elaborated on the final sentences of this paragraph to provide a clearer rationale:

      “We used this mating assay to restrict study subjects to voles that showed lordosis (females) or mounting behavior (males). By selecting voles who showed sexual behavior, we could control the estrus state and timing of mating across the 0, 2.5, 6 and 22 h study groups. This selection process also ensured that animals assigned to the same-sex sibling pair and opposite-sex mating pair groups had similar sexual motivation and experience.”

      I assume in the final manuscript the authors will release the availability of the atlas? Making the atlas public seems to be in the spirit of the eLife publishing model.

      The prairie vole reference brain, atlas, and atlas annotation labels, are now included on the Figshare repository site. We updated the Data and code availability section to clarify this.

      Reviewer #3 (Recommendations for the authors):

      Please clarify in the Methods if same-sex sibling females were also estrogen primed. If not, could the estrogen exposure cause Fos differences?

      Thank you for this suggestion. All females were estrogen primed. We refined the Methods section “Estrus induction and animal selection” to make this part of the study design clearer. We edited one of the sentences to say “During this isolation period, all females were induced into estrus[...]” We also added a couple sentences at the end of this paragraph:

      “By selecting voles who showed sexual behavior, we could control the estrus state and timing of mating across the 0, 2.5, 6 and 22 h study groups. This selection process also ensured that animals assigned to the same-sex sibling pair and opposite-sex mating pair groups had similar sexual motivation and experience.”

    1. Author Response:

      We thank the reviewers for their careful comments. We sincerely agree with the comments from both reviewers, and noticed the word “cell transplantation”, throughout the manuscript including the title, was confusing. We will revise the manuscript to clarify the aim of the study, and to express the conclusion more straightforwardly.

      Response to reviewers:

      We interpret the data of the present study as the color of each RPE cell is a temporal condition which does not necessarily represent the quality (e.g. for cell transplantation) of the cells. We consider this may be applicable not only in vitro but also in vivo, although we do not know whether RPE shows heterogeneous level of pigmentation in vivo.

      As our concern for iPSC-RPE is always about their quality for cell transplantation, maybe we haven’t fairly evaluated the scientific significance obtained from the present study.

      Another thing we noticed was, although we used the term “cell transplantation” to explain what we meant by “quality” of the cells, we agree this was confusing. The aim of the study was not to show how the pigmentation level of transplant-RPE affects the result of cell transplantation, but to show the heterogeneous gene expression of iPSC-derived RPE cells, and the less correlation of the heterogeneity with the pigmentation level. We went through the manuscript, including the title, to more straightforwardly lead this conclusion: the degree of pigmentation had some but weak correlation with the expression levels of functional genes, and the reason for the weakness of the correlation may be because the color is a temporal condition (as we interpreted from the data) that is different from more stable characteristics of the cells.

      We agree that “cell transplantation” in the title (and other parts) was misleading. So, we will change the title, and removed the phrase that led as if the aim of the study was to show something about cell transplantation or in vivo results.

      Also, to face scientifically significant results obtained from the present study appropriately, we will discuss more about the correlation of the pigmentation level with some functional genes, and brought this as one of the conclusions of the manuscript.

    1. Author Response:

      Reviewer #1 (Public Review):

      [...] Weaknesses:

      1. I feel the authors need to justify why flow-crushing helps localization specificity. There is an entire family of recent papers that aim to achieve higher localization specificity by doing the exact opposite. Namely, MT or ABC fRMRI aims to increase the localization specificity by highlighting the intravascular BOLD by means of suppressing non-flowing tissue. To name a few:

      Priovoulos, N., de Oliveira, I.A.F., Poser, B.A., Norris, D.G., van der Zwaag, W., 2023. Combining arterial blood contrast with BOLD increases fMRI intracortical contrast. Human Brain Mapping hbm.26227. https://doi.org/10.1002/hbm.26227.

      Pfaffenrot, V., Koopmans, P.J., 2022. Magnetization Transfer weighted laminar fMRI with multi-echo FLASH. NeuroImage 119725. https://doi.org/10.1016/j.neuroimage.2022.119725

      Schulz, J., Fazal, Z., Metere, R., Marques, J.P., Norris, D.G., 2020. Arterial blood contrast ( ABC ) enabled by magnetization transfer ( MT ): a novel MRI technique for enhancing the measurement of brain activation changes. bioRxiv. https://doi.org/10.1101/2020.05.20.106666

      Based on this literature, it seems that the proposed method will make the vein problem worse, not better. The authors could make it clearer how they reason that making GE-BOLD signals more extra-vascular weighted should help to reduce large vein effects.

      The empirical evidence for the claim that flow crushing helps with the localization specificity should be made clearer. The response magnitude with and without flow crushing looks pretty much identical to me (see Fig, 6d). It's unclear to me what to look for in Fig. 5. I cannot discern any layer patterns in these maps. It's too noisy. The two maps of TE=43ms look like identical copies from each other. Maybe an editorial error?

      The authors discuss bipolar crushing with respect to SE-BOLD where it has been previously applied. For SE-BOLD at UHF, a substantial portion of the vein signal comes from the intravascular compartment. So I agree that for SE-BOLD, it makes sense to crush the intravascular signal. For GE-BOLD however, this reasoning does not hold. For GE-BOLD (even at 3T), most of the vein signal comes from extravascular dephasing around large unspecific veins, and the bipolar crushing is not expected to help with this.

      The authors would like to clarify that the velocity-nulling gradient is NOT designed to suppress all the contributions from intravascular blood. Instead, we tried to find a balance so that the VN gradient maximally suppressed the macrovascular signal in unspecific veins but minimally attenuated the microvascular signal in specific capillary bed. We acknowledge the reviewer's concern regarding the potential extravascular contributions from large, non-specific vessels. This aspect will be thoroughly evaluated and addressed in the revised manuscript. Additionally, we will make clarifications in other parts that may have cause the reviewer’s misunderstandings.

      1. The bipolar crushing is limited to one single direction of flow. This introduces a lot of artificial variance across the cortical folding pattern. This is not mentioned in the manuscript. There is an entire family of papers that perform layer-fmri with black-blood imaging that solves this with a 3D contrast preparation (VAPER) that is applied across a longer time period, thus killing the blood signal while it flows across all directions of the vascular tree. Here, the signal cruising is happening with a 2D readout as a "snap-shot" crushing. This does not allow the blood to flow in multiple directions. VAPER also accounts for BOLD contaminations of larger draining veins by means of a tag-control sampling. The proposed approach here does not account for this contamination.

      Chai, Y., Li, L., Huber, L., Poser, B.A., Bandettini, P.A., 2020. Integrated VASO and perfusion contrast: A new tool for laminar functional MRI. NeuroImage 207, 116358. https://doi.org/10.1016/j.neuroimage.2019.116358

      Chai, Y., Liu, T.T., Marrett, S., Li, L., Khojandi, A., Handwerker, D.A., Alink, A., Muckli, L., Bandettini, P.A., 2021. Topographical and laminar distribution of audiovisual processing within human planum temporale. Progress in Neurobiology 102121. https://doi.org/10.1016/j.pneurobio.2021.102121

      If I would recommend anyone to perform layer-fMRI with blood crushing, it seems that VAPER is the superior approach. The authors could make it clearer why users might want to use the unidirectional crushing instead.

      We acknowledge that the degree of velocity nulling varies across the cortical folding pattern. We intend to discuss potential solutions to address this variance, and these may be implemented in the revised manuscript as appropriate. Furthermore, we will provide a comprehensive discussion on the advantages and disadvantages of both CBV-based and BOLD-based approaches.

      1. The comparison with VASO is misleading. The authors claim that previous VASO approaches were limited by TRs of 8.2s. The authors might be advised to check the latest literature of the last years. Koiso et al. performed whole brain layer-fMRI VASO at 0.8mm at 3.9 seconds (with reliable activation), 2.7 seconds (with unconvincing activation pattern, though), and 2.3 (without activation). Also, whole brain layer-fMRI BOLD at 0.5mm and 0.7mm has been previously performed by the Juelich group at TRs of 3.5s (their TR definition is 'fishy' though).

      Koiso, K., Müller, A.K., Akamatsu, K., Dresbach, S., Gulban, O.F., Goebel, R., Miyawaki, Y., Poser, B.A., Huber, L., 2023. Acquisition and processing methods of whole-brain layer-fMRI VASO and BOLD: The Kenshu dataset. Aperture Neuro 34. https://doi.org/10.1101/2022.08.19.504502

      Yun, S.D., Pais‐Roldán, P., Palomero‐Gallagher, N., Shah, N.J., 2022. Mapping of whole‐cerebrum resting‐state networks using ultra‐high resolution acquisition protocols. Human Brain Mapping. https://doi.org/10.1002/hbm.25855

      Pais-Roldan, P., Yun, S.D., Palomero-Gallagher, N., Shah, N.J., 2023. Cortical depth-dependent human fMRI of resting-state networks using EPIK. Front. Neurosci. 17, 1151544. https://doi.org/10.3389/fnins.2023.1151544

      The authors are correct that VASO is not advised as a turn-key method for lower brain areas, incl. Hippocampus and subcortex. However, the authors use this word of caution that is intended for inexperienced "users" as a statement that this cannot be performed. This statement is taken out of context. This statement is not from the academic literature. It's advice for the 40+ user base that wants to perform layer-fMRI as a plug-and-play routine tool in neuroscience usage. In fact, sub-millimeter VASO is routinely being performed by MRI-physicists across all brain areas (including deep brain structures, hippocampus etc). E.g. see Koiso et al. and an overview lecture from a layer-fMRI workshop that I had recently attended: https://youtu.be/kzh-nWXd54s?si=hoIJjLLIxFUJ4g20&t=2401

      Thus, the authors could embed this phrasing into the context of their own method that they are proposing in the manuscript. E.g. the authors could state whether they think that their sequence has the potential to be disseminated across sites, considering that it requires slow offline reconstruction in Matlab? Do the authors think that the results shown in Fig. 6c are suggesting turn-key acquisition of a routine mapping tool? In my humble opinion, it looks like random noise, with most of the activation outside the ROI (in white matter).

      Those literatures will be included and discussed in the revised manuscript. Furthermore, we are considering the exclusion of the LGN results presented in Figure 6, as they may divert attention from the primary focus of the study.

      We are enthusiastic about sharing our imaging sequence, provided its usefulness is conclusively established. However, it's important to note that without an online reconstruction capability, such as the ICE, the practical utility of the sequence may be limited. Unfortunately, we currently don’t have the manpower to implement the online reconstruction. Nevertheless, we are more than willing to share the offline reconstruction codes upon request.

      1. The repeatability of the results is questionable. The authors perform experiments about the robustness of the method (line 620). The corresponding results are not suggesting any robustness to me. In fact, the layer profiles in Fig. 4c vs. Fig 4d are completely opposite. The location of peaks turns into locations of dips and vice versa. The methods are not described in enough detail to reproduce these results. The authors mention that their image reconstruction is done "using in-house MATLAB code" (line 634). They do not post a link to github, nor do they say if they share this code.

      It is not trivial to get good phase data for fMRI. The authors do not mention how they perform the respective coil-combination. No data are shared for reproduction of the analysis.

      There may have been a misunderstanding regarding the presentation in Figure 4, which illustrates the impact of TEs and the VN gradient. To enhance clarity and avoid further confusion, we will redesign this figure for improved comprehension.

      The authors are open to sharing the MATLAB codes associated with our study. However, we were limited by manpower for refining and enhancing the readability of these codes for broader use.

      Regarding the coil combination, we utilized an adaptive coil combination approach as described in the paper by Walsh DO, Gmitro AF, and Marcellin MW, titled 'Adaptive reconstruction of phased array MR imagery' (Magnetic Resonance in Medicine 2000; 43:682-690). The MATLAB code for this method was implemented by Dr. Diego Hernando. We will include a link for downloading this code in the revised manuscript for the convenience of interested readers.

      1. The application of NODRIC is not validated. Previous applications of NORDIC at 3T layer-fMRI have resulted in mixed success. When not adjusted for the right SNR regime it can result in artifactual reductions of beta scores, depending on the SNR across layers. The authors could validate their application of NORDIC and confirm that the average layer-profiles are unaffected by the application of NORDIC. Also, the NORDIC version should be explicitly mentioned in the manuscript.

      Akbari, A., Gati, J.S., Zeman, P., Liem, B., Menon, R.S., 2023. Layer Dependence of Monocular and Binocular Responses in Human Ocular Dominance Columns at 7T using VASO and BOLD (preprint). Neuroscience. https://doi.org/10.1101/2023.04.06.535924

      Knudsen, L., Guo, F., Huang, J., Blicher, J.U., Lund, T.E., Zhou, Y., Zhang, P., Yang, Y., 2023. The laminar pattern of proprioceptive activation in human primary motor cortex. bioRxiv. https://doi.org/10.1101/2023.10.29.564658

      During our internal testing, we observed that the NORDIC denoising process did not alter the activation patterns. These findings will be incorporated into the revised manuscript. The details of NORDIC will be provided as well.

      Reviewer #2 (Public Review):

      [...] The well-known double peak feature in M1 during finger tapping was used as a test-bed to evaluate the spatial specificity. They were indeed able to demonstrate two distinct peaks in group-level laminar profiles extracted from M1 during finger tapping, which was largely free from superficial bias. This is rather intriguing as, even at 7T, clear peaks are usually only seen with spatially specific non-BOLD sequences. This is in line with their simple simulations, which nicely illustrated that, in theory, intravascular macrovascular signals should be suppressible with only minimal suppression of microvasculature when small b-values of the VN gradients are employed. However, the authors do not state how ROIs were defined making the validity of this finding unclear; were they defined from independent criteria or were they selected based on the region mostly expressing the double peak, which would clearly be circular? In any case, results are based on a very small sub-region of M1 in a single slice - it would be useful to see the generalizability of superficial-bias-free BOLD responses across a larger portion of M1.

      Given the individual variations in the location of the M1 region, we opted for manual selection of the ROI. In the revised manuscript, we plan to explore and implement an independent criterion for ROI selection to enhance the objectivity and reproducibility of our methodology.

      As repeatedly mentioned by the authors, a laminar fMRI setup must demonstrate adequate functional sensitivity to detect (in this case) BOLD responses. The sensitivity evaluation is unfortunately quite weak. It is mainly based on the argument that significant activation was found in a challenging sub-cortical region (LGN). However, it was a single participant, the activation map was not very convincing, and the demonstration of significant activation after considerable voxel-averaging is inadequate evidence to claim sufficient BOLD sensitivity. How well sensitivity is retained in the presence of VN gradients, high acceleration factors, etc., is therefore unclear. The ability of the setup to obtain meaningful functional connectivity results is reassuring, yet, more elaborate comparison with e.g., the conventional BOLD setup (no VN gradients) is warranted, for example by comparison of tSNR, quantification and comparison of CNR, illustration of unmasked-full-slice activation maps to compare noise-levels, comparison of the across-trial variance in each subject, etc. Furthermore, as NORDIC appears to be a cornerstone to enable submillimeter resolution in this setup at 3T, it is critical to evaluate its impact on the data through comparison with non-denoised data, which is currently lacking.

      We appreciate the reviewer’s comments. Those issues will be addressed carefully.

      Reviewer #3 (Public Review):

      [...] Weaknesses: - Although the VASO acquisition is discussed in the introduction section, the VN-sequence seems closer to diffusion-weighted functional MRI. The authors should make it more clear to the reader what the differences are, and how results are expected to differ. Generally, it is not so clear why the introduction is so focused on the VASO acquisition (which, curiously, lacks a reference to Lu et al 2013). There are many more alternatives to BOLD-weighted imaging for fMRI. CBF-weighted ASL and GRASE have been around for a while, ABC and double-SE have been proposed more recently.

      The principal distinction between DW-fMRI and our methodology lies in the level of the b-value employed. DW-fMRI typically measures cellular swelling by utilizing a b-value greater than 1000 s/mm^2 (e.g. 1800). Conversely, our Velocity Nulling functional MRI (VN-fMRI) approach continues to assess hemodynamic responses, utilizing a smaller b-value specifically for the suppression of signals from draining veins. In addition, other layer-fMRI methods will be discussed.

      • The comparison in Figure 2 for different b-values shows % signal changes. However, as the baseline signal changes dramatically with added diffusion weighting, this is rather uninformative. A plot of t-values against cortical depth would be much more insightful.
      • Surprisingly, the %-signal change for a b-value of 0 is not significantly different from 0 in the gray matter. This raises some doubts about the task or ROI definition. A finger-tapping task should reliably engage the primary motor cortex, even at 3T, and even in a single participant.
      • The BOLD weighted images in Figure 3 show a very clear double-peak pattern. This contradicts the results in Figure 2 and is unexpected given the existing literature on BOLD responses as a function of cortical depth.

      In our study, the TE in Figure 2 is shorter than that in Figure 3 (33 ms versus 43 ms). It has been reported in the literature that BOLD fMRI with a shorter TE tends to include a greater intravascular contribution. Acknowledging this, we plan to repeat the experiments with a controlled TE to ensure consistency in our results.

      • Given that data from Figures 2, 3, and 4 are derived from a single participant each, order and attention affects might have dramatically affected the observed patterns. Especially for Figure 4, neither BOLD nor VN profiles are really different from 0, and without statistical values or inter-subject averaging, these cannot be used to draw conclusions from.

      The order of the experiments were randomized to ensure unbiased results.

      It is important to note that the error bars presented in Figures 2, 3, and 4 do not represent the standard deviation of the residual fitting error. Instead, they illustrate the variation across voxels within a specific layer. This approach may lead to the error bars being influenced by the selection of the Region of Interest (ROI). In light of this, we intend to refine our statistical methodologies in the revised manuscript to address this issue.

      • In Figure 5, a phase regression is added to the data presented in Figure 4. However, for a phase regression to work, there has to be a (macrovascular) response to start with. As none of the responses in Figure 4 are significant for the single participant dataset, phase regression should probably not have been undertaken. In this case, the functional 'responses' appear to increase with phase regression, which is contra-intuitive and deserves an explanation.
      • Consistency of responses is indeed expected to increase by a removal of the more variable vascular component. However, the microvascular component is always expected to be smaller than the combination of microvascular + macrovascular responses. Note that the use of %signal changes may obscure this effect somewhat because of the modified baseline. Another expected feature of BOLD profiles containing both micro- and microvasculature is the draining towards the cortical surface. In the profiles shown in Figure 7, this is completely absent. In the group data, no significant responses to the task are shown anywhere in the cortical ribbon.
      • Although I'd like to applaud the authors for their ambition with the connectivity analysis, I feel that acquisitions that are so SNR starved as to fail to show a significant response to a motor task should not be used for brain wide directed connectivity analysis.

      We agree that exploring brain-wide directed functional connectivity may be overly ambitious at this stage, particularly before the VN-fMRI technique has been comprehensively evaluated and validated. In the revised manuscript, we will focus more on examining the characteristics of the layer-dependent BOLD signal rather than delving into layer-dependent functional connectivity.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors developed computational models that capture the electrical and Ca2+ signaling behavior in mesenteric arterial cells from male and female mice. A baseline model was first formulated with eleven transmembrane currents and three calcium compartments. Sex-specific differences in the L-type calcium channel and two voltage-gated potassium channels were then tuned based on experimental measurements. To incorporate the stochastic ion channel openings seen in smooth muscle cells under physiological conditions, noise was added to the membrane potential and the sarcoplasmic Ca2+ concentration equations. Finally, the models were assembled into 1D vessel representations and used to investigate the tissue-level electrical response to an L-type calcium channel blocker.

      Strengths:

      A major strength of the paper is that the modeling studies were performed on three different scales: individual ionic currents, whole-cell, and 1D tissue. This comprehensive computational framework can help provide mechanistic insight into arterial myocyte function that might be difficult to achieve through traditional experimental methods.

      The authors aimed to develop sex-specific computational models of mesenteric arterial myocytes and demonstrate their use in drug-testing applications. Throughout the paper, model behavior was both validated by experimental recordings and supported by previously published data. The main findings from the models suggested that sex-specific differences in membrane potential and Ca2+ handling are attributable to variability in the gating of a small number of voltage-gated potassium channels and L-type calcium channels. This variability contributes to a higher Ca2+ channel blocker sensitivity in female arterial vessels. Overall, the study successfully met the aims of the paper.

      Thank you for your insightful review and for recognizing the strengths of our study. We appreciate your encouraging comment regarding our multi-scale approach. Indeed, we believe that by systematically connecting these scales—individual ionic currents, whole-cell, and 1D tissue—we can integrate and reconcile experimental and clinical data. We anticipate that this approach will not only provide mechanistic insights into arterial myocyte function that may not be easy to glean from traditional experimental methods but will also facilitate the translation of this information into the development of therapeutic interventions.

      Weaknesses:

      A main weakness of the paper, as addressed by the authors, is the simplicity of the 1D vessel model; it does not take into account various signaling pathways or interactions with other cell types which could impact smooth muscle electrophysiology.

      Thank you for highlighting areas for improvement in our study. The strength of computational modeling lies in its iterative nature, allowing us to introduce and examine variables in a systematic manner. While our current model is simplified and does not contain all details, the modular nature of the build will allow continuous expansion to add the important elements described by the reviewer. We are enthusiastic about progressively enriching the model in subsequent studies, introducing signaling pathways in a step-by-step manner, and ensuring their validation with rigorous experimental data.

      Another potential shortcoming is the use of mouse data for optimizing the model, as there could be discrepancies in signaling behavior that limit the translatability to human myocyte predictions.

      We appreciate this important comment. Our model was parametrized using data from mouse mesenteric artery smooth muscle cells as initial proof of concept. Mouse arteries are a good representation of human arteries, as they have similar intravascular pressure-myogenic tone relationships, resting membrane potentials, and express similar ionic channels (e.g., CaV1.2, BK channels, RyRs, etc) (PMID: 28119464, PMID: 29070899, PMID: 23232643). In response to the reviewer, we have modified the discussion section of the manuscript to specifically note the mouse is not identical to the human but does share some common important features that make mice a good approximate model.

      Reviewer #2 (Public Review):

      In this study, Hernandez-Hernandez et al developed a gender-dependent mathematical model of arterial myocytes based on a previous model and new experimental data. The ionic currents of the model and its sex difference were formulated based on patch-clamp experimental data, and the model properties were compared with single-cell and tissue scale experimental results. This is a study that is of importance for the modeling field as well as for experimental physiology.

      Thank you for the comment. In fact, we developed a model that incorporates sex-dependent differences that allowed for male and female models. It’s an important distinction as sex is a biological variable and gender is a self-ascribed characteristic.

      Reviewer #3 (Public Review):

      Summary:

      This hybrid experimental/computational study by Hernandez-Hernandez sheds new light on sex-specific differences between male and female arterial myocytes from resistance arteries. The authors conduct careful experiments in isolated myocytes from male and female mice to obtain the data needed to parameterize sex-specific models of two important ionic currents (i.e., those mediated by CaV1.2 and KV2.1). Available experimental data suggest that KV1.5 channel currents from male and female myocytes are similar, but simulations conducted in the novel Hernandez-Hernandez sex-specific models provide a more nuanced view. This gives rise to the first of the authors' three key scientific claims: (1) In males, KV1.5 is the dominant current regulating membrane potential; whereas, in females, KV2.1 plays a primary role in voltage regulation. They further show that this (2) the latter distinction drives drive sex-specific differences in intracellular Ca2+ and cellular excitability. Finally, working with one-dimensional models comprising several copies of the male/female myocyte models linked by resistive junctions, they use simulations to (3) predict that the sensitivity of arterial smooth muscle to Ca2+ channel-blocking drugs commonly used to treat hypertension is heightened in female compared to male cells.

      Strengths:

      The Methodology is described in exquisite detail in straightforward language that will be easy to understand for most if not all peer groups working in computational physiology. The authors have deployed standard protocols (e.g., parameter fitting as described by Kernik et al., sensitivity analysis as described by Sobie et al.) and appropriate brief explanations of these techniques are provided. The manoeuvre used to represent stochastic effects on voltage dynamics is particularly clever and something I have not personally encountered before. Collectively, these strengthen the credibility of the model and greatly enrich the manuscript.

      We appreciate your comment highlighting the robustness of our methodology. Your acknowledgment of our approach to represent stochastic effects on voltage dynamics is especially encouraging. Indeed, noise is a fundamental component of physiological systems, including in vascular myocytes

      Broadly speaking, the Results section describes findings that robustly support the three key scientific claims outlined in my summary. While there is certainly room for further discussion of some nuanced points as outlined below, it is evident these experiments were carefully designed and carried out with care and intentionality. In the present version of the manuscript, there are a few figures in which experimental data is shown side-by-side with outputs from the corresponding models. These are an excellent illustration of the power of the authors' novel sex-specific computational simulation platform. I think these figures will benefit from some modest additional quantitative analysis to substantiate the similarities between experimental and computational data, but there is already clear evidence of a good match.

      We sincerely appreciate your constructive feedback on the Results section. We have included additional quantitative analysis to substantiate the similarities between experimental and computational data. We agree with the reviewer that the suggestion on the potential value of a more quantitative assessment. As such we have updated the figure to include an in-depth analysis that provides greater insights and solidifies the power of our simulation predictions when compared to experimental results. A detailed analysis of the male and female data as well as the male and female simulations are summarized in the text as follows:

      Baseline membrane potential is -40 mV in male myocytes compared to -30 mV. The frequency of hyperpolarization transients (THs) is 1 Hz in male and 2.5 Hz in female cells for the specific baseline membrane potential shown in Figure 5 A-B. In the range of membrane potentials from -50 mV to -30 mV the frequency increases from 1-2.8Hz which is identical to the experimental frequency range.

      Areas for Improvement:

      The authors used experimental data from a prior publication to calibrate their model of the BKCa current. As indicated in the manuscript, these data are for channel activity measured in a heterologous expression system (Xenopus oocytes). A similar principle applies to other major ion channels/pumps/etc. Is it possible there might be relevant sex-specific differences in these players as well? In the context of the present work, this feels like an important potential caveat to highlight, in case male/female differences in the activity of BKCa or other currents might influence model-predicted differences (e.g., the relative importance of KV1.5 and KV2.1). This should be discussed, and, if possible, related to the elegant sensitivity analysis presented in Fig. 5C (which shows, for example, that the models are relatively insensitive to variation in GBK).

      We fully agree with the reviewer - an important caveat to highlight is the unknown sex-specific differences in all the other players regulating membrane potential and calcium signaling. While our initial assessments indicated that the contribution of BKCa channels to the total voltage-gated K+ current (IKvTOT) was small within the physiological range of -50 mV to -30 mV, further analysis of spontaneous transient outward currents revealed sex-specific variations. We have investigations underway to explore if BKCa channel expression and organization may be also sex-dependent.

      The authors state that their model can be expanded to 2D/3D applications, "transitioning seamlessly from single-cell to tissue-level simulations". I would like to see more discussion of this. For example, given the modest complexity of the cell-scale model, how considerable would the computational burden be to implement a large network model of a subset of the human female or male arterial system? Are there sex-specific differences in vessel and/or network macro-structure that would need to be considered? How would this influence feasibility? Rather than a 1D cable as implemented here, I imagine a multi-scale implementation would involve the representation of myocytes wrapped around vessels. How would the behavior of such a system differ from the authors' presented work using a 1D representation of 100 myocytes coupled end-to-end? Could these differences partially explain why the traces in Fig. 8D are smoother than those in Fig. 8C? From my standpoint, discussing these points would enrich the paper.

      We appreciate the reviewer’s thoughtful and forward-looking ideas! Indeed, we are very interested to extend the model to incorporate a number of these important items.

      Our choice for the 1D cable model was driven by its anatomical relevance to the structure of third and fourth-order mesenteric arteries. These arteries possess a singular layer of vascular myocytes encircling the lumen in a cylindrical arrangement. When we conceptualize this structure as unrolled or viewed laterally, it aligns with a flat, rectangular form, closely paralleling our 1D cable implementation. One option is to expand this into a 2D representation by connecting multiple 1D cables together. Another option would be to connect the 1D cable end-to-end to create a ring to represent a cross section. While these approaches would appear to be different geometries, in either case, the dynamics will remain consistent because the cells comprising the tissue are the same. There is no propagating impulse (for example – although even then in a 2D homogenous tissue, a planar wave is identical in 1D), and the only effect will be an increase in electrotonic load (sink) from neighboring cells, which can readily be approximated in 1D by increasing coupling or modification of the boundary conditions.

      We totally agree that future investigation should include exploration into the potential sex-specific differences in vessel and/or network macro-structure, as these factors may critically impact predictions and indeed the difference in traces observed between Fig. 8D and Fig. 8C may well involve “insulating” effects of vessel layers and interaction between various cell types and other structural factors. In particular, the contribution of endothelial cells in modulating membrane potential in vascular myocytes might be one such influential factor. In future studies, we are also keen to investigate blood flow regulation where a 3D configuration might become necessary.

      The nifedipine data presented in Fig. 9 are quite compelling, and a nice demonstration of the potential power of the new models. How does this relate to what is known about the clinical male/female responses to nifedipine? Are there sex differences in drug efficacy?

      Thank you for your comment regarding Fig. 9.

      It is well known that sex-specific differences in pharmacokinetics and pharmacodynamics influence antihypertensive drug responses [PMID: 8651122., PMID: 22089536]. Previous studies, notably by Kloner et al., have illustrated this point quantitatively, highlighting a more pronounced diastolic BP response in women (91.4%) compared to men (83%) when treated with dihydropyridine-type channel blockers, such as amlodipine/nifedipine. Importantly, this distinction persisted even after adjusting for confounding factors such as baseline BP, age, weight, and dosage per kilogram [PMID: 8651122]. An interesting observation from Kajiwara et al. emphasizes that vasodilation-related adverse symptoms occur significantly more frequently in younger women (<50 years) compared to their male counterparts, suggesting a heightened sensitivity to dihydropyridine-type calcium channel blockers [PMID: 24728902].

      While our findings resonate with clinical observations, a word of caution is in order. Our data suggest that, in the mouse model, nifedipine elicits distinct sex-specific effects. Importantly, future research should test the direct translatability and implications of these observations in human subjects.

      Reviewer #1 (Recommendations For The Authors):

      1. Cellular simulations with noise: It might be useful to also include in this section how noise was introduced specifically into the [Ca]SR equations.

      We agree. The manuscript now includes an expanded explanation of how noise was incorporated into the model. This includes the addition of Equation 6 into section 2.4 "Cellular simulations with noise" to describe how noise was specifically integrated into the [Ca]SR equations. Please see LINE 355.

      1. For equation 14, the description might be confusing. RCG and Ri are not explicitly included.

      Thank you – this has been corrected.

      1. In the paragraph starting with, "Having explored the regulation of graded membrane potential..." , the references to Figure 7C-D do not seem to match the content of the text. Namely, the figures show female versus male responses to nifedipine, which is not introduced until the next paragraph. Additionally, the graphs in 7C-D do not have the panels titled and the y-axes labeled.

      We apologize for the error. We have modified the text and figures to address these issues.

      1. Perhaps give more detail on how the effects of nifedipine were mathematically simulated at the ionic current level.

      Good suggestion. Briefly, previous studies [PMID: 1329564] have shown that at the therapeutic dose of nifedipine (i.e., about 0.1 μM) L-type Cav1.2 channel currents are reduced by about 70%. Accordingly, we decreased ICaL in our mathematical simulations by the same extent. It is known that dihydropyridine-type channel blockers exhibit a voltage-dependent behavior, predominantly binding to the inactivated state. In smooth muscle cells, these blockers initiate inhibition quickly within a voltage range of -60 to -40 mV. This range aligns with the membrane potential baseline of vascular muscle cells (PMID: 8388295), ensuring the blockers are effective without the need of inducing significant depolarization. Therefore, the voltage dependency of dihydropyridine-type channel blockers can be neglected.

      1. For the simulations with 400 uncoupled myocytes, the methods stated that the "gap junctional resistance [was set] to zero". Did the authors mean to use "conductivity" or am I misunderstanding?

      Thank you for bringing up this issue with the term "gap junctional resistance." We now state that the "gap junctional conductivity" was set to zero to indicate no electrical communication/coupling.

      1. Address whether there are differences-such as in cell geometry, degree of sex-based ionic current changes, and frequency of spontaneous hyperpolarization-between mice and human smooth muscle myocytes that could limit the predictive capability of the model.

      Excellent point. Our model was parametrized using data from mouse mesenteric artery smooth muscle cells as initial proof of concept. In general terms, mouse arteries are a good animal model for human arteries, as they have similar intravascular pressure-myogenic tone relationships, resting membrane potentials, and express similar ionic channel (e.g., CaV1.2, BK channels, RyRs, etc) (PMID: 28119464, PMID: 29070899). Unfortunately, these studies have largely been done in male arteries and myocytes. Thus, while we recognize that the physiological distinctions between mice and humans could introduce variances in the model's outcomes. Our model offers valuable insights into the sex-specific mechanisms of KV2.1 and CaV1.2 channels in controlling membrane potential and Ca2+ dynamics in mice. It has been shown that sex-specific differences in pharmacokinetics and pharmacodynamics influence antihypertensive drug responses [[PMID: 8651122., PMID: 22089536]. Previous studies, notably by Kloner et al., have illustrated this point quantitatively, highlighting a more pronounced diastolic BP response in women (91.4%) compared to men (83%) when treated with dihydropyridine-type channel blockers, such as amlodipine/nifedipine. Importantly, this distinction persisted even after adjusting for confounding factors such as baseline BP, age, weight, and dosage per kilogram [PMID: 8651122]. An interesting observation from Kajiwara et al. emphasizes that vasodilation-related adverse symptoms occur significantly more frequently in younger women (<50 years) compared to their male counterparts, suggesting a heightened sensitivity to dihydropyridine-type calcium channel blockers [PMID: 24728902].

      While our findings resonate with clinical observations, a word of caution is in order. Our data suggest that, in the mouse model, nifedipine elicits distinct sex-specific effects. Importantly, future research should test the direct translatability and implications of these observations in human subjects.

      1. "A virtual drug-screening system that can model drug-channel interactions" (pg 32) sounds very novel.

      Thank you for highlighting this. We recognize the typo in our manuscript and have made the necessary corrections to ensure clarity and accuracy.

      Reviewer #2 (Recommendations For The Authors):

      The manuscript is well written. I only have some minor comments:

      1. In the patch clamp experiments, there is no information on the recovery of the ionic currents. Is recovery important or not in arterial myocytes? This question is related to the results shown in Figs 5-7. In Fig.5, is the oscillation caused by noise alone or a spontaneous oscillation (such as the oscillation in Fis.6-7) modulated by noise? In general, recovery is an important parameter for the frequency of spontaneous oscillations. It seems to me that the spontaneous oscillations in Fig.8 are mainly noise-driven since they disappear after the cells are coupled through gap junctions.

      One important aspect of the oscillatory behavior of the smooth muscle cells is the very long timescales, with fluctuations occurring on the order of seconds. But the majority of ion channels are operating and recovering on the order of milliseconds, so a reasonable approximation is that most ion channels in the cell are operating at steady state at low voltages.

      Oscillations in Fig.5: Both the intrinsic oscillations and the noise play key roles in shaping in the oscillations.

      The intrinsic deterministic dynamics of the model cells are oscillatory (as seen in Figures 6-7), but the noise can trigger sparks early or delay them, which leads to substantial fluctuations in the inter-spark intervals. Therefore, the spontaneous oscillations are technically modulated by the noise rather than driven by the noise. Nevertheless, in both cases, recovery dynamics play an essential role in shaping the oscillations and determining their frequency

      Note however that, when an excitable system is around the bifurcation for oscillations and noise is included, the "firing" statistics in the oscillatory state and the non-oscillatory state are indistinguishable for moderate to high levels of noise.

      Noise Exclusion in Figures 6-7: To offer a clear and undistracted interpretation of the results, noise was intentionally omitted from Figures 6-7. This was done to ensure that the primary phenomena under investigation were not obscured. While we recognize the significance of incorporating all elements, including noise, in simulating biological systems, in this case we prioritized a clear point to be made in this context.

      Oscillations in Fig.8: Your observation regarding Fig.8 is insightful. Here, uncoupled cells indeed display a spontaneous oscillatory behavior. As documented in previous research, this behavior is not an artifact resulting from cell isolation from the vessel but represents an intrinsic characteristic vital for maintaining electrical signals. The noise in the cells leads to substantial fluctuations in the inter-spike intervals. Because the noise in each cell is uncorrelated, it acts to desynchronize the activity of the cells. Therefore, instead of synchronizing the activity of the cells, the gap junction coupling quenches the large-scale oscillations (the spikes), creating lower amplitude irregular oscillations.

      1. The calcium level is much higher in women than in men as shown in Figs.7 and 9. Do women have higher arterial pressure than men?

      We thank the reviewer for the observation regarding the calcium levels in Figs.7 and 9. All data presented comes from both male and female C57BL/6J animal models, forming the foundation of our experimental framework.

      From earlier studies by the Santana lab (PMID: 32015129), distinct sex-specific differences were found between male and female vascular mesenteric vessels. When the endothelium was removed from small arteriole segments and these segments were subsequently pressurized within a range of 20–120 mmHg, the female arterioles exhibited a pronounced myogenic response in comparison to the male ones. This brings to the forefront the marked sex-based differences, especially in the context of vascular smooth muscle activity.

      Yet, when examining the behavior of whole, intact vessels, a different picture emerges. Despite clear sex-specific differences in conditions with the endothelium removed, these distinctions become less pronounced in whole, intact vessels. In essence, both male and female mice exhibit analogous arterial pressure patterns. This suggests possible compensatory mechanisms related to the caliber and structure of the small vessels.

      To address the core issue: Despite our data showing higher calcium levels in female samples, it doesn't necessarily imply females consistently exhibit higher arterial pressure across all physiological scenarios.

      1. In Fig.9, where is the intravascular pressure (a variable or a parameter) in the mathematical model?

      In our model, the intravascular pressure effects are implicitly introduced by modulating the conductance of the non-selective cation currents (INSCC). Specifically, the increase in INSCC is our way of simulating the effects of pressure-induced membrane depolarization. This approach allows us to capture the physiological response to intravascular pressure changes without explicitly introducing it as a separate parameter in the model. We have modified the manuscript to ensure that this rationale is clarified.

      1. In Eq.14, the given units of Rmyo (Ohmcm) and Rg (Ohmcmcm) are different, but Eq.14 implies they should have the same unit.

      We sincerely appreciate the reviewer's meticulous observation regarding the units discrepancy in Eq.14. We have revised the manuscript to correct the error.

      Reviewer #3 (Recommendations For The Authors):

      Suggestions for improved or additional experiments, data, or analyses:

      Fig. 5 A-B: This is a beautiful qualitative comparison between experimental and simulation data! I think it would be even more impactful if the authors carried out some quantitative analysis of the similarity between male/female experimental/simulation data. For example, the "resting" Vm levels (approx. -30 mV and -40 mV for females and males, respectively) and the peak levels of Vm hyperpolarization could be compared, as well as the frequency of transient hyperpolarization events. It seems like the female model is much more prone to intervals of relative quiescence (i.e., absence of transient hyperpolarization events - e.g., from ~5-6.5 s). Is this consistent with the duration of such ranges in the experimental data (e.g., from 0 to 2.5 s in Fig. 5A).

      Thank you for your positive remarks concerning the qualitative comparison in Fig. 5 A-B. We are indeed enthusiastic about the parallels we've identified between experimental and simulation outcomes. We agree with the reviewer that the suggestion on the potential value of a more quantitative assessment. As such we have updated the figure to include an in-depth analysis that provides greater insights and solidifies the power of our simulation predictions when compared to experimental results. A detailed analysis of the male and female data as well as the male and female simulations are summarized in the text as follows:

      Baseline membrane potential is -40 mV in male myocytes compared to -30 mV. The frequency of hyperpolarization transients (THs) is 1 Hz in male and 2.5 Hz in female cells for the specific baseline membrane potential shown in Figure 5 A-B. In the range of membrane potentials from -50 mV to -30 mV the frequency increases from 1-2.8Hz which is identical to the experimental frequency range.

      • Fig. 7 C-D: Likewise, it would be helpful to quantitatively characterize male/female differences in the model's response to simulated Ca channel blockade (e.g., rate of transient hyperpolarization events, relative levels of ICa and [Ca]i).

      Thank you for the constructive feedback on Fig. 7 C-D. We appreciate the emphasis on a quantitative approach to solidify our understanding and have modified the results as follows:

      Next, we simulated the effects of calcium channel blocker nifedipine on ICa at a steady membrane potential of -40 mV in male and female simulations. Briefly, previous studies70 have shown that at the therapeutic dose of nifedipine (i.e., about 0.1 μM) L-type Cav1.2 channel currents are reduced by about 70%. Accordingly, we decreased ICa in our mathematical simulations by the same extent. In Figure 7C-D, we show the predicted male (gray) and female (pink) time course of membrane voltage at -40 mV (top panel), ICa (middle panel), and [Ca2+]i (lower panel). First, we observed that in both male and females 0.1 μM nifedipine modifies the frequency of oscillation in the membrane potential, by causing a reduction in oscillation frequency. Second, both male and female simulations (middle panels) show that 0.1 μM nifedipine caused a reduction of ICa to levels that are very similar in male and female myocytes following treatment. Consequently, the reduction of ICa causes both male and female simulations to reach a very similar baseline [Ca2+]i of about 85 nM (lower panels). As a result, simulations provide evidence supporting the idea that CaV1.2 channels are the predominant regulators of intracellular [Ca2+] entry in the physiological range from -40 mV to -20 mV. Importantly, these predictions also suggest that clinically relevant concentrations of nifedipine cause larger overall reductions in Ca2+ influx in female than in male arterial myocytes.

      Recommendations for improving the writing and presentation:

      When I accessed the GitHub repository linked in section 2.7 (Aug 17, 13:30 PT) it only contained a LICENSE file and none of the described codes and model equations appeared to be publicly available. I would like to access and examine these files. Based on the Clancy lab's excellent track record for making their work publicly available, I have no doubt that the published files will be complete, thoroughly documented, and ready for implementation in studies to reproduce or extend the work described in this manuscript.

      https://github.com/ClancyLabUCD/sex-specific-responses-to-calcium-channel-blockers-in-mesenteric-vascular-smooth-muscle

      We sincerely apologize for the omission regarding the GitHub repository. It was never our intention to omit the crucial files that should accompany our manuscript. We deeply regret any inconvenience this may have caused in your review process.

      We deeply value transparency and the importance of making our work accessible to fellow researchers and the wider community. As you rightly pointed out, the Clancy lab has always been committed to ensuring that our work is available publicly, and this instance is no exception. Please find all codes and documentation here:

      Minor corrections to the text and figures:

      The introduction is somewhat lengthy, and some of the material contained therein might be more suitable to be merged into the Discussion instead (e.g., paragraphs on negative feedback regulation and the recent study by O'Dwyer et al.).

      Thank you – we have updated the introduction but left some foundational work descriptions intact.

      • Page 6, section 1.1: There is a missing word (mice?) in the first sentence.

      • Page 11, under Eqn. 7: Luo is misspelled as Lou. (Also twice on Page 20.)

      Thank you – these have been corrected.

      Figs. 2-3: As a colorblind person, it was somewhat challenging for me to differentiate between the red and black lines. Choosing a higher-contrast colour pairing would be beneficial. For some reason, this is not so much of an issue for other figures that use the red/black scheme later in the manuscript (e.g., Figs. 5, 7-8).

      We truly appreciate your feedback on the color contrast used in our figures. Accessibility and clarity are crucial to us, and we regret any difficulty you encountered due to the color choices. Based on your valuable feedback, we have included different color pairings in our visual representations to ensure they are comprehensible to all readers, including those who are colorblind.

      Fig. 2-3: I am also confused about the use of symbols to indicate significant differences in these plots. In Fig. 2, ** is defined in the legend but not used in the figure. In both figures, the symbols are placed above/below specific sets of points, but it is unclear whether large differences for other x-axis values are statistically significant (e.g., -20 mV in Fig. 3B, +40 mV in Fig. 2C, etc.) This should be clarified.

      Thank you – we now have included all the significant differences in the data discussed in the manuscript.

      Page 22: The authors state that they "introduced noise into the [Ca]SR..." but the specifics of this approach are not described. As with other aspects of the Methods section, it would be suitable to provide a brief description of the technique used in ref. 40, perhaps added to section 2.4.

      Thank you – it has been corrected.

      Fig.7 C-D: Axis labels and units are missing. Even though the labels and units will be inferred by most readers, it would be helpful to include them here (at least in C).

      Thank you for pointing out the inconsistency between the textual references and Figure 7C-D. We have added the corrected figure.

      Page 32: "...the first step toward the development of a virtual drug-screaming system..." I think the authors mean drug-screening. As a side note, this is immediately in the running for the best typo I've ever seen as a peer reviewer.

      <good laugh> Thank you for pointing out this error, and we sincerely appreciate your sense of humor about it. You are indeed correct; the intended word is "drug-screening." We have corrected this typo in the manuscript. We're grateful for your thorough review and the light-hearted way you brought this to our attention.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We would like to thank the reviewers for their strong interest in our studies and their excellent suggestions for improvement.

      Reviewer #1:

      Weaknesses:

      Comment 1. The authors identified NPR-15 and ASJ neurons that are involved in both molecular and behavioral responses to pathogen attack. This finding, by itself, is significant. However, how the NPR-15/ASJ circuit regulates the interplay between the two defense strategies was not explored. Therefore, emphasizing the interplay in the title and the abstract is misleading.

      Response to comment 1. We have removed the word “interplay.”

      Comment 2. Although the discovery of a single GPCR regulating both immunity and avoidance behavior is significant and novel, NPR-15 is not the first GPCR identified with these functions. Previously, the same lab reported that the GPCR OCTR-1 also regulates immunity and avoidance behavior through ASH and ASI neurons respectively (PMID: 29117551). This point was not mentioned in the current manuscript.

      Response to Comment 2. We’d like to clarify that it remains unclear whether OCTR-1 itself controls both immunity and behavior (PMID: 29117551). The reference study showed that OCTR-1-expressing neurons ASH and ASI control immunity and behavior, respectively. We modified the manuscript to make this point clearer: “While OCTR-1-expressing neurons ASI play a role in avoidance (34), the specific role of OCTR-1 in ASH and ASI neurons remains unclear. “

      Comment 3. The authors discovered that NPR-15 regulates avoidance behavior via the TRPM gene, GON-2. Only two factors (GON-2 and GTL-2) were examined in this study, and GON-2 happens to function through the intestine.

      Response to comment 3. We studied GON-2 and GTL-2 because a recent screen of intestinal TRPM genes showed that they are the only two involved in the control of pathogen avoidance. We modified the manuscript to make this rationale clearer: “Because transient receptor potential melastatin (TRPM) ion channels, GON-2 and GTL-2, are required for pathogen avoidance (32), we studied whether they may be part of the NPR-15 pathway that controls pathogen avoidance”

      Comment 3b. It is possible that NPR-15 may broadly regulate multiple effectors in multiple tissues. Confining the regulation to the amphid sensory neuron-intestinal axis, as stated in the title and elsewhere in the manuscript, is not accurate.

      Response to comment 3b. We agree that NPR-15 may broadly regulate multiple effectors in different tissues. Indeed, we have shown that the transcriptional activity of ELT-2, HLH-30, DAF-16, and PMK-1 is higher in npr-15 than in WT animals. We found that expression of NPR-15 only in ASJ cells rescues both the survival and behavioral phenotypes of npr-15 animals (Figs. 4F and 5C).

      Comment 4. The C. elegans nervous system is simple, and hermaphrodites only have 302 neurons. Individual neurons possessing multiple regulatory functions is expected. Whether this is conserved in mammals and other vertebrates is unknown, because in higher animals, neurons and neuronal circuits could be more specialized.

      Response to Comment 4. We agreed. We have removed the statements discussing conservation in that manner.

      Comment 5. A key question, that is, why would NPR-15 suppress immunity (which is bad for defense) but enhance avoidance behavior (which is good for defense), is not addressed or explained. This could be due to temporal regulation, for example, upon pathogen exposure, NPR-15 could regulate behavior to avoid the pathogen, but after infection, NPR-15 could suppress excessive immune responses or quench the responses for the resolution of infection.

      Response to comment 5. We found that NPR-15 controls the expression of immune genes in the absence of an infection. Without further experiments, we think it would be too speculative to discuss the possibility of a temporal regulation. However, we modified the manuscript to address the control of both molecular and behavioral immunity by NPR-15. The revised discussion reads: “Our findings shed light on the role of NPR-15 in the control of the immune response. NPR-15 seems to suppress specific immune genes while activating pathogen avoidance behavior to minimize potential tissue damage and the metabolic energy cost associated with activating the molecular immune response against pathogen infections. Overall, the control of immune activation is essential for maintaining homeostasis and preventing excessive tissue damage caused by an overly aggressive and energy-costly response against pathogens (60-63).”

      Comment 6. Discussion appears timid in scope and contains some repetitive statements. Point 5 can be addressed in the Discussion.

      Response to comment 6. We have removed repetitive concepts and modified the discussion as mentioned in the response to point 5.

      Comment 7. Overall, the authors presented an impactful study that identified specific molecules and neuronal cells that regulate both molecular and behavioral immune responses to pathogen attack. Most conclusions are supported by solid evidence. However, some statements are overreaching, for example, regulation of the interplay between molecular and behavioral immune responses was emphasized but not explored. Nonetheless, this study reported a significant and novel discovery and has laid a foundation for investigating such an interplay in the future.

      Response to comment 7: We removed the statements that may have appeared to be overreaching and addressed the weakness raised by the reviewer. The revised discussion reads “Our findings shed light on the role of NPR-15 in the control of the immune response. NPR-15 seems to suppress specific immune genes while activating pathogen avoidance behavior to minimize potential tissue damage and the metabolic energy cost associated with activating the molecular immune response against pathogen infections. Overall, the control of immune activation is essential for maintaining homeostasis and preventing excessive tissue damage caused by an overly aggressive and energy-costly response against pathogens (60-63).”

      Recommendations for the authors:

      Recommendations 1. The title, abstract and some statements in the main text need to be re-written to reflect the fact that regulation of the interplay between molecular and behavioral immune responses was not explored in this study.

      Response to recommendations 1. We modified the title and abstract accordingly.

      Recommendations 2. It should be mentioned in the manuscript that OCTR-1 is the first GPCR that was identified to regulate both immunity and avoidance behavior.

      Response to recommendation 2. We addressed this issue as discussed in the response to comment 2.

      Recommendations 3. Repetitive statements should be removed from Discussion.

      Response to recommendations 3. The statements were removed.

      Recommendations 4. It is surprising to see that pmk-1 RNAi did not affect the survival of npr-15(tm12539) animals against S. aureus because PMK-1 has a general role in defense against S. aureus infection.

      Response to recommendations 4. We agree. However, the RNAi studies were validated using mutants (Fig. S3B).

      Recommendations 4b. Also, the rationale for using skn-1 RNAi as a control was not given. These need to be explained adequately in the manuscript.

      Response to recommendations 4b. There’s no need to include skn-1 RNAi and we removed the data.

      Recommendations 5. The conclusion that the lack of avoidance behavior by NPR-15 loss-of-function is independent of immunity and neuropeptide genes was drawn entirely based on experiments with RNAi of individual genes. Functional redundancy among genes could render RNAi of individual genes ineffective, thus masking the dependence of avoidance behavior on these genes. More experiments are needed to support this conclusion, or the wording of the conclusion need to be changed.

      Response to recommendations 5. We modified the conclusion to address this issue: “Given the possibility of functional redundancy among these genes, we cannot rule out the possibility that different combinations may play a role in controlling avoidance behavior.”

      Recommendations 6. What is representation factor in Fig. 2B and 2C?

      Response to recommendations 5. Figure 2B shows significantly enriched terms with a Q value < 0.1, sorted by P values. Figure 2 C shows the representation factor that is calculated using a tool, http://nemates.org/MA/progs/overlap_stats.html. The calculation is based on the number of genes in set 1, the number of genes in set 2, and the Overlap between set 1 and set 2, as well as the number of genes in the genome.

      We corrected the Figure legends and included the corresponding information in Material and Methods.

      Recommendations 7. The legend of Fig. 6 was wrong and should be changed to 'GPCR/NPR-15 suppressed immune response and enhanced avoidance behavior via sensory neurons'.

      Response to recommendations 7. Thank you for pointing this out. We changed the legend.

      Reviewer #2:

      Comments 1. There is some variance in lawn occupancy of wt strains between the different trials in WT animals (e.g. in Fig. 1: 25 for wt vs 60% for npr mutant; S1c 5% for wt and 60% for npr mutant).

      Response to comment 1. We appreciate the observation. We did notice some variation in both the WT and npr-15(tm12539) animals during our study. Notably, the variation appeared to be more in the WT compared to the npr-15(tm12539) animals. However, it's important to note that these variations did not significantly affect the outcome of our findings. We calculated the means, standard deviation, and standard error across different experimental trials that are presented in the manuscript (Table S2) (new Table). It's worth noting that these variations did not significantly impact the observed differences in lawn occupancy between the wild-type (WT) and npr-15 mutant strains.

      We addressed this issue in the revised manuscript: “Interestingly, we noticed that the variation in lawn occupancy is greater in WT than in npr-15(tm12539) animals across experiments (Table S2), which suggests that the strong lack of avoidance of npr-15(tm12539) somehow counteracts the experimental variation”

      Comment 2. Does this reflect rates of migration or re-occupancy in WT?

      Response to comment 2. We did not observe any re-occupancy in either the WT or npr-15 animals at 24-hour time points (which we mostly use in this study) or beyond. To address the comment, we performed a new experiment and found that the re-occupancy of npr-15 mutants is comparable to that of WT animals at 4 hours post-exposure (Figure S1B).

      Comment 3. Does pathogen avoidance persist and/or the rate of avoidance differ in npr mutant worms?

      Response to comment 3. As illustrated in new Figure S1B, the avoidance behavior in response to pathogens remained consistent even when we extended our observations up to 48 hours (Figure S1B).

      Comment 4. if animals were exposed then re-exposed, could the authors to determine whether a learned avoidance was similarly affected by this mutation by assessing rate changes?

      Response to comment 4. We conducted the proposed experiment and observed that the WT animals learned to avoid the pathogen but not npr-15(tm12539) mutants (Figure S1C). The revised manuscript reads: “We also found that npr-15(tm12539) exhibited reduced learned avoidance compared to WT animals (Figure S1C).”

      Comment 5: Is there any difference in gene expression of animals that have migrated off the lawn to those remaining on the lawn (e.g. in partial lawn experiments?).

      Response to comment 5. This is an interesting question that has not been addressed in the field yet. While we think the study is exciting, we believe that it is outside the scope of our work. All the gene expression studies performed here are in non-avoiding conditions.

      Comment 6. No concerns but the P values in the legends are a pain to read. Why not put them in figures as in above figures.

      Response to comment 6. We included the P values as suggested.

      Recommendations for the authors:

      Recommendation 1. Fig. 1/S1. Comments: There is some variance in lawn occupancy of wt strains between the different trials in WT animals (e.g. in Fig. 1: 25 for wt vs 60% for npr mutant; S1c 5% for wt and 60% for npr mutant).

      Response to recommendation 1. We addressed this issue as discussed in the response to comment 1.

      Recommendation 2. Fig. 1/S1. Comments. Does this reflect rates of migration or re-occupancy in WT?

      Response to recommendation 2. We have responded to this issue in comment 2.

      Recommendations 3. Fig. 1/S1. Comments. Does pathogen avoidance persist and/or the rate of avoidance differ in npr mutant worms.

      Response to recommendation 3. We have responded to this issue in comment 3.

      Recommendation 4. Fig. 1/S1. Comments B. and if animals were exposed then re- exposed, could the authors to determine whether a learned avoidance was similarly affected by this mutation by assessing rate changes?

      Response to recommendation 4: We have responded to this issue in comment 4 above.

      Recommendation 5. Fig. 2/S2. Comment: Is there any difference in gene expression of animals that have migrated off the lawn to those remaining on the lawn (e.g. in partial lawn expts?).

      Response to recommendation 5. We have responded to this issue in comment 5 above.

      Recommendation 6. Fig. 3/S3. Comment. No concerns but the P values in the legends are a pain to read. Why not put them in figures as in above figures.

      Response to recommendation 6. We included the P values.

      Recommendation 7. Fig. 5. Comments: The authors suggest that the ASJ/NPR15 effect to limit avoidance acts via inhibition of GON-2 in the intestine. The observation that GON-2 inhibition effects on pathogen avoidance occur independently of neurons could suggest that it is a redundant way of accomplishing the same thing, which then makes one wonder if or what the connection is exists between the neuron and the gut. The effect of ASJ via NPR on pathogen avoidance is not neuropeptide dependent, which they show. So how the neuronal-gut communication works. Specific Transmitters... perhaps.

      Response to Recommendation 7 Fig. 5. Thanks for this observation. To address the recommendation, we modified the discussion: “Our research additionally indicates that the regulation of NPR-15-mediated avoidance is not influenced by intestinal immune and neuropeptide genes. Given the potential for functional redundancy and our focus on genes upregulated in the absence of NPR-15, we cannot entirely rule out the possibility that unexamined immune effectors or neuropeptides, not transcriptionally controlled by NPR-15, might be involved. Different intestinal signals may also participate in the NPR-15 pathway that controls pathogen avoidance.”

      Recommendation 8. Comment. Since ASJ neurons control entry into dauer, perhaps isn't surprising that DAF-16 showed up as an NPR-15. induced factor (and dauer worms are resistant to a lot of stressors); that said dauer hormones might be involved as well. Is there any evidence that DAF-16 down-regulates GON-2 expression (see Murphy, Kenyon et al. 2005), and along these lines would GON-2 RNAi work in a DAF-16 mutant? I think addressing these issues are the subject of future studies.

      Response to recommendation 8. We checked the data in the study by Murphy, Kenyon et al., and found that the gon-2 gene was not downregulated.

      Recommendation 9. Minor: Regarding the description to Fig. 5. "Consistently with our previous findings, we found that only " The adverb form of consistent should not be used here.

      Response to recommendation 9. Thank you for pointing this out. The description of Figure 5 was corrected.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      A weakness of the paper is that the power of the model is illustrated for only one specific set of parameters, added in a stepwise manner and the comparison to one specific empirical TGM, assumed to be prototypical; And that this comparison remains descriptive. (That is could a different selection of parameters lead to similar results and is there TGM data which matches these settings less well.)

      The fact that the comparisons in the paper are descriptive is a central point of criticism from both reviewers. As mentioned in my preliminary response, I intentionally did not optimise the model to a specific TGM or show an explicit metric of fitness. As I now explicitly mention in the new experimental section of the paper:

      “The previous analyses were descriptive in the sense that they did not quantify how much the generated TGMs resembled a specific empirical TGM. This was deliberate, because empirical TGMs vary across subjects and experiments, and I aimed at characterising them as generally as possible by looking at some characteristic features in broad terms. For example, while TGMs typically have a strong diagonal and horizontal/vertical bars of high accuracy, questions such as when these effects emerge and for how long are highly dependent on the experimental paradigm. For the same reason, I did not optimise the model hyperparameters, limiting myself to observing the behaviour of the model across some characteristic configurations”

      And, in the Discussion:

      “The demonstrations here are not meant to be tailored to a specific data set, and are, for the most part, intentionally qualitative. TGMs do vary across experiments and subjects; and the hyperparameters of the model can be explicitly optimised to specific scientific questions, data sets, and even individuals. In order to explore the space of configurations effectively, an automatic optimisation of the hyperparameter space using, for instance, Bayesian optimisation (Lorenz, et al., 2017) could be advantageous. This may lead to the identification of very specific (spatial, spectral and temporal) features in the data that may be neurobiologically interpreted.”

      Nonetheless, it is possible to fit the model to a specific TGMs by using a explicit metric of fitness. For illustration, this is what I did in the new experimental section Fitting and empirical TGM, where I used correlation with an empirical TGM to optimise two temporal parameters: the rise slope and the fall slope. As can be seen in the Figure 8, the correlation with the empirical TGM was as high as 0.7, even though I did not fit the other parameters of the model. As mentioned in the paragraph above, more sophisticated techniques such as Bayesian optimisation might be necessary for a more exhaustive exploration, but this would be beyond the scope of the current paper.

      I would also like to point out that fitting the parameters in a step-wise manner was a necessity for interpretation. I suggest to think of the way we use F-tests in regression analyses as a comparison: if we want to know how important a feature is, we compare the model with and without this feature and see how much we loss.

      It further remained unclear to me, which implications may be drawn from the generative model, following from the capacities to mimic this specific TGM (i) for more complex cases, such as the comparison between experimental conditions, and (ii) about the complex nature of neural processes involved.

      Following on the previous points, the object of this paper (besides presenting the model and the associated toolbox) was not to mimic a specific TGM, but to characterise the main features that we generally see across studies in the field. To clarify this, I have added Figure 2 (previously a Supplemental Information figure), and added the following to the Results section:

      “Figure 2 shows a TGM for an example subject, where some archetypal characteristics are highlighted. In the experiments below, specifically, I focus on the strong narrow diagonal at the beginning of the trial, the broadening of accuracy later in the trial, and the vertical/horizontal bars of higher-than-chance accuracy. Importantly, this specific example in Figure 2 is only meant as a reference, and therefore I did not optimise the model hyperparameters to this TGM (except in the last subsection), or showed any quantitative metric of similarity.”

      I mention the possibility of using the model to explore more complex cases in the Introduction, although doing so here would be out of scope:

      “Other experimental paradigms, including motor tasks and decision making, can be investigated with genephys”

      Towards this end, I would appreciate (i) a more profound explanation of the conclusions that can be drawn from this specific showcase, including potential limitations, as well as wider considerations of how scientists may empower the generative model to (ii) understand their experimental data better and (iii) which added value the model may have in understanding the nature of underlying brain mechanism (rather than a mere technical characterization of sensor data).

      To better illustrate how to use genephys to explore a specific data set, I have added a section (Fitting an empirical TGM) where I show how to fit specific hyperparameters to an empirical TGM in a simple manner.

      In the Introduction, I briefly mentioned:

      “This (not exhaustive) list of effects was considered given previous literature (Shah, et al., 2004; Mazaheri & Jensen, 2006; Makeig, et al., 2002; Vidaurre, et al., 2021), and each effect may be underpinned by distinct neural mechanisms. For example, it is not completely clear the extent to which stimulus processing is sustained by oscillations, and disentangling these effects can help resolving this question”

      In the Discussion, I have further commented:

      “Genephys has different available types of effect, including phase resets, additive damped oscillations, amplitude modulations, and non-oscillatory responses. All of these elements, which may relate to distinct neurobiological mechanisms, are configurable and can be combined to generate a plethora of TGMs that, in turn, can be contrasted to specific empirical TGMs. This way, we can gain insight on what mechanisms might be at play in a given task.

      The demonstrations here are not meant to be tailored to a specific data set, and are, for the most part, intentionally qualitative. TGMs do vary across experiments and subjects; and the hyperparameters of the model can be explicitly optimised to specific scientific questions, data sets, and even individuals. In order to explore the space of configurations effectively, an automatic optimisation of the hyperparameter space using, for instance, Bayesian optimisation (Lorenz, et al., 2017) could be advantageous. This may lead to the identification of very specific (spatial, spectral and temporal) features in the data that may be neurobiologically interpreted. “

      On p. 15 "Having a diversity of frequencies but not of latencies produces another regular pattern consisting of alternating, parallel bands of higher/lower than baseline accuracy. This, shown in the bottom left panel, is not what we see in real data either. Having a diversity of latencies but not of frequencies gets us closer to a realistic pattern, as we see in the top right panel." The terms frequency and latency seem to be confused.

      The Reviewer is right. I have corrected this now. Thank you.

      Reviewer #2:

      The results of comparisons between simulations and real data are not always clear for an inexperienced reader. For example, the comparisons are qualitative rather than quantitative, making it hard to draw firm conclusions. Relatedly, it is unclear whether the chosen parameterizations are the only/best ones to generate the observed patterns or whether others are possible. In the case of the latter, it is unclear what we can actually conclude about underlying signal generators. It would have been different if the model was directly fitted to empirical data, maybe of different cognitive conditions. Finally, the neurobiological interpretation of different signal properties is not discussed. Therefore, taken together, in its currently presented form, it is unclear how this method could be used exactly to further our understanding of the brain.

      This critique coincides with that of Reviewer 1. In the current version, I made more clear the fact that I am not fitting a specific empirical TGM and why, and that, instead, I am referring to general features that appear broadly throughout the literature. See more detailed changes below.

      Regarding whether the chosen parameterizations are the only/best ones to generate the observed patterns, the Discussion reflects this limitation:

      “Also importantly, I have shown that standard decoding analysis can differentiate between these explanations only to some extent. For example, the effects induced by phase-resetting and the use of additive oscillatory components are not enormously different in terms of the resulting TGMs. In future work, alternatives to standard decoding analysis and TGMs might be used to disentangle these sources of variation (Vidaurre, et al., 2019). ”

      And

      “Importantly, the list of effects that I have explored here is not exhaustive …”

      Of course, since the list of signal features I have explored is not exhaustive, it cannot be claimed without a doubt that these features are the ones generating the properties we observe in real TGMs. The model, however, is a step forward in that direction, as it provides us with a tool to at least rule out some causes.

      Firstly, it was not entirely clear to me from the introduction what gap exactly the model is supposed to fill: is it about variance in neural responses in general, about which signal properties are responsible for decoding, or about capturing stability of signals? It seems like it does all of these, but this needs to be made clearer in the introduction. It would be helpful to emphasize exactly what insights the model can provide that are unable to be obtained with the current methods.

      I have now made this explicit in in the Introduction, as suggested:

      “To gain insight into what aspects of the signal underpin decoding accuracy, and therefore the most stable aspects of stimulus processing, I introduce a generative model”

      To help illustrating what insights the model can provide, I have added the following sentence as an example:

      “For example, it is not completely clear the extent to which stimulus processing is sustained by oscillations, and disentangling these effects can help resolving this question.”

      Furthermore, I was unclear on why these specific properties were chosen (lines 71 to 78). Is there evidence from neuroscience to suggest that these signal properties are especially important for neural processing? Or, if the logic has more to do with signal processing, why are these specific properties the most important to include?

      To clarify this the text now reads:

      “In the model, when a channel responds, it can do it in different ways: (i) by phase-resetting the ongoing oscillation to a given target phase and then entraining to a given frequency, (ii) by an additive oscillatory response independent of the ongoing oscillation, (iii) by modulating the amplitude of the stimulus-relevant oscillations, or (iv) by an additive non-oscillatory (slower) response. This (not exhaustive) list of effects was considered given previous literature (Shah, et al., 2004; Mazaheri & Jensen, 2006; Makeig, et al., 2002; Vidaurre, et al., 2021), and each effect may be underpinned by distinct neural mechanisms”

      The general narrative and focus of the paper could also be improved. It might help to start off with an outline of what the goal is at the start of the paper and then explicitly discuss how each of the steps works toward that goal. For example, I got the idea that the goal was to capture specific properties of an empirical TGM. If this was the case, the empirical TGM could be placed in the main body of the text as a reference picture for all simulated TGMs. For each simulation step, it could be emphasized more clearly exactly which features of the TGM is captured and what that means for interpreting these features in real data.

      Thank you. To clarify the purpose of the paper better, I have brought Figure 2 to the front (before a Supplementary Figure), and in the first part of Results I have now added:

      “Figure 2 shows a TGM for an example subject, where some archetypal characteristics are highlighted. In the experiments below, specifically, I focus on the strong narrow diagonal at the beginning of the trial, the broadening of accuracy later in the trial, and the vertical/horizontal bars of higher-than-chance accuracy. Importantly, this specific example in Figure 2 is only meant as a reference, and therefore I did not optimise the model hyperparameters to this TGM (except in the last subsection), or showed any quantitative metric of similarity. ”

      I have enunciated the goals more clearly in the Introduction:

      “To gain insight into what aspects of the signal underpin decoding accuracy, and therefore the most stable aspects of stimulus processing, …”

      Relatedly, it would be good to connect the various signal properties to possible neurobiological mechanisms. I appreciate that the author tries to remain neutral on this in the introduction, but I think it would greatly increase the implications of the analysis if it is made clearer how it could eventually help us understand neural processes.

      The Reviewer is right in pointing out that I preferred to remain neutral on this. While I have still kept that tone of neutrality throughout the paper, I have now included the following sentence as an example of a neurobiological question that could be investigated with the model:

      “For example, it is not completely clear the extent to which stimulus processing is sustained by oscillations, and disentangling these effects can help resolving this question.”

      And, more generally,

      “Genephys has different available types of effect, including phase resets, additive damped oscillations, amplitude modulations, and non-oscillatory responses. All of these elements, which may relate to distinct neurobiological mechanisms, are configurable and can be combined to generate a plethora of TGMs that, in turn, can be contrasted to specific empirical TGMs. This way, we can gain insight on what mechanisms might be at play in a given task. ”

      Line 57: this sentence is very long, making it hard to follow, could you break up into smaller parts?

      Thank you. The sentence is fragmented now.

      Please replace angular frequencies with frequencies in Hertz for clarity.

      Here I have preferred to stick to angular frequencies because it is more general than if I talk about Hertz, because that would entail having a specific sampling frequency. I think doing so would create confusion precisely of the sorts that I am trying to clarify in this revision: that is, that these results are not specific of one TGM but reflect general features that we see broadly in the literature.

      There are quite some types throughout the paper, please recheck

      Thank you. I have revised and have made my best to clear them out.

    2. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      A weakness of the paper is that the power of the model is illustrated for only one specific set of parameters, added in a stepwise manner and the comparison to one specific empirical TGM, assumed to be prototypical; And that this comparison remains descriptive. (That is could a different selection of parameters lead to similar results and is there TGM data which matches these settings less well.)

      The fact that the comparisons in the paper are descriptive is a central point of criticism from both reviewers. As mentioned in my preliminary response, I intentionally did not optimise the model to a specific TGM or show an explicit metric of fitness. As I now explicitly mention in the new experimental section of the paper:

      “The previous analyses were descriptive in the sense that they did not quantify how much the generated TGMs resembled a specific empirical TGM. This was deliberate, because empirical TGMs vary across subjects and experiments, and I aimed at characterising them as generally as possible by looking at some characteristic features in broad terms. For example, while TGMs typically have a strong diagonal and horizontal/vertical bars of high accuracy, questions such as when these effects emerge and for how long are highly dependent on the experimental paradigm. For the same reason, I did not optimise the model hyperparameters, limiting myself to observing the behaviour of the model across some characteristic configurations”

      And, in the Discussion:

      “The demonstrations here are not meant to be tailored to a specific data set, and are, for the most part, intentionally qualitative. TGMs do vary across experiments and subjects; and the hyperparameters of the model can be explicitly optimised to specific scientific questions, data sets, and even individuals. In order to explore the space of configurations effectively, an automatic optimisation of the hyperparameter space using, for instance, Bayesian optimisation (Lorenz, et al., 2017) could be advantageous. This may lead to the identification of very specific (spatial, spectral and temporal) features in the data that may be neurobiologically interpreted.”

      Nonetheless, it is possible to fit the model to a specific TGMs by using a explicit metric of fitness. For illustration, this is what I did in the new experimental section Fitting and empirical TGM, where I used correlation with an empirical TGM to optimise two temporal parameters: the rise slope and the fall slope. As can be seen in the Figure 8, the correlation with the empirical TGM was as high as 0.7, even though I did not fit the other parameters of the model. As mentioned in the paragraph above, more sophisticated techniques such as Bayesian optimisation might be necessary for a more exhaustive exploration, but this would be beyond the scope of the current paper.

      I would also like to point out that fitting the parameters in a step-wise manner was a necessity for interpretation. I suggest to think of the way we use F-tests in regression analyses as a comparison: if we want to know how important a feature is, we compare the model with and without this feature and see how much we loss.

      It further remained unclear to me, which implications may be drawn from the generative model, following from the capacities to mimic this specific TGM (i) for more complex cases, such as the comparison between experimental conditions, and (ii) about the complex nature of neural processes involved.

      Following on the previous points, the object of this paper (besides presenting the model and the associated toolbox) was not to mimic a specific TGM, but to characterise the main features that we generally see across studies in the field. To clarify this, I have added Figure 2 (previously a Supplemental Information figure), and added the following to the Results section:

      “Figure 2 shows a TGM for an example subject, where some archetypal characteristics are highlighted. In the experiments below, specifically, I focus on the strong narrow diagonal at the beginning of the trial, the broadening of accuracy later in the trial, and the vertical/horizontal bars of higher-than-chance accuracy. Importantly, this specific example in Figure 2 is only meant as a reference, and therefore I did not optimise the model hyperparameters to this TGM (except in the last subsection), or showed any quantitative metric of similarity.”

      I mention the possibility of using the model to explore more complex cases in the Introduction, although doing so here would be out of scope:

      “Other experimental paradigms, including motor tasks and decision making, can be investigated with genephys”

      Towards this end, I would appreciate (i) a more profound explanation of the conclusions that can be drawn from this specific showcase, including potential limitations, as well as wider considerations of how scientists may empower the generative model to (ii) understand their experimental data better and (iii) which added value the model may have in understanding the nature of underlying brain mechanism (rather than a mere technical characterization of sensor data).

      To better illustrate how to use genephys to explore a specific data set, I have added a section (Fitting an empirical TGM) where I show how to fit specific hyperparameters to an empirical TGM in a simple manner.

      In the Introduction, I briefly mentioned:

      “This (not exhaustive) list of effects was considered given previous literature (Shah, et al., 2004; Mazaheri & Jensen, 2006; Makeig, et al., 2002; Vidaurre, et al., 2021), and each effect may be underpinned by distinct neural mechanisms. For example, it is not completely clear the extent to which stimulus processing is sustained by oscillations, and disentangling these effects can help resolving this question”

      In the Discussion, I have further commented:

      “Genephys has different available types of effect, including phase resets, additive damped oscillations, amplitude modulations, and non-oscillatory responses. All of these elements, which may relate to distinct neurobiological mechanisms, are configurable and can be combined to generate a plethora of TGMs that, in turn, can be contrasted to specific empirical TGMs. This way, we can gain insight on what mechanisms might be at play in a given task.

      The demonstrations here are not meant to be tailored to a specific data set, and are, for the most part, intentionally qualitative. TGMs do vary across experiments and subjects; and the hyperparameters of the model can be explicitly optimised to specific scientific questions, data sets, and even individuals. In order to explore the space of configurations effectively, an automatic optimisation of the hyperparameter space using, for instance, Bayesian optimisation (Lorenz, et al., 2017) could be advantageous. This may lead to the identification of very specific (spatial, spectral and temporal) features in the data that may be neurobiologically interpreted. “

      On p. 15 "Having a diversity of frequencies but not of latencies produces another regular pattern consisting of alternating, parallel bands of higher/lower than baseline accuracy. This, shown in the bottom left panel, is not what we see in real data either. Having a diversity of latencies but not of frequencies gets us closer to a realistic pattern, as we see in the top right panel." The terms frequency and latency seem to be confused.

      The Reviewer is right. I have corrected this now. Thank you.

      Reviewer #2:

      The results of comparisons between simulations and real data are not always clear for an inexperienced reader. For example, the comparisons are qualitative rather than quantitative, making it hard to draw firm conclusions. Relatedly, it is unclear whether the chosen parameterizations are the only/best ones to generate the observed patterns or whether others are possible. In the case of the latter, it is unclear what we can actually conclude about underlying signal generators. It would have been different if the model was directly fitted to empirical data, maybe of different cognitive conditions. Finally, the neurobiological interpretation of different signal properties is not discussed. Therefore, taken together, in its currently presented form, it is unclear how this method could be used exactly to further our understanding of the brain.

      This critique coincides with that of Reviewer 1. In the current version, I made more clear the fact that I am not fitting a specific empirical TGM and why, and that, instead, I am referring to general features that appear broadly throughout the literature. See more detailed changes below.

      Regarding whether the chosen parameterizations are the only/best ones to generate the observed patterns, the Discussion reflects this limitation:

      “Also importantly, I have shown that standard decoding analysis can differentiate between these explanations only to some extent. For example, the effects induced by phase-resetting and the use of additive oscillatory components are not enormously different in terms of the resulting TGMs. In future work, alternatives to standard decoding analysis and TGMs might be used to disentangle these sources of variation (Vidaurre, et al., 2019). ”

      And

      “Importantly, the list of effects that I have explored here is not exhaustive …”

      Of course, since the list of signal features I have explored is not exhaustive, it cannot be claimed without a doubt that these features are the ones generating the properties we observe in real TGMs. The model, however, is a step forward in that direction, as it provides us with a tool to at least rule out some causes.

      Firstly, it was not entirely clear to me from the introduction what gap exactly the model is supposed to fill: is it about variance in neural responses in general, about which signal properties are responsible for decoding, or about capturing stability of signals? It seems like it does all of these, but this needs to be made clearer in the introduction. It would be helpful to emphasize exactly what insights the model can provide that are unable to be obtained with the current methods.

      I have now made this explicit in in the Introduction, as suggested:

      “To gain insight into what aspects of the signal underpin decoding accuracy, and therefore the most stable aspects of stimulus processing, I introduce a generative model”

      To help illustrating what insights the model can provide, I have added the following sentence as an example:

      “For example, it is not completely clear the extent to which stimulus processing is sustained by oscillations, and disentangling these effects can help resolving this question.”

      Furthermore, I was unclear on why these specific properties were chosen (lines 71 to 78). Is there evidence from neuroscience to suggest that these signal properties are especially important for neural processing? Or, if the logic has more to do with signal processing, why are these specific properties the most important to include?

      To clarify this the text now reads:

      “In the model, when a channel responds, it can do it in different ways: (i) by phase-resetting the ongoing oscillation to a given target phase and then entraining to a given frequency, (ii) by an additive oscillatory response independent of the ongoing oscillation, (iii) by modulating the amplitude of the stimulus-relevant oscillations, or (iv) by an additive non-oscillatory (slower) response. This (not exhaustive) list of effects was considered given previous literature (Shah, et al., 2004; Mazaheri & Jensen, 2006; Makeig, et al., 2002; Vidaurre, et al., 2021), and each effect may be underpinned by distinct neural mechanisms”

      The general narrative and focus of the paper could also be improved. It might help to start off with an outline of what the goal is at the start of the paper and then explicitly discuss how each of the steps works toward that goal. For example, I got the idea that the goal was to capture specific properties of an empirical TGM. If this was the case, the empirical TGM could be placed in the main body of the text as a reference picture for all simulated TGMs. For each simulation step, it could be emphasized more clearly exactly which features of the TGM is captured and what that means for interpreting these features in real data.

      Thank you. To clarify the purpose of the paper better, I have brought Figure 2 to the front (before a Supplementary Figure), and in the first part of Results I have now added:

      “Figure 2 shows a TGM for an example subject, where some archetypal characteristics are highlighted. In the experiments below, specifically, I focus on the strong narrow diagonal at the beginning of the trial, the broadening of accuracy later in the trial, and the vertical/horizontal bars of higher-than-chance accuracy. Importantly, this specific example in Figure 2 is only meant as a reference, and therefore I did not optimise the model hyperparameters to this TGM (except in the last subsection), or showed any quantitative metric of similarity. ”

      I have enunciated the goals more clearly in the Introduction:

      “To gain insight into what aspects of the signal underpin decoding accuracy, and therefore the most stable aspects of stimulus processing, …”

      Relatedly, it would be good to connect the various signal properties to possible neurobiological mechanisms. I appreciate that the author tries to remain neutral on this in the introduction, but I think it would greatly increase the implications of the analysis if it is made clearer how it could eventually help us understand neural processes.

      The Reviewer is right in pointing out that I preferred to remain neutral on this. While I have still kept that tone of neutrality throughout the paper, I have now included the following sentence as an example of a neurobiological question that could be investigated with the model:

      “For example, it is not completely clear the extent to which stimulus processing is sustained by oscillations, and disentangling these effects can help resolving this question.”

      And, more generally,

      “Genephys has different available types of effect, including phase resets, additive damped oscillations, amplitude modulations, and non-oscillatory responses. All of these elements, which may relate to distinct neurobiological mechanisms, are configurable and can be combined to generate a plethora of TGMs that, in turn, can be contrasted to specific empirical TGMs. This way, we can gain insight on what mechanisms might be at play in a given task. ”

      Line 57: this sentence is very long, making it hard to follow, could you break up into smaller parts?

      Thank you. The sentence is fragmented now.

      Please replace angular frequencies with frequencies in Hertz for clarity.

      Here I have preferred to stick to angular frequencies because it is more general than if I talk about Hertz, because that would entail having a specific sampling frequency. I think doing so would create confusion precisely of the sorts that I am trying to clarify in this revision: that is, that these results are not specific of one TGM but reflect general features that we see broadly in the literature.

      There are quite some types throughout the paper, please recheck

      Thank you. I have revised and have made my best to clear them out.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We are grateful for the comments and suggestions from the reviewers and have followed the recommendation in producing our revised manuscript. We have modified the text and performed additional statistical analysis as detailed below, which we believe has improved the overall manuscript.

      Reviewer #1 (Public Review):

      Establishing direct links between the neuronal connectivity information of connectomics datasets with circuit physiology and behavior and exciting current research area in neurobiology. Until recently, studies of aggression in Drosophila had been conducted largely in males, and many of the neurons involved in this behavior are male-specific clusters. Since the currently available fly brain connectomes come from female brains, their applicability for the study of the circuitry underlying aggressive behavior is very limited.

      The authors have previously used the Janelia hemibrain connectome paired with behavior analysis to show that activating either the aIPg or pC1d cell types can induce short-term aggression in females, while activation of other PC1 clusters (a-c and e) does not. Here they expand on those findings, showing that optogenetic stimulation of aIPg neurons was sufficient to promote an aggressive internal state lasting at least 10 minutes following a 30-second activation. In addition, the authors show that while stimulation of PC1d alone is not sufficient to induce this persistent aggressive state, simultaneous activation of PC1d + PC1e is, suggesting a synergistic effect. Connectomics analysis performed in the authors' previous study had shown that PC1d and aIPg are interconnected. However, silencing pC1d neuronal activity did not reduce aIPg-evoked persistent aggression, indicating that the aggressive state did not depend on pC1d-aIPg recurrent connectivity.

      The conclusions are well supported by the data, and the results presented in this manuscript represent an important contribution to our understanding of the neuronal circuitry underlying female aggression.

      Reviewer #1 (Recommendations For The Authors):

      1. Previously, the authors have shown that the activation of PC1e alone does not induce female aggression. In this study, they investigate the role of aIPg, PC1d, or PC1d+e on aggression persistence, but they do not explore the effect of activation of PC1e alone. It is possible that PC1e activation may not produce an immediate short-term effect but could lead to a gradual increase in aggression over time, potentially explaining at least in part the observed effect upon PC1d+e activation. Incorporating an examination of the long-term impact of PC1e activation on aggression could provide valuable information.

      We did perform mixed pair experiments with the pC1e-SS1 line from the Schretter et al. (2020) paper and did not find any significant changes in aggression over time in this setup as well. We have now added a reference to these experiments in the revised submission in lines 135 to 136.

      1. Some important controls are missing: flies with the genetic combinations employed in the activation experiments shown in Figure 2 but in the absence of activation and under the exact same conditions and for a similar observation period.

      For Figure 2, we used an empty split-Gal4 driver as a genetic control for our activation paradigms. As these flies contain the same number of copies of mini-white while not labeling the targeted cell types, we believe that they provide an appropriate control for these experiments. The control information is specified in all figure legends as well.

      1. The quantification shown in Fig 3- Supplementary Figure 1 shows no effect during stimulation (13 s + 15s), but based on the plots of Figure 3, there may be an effect of silencing PC1d on aIPg-induced aggression during the initial 13 second period. Those two time periods (13 s vs 15 s) could be quantified separately to determine if this is the case.

      We examined the two stimulation periods separately and did not find any significant differences in either period (13s period, p = 0.2978; 15s period, p = 0.6650). We have now added this into the figure legend for Figure 3 and Figure 3 supplement 1.

      1. Expression of Kir2.1 in pC1d neurons while aIPg neurons were activated did not suppress aggression after aIPg stimulation, suggesting that connections from pC1d neurons are not necessary for the persistent aggressive state promoted by aIPg. Since previously the authors have shown that TNT-mediated inhibition of aIPg reduces aggression, the reciprocal experiment would be informative: determining if stimulation of PC1d+e no longer produces persistent aggression when aIPg neurons are silenced.

      In this manuscript, we were primarily testing if the connections from aIPg to pC1d were necessary for the persistent aggressive state induced by aIPg activation. Therefore, we believe the suggested experiment is beyond the scope of the current manuscript.

      1. How many times was each experiment repeated? This is important information and should be in the methods section for each type of experiment or in each figure legend.

      We have now added this information in the appropriate figure legends.

      1. Determining the effect on persistent aggression of silencing sNPF (for example via RNAi or Crispr-Cas9 mediated mutagenesis) in aIPG neurons would be an important addition to the manuscript. If peptidergic signaling is underlying the persistence phenotype of aIPg neurons, that would explain why the recurrent connectivity found between those cells and the PC1 cluster does not play a role.

      We agree with the reviewer that this would be a logical next step in extending this work.

      Reviewer #2 (Public Review):

      The mechanisms that mediate female aggression remain poorly understood. Chiu, Schretter, and colleagues, employed circuit dissection techniques to tease apart the specific roles of particular doublesex and fruitless expressing neurons in the fly Drosophila in generating a persistent aggressive state. They find that activating the fruitless positive alPg neurons, generated an aggressive state that persisted for >10min after the stimulation ended. Similarly, activating the doublesex positive pC1de neurons also generated a persistent state. Activating pC1d or pC1e individually did not induce a persistent state. Interestingly, while neural activation of alPGs and pC1d+e neurons induced persistent behavioural states it did not induce persistent activity in the neurons being activated.

      The conclusions of this paper are well supported by the data, there were only a few points where clarification might help:

      1. Figure 3 is a little confusing. This is a circuit behavioural epistasis experiment where the authors activate alPg with CsChrimson while inhibiting pC1d with Kir2.1. In Fig. 2 flies were separated for 10 min following stimulation which allowed for identification of a persistent state. However, in Fig 3 it appears as if flies were allowed to freely interact during and immediately post-stimulation. It is unclear why flies were not separated as in Fig. 2, which makes it difficult to compare the two results. Some discussion of this point would help. Also, from the rasters it appears as if inhibition of pC1d reduced aggression induced by alPg during the stimulation period. Is this true?

      We thank the reviewer for pointing out the need for clarification and we have modified the legend in Figure 3 to address the points raised. The flies were allowed to freely interact during the experiments shown in Figure 3 and we have added this information to the figure legend. To obtain a high level of aggressive behavior that would make it easier to observe a suppression of aggression, the epistasis experiments were performed with freely moving same-genotype pairs. The level of aggression triggered by the generation 1 LexA line labeling aIPg was lower than that observed when using with the aIPg-SS GAL4 line. The experiment was performed as in Schretter et al. (2020) where we found that aIPg activation induced persistent fighting in same genotype pairs. We have added a brief explanation in lines 152 to 155.

      Inhibition of pC1d does not significantly reduce the overall aggression induced by aIPg stimulation in the 13s + 15s period. We also examined the differences within the two stimulation periods and did not find any significant differences (13s period, p = 0.2978; 15s period, p = 0.6650). We have now added this information to the figure legends for Figure 3 and Figure 3 supplement 1.

      1. pC1e neurons also have recurrent connectivity with alPg neurons. It might help to also discuss the potential role of this arm of the microcircuit.

      We thank the review for this suggestion. The number of synapses that aIPg sends back to pC1e is a very low proportion of its total output (0.177%). However, based on the experiments that we have performed, we cannot rule out that this microcircuit might contribute to maintaining persistence. We have added this point into the discussion in lines 210 to 211.

      Reviewer #2 (Recommendations For The Authors):

      1. Line 129-130: A citation for group-housed flies showing lower aggression would be helpful.

      We have now added in the reference to Chiu et al. (2021), as they showed this effect for females, in line 130.

      1. Figure 2 - figure supplement 1: In the legend, change "when pC1d neurons were stimulation" to "when pC1d neurons were stimulated".

      We thank the reviewer for finding this error and have now corrected this.

      Reviewer #3 (Public Review):

      Two studies published in 2020 independently identified the alPg, pC1d, and pC1e neurons to be involved in initiating and maintaining a state of aggression in female Drosophila. Both studies combined behavioural analyses, optogenitic manipulation of neurons, and connectomics. One of these studies proposed that the extensive interconnections seen between the alPg and pC1d+e neurons might represent a recurrent motif known to support persistent behvioural states in other systems. In this manuscript, the authors test this idea and report that their data do not support it. Specifically, they report that alPg or pC1d+e (but not pC1d alone) can initiate a persistent state of aggression. But they find that the persistent aggressive state is maintained even when the pC1d neurons are inactivated. Finally, they show that neither of these neurons themselves sustains neuronal activity upon stimulation, nor do either of them induce a persistent activity in the other. Together, their data suggest that the recurrent connection between alPg and pC1d is not what supports the persistent state. The data underlying these claims are convincing. A possibility to explore before ruling out recurrent motifs (at this circuit level) in maintaining aggression is that the connections between alPg and pC1e can compensate for the loss of pC1e. Overall, the study is important and will be of interest to those who study the circuit basis of persistent behavioural states, but also to neuroscientists in general.

      Reviewer #3 (Recommendations For The Authors):

      I enjoyed reading this manuscript for its clarity in writing and data presentation.

      I would like the authors to comment on the possibility that pC1e can compensate for the loss of pC1d. It is possible that if they silence both pC1d+e in the context of alPg activation, the persistent aggression is lost?

      We agree with the reviewer that this is an intriguing hypothesis. In order to examine if pC1e does compensate for pC1d, we would need to also activate pC1e while inhibiting pC1d. However, such an experiment is not currently possible as we do not have a LexA line that specifically labels either pC1d or pC1e alone.

      For the pC1d+e silencing experiments, we were primarily testing to see if the most prominent recurrent connection, which is between pC1d and aIPg, was responsible for the behavioral persistence. We agree with the reviewer that this would be a logical follow up experiment to be performed in the future.

      Have the authors looked for activity in the pC1e neuron upon simulation of alPg? (Deutsch et al 2020 observed many regions in the brain that maintained sustained activity upon pC1d+e stimulation.)

      We have not examined this activity. We agree that this would be a good follow up experiment; however, we believe it is beyond the scope of the current work.

      Would the more appropriate experiment in Figure 4c be the co-stimulation of pC1d+e while imaging from alPg?

      For these experiments, we were testing to see if the most prominent recurrent connection, which is between pC1d and aIPg, was responsible for the behavioral persistence. We agree with the reviewer that this would be a good follow up experiment

    1. Author Response

      Reviewer #1 (Public Review):

      Summary:

      This study uses whole genome sequencing to characterise the population structure and genetic diversity of a collection of 58 isolates of E. coli associated with neonatal meningitis (NMEC) from seven countries, including 52 isolates that the authors sequenced themselves and a further 6 publicly available genome sequences. Additionally, the study used sequencing to investigate three case studies of apparent relapse. The data show that in all three cases, the relapse was caused by the same NMEC strain as the initial infection. In two cases they also found evidence for gut persistence of the NMEC strain, which may act as a reservoir for persistence and reinfection in neonates. This finding is of clinical importance as it suggests that decolonisation of the gut could be helpful in preventing relapse of meningitis in NMEC patients.

      Strengths:

      The study presents complete genome sequences for n=18 diverse isolates, which will serve as useful references for future studies of NMEC. The genomic analyses are high quality, the population genomic analyses are comprehensive and the case study investigations are convincing.

      We agree

      Weaknesses:

      The NMEC collection described in the study includes isolates from just seven countries. The majority (n=51/58, 88%) are from high-income countries in Europe, Australia, or North America; the rest are from Cambodia (n=7, 12%). Therefore it is not clear how well the results reflect the global diversity of NMEC, nor the populations of NMEC affecting the most populous regions.

      The virulence factors section highlights several potentially interesting genes that are present at apparently high frequency in the NMEC genomes; however, without knowing their frequency in the broader E. coli population it is hard to know the significance of this.

      We acknowledged the limitations of our NMEC collection in the Discussion. We agree the prevalence of virulence factors in our collection is interesting. The limited size of our collection prevented further evaluation of the prevalence of these virulence factors in a broader E. coli population.

      Reviewer #2 (Public Review):

      Summary:

      In this work, the authors present a robust genomic dataset profiling 58 isolates of neonatal meningitis-causing E. coli (NMEC), the largest such cohort to be profiled to date. The authors provide genomic information on virulence and antibiotic resistance genomic markers, as well as serotype and capsule information. They go on to probe three cases in which infants presented with recurrent febrile infection and meningitis and provide evidence indicating that the original isolate is likely causing the second infection and that an asymptomatic reservoir exists in the gut. Accompanying these results, the authors demonstrate that gut dysbiosis coincides with the meningitis.

      Strengths:

      The genomics work is meticulously done, utilizing long-read sequencing.

      The cohort of isolates is the largest to be sampled to date.

      The findings are significant, illuminating the presence of a gut reservoir in infants with repeating infection.

      We agree

      Weaknesses:

      Although the cohort of isolates is large, there is no global representation, entirely omitting Africa and the Americas. This is acknowledged by the group in the discussion, however, it would make the study much more compelling if there was global representation.

      We agree. In the Discussion we state this is likely a reflection of the difficulty in acquiring isolates causing neonatal meningitis, in particular from countries with limited microbiology and pathology resources.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, Schembri et al performed a molecular analysis by WGS of 52 E. coli strains identified as "causing neonatal meningitis" from several countries and isolated from 1974 to 2020. Sequence types, virulence genes content as well as antibiotic-resistant genes are depicted. In the second part, they also described three cases of relapse and analysed their respective strains as well as the microbiome of three neonates during their relapse. For one patient the same E. coli strain was found in blood and stool (this patient had no meningitis). For two patients microbiome analysis revealed a severe dysbiosis.

      Major comments:

      Although the authors announce in their title that they study E. coli that cause neonatal meningitis and in methods stipulate that they had a collection of 52 NMEC, we found in Supplementary Table 1, 29 strains (therefore most of the strains) isolated from blood and not CSF. This is a major limitation since only strains isolated from CSF can be designated with certainty as NMEC even if a pleiocytose is observed in the CSF. A very troubling data is the description of patient two with a relapse infection. As stated in the text line 225, CSF microscopy was normal and culture was negative for this patient! Therefore it is clear that patient without meningitis has been included in this study.

      We have reviewed the clinical data for our 52 NMEC isolates, noting that for some of the older Finish isolates we relied on previous publications. This data is shown in Table S1. To address the Reviewer’s comment, we have added the following text to the methods section (new text underlined).

      ‘The collection comprised 42 isolates from confirmed meningitis cases (29 cultured from CSF and 13 cultured from blood) and 10 isolates from clinically diagnosed meningitis cases (all cultured from blood).’

      Patient 2 was initially diagnosed with meningitis based on a positive blood culture in the presence of CSF pleocytosis (>300 WBCs, >95% polymorphs). We understand there may be some confusion with reference to a relapsed infection, which we now more accurately describe as recrudescent invasive infection in the revised manuscript.

      Another major limitation (not stated in the discussion) is the absence of clinical information on neonates especially the weeks of gestation. It is well known that the risk of infection is dramatically increased in preterm neonates due to their immature immunity. Therefore E. coli causing infection in preterm neonates are not comparable to those causing infection in term neonates notably in their virulence gene content. Indeed, it is mentioned that at least eight strains did not possess a capsule, we can speculate that neonates were preterm, but this information is lacking. The ages of neonates are also lacking. The possible source of infection is not mentioned, notably urinary tract infection. This may have also an impact on the content of VF.

      We agree. In the Discussion we now note the following (new text underlined):

      ‘… we did not have clinical data on the weeks of gestation for all patients, and thus could not compare virulence factors from NMEC isolated from preterm versus term infants.’

      Submission to Medrxiv, a requirement for review of our manuscript at eLife, necessitated the removal of some patient identifying information, including precise age and detailed medical history.

      Sequence analysis reveals the predominance of ST95 and ST1193 in this collection. The high incidence of ST95 is not surprising and well previously described, therefore, the concluding sentence line 132 indicating that ST95 E. coli should exhibit specific virulence features associated with their capacity to cause NM does not add anything. On the contrary, the high incidence of ST1193 is of interest and should have been discussed more in detail. Which specific virulence factors do they harbor? Any hypothesis explaining their emergence in neonates?

      We compared the virulence factors of ST95 and ST1193 and summarized this information in Figure 4. We also discussed how the K1 polysialic acid capsule in ST95 and ST1193 could contribute to the emergence of these STs in NM. Specifically, we stated the following: ‘We speculate this is due to the prevailing K1 polysialic acid capsule serotype found in ST95 and the newly emerged ST1193 clone [22, 37] in combination with other virulence factors [15, 28, 29] (Figure 4) and the immature immune system of preterm infants.’

      In the paragraph depicted the VF it is only stated that ST95 contained significantly more VF than the ST1193 strains. And so what? By the way "significantly" is not documented: n=?, p=?

      We compared the prevalence of known virulence factors between ST95 and ST1193, and showed that ST95 strains in our collection contained significantly more virulence factors than the ST1193 strains. The P-value and the statistical test used were included in Supplementary Figure 3. To address the reviewers concern, we have now also added this to the main manuscript text as follows (new text underlined):

      ‘Direct comparison of virulence factors between ST95 and ST1193, the two most dominant NMEC STs, revealed that the ST95 isolates (n = 20) contained significantly more virulence factors than the ST1193 isolates (n=9), p-value < 0.001, Mann-Whitney two-tailed unpaired test (Supplementary Table 1, Supplementary Figure 3).’

      The complete sequence of 18 strains is not clear. Results of Supplementary Table 2 are presented in the text and are not discussed.

      NMEC isolates that were completely sequenced in this study are indicated in bold and marked with an asterisk in Figure 1. This information is indicated in the figure legend and was provided in the original submission. All information regarding genomic island composition and location, virulence genes and plasmid and prophage diversity is included in Supplementary Table 2. This information is highly descriptive and thus we elected not to include it as text in the main manuscript.

      46 years is a very long time for such a small number of strains, making it difficult to put forward epidemiological or evolutionary theories. In the analysis of antibiotic resistance, there are no ESBLs. However, Ding's article (reference 34) and other authors showed that ESBLs are emerging in E. coli neonatal infection. These strains are a major threat that should be studied, unfortunately, the authors haven't had the opportunity to characterize such strains in their manuscript.

      We agree 46 years is a long time-span. The study by Ding et al examined 56 isolates comprised of 25 different STs isolated in China from 2009-2015, with ST1193 (n=12) and ST95 (n=10) the most common. Our study examined 58 isolates comprised of 22 different STs isolated in seven different geographic regions from 1974-2020, with ST1193 (n=9) and ST95 (n=20) the most common. Thus, despite differences in the geographic regions from which isolates in the two studies were sourced, there are similarities in the most common STs identified. The fact that we observed less antibiotic resistance, including a lack of ESBL genes, in ST1193 is likely due to the different regions from which the isolates were sourced. We acknowledged and discussed the potential of ST1193 harbouring multidrug resistance including ESBLs in our manuscript as follows:

      ‘Concerningly, the ST1193 strains examined here carry genes encoding several aminoglycoside-modifying enzymes, generating a resistance profile that may lead to the clinical failure of empiric regimens such as ampicillin and gentamicin, a therapeutic combination used in many settings to treat NM and early-onset sepsis [35, 36]. This, in combination with reports of co-resistance to third-generation cephalosporins for some ST1193 strains [22, 34], would limit the choice of antibiotic treatment.’

      Second part of the manuscript:

      The three patients who relapsed had a late neonatal infection (> 3 days) with respective ages of 6 days, 7 weeks, and 3 weeks. We do not know whether they are former preterm newborns (no term specified) or whether they have received antibiotics in the meantime.

      As noted above, patient ages were not disclosed to comply with submission to Medrxiv, a requirement for review of our manuscript at eLife.

      Patient 1: Although this patient had a pleiocytose in CSF, the culture was negative which is surprising and no explanation is provided. Therefore, the diagnosis of meningitis is not certain. Pleiocytose without meningitis has been previously described in neonates with severe sepsis. Line 215: no immunological abnormalities were identified (no details are given).

      We respectfully disagree with the reviewer. The diagnosis of meningitis is made unequivocally by the presence of a clearly abnormal CSF microscopy (2430 WBCs) and an invasive E. coli from blood culture. This does not seem controversial to the authors. We had believed it unnecessary to include this corroborative evidence, but have added the following to support our assertion:

      ‘The child was diagnosed with meningitis based on a cerebrospinal fluid (CSF) pleocytosis (>2000 white blood cells; WBCs, low glucose, elevated protein), positive CSF E. coli PCR and a positive blood culture for E. coli (MS21522).’

      On the contrary, the authors are surprised by the statement that CSF pleocytosis occurs in neonatal sepsis ‘without meningitis’ and do not know of any definitions of neonatal meningitis that are not tied to the presence of a CSF pleocytosis. Furthermore, the later isolation of E. coli from the CSF during the relapsed infection re-enforces the initial diagnosis.

      Patient 2: This patient had a recurrence of bacteremia without meningitis (line 225: CSF microscopy was normal and culture negative!). This case should be deleted.

      In a similar vein to the previous comment, we respectfully assert that this patient has clear evidence of meningitis (330 WBCs in the CSF, taken 24h after initiation of antibiotic treatment). In this case, molecular testing was not performed as, under the principle of diagnostic stewardship, it was not considered necessary by the clinical microbiologists and treating clinicians following the culture of E. coli in the bloodstream. We agree that this is not a case of recurrent meningitis, but our intention was to highlight the recrudescence of an invasive infection (urinary sepsis requiring admission to hospital and intravenous antibiotics) which we hypothesise has arisen from the intestinal reservoir. We did not state that all patients suffered from relapsed meningitis.

      Despite this, to address this reviewers concern, we have changed all reference to ‘relapsed infection’ to now read ‘recrudescent invasive infection’ in the revised manuscript.

      Patient 3: This patient had two relapses which is exceptional and may suggest the existence of a congenital malformation or a neurological complication such as abscess or empyema therefore, "imaging studies" should be detailed.

      This patient underwent extensive imaging investigation to rule out a hidden source. This included repeated MRI imaging of head and spine, CT imaging of head and chest, USS imaging of abdomen and pelvis and nuclear medicine imaging to detect a subtle meningeal defect and CSF leak. All tests were normal, and no abscess or empyema found.

      We have modified the text to include this information:

      Text in original submission: ‘Imaging studies and immunological work-up were normal.’

      New text in revised manuscript (underlined): ‘Extensive imaging studies including repeated MRI imaging of the head and spine, CT imaging of the head and chest, ultrasound imaging of abdomen and pelvis, and nuclear medicine imaging did not show a congenital malformation or abscess. Immunological work-up did not show a known primary immunodeficiency. At two years of age, speech delay is reported but no other developmental abnormality.’

      The authors suggest a link between intestinal dysbiosis and relapse in three patients. However, the fecal microbiomes of patients without relapse were not analysed, so no comparison is possible. Moreover, dysbiosis after several weeks of antibiotic treatment in a patient hospitalized for a long time is not unexpected. Therefore, it's impossible to make any assumption or draw any conclusion. This part of the manuscript is purely descriptive. Finally, the authors should be more prudent when they state in line 289 "we also provide direct evidence to implicate the gut as a reservoir [...] antibiotic treatment". Indeed the gut colonization of the mothers with the same strain may be also a reservoir (as stated in the discussion line 336). Finally, the authors do not discuss the potential role of ceftriaxone vs cefotaxime in the dysbiosis observed. Ceftriaxone may have a major impact on the microbiota due to its digestive elimination.

      We addressed the limitations of our study in the Discussion, including that we did not have access to urine or stool samples from the mother of the infants that suffered recrudescence, and thus cannot rule out mother-to-child transmission as a mechanism of reinfection. We have now added that we did not have clinical data on the weeks of gestation for all patients, and thus could not compare virulence factors from NMEC isolated from preterm versus term infants. The limitations of our study are summarised as follows in the Discussion (new text underlined):

      ‘This study had several limitations. First, our NMEC strain collection was restricted to seven geographic regions, a reflection of the difficulty in acquiring strains causing this disease. Second, we did not have access to a complete set of stool samples spanning pre- and post-treatment in the patients that suffered NM and recrudescent invasive infection. This impacted our capacity to monitor E. coli persistence and evaluate the effect of antibiotic treatment on changes in the microbiome over time. Third, we did not have access to urine or stool samples from the mother of the infants that suffered recrudescence, and thus cannot rule out mother-to-child transmission as a mechanism of reinfection. Finally, we did not have clinical data on the weeks of gestation for all patients, and thus could not compare virulence factors from NMEC isolated from preterm versus term infants.’

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Shibl et al., studied the possible role of dicarboxylate metabolite azelaic acid (Aze) in modulating the response of different bacteria, it was used as a carbon source by Phycobacter and possibly toxic for Alteromonas. The experiments were well conducted using transcriptomics, transcriptional factor coexpression networks, uptake experiments, and chemical methods to unravel the uptake, catabolism, and toxicity of Aze on these two bacteria. They identified a putative Aze TRAP transporter in bacteria and showed that Aze is assimilated through fatty acid degradation in Phycobacter. Meanwhile, in Alteromonas it is suggested that Aze inhibits the ribosome and/or protein synthesis, and that efflux pumps shuttles Aze outside the cytoplasm. Further on, they demonstrate that seawater amended with Aze selects for microbes that can catabolize Aze.

      Major strengths:

      The manuscript is well written and very clear. Through the combination of gene expression, transcriptional factor co-expression networks, uptake experiments, and chemical methods Shibl et al., showed that Aze has a different response in two bacteria.

      Major weakness:

      There is no confirmation of the Aze TRAP transporters through mutagenesis.

      Impact on the field:

      Metabolites exert a significant influence on microbial communities in the ocean, playing a crucial role in their composition, dynamics, and biogeochemical cycles. This research highlights the intriguing capacity of a single metabolite to induce contrasting responses in distinct bacterial species, underscoring its role in shaping microbial interactions and ecosystem functions.

      We thank the reviewer for their comments on the paper and we appreciate their suggestion to confirm the activity of Aze TRAP transporters through mutagenesis. We agree that this would be a valuable addition to the study, and we mention in the text that “Despite numerous attempts, our efforts to knock-out azeTSL in Phycobacter failed.”

      The success rate of mutagenesis experiments is often low and time-consuming. There are a few reasons why our knock-out experiments with Phycobacter have not been successful. Despite using several modified protocols for electroporation, no Phycobacter colonies grew on the antibiotic plate. We then tried the homologous recombination approach for conjugation but were not successful in selecting for Phycobacter cells, even when grown in high salinity conditions that favor Phycobacter and disfavor the carrier, E. coli . While we would love to include a mutagen to confirm the function of this cluster, the task seems to be unattainable at the moment .

      Reviewer #2 (Public Review):

      This study explores the breadth of effects of one important metabolite, azelaic acid, on marine microbes, and reveals in-depth its pathway of uptake and catabolism in one model bacterial strain. This compound is known to be widely produced by phytoplankton and plants, and to have complex effects on associated microbiomes.

      This work uses transcriptomics to assay the response of two strains that show contrasting responses to the metabolite: one catabolizes the compound and assimilates the carbon, while the other shows growth inhibition and stress response. A highly induced TRAP transporter, adjacent to a previously identified regulator, is inferred to be the specific uptake system for azelaic acid. However the transport function was not directly tested via genetic or biochemical methods. Nevertheless, this is a significant finding that will be useful for exploring the distribution of azelaic acid uptake capability across metagenomes and other bacteria.

      The authors use pulse-chase style metabolomics experiments to beautifully demonstrate the fate of azelaic acid through catabolic pathways. They also measure an assimilation rate per cell, though it remains unclear how this measured rate relates to natural systems. The metabolomics approach is an elegant way to show carbon flux through cells, and could serve as a model for future studies.

      The study seeks to extend the results from two model strains to complex communities, using seawater mesocosm experiments and soil/Arabidopsis experiments. The seawater experiments show a community shift in mesocosms with added azelaic acid. However, the mechanisms for the shift were not determined; further work is necessary to demonstrate which community members are directly assimilating the compound vs. benefitting indirectly or experiencing inhibition. In my opinion the soil and Arabidopsis experiments are quite preliminary. I appreciate the authors' desire to broaden the scope beyond marine systems, but I believe any conclusions regarding different modes of action in aquatic vs terrestrial microbial communities are speculative at this stage.

      This work is a nice illustration of how we can begin to tease apart the effects of chemical currencies on marine ecosystems. A key strength of this work is the combination of transcriptomics and metabolomics methods, along with assaying the impacts of the metabolite on both model strains of bacteria and whole communities. Given the sheer number of compounds that probably play critical roles in community interactions, a key challenge for the field will be navigating the tradeoffs between breadth and depth in future studies of metabolite impacts. This study offers a good compromise and will be a useful model for future studies.

      We thank the reviewer for their thoughtful comments on the manuscript. We appreciate their feedback on the breadth of effects of Aze on marine microbes, and their insights into the strengths and limitations of our study.

      We agree that the specific mechanisms underlying community-level shifts in seawater mesocosm experiments with added Aze are not yet fully understood and we believe such work is beyond the scope of this paper and warrants an in-depth study of its own. This can perhaps be conducted at a larger scale by using a combination of meta-omics and targeted enrichment to identify the community members directly assimilating Aze, as well as those that are benefitting indirectly or experiencing inhibition.

      We also agree that the soil and Arabidopsis experiments are exploratory. However, we believe that these experiments are a valuable first step in highlighting the potential for Aze to have different modes of action in aquatic versus terrestrial microbial communities. Our interest in contrasting bacterial molecular responses in terrestrial plant rhizospheres and marine algal phycospheres stems from the fact that both environments share similar molecules and related bacteria, yet exhibit significantly different evolutionary histories and fluid dynamic profiles (Seymour et al 2017, Nature Microbiol ). Although more is known about Aze in Arabidopsis than phytoplankton, there are still gaps in this knowledge. For example, recent work has shown that Aze and derivatives can be secreted into soil (Korenblum et al 2020, PNAS ), but whether Aze directly influences microbial communities in soil as we have shown in seawater has not been explored. Thus, we feel our preliminary experiments in soil are important to provide such a distinction with seawater. Additional studies in these systems to further investigate the importance of Aze, which were beyond the scope of this current work, would be quite beneficial.

      Reviewer #1 (Recommendations For The Authors):

      General comments:

      A complete supplemental file of differentially expressed genes should be provided in the supplemental. Please add tables with the entire DESeq output for Aze additions in the genomes of Phycobacter (0.5 and 8 h) and Alteromonas (0.5 h). While it makes sense to focus the paper on Aze related genes, the full dataset should be made available in a more curated form than just the raw reads in the SRA.

      We thank the reviewer for this suggestion. We have included three more sheets in Supplementary Table 1 file where readers can find the entire DESeq outputs of Phycobacter (0.5 and 8 h) and Alteromonas (0.5 h) experiments.

      Specific comments:

      • L82 indicates the TRAP transporter for Aze. Looking at the table for gene expression of Phycobacter there are 26 significantly enriched transport genes at 0.5 h other than the putative Aze TRAP transporter. Even though the TRAP transporter is likely transporting Aze, it would be good to let the readers know that other transporters showed transcript enrichment.

      Thank you for this helpful comment. We modified the sentence accordingly to read as follows: “Among 26 enriched transporter genes in our dataset, a C 4 - dicarboxylate tripartite ATP-independent periplasmic (TRAP) transporter substrate-binding protein (INS80_RS11065) was the most and the third most upregulated gene in Phycobacter grown on Aze at 0.5 and 8 hours, respectively.”

      • Figure 1: There are many genes enriched from -1 to 1. Is there a cut off, p-val (can you add it to the caption)? It would be good to have a dashed line or something that indicates the -1 and 1 log2 fold change in the figure.

      We thank the reviewer for this suggestion. We added the following sentence to the legend of Fig. 1: “Genes were considered DE with a p -adjusted value of < 0.05 and a log2 fold-change of ≥ ±0.50.”

      • Supplementary tables: Add a title on all the supplementary tables. It's hard to tell what each one of the tables means without looking at the text and content of each tables.

      A short descriptive title is now added to all supplementary tables.

      • Not sure if it matters, though Table S1 was not available in the attached files, though it is in the complete pdf.

      Table S1 is now in the attached files and the DESeq output has been added to it as suggested in the general comment above.

      Reviewer #2 (Recommendations For The Authors):

      Here I offer some more specific suggestions and comments on the methods and presentation.

      I recommend being careful throughout with the language regarding conclusions. For instance, the study does not directly demonstrate the activity of the TRAP transporter (as mentioned above), and does not directly demonstrate that the bacteria that increase in abundance in the mesocosm experiments are actually assimilating azelaic acid.

      We thank the reviewer for this comment. We agree that further studies are required to get definitive answers regarding the direct activity of the transporter genes and direct assimilation of Aze by bacteria in the mesocosm. These complex experiments would require establishing a reproducible workflow for knocking out genes and further isotope labeling experiments to track Aze assimilation in a natural setting. To that end, we were keen on using language throughout the manuscript indicating that transporter activity is putative. We went through the manuscript again to make sure it was clear that the transporter activity is putative at this time and is not confirmed. For the mesocosms, we cannot rule out that the changes in community structure is not due to other factors besides Aze. We have added this sentence in the discussion of the mesocosm experiments to indicate that the observed changes in microbial community cannot be directly attributed to Aze activity and may be a byproduct of other mechanisms.

      Additionally, I find the soil and plant experiments to be very preliminary, and would personally recommend removing them from the manuscript. This is of course the authors' choice, but I find they detract from an otherwise more solid story. I wonder whether 16 hours was sufficient to see community changes and whether adding azelaic acid directly into the plant is necessary or relevant. The study does not measure any plant immune responses so I caution against drawing conclusions about the mechanism. It seems the connection to plant immunity was already shown in the literature, in which case I'm not sure whether these experiments presented here really add anything new to the paper.

      We thank the reviewer for these comments. Our 16-hour sampling time point (similar to the seawater experiment) represents an overnight incubation period that should allow sufficient change in the natural microbial composition yet avoids the long-term succession of microbes with high metabolic capacities that may outcompete the rest of the community at long incubation periods. Deciding on this length of incubation was also informed by the uptake rate of Aze and its influence on either bacteria assimilating it as a carbon source or being inhibited by it.

      Since no significant changes were observed in the soil, it was necessary to test the hypothesis that the plant host might be indirectly influencing the rhizosphere microbial communities by infiltrating A. thaliana leaves with Aze. As the reviewer mentions, the association between Aze and plant immunity was previously shown; however, the overall influence on the microbial community has not been fully explored yet. The soil and plant experiments were meant to serve an exploratory purpose and we find them necessary to keep in the manuscript as a first step in comparing the mode of action of Aze within marine and terrestrial ecosystems. They are by no means the answer to what role Aze plays in soil systems, but rather they are the starting point. We hope that our results encourage some readers to investigate similar common metabolites to further elucidate the molecular underpinnings of microbial modulation in both environments.

      Regarding the transcriptomics data, I am not clear on why the "expression ratio" -- i.e. the fraction of pathway genes that were differentially abundant -- was used. I would not expect all transcripts in a pathway to behave the same way in response to a perturbation, due to variation in half-life/stability, post-transcriptional and post-translational regulation, etc. I recommend removing the expression ratio (right panel) from Figure 1. The left panel shows the data more clearly and more directly.

      We thank the reviewer for their insight and we agree that not all transcripts in a pathway behave the same way. However, we find the expression ratio panel visually informative to highlight the importance of a pathway in response to Aze, taking into consideration the total number of key genes involved in a pathway. For example, despite the larger number of DE genes associated with the Amino Acid Metabolism & Degradation pathway compared to the Fatty Acid Degradation pathway, the expression ratio for the former in each transcriptome is lower than its Fatty Acid Degradation counterpart, indicating that the response of key fatty acid degradation genes to Aze is more pronounced. We have qualified the reasons for including expression ratios in Figure 1 legend.

      Overall I enjoyed reading the manuscript and applaud the authors on a nice contribution to this important field.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Peng et al develop a computational method to predict/rank transcription factors (TFs) according to their likelihood of being pioneer transcription factors--factors that are capable of binding nucleosomes--using ChIP-seq for 225 human transcription factors, MNase-seq and DNase-seq data from five cell lines. The authors developed relatively straightforward, easy to interpret computational methods that leverage the potential for MNase-seq to enable relatively precise identification of the nucleosome dyad. Using an established smoothing approach and local peak identification methods to estimate positions together with identification of ChIP-seq peaks and motifs within those peaks which they referred to as "ChIP-seq motifs", they were able to quantify "motif profiles" and their density in nucleosome regions (NRs) and nucleosome depleted regions (NDRs) relative to their estimated nucleosome dyad positions. Using these profiles, they arrived at an odd-ratio based motif enrichment score along with a Fisher's exact test to assess the odds and significance that a given transcription factor's ChIP-seq motifs are enriched in NRs compared to NDRs, hence, its potential to be a pioneer transcription factor. They showed that known pioneer transcription factors had among the highest enrichment scores, and they could identify a number of relatively novel pioneer TFs with high enrichment scores and relatively high expression in their corresponding cell line. They used multiple validation approaches including (1) calculating the ROC-AUC and Matthews correlation coefficient (MCC) and generating ROC and precision-recall curves associated with their enrichment score based on 32 known pioneer TFs among their 225 TFs which they used as positives and the remaining TFs (among the 225) as negatives; (2) use of the literature to note that known pioneer TFs that acted as key regulators of embryonic stem cell differentiation had a highest enrichment scores; (3) comparison of their enrichment scores to three classes of TFs defined by protein microarray and electromobility shift assays (1. strong binder to free and nucleosomal DNA, 2. weak binder to free and nucleosomal DNA, 3. strong binding to free but not nucleosomal DNA); and (4) correlation between their calculated TF motif nucleosome end/dyad binding ratio and relevant data from an NCAP-SELEX experiment. They also characterize the spatial distribution of TF motif binding relative to the dyad by (1) correlating TF motif density and nucleosome occupancy and (2) clustering TF motif binding profiles relative to their distance from the dyad and identifying 6 clusters.

      The strengths of this paper are the use of MNase-seq data to define relatively precise dyad positions and ChIP-seq data together with motif analysis to arrive at relatively accurate TF binding profiles relative to dyad positions in NRs as well as in NDRs. This allowed them to use a relatively simple odds ratio based enrichment score which performs well in identifying known pioneer TFs. Moreover, their validation approaches either produced highly significant or reasonable, trending results.

      The weaknesses of the paper are relatively minor, and the authors do a good job describing the limitations of the data and approach.

      Reviewer #2 (Public Review):

      In this study, the authors utilize a compendium of public genomic data to identify transcription factors (TF) that can identify their DNA binding motifs in the presence of nuclosome-wrapped chromatin and convert the chromatin to open chromatin. This class of TFs are termed Pioneer TFs (PTFs). A major strength of the study is the concept, whose premise is that motifs bound by PTFs (assessed by ChIP-seq for the respective TFs) should be present in both "closed" nucleosome wrapped DNA regions (measured by MNase-seq) as well as open regions (measured by DNAseI-seq) because the PTFs are able to open the chromatin. Use of multiple ENCODE cell lines, including the H1 stem cell line, enabled the authors to assess if binding at motifs changes from closed to open. Typical, non-PTF TFs are expected to only bind motifs in open chromatin regions (measured by DNaseI-seq) and not in regions closed in any cell type. This study contributes to the field a validation of PTFs that are already known to have pioneering activity and presents an interesting approach to quantify PTF activity.

      For this reviewer, there were a few notable limitations. One was the uncertainty regarding whether expression of the respective TFs across cell types was taken into account. This would help inform if a TF would be able to open chromatin. Another limitation was the cell types used. While understandable that these cell types were used, because of their deep epigenetic phenotyping and public availability, they are mostly transformed and do not bear close similarity to lineages in a healthy organism. Next, the methods used to identify PTFs were not made available in an easy-to-use tool for other researchers who may seek to identify PTFs in their cell type(s) of interest. Lastly, some terms used were not define explicitly (e.g., meaning of dyads) and the language in the manuscript was often difficult to follow and contained improper English grammar.

      Reviewer #3 (Public Review):

      Peng et al. designed a computational framework for identifying pioneer factors using epigenomic data from five cell types. The identification of pioneer factors is important for our understanding of the epigenetic and transcriptional regulation of cells. A computational approach toward this goal can significantly reduce the burden of labor-intensive experimental validation.

      The authors have addressed my previous comments.

      The main issue identified in this re-review is based on the authors' additional experiments to investigate the reproducibility of the pioneer factors identified in the previously analysis that anchored on H1 ESCs.

      The additional analysis that uses the other four cell types (HepG2, HeLa-S3, MCF-7, and K562) as anchors reveals the low reproducibility/concordance and high dependence on the selection of anchor cell type in the computational framework. In particular, now several stem cell related TFs (e.g. ESRRB, POU5F1) are ranked markedly higher when H1 ESC is not used as the anchor cell type as shown in Supplementary Figure 5.

      Of note, the authors have now removed the shape labels that denote Yamanaka factors in Figure 2c (revised manuscript) that was presented in the main Figure 2a in the initial submission. The NFYs and ESRRB labels in Supplementary 4a are also removed and the boxplot comparing NFYs and ESRRB with other TF are also removed in this figure. Removing these results effectively hides the issues of the computational framework we identified in this revision. Please justify why this was done.

      In summary, these new results reveal significant limitations of the proposed computational framework for identifying pioneer factors. The current identifications appear to be highly dependent on the choice of cell types.

      Response: We thank all reviewers for their thoughtful and constructive comments and suggestions, which helped us to strengthen our paper. Following the suggestions, we have further addressed the reviewer’s comments and the detailed responses are itemized below.

      Reviewer #1 (Recommendations For The Authors):

      The following few minor mistakes/discrepancies/omissions should be addressed:

      1. In Figure 3, the Nucleosome Occupancy curves and legend are orange and the Binding Motif Profiles are blue; however, the y-axis label for Nucleosome occupancy profile is blue, and the y-axis label of Binding motif profile is orange. The colors seem to be switched, or I'm missing something.

      Response: We thank the reviewer for pointing it out. We have changed the colors to make it consistent.

      1. The text at the bottom of p. 11 of the main manuscript describing Supplementary Fig. 5 states: "If we repeat our anaysis by redefining differentially open regions as those closed in differentiated cell lines and open in H1 embryonic cell line, then ESSRB and Yamanaka pioneer transcription factor POU5F1 (OCT4) showed significantly higher enrichment scores (Supplementary Figure 5)." However, Supplementary Fig. 5 legend states: "Enrichment analysis of different TFs using the differentially open from one cell line (shown in the title) and conserved open regions from other four cell lines.". These two descriptions of the differential chromatin criteria used in the analysis don't appear to match. The description in the text is the one that makes much more sense to me. The legend should be written a little more clearly and reflect the statement in the main text. One can see from the cut and paste the "analysis" is also misspelled.

      Response: We have rewritten the legend of Supplementary Figure 5 to make it clear and consistent. The misspelling has also been corrected.

      1. It might be helpful to add that a random classifier would yield a constant precision recall (PR) curve (as a function of Recall) with the Precision = P/(P+N) or the fraction of positives for all plotted PR curves which in the case of Fig. 2a is 32/225 = 0.142, for example.

      Response: We thank the reviewer for the suggestions. We have added the fraction of positives for Figure 2.

      1. On p. 17 line 513, the authors refer to "Supplementary 7, 9 and 13". I'm assuming it's "Supplementary Tables 7, 9 and 13".

      Response: It has been corrected.

      1. On p. 18 line 539, "essays" should be "assays".

      Response: It has been corrected.

      Reviewer #2 (Recommendations For The Authors):

      We are satisfied with the revisions in this version of the manuscript.

      Reviewer #3 (Public Review):

      The main issue identified in this re-review is based on the authors' additional experiments to investigate the reproducibility of the pioneer factors identified in the previously analysis that anchored on H1 ESCs.

      The additional analysis that uses the other four cell types (HepG2, HeLa-S3, MCF-7, and K562) as anchors reveals the low reproducibility/concordance and high dependence on the selection of anchor cell type in the computational framework. In particular, now several stem cell related TFs (e.g. ESRRB, POU5F1) are ranked markedly higher when H1 ESC is not used as the anchor cell type as shown in Supplementary Figure 5.

      Of note, the authors have now removed the shape labels that denote Yamanaka factors in Figure 2c (revised manuscript) that was presented in the main Figure 2a in the initial submission. The NFYs and ESRRB labels in Supplementary 4a are also removed and the boxplot comparing NFYs and ESRRB with other TF are also removed in this figure. Removing these results effectively hides the issues of the computational framework we identified in this revision. Please justify why this was done.

      In summary, these new results reveal significant limitations of the proposed computational framework for identifying pioneer factors. The current identifications appear to be highly dependent on the choice of cell types.

      Response: We would like to clarify that our enrichment score used for TF classification, defined by Equation 3, is expected to be cell-type specific. The value of the enrichment score is modulated by a number of factors beyond the property of a TF to act as a PTF, such as the abundance of a given TF in a given cell line, cell type-specific nucleosome binding maps and interactions with other TFs. Thus, it is expected that the enrichment scores calculated for the same TF in different cell lines should be quantitatively different. Following the initial suggestion of Reviewer 3, we have diversified our analysis by using different cell lines as anchors. This analysis showed that most PTFs that we identified could be confirmed based on different cell lines, when comparing the relative enrichment scores within each cell line. On the other hand, it is not expected that the values of enrichment scores of a given TF should be similar across different cell lines.

      Regarding a specific comment about ESRRB and POU5F1, these TFs are known pioneer factors with roles in reprogramming of somatic cells into induced pluripotent stem cells and suppressing cell differentiation. They have the ability to open closed chromatin regions in the differentiated cell lines. Therefore, if we redefine the differentially open regions as those closed in differentiated cell lines and open in H1 embryonic cell line, these pioneer factors are expected to have high enrichment scores. Indeed, our new results validated the roles of these PTFs in cell reprogramming. As mentioned above, their enrichment scores in different cell lines are not expected to be the same.

      We also would like to clarify that no results were removed during the update of the figures, and all modifications of the manuscript following the suggestions of the reviewers were only made to improve the figures and make them clearer and the message more straightforward.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      We would like to thank the Editors and Reviewers for their additional comments and constructive feedback on our manuscript. We have made minor adjustments to the figures and texts based on their suggestions, including improved images in Figure 1 and correction of figure labels.

      Reviewer #1 (Public Review):

      In their previous paper (Lari et al, 2019; Azra Lari Arvind Arul Nambi Rajan Rima Sandhu Taylor Reiter Rachel Montpetit Barry P Young Chris JR Loewen Ben Montpetit (2019) A nuclear role for the DEAD-box protein Dbp5 in tRNA export eLife 8:e48410.) as well as in the current manuscript the authors states that Dbp5 is involved in the export of tRNA that is independent of and parallel to Los1. They state that Dbp5 binds to the tRNA independent of known tRNA export proteins. The obtained conclusion is both intriguing and innovative, since it suggests that there are other variables, beyond the ones previously identified as tRNA factors, that might interact with Dbp5 to facilitate the export process. In order to find out additional factors aiding this process the authors may employ total RNA-associated protein purification (TRAPP) experiments ( Shchepachevto et al., 2019; Shchepachev V, Bresson S, Spanos C, Petfalski E, Fischer L, Rappsilber J, Tollervey D. Defining the RNA interactome by total RNA-associated protein purification. Mol Syst Biol. 2019 Apr 8;15(4):e8689. doi: 10.15252/msb.20188689. PMID: 30962360; PMCID: PMC6452921) to identify extra factors involved in conjunction with Dbp5. The process elucidates hitherto uninvestigated tRNA export components that function in conjunction with Dbp5.

      Author Response: We greatly appreciate this suggestion and agree with the reviewer that identification of the composition of the export competent Dbp5 containing tRNA complex is a critical next step for understanding the mechanism of Dbp5 mediated tRNA export, which will form the foundation of a future investigation in the laboratory and warrants its own study.

      Reviewer #1 (Public Review):

      Various reports suggest that eukaryotic translation elongation factor 1 eEF1A is involved tRNA export Bohnsack et al., 2002 (Bohnsack MT, Regener K, Schwappach B, Saffrich R, Paraskeva E, Hartmann E, Görlich D. Exp5 exports eEF1A via tRNA from nuclei and synergizes with other transport pathways to confine translation to the cytoplasm. EMBO J. 2002 Nov 15;21(22):620515. doi: 10.1093/emboj/cdf613. PMID: 12426392; PMCID: PMC137205), Grosshans etal., 2002; Grosshans H, Hurt E, Simos G. An aminoacylation-dependent nuclear tRNA export pathway in yeast. Genes Dev. 2000 Apr 1;14(7):830-40. PMID: 10766739; PMCID: PMC316491). The presence of mutations in eEF1A has been seen to hinder the nuclear export process of all transfer RNAs (tRNAs). eEF1A has been shown to interact with Los1 aiding in tRNA export. The authors can also explore the crosstalk between Dbp5 and eEF1A in this study. Additionally, suppressor screening analysis in dbp5R423A , los1∆dbp5R423A los1∆msn∆dbp5R423A could shed more light on this.

      Author Response: Thank you for this suggestion and raising an important possible role for Dbp5 in eEF1A mediated tRNA export. Based on more recent investigation of eEF1A function in tRNA export (PMID: 25838545), it is likely that eEF1A functions in re-export of charged tRNAs specifically (likely in conjunction with Msn5). The current manuscript has largely focused on the role of Dbp5 in pre-tRNA export, but a more careful mechanistic characterization of Dbp5 and re-export will be conducted in follow-up studies given the physical interaction between Dbp5 and spliced tRNAs we previously reported. Similarly, suppressor screens with the Dbp5 and los1Δmsn5Δ mutants will likely be a useful tool in identifying additional tRNA export factors and we thank the reviewer for this suggestion.

      Reviewer #1 (Public Review):

      The addition of Gle1 is potentially novel but it's unclear why the authors didn't address the potential involvement of IP6.

      Author Response: The text has been revised to highlight the importance of InsP6 in Gle1 mediated activation of Dbp5. This includes referencing InsP6 throughout the manuscript during discussions of Gle1 activation of Dbp5 and lines 401-404 discussing the potential role for the small molecule in regulating mRNA and tRNA export in different cellular contexts (e.g., stress and disease).

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Weaknesses:

      Reviewer comment: Here, the activity of SWIFT molecules was assessed in single cell types with or without BKlotho expression. Ultimately, the ability of the SWIFT molecules to activate Wnt signaling in a cell type-specific manner should be tested in the context of many different cellular identities that express BKlotho to different extents. It would be good to demonstrate that Wnt activation by SWIFT correlates with BKlotho expression level in multiple cell types - such data would strengthen the claim of cell-type specificity.

      Response: We agree with the reviewer’s comment, it would be interesting to correlate the signaling level to the expression levels of βKlotho. The tools to carry out such an experiment are not currently available, as this would require a culture system that allows efficient growth of different cell types, and the reagents to detect both the receptor protein levels of βKlotho (as well as FZD/LRP) and signaling levels. We did perform an additional experiment to further support this targeting approach using a 2-layered (transwell) cell culture system. In this culture system, one cell type is put into the top well and the other cell type is put into the bottom well. Molecules to be tested were added to the media which is shared and freely diffuse across the two cell types. In this 2-layer cell system, the results again demonstrate the ability of the SWIFT molecules to specifically induce signaling only in βKlotho expressing hepatoma Huh7 cells and not in non-targeting HEK293 cells. This new data is included as Fig. 3H in the revised manuscript.

      Reviewer comment: The study does not address whether the targeted cells express FGFR1c/2c/3c and whether the FGF21 full-length moiety or the 39F7 IgG moiety of SWIFT molecules could unintentionally activate FGF signaling in these cells.

      Response: We agree with the reviewer’s comment. The receptor βKlotho and its binders (FGF21 and 39F7) were used to test the BRAID/SWIFT concept, the effects on FGF signaling were not the focus of the current study. This comment has now been added to the revised manuscript in the discussion. Inclusion of αGFP controls in the study also suggests the observed reporter activity in the targeted scenario is unlikely caused indirectly by any unexpected FGF signaling.

      Reviewer #2 (Public Review):

      Weaknesses:

      Reviewer comment: The study shows the SWIFT approach works in vitro using cell lines, primary human hepatocytes, and human intestinal organoids, but it lacks an in vivo animal model or clinical validation. The applicability of this approach to therapy is still unknown.

      Response: The βKlotho binder, 39F7, is specific to the human receptor and does not cross react with mouse. Unfortunately, we are not able to test these SWIFTs in a mouse model.

      Reviewer comment: The success of SWIFT depends on the presence and expression of the bridging receptor (βKlotho) on target cells. The approach may fail if the target receptor is not expressed or available.

      Response: We agree with the reviewer, the SWIFT molecules should not induce signaling on cells where bridging receptor is not expressed, therefore, achieving target cell specificity. As pointed out by the reviewer, finding the right bridging receptor on the target cell is critical.

      Reviewer #1 (Recommendations For The Authors):

      Reviewer comment 1: One way to further validate the specificity of SWIFT molecules would be to apply them to a mix of different cell types and quantify BKlotho level and Wnt reporter activity at the single cell level, potentially through imaging, FACS, or transcriptomics.

      Response: We agree with the reviewer’s comment, it would be interesting to correlate the signaling level to the expression levels of βKlotho. The tools to carry out such an experiment are not currently available, as this would require a culture system that allows efficient growth of different cell types, and the reagents to detect both the receptor protein levels of βKlotho (as well as FZD/LRP) and signaling levels. We did perform an additional experiment to further support this targeting approach using a 2-layered (transwell) cell culture system. In this culture system, one cell type is put into the top well and the other cell type is put into the bottom well. Molecules to be tested were added to the media which is shared and freely diffuse across the two cell types. In this 2-layer cell system, the results again demonstrate the ability of the SWIFT molecules to specifically induce signaling only in βKlotho expressing hepatoma Huh7 cells and not in non-targeting HEK293 cells. This new data is included as Fig. 3H in the revised manuscript.

      Reviewer comment 2: The experiments presented demonstrate activation of one signaling pathway in cells specifically expressing a target receptor rather than demonstrating "the feasibility of combining different signaling pathways" as stated in the abstract.

      Response: We thank the reviewer for pointing this out and have adjusted the sentence accordingly.

      Reviewer comment 3: What are the biological consequences of activating Wnt signaling in cells expressing BKlotho and why is that of interest? Could these biological outcomes be used as an additional, perhaps more consequential, readout for SWIFT activity?

      Response: βKlotho is expressed on several different cell types that include hepatocytes, WAT, BAT, and certain regions in CNS. Our studies here focused on the WNT signaling pathway, and βKlotho/FGF21/39F7 receptor ligand system was used to illustrate the BRAID/SWIFT cell targeting concept. Whether these molecules may additional modulate endocrine FGF signaling and metabolic homeostasis, and whether there is any interaction between βKlotho and Wnt signaling pathways could be the subject of future studies. This is now added to the revised manuscript.

      Reviewer comment 4: The manuscript would benefit from a careful review to improve wording and address grammatical errors.

      Response: We thank the reviewer for this suggestion, and we have now had another round of language editing by a professional service.

      Reviewer #2 (Recommendations For The Authors):

      Reviewer comment 1. The expression of KLB in Fig 3G and 4B seems way too low and may not represent the amount on the cell surface. Did the authors validate the expression on the cell surface?

      Response: In both figures we have displayed the expression level normalized to housekeeping gene ACTB. Housekeeping genes such as ACTB can be among the most abundant transcripts in a cell. The observation that KLB mRNA detection is below ACTB mRNA levels is expected and we would argue not too low. The average real-time PCR cycle threshold (Ct) for KLB in Huh7 and primary hepatocytes was 18 and 24 respectively. To avoid any confusion, we have now displayed the expression data normalized to HEK293 and intestinal organoids as a fold difference in a new Figure 3G and 4B.

      Comment 2. Fig 3G needs statistical significance.

      Response: We thank the reviewer for highlighting this, we have now included the statistical analysis in an updated Figure 3G.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In the manuscript there is not much comparison between the crystal and cryoEM structures provided, and on inspection they appear to be very similar. The crystal structures also reveal parts of the CC domains in Las1, which is not present in the cryoEM structures. It is interesting the CC domains in Sc and Cj are quite different as illustrated in Figure 4B. They also seem to be somewhat disconnected from the rest of the complex (more so for Cj), even though that's not apparent in Figures 2-4. Despite this, it would be very useful to show the cryoEM densities when describing the catalytic site and C-terminal domain interactions, for example, as this can be very useful to increase confidence in the model and proposed mechanisms.

      We thank the reviewer for this suggestion. We have added a figure (Figure 5- Figure supplement 3) to show cryo-EM and crystal densities of key amino acids, when describing the catalytic site and C-terminal domain interactions. In analyzing the interaction between Las1 and Grc3, we have also provided additional comparisons of the crystal structure and the cryo-EM structure (Figure 5, Figure 5-figure supplement 1, 2 and 3, Figure 6, Figure 6-figure supplement 1).

      The description of the complex as a butterfly is engaging, and from a certain angle it can be made to look as such; this was also described previously in (Pillon et al., 2019, NSMB) for the same complex from a different organism (Ct). However, it is a bit misleading, because the complex is actually C2 symmetric. Under this symmetry, the 'body' would consist of two 'heads' one pointing up, one down facing towards the back, and one wing would have its back toward the viewer, the other the front. The structures presented here in Sc and Cj seem quite similar to the previous structure of the same complex in Ct, though the latter was only solved with cryoEM, and was also lacking the structure of the CC domain in Las1.

      We thank the reviewer for pointing out this issue. We have re-wrote these sentences and changed the butterfly description of Las1-Grc3 complex in the revised manuscript.

      For the model suggested in Figure 8, perhaps in the 'weak activity' state, the LCT in Las1 could still be connected to Grc3, via the LCT, rather than disconnected as shown. This could facilitate faster assembly of the 'high activity' state. The complex is described as 'compact and stable', but from the structure and this image, it appears more dynamic, which would serve its purpose and the illustrated model better. The two copies of HEPN appear to have more connective area, meaning they are indeed more likely to remain assembled in the 'weak activity' state. On the other hand, HEPN in one protein appears to have less binding surface with PNK in Grc3, and even less so with the CTD (both PNK and CTD being from the other associated protein), meaning these bindings could release easily to form the 'weak activity' state.

      There is also the potential to speculate that the GCT is bound to HEPN near the catalytic area in the 'weak activity' state. The reduced activity when the GCT residues are replaced by Alanine could then be explained by the complex not being able to assemble as quickly upon binding of the substrate, as it could if the GCT remained bound, rather than by a conformational change that it induces upon binding. The conformational change is also likely to be influenced by the combined binding of PNK and CTD in the assembled state, which also contact HEPN, rather than by GCT alone.

      We thank the reviewer for this suggestion. We have revised our model in the new Figure 8 of our revised manuscript. We apologize for the un-clarity description of the 'weak activity' state in our model. In fact, we believe that Las1 is in a "weakly activity" state before binding to Grc3 and is in a "highly activity" state when it forms a complex with Grc3. We strongly agree that the Las1-Grc3 complex is more dynamic than compact and stable, so it is easy to change its active state. We have changed our description and revised our model in the revised manuscript.

      When comparing the structure of the HEPN domain in the lone Las1 protein to the structure of Las1-HEPN in the Las1-Grc3 complex, it is mentioned that 'large conformational changes are observed'. These could be described a bit better. The conformational change is ~3-4Å C-alpha RMSD across all ~150 residues in the domain (~90 residues forming a stable core that only changes by ~1Å). There is also a shift in the associated HEPN domain in Las1B domain compared to the bound HEPN in the Las1-Grc3 complex, as shown in Figure 7D: ~1Å shift and ~12degrees rotation. This does point to the conformation of HEPN changing upon complex formation, as does the relative positions of the HEPN domains in Las1A and Las1B. The conformational change and relative shift could indeed by key for the catalysis of the substrate as mentioned.

      We thank the reviewer for this great suggestion. We have replaced the sentence describing the conformational changes in our revised manuscript.

      Overall, the structures presented should be very useful in further study of this system, even though the exact dynamics and how the substrate is bound are aspects that are perhaps not fully clear yet. The addition of the structures of the CC domain in two different organisms and the Las1 HEPN domain (not in complex with Grc3) as new structural information should allow for increasing our understanding of the overall complex and its mechanism.

      We thank this reviewer for these encouraging comments, which helped us with greatly improving our manuscript.

      Reviewer #2 (Public Review):

      In this manuscript, Chen et al. determined the structural basis for pre-RNA processing by Las1-Grc3 endoribonuclease and polynucleotide kinase complexes from S. cerevisiae (Sc) and C. jadinii (Cj). Using a robust set of biochemical assays, the authors identify that the sc- and CjLas1-Grc3 complexes can cleave the ITS2 sequence in two specific locations, including a novel C2' location. The authors then determined X-ray crystallography and cryo-EM structures of the ScLas1-Grc3 and CjLas1-Grc3 complexes, providing structural insight that is complimentary to previously reported Las1-Grc3 structures from C. thermophilum (Pillon et al., 2019, NSMB). The authors further explore the importance of multiple Las1 and Grc3 domains and interaction interfaces for RNA binding, RNA cleavage activity, and Las1-Grc3 complex formation. Finally, evidence is presented that suggests Las1 undergoes a conformational change upon Grc3 binding that stabilizes the Las1 HEPN active site, providing a possible rationale for the stimulation of Las1 cleavage by Grc3.

      Several of the conclusions in this manuscript are supported by the data provided, particularly the identification and validation of the second cleavage site in the ITS2. However, several aspects of the structural analysis and complimentary biochemical assays would need to be addressed to fully support the conclusions drawn by the authors.

      We thank the reviewer for the positive comments.

      • There is a lack of clarity regarding the number of replicates performed for the biochemical experiments throughout the manuscript. This information is critical for establishing the rigor of these biochemical experiments.

      We apologize for not providing the detailed information on the number of replicates of biochemical experiments. All the biochemical experiments were repeated three times. We have provided this information in the figure legends.

      • The authors conclude that Rat1-Rai1 can degrade the phosphorylated P1 and P2 products of ITS2 (lines 160-162, Figure 1H). However, the data in Fig. 1H shows complete degradation of 5'Phos-P2 and 5'Phos-P4 of ITS2, while the P1 and 5'Phos-P3 fragments remain in-tact. Additional clarification for this discrepancy should be provided.

      We thank the reviewer for pointing out this issue. “phosphorylated P1 and P2 products” should be “phosphorylated P2 and P4 products”. We have corrected this clerical error. In addition, we have also provided an explanation for why phosphorylated P3 product show only partial degradation. We suspect that P3 product may be too short to completely degrade.

      • The authors determined X-ray crystal structures of the ScLas1-Grc3 (PDB:7Y18) and CjLas1-Grc3 (PDB:7Y17) complexes, which represents the bulk of the manuscript. However, there are major concerns with the structural models for ScLas1-Grc3 (PDB:7Y18) and CjLas1-Grc3 (PDB:7Y17). These structures have extremely high clashscores (>100) as well as a significant number of RSRZ outliers, sidechain rotamer outliers, bond angle outliers, and bond length outliers. Moreover, both structures have extensive regions that have been modeled without corresponding electron density, and other regions where the model clearly does not fit the experimental density. These concerns make it difficult to determine whether the structural data fully support several of the conclusions in the manuscript. A more careful and thorough reevaluation of the models is important for providing confidence in these structural conclusions.

      We thank the reviewer for pointing out this issue. We have used the cryo-EM datasets to further validate our conclusions of the manuscript. We analyzed the active site of Las1-Grc3 complex and the interactions between Las1 and Grc3 using the cyro-EM structures and presented new figures (Figure 5- Figure supplement 1, Figure 5- Figure supplement 2, Figure 5- Figure supplement 3, Figure 6- Figure supplement 1) in our revised manuscript. Both the refinement and validation statistical parameters of the cryo-EM datasets are within a reasonable range (Table 2), which will provide confidence for our structure conclusions. The X-ray crystal structures of ScLas1-Grc3 (PDB:7Y18) and CjLas1-Grc3 (PDB:7Y17) complexes has high calshscores and many outliers, which is mainly due to the great flexibility of Las1-Grc3 complex, especially the CC domain of Las1. We have improved our crystal structure models with better refinement and validation of statistical parameters. The clashscores of ScLas1-Grc3 complex and CjLas1-Grc3 complex are 25 and 45, respectively. There are no rotamer outliers and C-beta outliers to report for both ScLas1-Grc3 complex and CjLas1-Grc3 complex.

      • The presentation of the cryo-EM datasets is underdeveloped in the results section drawing and the contribution of these structures towards supporting the main conclusions of the manuscript are unclear. An in-depth comparison of the structures generated from X-ray crystallography and cryo-EM would have greatly strengthened the structural conclusions made for the ScLas1-Grc3 and CjLas1-Grc3 complexes.

      We thank the reviewer for this suggestion. We have performed structural comparisons between X-ray crystal structure and cyro-EM structure in analyzing the active site of Las1-Grc3 complex and the interactions between Las1 and Grc3 (Figure 5- Figure supplement 1, Figure 5- Figure supplement 2, Figure 6- Figure supplement 1). We have also added a figure (Figure 5- Figure supplement 3) to show cryo-EM and crystal densities of the Las1 active site as well as the key amino acids for Las1 and Grc3 interactions. These comparisons and densities have greatly strengthened our structural conclusions.

      • The authors conclude that truncation of the CC-domain contributes to Las1 IRS2 binding and cleavage (lines 220-222, Fig. 4C). However, these assays show that internal deletion of the CC-domain alone has minimal effect on cleavage (Fig 4C, sample 3). The loss in ITS2 cleavage activity is only seen when truncating the LCT and LCT+CC-domain (Fig 4C, sample 2 and 4, respectively). Consistently, the authors later show that Las1 is unable to interact with Grc3 when the LCT domain is deleted (Fig. 6 and Fig. 6-figure supplement 2). These data indicate the LCT plays a critical role in Las1-Grc3 complex formation and subsequent Las1 cleavage activity. However, it is unclear how this data supports the stated conclusion that the CC-domain is important for LasI cleavage.

      Our EMSA data shows that the CC domain contributes to the binding of ITS2 RNA (Figure 4D), suggesting that the CC domain may play a role of ITS2 RNA stabilization in the Las1 cutting reaction. The in vitro RNA cleavage assays (Figure 4C) indicate that the LCT is important for Las1 cleavage because it plays a critical role in the formation of the Las1-Grc3 complex. Compared with LCT, the CC domain, although not particularly important for Las1 cutting ITS2, still has some influence (Fig 4C, sample 1 and 3, sample 2 and 4,). Therefore, we conclude that the CC domain may mainly play a role in the stabilization of ITS2 RNA, thereby enhancing ITS2 RNA cleavage.

      • The authors conclude that the HEPN domains undergo a conformational change upon Grc3 binding, which is important for stabilization of the Las1 active site and Grc3-mediated activation of Las1. This conclusion is based on structural comparison of the HEPN domains from the CjLas1-Grc3 complex (PDB:7Y17) and the structure of the isolated HEPN domain dimer (PDB:7Y16). However, it is also possible that the conformational changes observed in the HEPN domain are due to truncation of the Las1 CC and CGT domains. A rationale for excluding this possibility would have strengthened this section of the manuscript.

      We thank the reviewer for pointing out this issue. We agree that the complete Las1 structure information is helpful in illuminating the conformational activation of the Las1 by Grc3. We screened about 1200 crystallization conditions with full-length Las1 proteins, but ultimately did not obtain any crystals, probably due to flexibility. The CC domain exhibits a certain degree of flexibility, which has not been observed in the structure obtained from electron microscopy. The LCT is involved in binding to the CTD domain of Grc3. The coordination of the active center of HEPN domains by LCT and CC domains is unlikely due to the limited nuclease activity observed in full-length Las1. The conformational changes of the active center are essential for HEPN nuclease activation. Our structure shows that the GCTs of Grc3 interact with the active residues of Las1 HEPN domains, which probably induce conformational changes in the active center of the HEPN domain to activate Las1. Of course, we cannot exclude the possibility that truncation of the Las1 CC and LCT domains will result in little conformational change in the HEPN domains. We have explained this possibility in our revised manuscript.

      Reviewer #1 (Recommendations For The Authors):

      1) It would be very useful to show the cryoEM densities when describing the catalytic site and C-terminal domain interactions.

      The new Figure 5-figure supplement 2 have showed the Cyro-EM densities of the catalytic site of ScLas1 and the C-terminal domain of ScGrc3.

      2) "ScLas1 cleaves the 33-nt ITS2 at C2 site to theoretically generate a 10-nt 5′-terminal product and a 23-nt 3′-terminal product (Figure 1A). Our merger data shows that the final 5′-terminal and 3′-terminal product bands are at nearly the same horizontal position on the gel (Figure 1B), indicating that they are similar in size." These two sentences seem to contradict, i.e. 10-nt and 23-nt are similar in size even though they are different lengths?

      We apologize for the contradiction in these two sentences mentioned above. We have re-wrote these two sentences in the revised manuscript.

      3) We observed four cleavage bands of approximately 23-nt (P2), 14-nt (P3), 10-nt (P1), and 9-nt (P4) in length (Figure 1C). "

      Figure 1C. The bands show 23 nt, 22 nt, 21nt, 14 nt, 13nt, and 11nt, so this text does not seem to describe the figure.

      We have re-wrote this sentence in the revised manuscript.

      4) "We obtained similar cleavage results with a longer 81-nt ITS2 RNA substrate 6 (Figure 1D, E). " Figure 1D,E. The lengths in Figure 1E do not correspond to all bands in Figure 1E, e.g. the 13 nt band, though the others do, e.g. 14 nt, 30nt, 37nt, etc.

      In order to better evaluate the size of the cut product, we used an RNA marker as a comparison. The RNA marker will have more bands than the cleavage products. To further confirm the cleavage site of C2′, we also mapped the cleavage sites of the 81-nt ITS2 using reverse transcription coupling sequencing methods (Figure 1F).

      5) In Figure 3, domains are colored different but it's hard to know which are different proteins.

      We have added a diagram in Figure 3 to show the Las1-Grc3 complex structure, and it is now clear how Las1 and Grc3 are assembled into a tetramer.

      6) Line 267. "we screened a lot of crystallization conditions with full-length Las1 proteins" How many? Rough numbers ok, but 'a lot' is not very informative

      We have provided the approximate numbers of crystallization conditions in our revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      1) The authors missed an excellent opportunity to compare and contrast the ScLas1-Grc3 and CjLas1-Grc3 complex structures presented here with that of the previously determined CtLas1-Grc3 structure (Pillon et al., 2019, NSMB). For example, His130 in the ScLas1-Grc3 complex active site adopts a similar conformation to His142 in the TcLas1-Grc3 complex active site (Pillon et al., 2019, NSMB). Interestingly, the analogous His134 active site residue in the CjLas1-Grc3 adopts an alternative (maybe inactive) conformation. This observation could provide a structural rationale for the activation of scLas1 and TcLas1 by Grc3, while also providing a rationale for the fairly weak activation of CjGrc3 by CjGrc3.

      We thank the reviewer for this suggestion. We have performed structural comparisons between ScLas1-Grc3, CjLas1-Grc3 and CtLas1-Grc3 complexes, especially the Las1 nuclease active center. We added two figures (Figure7-figure supplement 3A and 3B) in the revised manuscript to contrast and highlight the conformational differences of active amino acids in active centers between ScLas1-Grc3, CtLas1-Grc3 and CjLas1-Grc3. These structural comparisons provide stronger evidence that further reinforces the conclusions of our manuscript.

      2) Can the authors speculate as to whether the structural data can provide any insight into how the Las1-Grc3 may cleave both C2 and C2' positions in the ITS2 RNA? This commentary would further strengthen the discussion section of the manuscript.

      We thank the reviewer for this suggestion. We have provided a speculation in the discussion section of the revised manuscript.

      We think that the structural data may provide some insight into how Las1-Grc3 complex cleaves ITS2 RNA at both C2 and C2' positions. The Las1-Grc3 tetramer complex has one nuclease active center and two kinase active centers. The nuclease active center consists of two Las1 molecules in a symmetric manner, while the kinase active center consists of only one Grc3 molecule. The ITS2 RNA is predicted to form a stem-loop structure. The symmetrical nuclease active center recognizes the stem region of ITS2 RNA and makes it easy to perform C2 and C2' cleavages on both sides of the stem. C2 and C2' cleavage products are further phosphorylated by two Grc3 kinase active centers, respectively.

      3) The method used for the plasmid generation, expression, and purification of the Las1 truncations and the Las1 and Grc3 point mutants should be provided in the methods section.

      The method used for the plasmid generation, expression, and purification of the Las1 truncations and the Las1 and Grc3 point mutants have be provided in the methods section.

      4) The exact amino acid cutoffs for the truncated forms of Las1 used for the biochemical assays in Fig. 4 should be provided.

      We have provided the exact amino acid cutoffs for the truncated forms of Las1 in the figure legend of Figure 4C.

      5) The models associated with the cryo-EM datasets should be deposited in the PDB.

      The models associated with the Cryo-EM datasets have be deposited in the PDB with the following accession codes: 8J5Y (ScLas1-Grc3 complex), and 8J60 (CjLas1-Grc3 complex).

      6) Lines 232-234: Arg129 should be changed to His134.

      We have corrected it.

      7) Figure 5B: the bottom half of the HEPN active site has been labeled incorrectly. The labels should be Arg129, His130, and His134 (from left to right).

      We have corrected it.

      8) Line 252: "multitudinous" should be changed to "multiple."

      We have corrected it.

    1. Author Response

      We are grateful to the reviewers for their thorough and thoughtful critiques, including their agreement on the significant value of this dataset. We intend to respond to their comments in full with a revision in the near future. However, we would like to make an initial comment at this stage. A key concern raised by the reviewers was that the analyses described do not adequately support the claim that "movie-watching data can identify retinotopic regions" (quoted from R2, similar sentiment expressed by R1). To be clear, we agree with this assessment. Our primary aim was not to identify visual areas with movie-watching data. Rather, our focus was on how movies can reveal fine-grained organization in infant visual cortex, which would support their potential utility for understanding the development of dynamic visual processing. To demonstrate this potential, we tested and found that maps of visual activity generated from movies are significantly similar to those generated by a retinotopy task. Nevertheless, we did not intend to argue that movie-based maps are sufficiently accurate to replace task-based retinotopic maps when defining visual areas, nor did we test this possibility. We accept that this point was unclear in the original manuscript and will make edits to avoid this miscommunication. We also plan to incorporate the reviewers’ many other helpful recommendations, including addressing concerns about the clarity of the presentation and double dipping, as well as adding several new analyses we hope will provide greater confidence in the findings and interpretation.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We would like the reviewers for their positive and useful comments. Below please find our answers to the issues raised.

      Reviewer #1 (Public Review):

      Overall, the experiments are well-designed and the results of the study are exciting. We have one major concern, as well as a few minor comments that are detailed in the following.

      Major:

      1) The authors suggest that "Visuomotor experience induces functional and structural plasticity of chandelier cells". One puzzling thing here, however, is that mice constantly experience visuomotor coupling throughout life which is not different from experience in the virtual tunnel. Why do the authors think that the coupled experience in the VR induces stronger experience-dependent changes than the coupled experience in the home cage? Could this be a time-dependent effect (e.g. arousal levels could systematically decrease with the number of head-fixed VR sessions)? The control experiment here would be to have a group of mice that experience similar visual flow without coupling between movement and visual flow feedback.

      Either change would be experience-dependent of course, but having the "visuomotor experience dependent" in the title might be a bit strong given the lack of control for that. We would suggest changing the pitch of the manuscript to one of the conclusions the authors can make cleanly (e.g. Figure 4).

      Although the plasticity is induced by the visuomotor experience in the tunnel, we agree that we do not know what aspect of the repeated exposure to the virtual tunnel caused the plasticity. We cannot rule out that it was the exposure to the visual stimuli alone that caused it. Therefore, we rephrased sentences that suggested that it was the coupling between visual stimuli and motor behavior that was responsible for the plasticity. We also changed the title to “Experience Shapes Chandelier Cell Function and Structure in the Visual Cortex”.

      We do believe that training the mice in the virtual tunnel does significantly increase experience with coupled visuomotor activity, though. In their home cage, mice are mostly active in the dark and there is litle space to run.

      Minor:

      2). "ChCs shape the communication hierarchy of cortical networks providing visual and contextual information." We are not sure what this means.

      We thank the reviewer for helping to raise clarity and we rephrased this sentence to: “…ChCs may establish a hierarchical relationship among cortical networks.”

      3) "respond to locomotion and visuomotor mismatch, indicating arousal-related activity" This is not clear. We think we understand what the authors mean but would suggest rephrasing.

      Agreed. We rephrased this sentence to: "...respond to events that are known to increase arousal levels, such as locomotion and visuomotor mismatch.”

      4) 'based on morphological properties revealed that 87% (287/329) of labeled neurons were ChCs" Please specify the morphological properties used for the classification somewhere in the methods.

      We added that the neurons were positioned at the border of L1 and L2 and had a dendrite reaching into layer 1.

      5) We may have missed this - in the patch clamp experiment (Fig.1 H-K), please add information about how many mice/slices these experiments were performed in.

      We have added the information to the legend of Fig. 1.

      6) "These findings suggest that the rabies-labeled L1-4 neurons providing monosynaptic input to ChCs are predominantly inhibitory neurons". We are not sure this conclusion is warranted given the sparse set of neurons labelled and the low number of cells recorded in the paired patch experiment. We would suggest properly testing (e.g. stain for GABA on the rabies data) or rephrasing.

      We weakened the statement to: “These findings suggest that the rabies-labeled L1-4 neurons providing monosynaptic input to ChCs may include many inhibitory neurons.”

      7) Figure 2E. A direct comparison of dF/F across different cell types can be subject to a problematic interpretation. The transfer function from spikes to calcium can be different from cell type to cell type. Additionally, the two cell populations have been marked with different constructs (despite the fact that it's the same GECI) further reducing the reliability of dF/F comparisons. We would recommend using a different representation here that does not rely on a direct comparison of dF/F responses (e.g. like the "response strength" used in Figure 3B). Assuming calcium dynamics are different in ChCs and PyCs - this similarity in calcium response is likely a coincidence.

      We have removed the quantification in this figure.

      8) If ChCs are more strongly driven by locomotion and arousal, then it's a bit counterintuitive that at the beginning of the visual corridor when locomotion speed consistently increases, the activity of ChCs consistently decreases. This does not appear to be driven by suppression by visual stimuli as it is present also in the first and last 20cm of the tunnel where there are no visual stimuli. How do the authors explain this?

      We do believe that this is suppression driven by visual stimuli. Although on average the strongest visual suppression happens between 20-80 cm, neurons that have their receptive fields toward the center of the visual field will already respond to the stimuli before the mouse reaches the 20 cm location of the tunnel. In addition, although the visual stimuli are the strongest sensory inputs, the background of the visual part of the tunnel has a black and white noise patern, which might already mildly suppress ChC activity. Both arguments are supported by the observation that the visual PyCs (V-PyCs, blue line) in Fig. 4D are already activated at the beginning of the tunnel and that the activity of V-PyCs matches well with the suppression of ChC activity.

      9) The authors mention that "ChC responses underwent sensory-evoked plasticity during the repeated visual exposure, even though the visual stimuli were different from those encountered during training in the virtual tunnel". How would this work? And would this mean all visual responses are reduced? What is special about the visual experience in the virtual tunnel? It does not inherently differ from visual experience in the home cage, given that the test stimuli (full field gratings) are different from both.

      As mentioned in our answer to point 1, the exposure to visual stimuli is strongly increased since, firstly, they are presented during the dark phase when the mice are most active and, secondly, they do not get these types of visual inputs in their home cage.

      10) Just as a point to consider for future experiments: For the open-loop control experiments, the visual flow is constant (20cm/s) - ideally, this would be a replay of the running speed the mouse previously generated to match statistics.

      We agree with this point and will implement replay of earlier sessions in future experiments.

      11) We would recommend specifying the parameters used for neuropil correction in the methods section.

      This is described on page 24, under “preprocessing”. We also refer to the analysis package (Spectral Segmentation - SpecSeg) in which the neuropil correction as used by us here is explained in more detail.

      12) If we understand correctly, the F0 used for the dF/F calculation is different from that used for division. Why is this?

      We apologize for this mistake, which was based on an older version of the software. We have now corrected this in the revised manuscript.

      13) Authors compare neuronal responses using "baseline-corrected average". Please specify the parameters of the baseline correction (i.e. what is used as baseline here).

      In the revised version we have now beter explained this in the methods, page 24, under “Passive Sessions”.

      Reviewer #2 (Public Review):

      Summary:

      Seignete et al. investigated the potential roles of axo-axonic (chandelier) cells (ChCs) in a sensory system, namely visual processing. As introduced by the authors, the axo-axonic cell type has remained (and still is) somehow mysterious in its function. Seignete and colleagues leveraged the development of a transgenic mouse line selective for ChC, and applied a very wide range of techniques: transsynaptic rabies tracing, optogenetic input activation, in vitro electrophysiology, 2-photon recording in vivo, behavior and chemogenetic manipulations, to precisely determine the contribution of ChCs to the primary visual cortex network.

      The main findings are 1) the identification of synaptic inputs to ChC, with a majority of local, deep layer principal neurons (PN), 2) the demonstration that ChC is strongly and synchronously activated by visual stimuli with low specificity in naive animals, 3) the recruitment of ChC by arousal/visuomotor mismatch, 4) the induction of functional and structural plasticity at the ChC-PN module, and, 5) the weak disinhibition of PNs induced by ChCs silencing. All these findings are strongly supported by experimental data and thoroughly compared to available evidence.

      Strengths:

      This article reports an impressive range of very demanding experiments, which were well executed and analyzed, and are presented in a very clear and balanced manner. Moreover, the manuscript is well- writen throughout, making it appealing to future readers. It has also been a pleasure to review this article.

      In sum, this is an impressive study and an excellent manuscript, that presents no major flaws.

      Notably, this study is one of the first studies to report on the activities and potential roles of axo-axonic cells in an active, integrated brain process, beyond locomotion as reported and published in V1. This type of research was much awaited in the fields of interneuron and vision research.

      Weaknesses:

      There are no fundamental weaknesses; the later mainly concern the presentation of the main results. The main weakness may be that the different sections appear somehow disconnected conceptually.

      Additionally, some parts deserve a more in-depth clarification/simplification of concepts and analytic methods for scientists outside the subfield of V1 research. Indeed, this paper will be of key interest to researchers of various backgrounds.

      Reviewer #3 (Public Review):

      Summary:

      The authors set out to characterize the anatomical connectivity profile and the functional responses of chandelier cells (ChCs) in the mouse primary visual cortex. Using retrograde rabies tracing, optogenetics, and in vitro electrophysiology, they found that the primary source of input to ChCs are local layer 5 pyramidal cells, as well as long-range thalamic and cortical connections. ChCs provided input to local layer 2/3 pyramidal neurons, but did not receive reciprocal connections.

      With two-photon calcium imaging recordings during passive viewing of drifting gratings, the authors showed that ChCs exhibit weakly selective visual responses, high correlations within their own population, and strong responses during periods of arousal (assessed by locomotion and pupil size). These results were replicated and extended in experiments with natural images and prediction of receptive field structure using a convolutional neural network.

      Furthermore, the authors employed a learned visuomotor task in a virtual corridor to show that ChCs exhibit strong responses to mismatches between visual flow and locomotion, locomotion-related activation (similar to what was shown above), and visually-evoked suppression. They also showed the existence of two clusters of pyramidal neurons with functionally different responses - a cluster with "classically visual" responses and a cluster with locomotion- and mismatch-driven responses (the later more correlated with ChCs). Comparing naive and trained mice, the authors found that visual responses of ChCs are suppressed following task learning, accompanied by a shortening of the axon initial segment (AIS) of pyramidal cells and an increase in the proportion of AIS contacted by ChCs. However, additional controls would be required to identify which component(s) of the experimental paradigm led to the functional and anatomical changes observed.

      Finally, using a chemogenetic inactivation of ChCs, the authors propose weak connectivity to pyramidal cells (due to small effects in pyramidal cell activity). However, these results are not unequivocally supported, as the baseline activity of ChCs before inactivation is considerably lower, suggesting a potentially confounding homeostatic plasticity mechanism might already be operating.

      Strengths:

      The authors bring a comprehensive, state-of-the-art methodology to bear, including rabies tracing, in vivo two-photon calcium imaging, in vitro electrophysiology, optogenetics and chemogenetics, and deep neural networks. Their analyses and statistical tests are sound and for the most part, support their claims. Their results are in line with previous findings and extend them to the primary visual cortex.

      Weaknesses:

      • Some of the results (e.g. arousal-related responses) are not entirely surprising given that similar results exist in other cortical areas.

      We agree that previous studies have shown arousal-related responses of ChC cells and our study confirms those findings. However, this is not the main message of the article and we present many findings that are novel.

      • Control analyses regarding locomotion paterns before and atier learning the task (Figure 5), and additional control experiments to identify whether functional and anatomical changes following task learning were due to learning, repeated visual exposure, exposure to reward, or visuomotor experience would strengthen the claims made.

      In figure 5 we excluded running trials, so locomotion paterns are unlikely to play a major role. We agree that testing what are the factors that contribute to the observed plasticity are important to investigate in future experiments.

      • The strength of the results of the chemogenetics experiment is impacted by the lower baseline activity of ChCs that express the KORD receptor. At present, it is not possible to exclude the presence of homeostatic plasticity in the network before the inactivation takes place.

      Although we do not know why there is a difference in the baseline df/f (e.g. expression levels), we consider it unlikely that expression of the KORD receptor itself without exposure to the ligand causes reduction of ChC activity. Moreover, we are not sure how homeostatic plasticity in the network would occur selectively in KORD-expressing ChCs. Finally, we do not find evidence for a relationship between lower ChC calcium signals and the effects of ChC silencing on PyC activity. We performed an additional analysis in which we correlated baseline ChC activity (before salvinorin B injection) with the effect of ChC silencing on PyC activity (post – pre) across mice, and found that this correlation was not significant (R = 0.41, p = 0.18).

      Reviewer #1 (Recommendations For The Authors):

      In the spirit of openness of the scientific discussion, all our feedback and recommendations to the authors are included in the public reviews.

      Reviewer #2 (Recommendations For The Authors):

      Most of my comments and suggestions concern the presentation of the data, to (hopefully) help and convey as clearly as possible the messages of this important article.

      Main

      The main weakness of the paper may be that the different sections appear somehow disconnected conceptually. This is particularly true for:

      -structural plasticity: how can we link this finding with the rest of the study? Are there ways to correlate this finding with physiological recordings in individual animals, or to directly test whether particular functional types of PNs (visual, non-visual) undergo plasticity at their AIS?

      This is a very interesting question that may be addressed in future experiments.

      -the indirect finding suggesting that ChC weakly inhibits PNs using chemogenetic silencing of PNs. Do chemogenetic manipulations of ChCs affect PN responses in visual paradigm and/or modify the induction of structural plasticity at the ChC-AIS connection?

      This is also a very interesting question for future work.

      Additionally, some parts would deserve a more in-depth clarification/simplification of concepts and analytic methods (OSI, DSI, MEI...) for scientists outside the subfield of V1 research. Indeed, this paper will be of key interest to researchers of various backgrounds.

      In the revised manuscript we briefly explain what an MEI is when first introduced, and introduce the abbreviations OSI and DSI at the correct location. We believe orientation and direction selectivity are well-known concepts for the audience reading this article.

      Minor

      These are discussed by order of appearance in the text.

      Abstract

      The alternative interpretation of error/mismatch negativity to explain ChC activation deserves to appear in the abstract. Arousal consistency in prediction should be in the introduction. "In mice running in a virtual tunnel, ChCs respond strongly to locomotion and halting visual flow, suggesting arousal-related activity."

      This comment holds for the end of the introduction and the beginning of the discussion, as well.

      "These findings suggest that ChCs provide an arousal-related signal to layer 2/3 pyramidal cells that may modulate their activity". This statement appears to be in contradiction with the weak effect mentioned just before. This comment holds for the end of the introduction.

      The full sentence was: “These findings suggest that ChCs provide an arousal-related signal to layer 2/3 pyramidal cells that may modulate their activity and/or gate plasticity of L2/3 PyCs in V1.” Our results show that activity of layer 2/3 pyramidal cells is modulated (albeit weakly) and it is well possible that ChCs regulate plasticity at the AIS. Therefore, we do not believe that this statement contradicts the weak direct effect of ChCs on layer 2/3 pyramidal cell activity. Therefore , we think that this statement does not contradict the weak direct effect of ChCs on layer 2/3 pyramidal cell activity.

      We changed the last sentence of the introduction to “Our findings suggest that ChCs predominantly respond to arousal related to locomotion or unexpected events/stimuli, and act to weakly modulate activity and/or gate plasticity of L2/3 PyCs in V1.”

      Introduction First paragraph

      Coming from a field outside of vision research, it is not obvious to me what has been learned from interneuron classes in the past. An example would be welcome in the introduction.

      The literature on the role of different interneuron types in visual processing and plasticity is too large to pick one or two examples. For the sake of conciseness, we have therefore provided some important references and reviews for the interested readers (references 1 to 10).

      Interneuron "subtypes" seem to refer to main classes (e.g. PV+): please rephrase accordingly (ChC being a type and PV+ ChC a subtype).

      We changed interneuron “subtypes” to “types” and left L2/3 pyramidal cell “subtypes” unchanged.

      Second paragraph

      Beyond the reversal potential of GABA-ARs at the axon initial segment, GABA may inhibit action potential generation in various conditions (Lipkin et al. 2023, DOI: 10.1523/JNEUROSCI.0605-23.2023 : should be cited).

      We added this citation.

      Fourth paragraph

      "ChCs alter the number of synapses at the AIS based on the activity of their postsynaptic targets": the concept of alteration is too vague to let the reader grasp the concept: could the authors rephrase?

      We have rephrased the sentence to:

      “…ChCs increase the number of synapses at the AIS if their postsynaptic targets are chemogenetically activated…”

      Results 1) ChCs receive input from long-range sources and L5 PyCs in V1 It is not clear how morphological identification of ChC was performed. Did dendrites and/or axons of starter cells occasionally overlap as can be expected, complicating the cell-by-cell morphological classification?

      "Most labeled neurons were located on the border between L1 and L2/3 and displayed typical ChC morphology": maybe clarify that this concerns neurons expressing eYFP-TVA?

      We assessed the location (at the border of L1 and L2) and spatial distribution of the labeled cells and whether they had a dendrite extending upwards towards into L1. We have now indicated this in the results section and clarified that these neurons express eYFP-TVA.

      -Likewise the following would benefit from clarification " This is further supported by the distributed localization of the labeled neurons": it would also help here to remind the reader of the labelling (presumably retrogradely-labeled mCherrry+ neurons).

      We have now clarified in the text that these are mCherry+ neurons labeled by the rabies virus

      2) Chandelier cells are modulated by arousal and show high correlations

      -The authors indicate that the results "(suggest) that ChCs distribute a synchronized signal during high arousal." : it would be stronger to defend this claim by showing a higher ChC-ChC correlation during "arousal" vs. baseline (i.e. analyze high arousal epochs outside of movement). It may be difficult to perform this analysis due to low fluorescence changes outside running episodes, but this should be discussed accordingly. In this respect, the title of the section is more in line with the data presented.

      We believe our statement is correct. The activity of ChCs is highly synchronized and their firing rates increase during arousal. We do not state that synchronization increases with arousal.

      -A brief explanation of DSI and OSI meaning would be nice for the audience that will definitely extend beyond vision research given the importance of this study.

      See above

      3) ChCs are weakly selective to visual information

      -I may very well miss the point, but the equivalence in response strength among cell classes (Fig3B) seems inconsistent with the wider distribution of high response strength in ChCs (Fig3C). Perhaps a graphical representation taking into account the distribution of single data points in Fig3B would help resolve this discrepancy.

      This is because in panel C the response strengths are normalized. We now also state this in the legend to avoid confusion.

      -"clearly oriented edge-like paterns with sharp ON and OFF regions": it would help if a representative example was highlighted in Figure 3F.

      The majority of L2/3 pyramidal MEIs presented in this panel show this patern.

      -It is interesting and surprising that properties of ChCs appear more distinct from those of L5 PNs than from those of L2-3 PNs (Fig 3G-J), given the fact that V1 ChCs were found by the authors to derive their inputs from V1 L5 PNs (please see comments of the discussion for this specific point).

      How ChCs respond based on L5 input depends strongly on how the connections between L5 and ChCs are organized. Similarity between responses of L5 and ChC neurons is not required.

      4) Locomotion and visuomotor mismatch drive chandelier cell activity in a virtual tunnel This is the least convincing part in terms of presentation.

      -It is unclear where/when visuomotor mismatch has been induced in the tunnel: please clarify in the text and in Fig 4B.

      We realized that the title of the paragraphs was indeed confusing. In fig. 4A-D and the first paragraph about the virtual tunnel, we do not discuss the visuomotor mismatch. This comes later, when we describe the results in Fig. 4E. The titles have been changed.

      -No result on visuomotor mismatch is reported in the text of this section, while this is presented in the subsequent section: this needs to be corrected (merge this section with the next?).

      We agree, apologies for the confusion. See above.

      -It would be interesting to further analyze responses to CS and US. Regarding the US: is water rewarding in non-water-restricted mice? This should be mentioned.

      We realized that we did not mention that the mice were water restricted during behavioral training and during the imaging sessions when mice performed the virtual tunnel task. We have now added this to the methods section. Sorry for the omission.

      -Along this line: was water sometimes omited? This would provide a complementary way to test the prediction error theory for ChC activation with an alternative modality.

      We never omited the water reward. It would be interesting to test this in a future experiment.

      5) ChCs have similar response properties as non-visual PyCs

      • It would help to explicitly mention that in Ai65 mice, only Cre and Flp+ cells express tdTomato (here Vipr2 and PV+).

      We added the following sentence: “In these mice, tdTomato was only expressed in cells expressing both Vipr2 and PV.”

      6) Visuomotor experience in the virtual tunnel induces plasticity of ChC-AIS connectivity

      • In relation to the previous section, Jung et al. (doi.org/10.1038/s41593-023-01380-x) recently reported that motor learning reduced ChC-ChC synchrony in M2. Did the author observe a similar change in ChC- ChC synchrony with visual experience/habituation to the task? If available, these data should be reported to help build a clearer picture of ChC functions in the neocortex.

      We tested this and also found reduced correlations between ChCs in trained mice vs naïve mice. We added this as text on p14 in the results section.

      • The low number of ChC boutons' appositions per AIS may be misleading: "While the average number of ChC boutons per AIS remained constant (~2-3 ChC boutons/AIS)"). It would be helpful to make it clear that these are "virally" labelled boutons, as opposed to absolute numbers, if compared with the detailed quantification of Schneider-Mizell et al, 2021 (7.4 boutons per AIS in average; doi: 10.7554/eLife.73783.).

      We added "virally labeled"

      • It may be difficult to clearly isolate boutons in light microscopic images of ChC boutons. could the authors comment on this and explain how they solved this issue (in the methods section for instance)?

      We elaborated on our definition of a bouton under confocal microscopy conditions. We also added that the analysis was performed under blinded conditions for the experimenter (i.e. the experimenter did not know whether the images came from trained or untrained mice).

      • Is there any suggestion for heterogeneity/selectivity for a subset of PNs (the distribution does not seem to show this, though)? It would be interesting to discuss this and try to link this finding to the rest of the study a bit more directly. Future work could also investigate if genetically defined PN types undergo different pre-synaptic plasticity at their AISs (e.g. work cited by the authors by O'Toole et al, 2023 doi: 10.1016/j.neuron.2023.08.015 -this reference can be updated as well, since the work has been published in the meantime).

      In our data, we did not find evidence for heterogeneity or selectivity of targeting, also not in the physiology using KORD (see below). We do agree that it is an interesting question and deserves atention in future experiments. We also updated the reference.

      7) ChCs weakly inhibit PyC activity independent of locomotion speed

      The authors state that "recent work in adult mice has reported hyperpolarizing and shunting effects in prelimbic cortex, S1 and hippocampus (18, 26, 27)": however, to my knowledge studies presented in refs 26 & 27 found reduced activity/firing of PNs upon optogenetic activation of ChCs in vivo, but did not perform intracellular recordings to assess GABA-A reversal potential at the AIS. I would like to kindly ask the authors to correct this sentence.

      If the polarity of responses is discussed, they may rather refer to the corresponding literature including Rinetti Vargas et al (doi: 10.1016/j.celrep.2017.06.030), Lipkin et al (doi: 10.1523/JNEUROSCI.0605- 23.2023), and Khirug et al (doi: 10.1523/JNEUROSCI.0908-08.2008.).

      We added the reference to Lipkin et al and changed the sentence so that it matches the references..

      • In an atempt to link findings from several parts of the article, did the authors investigate whether chemogenetic effects were different in visual vs non-visual PNs? As ChCs are functionally related to visual PNs, one might indeed speculate that these cells are synaptically connected.

      We did not find evidence for selectivity in the chemogenetic effect. We compared the chemogenetic effect to locomotion modulation (see text accompanying Fig 7.) – based on our observation that non- visual PyCs were more strongly modulated by locomotion (see Fig. 4) – but did not find any significant correlation.

      • " We first looked at the average activity of neurons in both essions.": sessions

      Thank you for noticing. We corrected this.

      Discussion

      Summary of findings

      -It would be worthwhile to include in the summary the finding of mismatch-related activity, that appears to explain more convincingly ChC activation than arousal per se (with the data available).

      We updated the summary of the discussion accordingly.

      -Moreover, the last part of the article (weak inhibition of PNs by ChCs), despite being very important, is not mentioned.

      We now mention this in the summary of the discussion (“Finally, ChCs only weakly inhibit PyCs.”)

      Discussion of findings

      -" Optogenetic activation of cortical feedback": it is not clear what the authors mean by cortical feedback. As RS was retrogradely labeled, this region may rather provide feedforward inhibition to V1 via ChCs.

      Retrosplenial cortex is a higher order cortical area and only provides feedback to V1.

      -"This means that each ChC receives input from many L5 PyCs, which could explain the low selectivity of ChC responses we observed to natural images compared to those of L2/3 and L5 PyCs". : perhaps state explicitly that the convergence of many PN inputs each carrying different RF/visual properties "averages out" in ChC (as you do a few lines below for MEI).

      At this point, we do not know how the connections from L5 to ChCs are organized. Whether this converge results in “average out” is therefore not so certain. We have made an atempt to clarify the situation. (“This convergence of L5 PyC inputs, if not strongly organized, could explain the low selectivity of ChC responses we observed to natural images compared to those of L2/3 and L5 PyCs.”)

      -"However, we did not identify neuromodulatory inputs to ChCs in our rabies tracing experiment. Possibly, these inputs act predominantly through extrasynaptic receptors and were therefore not labeled by the transsynaptic rabies approach.": here, the authors should cite the work by Lu et al (doi: 10.1038/nn.4624) which found basal forebrain (diagonal band of Broca) cholinergic inputs to ChC of the PFC in the Nkx2.1CreER mouse model. Moreover, the authors should discuss potential technical differences (?) responsible for this discrepancy. Beyond the extrasynaptic release of neuromodulators, rabies strains may display different tropism profiles for neuron classes.

      We have now added a sentence discussing this and added the reference in the revised manuscript.

      -The section dedicated to prediction error is particularly interesting and relevant. In my opinion, this interpretation should be further emphasized in the abstract and summary of findings paragraph in the discussion (as already indicated).

      Yes, we agree and have added some emphasis.

      -" These findings are thus in contrast with the general notion that ChCs exert powerful control over PyC output (28, 78), but consistent with computational simulations predicting a relatively small inhibitory effect of GABAergic innervation of the AIS, possibly involving shunting inhibition (79, 80)." These findings are also consistent with results from PFC and dCA1 studies showing, with electrophysiological recordings combined with optogenetic stimulation of ChCs, that a small proportion of putative PNs was inhibited upon ChC stimulation (doi: 10.1038/nn.4624 doi: 10.1016/j.neuron.2021.09.033).

      Perhaps the effect of ChCs is limited in all these experiments by a suboptimal efficiency of ChC targeting. Moreover, inhibition might be restricted to a subset of PNs carrying a specific function. This could be discussed.

      We added an explanation for the weak effects of silencing to the discussion and stated that our results are in line with findings in PFC and CA1. (“One explanation for the weak effects we observed is the high variability in the number of GABAergic boutons that PyCs receive at their AISs. Possibly, only a smaller fraction of PyCs with high numbers of AIS synapses are inhibited when ChCs are active. Indeed, we find that only a small fraction of PyCs increased their activity upon chemogenetic silencing of ChCs, in line with findings by others showing that manipulating ChC activity in vivo has relatively weak effects on small populations of PyCs (27, 28).”)

      Although we cannot rule out that ChC targeting is suboptimal in our and other experiments, the expression of the KORD receptor as visualized by mCyRFP1 fluorescence appeared very strong. In addition, the common notion in the ChC field is that ChCs exert powerful control over PyC firing. Even suboptimal labeling should in that case show clear inhibitory effects. Similar experiments with PV+ interneurons would show very convincing inhibition, even if labeling is suboptimal. To keep the discussion concise, we prefer to leave this particular point out.

      -" ChC activation could prevent homeostatic AIS shortening of L2/3 PyCs if their activity occurs during behaviorally relevant, arousal inducing events": this postulate seems to be very interesting but is not very clear and lacks some mechanistic speculation.

      We considered elaborating more on this hypothesis. However – given that it is merely a speculation at this point – we do not wish to lengthen the discussion further on this point.

      • A reference to previous studies demonstrating high levels of synchronous ChC activities is missing: the authors may cite Dudok et al., Schneider-Mizell et al., and Jung et al. (and discuss a change in synchrony with learning or habituation in the case of this study; see above).

      We have now also referred to these papers in the context of high correlations between ChCs.

      Methods

      Beyond references to reagents (eg antibodies, viruses), lot numbers should be provided whenever this is possible. Indeed, there might be strong lot-to-lot variations in specificity and efficiency.

      Reviewer #3 (Recommendations For The Authors):

      Major:

      • (Figure 5) Control analysis missing. Mice before and after training in VR will almost definitely exhibit different running paterns when viewing driftng gratings. Since ChCs are strongly modulated by locomotion, assess whether results depend on changes in running.

      Although we did not compare locomotion paterns before and after training, we removed all trials in which the mice were running (see methods). Therefore, we can exclude that these results are caused by changes in running behavior.

      • (Figure 5 & 6) What would happen with simple passive visual experience, not in a visuomotor task? What if there was no reward? What if there was an open-loop experiment with random reward? To which specific aspect of the experiment are the results atributable?

      These are indeed very interesting questions that may be tested in future experiments.

      (Figure 7 B, H) The pre-injection ChC activity in the KORD group is less than 50% of that in control mice! Discuss the effect of such a shift in baseline. Plasticity of PyCs even before ChC inactivation?

      See answer to the above question in the public section of reviewer 3.

      • (Figure 3 H) Contrast tuning results, as far as I understand, come only from the CNN. However, if I understood correctly, during the passive viewing of gratings there were already different contrasts. Why not show contrast tuning there? Do the results disagree?

      We did indeed show stimuli at different contrasts during the passive viewing of gratings. Although the results from those recordings were not optimal for defining contrast sensitivity, they also showed that ChC responses were less modulated by contrast than PyCs.

      Minor: - (Figure 3) Explain the potential impact of different indicators 8m vs 6f due to different baselines and dynamics.

      We believe there is no impact of different indicators, because for the CNN analyses we estimated spikes using CASCADE. This toolbox is specifically designed to generalize across different calcium indicators. Although GCaMP8m was not included in their training set, the wide variety of indicators used provides a solid basis for generalizable spike estimation. Importantly, comparisons between L2/3 PyCs and ChCs also would not be affected by this concern.

      • (Figure 4) NV-PyCs. Would you call all of these mismatch-responsive neurons? Discuss the difference in the percentage of neurons (more than 50% of total PyCs here, compared to significantly less - up to 40% in previous studies, as far as I'm aware)

      Not all NV-PyCs appeared to be mismatch-responsive neurons.

      • (Figure 6 D) No error bars?

      This is a representation of the fraction of all contacted AISs, which has no error bars indeed.

      • (Figure 6 E-F and H-I) These pairs of panels contain essentially the same information. The first panel of each pair seems redundant.

      We prefer to keep both plots in place, as in this case the skewness of the histogram can be helpful, which is less clear in the boxplot (which in itself displays the quantiles beter).

      • The equation for direction tuning still has ang_ori, instead of ang_dir which I'm assuming should be there.

      Thank you for noticing, we corrected it.

      • The response for drifting gratings is calculated from a different interval (0.2-1.2s) compared to natural images (0-0.5s). Why?

      Because we used spike probability in the case of the natural images to shorten the signal, and the visual stimuli were presented for 0.5 s (instead of 1 s as with the gratings).

      Very minor:

      • It would be helpful for equations to have numbers.

      Done

      • Sparsity equation. Beter to have it as a general equation, with N instead of 40. Then below it can be explained that N is the number of images = 40.

      Done

      • "The similarity of these MEIs with those we found for ChCs is in line with the idea that ChCs are driven by input from a large number of L5 PyCs (but do not exclude alternative explanations)." - in parenthesis it should be does not exclude.

      Corrected.

      • "In contrast, the response strength of PyCs was only mildly and non-significantly reduced after training"

      • statistically non-significant..

      Corrected.

      "We first looked at the average activity of neurons in both essions." - sessions

      Corrected.

      • (Figure 7 C) Explain what points and error bars represent

      Done.

    1. Author Response

      Reviewer #2 (Public Review):

      The study from Gumaste et al investigates whether mice can use changes of intermittency, a temporal odor feature, to locate an odor source. First, the study tries to demonstrate that mice can discriminate between low and high intermittency and that their performance is not affected by the odor used or the frequency of odor whiffs. Then, they show that there is a correlation between glomerular responses (OSNs and mitral cells) and intermittency. Finally, they conclude that sniffing frequency impacts the behavioral discrimination of intermittency as well as its neural representation. Overall, the authors seek to demonstrate that intermittency is an odor-plume property that can inform olfactory navigation.

      The paper explored an interesting question, the use of intermittency of an odor plume as a behavioral cue, which is a new and intriguing hypothesis. However, it falls short in demonstrating that the animal is actually sensitive to intermittency but not other flow parameters, and is missing some important details.

      Major concerns

      1) One of the cornerstones of this paper consists in showing that mice are behaviorally able to distinguish among different intermittency values (high or low), across a variety of different stimuli and without confounds such as the number of whiffs or concentration. However, I could not find in the paper a convincing explanation of how these confounds were tested. It is clear that the authors repeat their measurements in different conditions (low or high concentration, and different whiff numbers) but it is not specified how: do the authors mix all stimuli in the same session, and so the animals simply generalize across all the stimuli and only consider intermittency for the behavioral choices? Or do authors repeat different sessions for different parameters? For example: do they perform two separate sessions with low concentration and high concentration? If this last one is the case, I would argue that this is not enough proof that animals generalize across concentrations, as the animals might simply use concentration as a cue and change the decision criteria at each session. Please clarify.

      We appreciate the reviewer pointing out our oversight in including this information in the manuscript. Trials of the two gain values (which modulate the maximum concentration) are presented interleaved within a session. These trials are solely separated for post-session analysis to test the effect of gain on animal performance. To make this point clearer we have included the following text on line 952 of the manuscript:

      “Additionally, trials of a gain of 0.5 and a gain of 1 are interwoven randomly during the session with each unique stimulus being presented at both a gain of 0.5 and 0.1. Thus, after the initial engagement trials, animals are presented with a total of 28 trials at a gain of 0.5 and 28 trials at a gain of 0.1.”

      Additionally, to address one of the reviewer’s overarching points, that the manuscript “falls short in demonstrating that the animal is actually sensitive to intermittency but not other flow parameters,” we would like to highlight that through our olfactometer design (described in the Olfactometer Design subsection of the Methods section and illustrated in Figure 1C) the flow rate is held constant throughout the experiment. To further ensure that the animal is not using flowrate or other experimental conditions to perform the task, we tested all animals on a “no odor” condition in which the vial of odor is replaced with a vial of mineral oil. In this condition, their hit rate significantly lowered, as shown in Figure 2C and described in Lines 240- 245:

      “Animals’ hit rate also significantly decreased when tested on the Go/No-Go task with the odor vial replaced with mineral oil (n=12 mice, two-sample t-test Naturalistic: odor hit rate = 0.87 ±0.01, no odor hit rate= 0.23 ±0.05, p<0.0001; two-sample t-test Binary Naturalistic: odor hit rate= 0.89±0.01, no odor hit rate= 0.18±0.07, p<0.0001; two-sample t-test Synthetic: odor hit rate= 0.86±0.007, no odor hit rate= 0.23±0.07, p<0.0001), confirming that mice are using odor to perform the task.”

      2) It looks to me that the measure of intermittency strongly depends on the set. What is the logic of setting a specific threshold? Do the results hold when this threshold changes within a reasonable range? The same questions (maybe even more important) go for the measure of glomerular intermittence. Unfortunately, a sensitivity analysis for both measures is missing, which makes it hard to interpret the results.

      We assume the reviewer suggests that we could have tested discrimination at various Intermittency thresholds. This is indeed wat we did, though not by varying the threshold parametrically (due to abovementioned time constraints), but rather qualitatively/categorically. We tested our mice on 3 stimulus "types" (Figure 1F): actual continuous plume concentration traces (naturalistic), thresholded traces (binarized by threshold 0.1) and square wave (odor agnostic periodic binary). Further, each was tested at 2 gain levels. Figure 2B demonstrates mice discriminate similarly across these 3 widely differing stimuli, while traces were spanning most of the range of possible intermittencies. Reducing the threshold by 1 or 2 orders would skew the range of trials toward many more CS+ trials. We hence conclude that the mice are robustly discriminating and that the paradigm chosen and its associated constraints provide a reasonable test of "intermittency space".

      We agree nonetheless that future work should address your suggestion directly by implementing an alternate paradigm. For example, in such a paradigm, mice may be trained to discriminate high vs low intermittencies at varying absolute levels (e.g. 1 vs 0.9 and 0.1 vs 0), etc., however that was well outside the scope of what we aimed to test.

      See Figure 1- Supplement 1A. We varied the threshold half a log unit around the 0.1 threshold used in the neuro-behavioral research. As expected, the higher the odor threshold, the more left-shifted the curve. You can see that the monotonic relationship is qualitatively the same across thresholds.

      3) The logic of choosing the decision boundary for the discrimination task is not clear: low intermittency is considered to be below 0.15 and high intermittency is considered to be between 0.2 and 0.8. Do these values correspond to natural intermittency distribution? How were these values chosen?

      Intermittency drops as function of distance from the source (downwind). It also has a close to normal (with kurtosis) distribution across wind, peaking at the center (see e.g. Crimaldi 2002, Connor 2018). So, animals may encounter any and all intermittencies (0-1). Given our Go/No-Go paradigm we had to set a CS-/CS+ boundary. Typically, to generate an adequate psychometric curve using this paradign, either the CS- or CS+ stimuli need to represent a wide range of values of which the animals are required to compare against a narrow range (or single value). Again, bounded by effective behavioral paradigm design, the number of CS+ and CS- trials need to be even in order to appropriately motivate animals to engage in the task. Thus, considering the entire range of intermittency values animals can encounter while navigating through a plume in conjunction with effective behavioral design, we arrived at our chosen values for low and high intermittency.

      As you can see in Figure 1- Supplement 1A (and also reviewer #1, comment 2), I=0.15 is roughly at the knee where the monotonic decrease begins to asymptote. This is roughly true for all 3 concentration thresholds. Consequently, I=0.2-0.8 effectively samples the region where intermittency clearly relates to distance to the source, which is where we hypothesize animals.

      4) Only 2 odors were used in the whole study and some results were in disagreement between the two odors. By looking at only two odors it is very difficult to make a general conclusion about intermittency encoding in the OB.

      We agree 2 odors are limited, but we were constrained in terms of number of tests that we could run on our cohort of animals. Nonetheless intermittency of both odors is clearly discriminable. As explained to comment 3 by Reviewer 1:

      “We indeed considered several odorants and associated properties. Given time constrains we were limited to 2 stimuli of which we had to vary many parameters (type, I, gain, sniffing) in assessing both discrimination and neural processing.”

      “Additionally, these two odorants recruit glomeruli in different regions of the dorsal olfactory bulb, have different functional groups and elicit different spatiotemporal response properties in the olfactory bulb (Figure 6- figure supplement 1A, stated on line 507). Both odorants are fruit-associated odors with neutral preference indices (Saraiva et al., 2016, Fletcher, 2012). Thus, while we do not explore a panel of odorants, we do explore the generalizability of intermittency processing with two distinct odorants.”

      We decided to test 2 monomolecular odorants (2-heptanone and methyl valerate) as these have been widely used in rodent olfactory bulb imaging, providing distinct and clear glomerular response patterns. They are both fruity smelling odors, implying a relationship to edible food (at least, for humans). Methyl valerate is a methyl ester of pentanoic acid with a fruity (apple) smell and 2-Heptanone is a ketone with a fruity (green banana) smell.

      5) Assuming that all the above issues are resolved, one can conclude that intermittency can be perceived by an animal. The study puts a strong accent on the fact that this feature could be used for navigation. I understand that it is extremely hard to demonstrate that this feature is actually used for navigation, however, the analysis of relevance of this measure is missing. Even if it is used in navigation, most probably this would be in combination with other features, thus its relative importance needs to be discussed, or even better, established.

      We fully appreciate the reviewers reasoning. Our approach indeed intended to establish a conditio sine qua non: if mice could not discriminate these stimuli they would likely not be able to use intermittency in general for navigation (at least for the odorants tested, for the intermittency ranges tested). We show however that they can, and hence they could use it. To demonstrate their use of intermittency alone or combined with other modalities or properties is well beyond the scope of this manuscript and we agree is a very interesting endeavor.

      We discussed other temporal properties on line 58-71 and 657-664 and other general properties on lines 46-56. The relative roles were briefly addressed on lines 664-676 and we hesitate to speculate beyond this.

    1. Author Response

      Reviewer #1 (Public Review):

      With MERGEseq, the authors sought to develop a scalable and accessible method for getting both projectome and transcriptome information at the single-cell level from multiple projection targets within a single animal. MERGEseq uses a retro rAAV2 to deliver a 15-nucleotide barcode driven by a CAG promoter with co-expression of eGFP to enrich barcoded cells using FACS. Injection of this rAAV2 in distinct regions (with each injection region distinguished by a unique barcode that is specific to the virus used) allows retrograde trafficking and expression of the barcodes in cells that project to the injected region. In this manuscript, rAAVs harboring 5 unique barcodes were stereotactically delivered to 5 targets of the mouse: dorsomedial striatum (DMS), mediodorsal thalamic nucleus (MD), basal amygdala (BLA), lateral hypothalamus (LH), and agranular insular cortex (AI). After a 6-week period to allow for viral transduction and expression, the ventromedial prefrontal cortex (vmPFC) was harvested for scRNAseq. vmPFC scRNAseq data were validated against previously published PFC datasets, demonstrating that MERGEseq does not disrupt transcript expression and identifies the same principal cell types as annotated in previous studies. Importantly, MERGEseq enabled the identification of cell types in the vmPFC that project to distinct areas, with separation occurring largely based on cell type and cortical layer. The application of stringent criteria for barcode index determination is rigorous and improves confidence that barcoded cells are correctly identified. The observation that all barcoded cells were excitatory is consistent with prior work, although it is not clear if viral tropism contributes to this in some way. In a parallel experiment, FAC-sorted cells (vmPFC cells expressing EGFP) were isolated as a comparison. Notably, EGFP+ cells were exclusively excitatory neurons, consistent with literature showing PFC projection neurons are excitatory. Next, barcode analysis was combined with transcriptional identification of neuronal subtypes to define general projection patterns and single-cell projection patterns, which were validated by the DMS and MD in situ using retrograde tracing in combination with RNA FISH. MERGEseq data were also used to identify transcriptional differences between neurons with dedicated and bifurcated projections. DMS+LH and DMS+MD projecting neurons had distinct transcriptional profiles, unlike cells with other targets. RNA FISH for marker gene Pou3f and retrograde tracing from DMS+LH projecting cells demonstrate enrichment of this gene in this projection population. Finally, machine-learning was used to predict projection targets based on transcriptional profiles. In this dataset, 50 highly variable genes (HVGs) were optimal for predicting projection patterns, though this might vary in different circuits. Overall, the results of this manuscript are well presented and include rigorous validation for select vmPFC targets with in situ techniques. The application of unique barcodes for retro-AAV delivery is an accessible tool that other labs can implement to study other brain circuits.

      Ultimately, MERGEseq is a subtle conceptual advancement over VECTORseq (retro-AAV delivered transgenes rather than barcodes, in combination with scRNAseq) that offers higher confidence in the described projectome diversity in comparison. The use of a retrograde AAV inherently limits the number of projection areas that can be assessed, a weakness compared to anterograde approaches such as MAPseq/BARseq. However, BARseq demands more time and resources; further, the use of the highly toxic Sindbis virus limits the application of this technique. This manuscript builds upon previous work by utilizing machine learning to predict projection targets. BARseq2 could be used to rigorously validate predicted projectomes and gain single-cell information regarding target neurons. Overall, MERGEseq is an accessible technique that can be used across many animal models and serve as an important starting point to define circuits at the single-cell level.

      We thank reviewer for the comprehensive review. We are grateful for reviewer’s recognition of the conceptual advancement of MERGE-seq and the rigorous criteria we applied for projection barcode determination. We have revised the Introduction to highlight advancements in our method. We also discussed the balance of transcriptomic comprehensiveness against spatial resolution in the revised Discussion. Reviewer’s comments have been invaluable in enhancing the clarity and depth of our manuscript.

      Reviewer #2 (Public Review):

      Investigating the relationship between transcriptomic profiles, their axonal projection and collateralization patterns will help define neuronal cell types in the mammalian central nervous system. The study by Xu et al. combined multiple retrograde viruses with barcodes and single-cell RNA-sequencing (MERGE-seq) to determine the projection and collateralization patterns of transcriptomically defined ventral medial prefrontal cortex (vmPFC) projection neurons. They found a complex relationship: the same transcriptomically defined cell types project to multiple target regions, and the same target region receives input from multiple transcriptomic types of vmPFC neurons. Further, collateralization patterns of vmPFC to the five target regions they investigated are highly non-random.

      While many of the biological conclusions are not surprising given recent studies on the collateralization patterns of vmPFC neurons using single neuron tracing and other methods that integrate transcriptomics and projections, MERGE-seq provides validation, at the single cell level, collateralization patterns of individual vmPFC neurons, and thus offer new and valuable information over what has been published. The method can also be used to study collateralization patterns of other neuron types.

      Some of the conclusions the authors draw depend on the efficiency of retrograde labeling, which was not determined. Without quantitative information on retrograde labeling efficiency, and unless such efficiency is close to 100%, these conclusions are likely misleading.

      We thank reviewer for recognizing the contributions of our MERGE-seq technique in advancing the understanding of projection patterns of vmPFC neurons. We concur that while our conclusions align with previous findings, our single-cell level analysis provides additional depth to the existing knowledge of the field. We acknowledge the challenge to quantify retrograde labeling efficiency to draw quantitive conclusions based on our findings. Alternatively, we have used fMOST-based single-neuron tracing data and analysis to validate our projection patterns and ensure the robustness of our conclusions in the revised manuscript. We also more explicitly clarified the limitations of the quantitive conclusion drawn from MERGE-seq in the revised Discussion. The insights of reviewer are greatly appreciated and will inform the improvement of our research methodology.

      Reviewer #3 (Public Review):

      This manuscript describes a multiplexed approach for the identification of transcriptional features of neurons projecting to specific target areas at the single-cell level. This approach, called MERGE-seq, begins with multiplexed retrograde tracing by injecting distinctly barcoded rAAV-retro viruses into different target areas. The transcriptomes and barcoding of neurons in the source area are then characterized by single-cell RNA sequencing (scRNAseq) on the 10xGenomics platform. The projection targets of barcoded neurons in the source area can be inferred by matching the detected barcodes to the barcode sequences to of rAAV-retro viruses injected into the target areas.

      The authors validated their approach by injecting five rAAV-retro GFP viruses, each encoding a different barcode, into five known targets of the ventromedial prefrontal cortex (vmPFC). The transcriptomes and barcoding of vmPFC neurons were then analyzed by scRNA-seq with or without enrichment of retrogradely labeled neurons based on GFP fluorescence. The authors confirmed the previously described heterogeneity of vmPFC neurons. In addition, they showed that most transcriptionally defined cell types project to multiple targets and that the five targets received projections from multiple transcriptomic types. The authors further characterized the transcriptomic features of barcoded vmPFC neurons with different projection patterns and defined Pou3f1 as a marker gene of neurons extending collateral branches to the dorsomedial striatum and lateral hypothalamus.

      Overall, the results of the manuscript are convincing: the transcriptomic vmPFC cell types defined by scRNAseq in this study appear to correlate well with previous studies, the bifurcated projection patterns inferred by barcoding are validated using dual-color retro-AAV tracing, and marker genes for projection-specific cell subclasses are validated in retrogradely labeled vmPFC using RNA FISH for marker detection.

      The concept of combining retrograde tracing and scRNAseq is not new. Previous studies have applied recombinase-expressing viruses capable of retrograde labeling, such as CAV, rabies virus, and AAV2-Retro, to retrogradely label and induce the expression of fluorescence markers in projection neurons, therefore facilitating enrichment and analysis of neurons projecting to a specific target. Multiplexed analysis can be achieved with the combination of different reporter viruses or viruses expressing different recombinases and appropriate reporter mouse lines. The advantages of MERGE-seq include that no transgenic lines are required and that it could be applied at even higher levels of multiplexity.

      We thank reviewer for the insightful review of our manuscript and the recognition of the advantages of MERGE-seq. We appreciate reviewer acknowledged the robust validation of the method through dual-color retro-AAV tracing and RNA FISH, and the confirmation of previous findings on vmPFC neuronal heterogeneity and collateral projection patterns. We provided additional joint analysis with fMOST-based single-neuron projectome data (Gao et al., 2022, Nature Neuroscience) to further validate the projection patterns (>= 3 targets) that cannot be easily validated with dual-color retro-AAV tracing.

      However, previously existing datasets that have already profiled this region with scRNAseq have not been utilized to their full extent. Therefore, for the proper context with prior literature, bioinformatic integration of these scRNAseq and prior scRNAseq data is needed.

      Moreover, robust detection of barcodes in neurons labeled by barcoded AAV-retro viruses remains a challenge. The authors should clearly discuss the difficulties with barcode detection in this approach, as well as discuss potential solutions, which are important for others interested in its approach.

      While this study is limited to the five known targets of vmPFC, the results suggest that MERGE-seq is a valuable tool that could be used in the future to characterize projection targets and transcriptomes of neurons in a multiplexed manner. As MERGE-seq uses AAVs to deliver barcodes, this method has the potential for application in model organisms for which transgenic lines are not available. Further improvements in experimental design and data analysis should be considered when applying MERGE-seq to poorly characterized source areas or with increased multiplexity of target areas.

      In summary, this is a valuable approach, but the authors should clearly provide the context for their study within the existing literature, transparently discuss the limitations of MERGE-seq, as well as suggest improvements for the future.

      We appreciate your positive assessment of MERGE-seq as a valuable approach with future potential. As recommended, we have performed integration analysis with existing vmPFC scRNA-seq studies, including Bhattacherjee et al., 2019, Lui et al., 2021, Yao at al., 2021, and specifically recently published MERFISH data of PFC (Bhattacherjee et al., 2023).

      In the revised Discussion, we have transparently addressed the current limitations of MERGE-seq, including imperfect retrograde labeling efficiency, variable barcode recovery rates and cell loss during dissociation. We also addressed the challenges in detecting and recovering projection barcodes and suggested potential solutions such as using FAC-sorted EGFP-negative cells for control and applying single-molecule FISH techniques. We sincerely appreciate reviewer’s rigorous and insightful feedback, which has substantially strengthened our manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      In this paper, the authors develop new models of sequential effects in a simple Bernoulli learning task. In particular, the authors show evidence for both a "precision-cost" model (precise posteriors are costly) and an "unpredictabilitycost" model (expectations of unpredictable outcomes are costly). Detailed analyses of experimental data partially support the model predictions.

      Strengths:

      • Well-written and clear.

      • Addresses a long-standing empirical puzzle.

      • Rigorous modeling.

      Weaknesses:

      • No model adequately explains all of the data.

      • New empirical dataset is somewhat incremental.

      • Aspects of the modeling appear weakly motivated (particularly the unpredictability model).

      • Missing discussion of some relevant literature.

      We thank Reviewer #1 for her/his positive comments on our work and her/his comments and suggestions.

      Reviewer #2 (Public Review):

      This paper argues for an explanation of sequential effects in prediction based on the computational cost of representing probability distributions. This argument is made by contrasting two cost-based models with several other models in accounting for first- and second-order dependencies in people's choices. The empirical and modeling work is well done, and the results are compelling.

      We thank Reviewer #2 for her/his positive comments on our work.

      The main weaknesses of the paper are as follows:

      1) The main argument is against accounts of dependency based on sensitivity to statistics (ie. modeling the timeseries as having dependencies it doesn't have). However, such models are not included in the model comparison, which makes it difficult to compare these hypotheses.

      Many models in the sequential-effects literature (Refs. [7-12] in the manuscript) are ‘leaky-integration’ models that interpret sequential effects as resulting from an attempt to learn the statistics of a sequence of stimuli, through exponentiallydecaying counts of the simple patterns in the sequence (e.g., single stimuli, repetitions, and alternations). In some studies, the ‘forgetting’ of remote observations that results from the exponential decay is justified by the fact that people live in environments that are usually changing: it is thus natural that they should expect that the statistics underlying the task’s stimuli undergo changes (although in most experiments, they do not), and if they expect changes, then they should discard old observations that are not anymore relevant. This theoretical justification raises the question as to why subjects do not seem to learn that the generative parameters in these tasks are in fact not changing — all the more as other studies suggest that subjects are able to learn the statistics of changes (and consistently they are able to adapt their inference) when the environment does undergo changes (Refs. [42,57]).

      Our models are derived from a different approach: we derive behavior from the resolution of a problem of constrained optimization of the inference process. It is not a phenomenological model. When the constraint that weighs on the inference process is a cost on the precision of the posterior, as measured by its entropy, we find that the resulting posterior is one in which remote observations are ‘forgotten’, through an exponentially discount, i.e., we recover the predictions of the leaky-integration models, which past studies have empirically found to be reasonably good accounts of sequential effects. (Thus these models are already in our model comparison.) In our framework, the sequential effects do not stem from the subjects’ irrevocable belief that the statistics of the stimuli change from time to time, but rather from the difficulty that they have in representing precise belief; a rather different theoretical justification.

      Furthermore, we show that a large fraction of subjects are not best-fitted by precision-cost models (i.e., they are not best-fitted by leaky integration), but instead they are best fitted by unpredictability-cost models. These models suggest a different explanation of sequential effects: that they result from the subjects favoring predictable environments, in their inference. In the revised version of the manuscript, we have made clearer that the derivation of the optimal posterior under a precision cost results in the exponential forgetting of remote observations, as in the leaky-integration models. We mention it in the abstract, in the Introduction (l. 76-78), in the Results when presenting the precision-cost models (l. 264-278), and in the Discussion (l.706-716).

      2) The task is not incentivized in any way. Since incentives are known to affect probability-matching behaviors, this seems important. In particular, we might expect incentives would trade off against computational costs - people should increase the precision of their representations if it generates more reward.

      We thank Reviewer #2 for her/his attention to our paper and for her/his comments. As for the point on the models, see answer above (point 1).

      As for the point on incentivization: we agree that it would be very interesting to measure whether and to which extent the performance of subjects increases with the level of incentivization. Here, however, we wanted, first, to establish that subjects’ behavior could be understood as resulting from inference under a cost, and second, to examine the sensitivity of their predictions to the underlying generative probability — rather than to manipulating a tradeoff involving this cost (e.g. with financial reward). We note that we do find that subjects are sensitive to the generative probability, which implies that they exhibit some degree of motivation to put some effort in the task (which is the goal of incentivization), in spite of the lack of economic incentives. But it would indeed be interesting to know how the potential sensitivity to reward interacts with the sensitivity to the generative probability. Furthermore, as Reviewer #2 mentions, some studies show that incentives affect probability-matching behavior: it is then unclear whether the introduction of incentives in our task would change the inference of subjects (through a modification of the optimal trade-off that we model); or whether it would change their probability-matching behavior, as modeled by our generalized probability-matching response-selection strategy; or both. Note that we disentangled both aspects in our modeling and that our conclusions are about the inference, not the response-selection strategy. We deem the incentivization effects very much worth investigating; but they fall outside of the scope of our paper.

      We now mention this point in the Discussion of the revised manuscript (l. 828-840).

      3) The sample size is relatively small (20 participants). Even though a relatively large amount of data is collected from each participant, this does make it more difficult to evaluate the second-order dependencies in particular (Figure 6), where there are large error bars and the current analysis uses a threshold of p < .05 across a large number of tests hence creating a high false-discovery risk.

      Indeed we agree with Reviewer #2 that as the number of tests increases, so does the probability that at least one null hypothesis is rejected at a given level, even if the null hypothesis is correct. But in the panels a, b and c of Figure 6, about half of the tests are rejected, which is very unlikely under the null hypothesis that there is no effect of the stimulus history on the prediction, all the more as the signs of the non-significant results are in most cases consistent with the direction of the significant results. (In panel e, which reports a finer analysis in which the number of subjects is essentially divided by 2, about a fourth of the tests are rejected, and here also the non-significant results are almost all in the same direction as the significant ones.)

      However, we agree that there remains a risk of false discovery, thus we applied a Bonferroni-Holm-Šidák correction to the p-values in order to mitigate this risk. With these more conservative p-values, a lower number of tests are rejected, but in most cases in Fig. 6abc the effects remain significant. In particular, we are confident that there is a repulsive effect of the third-to-last stimulus in the case of Fig. 6c, while there is an attractive effect in the other cases.

      In the revised manuscript, Figure 6 now reports whether the tests are rejected when the p-values are corrected with the Bonferroni-Holm-Šidák correction.

      (We also applied this correction to the p-values of the tests in Fig. 2, which has more data: the corrected p-values are all below 1e-13, which we now indicate in the caption of this figure.)

      4) In the key analyses in Figure 4, we see model predictions averaged across participants. This can be misleading, as the average of many models can produce behavior outside the class of functions the models themselves can generate. It would be helpful to see the distribution of raw model predictions (ideally compared against individual data from humans). Minimally, showing predictions from representative models in each class would provide insight into where specific models are getting things right and wrong, which is not apparent from the model comparison.

      In the main text of the original manuscript, we showed the behavior of the pooled responses of the best-fitting models, and we agree with Reviewer #2 that it did not make clear to the reader that the apparent ability of the models to reproduce the subjects’ behavioral patterns was not a misleading byproduct of the averaging of different models. In the original version of the manuscript, we had put a figure showing the behavior of each individual model (each cost type with each Markov order) in the Methods section of the paper; but this could easily be overlooked, and indeed it would be beneficial for the reader to be shown the typical behaviors of the models, in the main text. We have reorganized the presentation of the models’ behaviors: the first panels in Fig. 4 (in the main text) are now dedicated to showing the individual sequential effects of the precision-cost and of the unpredictabilitycost models with Markov order 0 and 1. The Figure 4 is reproduced in the response to Reviewer #1, above, along with comments on the sequential effects produced by these models (and also on the impact of the generalized probability-matching response-selection strategy, in comparison with the traditional probability matching). We believe that this figure makes clearer how the individual models are able to reproduce the patterns in subjects’ predictions — in particular it shows that this ability of the models is not just an artifact of the averaging of many models, as was the legitimate concern of Reviewer #2. We have left the illustration of the firstorder sequential effects of the other models (with Markov order 2 and 3) in the Methods section (Fig. 7), so as not to overload Fig. 4, and because they do not bring new critical conceptual points.

      As for the higher-order sequential effects, the updated Figure 5, also reproduced above in the responses to Reviewer #1, now includes the sequential effects obtained with the precision-cost model of a Bernoulli observer (m=0), in addition to the precision-cost model of a Markov observer (m=1) and to the unpredictabilitycost model of a Markov observer (m=3), in order to better illustrate the behaviors of the different models. The higher-order sequential effects of the other models can be found in Fig. 8 in Methods.

      Reviewer #3 (Public Review):

      This manuscript offers a novel account of history biases in perceptual decisions in terms of bounded rationality, more specifically in terms of finite resources strategy. Bridging two works of literature on the suboptimalities of human decision-making (cognitive biases and bounded rationality) is very valuable per se; the theoretical framework is well derived, building upon the authors' previous work; and the choice of experiment and analysis to test their hypothesis is adequate. However, I do have important concerns regarding the work that do not enable me to fully grasp the impact of the work. Most importantly, I am not sure whether the hypothesis whereby inference is biased towards avoiding high precision posterior is equivalent or not to the standard hypothesis that inference "leaks" across time due to the belief that the environment is not stationary. This and other important issues are detailed below. I also think that the clarity and architecture of the manuscript could be greatly improved.

      We thank Reviewer #3 for her/his positive comments on our work and her/his comments and suggestions.

      1) At this point it remains unclear what is the relationship between the finite resources hypothesis (the only bounded rationality hypothesis supported by the data) and more standard accounts of historical effects in terms of adaptation to a (believed to be) changing environment. The Discussion suggests that the two approaches are similar (if not identical) at the algorithmic level: in one case, the posterior belief is stretched (compared to the Bayesian observer for stationary environments) due to precision cost, in other because of possible changes in the environment. Are the two formalisms equivalent? Or could the two accounts provide dissociable predictions for a different task? In other words, if the finite resources hypothesis is not meant to be taken as brain circuits explicitly minimizing the cost (as stated by the authors), and if it produces the same type of behavior as more classical accounts: is the hypothesis testable experimentally?

      We agree with Reviewer #3 that the relation between our approach and other approaches in the literature should be made clearer to the reader.

      Since the 1990s, in the psychology and neuroscience literature, many models of perception and decision-making have featured an exponential decay of past observations, resulting in an emphasis, in decisions, of the more recent evidence (‘leaky integration’, Refs. [7-12, 76-86]). In the context of sequential effects, this mechanism has found a theoretical justification in the idea that people believe that statistics typically change, and thus that remote observations should indeed be discarded [8,12]. In inference tasks with binary signals, in which the optimal Bayesian posterior is in many cases a Beta distribution whose two parameters are the counts of the two signals, one way to conveniently incorporate a forgetting mechanism is to replace these counts with exponentially-filtered counts, in which more recent observations have more weight (e.g., Ref. [12]).

      Our approach to sequential effects is not grounded in the history of leakyintegration models: we assume, first, that subjects attempt at learning the statistics of the signals presented to them (this is also the assumption in many studies [712]), and second, that their inference is subject to a cost, which prevents them from reaching the optimal, Bayesian posterior; but under the constraint of this cost, they choose the optimal posterior. We formalize this as a problem of constrained optimization.

      The two formalisms are thus not equivalent. Beyond the fact that we clearly state the problem which we assume the brain is solving, we do not propose that the origin of sequential effects resides in an adaptation to putatively changing environments: instead, we assume that they originate in a cognitive cost internal to the decision-maker. If this cost is proportional to the entropy of the posterior, as in our precision cost, then the optimal approximate posterior is one in which remote observations are ‘forgotten’ through an exponential filter, as in the leakyintegration models. In other words, in the context of this task and with this kind of cost, the models are, as Reviewer #3 writes, identical at the algorithmic level. As for the unpredictability cost, it does not result in a solution that resembles leaky integration; about half the subjects, however, are best fitted by unpredictabilitycost models. We thus provide a different rationale for sequential effects — that the brain favors predictive environment, in its inference — and this alternative account is successful in capturing the behavior of a large fraction of the subjects.

      In the revised manuscript, we now clarify that the precision cost results in leaky integration, in the abstract, in the Introduction (l. 76-78), in our presentation of the precision-cost models (Results section, l. 264-275), and in the Discussion (l. 706716). (We also refer Reviewer #3 to our response to the first comment of Reviewer #2, above.)

      Finally, Reviewer #3 asks the interesting question as to whether the “two accounts provide dissociable predictions for a different task”. Given that the leakyintegration approach is justified by an adaptation to potential changes, and our approach relies on the hypothesis that precision in beliefs is costly, one way to disentangle the two would be to eliminate the sequential nature of the task and presenting instead observations simultaneously. This would eliminate the mere notion of change across time. In this case, the leaky account would predict that subjects’ inference becomes optimal (because the leak should disappear in the absence of change), while in the second approach the precision cost would still weigh on the inference, and result in approximate posteriors that are “wider” (less precise) than the optimal one. The resulting divergence in the predictions of these models is very interesting, but out of the scope of this study on sequential effects.

      2) The current analysis of history effects may be confounded by effects of the motor responses (independently from the correct response), e.g. a tendency to repeat motor responses instead of (or on top of) tracking the distribution of stimuli.

      We thank Reviewer #3 for pointing out the possibility that subjects may have a tendency to repeat motor responses that is not related to their inference.

      We note that in Urai et al., 2017, as in many other sensory 2AFC tasks, successive trials are independent: the stimulus at a given trial is a random event independent of the stimulus at the preceding trial; the response at a given trial should in principle be independent of the stimulus at the preceding trial; and the response at the preceding trial conveys no information about the response that should be given at the current trial (although subjects might exhibit a serial dependency in their responses). By contrast, in our task an event is more likely than not to be followed by the same event (because observing this event suggests that its probability is greater than .5); and a prediction at a given trial should be correlated with the stimuli at the preceding trials, and with the predictions at the preceding trials. In a logit model (or any other GLM), this would mean that the predictors exhibit multicollinearity, i.e., they are strongly correlated. Multicollinearity does not reduce the predictive power of a model, but it makes the identification of parameters extremely unreliable: in other words, we wouldn’t be able to confidently attribute to each predictor (e.g., the past observations and the past responses) a reliable weight in the subjects’ decisions. Furthermore, our study shows that past stimuli can yield both attractive and repulsive effects, depending on the exact sequence of past observations. To capture this in a (generalized) linear model, we would have to introduce interaction terms for each possible past sequence, resulting in a very high number of parameters to be identified.

      However, this does not preclude the possibility that subjects may have a motor propensity to repeat responses. In order to take this hypothesis into account, we examined the behavior and the ability to capture subjects’ data of models in which the response-selection strategy allows for the possibility of repeating, or alternating, the preceding response. Specifically, we consider models that are identical to those in our study, except for the response-selection strategy, which is an extension of the generalized probability-matching strategy, in which a parameter eta, greater than -1 and lower than 1, determines the probability that the model subject repeats its preceding response, or conversely alternates and chooses the other response. With probability 1-|η|, the model subject follows the generalized probability-matching response-selection strategy (parameterized by κ). With probability |η|, the model subject repeats the preceding response, if η > 0, or chooses the other response, if η < 0. We included the possibility of an alternation bias (negative η), but we find that no subject is best-fitted by a negative η, thus we focus on the repetition bias (positive η). We fit the models by maximizing their likelihoods, and we compared, using the Bayesian Information Criterion (BIC), the quality of their fit to that of the original models that do not include a repetition propensity.

      Taking into account the repetition bias of subjects leaves the assignment of subjects into two families of inference cost mostly unchanged. We find that for 26% of subjects the introduction of the repetition propensity does not improve the fit (as measured by the BIC) and can therefore be discarded. For 47% of subjects, the fit is better with the repetition propensity (lower BIC), and the best-fitting inference model (i.e., the type of cost, precision or unpredictability, and the Markov order) is the same with or without repetition propensity. Thus for 73% (=26+47) of subjects, allowing for a repetition propensity does not change the inference model. We also find that the best-fitting parameters λ and κ, for these subjects, are very stable, when allowing or not for the repetition propensity. For 11% of subjects, the fit is better with the repetition propensity, and the cost type of the inference model is the same (as without the repetition propensity), but the Markov order changes. For the remaining 16%, both the cost type and the Markov order change.

      Thus for a majority of subjects, the BIC is improved when a repetition propensity is included, suggesting that there is indeed a tendency to repeat responses, independent of the subjects’ inference process and generative stimulus probability. In Figure 7, in Methods, we show the behavior of the models without repetition propensity, and with repetition propensity, with a parameter η = 0.2 close to the average best-fitting value of eta across subjects. We show, in Methods, that (i) the unconditional probability of a prediction A, p(A), is the same with and without repetition propensity, and that (ii) the conditional probabilities p(A|A) and p(A|B) when η≠0 are weighted means of the unconditional probability p(A) and of the conditional probabilities when eta=0 (see p. 47-49 of the revised manuscript).

      In summary, our results suggest that a majority of subjects do exhibit a propensity to repeat their responses. Most subjects, however, are best-fitted by the same inference model, with or without repetition propensity, and the parameters λ and κ are stable, across these two cases; this speaks to the robustness of our model fitting. We conclude that the models of inference under a cost capture essential aspects of the behavioral data, which does not exclude, and is not confounded by, the existence of a tendency, in subjects, to repeat motor responses.

      In the revised manuscript, we present this analysis in Methods (p.47-49), and we refer to it in the main text (l. 353-356 and 400-406).

      3) The authors assume that subjects should reach their asymptotic behavior after passively viewing the first 200 trials but this should be assessed in the data rather than hypothesized. Especially since the subjects are passively looking during the first part of the block, they may well pay very little attention to the statistics.

      The assumptions that subjects reach their asymptotic behavior after being presented with 200 observations in the passive trials should indeed be tested. To that end, we compared the behavior of the subjects in the first 100 active trials with their behavior in the remaining 100 active trials. The results of this analysis are shown in Figure 9.

      For most values of the stimulus generative probability, the unconditional proportions of predictions A, in the first and the second half (panel a, solid and dashed gray lines), are not significantly different (panel a, white dots), except for two values (p-value < 0.05; panel a, filled dots). Although in most cases the difference between the two is not significant, in the second half the proportions of prediction A seem slightly closer to the extremes (0 and 1), i.e., closer to the optimal proportions. As for the sequential effects, they appear very similar in the two halves of trials. We conclude that for the purpose of our analysis we can reasonably consider that the behavior of the subjects is stationary throughout the task.

      4) The experiment methods are described quite poorly: when is the feedback provided? What is the horizontal bar at the bottom of the display? What happens in the analysis with timeout trials and what percentage of trials do they represent? Most importantly, what were the subjects told about the structure of the task? Are they told that probabilities change over blocks but are maintained constant within each block?

      We thank Reviewer #3 for her/his close attention to the details of our experiment. Here are the answers to the reviewer’s questions:

      • The feedback (i.e., a lightning strike on the left or the right rod, with the rod and the battery turning yellow if the strike is on the side predicted by the subject,) is immediate, i.e., it is provided right after the subject makes a prediction, with no delay. We now indicate this in the caption of Figure 1.

      • The task is presented to the subjects as a game in which predicting the correct location of the lightning strike results in electric power being collected in the battery. The horizontal bar at the bottom of the display is a gauge that indicates the amount of power collected in the current block of trials. It has no operational value in the task. We now mention it in the Methods section (l. 872-874).

      • The timeout trials were not included in the analysis. The timeout trials represented 1.27% of the trials, on average (across subjects); and for 95% of the subjects the timeout trials represented less than 2.5% of the trials. This information was added in Methods (l. 887-889).

      • Each new block of trials was presented to the subject as the lightning strikes occurring in a different town. The 200 passive trials at the beginning of each block, in which subjects were asked to observe a sequence of 200 strikes, were presented as the ‘track record’ for that town, and the instructions indicated that it was ‘useful’ to know this track record. No information was given on the mechanism governing the locations of the strikes. In the main text of the revised manuscript, we now include these details when describing the task (p. 6).

    1. Author Response

      Reviewer #1 (Public Review):

      Sun et al. investigated the circuit mechanism of a novel type of synaptic plasticity in the projection from the visual cortex to the auditory cortex (VC-AC), which is thought to play an important role in visuo-auditory associative learning. The key question behind this paper is what is the role of CCK positive projection from the entorhinal cortex in the plasticity of VC-AC projections? They discover that the strength of VC-AC projections does not change when pairing the stimulation of this pathway with the acoustic stimulation of the auditory cortex (AC) unless CCK is applied to the AC or CCK positive projection from the entorhinal cortex to auditory cortex (EC-AC) is optogenetically stimulated. In contrast, optogenetically stimulating VC-AC projections, which express a lower level of CCK than the EC-AC projection, do not induce such synaptic plasticity. Interestingly, the data also indicates that even if the EC-AC pathway is stimulated 500ms ahead of the pairing of stimulating VC-AC pathway and the AC, the VC-AC synaptic strength can still be potentiated, consistent with the long-lasting nature of CCK as a neuropeptide. By performing a fear conditioning assay, the authors demonstrate that the CCK signaling is indeed required for the association of visual and auditory cues.

      The proposed mechanism is interesting because it not only helps explain the heterosynaptic plasticity of the visual-auditory projection but also will provide insight into how the entorhinal cortex as an association area contributes to the association of visual and auditory cues. Nevertheless, this study suffers from the lack of a few key experiments, which prevents drawing a conclusion on the contribution of CCK release from the EC-AC projection to the plasticity of the VC→AC projection.

      We are grateful for the constructive comments provided by the reviewers and appreciate the significant effort they have dedicated to reviewing our manuscript. To enhance our study and strengthen our conclusions, we have made the following revisions in response to their feedback.

      1) One main conclusion from figures 1-3 is that CCK released from the EC-AC projection is required for the plasticity of VC-AC projection in addition to pairing VALS with noise/electrical stimulation. But the data in those figures cannot exclude alternative explanations that CCK alone or the pairing CCK with either VALS or noise are sufficient to make the VC-AC synaptic connection more potent. It concerns the mechanism underlying the effect of CCK: CCK may function simply as a neuromodulator to regulate the excitatory synaptic transmission, but not to promote long term synaptic plasticity.

      Thanks for the valuable comment and pointing out the weakness. In response to the comment, we have conducted additional control experiments to reinforce our conclusions. These include: For Figure 1G, we introduced three control groups: CCK alone (Figure1-figure supplement 1F-G), CCK + presynaptic activation of VC-to-AC inputs (Figure 1-figure supplement 1H-I), and CCK + postsynaptic firing induced by noise (Figure 1-figure supplement 1J-K). Our findings from these control experiments indicate that in all three scenarios, there was no potentiation of the VC-to-AC inputs. Further details can be found in Figure 1-figure supplement 1F-K.

      For Figure 2E, we introduced three control groups: HFS laser EC-to-AC alone (Figure 2-figure supplement 1H-I), HFS laser EC-to-AC + presynaptic activation of VC-to-AC inputs (Figure 2-figure supplement 1L-M), and HFS laser + postsynaptic firing induced by noise (Figure 2-figure supplement 1P-Q). And we found that in all three scenarios, the VC-to-AC inputs were not significantly potentiated. Please see details in Figure 2-figure supplement 1.

      Given that our in vivo results already demonstrated that neither HFS laser EC-to-AC alone, nor its combination with presynaptic or postsynaptic activation, potentiated the VC-to-AC inputs, we did not replicate these control groups in our ex vivo setup. These additional experiments enhance the robustness of our findings and address the initial concerns raised.

      2) Similar issue exists in Fig. 2H and 3J. Without proper controls, it is impossible to tell whether all three conditions (HFLSEA, VALA, noise/electrical stimulation) are necessary for potentiated AC responses to acoustic/electrical stimulation.

      Same as above, we have conducted additional control experiments to reinforce our conclusions. These include:

      For Figure 2H, we also tested the noise response in the above three control groups: HFS laser EC to AC alone (Figure 2-figure supplement 1J-K), HFS laser EC-to-AC + presynaptic activation of VC-to-AC inputs (Figure 2-figure supplement 1N-O), and HFS laser + postsynaptic firing induced by noise (Figure 2-figure supplement 1R-S). And we found that fEPSPs evoked by noise stimuli were significantly potentiated after HFS laser EC-to-AC + Post (Figure 2-figure supplement 1R-S). However, there was no potentiation observed following HFS laser EC-to-AC alone (Figure 2-figure supplement 1J-K) and HFS laser EC-to-AC + Pre (Figure 2-figure supplement 1N-O).

      These results suggest that both HFS laser targeting the EC-to-AC projection and noise-induced AC firing are required to potentiate the AC's response to acoustic stimuli. In contrast, activation of the VC-to-AC projection is not necessary. This finding aligns with our previous research (Li et al., 2014).

      Given the similarity in experimental design, we opted not to replicate these specific control groups in our ex vivo setup.

      These additional control experiments have been crucial in reinforcing the conclusions of our study.

      3) Fig. 2E and 3G show that the stimulation of CCK-positive EC-AC projection is required for the plasticity of VC-AC projection. Considering most EC-AC projection neurons co-release glutamate and CCK, however, we cannot tell if CCK or glutamate or both matter to this type of plasticity. Even though the long delay in Fig 5B is consistent with the neuropeptide nature of CCK, direct experimental evidence is needed, since it is where the novelty of the paper is.

      Thank you for your constructive feedback. In response to the suggestions, for Figure 2E, we have incorporated two additional experiments: one with a CCKB receptor (CCKBR) antagonist and another with ACSF infused into the AC prior to HFS laser EC-to-AC + Pre/Post Pairing (Figures 2N-P). Our findings demonstrate that the CCKBR antagonist effectively inhibited the potentiation of the VC-to-AC inputs following the HFS laser EC-to-AC + Pre/Post Pairing. Conversely, ACSF did not exhibit this inhibitory effect. For further information, please refer to Figures 2N-P. Given the similarity in experimental design, we opted not to replicate these groups in our ex vivo setup.

      4) In Fig. 6, the authors examined the necessity of CCK for the generation of the visuo-auditory association. The experimental approach of injection CCK receptor blocker or CCK-4 is not specific to the EC-AC pathway. There is neither a link between VC-AC plasticity nor this behavioral result. Thus, the explanatory power of this experiment is limited in the context set up by the first 5 figures.

      Thank you for highlighting this area for improvement. To enhance the explanatory power of our behavioral experiments, we conducted the following additional studies:

      1) Assessing the Necessity of CCK+ EC-to-AC Projection in Establishing Visuo-Auditory Association:

      We bilaterally injected AAV9-syn-DIO-hM4Di-eYFP or AAV9-syn-DIO-eYFP into the EC and implanted cannulae in the AC of Cck Ires-Cre mice. During the encoding phase, we inactivated the CCK+ EC-to-AC pathway via CNO infusion into the AC. Our results show that this inactivation prevents the behavioral establishment of an association between the visual stimulus (VS) and auditory stimulus (AS), without affecting the fear conditioning memory to the AS (Figure 6B, beige).

      2) Determining the Role of VC-to-AC Projection in Establishing Visuo-Auditory Association: We bilaterally injected AAV9-syn-hM4Di-eYFP or AAV9-syn-eYFP into the visual cortex (VC) and also implanted cannulae in the AC of Cck Ires-Cre mice. Inactivating the VC-to-AC pathway during the encoding phase with CNO infusion in the AC, we observed that this inactivation hinders the establishment of a behavioral association between VS and AS, but does not interfere with the fear conditioning memory to the AS (Figure 6B, red).

      3) Investigating the Importance of CCK+ EC-to-AC Projection in Recalling Recent Visuo-Auditory Association:

      Again, AAV9-syn-DIO-hM4Di-eYFP or AAV9-syn-DIO-eYFP was injected bilaterally into the EC, and cannulae were implanted in the AC of Cck Ires-Cre mice. By inactivating the CCK+ EC-AC pathway during the retrieval phase with CNO infusion into the AC, we found that such inactivation disrupted the recall of the recent association between VS and AS behaviorally, yet did not affect the fear conditioning memory to the AS (Figure 6D, beige).

      4) Assessing the Necessity of VC-to-AC Projection in Recalling Recent Association Memory: For this experiment, AAV9-syn-hM4Di-eYFP or AAV9-syn-DIO-eYFP was injected bilaterally into the VC, and cannulae were placed in the AC of Cck Ires-Cre mice. Inactivating the VC-AC pathway during the retrieval phase with CNO infusion in the AC led to the discovery that this inactivation disrupted the behavioral recall of the recent association between VS and AS but did not disrupt the fear conditioning memory to the AS (Figure 6D, red).

      These additional experiments significantly contribute to our understanding of the roles played by the CCK+ EC-AC and VC-AC projections in both the establishment and recall of visuo-auditory associative memories.

      5) In page 16, line 322-326, the authors concluded that to induce the plasticity of VC→AC projection, Delay 1 should be longer than 10 ms and Delay 2 should be longer than 0 ms. This conclusion was not fully supported by the data from Figure 5B-D, because there is no data point between -65 ms and 10 ms for Delay 1 (for example 0 ms), and no negative values for Delay 2.

      We rewrote this paragraph and hope it is more accurate now.

      “Taken together, our study indicates that significant potentiation of the VC-to-AC inputs can be observed (Figure 5D, black cube) across five pairing trials with a 10-second inter-trial interval, under certain tested conditions: (i) the frequency of repetitive laser stimulation of the CCK+ entorhinal cortex (EC) to AC projection was maintained at 10 Hz or higher (as we did not test frequencies between 1 to 10 Hz), (ii) Delay 1 was set within the tested range of 10 to 535 ms (noting the absence of data between -65 to 10 ms), and (iii) Delay 2 was within the range of 0 to 200 ms (acknowledging that negative values for Delay 2 were not explored).”

      Reviewer #2 (Public Review):

      The manuscript by Sun et al., investigates the synaptic plasticity underlying visuo-auditory association. Through a series of in vivo and ex vivo electrophysiology recordings, the authors show that high-frequency stimulation (HFLS) of the cholecystokinin (CCK) positive neurons in the entorhino-auditory projection paired with an auditory stimulus can evoke long-term potentiation (LTP) of the visuo-auditory projection. However, LTP of the visuo-auditory projection could not be elicited by HFLS of the visuo-auditory projection itself or by an unpaired stimulus. They further demonstrate that auditory stimulus pairing with CCK is required to elicit LTP of the visuo-auditory projection as well as visuo-auditory association in a fear conditioning behavioral experiment. As they found elevated expression of CCK in entorhinal neurons which project to the auditory cortex, they conclude that HFLS of the entorhino-auditory projection causes CCK release.

      Strengths:

      The authors use an elegant approach with Chrimson and Chronos to stimulate different auditory inputs in the same mouse in vivo and also in slice and demonstrate that potentiation of the visuo-auditory projection is dependent on HFLS of the entorhino-auditory projection paired with auditory stimulus. Furthermore, they test several parameters in a systematic fashion, generating a comprehensive analysis of the plasticity changes that regulate visuo-auditory association.

      Weaknesses:

      In their previous publications (Chen et al., 2019; Li et al., 2014; Zhang et al., 2020), it has been established that HFLS of the entorhino-auditory projection and CKK release are important for visuo-auditory association via electrophysiology and behavioral experiments. The Chrimson and Chronos approach was applied by Zhang et al., 2020, where they already found that the visuo-auditory projection was potentiated through HFLS of entorhino-neocortical fibers. This manuscript extends those findings by testing different parameters of pairing, which may not represent a major conceptual advance. Unlike the electrophysiological recordings, drug infusion is used in behavioral manipulations to show that HFLS of the entorhino-auditory projection is important for visuo-auditory association. While the use of drugs to inhibit CKK receptors is important, it does not directly demonstrate that CCK release from the entorhino-auditory is necessary.

      We deeply appreciate the reviewer's constructive and insightful feedback. Building on our previous work (Zhang et al., 2020), which highlighted the potentiation of the VC-to-AC projection through high-frequency laser stimulation (HFS laser) of entorhino-neocortical fibers, our current study probes further into the intricacies of this process. We have thoroughly explored the specific conditions necessary for the potentiation of the VC-to-AC projection, assessing a wide range of parameters.

      A significant advancement in our current research is the elucidation of why HFS of the VC-to-AC pathway alone fails to induce potentiation, whereas HFS of the EC-to-AC pathway, coupled with Pre/Post Pairing, is effective. This critical distinction is linked to the heightened expression of CCK in EC neurons projecting to the AC, in contrast to those from the VC. In this revised version of our study, we have also demonstrated that HFS laser stimulation of the EC-to-AC CCK+ projection induces the release of endogenous CCK in the AC using a combination of a CCK sensor and fiber photometry.

      Behaviorally, our revised research emphasizes the vital role of the CCK+ EC-AC projection in both establishing and retrieving visuo-auditory memories, thereby highlighting its fundamental importance in memory processing. Moreover, our study confirms that the CCK+ EC-AC projection is not only crucial for memory formation and retrieval but also indicates that the VC-to-AC projection is the anatomical basis for establishing visuo-auditory associations and serves as the principal storage site for visuo-auditory associative memory. These findings represent significant strides in our understanding of synaptic plasticity and memory mechanisms.

      For the behavioral part, to build the link that HFS laser of the EC-to-AC CCK+ projection is important for visuo-auditory association in the behavioral context, we conducted the following additional behavioral studies (for details please see the response to comment 4 of reviewer 1):

      1) Assessing the Necessity of CCK+ EC-to-AC Projection in Establishing Visuo-Auditory Associative memories, by inactivating the pathway with inhibitory DREADD during the encoding phase.

      2) Investigating the Importance of CCK+ EC-to-AC Projection in Recalling Visuo-Auditory Association, by inactivating the pathway with inhibitory DREADD during the retrieving phase.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper combines an array of techniques to study the role of cholecystokinin (CCK) in motor learning. Motor learning in a pellet reaching task is shown to depend on CCK, as both global and locally targeted CCK manipulations eliminate learning. This learning deficit is linked to reduced plasticity in the motor cortex, evidenced by both slice recordings and two-photon calcium imaging. Furthermore, CCK receptor agonists are shown to rescue motor cortex plasticity and learning in knockout mice. While the behavioral results are clear, the specific effects on learning are not directly tested, nor is the specificity pathway between rhinal CCK neurons and the motor cortex. In general, the results present interesting clues about the role of CCK in motor learning, though the specificity of the claims is not fully supported.

      Since all CCK manipulations were performed throughout learning, rather than after learning, it is not clear whether it is learning that is affected or if there is a more general motor deficit. Related to this point, Figure 1D appears to show a general reduction in reach distance in CCK-/- mice. A general motor deficit may be expected to produce decreased success on training day 1, which does not appear to be the case in Figure 1C and Figure 2B, but may be present to some degree in Figure 5B. Or, since the task is so difficult on day 1, a general motor deficit may not be observable. It is therefore inconclusive whether the behavioral effect is learning-specific.

      Thanks for your comments and suggestions.

      We have tested the basic movement ability of CCK-/- and WT mice and we found that there were no significant difference between CCK-/- and WT in terms of stride length, stride time, step cycle ratio and grasp force (Figure S1C, S1D, S1E, S1F). Besides, we also have tested the performance of mice injected with CCKBR antagonist or injected with hM4Di together with clozapine after learned the task (Figure S2D, S8D). The performance of mice before and after antagonist injection or chemogenetic manipulation were comparable. These results suggested that all the CCK manipulations did not cause general defects to the movement ability of mice.

      The paper implicates motor cortex-projecting CCK neurons in the rhinal cortex as being a key component in motor learning. However, the relative importance of this pathway in motor learning is not pinned down. The necessity of CCK in the motor cortex is tested by injecting CCK receptor antagonists into the contralateral motor cortex (Figure 2), though a control brain region is not tested (e.g. the ipsilateral motor cortex), so the specificity of the motor cortex is not demonstrated.

      Thanks for your comments and suggestions.

      In this study, we focus on the role played by CCK from the rhinal cortex to the motor cortex, and how CCK affects motor learning. The single pellet reaching task was selected to study the role of CCK from the rhinal cortex to the motor cortex in motor skill learning and the motor cortex is considered as the main area generates motor memory when training in this task (Komiyama et al., 2010; Peters et al., 2014; Richard et al., 2019). We emphasized that the importance of the motor cortex in motor learning, not meant that other brain areas where also receive CCK-positive neural projections from the rhinal cortex, for example hippocampus (spatial memory), are not important for the performance of this task. In fact, specifically inhibiting the projection from the rhinal cortex to the contrallateral motor cortex is not enough to suppress the motor learning ability of, but inhibiting projecting in both sides (contro- and ipsi-lateral) could suppress the learning ability of mice, suggesting that the whole motor cortex is critical for motor skill learning (Figure 6, S8). In this paper, we studied the relationship between the rhinal cortex and the motor cortex and the role played by CCK in this circuit. The specificity of the motor cortex is task-dependent, not the main purpose in this study.

      The learning-related source of CCK in the motor cortex is also unclear, since even though it is demonstrated that CCK neurons in the rhinal cortex project to the motor cortex in Figure 4D, Figure 4C shows that there is also a high concentration of CCK neurons locally within the motor cortex. Likewise, the importance of the projection from the rhinal cortex to the motor cortex is not specifically tested, as rhinal CCK neurons targeted for inactivation in Figure 5 include all CCK cells rather than motor cortex-projecting cells specifically.

      Thanks for your comments and suggestions.

      The specificity of the CCK-projection from the rhinal cortex to the motor cortex for motor skill learning was studies using chemogenetic methods in the revised version of the manuscript. We first determined that over 98% of neurons in the rhinal cortex that projected to the motor cortex are CCK positive (Figure 6A, S6A, S6B). Next, we injected the retro-Cre virus in the motor cortex and the Cre-dependent hM4Di in the rhinal cortex in C57BL/6 mice to specifically inhibit the CCK neurons from the rhinal cortex to the motor cortex. Compared to two control groups, the learning ability of the experimental group was significant suppressed, suggesting that CCK projections from the rhinal cortex to the motor cortex are critical for motor skill learning (Figure 6). Detailed description was added in the part of "Result" in the manuscript.

      CCK is suggested to play a role in producing reliable activity in the motor cortex through learning through two-photon imaging experiments. This is useful in demonstrating what looks like normal motor cortex activity in the presence of CCK receptor antagonist, indicating that the manipulations in Figure 2 are not merely shutting off the motor cortex. It is also notable that, as the paper points out, the activity appears less variable in the CCK manipulations (Figure 3G). However, this could be due to CCK manipulation mice having less-variable movements throughout training. The Hausdorff distance is used for quantification against this point in Figure 1E, though the use of the single largest distance between trajectories seems unlikely to give a robust measure of trajectory similarity, which is reinforced by the CCK-/- traces looking much less variable than WT traces in Figure 1D. The activity effects may therefore be expected from a general motor deficit if that deficit prevented the mice from normal exploratory movements and restricted the movement (and activity) to a consistently unsuccessful pattern.

      Thanks for your comments and suggestions.

      To totally suppress CCK receptors in the motor cortex, the antagonist is unavoidable to diffuse to the adjacent brain areas as the motor cortex is not regularly circular. But the area inhibited most should be the motor cortex. We applied the chemogenetics method to further determine the specificity of the motor cortex in the motor skill learning. Specific projection from the RC to the MC was inhibited bilaterally, which suppressed the motor learning ability.

      For a wild-type mouse, neurons were activated when it try to get the food pellet. Neuronal pattern corresponding to each trial will be remembered, and the patterns corresponding to successful movements will tend to be repeated. Manipulations of CCK prevented neurons from remembering the pattern they tried and repeated the pattern they tried before no matter it is successful or not. This is corresponding to the neuron-activation pattern showed in figure 3D, 3E and 3G, the population activities (neuronal activities) are comparable, while the trial-to-trial population correlation is a little bit higher for the CCK-manipulation groups on Day 1. In terms of the behavior, manipulations of CCK decreased the possibility to explore the best path to get food pellets and just repeating a reach for the food pellet like it was the first time. Besides, many tests including the movement ability of CCK-/-, performance of antagonist injection group and chemogenetics manipulation group after learning indicated that CCK-manipulation did not affect the basic movement ability.

      Hausdorff distance is the greatest of all the distances from a point in one set to the closest point in the other set. It is not just the largest distance between two trajectories, but comprehensively takes all points in each trajectory into consideration. Hausdorff distance is widely used to assess the variation of two trajectories. The similarity of the shapes of trajectories is not applied for analysis because it is not very effective to assess the performance of a mouse. The fixed location of the initial site and food site makes all trajectories are single lines in the same direction, thus, the shapes of the trajectories are very similar among different trials. Two trajectories with similar shape but far from each other (big Hausdorff distance) should be treated as big variation because, in terms of the final results, they are quite different (success vs. miss). Therefore, Hausdorff distance is more reliable to be applied for assessment of the performance of mice.

      Finally, slice experiments are used to demonstrate the lack of LTP in the motor cortex following CCK knockout, which is rescued by CCK receptor agonists. This is a nice experiment with a clear result, though it is unclear why there are such striking short-term depression effects from high-frequency stimulation observed in Figure 6A that are not observed in Figure 1H. Also, relating to the specificity of the proposed rhinal-motor pathway, these experiments do not demonstrate the source of CCK in the motor cortex, which may for example originate locally.

      Thanks for your comments.

      1. Because CCK4 is a small molecule, which degrades very fast with half-time less than 1 min in the rat serum and 13 min in the human serum, we injected the drug into the electrode recording dishes, while the ACSF was stopped flowing, leading to a relatively low oxygen condition. As it showed in Figure 6A, it cost about 15 min for the brain slices to recover. Compared with CCK4 manipulation, the depression of vehicle group is stronger, which could be due to the effects of CCK4 induced LTP after HFS compensated the depression.

      2. In the motor cortex, many CCK-positive neurons are γ-aminobutyric acid-ergic (GABAergic) neurons, in which the role played by CCK is not very clear (Whissell et al., 2015). However, evidence showed that GABA may inhibit the release of CCK in the neocortex (Yaksh et al., 1987). Many glutamatergic neurons in the neocortex also express CCK (Watakabe et al., 2012). In this study, the stimulation electrode was placed on the layer 1, where receives most CCK projections from the rhinal cortex, to release CCK from the rhinal cortex, but can not rule out the possibility that some CCK may release from the local CCK neurons (Figure 4B). We focused on the importance of CCK for neural plasticity in the motor cortex, but did not aim to figure out the role played by the cortical CCK-positive neurons, including inhibitory and excitatory neurons, in neuronal plasticity and motor skill learning by this experiment.

      Therefore, the specificity of the projections from the rhinal cortex to the motor cortex was further studied by chemogenetic manipulation. Inhibiting the activity of the projections suppressed the learning ability compared with two types of control manipulations, indicating the CCK projections from RC to the MC is critical for motor skill learning.

      Reviewer #2 (Public Review):

      This study aims to test whether and if so, how cholecystokinin (CCK) from the mice rhinal cortex influences neural activity in the motor cortex and motor learning behavior. While CCK has been previously shown to be involved in neural plasticity in other brain regions/behavioral contexts, this work is the first to demonstrate its relationship with motor cortical plasticity in the context of motor learning. The anatomical projection from the rhinal cortex to the motor cortex is also a novel and important finding and opens up new opportunities for studying the interactions between the limbic and motor systems. I think the results are convincing to support the claim that CCK and in particular CCK-expressing neurons in the rhinal cortex are critical for learning certain dexterous movements such as single pellet reaching. However, more work needs to be done, or at least the following concerns should be addressed, to support the hypothesis that it is specifically the projection from the rhinal cortex to the motor cortex that controls motor learning ability in mice.

      1)Because CCK is expressed in multiple brain regions, as the authors recognized, results from the CCK knock-out mice could be due to a global loss of neural plasticity. In comparison, the antagonist experiment is in my opinion the most convincing result to support the specific effect of CCK in the motor cortex. However, it is unclear to me whether the CCK knock-out mice exhibited an impaired ability to learn in general, i.e., not confined to motor skills. For instance, it would be very valuable to show whether these mice also had severe memory deficits; this would help the field to understand different or similar behavioral effects of CCK in the case of global vs. local loss of function. If the CCK knock-out mice only exhibited motor learning deficits, that would be surprising but also very interesting given previous studies on its effect in other brain areas.

      Thanks for your comments. According to the studies in our lab, we found that CCK is critical for the neural plasticity in the auditory cortex, hippocampus and the amygdala and CCK-/- mice performed much worse than wildtype mice in associative, spatial and fear memory (Li et al.,2014; Chen et al., 2019; Su et al. 2019; Feng et al. 2021).

      2) Related to my last point, I believe that normal neural plasticity should be essential to motor skill learning throughout development not just during the current task. Thus, it would be important to show whether these CCK knock-out mice present any motor deficits that could have resulted from a lack of CCK-mediated neural plasticity during development. If not, the authors should explain how this normal motor learning during development is consistent with their major hypothesis in this study (e.g., is CCK not critical for motor learning during early development).

      Thanks for your comments and suggestions.

      Development is mainly gene-guided which prepares the physical structure for learning, while learning is dependent on the neural plasticity and a period of experience (such as motor training in this research). Besides, development is deemed as "experience-expectant", using common environmental information, while learning is "experience-dependent", sensitive to the specific individual experiences (Greenough et al., 1987; Galván, 2010). Moreover, development costs longer time to form a specific ability of a species in general. The role of CCK plays in the development is not clear. Duchemin et al. (1987) studied the CCK gene expression level in the brain of rats pre- and postnatally. They found that the CCK mRNA was detectable on embryonic day 14 (E14) and gradually increased to the maximum level on postnatal day 14 (P14), indicating that CCK might participate in the development of rats. Paolo et al. (2007) mapped the expression of CCK in the mouse brain. Plentiful CCK expression was observed at E12.5 in the thalamus and spinal cord and by E17.5 CCK expression extended to the cortex, hippocampus and hypothalamus, suggesting that CCK might also regulate the development of mice. Paolo et al. (2004) found that CCK suppressed the migration of GnRH-1 through CCK-A receptor in the brain. Besides, postnatal early learning may participate in development. CCK-B receptor antagonist administration (postnatal 6 hours) suppressed the infant sheep get motor preference, indicating that CCK might be important for the development of mother preference of sheep. However, what the role CCK played in the development of motor system is not known.

      In this study, the performance of both CCK-/- and WT mice is at the same level without significant difference on Day one, in terms of the percentage of "miss", "no-grasp", "drop" and "success". Besides, the movement abilities, including stride length, stride time, step cycle ratio and grasp force, were comparable for both CCK-/- and WT mice (Figure S1C, S1D, S1E, S1F), suggesting that knockout of cck gene did not affect the basic movement ability. This could be because the development of basic movement ability is not learning-guided, but is physical structure-determined. However, all these tests were on physical level, but how CCK affected the motor system on the molecular and cellular level is not known. Therefore, we further applied CCK-BR antagonist and chemogenetic method to study the role of CCK in the motor learning.

      3)Lines 198-200 and Fig. 2C: The authors found that the vehicle group showed significantly increased "no grasp" behavior, and reasoned that the implantation of a cannula may have caused injuries to the motor cortex. In order to support their reasoning and make the control results more convincing, I think it would be helpful to show histology from both the antagonist and control groups and demonstrate motor cortical injury in some mice of the vehicle group but not the antagonist group. Otherwise, I'm a bit concerned that the methods used here could be a significant confounding factor contributing to motor deficits.

      Thanks for your comments and suggestions.

      The injury of the motor cortex can not be avoided, because the cannula was inserted below the surface of the cortex (Figure S2C). The significantly increased "no-grasp" rate is because the improvement of miss rate of the Vehicle group, which turned to "no-grasp" but failed to further improve to drop or success, while for the Antagonist group, there is no significant improving from "miss" to "no-grasp", leaving no change in the "no grasp".

      4) The authors showed that chemogenetic inhibition of CCK neurons in the rhinal cortex impaired motor skill learning in the pellet-reaching task. However, we know that the rhinal cortex projects to multiple brain regions besides the motor cortex (e.g., other cortical areas and the hippocampus). Thus, the conclusion/claim that the observed behavioral deficits resulted from inhibited rhinal-motor cortical projections is not strongly supported without more targeted loss-of-function or rescue experiments.

      It would also be very informative to the field to compare the specific behavioral deficits, if any, of inhibiting specific downstream targets of the rhinal CCK neurons. As a concrete example, the hippocampus may be involved in learning more sophisticated motor skills (as the authors pointed out in the Discussion) besides the motor cortex. It would be a critical result if the authors could either show or exclude the possibility that the motor learning deficits observed in CCK-/- mice were at least partially due to the inhibition of hippocampal plasticity. This echoes my earlier point (point 1) that it is unclear whether the effect of lacking CCK in knock-out mice is specific in the motor cortex or engages multiple brain regions.

      Lastly, because Fig. 4 only showed histology in the rhinal and motor cortices, I am not sure whether the motor cortex solely receives CCK input from the rhinal cortex. A more comprehensive viral tracing result could be important to both supporting the circuit-specificity of the observed behavior in this study and providing a clearer picture of where the motor cortex receives CCK inputs.

      Thanks for your comments.

      The specificity of the CCK-projection from the rhinal cortex to the motor cortex for motor skill learning was studies using chemogenetic methods in the revised version of the paper. We first determined that over 98% of neurons in the rhinal cortex that projected to the motor cortex are CCK positive (Figure 6A, S6A, S6B). Next, we injected the retro-Cre virus in the motor cortex and the Cre-dependent hM4Di in the rhinal cortex in C57BL/6 mice to specifically inhibit the CCK neurons from the rhinal cortex to the motor cortex. Compared to two control groups, the learning ability of the experimental group was significantly suppressed, suggesting that CCK projections from the rhinal cortex to the motor cortex are critical for motor skill learning (Figure 6). Detailed description was added in the part of "Result" in the manuscript.

      In this study, we focus on the role played by CCK from the rhinal cortex, and how CCK affects motor learning. The single pellet reaching task was selected to study the role of CCK from the rhinal cortex in motor skill learning and the motor cortex is considered as the main area generates motor memory when training in this task (Komiyama et al., 2010; Peters et al., 2014; Richard et al., 2019). We emphasized that the importance of the contrallateral motor cortex in motor learning, not meant that other brain areas where also receive CCK-positive neural projections from the rhina cortex, for example hippocampus (spatial memory), are not important for the performance of this task. In fact, specifically inhibiting the projection from the rhinal cortex to the contrallateral motor cortex is not enough to suppress the motor learning ability, but inhibiting projecting in both sides (contro- and ipsi-lateral) could suppress the learning ability of mice, suggesting that the whole motor cortex is critical for motor skill learning (Figure 6, S8). In our lab, we found that CCK projection from the entorhinal cortex to the hippocampus is critical for spatial memory formation (Su et al., 2019). Impaired hippocampus, to some extent, affected the performance in single pellet reaching task (Shwuhuey et al., 2007). Therefore, manipulation of CCK projections from the rhinal cortex to the hippocampus may also affect the performance in the single pellet reaching task. In this paper, we aim to study the relationship between the rhinal cortex and the motor cortex and the role played by CCK in this circuit. Other brain areas involved in the single pellet reaching task are not the core concern in this study.

      The motor cortex also receive CCK projections from other cortices, such as the contrallateral motor cortex, the deep layer of visual cortex and auditory cortex, and thalamus (Figure S4).

      5) I am glad to see the CCK4 rescue experiment to demonstrate the sufficiency of CCK in promoting motor learning. However, the rescue experiment lacked specificity: IP injection did not allow specific "gain of function" in the motor cortex but instead, the improved learning ability in CCK knock-out mice could be a result of a global effect of CCK4 across multiple brain regions. CCK4 injection specifically targeted at the motor cortex would be necessary to support the sufficiency of CCK-regulated neuroplasticity in the motor cortex to promote motor learning.

      Thanks for your comments.

      First, the specificity of the circuit were studied by injecting a Cre virus in the MC and a Cre-dependent hM4Di virus in the RC. After injection with clozapine, the motor learning ability were significantly suppressed compared with the saline control and the control virus combined with clozapine.

      Besides, we emphasized that the importance of the motor cortex in motor learning, not meant that other brain areas where also receive CCK-positive neuronal projections from the rhinal cortex, for example hippocampus (spatial memory), are not important for the performance of this task. Specific infusion the drug into the motor cortex is hard to rescue the motor learning ability of CCK-/- mice because the motor cortex is very large, varying from AP: -1.3 to 2.46 mm and ML: ±0.5 to ±2.75 mm and other areas receiving CCK projections from the rhinal cortex also could be important for motor learning. Actually, we tried to inject CCK into the motor cortex through a drug cannula, but the result showed that it is hard to compensate the knock out of cck gene in the whole brain, and rescue the motor learning ability (Figure S11D, S11E). Moreover, cannula implantation causes inescapable injury to the motor cortex, because the cannula must be inserted into the brain, so that the drug could be infused into the brain. This injury may affect the performance in the task, as the motor cortex is very critical for motor learning. Therefore, it is not the best method to be applied for motor skill rescuing.

      Furthermore, CCK4 molecules can be transported to the whole brain by i.p. injection, as CCK4 is capable to pass through brain blood barrier, which compensates the knockout of cck gene in the whole brain, leading to the rescuing of motor learning ability. Furthermore, i.p. injection is widely accepted for drug discovery because it is very convenient, simply manipulated and does not causes any direct injury on the brain. Thus, we applied i.p. injection not only for whole brain CCK compensation, but also for the further study of the application in drug discovery.

      Reviewer #3 (Public Review):

      The authors elucidated the roles of cholecystokinin (CCK)-expressing excitatory neurons, which project from the rhinal cortex to the motor cortex, in motor skill learning. The authors found CCK knock-out mice exhibited learning defects in the pellet reaching task while the baseline success rate of the knock-out mice was similar to that of the wild-type mice. Application of a CCK B receptor (CCKBR) antagonist into the motor cortex lowered the success rate in the motor task. The authors found the population activity which was observed in the in vivo calcium imaging during motor learning was elevated after motor learning, but this increase disappeared in CCK knock-out mice and animals with CCKBR antagonist administration. Anterograde and retrograde viral tracing revealed that CCK-expressing excitatory neurons in the rhinal cortex projected to the motor cortex. Chemogenetic inhibition of the CCK-expressing neurons in the rhinal cortex lowered the ability for motor learning. The application of a CCKBR agonist increased the motor learning ability of CCK knock-out animals as well as long-term potentiation (LTP) observed in the slice of the motor cortex.

      However, the manuscript contains several shortcomings:

      First, the "Discussion" has several statements that are only supported weakly by the results, for example, ll. 429-431, ll. 432-433, and ll. 447-448. In addition, most of the sentences in this section are not divided into subsections. The paragraphs should be composed in multiple subsections with appropriate subheadings, even though the initial section summarizing the results can lack a subheading.

      Thanks for your suggestions. The statements were revised and the discussion was divided into subsections.

      Second, it would be important that the authors showed which area(s) of the brain is affected by the CCKBR antagonist in the experiments described in ll. 166-206 and Fig. 2. The authors injected the drug into the motor cortex, but the chemical can spread to neighboring cortical areas (e.g. somatosensory cortex) or wider brain regions. If so, the blockade of the CCKBR in the brain areas other than the motor cortex could cause the defects of the motor task learning observed in these experiments. I think it is desirable that such a possibility should be excluded. Conversely, it is possible that the antagonist had an effect on a limited subarea of the motor cortex (e.g. only the primary motor cortex (M1)). In this case, the information about the field altered by the CCKBR blocker would be useful to interpret the results of the learning defects.

      Thanks for your comments and suggestions.

      The drug cannula was implanted in the motor cortex (coordinates: AP, 1.4 mm, ML, -/+1.6 mm, DV, 0.25 - 0.3 mm) contralateral to the dominant hand of the mice (Figure S2C). To totally inhibit CCKBR in the motor cortex, we injected over-dosage of antagonist into the motor cortex. Thus, we cannot totally exclude the possibility that some antagonist spread to the neighboring cortices. However, the fact is that the motor cortex is very large, varying from AP: -1.3 to 2.46 mm and ML: ±0.5 to ±2.75 mm. It is not easily to spread out of the motor cortex with high concentration.

      Third, the authors need to show bilateral data about their anterograde and retrograde tracking of CCK-expressing neurons in the rhinal cortex. In ll. 290-292, they described as follows: "Both anterograde and retrograde tracking results indicated that CCK-expressing neurons in the rhinal cortex projecting to the motor cortex were asymmetric, showing a preference for the ipsilateral hemisphere." However, they provided only unilateral data for the anterograde (Fig. 4B) and the retrograde (Fig. 4D) experiments.

      Thanks for your comments. Both anterograde and retrograde tracking data from bilateral hemisphere were added to the supplementary file (Figure S4).

      Fourth, unilateral (contralateral to the dominant forelimb) experiments are needed in the chemogenetic inhibition of the CCK neurons. In ll. 301-338 and Fig. 5, the authors inhibited the CCK -expressing neurons in both hemispheres by injecting the virus into both sides. However, the CCKBR antagonist injection into the motor cortex contralateral to the dominant forelimb caused defects in motor learning ability, as described in ll. 166-206. The authors also observed that the population neuronal activity in the motor cortex contralateral to the dominant forelimb changed in accordance with the improvement of the motor skill in ll. 208-269. Therefore, it may be the case that inhibition of CCK neurons only in the side contralateral to the dominant forelimb - not bilaterally, as the authors did - could cause the lowered ability of motor learning. Such unilateral inhibition can be carried out by unilateral injection of the virus. In relation to the point above, in the chemogenetic inhibition experiments, it would be important to show which neurons in which cortical area is inhibited. This could be done by examining the distributions of the mCherry-labeled somata in the rhinal cortex using histochemistry.

      Thanks for your comments and suggestions.

      The specific of the CCK-projection from the rhinal cortex to the motor cortex for motor skill learning was studied using chemogenetic methods in the revised version of the paper. We first determined that over 98% of neurons in the rhinal cortex that projected to the motor cortex are CCK positive by retrograde virus injection and immunostaining (Figure 6A, S6A, S6B). Next, we injected the retro-Cre virus in the motor cortex and the Cre-dependent hM4Di in the rhinal cortex in C57BL/6 mice to specifically inhibit the CCK neurons from the rhinal cortex to the motor cortex. Compared to two control groups, the learning ability of the experimental group was significant suppressed, suggesting that CCK projections from the rhinal cortex to the motor cortex are critical for motor skill learning (Figure 6). Furthermore, we also injected the retro-Cre virus into the single site of the motor cortex controlateral to the dominant forelimb together with Cre-dependent hM4Di virus in the rhinal cortex. The result showed that after injection of clozapine, the motor learning ability was not significantly suppressed, suggesting that the bilateral motor cortex is important for motor skill learning. This is consistent with the previous findings that the increased GluA1 expression were observed bilaterally in the motor cortex after training in the single pellet reaching task. Detailed description was added in the part of "Result" in the manuscript.

      Fifth, it would be valuable to further examine differences in task performance across sessions and groups. The paragraph in ll. 138-153 needs a comparison of the "miss" rates of CCK-/- animals between Day 1 vs. Day 6 (related to ll. 429- 431). This paragraph also needs comparisons of the "no-grasp" and "drop" rates of CCK-/- animals between Day 1 vs. Day 6 (related to ll. 432- 433). The paragraph in ll. 175-190 needs comparisons of success rates between Day 1 and Day 5/6 within the antagonist group (related to ll. 447-448).

      Thanks for your comments. The comparisons were made in the revised manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      Strengths:

      The study addresses an intriguing research question that fills a gap in existing literature, and was carefully designed and well-executed, with a series of experiments and control experiments.

      We thank the reviewer for the positive statement about the conception and execution of the study as well as the potential interest to the community within a broader field.

      Weaknesses:

      1) My main concern is the null effect of precision estimation pattern between cued and un-cued trials. It is well established that relative to the un-cued stimuli, the cued stimuli obtain more attentional resource and this study claimed serial attentional resource allocation during parallel feature value tracking. However, all Experiments 3a-c did not find any difference in precision estimates between these two types of trials.

      We would like to annotate that the terminology „cued versus uncured trials“ in the usual sense of distinguishing between stimuli being attended versus unattended is admittedly somewhat misleading in the current work. In cued and uncured trials of the present experiments 3a-c the allocation of attention is equal. The difference is that the color stream that is attended first is defined (knowable) in the cued but not in the uncued trials. In all cases subjects had to track both color streams and report any of the probed streams as accurately as possible. In other words, the overall allocation of attention in cued and uncured trials is the same. Also, the „cue“ did not provide any information regarding the following probe (no indication of likelihood for a probe in that stream as in an attention experiment). It was entirely irrelevant and was therefore expected not to alter subjects overall performance – as confirmed by the mentioned null-result. The performed test shows, that the reported bias of ~2:1 does not depend on whether in one set of the trials one stream is cued or not. The sole purpose of the “cue” was to subconsciously redirect attention briefly towards that particular stream at the start of each trial in order to ‘phase-reset’ any process, switching/oscillating feature-based resources over time. Performance imbalance across streams is hereby not altered by this phase-reset but remains constant since precision ratio is estimated across a large number of trials and durations. To clarify this issue, we rephrased relevant descriptions in the methods section.

      2) Results of Exp.1 in the main text were different from those in Figure.

      Thank you for spotting that error. We have corrected the figure accordingly.

      3) It would be helpful to add more details for the assignation of response 1 and response 2 to target 1 and target 2, respectively, in all experiments.

      For Experiment 2 and 3 only one response per trial was required by the subjects. This design was chosen to avoid potentially ambiguous response-target assignments.

      However in the first experiment, as the reviewer points out, subjects gave two color estimates (one for each of the tracked color streams) within each trial. Given that we intend to split subjects’ target-response differences (precisions) into two distributions (based on the idea that each stream is being maintained by an independent attentional resource), there are two possible ways of assigning responses:

      (1) We split responses into a best and worst independent of which response was given first.

      (2) Alternatively, we assign target-response pairs based on the order of response. The assumption would be, that the first response would be the one with the highest confidence and would be paired with the target closest. This pairing would occur independent of the second response, which is consequently paired with the remaining target. This leaves open the possibility of the second target-response difference being better than the first one due to resource fluctuations. In general, this strategy would be less ‘rigid’ in dividing the two precision-responses into ‘good’ and ‘bad’ responses and was consequently chosen.

      To avoid problems arising from the ambiguity of target-response assignments, in all following experiments (2/3), subjects were required to give one response per trial only. We will go into further detail on this issue with reviewer 3 as well, including a numerical example. The logic behind the target-response assignments in experiment 1 has been described in more detail in the methods.

      Reviewer #2 (Publlic Review):

      The authors asked the question about whether and how changing feature values within the same feature dimensions are tracked. Using a series of behavioral studies combined with modeling approaches, the authors report interesting results regarding a robust, uneven distribution of attentional resources between two changing feature values (in a 2:1 ratio), alternating at 1 Hz. Although the results are clear, it is important to rule out the possible biases due to computational processes. The results advanced our understanding of how parallel tracking of multiple feature values within the same dimension is achieved.

      We thank the reviewer for the summary, including the potential impact on the field and we look forward to clarify methodological imprecisions.

      Reviewer #3 (Public Review):

      The study is interesting and the results are informative in how well people can report colors of two superimposed dot clouds. It reveals that there are trade-offs between reporting two colors. However, I have a few basic but major concerns with the present study and its conclusions about people's abilities to continuously track color values and the rate at which attention may be allocated across the two streams which I am outlining below.

      We thank the reviewer for the positive description of our findings and look forward to address any remaining issues.

      1) The first concern regards the task that was used to measure continuous tracking of feature values, which in my view is ambiguous in whether it truly assesses active tracking of features or rather short-term memory of the last-seen colors. Specifically, participants were viewing two colored dot clouds that then turned gray, and were asked to report each of the colors they saw using continuous report. The test usually occurred after 6-8s (in Exp. 1 &2), so while not completely predictable, participants could easily perform the task without tracking both feature streams continuously and simply perform the color report based on the very last colors they saw. In other words, it does not seem necessary to know which color belonged to which stream, or what color it was before, to perform the task successfully. Thus, it is unclear to what extent this task is actually measuring active tracking, the same way tracking of spatial locations in multiple-object tracking tasks has been studied, which is the literature that the authors are trying to draw parallels to. In multiple-object tracking tasks, targets and nontarget objects look identical and so to keep track of which of the moving objects are targets, participants need to attend to them actively and selectively. (Similarly, the original feature-tracking study by Blaser et al., at least in their main experiment, people were asked to track an object superimposed on a second object which required continuous and selective tracking of that object).

      The reviewer addresses a very fundamental point regarding ‘tracking’ in general: Does tracking rely on attentional processes or mere perception.

      The reviewer posits that subjects may simply ‘report based on the very last color they saw’ without the need to track both features streams continuously. Our argument supported by a broad literature on change blindness, inattentional blindness and related phenomena (c.f. Rensink, 2000) is, that one cannot consciously report a changing feature-value without continuously attending to it, in particular when it moves around randomly in feature space. The report of a feature value at a random unpredictable time t by ‘identifying it’ includes its attentive processing immediately before t. Since the time of the probing identification is random, it must continue throughout the trial. We do also rule out any strategy in which subjects only start tracking after some time (the probe appears between 6-8sec after trial onset) since such a strategy would involve processes of temporal attention as well and increase difficulty.

      Lastly, the reviewer refers to Blaser et al. as an example in which attentive tracking would be required, since ‘an object [is] superimposed on a second object’. We do absolutely agree. However, the same design principle applies in the current experiment: Two objects with separate values in feature space, that continuously change, are superimposed, that is, spatially inseparable. We do believe that the continuous movement of the feature values through color space separates this work from previous feature-tracking studies like Re et al., in which the presented features remained static. The latter work gives rise to alternate explanations in terms of working memory (mentioned in the next point of the reviewer). Once feature values keep changing and are relevant, a process of updating their internal representations in order to grant access is required (i.e. attention).

      2) The main claim that tracking two colors relies on a shared and strictly limited resource is primarily based on the relation between the two responses people give, such that the first response about one color tends to be higher accuracy than for the second response of the other color across participants. In my view, this is a relatively weak version of looking at trade-offs in resources, and it would have been more compelling to show such trade-offs at a single-trial level, or assess them with well-established methods that have been developed to look at attentional bottlenecks such as attention-operating characteristics that allow quantifying the cost of adding an additional task in a precise and much more direct manner.

      The reviewer suggests showing trade-offs at a single trial level within subject, which is in essence what we have done in experiment 1. Testing both streams simultaneously, however, has the drawback of introducing interference effects during the report (Reporting the first stream may degrade the precision of reporting the second stream) as well as the mentioned ambiguity between targets and responses. The second and third experiment circumvent this by probing only one color stream, as to analyze the data with a minimal set of assumptions. As the dependent measure of ‘precision’ fluctuates highly across trials, we have to estimate an overall tracking resource by creating a ‘precision’ distribution across many trials.

      3) Finally, the data of the last experiment is taken as evidence that feature-based selection oscillates at 1Hz between the two streams. This is based on response errors changing across time points with respect to an exogenous cue that is thought to "reset" attentional allocation to one stream. Only one of three data sets (which uses relatively sparse temporal sampling) shows a significant interaction between cue and time, and given that there was no a priori prediction of when such interaction should occur, this result begs for a replication to ensure that this is not a false positive result. Furthermore, based on the analyses done in the paper, it may very well be the case that the presumed "switching rate" is entirely non-oscillatory based on a recent very important paper by Geoffrey Brookshire (2022, Nature Human Behavior) that demonstrates that frequency analysis are not just sensitive to periodic but also aperiodic temporal structures. The paper also has a series of suggested analyses that could be used here to further test the current conclusions.

      The reviewer is absolutely correct in doubting the oscillatory nature of the results in Exp3. Importantly, in our discussion we do not claim that a regular periodicity of the attentional process maintains both color streams. In contrast, we stress the point of ‘one-feature at a time’, indicating a constraint that entails alternation between two representations. We do not presume any sort of regularity of this process but, instead, consider the switching being determined by the recurrent processing of tuning towards one of the two relevant values. Our interpretation is therefore largely in line with Brookshires criticism of previous attentional oscillation studies. In fact, we entirely share the doubtful interpretation of attentional oscillations that transfer mathematical modelling onto functional processes. In our study we use the tool of Fourier transformation in a mere methodological manner, in order to quantify alternations between our color streams but not to imply an underlying oscillatory process. We cannot draw conclusions about underlying attentional oscillations especially since we quantify the alternation/switch only across one full and one half period, in exp3a and exp3b respectively.

      We make the distinction between oscillations as a methodological tool and functional cognitive process more clear in the paper.

    1. Author Response

      eLife Assessment:

      The fluorescently tagged SYT-1 mouse line will be useful for the field. Importantly, the authors used a comprehensive set of immunohistochemical and physiological experiments to demonstrate that the fluorescence tagging did not alter the function of SYT-1. These are important control experiments that will make the strain useful for physiological experiments in the future. However, the advance of this manuscript is less clear.

      We thank the editor for raising this point. In the revised manuscript, we performed additonal experiments including testing the expression level of Syt1-TDT and testing the co-labeling of Syt1-TDT with synaptic marker in situ. We also dicussed the advantage of our model compared with the existed ones in line 285 to 300 in the section of discusion. Briefly, we conclude the advance of our models as follows: First, the Syt1-TDT could label synapse in situ, especially in glomerular layer of olfactory bulb (compared with B6SJL-Tg(Thy1-Syt1/ECFP)1Sud/J (Han et al. 2005)). Second, we provided a potential usage of our model in the study of electrophysiological recording and imaging in vivo, as the electrophyiological properties of neurons from Syt1-TDT mice are normal (not be analyzed in B6.Cg-Tg(Thy1-YFP/Syp)10Jrs/J and B6;CBA-Tg(Thy1-spH)21Vnmu/J (Umemori et al. 2004; Li et al. 2005)), which might be result from the relative low expression of Syt1-TDT compared with the native Syt1. Third, the neurons from the transgenic mice can be used in ASF screening by skiping the procedure of immunostaining. It will save the cost of time, reagents and work.

      Reviewer #1 (Public Review):

      In this manuscript, Zhang and colleagues created a transgenic mouse strain that expresses SYT-1-tdt in all neurons. They showed that the labelled SYT-1 colocalizes with multiple synaptic markers and label synapses in different regions. More importantly, they showed that the transgenic expression does not alter synaptic function using ephys assays. This is a straightforward paper that generated a useful reagent that will be used broadly.

      We are grateful for the reviewer’s positive comments.

      Reviewer #2 (Public Review):

      Yang et al. produced a transgenic mouse line (Syt1-TDT) that could be used for labeling both excitatory and inhibitory synaptic sites in cultured neurons and in vivo neurons. The strength of the current study is to provide a series of thorough analyses to claim the applicability of this mouse line in the relevant neuroscience research field(s). The weakness is the potential impact/usefulness of this mouse line. To strengthen the merit of this mouse line, the authors should present evidence showing its advantage over other similar genetic approaches.

      We thank the reviewer for raising this point. To strengthen the merit of this mouse line, we tested the application of Syt1-TDT in labeling synapse in situ. We found that the Syt1-TDT is highly overlapped with synapsin in the brain slice, especially in hippocampus, cerebellum and olfactory bulb, which suggest a potential usage of our model in imaging synapse in vivo. We also compared our transgenic model with the existed ones in line 285 to 300 in the section of discussion in the revised manuscript:

      “Several fluorescently tagged synaptic protein transgenic mice model, such as YFP tagged synaptophysin and pHluorin tagged synaptobrevin have been developed to label synapses [49, 50]. While these models can label synapse well, it lacks the functional analysis of neurotransmitter release in the overexpressed neurons as synaptophysin and synaptobrevin were reported to play a role in regulating neurotransmitter release. Considering the overexpression of synaptobrevin or synaptophysin were reported to promote neurite elongation or enhance neurotransmitter secretion, the synaptic organization and synaptic transmission might be changed in these models. Weiping Han et al. in their previous work [47] have generated transgenic mice expressing a Syt1-ECFP fusion protein. The Syt1-ECFP mice expressed the fluorescent protein ECFP in the cortex, midbrain, and cerebellum. However, the expression pattern in their model showed some difference with ours: In the olfactory bulb, the Syt1-TDT signals were highly enriched in glomerular layer in our model, which was not observed in the previously reported Syt1-ECFP transgenic mice [47]. It suggested a potential application of our model in labeling synapse in glomerular layer of olfactory bulb compared with Syt1-ECFP transgenic mice.”

      Reviewer #3 (Public Review):

      Yang and colleagues provide a thorough characterization of a transgenic mouse model expressing fluorescently tagged synaptotagmin. In particular, they present key controls validating this mouse model as a tool, including co-localization of the tagged synaptotagmin with other synaptic markers as well as normalcy of synaptic transmission mediated by synaptic terminals expressing the tagged synaptotagmin. Importantly, the authors present data on the potential use of neuronal cultures obtained from these mice in synaptic co-culture assays. In these assays, synaptic cell adhesion molecules expressed on non-neuronal cell lines such as HEK-293 cells or COS cells are used to test the sufficiency of these molecules to trigger synapse assembly. This mouse model will be a useful addition to existing models expressing fluorescently-tagged synaptic vesicle proteins such as synaptophysin, synaptotagmin as well as synaptobrevin.

      We are grateful for the reviewer’s positive comments.

    1. Author Response

      Reviewer #1 (Public Review):

      Bakoyiannis et al. investigated the distinct contribution of ventral hippocampal outputs to the nucleus accumbens and medial prefrontal cortex on memory in mice exposed to a high-fat diet (HFD) beginning in adolescence. The authors first characterize the hippocampal to accumbens or mPFC circuits using intersectional viral approaches. They then replicate their previous finding that adolescent HFD contributes to the overactivation of the ventral hippocampus during contextual learning via quantification of c-fos+ cells. In this manuscript, the authors further explore the distinct contribution of these two outputs from the ventral hippocampus using chemogenetics to specifically inhibit one circuit or the other. Interestingly, the authors find that inhibition of either circuit returns c-fos+ cell number to control levels, but the effects on memory are dissociable. They demonstrate that inhibition of output to the NAc rescues HFD-induced deficits on object recognition, while inhibition of mPFC outputs rescues HFD-induced deficits on object location recall. The authors further confirmed that chemogenetic manipulations resulted in alterations in c-fos+ cells that were specific to CA1, and not CA3 or DG. Behaviorally, they excluded any contribution of anxiety on recall, finding no effect on the elevated plus maze.

      The strengths of this manuscript include robust behavioral findings that can be attributed to specific circuits. The conclusions of this paper are largely well supported by the data, although some of the methods could provide more detail and the statistical approaches used for analysis need improvement.

      We thank the Reviewer for thoroughly summarizing the main results of the study and for providing the comments that we address below.

      Reliance on only one measure of anxiety to exclude this as a confound on recall performance is a weakness of the manuscript. To be more convincing that anxiety is not a confound, more than one behavioral assay should be performed.

      Reviewer #2 (Public Review):

      Bakoyiannis et al. aim to analyze the impact of high-fat diet (HFD) intake during the preadolescent period on memory performances by optogenetically manipulating the circuits responsible for related memory performances. In previous work, they showed the possibility to rescue object-based memory impairments in HFD-exposed animals by silencing the ventral hippocampus (vHPC). Here they investigated further the projections to the nucleus accumbens (NAc) and medial prefrontal cortex (mPFC), 2 of the main monosynaptic targets of the vHPC.

      They used a precise strategy to target and manipulate only vHPC cells that project to either NAc or mPFC. They found that preadolescent HFD can induce different types of memory deficits related to different vHPC pathways. In particular, they found that silencing vHPC-NAc, but not vHPC-mPFC, pathway restored HFD-induced object recognition memory deficit. On the other side, silencing vHPC to mPFC, but not vHPC-NAc, pathway rescued HFD-induced object location memory deficits. Moreover, these pathways do not control anxiety-like behaviours since their inactivation has no effect on anxiety levels.

      We thank the Reviewer for summarizing the findings of the study and for their positive comments on our manuscript.

      The conclusions of the manuscript are mostly supported by the results, but there are some points and controls that need to be addressed and clarified:

      • While identifying the relevance of hippocampal cells projecting to NAc and mPFC, a missing control is to verify the activity of vHPC not projecting to these 2 regions in normal conditions or when the investigated pathways are manipulated. This control is essential to refine and bring novel results related to their previous discovery that vHPC overall is involved in the process.

      • A downstream effect of their optogenetic manipulation on NAc and mPFC cellular populations should be shown if they want to claim that their chemogenetic inhibition decrease the activation of the pathway and not only of vHPC projecting neurons.

      New c-Fos experiments were performed. Please see our response to points 4-5-6 in the “Essential Revision” section.

      Reviewer #3 (Public Review):

      "Obesogenic diet induces circuit-specific memory deficits in mice" by Bakoyiannis et al., investigates the role of specific ventral hippocampal circuits (specifically to nucleus accumbens and mPFC) in high-fat diet-induced memory deficits. The authors had previously shown that increases in activity in the ventral hippocampus accompany high-fat diet-induced memory deficits, and that inhibition of activity thereby normalizes those memory deficits. In this manuscript, the authors extend these findings to specific projections, showing that they normalize different types of memories by inhibiting the two different pathways.

      The strengths of the paper include the pathway-specific manipulations that reveal a difference between the two types of memory. The results are a modest step forward for the field of feeding and learning and memory and would be of interest to that subgroup of neuroscientists. However, the paper also has a number of weaknesses which I detail below.

      We thank the Reviewer for summarizing the finding of our study and for the positive feedback.

      1) First, the authors show an effect of cfos from both pathways in Figure 2 on object learning. However, the inactivation studies show a pathway-specific effect on object recognition and object location, with no experiments to delineate how this divergence occurs. The authors do not specify whether they compared cfos in the control group between NAc and mPFC projections (presumably they did some controls with each injection), which might reveal differences.

      We have added new groups and presented/analyzed the results for each pathway (either vHPC-NAc pathway or vHPC-mPFC pathway) separately for c-Fos (new Figure 2 and Figure 2-Figure Supplement 1) or behaviours (new Figure 3 and Figure 3-Figure Supplement 1). Please see our responses to points 2, 4-5-6 and 9 in the “Essential Revision” section.

      2) Related to this, it is unclear how the pathways end up diverging for memory if they do not show any differences in cfos during training. Perhaps there are pathway-specific differences in cfos following the ORM and OLM tests? It is difficult to support the claim that there are pathway differences in memory following inactivation if we do not see any pathway-specific change in activity.

      We thank the Reviewer for this comment. Please see our answer to point 7 in the “Essential Revision” section above.

      3) Figure 2 and Figure 3 are also hard to interpret because of the usage of a 1-way ANOVA which is not the appropriate statistical test when there are two independent variables (HFD and DREADD manipulation). Indeed, noticing the statistical test also reveals that a critical control missing: HFD -, hM4di+CNO +. It is possible that inactivation simply brings down cfos levels regardless of diet. While this might benefit memory in the case of HFD, it is critical to know whether the manipulation is specific to the overactivation caused by HFD or just provides a general decrease in activity.

      Based on this comment we added new HFD-hM4di+CNO+ groups and modified statistical analyses accordingly. Indeed, inactivation of each pathway (vHPC-NAc or vHPC-mPFC) decreases c-Fos in both HFD+ and HFD- (CD+) groups (new Figure 2) whereas it has opposite effect on behaviors, improving memory performance in HFD+ groups but impairing or having no effect in HFD- (CD+) groups (new Figure 3). We have corrected this in the manuscript (please see our responses to points 2 and 9 of “Essential Revision” section).

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This paper reports the fundamental discovery of adrenergic modulation of spontaneous firing through the inhibition of the Na+ leak channel NALCN in cartwheel cells in the dorsal cochlear nucleus. This study provides unequivocal evidence that the activation of alpha-2 adrenergic or GABA-B receptors inhibit NALCN currents to reduce neuronal excitability. The evidence supporting the conclusions is compelling, the electrophysiological data is high quality and the experimental design is rigorous.

      Public Reviews:

      Reviewer #1 (Public Review):

      This study uses electrophysiological techniques in vitro to address the role of the Na+ leak channel NALCN in various physiological functions in cartwheel interneurons of the dorsal cochlear nucleus. Comparing wild type and glycinergic neuron-specific knockout mice for NALCN, the authors show that these channels 1) are required for spontaneous firing, 2) are modulated by noradrenaline (NA, via alpha2 receptors) and GABA (through GABAB receptors), 3) how the modulation by NA enhances IPSCs in these neurons.

      This work builds on previous results from the Trussell's lab in terms of the physiology of cartwheel cells, and from other labs in terms of the role of NALCN channels, that have been characterized in more and more brain areas somewhat recently; for this reason, this study could be of interest for researchers that work in other preparations as well. The general conclusions are strongly supported by results that are clearly and elegantly presented.

      I have a few comments that, in my opinion, might help clarify some aspects of the manuscript.

      1. It is mentioned throughout the manuscript, including the abstract, that the results suggest a closed apposition of NALCN channels and alpha2 and GABAB receptors. From what I understand, this conclusion comes from the fact that GABAB receptors activate GIRK channels through a membrane-delimited mechanism. Is it possible that these receptors converge on other effectors, for example adenylate cyclase (see https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6374141/).

      We have now tested the role of adenylyl cyclase modulation in the control of NALCN, by saturating the cells with a cAMP analogue 8-Br-cAMP and found no effect on the NA response. These data are included in the paper. While further experiments are necessary, these results argue in favor of a direct gating by G-proteins.

      1. In Figure 2G, the neurons from NALCN KO mice appear to reach a significantly higher frequency than those from WT (figure 2E, 110 vs. 70 spikes/s). Was this higher frequency a feature of all experiments? The results mention a rundown of peak firing rate due to whole-cell dialysis, but, from what I understand, the control conditions should be similar for all experiments.

      The peak firing rates in control solutions for WT and KO CWC are not statistically different.

      1. Also in Figure 2, the firing patterns for neurons from WT and NALCN KO mice appear to be quite different, with spikes appearing to be generated during the hyperpolarization of the bursts in the second half of the current step for WT neurons but always during the depolarization in KO neurons. Was this always the case? If so, could NALCN channels be involved in this type of firing? Along these lines, it would be interesting to show an example of a firing pattern of neurons from WT mice in the presence of NA, which inhibits NALCN channels.

      The specific pattern of spikes in CWC is quite variable from trial-to-trial or cell-to-cell, as it is dependent on multiple CaV and calcium dependent K channels subtypes, and is not dependent on the genotypes used here. The primary effects observed in the KO are in background firing and sensitivity to NA, both reflected alterations in rheobase. The firing pattern example requested was shown in the raster plot of fig 2B2.

      1. It might be interesting to discuss how the hyperpolarization induced by the activation of GIRK channels and inhibition of NALCN channels could have different consequences due to their opposite effect on the input resistance.

      We considered this as a point of discussion, but decided that making sense of it would depend on assumptions about the location of the channels (dendritic vs somatic, distance to AIS) that we do not have data for. For example, a dendritic increase in resistance through NALCN block, leading to a hyperpolarization of the soma, might have actions similar to a somatic hyperpolarizing conductance increase by GIRK, as far as the voltage at the AIS is concerned.

      Reviewer #2 (Public Review):

      This is a very interesting paper with several important findings related to the working mechanism of the cartwheel cells (CWC) in the dorsal cochlear nucleus (DCN). These cells generate spontaneous firing that is inhibited by the activation of α2-adrenergic receptors, which also enhances the synaptic strength in the cells, but the mechanisms underlying the spontaneous firing and the dual regulation by α2-adrenergic receptor activation have remained elusive. By recording these cells with the NALCN sodium-leak channel conditionally knocked, the authors discovered that both the spontaneous firing and the regulation by noradrenaline (NA) require NALCN. Mechanistically, the authors found that activation of the adrenergic receptor or GABAB receptor inhibits NALCN. Interestingly, these receptor activations also suppress the low [Ca2+] "activation" of NALCN currents, suggesting crosstalk between the pathways. The finding of such dominant contribution of the NALCN conductance to the regulation of firing by NA is somewhat surprising considering that NA is known to regulate K+ conductances in many other neurons.

      The studies reveal the molecular mechanisms underlying well known regulations of the neuronal processes in the auditory pathway. The results will be important to the understanding of auditory information processing in particular, and, more generally, to the understanding of the regulation of inhibitory neurons and ion channels. The results are convincing and are clearly presented.

      Reviewer #3 (Public Review):

      The study by Ngodup and colleagues describes the contribution of sodium leak NALCN conductance on the effects of noradrenaline on cartwheel interneurons of the DCN. The manuscript is very well-written and the experiments are well-controlled. The scope of the study is of high biological relevance and recapitulates a primary finding of the Khaliq lab (Philippart et al., eLife, 2018) in ventral midbrain dopamine neurons, that Gi/o-coupled receptors inhibit NALCN current to reduce neuronal excitability. Together these studies provide unequivocable evidence for NALCN as a downstream target of these receptors. There are no major concerns. I have only minor suggestions:

      Minor

      1. As introduced in the introduction, NALCN is inhibited by extracellular calcium which has led to some discourse of the relevance of NALCN when recorded in 0.1 mM calcium. A strength of this study is the effect of NA on NALCN is recorded in physiological levels of calcium (1.2 mM). I suggest including the concentration of extracellular calcium in the aCSF in the Results section instead of relying on the reader to look to the Methods.

      Done.

      1. It would be interesting to include the basal membrane properties of the KO compared to wildtype, including membrane resistance and resting membrane potential. From the example recording in Figure 2, one might think that the KOs have lower membrane resistance, so it is interesting that the 2 mV hyperpolarization produced similar effects on rheobase. In addition, from the example in Figure 2G, it appears that NA has an effect on firing frequency with large current injection in the KO. Is this true in grouped data and if so, is there any speculation into how this occurs?

      We have included in the text a comparison of the input resistance in WT and KO. These were not different. This should not be too surprising given the wide range of values between animals, and the necessity to compare populations. Measurements of resting potential are complicated by the fact that CWC are normally spontaneously active. As was discussed in the text, peak firing frequency declined with time during recording in both control and KO, necessitating normalization as shown in Fig 2E-H.

      1. Please expand on the rationale for why GABAB and alpha2 must be physically close to NALCN. To my knowledge, the mechanism by which these receptors inhibit NALCN is not known. Must it be membrane-delimited?

      Given the known membrane delimited modulation of GIRK by GABAB, and that alpha2 and GABAB receptors appear to share the same population of NALCN channels, and that alpha2 receptors do not appear to target GIRK channels, we felt the simplest explanation would be coupling through G-proteins, with spatial segregation of different receptor/channel pools providing the means for separating GIRK and NALCN effects. Given that the alpha2 receptor is a Gi/o GPCR, we have now included in the revision new experiments using 8-Br-cAMP, as discussed above. These showed no effect on the NA response, consistent with a direct effect membrane delimited of G-proteins. We acknowledge however that further experiments are warranted.

      Reviewer #1 (Recommendations For The Authors):

      1. I suggest labeling the voltage traces in Figure 2 with WT and KO for easier comprehension; in addition, I suggest adding the average data to the plots in Figure 2, as in Figure 2-supplementary Figure 1 panel F.

      We have added the figure labels as requested. We chose not to add the average data as we noticed that averaging the full FI plots led to a smearing of the curves and a distortion in the apparent rheobase. Thus, we instead measured the rheobase for individual cells and report their average.

      1. For readers that are not familiar with the field, more details should be given about the electrical stimulation to evoke IPSCs in cartwheel cells, and what they represent.

      Done.

      1. The methods should mention if and how the concentrations of divalents were adjusted in the experiments with 0.1 extracellular Ca2+

      Done.

      Reviewer #2 (Recommendations For The Authors):

      I only have several minor comments.

      1. The total lack of spontaneous firing in CWCs in the NALCN KO (Fig. 1) is interesting and provides an opportunity to probe the in vivo function of such spontaneous firing. Besides being a little smaller, do the mutant mice have any sign of abnormality in sound signal processing?

      Figure 1 – Figure supplement 1 showed that there are no effects on auditory brainstem responses in the KO.

      1. Figs. 3&4 (and several other figures with voltage-clamp recordings), a line indicating zero current level would be useful.

      Done

      1. page 7, "Outward current generated by suppression of NALCN": it might be better to state as "Outward response generated by suppression of NALCN", as the authors correctly pointed out that the NA-induced apparently outward current response is largely a result of an inhibition of NALCN-mediated inward Na+ current. One way to clarify this might be to record at the Nernst potential of K+ to isolate the contribution of Na+ currents (unclear if K+- or Cs+-based pipette was used in the experiment in Fig 3).

      Text has been modified.

      1. Figs. 5,6&7: do the dashed lines indicate initial current level or zero current level?

      Initial current. See legends.

      1. The labeling of some of the bar graphs can be made more clear. For example, in Fig. 2K, the right two columns should be labeled as WT as well. Fig. 3C & Fig. 4C, the left two columns should be labeled as WT and the right two as KO.

      Added labels to Fig 2 as requested.

      1. Figs. 5-7: The suppression of low extracellular [Ca2+]-induced NALCN-dependent current by NA and baclofen is very interesting. As the tonic inhibition of NALCN by extracellular Ca2+ is likely through a Ca2+-sensing GPCR (CaSR) and G-proteins (lowering [Ca2+] releases the inhibition and generates inward current) (Lu et al. 2010), the action of NA and baclofen may all converge onto the same G-protein dependent pathway of the Ca2+-sensing receptor. I'd include this in the discussion to provide a potential mechanistic explanation of the interesting observation.

      This is indeed an interesting idea. We prefer not to discuss here, as 1) the source of Ca2+ sensitivity of the channel seems to be controversial (Chua et al 2020), and 2) the effect of Ca2+ reduction is enormously slower than the effect of the modulators (Fig 5-7), implying distinct mechanisms.

      Reviewer #3 (Recommendations For The Authors):

      Typos/general comments

      1. Figure 2 would be easier to comprehend with WT and KO labels as in the other figures. Done

      2. Page 11, size of the IPSCs in NA is missing the minus sign.

      Corrected.

      1. Is the y-axis correct on Figure 8B? This looks like it is doubling the size of the IPSC.

      Thank you for catching this mistake. The formula used to calculate % change was in error. We have corrected all the data analysis in the figure, which fortunately did not change the conclusion. Regarding the axis, note that the measurement was % change, not ratio of drug vs control.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the reviewers and editors for their constructive comments on the manuscript. We have extensively revised the manuscript based on these concerns and comments. The followings are the specific answers.

      Public Reviews:

      Reviewer #1 (Public Review):

      In the manuscript "Long‐read single‐cell sequencing reveals expressions of hypermutation clusters of isoforms in human liver cancer cells", S. Liu et al present a protocol combining 10x Genomics single-cell assay with Element LoopSeq synthetic long-read sequencing to study single nucleotide variants (SNVs) and gene fusions in Hepatocellular carcinoma (HCC) at single‐cell level. The authors were the first to combine LoopSeq synthetic long‐read sequencing technology and 10x Genomics barcoding for single cell sequencing. For each cell and each somatic mutation, they obtain fractions of mutated transcripts per gene and per each transcript isoform. The manuscript states that these values (as well as gene fusion information) provide better features for tumor-normal classification than gene expression levels. The authors identified many SNVs in genes of the human major histocompatibility complex (HLA) with up to 25 SNVs in the same molecule of HLA‐DQB1 transcript. The analysis shows that most mutations occur in HLA genes and suggests evolution pathways that led to these hypermutation clusters. Yet, very little is said about novel isoforms and alternative splicing in HCC cells, differences in isoform ratio between cells carrying different mutations, or diversity of alternative isoforms across cells. While the manuscript by Liu et al. presents a promising combination of technologies, it lacks significant insights, a comprehensive introduction, and has significant problems with data description and presentation.

      Answer: Thanks for the precious suggestion. Our long-read single-cell sequencing has discovered an average of 442 novel isoform transcripts per benign liver cell and 450 novel isoform transcripts per HCC cell per SCANTI v1.2 analysis. These are stated in the revised manuscript. The alternative splicing was detected by differential isoform expression as demonstrated in supplemental figures 6 and 7 and supplemental tables 8-11. The examples of differences in isoform ratio between cells carrying different mutations are now shown by DOCK8 and STEAP4 (figure 5 in the revised manuscript). A new section was added in the results to discuss the mutation expression of these two genes. The diversity of isoforms of the selected genes is shown in Supplemental Figure 10.

      This study showed how mutations in the same allele evolved in liver cancer. In particular, HLA hypermutations were found to develop from some specific sites of the molecules into large clusters of mutations in the same molecules. A new paragraph of introduction was added about the role of mutations in human cancer development. We also revised the figures to present the information better. All the HLA genes expressed only one known isoform, as shown in Figure 4 and Supplemental Figure 3, regardless of mutations.

      Major comments:

      1. The introduction section is scarce. It lacks description of important previous works focused on clustered mutations in cancers (for example, PMID35140399), on deriving the process of cancer development through somatic evolution (PMID32025013, from single cell data PMID32807900). Moreover, some key concepts e.g. mutational gene expression and mutational isoform expression are not defined. The introduction and the abstract contain slang expressions e.g. "protein mutation', a combination of terms I teach my students not to use.

      Answer: We appreciate the reviewer for the idea of more solid background introduction and term definition. We added a new paragraph in the introduction section to introduce the role of mutations and hypermutations in human cancers. Some important work has been cited. We added a new section in the "Methods" to define "mutation gene expression share" and "mutation isoform expression share". "Protein mutation" has been replaced by "genetic mutation".

      1. In the results section, to select the mutations of interest, the authors apply UMAP dimensionality reduction to the mutation isoforms expression and cluster samples in UMAP space, then select the mutations that are present only in one cluster, then apply UMAP to the selected mutations only and cluster the samples again. The motivation for such a procedure seems unclear, could it be replaced with a more straightforward feature selection?

      Answer: Thanks for raising up this important question. The goal of the analysis is an unbiased classification of the cell populations in the samples. We found that by removal of mutated isoform expressions that were at similar levels of all cells, the UMAP clustering generated clear segregation of three population cells. When the unique mutated isoform expressions from each group were applied, it generated highly distinct 8 groups of cells, with each group having a distinct mutation isoform expression pattern. If we force known knowledge into the mix of the analysis, it may generate unwanted bias. Specifically, the first UMAP was performed in an unbiased way to cluster cells, while the second step is a supervised approach by selecting the unique mutations in each cluster to identify the classifiers. The second UMAP matches the Benign/HCC labeling well.

      1. As I understand, the first "mutated isoform"-based UMAP clustering was built from expression levels of 205 "mutational isoforms". What was the purpose and outcome of the second "mutated isoform"based UMAP clustering (Figure 2E)? In the manuscript the authors just describe the clusters and do not draw any conclusions or use the results of the clustering anywhere further.

      Answer: Thanks for pointing this out. Figure 2E was generated from unique mutation isoform expressions in groups A, B, and C from Figure 2D. The purpose of Figure 2E is to investigate whether these unique mutation isoforms can further classify the cell populations free of prior biological knowledge. We added a sentence in the revision to clarify the purpose of the clustering. The conclusion from this analysis, including Figure 2F and Figure 3 (which is an extension of Figure 2E), is that HLA mutation isoform expressions dominated the classifications of cell populations.

      1. The authors just cluster the data three times based on expression levels of different sets of "mutational isoforms" and describe the clusters. What do we need to gather from these clustering attempts besides the set of 113 mutations used for further analysis? What was the point of the reclusterings? Did the authors observe improvement of the classification at each step?

      Answer: Thanks for asking this important question. The improvement of re-clustering to classify cell populations is the obvious segregation of 8 different groups of cells without any manual classification through prior knowledge. The distances among groups were far apart in comparison to the first clustering (figure 2B). Detailed subclassifications were achieved on cell populations that otherwise could not be segregated based on the first clustering.

      1. The alignment of short reads generated from hypermutated transcriptomes is non-trivial. The proposed approach could address the issue without the need for whole genome sequencing and offer insights about the cancer development through somatic evolution. Why didn't the authors use modern phylogenetic approaches in the "Evolution of mutations in HLA molecules" section or at least utilize the already performed clustering to infer cell lineages?

      Answer: We appreciate for the great question. For a single molecule mutation evolution, single gene clustering may not produce a desirable and robust effect. A simple evolution snowball chart in Figure 4B may be easier to be understood.

      1. I am not sure I understood the definition of "mutated gene expression levels" and "mutated isoform expression levels" in the "Mutational gene expression and fusion transcript enhanced transcriptome clustering of benign hepatocytes and HCC" section. The authors mention that gene lists included all the isoforms within the same range of standard deviation. If I understand it correctly, they are equal if there is only one expressed transcript isoform. In that case, this overlap is not surprising at all.

      Answer: We thank the reviewer for the great question. The definition of mutation gene expression level, mutation isoform expression level, and fusion gene expression level are now defined in the "Methods" section. In all HLA mutation transcripts, there were multiple transcripts with or without mutations for a single dominant isoform.

      1. "To investigate the roles of gene expression alterations that were not accompanied with isoform expression changes, UMAP analyses were performed based on the non‐overlapped genes." Venn diagrams (Sup Figure 8) show that there are much less "non-overlapped genes" than "genes that showed both gene and isoform level changes" for each SD threshold (for example, for SD>=0.8 59 vs 275). Could that be the reason why clustering based on the former group is worse i.e the cancer and normal cells are separated less clearly?

      Answer: The number of (attributes) genes could be a contributing factor in the segregation of cell populations. However, the number of attributes is not the underlying reason for worse performance for gene only classifier because much smaller isoforms/genes (22) overlap in SD>=1 outperformed a large number of genes (59) with SD>=0.8. It suggested that 59 gene expression classifier is less efficient in segregating the cell populations. To address this concern, we took SD>=0.8 as an example for demonstration if we subsampled the 275 overlapped genes/isoforms to 59 (equal to 59 non-overlapped genes in terms of number), we can still get better separation than the 59 DEG only. We repeated this subsampling process for three times. Similar results were found. The new data were inserted into supplemental Figure 8

      Reviewer #2 (Public Review):

      In the present study, Liu et al present an analysis of benign and HCC liver samples which were subjected to a new technology (LOOP-Seq) and paired WES. By integrating these data, the authors find isoforms, fusions and mutations which uniquely cluster within HCC samples, such as in the HLA locus, which serve as candidate leads for further investigation. The main appeal of the study is in the potential of LOOPSeq as a method to present isoform-resolved data without actually performing long-read sequencing. While this presents an exciting new method, the current study lacks systematic comparisons with other technologies/data to test the robustness, reproducibility and utility of LOOPSeq. Further, this study could be further improved by giving more physiologic context and examples from the analyses, thus providing a new resource to the HCC community. A few suggestions based on these are below:

      Answer: We appreciate the reviewer to raise up all the important questions and the great suggestions. The LOOPseq technology was compared with Oxford nanopore and PacBio long-read sequencing in our previous study. We have cited analysis in the introduction section of the paper. HLA mutation clusters in the single molecules are our finding with major physiological significance since these mutations may help liver cancer cells evade immune surveillance. We have extensively discussed the potential impact of these mutations on cancer development in the discussion. In addition, we added a new section of DOCK8 and STEAP4 mutation expressions in the results (page 11, new Figure 5) that are highly relevant to the pathogenesis of HCC.

      1. A primary consideration is that this seems to be the first implementation of LOOP-Seq, where the technology, while intriguing, has not been evaluated systematically. It seems like a standard 10x workflow is performed, where exons are selectively pulled down and amplified. Subsequent ultra-deep sequencing is assumed to give isoform-resolution of the sc-seq data. To demonstrate the utility of the approach it would benefit the study to compare the isoform-resolved results with studies where long-read sequencing was actually performed (ex: https://journals.lww.com/hep/Fulltext/2019/09000/Long_Read_RNA_Sequencing_Identifies_Alternativ e.19.aspx, https://www.jhep-reports.eu/article/S2589-5559(22)00021-0/fulltext, https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1010342). Presumably, a fair amount of overlap should occur to justify the usage.

      Answer: We have discussed the utility of the methodology in comparison with the previous studies by these three groups in the revision (results, page 12).

      1. Related to this point, the sc-seq cell types and benign vs HCC genes should be compared with the wealth of data available for HCC sc-seq (https://www.nature.com/articles/s41467-022-322833, https://www.nature.com/articles/s41598-021-84693-w). These seem to be important to benchmark the technology in order to demonstrate that the probe-based selection and subsequent amplification does not bias cell type definition and clustering. In particular, https://www.nature.com/articles/s41586021-03974-6 seems quite relevant to compare mutational landscapes from the data.

      Answer: This is a great point. The consistency probe-based analysis was demonstrated in our previous analyses and the analyses mentioned in the comments. We further discussed it in the results section of the paper (page 12).

      1. From the initial UMAP clustering, it will be important to know what the identities are of the cells themselves. Presumably, there is quite a bit of immune cells and hepatocytes, but without giving identities, downstream mechanistic interpretation is difficult.

      Answer: When mutation analyses were combined with cell marker analysis, i.e., immune marker positive but negative in HLA mutation, we found only one bona fide immune cell in the HCC sample. Thus, immune cells may not be significant in the current analysis.

      1. In general, there are a fair amount of broad analyses, such as comparisons of hierarchical clustering of cell types, but very little physiologic interpretations of what these results mean. For example, among the cell clusters from Fig 6, knowing the pathways and cell annotations would help to contextualize these results. Without more biologically-meaningful aspects to highlight, most of the current appeal for the manuscript is dependent on the robustness of LOOP-seq and its implementation.

      Answer: To address this comment, a new pathway analysis was performed on the cluster results of Figure 6. A new supplemental table was generated. The results are now discussed on page 13.

      1. Many of the specific analyses are difficult and the methods are brief. Especially given that this technology is new and the dataset potentially useful, I would strongly recommend the authors set up a git repository, galaxy notebook or similar to maximize utility and reproducibility

      Answer: The script file has been uploaded to GIT to facilitate the reproducibility of the analysis. We also added a new pipeline description script in the methods (pages 19-20).

      1. The authors claim that clustering between benign and HCC samples was improved by including isoform & gene (Suppl fig 8). This seems like an important conclusion if true, especially to justify the use of longread implementation. Given that the combination of isoform + gene presents ~double the number of variables on which to cluster, it would be important to show that the improved separation on UMAP distance is actually due to the isoforms themselves and not just sampling more variables from either gene or isoform

      Answer: The number of (attributes) genes could be a contributing factor in the segregation of cell populations. However, the number of attributes is not the underlying reason for worse performance for gene only classifier because much smaller isoforms/genes (22) overlap in SD>=1 outperformed a large number of genes (58) with SD>=0.8. It suggested that 58 gene expression classifier is less efficient in segregating the cell populations. To address this comment, we performed random subsampling to reduce the isoform/gene overlap iterates, similar results were obtained. A new supplemental figure was generated to reflect the new analyses.

      1. SQANTI implementation to identify fusions relevant for the HCC/benign comparison. How do the fusions compare with those already identified for HCC? These analyses can be quite messy when performed on WES alone so it seems that having such deep RNA-seq would improve the capacity to see which fused genes are strongly expressed/suppressed. This doesn't seem as evident from current analysis. There are quite a bit of WES datasets which could be compared: https://www.nature.com/articles/ng.3252, https://www.nature.com/articles/s41467-01803276-y

      Answer: Exome sequencing is not an ideal tool to identify fusion genes. Very few fusion genes have been discovered based on RNA sequencing so far. The fusion genes discovered in the study appeared mostly novel. No exome sequencing was involved in the identification of fusion genes.

      1. Figure 4 is fairly unclear. The matrix graphs showing gene position mutations are tough to interpret and make out. Usually, gene track views with bars or lollipop graphs can make these results more readily interpretable. Also, how Figure 4 B infers causal directions from mutations is unclear.

      Answer: We appreciate the reviewer for pointing this out. We have revised the diagram in Figure 4A to reflect the proper distance between the mutations in HLA-DQB1 NM_002123. Since these are the positions in the same alleles (protein), the gene track view or lollipop graph may not show that properly. The mutation clusters started from an isolated mutation, and mutation did not revert to wild type sequence after occurring. Based on these two principles, we showed several mutation accumulation pathways leading to hypermutation clusters.

      Reviewer #3 (Public Review):

      The Liu, et al. manuscript focuses on the interesting topic of evaluating in an almost genome-wide-scale, the number of transcriptional isoforms and fusion gene are present in single cells across the annotated protein coding genome. They also seek to determine the occurrences of single nucleotide variations/mutations (SNV) in the same isoform molecule emanating from the same gene expressed in normal and normal and hepatocellular carcinoma (HCC) cells. This study has been accomplished using modified LoopSeq long‐read technology (developed by several of the authors) and single cell isolation (10X) technologies. While this effort addresses a timely and important biological question, the reader encounters several issues in their report that are problematic.:

      1. Much of the analysis of the evolution of mutations results and the biological effects of the fusion genes is conjecture and is not supported by empirical data. While their conclusions leave the reader with a sense that the results obtained from the LoopSeq has substantive biological implications. However, they are extended interpretations of the data. For example: The fusion protein likely functions as a decoy interference protein that negatively impacts the microtubule organization activity of EML4.(pg 9)... and other statements presented in a similar fashion.

      Answer: We thank the reviewer for the helpful comment. The mutation results were experimentally validated by exome sequencing on the same samples. Furthermore, these mutations were filtered by requiring their presence in three different transcriptomes. The biological significance of these mutations is probably the subject of investigation in the next phase. Since a large number of HLA mutations did not occur overnight, the analysis of the accumulation pathways for these mutations was warranted, given the extensive evidence of such a process. The impact of mutations on HLA molecules appeared obvious and should be discussed. For ACTR2-EML4 fusion, we revised it as "The loss of microtubule binding domain may negatively impact the microtubule organization activity of EML4 domain of the fusion protein." We only discussed the obvious impact due to the loss of a large protein domain.

      2, LoopSeq has the advantage of using short read sequencing analyses to characterize the exome capture results and thus benefits from low error rate compared to standard long-read sequencing techniques. However, there is no evidence obtained from standard long read sequencing that the isoforms observed with LoopSeq are obtained with parallel technologies such as long read technologies. It is not made clear how much discordance there is in comparing the LoopSeq results are with either PacBio or ONT long read technologies.

      Answer: The comparative analyses among LOOPSeq, Oxford nanopore, and PacBio sequencing were performed in our previous study. We have cited the study in our introduction.

      1. There is no proteome evidence (empirically derived or present in proteome databases) from the HCC and normal samples that confirms the presence or importance of the identified novel isoforms, nor is there support that indicate that changes in levels HLA genes translate to effects observed at the protein level. Since the stability and transport differences of isoforms from the same gene are often regulated at the post-transcriptional level, the biological importance of the isoform variations is unclear.

      Answer: Given the transcriptome sequencing data, we can only focus on the isoform variation analysis but not directly link to the protein level variation because of the post-transcriptional level regulation. We discussed this in the revised manuscript (page 14).

      4 It is unclear why certain thresholds were chosen for standard deviation (SD) <0.4 (page 5), SD >1.0 (pg 11).

      Answer: The threshold is flexible and arbitrary. We showed different thresholds, and the same conclusion holds. We just choose the thresholds with better separation and a reasonable number of genes/isoforms for the downstream analysis. (Supplemental Figure 6-7 with different thresholds and supplemental tables 4-12).

      1. HLA is known to accumulate considerable somatic variation. Of the many non-immunological genes determined to have multiple isoforms what are the isoform specific mutation rates in the same isoform molecule? Are the HLA genes unique in the number of mutations occurring in the same isoform?

      Answer: We thank the reviewer for this important suggestion. We now show mutation expression patterns in isoforms of DOCK8 and STEAP4 in Figure 5. A new section is added to discuss the mutation expression of these two genes. As shown in supplemental figure 10, HLA-DQB1, HLA-DRB1, HLA-B, and HLA-C, have only one known isoform detected,

      Editorial comments:

      The present study pairs single-cell seq with LoopSeq synthetic long-read sequencing on samples of HCC and benign liver to identify mutations and fusion transcripts specific to cancer cells. The authors present a potentially important resource; however the overall support remains incomplete.

      While the approach of evaluating isoform-specific changes at the cellular level to cancer seeks to address a timely and important topic, there is currently incomplete evidence in support of the major claims in the manuscript. In particular, major recommendations to provide stronger support for the combination of technologies and interpretation regarding cancer-associated genomic changes include: 1) systematic evaluation of UMAP-based clustering methods, to what subsets of data they are applied and subsequent interpretations, 2) direct comparisons of results with additional methods to quantify long-read sequencing data and those evaluating mutational consequences of HCC progression and 3) detailed expansion of the description of methods and rationale for selecting specific parameters and cell types for further analyses. Including these changes would significantly strengthen the support for utility of combining 10x single-cell with Loop-seq and provide compelling evidence for usage of this resource in dissecting HCC-associated molecular changes.

      Answer: We appreciate the frank and constructive comments. The goal of UMAP is to obtain biological knowledge through unbiased data selection. Systematically, we select classifiers without any prior knowledge (blind to the samples). In our case, classifiers with high standard deviation across all the cells were chosen. We stressed this in the result section. The comparison among LOOPSeq, PacBio, and Oxford nanopore was made in our previous study. We cited that analysis in this paper. Analysis detail and pipelines were added in the revised manuscript to improve the reproducibility. The mutation expression analysis was quite clear-cut. The clustering classified the HCC and benign liver cells by itself and identified a few cancer cells in the benign liver sample. All these were accomplished without applying any knowledge.

      Reviewer #1 (Recommendations For The Authors):

      Overall, there are numerous problems with data presentation and insufficient description, which authors could fix.

      1. Figure 4. A. It would be more clear if the figure showed the distribution of mutations in the molecule. Otherwise, it's hard to see if we see clusters of mutations or just 25 mutations spread uniformly across the transcript. B. It's unclear what the reader needs to take away from these columns of numbers.

      Answer: The mutation positions are now presented as proportion to the location in a molecule. Column B is the distribution of mutation molecules from left panel in each cluster of cells (from Figure 3A) and their sample origin (HCC or benign liver). We clarify it a little more in the legend of Figure 4A.

      1. As a reader, I did not understand how "mutated gene expression levels" and "mutated isoform expression levels" were calculated in terms of sequenced long reads

      Answer: We defined the term and calculations in the methods section of the revised manuscript.

      1. Page 6 "genes involving antigen presentation"

      Answer: The full sentence of the subtitle is" Mutations of genes involving antigen presentation dominated the mutation expression landscape."

      1. Page 6 "These unique mutational isoforms" - how are these isoforms unique?

      Answer: We take away most of the "unique" adjectives to describe the non-redundant mutations.

      1. Page 6. Unclear "All but one clusters contained cells co‐migrated with cells of their sources."

      "Among 113 mutation isoforms, the major histocompatibility complex (HLA) was the most prominent with 68 iterations (60.2%) (Supplemental Table 3, Figure 3B)" There is nothing about HLA in Figure 3B.

      Answer: We revised the sentence as "Cells in all but one clusters co-migrated with cells of their sources". The mutation isoform expressions were listed in supplemental Table 3. They are too small and become unreadable when put in the figure.

      1. Page 10 "genes or isoforms that across all samples had with expression standard deviations less than" - probably "with" should not be there.

      Answer: We correct the error and thank the reviewer for the comment.

      1. Page 11 "UMAP analysis was performed using genes with standard deviations {greater than or equal to} 1.0 (182 wild‐type genes) and standard deviations >0.4 (282 mutated genes)". What do "wild-type" and "mutated" mean here?

      Answer: We edited as "UMAP analysis was performed using gene expressions with standard deviations ≥ 1.0 (182 non-mutated genes) and gene mutation expression with standard deviations 0.4 (282 mutated genes)."

      1. I could not find the description of Supplementary Tables.

      Answer: The supplemental table legends are added in the revised manuscript.

      1. In the Discussion section, the authors mention that mutations were mainly expressed in a specific isoform of a gene for a given cell. I suggest to emphasize this point in the Results section and illustrate it with a comparison of abundance of mutated and non-mutated isoforms

      Answer: For HLA molecules, their expression appeared to be restricted to one known isoform, regardless of mutation status. This sentence is removed in the revision. A new section of DOCK8 and STEAP4 mutation expression is added to the result.

      1. It is also mentioned that mutations may have an impact on the RNA splicing process. The authors should compare the observed isoform ratio to a prediction of the effect of variants on splicing by SpliceAI or similar tools

      Answer: This sentence was removed from the discussion.

      1. Figure 3c: triangles corresponding to HLA-positive cells are hard to distinguish

      Answer: We provide a larger representation of the triangle and circle in figure 3c in the revision.

      Reviewer #2 (Recommendations For The Authors):

      Many of my comments could be addressed by spending time to provide the code/data and a walkthrough of analyses so that other users would be able to answer these questions on their own.

      Answer: We have included a script section in the revision to ensure the reproducibility of the analysis. The raw data had been uploaded to GEO (see Methods).

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      1. The results that TF binding produces microdomains at medium and long linker DNA but not short linker is very interesting. Although the differences can be observed from the figure, it still lacks of quantitative comparison. It is not clear the exact definition of the microdomain observed from simulations and what numbers of microdomains can be identified under different conditions. A quantitative comparison of different conditions could also be provided.

      We thank the reviewer for this suggestion. Our intent was to show qualitatively how TF binding locations that we design can direct fiber folding and create microdomains, which we define in the paper as high frequency contact regions in the contact maps, similar to the TADs observed in HiC maps. Together with the fiber configurations, contact maps allow us to identify formation of such microdomains, and to observe how these microdomains change depending on the conditions we build into the model, such as TF binding region or linker DNA length.

      To address your point, we have added a clustering analysis of the contact matrices with nucleosome resolution and assign each contact along the genome position (nucleosome index) to a cluster. In Supporting Figure S6, we show how DBSCAN clustering provides a clustering distribution that quantitatively describes the microdomains observed in the matrices and estimates the number of microdomains. For example, in the 44 and 62 bp systems, the contacts along the genomic distance separate into 5, 2, and 1 nucleosome groups for topologies 1 to 3, and into 2 and 1 group for topology 4, respectively. In the 26 bp and Life-Like systems, where microdomains are more diffuse due to fiber rigidity or polymorphism, we see that the clustering results are not as TF-topology-dependent as in the 44 and 62 bp systems. We also decomposed the contact matrices into one dimensional plots that depict the magnitude of 𝑖, 𝑖 ± 𝑘 internucleosome interactions. We see that internucleosome patterns change with the TF binding topology, and that the 26 bp and Life-Like systems show the least changes.

      1. When increasing TF concentration, from 0 to 100%, it seems that both packing ratio and sedimentation coefficients are not sensitive to the TF concentrations after 25%. Is it due to the saturation of TF binding? How many TF binding sites are considered at each concentration?

      Yes, in most cases, at TF concentrations higher than 25%, the fiber compaction does not change due to saturation of TF binding. Although the TF concentrations are reached, such as 50%, 70%, or 100%, these do not influence the fiber architecture. A higher order folding and compaction cannot be reached due to excluded volume interactions that impede overlapping of beads in the model.<br /> We have clarified this in the manuscript.

      As stated in the Methods section, the TF concentration refers to the number of linker DNA beads that can engage in a constraint compared to the total number of linker DNA beads. Thus, at 25% TF, 25% of linker DNA beads are engaged in TF constraints. We have added a comment on this in the Results section.

      1. It is shown that the contact maps that reveal microdomains are ensemble-based maps and single trajectories do not show clear formation of microdomains. Does the formation of microdomains increase with the number of combined trajectories?

      The formation of microdomains occurs in each single trajectory. However, the microdomains formed in each trajectory can be different. That is why ensemble-based maps show clearer trends of microdomains that might not be as visible in single-trajectory maps. If we increase the number of trajectories, the macrodomains will be more visible and there will be more macrodomains in the contact map, but the formation of microdomains will not increase in each single trajectory.

      1. "As we see from Figure 4A, when the linker DNA is short, such as 26 and 35 bp, TF binding does not increase the packing ratio of the fiber." The results of 35bp cannot be found in Figure 4A. In addition, the color of 44 and 62 bp should be changed since they are very similar in the figure.

      Thank you for catching this. The results corresponding to the 35 bp system are presented in the Supporting Figure 7. We have changed the text to read “As we see from Figure 4A and Figure S7..”.

      We have changed the color of the 62 bp trace to blue in the plots of Figure 4. Consistently, we have also changed the color of the 62 bp fiber in Figure 2 and Figure 5.

      1. For modelling of TF binding at increasing concentrations, it is mentioned that in these three conditions, TFs are allowed to bind to any region. Do you mean TF can also bind to nucleosomal DNA? Nucleosome structure prevents the binding of many TFs.

      In our model, only linker DNA beads can engage in the constraints (bind TF).<br /> We have changed the text to read “TFs are allowed to bind to any linker DNA region”.

      1. The details of the Mnase-seq dataset and how NFRs are identified should be provided, such as the coverage of the data and what read fragments are selected for NFR mapping.

      MNase data in bedgraph format were downloaded from the Genome Expression Omnibus (GSM2083107) repository and loaded without further processing into the Genome Browser. NFRs were visually inspected and detected as genomic regions without peaks. As detailed in the GEO repository, the sequenced paired-end reads were mapped to the mm9 genome. Only uniquely mapped reads with no more than two mismatches were retained and reads with insert sizes less than 50 or larger than 500 bp were discarded.

      We have clarified this in the manuscript.

      1. The calculations of volume and area of the Eed promoter region should be further elucidated.

      Thank you. We now elaborate upon these calculations. In particular, the Eed promoter region is defined between cores 123 and 129. The x,y or x,y,z coordinates of those cores are used to create the bounding area or volume by defining the shape’s vertices.

      1. In Figure 3, it is not clear how different topology are identified.

      In Figure 3 the topology, or TF binding regions, is the same for each of the 10 contact maps as these emerge from trajectory replicas of the same system which we named Topology 1. Different microdomains are formed in each individual trajectory as the high-frequency regions appear in different locations on each contact map. However, when these 10 maps are summed, the ensemble contact map clearly shows consensus microdomains in each region where TF binds.

      Reviewer #2:

      To further improve the manuscript, I have the following suggestions/comments.

      1. While most of the conclusions in this paper follow from the evidence provided by the ximulations, the result in section 3.3 title "Gene locus repression is medicated by TF finding," may not follow from the results. In my opinion, repression is a more complex process, and many more factors (such as nucleosome positioning, nucleosome sliding, histone methylation, and other proteins such as PRC or HP1, etc) may be involved in repression. While compaction is often associated with repressed chromatin (heterochromatin), recent studies have shown that heterochromatin fibers are highly diverse, and compaction alone may not be the criteria for repression (eg. see Spracklin et al. Nat. Struct. Mol. Biol. 30, 38-51 (2023).). In this light, I would recommend slightly modifying the title to say, "TF binding-mediated compaction can help in gene locus repression" or something similar.

      Yes! We completely agree that gene repression is a very complex phenomenon that involves many factors that we are approaching by modeling starting from the simplest strategy. Thus, we have changed the subtitle to read “TF binding-mediated compaction as possible mechanism of gene locus repression”.

      1. Authors could also present the contact probability versus genomic distance. This may provide some generic features at nucleosome resolution, given the variability in linker length and LH density.

      We thank the reviewer for this suggestion. We have now calculated the contact probability for the EED gene with and without TF binding (Supporting Figure 8). We see that the contact probability corresponding to short range interactions (i ± 2, 3, 4, 5, and 6) is slightly lower for the EED gene upon TF binding. However, a striking increase in the contact probability upon TF binding is seen in the genomic region between 3 and 5 kb, which corresponds to local loop interactions. Thus, TF binding slightly decreases local interactions but increases chromatin loops. Such changes are not observed for the EED system with LH density 0.8 (Supporting Figure 9), further supporting the idea that an increase in LH density hampers the effect of TF binding for the EED gene architecture. <br /> We have now added these results to the manuscript.

      1. Write a short paragraph about the limitations of the model/study. For example, one of the limitations could be that, as of now, it has only the effect of a few proteins, but to predict repression, one may need to incorporate the effect of several proteins.

      We agree with the reviewer that our model is a simple, first-step approach. Nonetheless, even the simplest mathematical model can be enlightening in helping dissect essential factors. Here, our model clearly shows how TF binding location modulates fiber architecture and the interplay between TF binding and other chromatin elements, like linker DNA length, LH density, and histone acetylation. We have now stated in the Discussion section that although limited due to being implicit and not considering other protein partners, our model can provide insights on the regulation of chromatin architecture by protein binding. Future modeling with explicit protein binding or combination of several proteins will further help us understand genome folding regulation.

      1. The radius of gyration of 26 kb chromatin is around ~60nm in this paper. Is there any experimental measurement to compare (approximate order of magnitude)? While I do not know any measurement for Eed gene locus, I am aware of the results in the Boettiger et al. paper from Xiaowei Zhuang lab (Nature 2016). There, they find that the Rg of a 26 kb region is above 100nm. But that is for a different organism, a different set of genes. Also, see Sangram Kadam et al. Nature Communications 14 (1), 4108, 2023.

      Thank you for this suggestion. To the best of our knowledge, there are no radius of gyration measurements for the EED gene. Regarding the two papers you cite, in the paper from Boettiger et al. (1) they determine by microscopy experiments that Rg ∝ 𝐿! where 𝐿 is the genomic length and 𝑐 is 0.37 ± 0.02 for active chromatin (Figure 1d of the paper). In such case, the Rg for a 26 kb region would be 43 ± 9 nm. Considering that these are Drosophila cells, our value of 62 nm is in good agreement with that estimate. Regarding the Kadam et al. paper (2), by coarse grained modeling they find an Rg of around 100 nm for different genes. Considering that the radius of gyration depends on cell type and fiber configuration (see for example (3) for the dependency of Rg on loop number and persistence length), we believe that our measurements in the same ball park as experimental results and other theoretical modeling studies are good indicators of our model’s reasonableness.

      We have added this comparison to the manuscript.

      1. The reason why it is useful to compare some distance measurements (physical dimension) with experiments is the following: The contact map in Hi-C only gives relative contact probabilities. It does not give absolute contact probabilities. To convert a Hi-C map into a physical distance, one requires comparison with some experimentally measured 3D distance. The radius of gyration is an ideal quantity to compare. From my experience, the contact probability is often much smaller than 1, suggesting that the chromatin is more expanded. But this could be due to the effect of many other proteins in vivo and the crowding, etc. I do not expect this work to incorporate all those effects. However, it may be useful to make a comment about it in the manuscript.

      Thank you. We have added to the discussion a comment on our first-generation model of TF binding to chromatin and the neglect of many associated protein and RNA cofactors that certainly influence chromosome folding and domain formation on higher scales. Some distance measures are also added to the Results as mentioned above.

      References

      1. Boettiger,A.N., Bintu,B., Moffitt,J.R., Wang,S., Beliveau,B.J., Fudenberg,G., Imakaev,M., Mirny,L.A., Wu,C. and Zhuang,X. (2016) Super-resolution imaging reveals distinct chromatin folding for different epigenetic states. Nature, 529, 418–422.

      2. Kadam,S., Kumari,K., Manivannan,V., Dutta,S., Mitra,M.K. and Padinhateeri,R. (2023) Predicting scale-dependent chromatin polymer properties from systematic coarsegraining. Nat. Commun., 14, 4108.

      3. Wachsmuth,M., Knoch,T.A. and Rippe,K. (2016) Dynamic properties of independent chromatin domains measured by correlation spectroscopy in living cells. Epigenetics Chromatin, 9, 57.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      I have only a few very minor suggestions for improvement.

      • the text repeatedly uses the terms "central nervous system" and "enteric nervous system", which are not in standard use in the field. These terms are not defined until the bottom of p. 12 even though they are used earlier. It would be useful for the authors to explicitly describe their definitions of these terms earlier in the paper.

      Fixed.

      • the inclusion of four pre-trained models is a powerful and useful aspect of WormPsyQi. Would it be possible to develop a simple tool that, when given the user's images, could recommend which of the four models would be most appropriate?

      We appreciate the reviewer for bringing this up. To address this, we have now added an additional function in the pipeline to test all pre-trained models on representative input images. Before processing an entire dataset, users can view all segmentation results for images in Fiji to assess which model performed best, judged by the user. The GUI, running guide document, and manuscript have been modified accordingly.

      In addition, we would like to emphasize that the pre-trained models were developed by iterative analyses of many reporters, often with multiple rounds of parameter tuning; the results were validated post hoc to choose the optimal model for each reporter, and we have listed this information in Supplemental Table 1 to inform the choice of the pre-trained model for commonly used reporter types.

      • On p. 11 (and elsewhere), the differences in the performance of WormPsyQi and human experimenters are called "statistically insignificant". This statement is not particularly informative (absence of evidence is not evidence of absence). Can the authors provide a more rigorous analysis here - or provide an estimate of the typical effect size of the machine-vs-human difference?

      To address this, we have included additional analysis in Figure 2 – figure supplement 3. For two reporters - I5 GFP::CLA-1 and M4 GFP::RAB-3 - we compare WormPsyQi vs. labelers and inter-labeler puncta quantification. A high Pearson correlation coefficient (r2) reflects greater correspondence between two independent scoring methods. We chose these two test cases to demonstrate that the machine-vs-human effect size is reporter-dependent. For I5, where the CLA-1 signal is very discrete and S/N ratio is high, the discrepancy between WormPsyQi, labeler 1, and labeler 2 is minimal (r2=0.735); moreover, scoring correspondence depends on the labeler (r2=0.642 and 0.942, respectively). In other words, WormPsyQi mimics some labelers better than others, which is to be expected. For M4, where the RAB-3 signal is diffuse and synapse density is high in the ROI, the inter-labeler discrepancy is high (r2=0.083) and WormPsyQi vs labeler (1 or 2) discrepancy is slightly reduced (r2=0.322 and 0.116, respectively). The problematic regions for the M4 RAB-3 reporter are emphasized in Figure 6 - figure supplement 1A. Overall, the additional analysis suggests that the effect size is contingent on the reporter type and image quality, and importantly for scoring difficult strains WormPsyQi may average out inter-labeler scoring variability.

      • p. 12: "Again, relying on alternative reporters where possible..." This is an incomplete sentence - are some words missing?

      Edited.

      Reviewer #2 (Recommendations For The Authors):

      1. The authors effectively validated the sexually dimorphic synaptic connectivity by comparing the synapse puncta numbers of PHB>AVA, PHA>AVG, PHB>AVG, and ADL>AVA. However, these differences appear to be quite robust. It would be beneficial for the authors to test whether WormPsyQi can detect more subtle changes at the synapses, such as 10-20% changes in puncta number and fluorescence intensity.

      While the dimorphic strains were used to first validate WormPsyQi based on the ground truth of very well-characterized reporters, the reviewer reasonably asks whether our pipeline can pick up on more subtle differences. To address this, we have now included an additional figure (Figure 9 – figure supplement 2), where we performed pairwise comparisons between L4 and adult timepoints for the reporter M3 GFP::RAB-3. As reflected in panels A and C, although the difference between puncta number and mean intensity between L4 and adult is marginal (22% increase in puncta number and 13% increase in mean intensity from L4 to adult), WormPsyQi can pick it up as statistically significant.

      1. On page 10, the authors mentioned that "cell-specific RAB-3 reporters have a more diffuse synaptic signal compared to the punctate signal in CLA-1 reporters for the same neuron, as shown for the neuron pair ASK (Figure 4 -figure supplement 1B, C)". It is important to note that in this case, the reporter gene expressing RAB-3 is part of an extrachromosomal array, whereas the reporter gene expressing CLA-1 is integrated into the chromosome. It's possible that the observed difference in pattern may arise from variations in the transgenic strategies employed.

      To emphasize the difference in puncta features inherent to the reporter type, we have now added WormPsyQi segmentation results for ASK CLA-1 extrachromosomal reporter (otEx7455) next to the ASK CLA-1 integrant (otIs789) and ASK RAB-3 reporter (otEx7231) in Figure 4 – figure supplement 1C. Importantly, otEx7455 was integrated to generate otIs789, so they belong to the same transgenic line. Literature shows that RAB-3 and CLA-1 have different localization patterns and corresponding functions at presynaptic specializations, and this is qualitatively and quantitatively shown by the significant difference in puncta area size between RAB-3 and both CLA-1 reporters, i.e., both CLA-1 reporters have smaller, discrete puncta compared to RAB-3 (Figure 4 – figure supplement 1C). Quantitatively, in the case of ASK - where the synapse density is sparse enough that even diffuse RAB-3 puncta can be segmented without confounding adjacent puncta – overall puncta number between otEx7231 and otIs789 are similar. However, RAB-3 signal is diffuse and this poses quantification problems in cases where the synapse density is higher (e.g. AIB, SAA in Figure 4 – figure supplement 1D) and WormPsyQi fails to score puncta in these reporters since the signal is not punctate. As far as integrated vs. extrachromosomal reporters go, the reviewer is right in pointing out that some differences may be stemming from reporter type as our additional analysis between otIs789 and otEx7455 indeed shows fewer puncta in the latter owing to variable expressivity.

      1. The authors mentioned that having a cytoplasmic reporter in the background of the synaptic reporter enhanced performance. It would be more informative to provide comparative results with and without cytoplasmic reporters, particularly for scenarios involving dim signals or densely distributed signals.

      The presence of a cytoplasmic marker is critical in two specific scenarios: 1) images where the S/N ratio is poor, and 2) when the image S/N ratio is good, but the ROI is large, which would make the image processing computationally expensive.

      To demonstrate the first scenario, we have included an additional panel in Figure 4 – figure supplement 1(B) to show how WormPsyQi performs on the PHB>AVA GRASP reporter with and without the channel having cytoplasmic marker. The original image was processed as-is in the former case with both the synaptic marker in green and cytoplasmic marker in red; for comparison, only the green channel having synaptic marker was used to simulate a situation where the strain does not have a cytoplasmic marker. As shown in the figure, in the presence of background autofluorescence signal from the gut (which can be easily confounded with GRASP puncta depending on the worm’s orientation), WormPsyQi quantified GRASP puncta much more robustly with the cytoplasmic label; without the cytoplasmic marker, gut puncta are incorrectly segmented as synapses (highlighted with red arrows) while some dim synaptic puncta are not picked up (highlighted with yellow arrows).

      To demonstrate the second scenario, we now highlight the case of ASK CLA-1 in Figure 2 - figure supplement 4E. Additionally, we have emphasized in the manuscript that in cases where the S/N ratio is good and the image is restricted to a small ROI, WormPsyQi will perform well even in the absence of a cytoplasmic marker. This is equally important to note as having a specific cytoplasmic marker in the background may not always be feasible and, in fact, if the cytoplasmic marker is discontinuous or dim relative to puncta signal, using a suboptimal neurite mask for synapse segmentation would result in undercounting synapses.

      1. On page 12, the author stated "We also note that in several cases, GRASP quantification differed from EM scoring". However, the EM scoring is primarily based on a single sample, making it challenging to conduct a statistical analysis for the purpose of comparison.

      This is correct and is indeed a limitation of EM for this type of analysis. We have now reworded this sentence (page 14) to emphasize the reviewer’s point, and it is also elaborated further in the limitations section.

      1. In Figure 6F, the discrepancy between WormPsyQi and human quantification in the analysis of RAB-3 is observed. The author stated that "the RAB-3 signal was too diffuse to resolve all puncta". To better illustrate this discrepancy, it would be beneficial to include images highlighting the puncta that WormPsyQi cannot score, providing direct evidence that diffusing signals are not able to automatically detectable.

      To highlight puncta that were not segmented by WormPsyQi but were successfully scored manually, we have included arrows in Figure 6. In addition, for reporter M4p::GFP::RAB-3, we have included magnified insets in Figure 6 - figure supplement 1A to highlight the region where human annotator scores more puncta than WormPsyQi owing to the high synapse density. In future implementations, additional functionality can be built for separating these merged puncta into instances based on geometrical features such as shape and intensity contour.

      1. In Figure 9 S1D, the results from WormPsyQi and the manual are totally different. To address this notable discrepancy, the authors should highlight and illustrate the areas of discrepancy in the images. This visual representation can assist future users in identifying signal types that may not be well-suited for WormPsyQi analysis and inspire the development of new strategies to tackle such challenges.

      This is now addressed in additional figure panels in Figure 4 – figure supplement 1B and Figure 6 - figure supplement 1A.

      Reviewer #3 (Recommendations For The Authors):

      I found the comparison between manual quantification and WormPsyQi-based quantification to be very informative. In my opinion, quantifying the number of puncta is not the most tedious/difficult quantification even when done manually. Would the authors be able to include manual-WormPsyQi comparison for more time-consuming and potentially more prone to human error/bias quantifications such as puncta size or distribution patterns using a few markers with some inter/intra animal variabilities?

      To address this point, we have now included an additional figure supplement to Figure 2 (Figure 2 – figure supplement 4). We focused on the ASK GFP::CLA-1 reporter and had two human annotators manually label the masks of puncta for each worm by scanning Z-stacks and drawing all pixels belonging to each puncta in Fiji, which were then processed by WormPsyQi’s quantification pipeline to score puncta number, volume, and distribution. We also included a comparison of overall image processing time for each annotator and WormPsyQi. For features analyzed, the difference between WormPsyQi and human annotators for ASK CLA-1 is not statistically significant for multiple puncta features. Importantly, WormPsyQi reduces overall processing time by at least an order of magnitude, and while this is already advantageous for counting puncta, it is especially useful for other important puncta features since a) they may not be easily discernible, and b) it is extremely laborious to quantify them manually in large datasets when pixel-wise labels are required.

      The authors listed minimum human errors and biases as one of the benefits of WormPsyQi. For the markers with discrepancies in quantifications between human and WormPsyQi, have the authors encountered or considered human errors/biases as potential reasons for such discrepancies?

      This is the same point brought up by reviewer 1. We added Figure 2- figure supplement 3 to compare WormPsyQi to different human labelers, and show that because human labels can introduce systematic bias, WormPsyQi reduces such bias by scoring images using the same metric.

      The authors noted that WormPsyQi would be useful for comparing different genotypes/environments. Some mutants have known changes in synapse patterning/number. It would be helpful if the authors could validate WormPsyQi using some of the mutants with known synapse defects. For instance, zig-10 mutant increases the cholinergic synapse density just by a bit (Cherra and Jin, Neuron 2016), and nlr-1 mutant disrupts punctated localization of UNC-9 gap junction in the nerve ring (Meng and Yan, Neuron 2020), which could only be detectable by experts' eyes. It would be interesting to see if WormPsyQi picks up such subtle phenotypes.

      We agree that our pipeline would need to be tested in multiple paradigms to test its performance on detecting additional subtle phenotypes. In the context of this paper, we note that the developmental analysis of puncta in Figure 8 was performed to validate the ground truth from previous EM-based analyses (Witvliet et al., 2021), albeit the latter was limited by sample size. We extended this developmental analysis to the pharyngeal reporters, and in some cases the difference across timepoints was marginal (as emphasized by additional Figure 9 - figure supplement 2), but still detected by WormPsyQi. Lastly, our synapse localization analysis in Figure 10 assigns the probability of finding a synapse at a particular location along a neurite, which is not easily discernible by manual scoring.

      One of the benefits of the automated data analysis program is to be able to notice the differences you do not expect. For example, there are situations where you feel that in certain genotypes there is something different from wild type with their synapses but you can't tell what's different from wild type. In such cases, you may not know what to quantify. I think it would be beneficial if there were more parameters to be included in the default qualifications such as puncta number/size/intensity/distributions in the pipeline, so that the users may find unexpected phenotypes from one of the default quantifications.

      We apologize if this was not clearer in the manuscript where we first describe the pipeline in detail. To clarify, the output of WormPsyQi is a CSV file which includes several quantitative features, such as mean/max/min fluorescence intensity, puncta volume, and position. While most of our analyses are focused on puncta count, the user can perform downstream statistical analyses on all additional features scored to infer which features are most significantly variable across conditions. To make this clearer, we have elaborated the text when we first describe our pipeline, and along with the new Figure 2 - figure supplement 4, we hope that this point is clearer now.

      In addition, most proof-of-principle analysis we performed was focused on an ROI where we expect the synapses to localize. In practice, the user can input images and perform quantification across the entire image without biasing toward an ROI (this can be done in the GUI synapse corrector window) to also evaluate synaptic changes in regions outside the usual ROI.

      The authors stated that WormPsyQi could mitigate the problems stemming from scoring images with low signal-to-noise ratio or in regions with high background autofluorescence, laboriousness of scoring large datasets, and inter-dataset variability. Other than the 'laboriousness of scoring large datasets' it appeared to me that WormPsyQi does not do better than manual quantifications, especially inter-dataset variability, as the authors noted variability among the transgenes as one of the limitations of the toolkits. If two datasets are taken with completely different setups such as two independent arrays taken with two distinct confocal microscopes, would WormPsyQi make these two datasets comparable?

      We have included additional figure supplements to address the reviewer’s point. A significant advantage WormPsyQi offers over manual scoring is that it provides a standardized method of quantifying synapse features. As shown in Figure 2 – figure supplement 3, human labelers can introduce systematic bias (e.g. some over count puncta, while some undercount). In addition, while puncta number may be relatively easy to quantify, especially in a high-quality dataset, more subtle puncta features such as size, intensity, and distribution are much more laborious to quantify and require a priori knowledge of signal localization (Figure 2 – figure supplement 4, Figure 10). Altogether, our pipeline facilitates multiple measurements while also enabling robust quantification in hard-to-score cases such as the example shown for PHB>AVA reporter (Figure 4 - figure supplement 1B).

      Minor comments:

      Limitations are not quite specific to this work but those are general limitations to the concatemeric trans genes and fluorescently labeled synaptic proteins. I'd appreciate discussing specific limitations to WormPsyQi related to image acquisitions. For instance, for neurons with 3D structures would WormPsyQi be able to handle z-stacks closer to coverslip and stacks that are deeper side in a similar manner? Would the users need to be aware of such limitations when comparing different genotypes?

      To address the reviewer’s comment, we have elaborated the last paragraph in the limitations section to explicitly discuss where the user should exercise caution. The reviewer reasonably points out that the fluorescent signal away from the cover slip is typically dimmer, and neurite masking in this case is indeed compromised if dim to start with. In such cases, we recommend that the user either performs some preprocessing such as deconvolution, denoising, or contrast enhancement to boost the neurite signal, or segment synapses without the neurite mask if the puncta signal is brighter than that of the cytoplasmic marker. We hope that our additional figure supplements will clarify that WormPsyQi’s performance is contingent on reporter type and image quality, thus making it easier for the user to discern where automated quantification falls short and alternative reporters should be explored. In general, if puncta are not discernible to the user due to very poor S/N ratio, for instance, we do not recommend using WormPsyQi to process such datasets; this will be manifest in the results of the new “test all models” feature we added in the revised version.

      Some Rab-3 fusion proteins are described as RAB-3::GFP(BFP). Do these represent the C-terminal fusion of the fluorescent proteins? RAB-3 is a small GTPase with a lipid modification site at its C-terminus essential for its localization and function. Is it possible that the diffuse signal of some RAB-3 markers is caused by c-terminal fusion of the fluorescent protein?

      While we do have reporters with N- and C-terminal RAB-3 fusions for different neurons, we do not have both for the same neuron to perform a fair comparison. However, as noted in response to a previous comment by reviewer 2, RAB-3 and CLA-1 have distinct localization patterns at the synapse and this aligns with their distinct functions: while RAB-3 localizes at synaptic vesicles, CLA-1 is an active zone protein required for synaptic vesicle clustering. Accordingly, we have observed diffuse RAB-3 signal in reporters irrespective of where the protein is tagged, and while this is not problematic for ROIs with a low synapse density, it confounds quantification in synapse-dense regions. In contrast, CLA-1 puncta are typically easier to quantify more discretely, which is particularly relevant for features such synapse distribution, size, and intensity.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this very strong and interesting paper the authors present a convincing series of experiments that reveal molecular mechanism of neuronal cell type diversification in the nervous system of Drosophila. The authors show that a homeodomain transcription factor, Bsh, fulfills several critical functions - repressing an alternative fate and inducing downstream homeodomain transcription factors with whom Bsh may collaborate to induce L4 and L5 fates (the author's accompanying paper reveals how Bsh can induce two distinct fates). The authors make elegant use of powerful genetic tools and an arsenal of satisfying cell identity markers.

      Thanks!

      I believe that this is an important study because it provides some fundamental insights into the conservation of neuronal diversification programs. It is very satisfying to see that similar organizational principles apply in different organisms to generate cell type diversity. The authors should also be commended for contextualizing their work very well, giving a broad, scholarly background to the problem of neuronal cell type diversification.

      Thanks!

      My one suggestion for the authors is to perhaps address in the Discussion (or experimentally address if they wish) how they reconcile that Bsh is on the one hand: (a) continuously expressed in L4/L4, (b) binding directly to a cohort of terminal effectors that are also continuously expressed but then, on the other hand, is not required for their maintaining L4 fate? A few questions: Is Bsh only NOT required for maintaining Ap expression or is it also NOT required for maintaining other terminal markers of L4? The former could be easily explained - Bsh simply kicks of Ap, Ap then autoregulates, but Bsh and Ap then continuously activate terminal effector genes. The second scenario would require a little more complex mechanism: Bsh binding of targets (with Notch) may open chromatin, but then once that's done, Bsh is no longer needed and Ap alone can continue to express genes. I feel that the authors should be at least discussing this. The postmitotic Bsh removal experiment in which they only checked Ap and depression of other markers is a little unsatisfying without further discussion (or experiments, such as testing terminal L4 markers). I hasten to add that this comment does not take away from my overall appreciation for the depth and quality of the data and the importance of their conclusions.

      Great suggestions, we will discuss these two hypotheses as requested.

      Bsh initiates Ap expression in L4 neurons which then maintain Ap expression independently of Bsh expression, likely through Ap autoregulation. During the synaptogenesis window, Ap expression becomes independent from Bsh expression, but Bsh and Ap are both still required to activate the synapse recognition molecule DIP-beta. Additionally, Bsh also shows putative binding to other L4 identity genes, e.g., those required for neurotransmitter choice, and electrophysiological properties, suggesting Bsh may initiate L4 identity genes as a suite of genes. The mechanism of maintaining identity features (e.g., morphology, synaptic connectivity, and functional properties) in the adult remains poorly understood. It is a great question whether primary HDTF Bsh maintains the expression of L4 identity genes in the adult. To test this, in our next project, we will specifically knock out Bsh in L4 neurons of the adult fly and examine the effect on L4 morphology, connectivity, and function properties.

      Reviewer #2 (Public Review):

      Summary:

      In this paper, the authors explore the role of the Homeodomain Transcription Factor Bsh in the specification of Lamina neuronal types in the optic lobe of Drosophila. Using the framework of terminal selector genes and compelling data, they investigate whether the same factor that establishes early cell identity is responsible for the acquisition of terminal features of the neuron (i.e., cell connectivity and synaptogenesis).

      Thanks for the positive words!

      The authors convincingly describe the sequential expression and activity of Bsh, termed here as 'primary HDTF', and of Ap in L4 or Pdm3 in L5 as 'secondary HDTFs' during the specification of these two neurons. The study demonstrates the requirement of Bsh to activate either Ap and Pdm3, and therefore to generate the L4 and L5 fates. Moreover, the authors show that in the absence of Bsh, L4 and L5 fates are transformed into a L1 or L3-like fates.

      Thanks!

      Finally, the authors used DamID and Bsh:DamID to profile the open chromatin signature and the Bsh binding sites in L4 neurons at the synaptogenesis stage. This allows the identification of putative Bsh target genes in L4, many of which were also found to be upregulated in L4 in a previous single-cell transcriptomic analysis. Among these genes, the paper focuses on Dip-β, a known regulator of L4 connectivity. They demonstrate that both Bsh and Ap are required for Dip-β, forming a feed-forward loop. Indeed, the loss of Bsh causes abnormal L4 synaptogenesis and therefore defects in several visual behaviors. The authors also propose the intriguing hypothesis that the expression of Bsh expanded the diversity of Lamina neurons from a 3 cell-type state to the current 5 cell-type state in the optic lobe.

      Thanks for the excellent summary of our findings!

      Strengths:

      Overall, this work presents a beautiful practical example of the framework of terminal selectors: Bsh acts hierarchically with Ap or Pdm3 to establish the L4 or L5 cell fates and, at least in L4, participates in the expression of terminal features of the neuron (i.e., synaptogenesis through Dip-β regulation).

      Thanks!

      The hierarchical interactions among Bsh and the activation of Ap and Pdm3 expression in L4 and L5, respectively, are well established experimentally. Using different genetic drivers, the authors show a window of competence during L4 neuron specification during which Bsh activates Ap expression. Later, as the neuron matures, Ap becomes independent of Bsh. This allows the authors to propose a coherent and well-supported model in which Bsh acts as a 'primary' selector that activates the expression of L4specific (Ap) and L5-specific (Pdm3) 'secondary' selector genes, that together establish neuronal fate.

      Thanks again!

      Importantly, the authors describe a striking cell fate change when Bsh is knocked down from L4/L5 progenitor cells. In such cases, L1 and L3 neurons are generated at the expense of L4 and L5. The paper demonstrates that Bsh in L4/L5 represses Zfh1, which in turn acts as the primary selector for L1/L3 fates. These results point to a model where the acquisition of Bsh during evolution might have provided the grounds for the generation of new cell types, L4 and L5, expanding lamina neuronal diversity for a more refined visual behaviors in flies. This is an intriguing and novel hypothesis that should be tested from an evo-devo standpoint, for instance by identifying a species when L4 and L5 do not exist and/or Bsh is not expressed in L neurons.

      Thanks for the appreciation of our findings!

      To gain insight into how Bsh regulates neuronal fate and terminal features, the authors have profiled the open chromatin landscape and Bsh binding sites in L4 neurons at mid-pupation using the DamID technique. The paper describes a number of genes that have Bsh binding peaks in their regulatory regions and that are differentially expressed in L4 neurons, based on available scRNAseq data. Although the manuscript does not explore this candidate list in depth, many of these genes belong to classes that might explain terminal features of L4 neurons, such as neurotransmitter identity, neuropeptides or cytoskeletal regulators. Interestingly, one of these upregulated genes with a Bsh peak is Dip-β, an immunoglobulin superfamily protein that has been described by previous work from the author's lab to be relevant to establish L4 proper connectivity. This work proves that Bsh and Ap work in a feed-forward loop to regulate Dip-β expression, and therefore to establish normal L4 synapses. Furthermore, Bsh loss of function in L4 causes impairs visual behaviors.<br /> Thanks for the excellent summary of our findings.

      Weaknesses:

      ● The last paragraph of the introduction is written using rhetorical questions and does not read well. I suggest rewriting it in a more conventional direct style to improve readability.

      We agree and have updated the text as suggested.

      ● A significant concern is the way in which information is conveyed in the Figures. Throughout the paper, understanding of the experimental results is hindered by the lack of information in the Figure headers. Specifically, the genetic driver used for each panel should be adequately noted, together with the age of the brain and the experimental condition. For example, R27G05-Gal4 drives early expression in LPCs and L4/L5, while the 31C06-AD, 34G07-DBD Split-Gal4 combination drives expression in older L4 neurons, and the use of one or the other to drive Bsh-KD has dramatic differences in Ap expression. The indication of the driver used in each panel will facilitate the reader's grasp of the experimental results.

      We agree and have updated the figure annotation.

      ● Bsh role in L4/L5 cell fate: o It is not clear whether Tll+/Bsh+ LPCs are the precursors of L4/L5. Morphologically, these cells sit very close to L5, but are much more distant from L4.

      Our current data show L4 and L5 neurons are generated by different LPCs. However, currently, we don’t have tools to demonstrate which subset of LPCs generate which lamina neuron type. We are currently working on a follow-up manuscript on LPC heterogeneity, but those experiments have just barely been started.

      ● Somatic CRISPR knockout of Bsh seems to have a weaker phenotype than the knockdown using RNAi. However, in several experiments down the line, the authors use CRISPR-KO rather than RNAi to knock down Bsh activity: it should be explained why the authors made this decision. Alternatively, a null mutant could be used to consolidate the loss of function phenotype, although this is not strictly necessary given that the RNAi is highly efficient and almost completely abolishes Bsh protein.

      The reason we chose CRISPR-KO (L4-specific Gal4, uas-Cas9, and uas-Bsh-sgRNAs) is that it effectively removed Bsh expression from the majority of L4 neurons. However, it failed to knock down Bsh in L4 neurons using L4-split Gal4 and Bsh-RNAi because L4-split Gal4 expression depends on Bsh. We have updated this explanation in the text.

      ● Line 102: Rephrase "R27G05-Gal4 is expressed in all LPCs and turned off in lamina neurons" to "is turned off as lamina neurons mature", as it is kept on for a significant amount of time after the neurons have already been specified.

      Thanks; we have made that change.

      ● Line 121: "(a) that all known lamina neuron markers become independent of Bsh regulation in neurons" is not an accurate statement, as the markers tested were not shown to be dependent on Bsh in the first place.

      Good point. We have rephrased it as “that all known lamina neuron markers are independent of Bsh regulation in neurons”.

      ● Lines 129-134: Make explicit that the LPC-Gal4 was used in this experiment. This is especially important here, as these results are opposite to the Bsh Loss of Function in L4 neurons described in the previous section. This will help clarify the window of competence in which Bsh establishes L4/L5 neuronal identities through ap/pdm3 expression.

      Thanks! We have updated Gal4 information in the text for every manipulation.

      ● DamID and Bsh binding profile:

      ● Figure 5 - figure supplement 1C-E: The genotype of the Control in (C) has to be described within the panel. As it is, it can be confused with a wild type brain, when it is in fact a Bsh-KO mutant.

      Great point! Thank you for catching this and we have updated it.

      ● It Is not clear how L4-specific Differentially Expressed Genes were found. Are these genes DEG between Lamina neurons types, or are they upregulated genes with respect to all neuronal clusters? If the latter is the case, it could explain the discrepancy between scRNAseq DEGs and Bsh peaks in L4 neurons.

      We did not use “L4-specific Differentially Expressed Genes”. Instead, we used all genes that are significantly transcribed in L4 neurons (line 209-213).

      ● Dip-β regulation:

      ● Line 234: It is not clear why CRISPR KO is used in this case, when Bsh-RNAi presents a stronger phenotype.

      As we explained above, the reason we chose CRISPR-KO (L4-specific Gal4, uas-Cas9, and uas-BshsgRNAs) is that it effectively removed Bsh expression from the majority of L4 neurons. However, it failed to knock down Bsh in L4 neurons using L4-split Gal4 and Bsh-RNAi because L4-split Gal4 expression depends on Bsh. We have updated this explanation in the text.

      ● Figure 6N-R shows results using LPC-Gal4. It is not clear why this driver was used, as it makes a less accurate comparison with the other panels in the figure, which use L4-Split-Gal4. This discrepancy should be acknowledged and explained, or the experiment repeated with L4-Split-Gal4>Ap-RNAi.

      I think you mean 6J-M shows results using LPC-Gal4. We first tried L4-Split-Gal4>Ap-RNAi but it failed to knock down Ap because L4-Split-Gal4 expression depends on Ap. We have added this to the text.

      ● Line 271: It is also possible that L4 activity is dispensable for motion detection and only L5 is required.

      Thanks! Work from Tuthill et al, 2013 showed that L5 is not required for any motion detection. We have included this citation in the text.

      ● Discussion: It is necessary to de-emphasize the relevance of HDTFs, or at least acknowledge that other, non-homeodomain TFs, can act as selector genes to determine neuronal identity. By restricting the discussion to HDTFs, it is not mentioned that other classes of TFs could follow the same PrimarySecondary selector activation logic.

      That is a great point, thank you! We have included this in the discussion.

    1. Author Response:

      We thank all reviewers for their comments and effort to improve our paper. We appreciate that the writing can be clarified overall, and some sections need more elaboration. We will provide these in the next revision within the coming months. Particularly, we will focus on some common themes identified by all reviewers:

      1. We will clarify that the coarse-grained brain surfaces are an output of our algorithm alone and not to be directly/naively likened to actual brain surfaces, e.g. in terms of the location or shape of the folds. Our analysis purely focuses on the likeliness in terms of whole-brain morphometrics between actual brains and coarse-grained brains. Specifically on the point of “thickening” of the brain: this is anatomically well-founded, as less folded brains have a “thicker” cortex than more folded brains, when they are all normalised to the same size. This is fundamentally why the universal scaling law also applies to these coarse-grained brains. We will provide more detail to highlight this.

      2. We will clarify the motivation behind our coarse-graining procedure better: mathematically, this is directly inspired by box-counting algorithms in fractal geometry; but this algorithm also has elegant parallels with other algorithms which we will highlight.

      3. The age effects are demonstrated here in a small sample as a proof-of-principle, but we will update our latest results using ~100 subjects from the CamCAN data demonstrating the same effect. We have additionally described and verified these age effects in more detail in a separate preprint (https://arxiv.org/abs/2311.13501) with ~1500 subjects, and additionally showed that scale-dependent metrics substantially improve understanding and applications such as brain age prediction.

      4. We have independently also received the feedback that we need to clarify how our method interacts with different resolution of the original MRI. We will add this as a new set of results, demonstrating that the MRI acquisition resolution (within a reasonable range) has a very small effect, as our method takes the reconstructed surfaces as a starting point.

      5. We agree that it may be confusing to emphasise a constant K in the first set of results across species, and then later highlight a changing K in the human ageing results. We will clarify that in the first set of results, we find a “constant” K relative to a changing S: The range in K across melted primate brains is approx 0.1, whereas in S it is over 1.2. In other words, S changes are an order of magnitude higher than K changes. Hence, we described K as “constant” relative to S. Nevertheless, K shows subtle changes within individuals, which is what we are describing in the human ageing results. These changes are within the range of K values described in the across species results.

      6. Finally, we will also make sure to summarise our specific contributions beyond existing work:

        (i) Showing for the first time that representative primate species follow the exact same fractal scaling – as opposed to previous work showing that they have a similar fractal dimension, i.e. slope, but not necessarily the same offset, as previous methods had no consistent way of comparing offsets.

        (ii) Previous work could also not show direct agreement in morphometrics between the coarse-grained brains of primate species and other non-primate mammalian species.

        (iii) Demonstrating in proof-of-principle that multiscale morphometrics, in practice, can have much larger effect sizes for classification applications. This moves beyond our previous work where we only showed the scaling law across and within species, but all on one (native) scale with comparable effect sizes for classification applications.

    1. Author Response

      Reviewer #2 (Public Review):

      Weaknesses:

      The paper contains multiple instances of non-scientific language, as indicated below. It would also benefit from additional details on the cryo-EM structure determination in the Methods and inclusion of commonly accepted requirements for cryo-EM structures, like examples of 2D class averages, raw micrographs, and FSC curves (between half-maps as well as between rigid-body fitted (or refined) atomic models of the different polymorphs and their corresponding maps). In addition, cryo-EM maps for the control experiments F1 and F2 should be presented in Figure 9.

      We will include the suggested data on the Cryo-EM analyses in a revised version of the preprint. We did not collect data on the sample used for the seeds in the cross seeding experiments because we had already confirmed in multiple datasets that the conditions in F1 and F2 reproducibly produce fibrils of Type 1 and Type 3, respectively. In a revised version we will include the analyses of several more datasets at the F1 and F2 conditions to support this statement.

      Reviewer #3 (Public Review):

      Weaknesses:

      1. The authors reveal that both Type 1 monofilament fibril polymorph (reminiscent of JOS-like polymorph) and Type 5 polymorph (akin to tissue-amplified-like polymorph) can both form under the same condition. Additionally, this condition also fosters the formation of flat ribbon-like fibril across different batches. Notably, at pH 5.8, variations in experimental groups yield disparate abundance ratios between polymorph 3B and 3C, indicating a degree of instability in fibrillar formation. The variability would potentially pose challenges for replicability in subsequent research. In light of these situations, I propose the following recommendations:

      (1) An explicit elucidation of the factors contributing to these divergent outcomes under similar experimental conditions is warranted. This should include an exploration of whether variations in purified protein batches are contributing factors to the observed heterogeneity.

      We are in complete agreement that understanding the factors that lead to polymorph variability is of utmost importance (and was the impetus for the manuscript itself). However the number of variables to explore is overwhelming and we will continue to investigate this in our future research. Regarding the variability between batches of purified protein, we also think that this could be a factor in the polymorph variability observed for otherwise “identical” aggregation conditions, particularly at pH 7 where the largest variety of polymorphs have been observed. While our data still indicates that Type 1,2 and 3 polymorphs are strongly selected by pH, the selection between interface variants 3B vs. 3C and 2A vs. 2B might also be affected by protein purity. Our standard purification protocol produces a single band by coomassie-stained SDS-PAGE however minor truncations and other impurities below a few percent would go undetected and, given the proposed roles of the N and C-termini in secondary nucleation, could have a large effect on polymorph selection and seeding. In line with the reviewer’s comments we now include a batch number for each EM dataset. While no new conclusions can be drawn from the inclusion of this additional data, we feel that it is important to acknowledge the possible role of batch to batch variability.

      (2) To enhance the robustness of the conclusions, additional replicates of the experiments under the same condition should be conducted, ideally a minimum of three times.

      The pH 5.8 conditions that yield Type 3 fibrils has already been repeated several times in the original manuscript. The pH 7.4 conditions were only mentioned twice, once as an unseeded and once as a cross-seeded fibrilization. We solved a second Type 1 structure from a second dataset from the same protein batch fibrillized under similar conditions at pH 7.4 but with the addition of inositol trisphosphate in the hopes that we could replicate one of the in vivo polymorphs. However only the Type 1 polymorphs were observed and so we will add this data point to the revised manuscript. We are currently screening more fibrils produced at pH 7.0 and will include any replicates of Type 5 or the Type 1M polymorphs or of new structures that are obtained at these conditions… however, as noted in the original manuscript, reproducibility at this pH might be difficult because there appears to be a wider range of accessible polymorphs. As will be mentioned in the revised version, the Type 5 structure was solved from a manually picked set of fibers that represented 10-20% of the observed fibrils. The remaining fibers in the sample comprised polymorphs that could not be analyzed due to their inhomogeneity or lack of twist.

      (3) Further investigation into whether different polymorphs formed under the same buffer condition could lead to distinct toxicological and pathology effects would be a valuable addition to the study.

      The correlation of toxicity with structure would in principle be interesting. However the Type 1 and Type 3 polymorphs formed at pH 5.8 and 7.4 are not likely to be biologically relevant. The pH 7 polymorphs (Type 5 and 1M) would be more interesting because they form under the same conditions and might be related to some disease relevant structures. Still, it is rare that a single polymorph appears at 7.0 (the Type 5 represented only 10-20% of the fibrils in the sample and the Type 1M also had unidentified double-filament fibrils in the sample). We plan to pursue this line of research and hope to include it in a future publication.

      1. The cross-seeding study presented in the manuscript demonstrates the pivotal role of pH conditions in dictating conformation. However, an intriguing aspect that emerges is the potential role of seed concentration in determining the resultant product structure. This raises a critical question: at what specific seed concentration does the determining factor for polymorph selection shift from pH condition to seed concentration? A methodological robust approach to address this should be conducted through a series of experiments across a range of seed concentrations. Such an approach could delineate a clear boundary at which seed concentration begins to predominantly dictate the conformation, as opposed to pH conditions. Incorporating this aspect into the study would not only clarify the interplay between seed concentration and pH conditions, but also add a fascinating dimension to the understanding of polymorph selection mechanisms.

      A more complete analysis of the mechanisms of aggregation, including the effect of seed concentration and the resulting polymorph specificity of the process, are all very important for our understanding of the aggregation pathways of alpha-synuclein and are currently the topic of ongoing investigations in our lab.

      Furthermore, the study prompts additional queries regarding the behavior of cross-seeding production under the same pH conditions when employing seeds of distinct conformation. Evidence from various studies, such as those involving E46K and G51D cross-seeding, suggests that seed structure plays a crucial role in dictating polymorph selection. A key question is whether these products consistently mirror the structure of their respective seeds.

      We thank the reviewer for reminding us to include a reference to these studies as a clear example of polymorph selection by cross-seeding which we will do in the revised version. Unfortunately, it is not 100% clear from the G51D cross seeding manuscript (https://doi.org/10.1038/s41467-021-26433-2) what conditions were used in the cross-seeding since different conditions were used for the seedless wild-type and mutant aggregations… however it appears that the wild-type without seeds was Tris pH 7.5 (although at 37C the pH could have dropped to 7-ish) and the cross-seeded wild-type was in Phosphate buffer at pH 7.0. In the E46K cross-seeding manuscript, it appears that pH 7.5 Tris was used for all fibrilizations (https://doi.org/10.1073/pnas.2012435118). In any event, both results point to the fact that at pH 7.0-7.5 under low-seed conditions (0.5%) the Type 4 polymorph can propagate in a seed specific manner.

      1. In the Results section of "The buffer environment can dictate polymorph during seeded nucleation", the authors reference previous cell biological and biochemical assays to support the polymorph-specific seeding of MSA and PD patients under the same buffer conditions. This discussion is juxtaposed with recent research that compares the in vivo biological activities of hPFF, ampLB as well as LB, particularly in terms of seeding activity and pathology. Notably, this research suggests that ampLB, rather than hPFF, can accurately model the key aspects of Lewy Body Diseases (LBD) (refer to: https://doi.org/10.1038/s41467-023-42705-5). The critical issue here is the need to reconcile the phenomena observed in vitro with those in in-vivo or in-cell models. Given the low seed concentration reported in these studies, it is imperative for the authors to provide a more detailed explanation as to why the possible similar conformation could lead to divergent pathologies, including differences in cell-type preference and seeding capability.

      We thank the reviewer for bring this recent report to our attention. The findings that ampLB and hPFF have different PK digestion patterns and that only the former is able to model key aspects of Lewy Body disease are in support of the seed-specific nature of some types of alpha-synuclein aggregation. We will add more discussion regarding the significant role that seed type and seed conditions likely play in polymorph selection.

      1. In the Method section of "Image processing", the authors describe the helical reconstruction procedure, without mentioning much detail about the 3D reconstruction and refinement process. For the benefit of reproducibility and to facilitate a deeper understanding among readers, the authors should enrich this part to include more comprehensive information, akin to the level of detail found in similar studies (refer to: https://doi.org/10.1038/nature23002).

      As suggested by reviewer #2, we will add more comprehensive information on the 3D reconstruction and refinement process to a revised version.

      1. The abbreviation of amino acids should be unified. In the Results section "On the structural heterogeneity of Type 1 polymorphs", the amino acids are denoted using three-letter abbreviation. Conversely, in the same section under "On the structural heterogeneity of Type 2 and 3 structures", amino acids are abbreviated using the one-letter format. For clarity and consistency, it is essential that a standardized format for amino acid abbreviations be adopted throughout the manuscript.

      That makes perfect sense and will be corrected in a revised version.

      Reviewing Editor:

      After discussion among the reviewers, it was decided that point 2 in Reviewer #3's Public Review (about the experiments with different concentrations of seeds) would probably lie outside the scope of a reasonable revision for this work.

      We agree as stated above and will continue to work on this important point.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Strengths

      This paper is well situated theoretically within the habit learning/OCD literature.

      Daily training in a motor-learning task, delivered via smartphone, was innovative, ecologically valid and more likely to assay habitual behaviors specifically. Daily training is also more similar to studies with non-humans, making a better link with that literature. The use of a sequential-learning task (cf. tasks that require a single response) is also more ecologically valid.

      The in-laboratory tests (after the 1 month of training) allowed the researchers to test if the OCD group preferred familiar, but more difficult, sequences over newer, simpler sequences.

      The authors achieved their aims in that two groups of participants (patients with OCD and controls) engaged with the task over the course of 30 days. The repeated nature of the task meant that 'overtraining' was almost certainly established, and automaticity was demonstrated. This allowed the authors to test their hypotheses about habit learning. The results are supportive of the authors' conclusions.

      Response: We truly appreciate the positive assessment of referee 1, particularly the consideration that our study is theoretically strong and that ‘the results are supportive of the authors' conclusions’. This is an important external endorsement of our conclusions, contrasting somewhat with the views of referee 2.

      Weaknesses

      The sample size was relatively small. Some potentially interesting individual differences within the OCD group could have been examined more thoroughly with a bigger sample (e.g., preference for familiar sequences). A larger sample may have allowed the statistical testing of any effects due to medication status. The authors were not able to test one criterion of habits, namely resistance to devaluation, due to the nature of the task

      Response: We agree with the reviewer that the proof of principle established in our study opens new avenues for research into the psychological and behavioral determinants of the heterogeneity of this clinical population. However, considering the study timeline and the pandemic constraints, a bigger sample was not possible. Our sample can indeed be considered small if one compares it with current online studies, which do not require in-person/laboratory testing, thus being much easier to recruit and conduct. However, given the nature of our protocol (with 2 demanding test phases, 1-month engagement per participant and the inclusion of OCD patients without comorbidities only) and the fact that this study also involved laboratory testing, we consider our sample size reasonable and comparable to other laboratory studies (typically comprising on average between 30-50 participants in each group).

      This article is likely to be impactful -- the delivery of a task across 30 days to a patient group is innovative and represents a new approach for the study of habit learning that is superior to an inlaboratory approach.

      An interesting aspect of this manuscript is that it prompts a comparison with previous studies of goal-directed/habitual responding in OCD that used devaluation protocols, and which may have had their effects due to deficits in goal-directed behavior and not enhanced habit learning per se.

      Response: Thank you for acknowledging the impact of our study, in particular the unique ability of our task to interrogate the habit system.

      Reviewer #2 (Public Review):

      In this study, the researchers employed a recently developed smartphone application to provide 30 days of training on action sequences to both OCD patients and healthy volunteers. The study tested learning and automaticity-related measures and investigated the effects of several factors on these measures. Upon training completion, the researchers conducted two preference tests comparing a learned and unlearned action sequences under different conditions. While the study provides some interesting findings, I have a few substantial concerns:

      1. Throughout the entire paper, the authors' interpretations and claims revolve around the domain of habits and goal-directed behavior, despite the methods and evidence clearly focusing on motor sequence learning/procedural learning/skill learning. There is no evidence to support this framing and interpretation and thus I find them overreaching and hyperbolic, and I think they should be avoided. Although skills and habits share many characteristics, they are meaningfully distinguishable and should not be conflated or mixed up. Furthermore, if anything, the evidence in this study suggests that participants attained procedural learning, but these actions did not become habitual, as they remained deliberate actions that were not chosen to be performed when they were not in line with participants' current goals.

      Response: We acknowledge that the research on habit learning is a topic of current controversy, especially when it comes to how to induce and measure habits in humans. Therefore, within this context referee’s 2 criticism could be expected. Across distinct fields of research, different methodologies have been used to measure habits, which represent relatively stereotyped and autonomous behavioral sequences enacted in response to a specific stimulus without consideration, at the time of initiation of the sequence, of the value of the outcome or any representation of the relationship that exists between the response and the outcome. Hence these are stimulus-bound responses which may or may not require the implementation of a skill during subsequent performance. Behavioral neuroscientists define habits similarly, as stimulus-response associations which are independent of reward or outcome, and use devaluation or contingency degradation strategies to probe habits (Dickinson and Weiskrantz, 1985; Tricomi et al., 2009). Others conceptualize habits as a form of procedural memory, along with skills, and use motor sequence learning paradigms to investigate and dissect different components of habit learning such as action selection, execution and consolidation (Abrahamse et al., 2013; Doyon et al., 2003; Squire et al., 1993). It is also generally agreed that the autonomous nature of habits and the fluid proficiency of skills are both usually achieved with many hours of training or practice, respectively (Haith and Krakauer, 2018).

      We consider that Balleine and Dezfouli (2019) made an excellent attempt to bring all these different criteria within a single framework, which we have followed. We also consider that our discussion in fact followed a rather cautious approach to interpretation solely in terms of goaldirected versus habitual control.

      Referee 2 does not actually specify criteria by which they define habits and skills, except for asserting that skilled behavior is goal-directed, without mentioning what the actual goal of the implantation of such skill is in the present study: the fulfillment of a habit? We assume that their definition of habit hinges on the effects of devaluation, as a single criterion of habit, but which according to Balleine and Dezfouli (2019) is only 1 of their 4 listed criteria. We carefully addressed this specific criterion in our manuscript: “We were not, however, able to test the fourth criterion, of resistance to devaluation. Therefore, we are unable to firmly conclude that the action sequences are habits rather than, for example, goal-directed skills. Regardless of whether the trained action sequences can be defined as habits or goal-directed motor skills, it has to be considered…”. Therefore, we took due care in our conclusions concerning habits and thus found the referee’s comment misleading and unfair.

      We note that our trained motor sequences did in fact fulfil the other 3 criteria listed by Balleine and Dezfouli (2019), unlike many studies employing only devaluation (e.g. Tricomi et al 2009; Gillan et al 2011). Moreover, we cited a recent study using very similar methodology where the devaluation test was applied and shown to support the habit hypothesis (Gera et al., 2022).

      Whether the initiation of the trained motor sequences in experiment 3 (arbitration) is underpinned by an action-outcome association (or not) has no bearing on whether those sequences were under stimulus-response control after training (experiment 1). Transitions between habitual and goal-directed control over behavior are quite well established in the experimental literature, especially when choice opportunities become available (Bouton et al (2021), Frölich et al (2023), or a new goal-directed schemata is recruited to fulfill a habit (Fouyssac et al, 2022). This switching between habits and goal-directed responding may reflect the coordination of these systems in producing effective behavior in the real world.

      • Fouyssac M, Peña-Oliver Y, Puaud M, Lim NTY, Giuliano C, Everitt BJ, Belin D. (2021).Negative Urgency Exacerbates Relapse to Cocaine Seeking After Abstinence. Biological Psychiatry. doi: 10.1016/j.biopsych.2021.10.009

      • Frölich S, Esmeyer M, Endrass T, Smolka MN and Kiebel SJ (2023) Interaction between habits as action sequences and goal-directed behavior under time pressure. Front. Neurosci. 16:996957. doi: 10.3389/fnins.2022.996957

      • Bouton ME. 2021. Context, attention, and the switch between habit and goal-direction in behavior. Learn Behav 49:349– 362. doi:10.3758/s13420-021-00488-z

      1. Some methodological aspects need more detail and clarification.

      2. There are concerns regarding some of the analyses, which require addressing.

      Response: We thank referee 2 for their detailed review of the methods and analyses of our study and for the helpful feedback, which clearly helps improve our manuscript. We will clarify the methodological aspects in detail and conduct the suggested analysis. Please see below our answers to the specific points raised.

      Introduction:

      1. It is stated that "extensive training of sequential actions would more rapidly engage the 'habit system' as compared to single-action instrumental learning". In an attempt to describe the rationale for this statement the authors describe the concept of action chunking, its benefits and relevance to habits but there is no explanation for why sequential actions would engage the habit system more rapidly than a single-action. Clarifying this would be helpful.

      Response: We agree that there is no evidence that action sequences become habitual more readily than single actions, although action sequences clearly allow ‘chunking’ and thus likely engage neural networks including the putamen which are implicated in habit learning as well as skill. In our revised manuscript we will instead state: “we have recently postulated that extensive training of sequential actions could be a means for rapidly engaging the ‘habit system’ (Robbins et al., 2019)]”

      DONE in page 2

      1. In the Hypothesis section the authors state: “we expected that OCD patients... show enhanced habit attainment through a greater preference for performing familiar app sequences when given the choice to select any other, easier sequence”. I find it particularly difficult to interpret preference for familiar sequences as enhanced habit attainment.

      Response: We agree that choice of the familiar response sequence should not be a necessary criterion for habitual control although choice for a familiar sequence is, in fact, not inconsistent with this hypothesis. In a recent study, Zmigrod et al (2022) found that 'aversion to novelty' was a relevant factor in the subjective measurement of habitual tendencies. It should also be noted that this preference was present in patients with OCD. If one assumes instead, like the referee, that the familiar sequence is goal-directed, then it contravenes the well-known 'egodystonia' of OCD which suggests that such tendencies are not goal-directed.

      To clarify our hypothesis, we will amend the sentence to the following: “Finally, we expected that OCD patients would generally report greater habits, as well as attribute higher intrinsic value to the familiar app sequences manifested by a greater preference for performing them when given the choice to select any other, easier sequence”.

      DONE in page 5. We have now rephrased it: “Additionally, we hypothesized that OCD patients would generally display stronger habits and assign greater intrinsic value to the familiar app sequences, evidenced by a marked preference for executing them even when presented with a simpler alternative sequence.”

      A few notes on the task description and other task components:

      1. It would be useful to give more details on the task. This includes more details on the time/condition of the gradual removal of visual and auditory stimuli and also on the within practice dynamic structure (i.e., different levels appear in the video).

      Response: These details will be included in the revised manuscript. Thank you for pointing out the need for further clarification of the task design.

      Done in page 7

      1. Some more information on engagement-related exclusion criteria would be useful (what happened if participants did not use the app for more than one day, how many times were allowed to skip a day etc.).

      Response: This additional information will be added to the revised manuscript. If participants omitted to train for more than 2 days, the researcher would send a reminder to the participant to request to catch up. If the participant would not react accordingly and a third day would be skipped, then the researcher would call to understand the reasons for the lack of engagement and gauge motivation. The participant would be excluded if more than 5 sequential days of training were missed. Only 2 participants were excluded given their lack of engagement.

      Done in page 8

      1. According to the (very useful) video demonstrating the task and the paper describing the task in detail (Banca et al., 2020), the task seems to include other relevant components that were not mentioned in this paper. I refer to the daily speed test, the daily random switch test, and daily ratings of each sequence's enjoyment and confidence of knowledge.

      If these components were not included in this procedure, then the deviations from the procedure described in the video and Banca al. (2020) should be explicitly mentioned. If these components were included, at least some of them may be relevant, at least in part, to automaticity, habitual action control, formulation of participants' enjoyment from the app etc. I think these components should be mentioned and analyzed (or at least provide an explanation for why it has been decided not to analyze them).

      This is also true for the reward removal (extinction) from the 21st day onwards which is potentially of particular relevance for the research questions.

      Response: The task procedure was indeed the same as detailed in Banca et al., 2020. We did not include these extra components in this current manuscript for reasons of succinctness and because the manuscript was already rather longer than a common research article, given that we present three different, though highly inter-dependent, experiments in order to answer key interrelated questions in an optimal manner. However, since referee 2 considers this additional analysis to be important, we will be happy to include it in the supplementary material of the revised manuscript.

      These additional components of the task as well as the respective analysis are now described in the Supplementary Materials.

      Training engagement analysis:

      1. I find referring to the number of trials including successful and unsuccessful trials as representing participants "commitment to training" (e.g. in Figure legend 2b) potentially inadequate. Given that participants need at least 20 successful trials to complete each practice, more errors would lead to more trials. Therefore, I think this measure may mostly represent weaker performance (of the OCD patients as shown in Figure 2b). Therefore, I find the number of performed practice runs, as used in Figure 2a (which should be perfectly aligned with the number of successful trials), a "clean" and proper measure of engagement/commitment to training.

      Response: We acknowledge referee’s concern on this matter and agree to replace the y-axis variable of Figure 2b to the number of performed practices (thus aligning with Figure 2a). This amendment will remove any potential effect of weaker performance on the engagement measurement and will provide clearer results.

      We have now decided to remove this figure as it does not add much to figure 2a. Instead, we replaced figure 2b and 2c for new plots, following new analysis linked to the next reviewer request (point 10)

      1. Also, to provide stronger support for the claim about different diurnal training patterns (as presented in Figure 2c and the text) between patients and healthy individuals, it would be beneficial to conduct a statistical test comparing the two distributions. If the results of this test are not significant, I suggest emphasizing that this is a descriptive finding.

      Response: Done, see revised Figure 2b and 2c. We have assessed the diurnal training patterns within each group using circular statistics, followed by independent-sample statistical testing of those circular distributions with the Watson’s U2 test ( Landler et al., 2021). While OCD participants have a group effect of practice with a significant peak at ~18:00, and HV participants have an earlier significant peak at ~15:00, the Watson’s U test did not find statistical betweengroup differences.

      • Landler L, Ruxton GD, Malkemper EP. Advice on comparing two independent samples of circular data in biology. Scientific reports. 2021 Oct 13;11(1):20337.

      Learning results:

      1. When describing the Learning results (p10) I think it would be useful to provide the descriptive stats for the MT0 parameter (as done above for the other two parameters).

      Response: Thank you for pointing this out. The descriptive stats for MT0 will be added to the revised version of the manuscript.

      Done page 11

      1. Sensitivity of sequence duration and IKI consistency (C) to reward:

      I think it is important to add details on how incorrect trials were handled when calculating ∆MT (or C) and ∆R, specifically in cases where the trial preceding a successful trial was unsuccessful. If incorrect trials were simply ignored, this may not adequately represent trial-by-trial changes, particularly when testing the effect of a trial's outcome on performance change in the next trial.

      Response: This is an important question. Our analysis protocol was designed to ensure that incorrect trials do not contaminate or confound the results. To estimate the trial-to-trial difference in ∆MT (or C) and ∆R, we exclusively included pairs of contiguous trials where participants achieved correct performance and received feedback scores for both trials. For example, if a participant made a performance error on trial 23, we did not include ∆R or ∆MT estimates for the pairs of trials 23-22 and 24-23. Instead of excluding incorrect trials from our analyses, we retained them in our time series but assigned them a NaN (not a number) value in Matlab. As a result, ∆R and ∆MT was not defined for those two pairs of trials. Similarly for C. This approach ensured that our analyses are not confounded by incremental or decremental feedback scores between noncontiguous trials. In the past, when assessing the timing of correct actions during skilled sequence performance, we also considered events that were preceded and followed by correct actions. This excluded effects such as post-error slowing from contaminating our results (Herrojo Ruiz et al., 2009, 2019). Therefore, we do not believe that any further reanalysis is required.

      • Ruiz MH, Jabusch HC, Altenmüller E. Detecting wrong notes in advance: neuronal correlates of error monitoring in pianists. Cerebral cortex. 2009 Nov 1;19(11):2625-39.

      • Bury G, García-Huéscar M, Bhattacharya J, Ruiz MH. Cardiac afferent activity modulates early neural signature of error detection during skilled performance. NeuroImage. 2019 Oct 1;199:704-17.

      1. I have a serious concern with respect to how the sensitivity of sequence duration to reward is framed and analyzed. Since reward is proportional to performance, a reduction in reward essentially indicates a trial with poor performance, and thus even regression to the mean (along with a floor effect in performance [asymptote]) could explain the observed effects. It is possible that even occasional poor performance could lead to a participant demonstrating this effect, potentially regardless of the reward. Accordingly, the reduced improvement in performance following a reward decrease as a function of training length described in Figure 5b legend may reflect training-induced increased performance that leaves less room for improvement after poor trials, which are no longer as poor as before. To address this concern, controlling for performance (e.g., by taking into consideration the baseline MT for the previous trial) may be helpful. If the authors can conduct such an analysis and still show the observed effect, it would establish the validity of their findings."

      Response: Thank you for raising this point. This has been done, see updated Figures 5 and 6. After normalizing the ∆MT(n+1) := MT(n+1) – MT(n) difference values by dividing them with the baseline MT(n) at trial n, we obtain the same results. Similar results are also obtained for IKI consistency (C).

      See below our initial response from June 2023.

      Thank you for raising this point. Figure 5b illustrates two distinct effects of reward changes on behavioral adaptation, which are expected based on previous research.

      I. Practice effects: Firstly, we observe that as participants progress across bins of practice, the degree of improvement in behavior (reflected by faster movement time, MT) following a decrease in reward (∆R−) diminishes, consistent with our expectations based on previous work. Conversely, we found that ∆MT does not change across bins of practices following an increase in reward (∆R+).

      We appreciate the reviewer’s suggestion regarding controlling for the reference movement time (MT) in the previous trial when examining the practice effect in the p(∆T|∆R−) and p(∆T|∆R+) distributions. In the revised manuscript, we will conduct the proposed control analysis to better understand whether the sensitivity of MT to score decrements changes across practice when normalising MT to the reference level on each trial. But see below for a preliminary control analysis.

      II. Asymmetry of the effect of ∆R− and ∆R+ on performance: Figure 5b also depicts the distinct impact of score increments and decrements on behavioural changes. When aggregating data across practice bins, we consistently observed that the centre of the p(∆T|∆R−) distribution was smaller (more negative) than that of p(∆T|∆R+). This suggests that participants exhibited a greater acceleration following a drop in scores compared to a relative score increase, and this effect persisted throughout the practice sessions. Importantly, this enhanced sensitivity to losses or negative feedback (or relative drops in scores) aligns with previous research findings (Galea et al., 2015; Pekny et al., 2014; van Mastrigt et al., 2020).

      We have conducted a preliminary control analysis to exclude the potential impact that reference movement time (MT) values could have on our analysis. We have assessed the asymmetry between behavioural responses to ∆R− and ∆R+ using the following analysis: We estimated the proportion of trials in which participants exhibited speed-up (∆T < 0) or slow-down (∆T > 0) behaviour following ∆R− and ∆R+ across different practice bins (bins 1 to 4). By discretising the series of behavioural changes (∆T) into binary values (+1 for slowing down, -1 for speeding up), we can assess the type of changes (speed-up, slow-down) without the absolute ∆T or T values contributing to our results. We obtained several key findings:

      • Consistent with expectations (sanity check), participants exhibited more instances of speeding up than slowing down across all reward conditions.

      • Participants demonstrated a higher frequency of speeding up following ∆R− compared to ∆R+, and this asymmetry persisted throughout the practice sessions (greater proportion of -1 events than +1 events). 53% events were speed-up events in the in the p(∆T|∆R+) distribution for the first bin of practices, and 55% for the last bin. Regarding p(∆T|∆R-), there were 63% speed-up events throughout each bin of practices, with this proportion exhibiting no change over time.

      • Accordingly, the asymmetry of reward changes on behavioural adaptations, as revealed by this analysis, remained consistent across the practice bins.

      Thus, these preliminary findings provide an initial response to referee 2 and offer valuable insights into the asymmetrical effects of positive/negative reward changes on behavioural adaptations. We plan to include these results in the revised manuscript, as well as the full control analysis suggested by the referee. We will further expand upon their interpretation and implications.

      1. Another way to support the claim of reward change directionality effects on performance (rather than performance on performance), at least to some extent, would be to analyze the data from the last 10 days of the training, during which no rewards were given (pretending for analysis purposes that the reward was calculated and presented to participants). If the effect persists, it is less unlikely that the effect in question can be attributed to the reward dynamics.

      Response: The reviewer’s concern is addressed in the previous quesQon. Also, this analysis would not be possible because our Gaussian fit analyses use the Qme series of conQnuous reward scores, in which ∆R− or ∆R+ are embedded. These events cannot be analyzed once reward feedback is removed because we do not have behavioral events following ∆R− or ∆R+ anymore.

      Done

      1. This concern is also relevant and should be considered with respect to the sensitivity of IKI consistency (C) to reward. While the relationship between previous reward/performance and future performance in terms of C is of a different structure, the similar potential confounding effects could still be present.

      Response: We will conduct this analysis for the revised manuscript, similarly to the control analysis suggested by referee 2 on MT. Our preliminary control analysis, as explained above, suggests that the fundamental asymmetry in the effect of ∆R+ and ∆R+ on behavioral changes persists when excluding the impact of reference performance values in our Gaussian fit analysis.

      Done. See updated Figure 6. The results are very similar once we normalize the IKI consistency index C with the IKI of the baseline performance at trial n.

      1. Another related question (which is also of general interest) is whether the preferred app sequence (as indicated by the participants for Phase B) was consistently the one that yielded more reward? Was the continuous sequence the preferred one? This might tell something about the effectiveness of the reward in the task.

      Response: We have now conducted this analysis. There is in fact no evidence to conclude that the continuously rewarded sequence was the preferred one. The result shows that 54.5% of HV and 29% of the OCD sample considered the continuous sequence to be their preferred one, a nonstatistically significant difference. Note that this preference may not necessarily be linked simply to programmed reward. The overall preference may be influenced by many other factors, such as, for example, the aesthetic appeal of particular combinations of finger movements.

      Regarding both experiments 2 and 3:

      1. The change in context in experiment 2 and 3 is substantial and include many different components. These changes should be mentioned in more detail in the Results section before describing the results of experiments 2 and 3.

      Response: Following referee’s advice, we will move these details (currently written in the Methods section) to the Results section, when we introduce Phase B and before describing the results of experiments 2 and 3.

      Done in page 21

      Experiment 2:

      1. In Experiment 2, the authors sometimes refer to the "explicit preference task" as testing for habitual and goal-seeking sequences. However, I do not think there is any justification for interpreting it as such. The other framings used by the authors - testing whether trained action sequences gain intrinsic/rewarding properties or value, and preference for familiar versus novel action sequences - are more suitable and justified. In support of the point I raised here, assigning intrinsic rewarding properties to the learned sequences and thereby preferring these sequences can be conceptually aligned with goal-directed behavior just as much as it could be with habit.

      Response: We clearly defined the theoretical framing of experiment 2 as a test of whether trained action sequences gain intrinsic value and we are pleased to hear that the referee agrees with this framing. If the referee is referring to the paragraph below (in the Discussion), we actually do acknowledge within this paragraph that a preference for the trained sequences can either be conceptually aligned with a habit OR a goal-directed behavior.

      “On the other hand, we are describing here two potential sources of evidence in favor of enhanced habit formation in OCD. First, OCD patients show a bias towards the previously trained, apparently disadvantageous, action sequences. In terms of the discussion above, this could possibly be reinterpreted as a narrowing of goals in OCD (Robbins et al., 2019) underlying compulsive behavior, in favor of its intrinsic outcomes”

      This narrowing of goals model of OCD refers to a hypothetically transiQonal stage of compulsion development driven by behavior having an abnormally strong, goal-directed nature, typically linked to specific values and concerns.

      If the referee is referring to the penulQmate sentence of hypothesis secQon, this has been amended in response to Q5. We cannot find any other possible instances in this manuscript stating that experiment 2 is a test of habitual or goal-directed behavior.

      Experiment 3:

      1. Similar to Experiment 2, I find the framing of arbitration between goal-directed/habitual behavior in Experiment 3 inadequate and unjustified. The results of the experiment suggest that participants were primarily goal-directed and there is no evidence to support the idea that this reevaluation led participants to switch from habitual to goal-directed behavior.

      Also, given the explicit choice of the sequence to perform participants had to make prior to performing it, it is reasonable to assume that this experiment mainly tested bias towards familiar sequence/stimulus and/or towards intrinsic reward associated with the sequence in value-based decision making.

      Response: This comment is aligned with (and follows) the referee’s criticism of experiment 1 not achieving automatic and habitual actions. We have addressed this matter above, in response 1 to Referee 2.

      Mobile-app performance effect on symptomatology: exploratory analyses:

      1. Maybe it would be worth testing if the patients with improved symptomatology (that contribute some of their symptom improvement to the app) also chose to play more during the training stage.

      Response: We have conducted analysis to address this relevant question. There is no correlation between the YBOCS score change and the number of total practices, meaning that the patients who improved symptomatology post training did not necessarily chose to play the app more during the training stage (rs = 0.25, p = 0.15). Additionally, we have statistically compared the improvers (patients with reduced YBOCS scores post-training) and the non-improvers (patients with unchanged or increased YBOCS scores post-training) in their number of app completed practices during the training phase and no differences were observed (U = 169, p = 0.19).

      The result from the correlational analysis has been added to the revised manuscript (page 28).

      Discussion:

      1. Based on my earlier comments highlighting the inadequacy and mis-framing of the work in terms of habit and goal-directed behavior, I suggest that the discussion section be substantially revised to reflect these concerns.

      Response: We do not agree that the work is either "inadequate or mis-framed" and will not therefore be substantially revising the Discussion. We will however clarify further the interpretation we have made and make explicit the alternative viewpoint of the referee. For example, we will retitle experiment 3 as “Re-evaluation of the learned action sequence: possible test of goal/habit arbitration” to acknowledge the referee’s viewpoint as well as our own interpretation.

      Done

      1. In the sentence "Nevertheless, OCD patients disadvantageously preferred the previously trained/familiar action sequence under certain conditions" the term "disadvantageously" is not necessarily accurate. While there was potentially more effort required, considering the possible presence of intrinsic reward and chunking, this preference may not necessarily be disadvantageous. Therefore, a more cautious and accurate phrasing that better reflects the associated results would be useful.

      Response: We recognize that the term "disadvantageously" may be semantically ambiguous for some readers and therefore we will remove it.

      Done

      Materials and Methods:

      1. The authors mention: "The novel sequence (in condition 3) was a 6-move sequence of similar complexity and difficulty as the app sequences, but only learned on the day, before starting this task (therefore, not overtrained)." - for the sake of completeness, more details on the pre-training done on that day would be useful.

      Response: Details of the learning procedure of the novel sequence (in condition 3, experiment 3) will be provided in the methods of the revised version of the manuscript.

      Done in page 40

      Minor comments:

      1. In the section discussing the sensitivity of sequence duration to reward, the authors state that they only analyzed continuous reward trials because "a larger number of trials in each subsample were available to fit the Gaussian distributions, due to feedback being provided on all trials." However, feedback was also provided on all trials in the variable reward condition, even though the reward was not necessarily aligned with participants' performance. Therefore, it may be beneficial to rephrase this statement for clarity.

      Response: We will follow this referee’s advice and will rephrase the sentence for clarity.

      Done. See page 16.

      1. With regard to experiment 2 (Preference for familiar versus novel action sequences) in the following statement "A positive correlation between COHS and the app sequence choice (Pearson r = 0.36, p = 0.005) further showed that those participants with greater habitual tendencies had a greater propensity to prefer the trained app sequence under this condition." I find the use of the word "further" here potentially misleading.

      Response: The word "further" will be removed.

      Done

      Reviewer #1 (Recommendations For The Authors):

      This is a very interesting manuscript, which was a pleasure to review. I have some minor comments you may wish to consider.

      1. I believe that it is possible to include videos as elements in eLife articles - please consider if you can do this to demonstrate the action sequence on the smartphone. I followed the YouTube video, and it was very helpful to see exactly what participants did, but it would be better to attach the video directly, if possible.

      Response: This is a great idea and we will definitely attach our video demonstrating the task to the revised manuscript (Version of Record) if the eLife editors allow.

      We ask permission to the editor to add the video

      1. The abstract states that the study uses a "novel smartphone app" but is the same one as described in Banca et al. Suggest writing simply "smartphone app".

      Response: We will remove the word novel.

      Done

      1. Some of the hypotheses described in the second half of the Hypothesis section could be stated more explicitly. For example: "We also hypothesized that the acquisition of learning and automaticity would differ between the two action sequences based on their associated rewarded schedule (continuous versus variable) and reward valence (positive or negative)." The subsequent sentence explains the prediction for the schedule but what is the hypothesized direction for reward valence? More detail is subsequently given on p. 14, Results, but it would be better to bring these details up to the Introduction. "We additionally examined differential effects of positive and negative feedback changes on performance to build on previous work demonstrating enhanced sensitivity to negative feedback in patients with OCD (Apergis-Schoute et al 2023, Becker et al., 2014; Kanen et al., 2019)." In general, the second part of the Hypothesis section is a bit dense, sometimes with two predictions per sentence. It could be useful for the reader if hypotheses were enumerated and/or if a distinction was made among the hypotheses with respect to their importance.

      We fully revised the hypothesis section, on page 5, following this reviewer’s suggestion. We think this section is much clearer now, in our revised manuscript.

      Response: Thank you for pointing out the need for clarity in our hypothesis section. This is a very important point and we will carefully rewrite our hypothesis in the revised manuscript to make them as clear as possible.

      1. Did medication status correlate with symptom severity in the OCD group (e.g., higher symptoms for the 6 participants on SSRI+antipsychotics?). Could this, or SSRI-only status, have impacted results in any way? I appreciate that there is no way to test medication status statistically but readers may be interested in your thoughts on this aspect.

      Response: We have now conducted exploratory analysis to assess the potential effect of medication in the following output measures: app engagement (as measured by completed practices), explicit preference and YBOCS change post-training. The patients who were on combined therapy (SSRIs + antipsychotic) did not perform significantly different in these measures as compared to the remaining patients and no other effects of interest were observed. Their symptomatology was indeed slightly more severe but not statistically significant [Y-BOCS combined = 26.2 (6.5); Y-BOCS SSRI only = 23.8 (6.1); Y-BOCS No Med = 23.8 (2.2), mean(std)]. Only one patient showed symptom improvement after the app training, another became worse and the remaining patients on combined therapy remain stable during the month.

      Palminteri et al (2011) found that unmedicated OCD patients exhibited instrumental learning deficits, which were fully alleviated with SSRI treatment. Therefore, it is possible that the SSRI medication (present in our sample) may have reduced habit formation and facilitated behavioral arbitration. However, since the effect goes against the habit hypothesis, it has is unlikely that it has confounded our measure of automaticity. If anything, medication rendered experiment 2 and 3 more goal-oriented. We agree that further studies are warranted to address the effect of SSRIs on these measures.

      1. You could explain earlier why devaluation could not be tested here (it is only explained in the Limitations section near the end)

      Response: The revised manuscript will be amended to account for this note.

      Done in page 25.

      1. Capitalize 'makey-makey', I didn't realize there was a product called Makey Makey until I Googled it.

      Response: Sure. We will capitalize 'Makey-Makey'. Thank you for pointing this out!

      Done

      Reviewer #2 (Recommendations For The Authors):

      Recommendations for the authors (ordered by the paper sections):

      In the introduction

      1. regarding this part "We used a period of 1-month's training to enable effective consolidation, required for habitual action control or skill retention to occur. This acknowledged previous studies showing that practice alone is insufficient for habit development as it also requires off-line consolidation computations, through longer periods of time (de Wit et al., 2018) and sleep (Nusbaum et al., 2018; Walker et al., 2003)." I advise the authors to re-check whether what is attributed here to de Wit et al. (2018) is indeed justified (if I remember correctly they have not mentioned anything about off-line consolidation computations).

      Response: When we revise the manuscript, we will remove the de Wit et al. (2018) citation from this sentence.

      Done

      in the Outline paragraph

      1. it stated: "We continuously collected data online, in real time, thus enabling measurements of procedural learning as well as automaticity development." I think this wording implies that the fact that the data was collected online in real time was advantageous in that it enabled to assess measurements of procedural learning and automaticity development, which in my understanding is not the case.

      Response: To make this sentence clearer, we will change it to the following: ‘We continuously collected data online, to monitor engagement and performance in real time and to enable acquisition of sufficient data to analyze, à posteriori, procedural learning and automaticity development’.

      Done in page 4: ‘We collected data online continuously to monitor engagement and performance in real-time. This approach ensured we acquired sufficient data for subsequent analysis of procedural learning and automaticity development’.

      1. In the final sentence of this paragraph "or and" should be changed to "or/end".

      Response: This was a typo. The word ‘and’ will be removed.

      Done

      1. In Figure 1c - Note that in the figure legend it says "Each sequence comprises 3 single press moves, 2 two-finger moves..." whereas in the example shown in the figure it's the other way around (2 single press moves and 3 two-finger moves).

      Response: Thank you so much for spotting this! The example shown in the figure is incorrect. We apologize for the mistake. It should depict 3 single press moves, 2 two-finger moves and 1 three- finger move. The figure will be amended.

      Done

      In the results section:

      1. Regarding the "were followed by a positive ring tone and the unsuccessful ones by a negative ring tone", I suggest mentioning that there was also a positive visual (rewarding) effect.

      Response: Thank you. A mention to the visual effect will be added for both the positive (successful) and negative (unsuccessful) trials. Done in page 7

      1. p 10. - Note a typo in the following sentence where the word "which" appears twice consecutively:

      "Furthermore, both groups exhibited similar motor durations at asymptote which, which combined with the previous conclusion, indicates that OCD patients improved their motor learning more than controls, but to the same asymptote."

      Response: Thank you for spotting this typo. The second word will be removed. Done

      1. I have a few suggestions with respect to Figure 3:

      2. keeping the y-axes scale similar in all subplots would be more visually informative.

      Here we kept the y-axes scale similar in all subplots, except one of them, which was important to keep to capture all the data.

      1. For the subplots in 3b I would recommend for the transparent regions, instead of the IQR, to use the median +/- 1.57 * IQR/sqrt(n) which is equivalent to how the notches are calculated in a box-plot figure (It is referred to as an approximate 95% confidence interval for the median). This should make the transparent area narrower and thus better communicate the results.

      Done

      1. I think the significant levels mentioned in figure legend 3b (which are referring to the group effect measured for each reward schedule type separately) is not mentioned in the text. While not crucial, maybe consider adding it in the text.

      We don’t think this is necessary and may actually lead to confusion because in the text we report a Kruskal–Wallis H test (which is the most appropriate statistical test), including their H and p values for the group and reward effects. Since in the figure we separated the analysis and plots for variable and continuous reward schedules (for visual purposes) , we reported a U test separated for each reward schedule. Therefore, we consider that the correct statistics are reported in the appropriate places of the manuscript.

      Response: Thank you for this very helpful suggestion. We will amend figure 3 accordingly.

      1. In the Automaticity results (pp. 12 and 13) when describing the Descriptive stats the wrong parameter indicator are used (DL instead of CL and nD instead of nC.

      Response: Thank you for noticing it. We will amend.

      Done

      1. In Sensitivity of IKI consistency (C) to reward results:

      In Figure 6a legend: with respect to "... and for reward increments (∆R+, purple) and decrements (∆R-, green)" - note that there are also additional colors indicating these ∆Rs.

      Response: Done. We had used a 2 x 2 color scheme: green hues for ∆R-, and purple hues for ∆R+. Then, OCD is denoted by dark colors, and HV by light colors. This represents all four colors used in the figure. For instance, OCD and ∆R- is dark green, whereas OCD and ∆R+ is denoted by dark purple.

      1. p.21 - the YBOCS abbreviation appears before the full form is spelled out in the text.

      Response: In the revised version, we will make sure the YBOCS abbreviation will be spelled out the first time it is mentioned.

      Done in page 24

      Experiments 2 and 3:

      1. If there is a reason behind presenting the conditions sequentially rather than using intermixed trials in experiments 2 and 3, it would be useful to mention it in the text.

      Response: Experiment 2 could have used intermixed trials. However, we were concerned that the use of intermixed trials in experiment 3 would increase excessively the memory load of the task, which could then be a confound.

      Done in page 41

      1. I wonder whether the presentation order of the conditions in experiments 2 and 3 affected participants' results? Maybe it is worth adding this factor to the analysis.

      Response: As we mentioned both in the methods and results sections, we counterbalanced all the conditions across participants, in both experiments 2 and 3. This procedure ensures no order effects.

      Experiment 2:

      1. Regarding this sentence (pp. 21-22): "However, some participants still preferred the app sequence, specifically those with greater habitual tendencies, including patients who considered the app training beneficial." I think the part that mentions that there are "patients who considered the app training beneficial" appears below and it may confuse the reader. I suggest either providing a brief explanation or indicating that further details will be provided later in the text ("see below in...").

      Response: We will clarify this section.

      We added “see below exploratory analyses of “Mobile-app performance effect on symptomatology”” in the end of the sentence so that the reader knows this is further explained below. Page 25

      1. Finally, in addition to subgrouping maybe it is worth testing whether there is a correlation between the YBOCS score change and the app-sequences preference (as to learn if the more they change their YBOCS the more they prefer the learned sequences and vice versa?)

      Response: Thank you for suggesting this relevant correlational analysis, which we have now conducted. Indeed, there is a correlation between the YBOCS score change and the preference for the app-sequences, meaning that the higher the symptom improvement after the month training, the greater the preference for the familiar/learned sequence. This is particularly the case for the experimental condition 2, when subjects are required to choose between the trained app sequence and any 3-move sequence (rs = 0.35, p=0.04). A trend was observed for the correlation between the YBOCS score change and the preference for the app-sequences in experimental condition 1 (app preferred sequence versus any 6-move sequence): rs = 0.30, p=0.09.

      This finding represents an additional corroboration of our conclusion that the app seems to be more beneficial to patients more prone to routine habits, who are somewhat more averse to novelty.

      This analysis was added in page 24, 25 and page 35.

      Experiment 3:

      1. You mention "The task was conducted in a new context, which has been shown to promote reengagement of the goal system (Bouton, 2021)." In my understanding this observation is true also for experiment 2. In such case it should be stated earlier (probably under: "Phase B: Tests of actionsequence preference and goal/habit arbitration").

      Response: As answered above in (Q17), we will follow this referee 2’s suggestion and describe the contextual details of experiments 2 and 3 in the Results section, when we introduce Phase B.

      Done in page 21.

      1. w.r.t this sentence - "...that sequence (Figure 8b, no group effects (p = 0.210 and BF = 0.742, anecdotal evidence)" I would add what the anecdotal evidence refers (as done in other parts of the paper), to prevent potential confusion.

      Response: OK, this will be added.

      Added on page 27

      Discussion:

      1. w.r.t. "Here we have trained a clinical population with moderately high baseline levels of stress and anxiety, with training sessions of a higher order of magnitude than in previous studies (de Wit et al., 2018, 2018; Gera et al., 2022) (30 days instead of 3 days)." The Gera et al. 2022 (was more than 3 days), you probably meant Gera et al. 2023 ("Characterizing habit learning in the human brain at the individual and group levels: a multi-modal MRI study", for which 3 days is true).

      Response: Thank you for pointing this out. We will keep the citation to Gera et al 2022 given its relevance to the sentence but we will remove the information inside the parenthesis. This amendment will solve the issue raised here.

      Done in page 32

      1. w.r.t "to a simple 2-element sequence with less training (Gera et al., 2022)" - it's a 3-element sequence in practice.

      Response: Thank you for this correction. We will amend this sentence accordingly.

      Done in page 32

      1. (p.30) w.r.t "and enhanced error-related negativity amplitudes in OCD" - a bit more context of what the negative amplitudes refer to would be useful (So the reader understands it refers to electrophysiology).

      Response: We will add a sentence in our revised manuscript addressing this matter. This sentence has been removed in the revised manuscript

      Supplementary materials:

      1. under "Sample size for the reward sensitivity analysis":

      It is stated "One practice corresponded to 20 correctly performed sequences. We therefore split the total number of correct sequences into four bins." I was not able to follow this reasoning here (20 correct trials in practice => splitting the data the 4 bins). More clarity here would be useful.

      Response: We will clarify this procedure of our analysis in the revised version of the manuscript. Thanks.

      Done. See Supplementary materials.

      1. Also, maybe I am missing something, but I couldn't understand why the number of sequences available per bin is different for the calculation of ∆MT and C. Aren't any two consecutive sequences that are good for the calculation of one of these measures also good for the calculation of the other?

      Response: Thank you for pointing this out. Indeed, the number of trials was the same for both analyses, ∆MT and C. We had saved an incorrect variable as number of trials. We will amend the text.

      We have re-analyzed the trial number data. The average number of trials per bin both for the ∆MT and C analyses was 109 (9) in the HV and 127 (12) in OCD groups. Although the number was on average larger in the patient group, we did not find significant differences between groups (p = 0.47).

      When assessing the p(∆T|∆R+) and p(∆T|∆R-) separately, more trials were available for p(∆T|∆R+), 107 (10) , than for p(∆T|∆R-), and 98 (8). These trial numbers differed significantly (p = 0.0046), but were identical for ∆MT and C analyses.

      Done. Included in Supplementary materials.

      Minor comments:

      1. Not crucial, but maybe for the sake of consistency consider merging the "Self-reported habit tendencies" section and the "Other self-reported symptoms" section, preferably where the latter is currently placed.

      Response: We fully understand the referee’s rationale underlying this suggestion. We indeed considered initially presenting the self-reported questionnaires all together, in a last, single section of the results, as suggested by the referee. However, we decided to report the higher habitual tendencies of OCD as an initial set of results, not only because it is a novel and important finding (which justifies it to be highlighted) but also because it is essential to the understanding of some of the remaining results presented.

      1. In some figure legends the percentage of the interval of the mentioned confidence intervals (probably 95%) is missing. I suggest adding it.

      Response: OK, this will be added.

      Done

      1. The NHS abbreviation appears without spelling out the full form.

      Response: This will be amended accordingly.

      I removed NHS as it is not relevant.

      1. In p.38 the citation (Rouder et al., 2012) is duplicated (appears twice consecutively).

      Response: Thank you for pointing this out. We will amend accordingly.

      Done

      In the results section:

      1. The authors mention: "To promote motivation, the total points achieved on each daily training sessions were also shown, so participants could see how well they improved across days". Yet, if the score is based on the number of practices, it may not represent participants improvement in case in some days more practices are performed. I suggest to clarify this point.

      Response: The goal of providing the scoring feedback was, as explained in the sentence, to gauge motivation and inform the subject about their performance. Having this goal in mind, it does not really matter if one day their scoring would be higher simply because they would have done more practice on that day. Participants could easily understand that the scoring reflected their performance on each practice so they would realize that the more practice, the greater their improvement and that the scoring would increase across days of practice. We will amend the sentence to the following: "To promote motivation, the total points achieved on each training session (i.e. practice) was also shown, so participants could see how well they improved across practice and across days".

      Done in page 7 and 8.

    1. Author Response

      Reviewer #3 (Public Review):

      [...] Weaknesses:

      The study produces a large amount of data that is in general cohesive and support the main conclusions, but more thorough considerations on some of their findings may be helpful, as exemplified by the following:

      1) the effect of microglial ablation on chloral hydrate-induced RORR in Fig. 1B appears to be not the same as other anesthetics. what does this mean?

      2) Macrophage ablation impedes anesthesia emergence from pentobarbital (Fig. 3C). how may this occur?

      3) examination of the potential effect of microglial depletion on dendritic spine density is interesting but the experimental design does not seem to align well with the PPR and eEPSC data, which indicate a reduction in presynaptic release (Fig.10E) and increase of postsynaptic function (Fig. 10H), respectively. The PPR data seems to suggest a presynaptic effect of microglia; ablation.

      This reviewer may confused the brain regions between our spine quantification (Figure 11) and patch-clamp recording (Figure 10). In our spine quantification, all evaluations were conducted in the mPFC. However, the patch-clamp recording were performed in SON (Figure 10 B-F) and LC (Figure 10 G-K), different brain regions from our spine quantification. As one of our conclusion, microglia differentially modulate the activity of neuronal network in a brain region-specific manner, neurons in different brain regions may exhibit different electrophysiological alterations upon microglial depletion. Therefore, this comment might be a factual error.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      This is an interesting, timely and informative article. The authors used publicly available data (made available by a funding agency) to examine some of the academic characteristics of the individuals recipients of the National Institutes of Health (NIH) k99/R00 award program during the entire history of this funding mechanism (17 years, total ~ 4 billion US dollars (annual investment of ~230 million USD)). The analysis focuses on the pedigree and the NIH funding portfolio of the institutions hosting the k99 awardees as postdoctoral researchers and the institutions hiring these individuals. The authors also analyze the data by gender, by whether the R00 portion of the awards eventually gets activated and based on whether the awardees stayed/were hired as faculty at their k99 (postdoctoral) host institution or moved elsewhere. The authors further sought to examine the rates of funding for those in systematically marginalized groups by analyzing the patterns of receiving k99 awards and hiring k99 awardees at historically black colleges and universities.

      The goals and analysis are reasonable and the limitations of the data are described adequately. It is worth noting that some of the observed funding and hiring traits are in line with the Matthew effect in science (https://www.science.org/doi/10.1126/science.159.3810.56) and in science funding (https://www.pnas.org/doi/10.1073/pnas.1719557115). Overall, the article is a valuable addition to the research culture literature examining the academic funding and hiring traits in the United States. The findings can provide further insights for the leadership at funding and hiring institutions and science policy makers for individual and large-scale improvements that can benefit the scientific community.

      Thank you for these comments. We have incorporated the articles referenced on the Matthew effect into the first paragraph of the Discussion our revised preprint.

      Reviewer #2 (Public Review):

      Early career funding success has an immense impact on later funding success and faculty persistence, as evidenced by well-documented "rich-get-richer" or "Matthew effect" phenomena in science (e.g., Bol et al. 2018, PNAS). Woitowich et al. examined publicly available data on the distribution of the National Institutes of Health's K99/R00 awards - an early career postdoc-to-faculty transition funding mechanism - and showed that although 85% of K99 awardees successfully transitioned into faculty, disparities in subsequent R01 grant obtainment emerged along three characteristics: researcher mobility, gender, and institution. Men who moved to a top-25 NIH funded institution in their postdoc-to-faculty transition experienced the shortest median time to receiving a R01 award, 4.6 years, in contrast to the median 7.4 years for women working at less well-funded schools who remained at their postdoc institutions. This result is consistent with prior evidence of funding disparities by gender and institution type. The finding that researcher mobility has the largest effect on subsequent funding success is key and novel, and enhances previous work showing the relationship between mobility and ones' access to resources, collaborators, or research objects (e.g., Sugimoto and Larivière, 2023, Equity for Women in Science (Harvard University Press)).

      These results empirically demonstrate that even after receiving a prestigious early career grant, researchers with less mobility belonging to disadvantaged groups at less-resourced institutions continue to experience barriers that delay them from receiving their next major grant. This result has important policy implications aimed at reducing funding disparities - mainly that interventions that focus solely on early career or early stage investigator funding alone will not achieve the desired outcome of improving faculty diversity.

      The authors also highlight two incredible facts: No postdoc at a historically Black college or university (HBCU) has been awarded a K99 since the program's launch. And out of all 2,847 R00 awards given thus far, only two have been made to faculty at HBCUs. Given the track record of HBCUs for improving diversity in STEM contexts, this distribution of awards is a massive oversight that demands attention.

      At no fault of the authors, the analysis is limited to only examining K99 awardees and not those who applied but did not receive the award. This limitation is solely due to the lack of data made publicly available by the NIH. If this data were available, this study would have been able to compare the trajectory of winners versus losers and therefore could potentially quantify the impact of the award itself on later funding success, much like the landmark Bol et al. (2018) paper that followed the careers of winners of an early career grant scheme in the Netherlands. Such an analysis would also provide new insights that would inform policy.

      Although data on applications versus awards for the K99/R00 mechanism are limited, there exists data for applicant race and ethnicity for the 2007-2017 period, which were made available by a Freedom of Information Act request through the now defunct Rescuing Biomedical Research Initiative: https://web.archive.org/web/20180723171128/http://rescuingbiomedicalresearch.org/blog/examining-distribution-k99r00-awards-race/. These results are not presently discussed in the paper, but are highly relevant given the discussion of K99 award impacts on the sociodemographic composition of U.S. biomedical faculty. From 2007 to 2017, the K99 award rate for white applicants was 31.0% compared to 26.7% for Asian applicants and 16.2% for Black applicants. In terms of award totals, these funding rates amount to 1,384 awards to white applicants, 610 to Asian applicants, and 25 to Black applicants for the entire 2007-2017 period. And in terms of R00 awards, or successful faculty transitions: whereas 77.0% of white K99 awardees received an R00 award, the conversion rate for Asian and Black K99 awardees was lower, at 76.1% and 60.0%, respectively. Regarding this K99-to-R00 transition rate, Woitowich et al. found no difference by gender (Table 2). These results are consistent with a growing body of literature that shows that while there have been improvements to equity in funding outcomes by gender, similar improvements for achieving racial equity are lagging.

      The conclusions are well-supported by the data, and limitations of the data and the name-gender matching algorithm are described satisfactorily.

      One aspect that the authors should expand or comment on is the change in the rate of K99 to R00 conversions. Since 2016, while the absolute number of K99 and R00 awards has been increasing, the percentage of R00 conversions appears to be decreasing, especially in 2020 and 2021. This observation is not clearly stated or shown in Figure 1 but is an important point - if the effectiveness of the K99/R00 mechanism for postdoc-to-faculty transitions has been decreasing lately, then something is undermining the purpose of this mechanism. This result bears emphasis and potentially discussion for possible reasons for why this is happening.

      Thank you for these insightful comments. We now calculate a rolling conversion rate for K99 to R00 awards which shows there is not as much of a decline in conversion from K99 to R00 (Fig 1B). We still see a slight decline in 2021 and 2022. 468 K99 awards are from 2020 or later so they may still convert to the R00 phase. Thus it is difficult to draw conclusions about 2021/2022 yet. As more time passes, we may better be able to determine whether or not significant alteration from normal occurred in these years, presumably due to pressures from the Covid-19 pandemic. We also thank you for providing the details of the FOIA request. We have included a discussion of these data in the discussion.

      Reviewer #3 (Public Review):

      The researchers aim add to the literature on faculty career pathways with particular attention to how gender disparities persist in the career and funding opportunities of researchers. The researchers also examine aspects of institutional prestige that can further amplify funding and career disparities. While some factors about individuals' pathways to faculty lines are known, including the prospects of certain K award recipients, the current study provides the only known examination of the K99/R00 awardees and their pathways.

      Strengths:

      The authors establish a clear overview of the institutional locations of K99 and R00 awardees and the pathways for K99-to-R00 researchers and the gendered and institutional patterns of such pathways. For example, there's a clear institutional hierarchy of hiring for K99/R00 researchers that echo previous research on the rigid faculty hiring networks across fields, and a pivotal difference in the time between awards that can impact faculty careers. Moreover, there's regional clusters of hiring in certain parts of the US where multiple research universities are located. Moreover, documenting the pathways of HBCU faculty is an important extension of the Wapman et al. study (among others from that research group), and provides a more nuanced look at the pathways of faculty beyond the oft-discussed high status institutions. (However, there is a need for more refinement in this segment of the analyses as discussed further below.). Also, the authors provide important caveats throughout the manuscript about the study's findings that show careful attention to the complexity of these patterns and attempting to limit misinterpretations of readers.

      Weaknesses:

      The authors reference institutional prestige in relation to some of the findings, but there's no specific measure of institutional prestige included in the analyses. If being identified as a top 25 NIH-funded institution is the proximate measure for prestige in the study, then more justification of how that relates to previous studies' measures of institutional prestige and status are needed to further clarify the interpretations offered in the manuscript.

      The identification of institutional funding disparities impacting HBCUs is an important finding and highlights another aspect of how faculty at these institutions are under resourced and arguably undervalued in their research contributions. However, a lingering question exists: why compare HBCUs with Harvard? What are the theoretical and/or methodological justifications for such comparisons? This comparison lends itself to reifying the status hierarchy of institutions that perpetuate funding and career inequalities at the heart of the current manuscript. If aggregating all HBCU faculty together, then a comparable grouping for comparison is needed, not just one institution. Perhaps looking at the top 25 NIH funded institutions could be one way of providing a clearer comparison. Related to this point is the confusing inclusion of Gallaudet in Figure 6 as it is not an officially identified HBCU. Was this institution also included in the HBCU-related calculations?

      Thank you for this comment. We agree this comparison perpetuates the perception of the prestige hierarchy and is problematic. We now compare all institutions in the top 25 NIH funding category to all HBCUs. Thank you also for identifying our error in mis-coding Gallaudet as an HBCU. We have corrected this in the current version.

      There is a clear connection that is missed in the current iteration of the manuscript derived from the work of Robert Merton and others about cumulative advantages in science and the "Matthew effect." While aspects of this connection are noted in the manuscript such as well-resourced institutions (those with the most NIH funding in this circumstance) hire each others' K99/R00 awardees, elaborating on these connections are important for readers to understand the central processes of how a rigid hierarchy of funding and career opportunities exist around these pathways. The work the authors build on from Daniel Larremore, Aaron Clauset, and their colleagues have also incorporated these important theoretical connections from the sociology of knowledge and science, and it would provide a more interdisciplinary lens and further depth to understanding the faculty career inequalities documented in the current study.

      Reviewer #1 (Recommendations For The Authors):

      Comments to authors:

      1. For the benefit of general reader, it would be informative to mention the amount of annual NIH investment in the k99 funding mechanism in the text (230 awards representing a ~ 230 million US dollars investment).

      Thank you for this suggestion. We have added that this is ~$25 million investment annually.

      1. It is worth noting that some of the observed funding and hiring traits resemble the Matthew effect, discussed in: The Matthew effect in science: https://www.science.org/doi/10.1126/science.159.3810.56

      The Matthew effect in science funding: https://www.pnas.org/doi/10.1073/pnas.1719557115

      It would be of value to cite these for further context for the readers.

      Thank you for this suggestion. We have included these references and briefly discussed the Matthew effect in the first paragraph of the Discussion.

      1. Figs 3, 6 and Fig S1 are hard to read without zooming in due to their format and don't work great within a letter size page but can work if they are also linked to a zoomable web version. It would make sense to have an online navigable/searchable/selectable version. But when the reader zooms out, there are patterns that reflect what points the authors are making (though those could be illustrated differently). These figures are really made for online webapp visualization (such as Shiny in R).

      We agree with this comment and have used the “googleVis()” package in R to put together interactive Sankey diagrams. These can be found at: https://dantyrr.github.io/K99-R00-analysis/ and they are referenced in the manuscript.

      1. The abstract states 85% of awardees get R00 awards. That appears to come from 198/234 (page 6) though it's not explicitly stated, and other ratios give different answers (e.g., 1-304/3475 = 91%) but the 85% seems to be the right one. That first paragraph of the results could be clearer. Also, in the middle of page three the number given is 90% so something is inconsistent. For Figure 1A, given the methodology it should be possible to calculate a rolling conversion rate as "R00(t) / K99(t-1)" (and a similarly-calculated cumulative rate).

      Thank you for catching these errors. These were introduced because there are R00 awardees that did not have extramural K99 awards. These are intramural NIH K99 awardees but there is no public data on these awardees. The correct number is 78% of K99 awardees that transitioned to the R00 phase. We have also calculated the rolling conversion rate which is 89% if you exclude the first 2 years of the program (when the first awardees were within the 2-yr K99 period) and final 2 years (when most recent K99 awardees were still within their first 2 years of the K99 period).

      1. Assuming that 85% is the correct number, is there any information/insight into why ~1/6 of awardees do not continue to R00, which seems high given that only two years passes - that's a lot of awardees not getting R00 positions.

      We are unsure of why these don’t convert. In the revised version of the manuscript, we speculate on this in the 4th paragraph of the discussion:

      The factors that prevented the other 302 K99 awardees from 2019 and earlier unable to convert their K99-R00 grants is cause for concern within our greater academic community. Possible explanations include leaving the biomedical workforce, accepting tenure-track positions or other positions abroad, or by simply not successfully securing a tenable tenure-track offer.

      1. It looks like perhaps a non-zero number of K99s are just one year and not two (e.g., see 2006 in Fig 1A, which should not appear if all 2006 awards were 2 years). What is the typical percentage of K99s not activated for a second year, and is this a sizable % of the 15% not converting to R00?

      This is an interesting question. We didn’t originally look into this and the dataset that we originally downloaded from NIH reporter included a significant number of duplicates for the grants because year 1 of the K99 was listed on its own line and year 2 was listed on a different line. The first step in curating the data was to delete the duplicate values so we only had one entry per person. Unfortunately based on sorting of the data tables, sometimes the year 1 appeared above year 2 and at other times year 2 appeared before year 1. Because none of the data we were interested in are benchmarked to K99 start date, we removed the duplicate values non-specifically. With the dataset we currently have, we would not be able to tell which individuals dropped out (didn’t convert to R00) during the first or second year of the K99. In order to do this we would have to download the raw data from NIH reporter again and curate it again. We may do this in the future but for the purpose of publishing the current manuscript we prefer to focus our efforts on other aspects of the revision.

      1. Further down page 3, the authors state that "men typically experience 2-3% greater funding success rates" is ambiguous, as rates are themselves a percentage. So, is it 2-3% greater as in 23% vs 20%, or is it 2-3% greater as in 20.6% vs 20%? Please clarify the language.

      Thank you for asking for this clarification. We have updated the text here to reflect that we mean “23% vs 20%”.

      1. Metrics such as time to first R01 are compared internally within the study set, which yields interesting insights, but more could be done to benchmark these metrics to non-K99 scientists.

      We agree with the reviewer that this would be ideal; however, we feel that it is out of the scope of this manuscript. We may examine this in the future.

      1. In the text, several times percentages are being referred to when the figures cited do not show percentages. For example (page 6) 'proportion of awardees that stayed at the same institution declined to about 20% where it has remained consistent (Fig 1B)' - Figure 1B does not show percentages, instead the reader would need to work out from the raw numbers what the pattern of percentages might look like. It's fine (great even) to provide the raw numbers, but would be great to show the percentages as well. This happened for multiple graphs.

      Thank you for this comment. We agree that showing the percentage would be beneficial so we have included the percentages in Figure 1 for the conversion rate. We also added a standalone figure panel for the rolling conversion rate for Figure 1. For Figure 4, we have also included a right Y-axis to better indicate the % women.

      1. Figure 4 - putting the %women on a 0-250 scale makes it difficult to see the changes in that curve. Please replot it as a separate graph with an appropriate scale (30-50%? 30-70%?)

      Thank you for this comment. We have made this edit.

      1. Figure 5 - The table appears inconsistent - the Moved/Stayed HR is 1.411 suggesting that moving is better for reducing time to R01, but then Woman/Man is 1.208, so one of these pairs needs to be written in the opposite order to have the table make sense (intended to be listed as 'better/worse'?)

      Thank you for noticing this. In the revised manuscript we have re-run the cox proportional hazard model using the R package “survival” and the function “coxph()”. There were minor differences in the hazard ratios using this package instead of Graphpad prism; however, the R package is much more widely used compared to prism for these types of analysis. We present the new data in the table in Figure 5B in the revised manuscript. We now present the “detrimental” cox hazard value for each variable (i.e. 0.7095 for the mobility [moved/stayed]). We also underlined the variable which was detrimental to receiving an R01 award earlier.

      1. Figure 5's graph appears strange. All the lines have an appearance of stochasticity but are actually multiples of each other, rising exactly in sync. Are these actually modeled lines? If so, why not instead actually draw the lines based on the real data from the real groups depicted, and give the n for each group?

      Thank you for picking this up. The software we originally used to plot the graphs did plot modeled lines instead of the actual data. We have re-run the cox proportional hazard model using the R “survival” package v3.5-5 and the coxph() and survfit() functions. The updated data are in Figure 5 of the revised manuscript.

      1. Table 1 should note that each column sums to 100%.

      This is a good suggestion. In the revised manuscript, we have added a row to the table to indicate the column total N and %.

      1. The authors discuss how k99/R00 grant reviewing process may have to change but the k99 awards also impact the faculty hiring ecosystem as well. There are faculty hiring job ads explicitly requesting or indicating preference towards k99 holders and the results described in this article show that k99 awarding is biased towards particular demographics at select wealthy institutions. Of course, collective/central action is almost always more effective/impactful (especially in shorter time line) than individual elective action. In other words, NIH changing granting patterns would likely work better than encouraging faculty searches to change the weight they give to K99s, because there are many searches and just one NIH. But these are not mutually exclusive and individual action can still help when central action isn't done (if the NIH does not change the k99/R00 grant review process for more inclusive funding and does not increase the number of annual k99 awards hence the annual budget for this award mechanism) and it would be good to have this discussed in the manuscript.

      Thank you for this comment and thoughtful insights. We have included additional discussion on this in the final paragraph of the discussion.

      Reviewer #2 (Recommendations For The Authors):

      Thank you for conducting this important work. On top of some thoughts I have described in the public review (in particular, Chris Pickett's FOIA data on K99/R00 outcomes by applicant race and ethnicity), I only have a few comments for potential improvements to this paper:

      1. The comparison of K99-R00 transition rates by gender was interesting. However, I missed the analysis on the K99-R00 transition rates by institution (by type or by top-25 NIH funded institution versus not). I think this analysis may be buried somewhere in the more nuanced descriptions about faculty flows from one institution type to another, but I was not able to locate it. I wonder if the authors could consider dedicating a subsection to specifically describing the transition rate by institution type, creating a table equivalent to Table 2. This section would probably fit best somewhere before the authors dive into the nuances of self-hires and faculty flows.

      Said another way: As I was reading, I felt I was missing an answer to a simple question - are there differences in conversion rates by institution type (however you define institution type, as an MSI or non MSI, or top-25 NIH funded versus not)?

      Thank you for this suggestion. We have created the table (Table 3 and Table 4) in the revised manuscript. We also made a new figure (now figure 5 in the revised manuscript). This was an interesting way to look at the data and it is very clear that the number of K99 and R00 awards is heavily concentrated within the institutions that have the highest NIH funding. We have added a paragraph in the results in a new section entitled “K99 and R00 awards are concentrated within the highest funded institutions”.

      1. Regarding the comparison of HBCUs and Harvard: this analysis was elucidating, but I am not sure if the framing of this analysis as pertaining to "systematically marginalized groups" - see second sentence in the section, "Faculty doctorates differ between Harvard and HBCUs" is appropriate. While it is true that proportionally more faculty at HBCUs are from marginalized groups, there are also many faculty at HBCUs who are from privileged or advantaged backgrounds (e.g., white, men, educated at elite institutions). It would be more accurate to rephrase the second sentence to say something along the lines of, "We sought to examine the rates of funding for those at historically under-funded institutions." I recommend that the authors comb the paper for any other potential places in the text that conflate systemic marginalization with institution type, and rephrase as needed for accuracy.

      Thank you for pointing this out. This is an extremely important point and we have removed any instances we could find where we conflate systemically marginalized groups with institution type.

      1. I strongly recommend Sugimoto and Larivière (2023)'s new book, Equity for Women in Science, which has an entire section dedicated to previous work investigating how researcher mobility impacts access to resources, collaborations, et cetera (Chapter 5 on Mobility; other chapters on Funding are also relevant but I hone in on Mobility since this is such a key result of this work). I think this chapter would provide significant food-for-thought and background that could strengthen the Discussion section of the paper.

      Thank you for this suggestion. We have added some discussion of mobility in the first paragraph of the Discussion.

      1. I appreciated the subsection headings that described key results (e.g., "Institutions with the most NIH funding tend to hire K99/R00 awardees from other institutions with the most funding"; "K99/R00 awardee self-hires are more common at institutions with the top NIH funding.") This paper structure made it easier for me to ensure that I was getting the intended takeaway from a figure or section. But partway through the paper, the subheadings changed to being less declarative and therefore less informative (e.g., "Gender of K99/R00 awardees"; "Factors influencing K99/R00 awardee future funding success"). It would be great to rephrase these boilerplate subsection headers to be more declarative, like earlier subsection headings. For example, maybe say "Men receive the majority of K99 awards" or "No gender difference in the rate of conversion from K99 to R00" or something to that effect, depending on what result the authors wish to emphasize.

      Thank you for this comment. This is a very good point. We have re-worded the more generic headings in the revised version.

      1. Lastly, I would like to share a question that came to my mind that involves an additional analysis, but is work that is (probably) out-of-the-scope of this paper, but could instead be a separate paper or product. Circling back to Chris Pickett's FOIA-ed data on K99/R00 funding outcomes by applicant race and ethnicity (https://web.archive.org/web/20180723171128/http://rescuingbiomedicalresearch.org/blog/examining-distribution-k99r00-awards-race/): Given that Pickett's numbers provide incontrovertible information on the number of awards to various racial and ethnic groups, I wonder if it is possible to use this information as an "answer key" to (1) check the accuracy of an algorithm that assigns race based on name for applications in your analysis but for 2007-2017 period, and, (2) if the results are reasonable, then examine the dataset with race and ethnicity information. Some recent papers performing large-scale bibliometric analyses have applied such algorithms (e.g., see Kozlowski et al. 2022 PNAS Intersectional inequalities in science) and I wonder if they could be useful, or at least tested, here. Again, Pickett's data would serve as the benchmark to see if the algorithm produces numbers that are consistent with the actual funding outcomes; if they're not wildly off, or perhaps accurate for some groups but not others, there might be something here.

      This is a really insightful comment. We have discussed whether we could assign ethnicity based on an algorithm and check based on Chris Pickett’s data. We agree that it is beyond the scope of this article, but has potential for future research.

      Reviewer #3 (Recommendations For The Authors):

      -In the methods section, it would be helpful to provide an overview of the number of universities, departments, and faculty represented in the data analyzed in the study.

      Thank you for this comment. We agree with the reviewer. We have added a section to the results discussing the distribution of different types of institutions. We also added Table 3 and Table 4 and a new Figure 5 describing these. Regarding the faculty, we have discussed the demographics of the K99 and R00 awardees as best as we could. We do not have data on which faculty laboratories the K99 awardees were in when they received their awards. This information is not available through NIH reporter.

      -I would consider incorporating, or at least citing, Jeff Lockhart and colleagues' recent paper Nature Human Behavior article "Name-based demographic inference and the unequal distribution of misrecognition" about to provide readers with an additional resource and more information about the likelihood of misattribution and general cautionary notes about using gender and race/ethnicity ascription/imputation approaches and tools for research.

      Thank you for bringing this reference to our attention. We have incorporated this into the methods section describing our name-based gender determination.

      -In the next to last sentence under the final paragraph of the methods section, there looks to be a typo as it should read "K99 or R00," not "K00" as currently written.

      Thank you for catching this. We have now corrected it.

      -Clarifying some of the data and measures used are necessary to limit confusion and misinterpretations of the study's findings.

      Thank you. We have significantly updated the revised manuscript and hope that it is more clear.

      -Elaborating more on the gender inequality notable in the Cox proportional hazard model would strengthen the authors' point about persistent gender inequalities within the K99/R00 funding mechanism and pathways. In its current iteration, the findings are somewhat buried by the discussion of institutional differences, but when we look at the findings and the plot associated with the model, we notice that men have more advantages than women in funding and institutional location.

      Thank you for highlighting this. This is true and we have elaborated on the gender inequality in the revised version of the manuscript.

      -Also for the Cox proportional hazard model, I would consider exploring the inclusion of data that can further clarify the biomedical research infrastructure of institutions. For example, in the conversation about the differences between Princeton and other universities including other Ivies, it's important to note that Princeton does not have a medical school. Moreover, other institutions do not operate or are affiliated with a hospital. Adding more data to the model that can better contextualize the research infrastructure around researchers with NIH awards beyond the size of the NIH portfolio can shed light on possibly other important institutional differences that undergird these inequalities.

      Thank you for this comment. We have added additional details about the institutional type; however, to examine whether institutions are attached to a hospital (or are themselves as hospital like MGH etc.) or whether institutions include a medical school may be difficult. We would have to manually code these and then determine whether or not the award recipient was affiliated with a department within that entity or not. We believe that this is a fascinating question but that it is out of the scope of the present manuscript. This is something that we will look into for potential future publications.

      -Throughout the manuscript there's usage of "elite" and "prestigious" that are somewhat ambiguous regarding what exactly they are referring to about institutional characteristics. This is a common issue in the literature, but trying to clarify what these terms specifically mean for the current study and checking for consistent usage with limited interchangeability that can add confusion for readers about what is being referred to would give added strength to the conversation provided by the authors.

      Thank you for this suggestion. Based on these comments and those by the other reviewers, in the revised version of the manuscript, we have limited the use of “elite” and “prestigious” to describe institutions in order not to perpetuate biases toward certain institutions.

      -In relation to the discussion at the end of the manuscript of the longer time to award noted for researchers who stay at the same institutions, another possibility for the disparity could be their reliance for service work (e.g., hiring committees, departmental committees, supporting graduate students through mentoring and/or dissertation committee work, etc.) in their institutions given their knowledge of and experience within it.

      Thank you for this suggestion. We have added 2 sentences to the discussion reflecting this possibility.

      -Engaging with how STEM professional cultures can perpetuate these funding disparities and related hiring and career outcomes could enhance the contributions of the study. In relation to STEM professional cultures, engaging with the work of Mary Blair-Loy and Erin Cech in their recent book, Misconceiving Merit, could help provide additional insights for readers.

      Thank you for these comments. We have incorporated edits to the revised manuscript reflecting the work of Erin Cech and Mary Blair-Loy.

    1. Author Response

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors showed that activation of RelA and Stat3 in hepatocytes of DSS-treated mice induced CYPs and thereby produced primary bile acids, particularly CDCA, which exacerbated intestinal inflammation.

      Strengths:

      This study reveals the RelA/Stat3-dependent gene program in the liver influences intestinal homeostasis.

      Weaknesses:

      Additional evidence will strengthen the conclusion.

      1) In Fig. 1C, photos show that phosphorylation of RelA and Stat3 was induced in only a few hepatocytes. The authors conclude that activation of both RelA and Stat3 induces inflammatory pathways. Therefore, the authors should show that phosphorylation of RelA and Stat3 is induced in the same hepatocytes during DSS treatment.

      Experiments in progress and data will be submitted in the revised manuscript- Co-staining of pRela and pStat3(727) on treated liver sections.

      2) In Fig. 5, the authors treated mice with CDCA intraperitoneally. In this experiment, the concentration of CDCA in the colon of CDCA-treated mice should be shown.

      Experiments in progress and data will be submitted in the revised manuscript - Supplementation of CDCA to knockout animals and estimation of CDCA in the colon of DSS treated and untreated animals.

      Reviewer #2 (Public Review):

      Singh and colleagues employ a methodic approach to reveal the function of the transcription factors Rela and Stat3 in the regulation of the inflammatory response in the intestine.

      Strengths of the manuscript include the focus on the function of these transcription factors in hepatocytes and the discovery of their role in the systemic response to experimental colitis. While the systemic response to induce colitis is appreciated, the cellular and molecular mechanisms that drive such systemic response, especially those involving other organs beyond the intestine are an active area of research. As such, this study contributes to this conceptual advance. Additional strengths are the complementary biochemical and metabolomics approaches to describe the activation of these transcription factors in the liver and their requirement - specifically in hepatocytes - for the production of bile acids in response to colitis.

      Some weaknesses are noted in the presentation of the data, including a lack of comprehensive representation of findings in all conditions and genotypes tested.

      These will be incorporated in the revised version.

      Reviewer #3 (Public Review):

      Summary:

      The authors try to elucidate the molecular mechanisms underlying the intra-organ crosstalks that perpetuate intestinal permeability and inflammation.

      Strengths:

      This study identifies a hepatocyte-specific rela/stat3 network as a potential therapeutic target for intestinal diseases via the gut-liver axis using both murine models and human samples.

      Weaknesses:

      1) The mechanism by which DSS administration induces the activation of the Rela and Stat3 pathways and subsequent modification of the bile acid pathway remains clear. As the authors state, intestinal bacteria are one candidate, and this needs to be clarified. I recommend the authors investigate whether gut sterilization by administration of antibiotics or germ-free condition affects 1. the activation of the Rela and Stat3 pathway in the liver by DSS-treated WT mice and 2. the reduction of colitis in DSS-treated relaΔhepstat3Δhep mice.

      Experiments in progress and data will be submitted in the revised manuscript - Antibiotic treatment for 2/4 weeks, subsequently mice will be treated with DSS and the Rela and Stat3 phosphorylation will be tested using western blotting.

      2) It has not been shown whether DSS administration causes an increase in primary bile acids, represented by CDCA, in the colon of WT mice following activation of the Rela and Stat3 pathways, as demonstrated in Figure 6.

      We have demonstrated a enhanced level of CDCA in the colon following DSS treatment in the wild type animals in figure 4B.

      3) The implications of these results for IBD treatment, especially in what ways they may lead to therapeutic intervention, need to be discussed.

      These will be incorporated in the revised version.

    1. Author Response

      We decided to address the comments of the reviewers with additional experiments and modification of the text with the aim of submitting a new version of the report.

      We would like to underline that the current study is an extension of the work published in eLife (Atze et al., 2021). For this reason, and in agreement with eLife guidelines, we did not repeat all the background information on the method used to identify PG subunit isotopologues using mass spectrometry.

      Reviewer #1 (Public Review):

      Summary:

      Liang et. al., uses a previously devised full isotope labeling of peptidoglycan followed by mass spec to study the kinetics of Lpp tethering to PG and the hydrolysis of this bond by YafK.

      Strengths:

      -The labeling and mass spec analysis technique works very well to discern differentially labelled Tri-KR muropeptide containing new and old Lpp and PG.

      Weaknesses:

      -Only one line of experimentation using mass spec based analysis of labeled PG-Lpp is used to make all conclusions in the paper. The evidence is also not enough to fully deleanate the role of YafK.

      Our approach based on heavy isotope labelling and mass spectrometry has the power to identify and kinetically characterize the specific products of the reaction leading to the tethering of Lpp to PG and the hydrolysis of the corresponding bond. We therefore advocate that our experimentation is sufficient to obtain meaningful results without combining other lines of experimentation.

      -Only one mutant (YafK) is used to make the conclusion.

      The aim of the study is to determine the effect of the hydrolysis of the PG→Lpp bond on the dynamics of the tethering of Lpp to PG. Since YafK is the only enzyme catalyzing this reaction, it is appropriate to compare the wild-type strain to an isogenic yafK deletion mutant. Nonetheless, we carefully consider this comment and will investigate the dynamics of the tethering of Lpp to PG in mutants deficient in the production of the L,D-transpeptidases responsible for tethering Lpp to PG.

      -The paper makes a lot of 'implications' with minimal proof to support their hypothesis. Other lines of experimentations must be added to fully delineate their claims.

      See our answer to the first comment.

      -Time points to analyse Tri-KR isotopologues in Wt (0,10,20,40,60 min) and yafK mutant (0,15, 25, 40, 60 min) are not the same.

      The purpose of the experiments is to compare the kinetics of formation and hydrolysis of the PG→Lpp bond in the WT versus ΔyafK strains. Comparison of the kinetics is therefore possible even though the kinetics are not based on the exact same time points. Nonetheless, we will reproduce the kinetics experiment (see also answers to Reviewer 2) and use the same time points in these additional experiments.

      -Experiments to define physiological role of YafK are also missing

      We will investigate the effect of the yafK deletion on the formation of outer membrane vesicles.

      Reviewer #2 (Public Review):

      Summary:

      The authors of this study have sought to better understand the timing and location of the attachment of the lpp lipoprotein to the peptidoglycan in E. coli, and to determine whether YafK is the hydrolase that cleaves lpp from the peptidoglycan.

      Strengths:

      The method is relatively straightforward. The authors are able to draw some clear conclusions from their results, that lpp molecules get cleaved from the peptidoglycan and then re-attached, and that YafK is important for that cleavage.

      Weaknesses:

      However, the authors make a few other conclusions from their data which are harder to understand the logic of, or to feel confident in based on the existing data. They claim that their 5-time point kinetic data indicates that new lpp is not substantially added to lipidII before it is added to the peptidoglycan, and that instead lpp is attached primarily to old peptidoglycan. I believe that this conclusion comes from the comparison of Fig.s 3A and 3C, where it appears that new lpp is added to old peptidoglycan a few minutes before new lpp is added to new peptidoglycan. However, the very small difference in the timing of this result, the minimal number of time points and the complete lack of any presentation of calculated error in any of the data make this conclusion very tenuous. In addition, the authors conclude that lpp is not significantly attached to septal peptidoglycan. The logic behind this conclusion appears to be based on the same data, but the authors do not provide a quantitative model to support this idea.

      The reviewer is correct in stating that we claim that Lpp is not substantially added to lipid II before incorporation of the disaccharide-pentapeptide subunit into the expanding PG network. This conclusion is based on the paucity of PG-Lpp covalent adducts containing light PG and Lpp moieties at the earliest time points. To substantiate more thoroughly this finding, we will reproduce the kinetic experiments with more early time points. The paucity of the new→new PG-Lpp isotopologues also implies that Lpp might not be extensively tethered to septal peptidoglycan since the latter is assembled from newly synthesized PG (see our previous publication Atze et al. 2021 and references therein). Quantitatively, septal synthesis roughly accounts for one third of the total PG synthesis. It is therefore expected that tethering of Lpp to septal PG would represent one third of the total number of newly synthesized Lpp molecules tethered to PG. We therefore proposed that the paucity of new→new PG- Lpp isotopologues at early time points of the kinetics implies that Lpp is preferentially tethered to the side wall. This is only one of several conclusions that we reach in the present study and we were very careful in the wording of our results.

      -This work will have a moderate impact on the field of research in which the connections between the OM and are being studied in E. coli. Since lpp is not widely conserved in gram negatives, the impact across species is not clear. The authors do not discuss the impact of their work in depth.

      We respectfully disagree with this reviewer’s comment. The work reported in this article for E. coli opens the way to the analysis and comparison of the mechanisms of the tethering of proteins to PG in various bacteria. In addition, we would like to stress that the Gram-negative bacteria that produce Lpp-related proteins and tether them to the PG include other major pathogens such as Pseudomonas aeruginosa (DOI: 10.1128/spectrum.05217-22).

    1. Author Response

      eLife assessment

      The manuscript presents valuable evidence of temporal correlations during specific oscillatory activity between the prefrontal cortex, thalamic nucleus reuniens, and the hippocampus, in naturally sleeping animals. Such correlations represent solid evidence to support the notion that the thalamic nucleus reuniens participates in the hippocampal and prefrontal cortex dialogue subserving memory processes.

      Thank you for your assessment.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, Basha and colleagues aim to test whether the thalamic nucleus reuniens can facilitate the hippocampus/prefrontal cortex coupling during sleep. Considering the importance of sleep in memory consolidation, this study is important to understand the functional interaction between these three majorly involved regions. This work suggests that the thalamic nucleus reuniens has a functional role in synchronizing the hippocampus and prefrontal cortex.

      Strengths:

      The authors performed recordings in naturally sleeping cats, and analysed the correlation between the main slow wave sleep oscillatory hallmarks: slow waves, spindles, and hippocampal ripples, and with reuniens' neurons firing. They also associated intracellular recordings to assess the reuniens-prefrontal connectivity, and computational models of large networks in which they determined that the coupling of oscillations is modulated by the strength of hippocampal-thalamic connections.

      Thank you for your positive evaluation.

      Weaknesses:

      The authors' main claim is made on slow waves and spindle coupling, which are recorded both in the prefrontal cortex and surprisingly in reuniens. Known to be generated in the cortex by cortico-thalamic mechanisms, the slow waves and spindles recorded in reuniens show no evidence of local generation in the reuniens, which is not anatomically equipped to generate such activities. Until shown differently, these oscillations recorded in reuniens are most likely volume-conducted from nearby cortices. Therefore, such a caveat is a major obstacle to analysing their correlation (in time or frequency domains) with oscillations in other regions.

      1. We fully agree with the reviewer that reuniens likely does not generate neither slow waves nor spindles. We do not make such claim, which we clearly stated in the discussion (lines 319-324). We propose that Reuniens neurons mediate different forms of activity. In the model, we introduced MD nucleus only because without MD we were unable to generate spindles. While the slow waves and spindles are generated in other thalamocortical regions, the REU neurons show these rhythms due to long-range projections from these regions to REU as has been shown in the model.

      2. Definitely, we cannot exclude some influence of volume conductance on obtained LFP recordings in REU nucleus. However, we show modulation of spiking activity within REU by spindles. Spike modulation cannot be explained by volume conductance but can be explained by either synaptic drive (likely the case here) or some intrinsic neuronal processes (like T-current).

      3. In our REU recordings for spike identification we used tetrode recordings. If slow waves and spindles are volume conducted, then slow waves and spindles recorded with tetrodes should have identical shape. Following reviewer comment, we took these recordings and subtracted one channel from another. The difference in signal during slow waves is in the order 0.1 mV. Considering that the distance between electrodes is in the order of 20 um, such a difference in voltage is major and can only be explained by local extracellular currents, likely due to synaptic activities originating in afferent structures.

      Finally, the choice of the animal model (cats) is the best suited one, as too few data, particularly anatomical ones regarding reuniens connectivity, are available to support functional results.

      1. Thalamus of majority of mammals (definitely primates and carnivores, including cats) contain local circuit interneurons (about 30 % of all neurons). A vast majority of studies in rodents (except LGN nucleus) report either absence or extremally low (i.e. Jager P, Moore G, Calpin P, et al. Dual midbrain and forebrain origins of thalamic inhibitory interneurons. eLife. 2021; 10: e59272.) number of thalamic interneurons. Therefore, studies on other species than rodents are necessary, and bring new information, which is impossible to obtain in rodents.

      2. Cats’ brain is much larger than the brain of mice or rats, therefore, the effects of volume conductance from cortex to REU are much smaller, if not negligible. The distance between REU and closest cortical structure (ectosylvian gyrus) in cats is about 15 mm.

      3. Indeed, there is much less anatomical data on cats as opposed to rodents. This is why, we performed experiments shown in the figure 1. This figure contains functional anatomy data. Antidromic responses show that recorded structure projects to stimulated structure. Orthodromic responses show that stimulated structure projects to recorded structure.

      Reviewer #2 (Public Review):

      Summary:

      The interplay between the medial prefrontal cortex and ventral hippocampal system is critical for many cognitive processes, including memory and its consolidation over time. A prominent idea in recent research is that this relationship is mediated at least in part by the midline nucleus reuniens with respect to consolidation in particular. Whereas the bulk of evidence has focused on neuroanatomy and the effects of temproary or permanent lesions of the nucleus reuniens, the current work examined the electrophysiology of these three structures and how they inter-relate, especially during sleep, which is anticipated to be critical for consolidation. They provide evidence from intercellular recordings of the bi-directional functional connectivity among these structures. There is an emphasis on the interactions between these regions during sleep, especially slow-wave sleep. They provide evidence, in cats, that cortical slow waves precede reuniens slow waves and hippocampal sharp-wave ripples, which may reflect prefrontal control of the timing of thalamic and hippocampal events, They also find evidence that hippocampal sharp wave ripples trigger thalamic firing and precede the onset of reuniens and medial prefrontal cortex spindles. The authors suggest that the effectiveness of bidirectional connections between the reuniens and the (ventral) CA1 is particularly strong during non-rapid eye movement sleep in the cat. This is a very interesting, complex study on a highly topical subject.

      Strengths:

      An excellent array of different electrophysiological techniques and analyses are conducted. The temporal relationships described are novel findings that suggest mechanisms behind the interactions between the key regions of interest. These may be of value for future experimental studies to test more directly their association with memory consolidation.

      We thank this reviewer for very positive evaluation of our study.

      Weaknesses:

      Given the complexity and number of findings provided, clearer explanation(s) and organisation that directed the specific value and importance of different findings would improve the paper. Most readers may then find it easier to follow the specific relevance of key approaches and findings and their emphasis. For example, the fact that bidirectional connections exist in the model system is not new per se. How and why the specific findings add to existing literature would have more impact if this information was addressed more directly in the written text and in the figure legends.

      Thank you for this comment. In the revised version, we will do our best to simplify presentation and more clearly explain our findings.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Activity has effects on the development of neural circuitry during almost any step of differentiation. In particular during specific time periods of circuit development, so-called critical periods (CP), altered neural activity can induce permanent changes in network excitability. In complex neural networks, it is often difficult to pinpoint the specific network components that are permanently altered by activity, and it often remains unclear how activity is integrated during the CP to set mature network excitability. This study combines electrophysiology with pharmacological and optogenetic manipulation in the Drosophila genetic model system to pinpoint the neural substrate that is influenced by altered activity during a critical period (CP) of larval locomotor circuit development. Moreover, it is then tested whether and how different manipulations of synaptic input are integrated during the CP to tune network excitability.

      Strengths:

      Based on previous work, during the CP, network activity is increased by feeding the GABA-AR antagonist PTX. This results in permanent network activity changes, as highly convincingly assayed by a prolonged recovery period following induced seizure and by altered intersegmental locomotor network coordination. This is then used to provide two important findings: First, compelling electro- and optophysiological experiments track the site of network change down to the level of single neurons and pre- versus postsynaptic specializations. In short, increased activity during the CP increases both the magnitude of excitatory and inhibitory synaptic transmission to the aCC motoneuron, but excitation is affected more strongly. This results in altered excitation inhibition ratios. Fine electrophysiology shows that excitatory synapse strengthening occurs postsynaptically. High-quality anatomy shows that dendrite size and numbers of synaptic contacts remain unaltered. It is a major accomplishment to track the tuning of network excitability during the CP down to the physiology of specific synapses to identified neurons.

      Second, additional experiments with single neuron resolution demonstrate that during the CP different forms of activity manipulation are integrated so that opposing manipulations can rescue altered setpoints. This provides novel insight into how developing neural network excitability is tuned, and it indicates that during the CP, training can rescue the effects of hyperactivity.

      Weaknesses:

      There are no major weaknesses to the findings presented, but the molecular cause that underlies increased motoneuron postsynaptic responsiveness as well as the mechanism that integrates different forms of activity during the CP remain unknown. It is clear that addressing these experimentally is beyond the scope of this study, but some discussion about different candidates would be helpful.

      We discuss likely mechanisms that underpin the increase in postsynaptic responsiveness below (Reviewer #1 (Recommendations For The Authors):, point 2). To address possible mechanisms that integrate different forms of activity we now include a new paragraph in the discussion.

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors use the tractable Drosophila embryonic/larval motor circuit to determine how manipulations of activity during a critical period (CP) modify the circuit in ways that persist into later developmental stages. Previously, this group demonstrated that manipulations to the aCC/MN-Ib neuron in embryonic stages enhance (or can rescue) susceptibility to seizures at later larval stages. Here, the authors demonstrate that following enhanced excitatory drive (by PTX feeding), the aCC neuron acquires increased sensitivity to cholinergic excitatory transmission, presumably due to increased postsynaptic receptor abundance and/or sensitivity, although this is not clarified. Although locomotion is not altered at later developmental larval stages, the authors suggest there is reduced "robustness" to induced seizures. The second part of the study then goes on to enhance inhibition during the CP in an attempt to counteract the enhanced excitation, and show that many aspects of the CP plasticity are rescued. The authors conclude that "average" E/I activity is integrated during the CP to determine the excitability of the mature locomotor network.

      Overall, this study provides compelling mechanistic insight into how a final motor output neuron changes in response to enhanced excitatory drive during a CP to change the functionality of the circuit at later mature developmental stages. The first part of this study is strong, clearly showing the changes in the aCC neuron that result from enhanced excitatory input. This includes very nice electrophysiology and imaging data that assess synaptic function and structure onto aCC neurons from pre-motor inputs resulting from PTX exposure during development. However, the later experiments in Figures 6 and 7 designed to counteract the CP plasticity are somewhat difficult to interpret. In particular, the specificity of the manipulations of the ch neuron intended to counteract the CP plasticity is unclear, given the complexities of how these changes impact the excitability of all neurons during development. It is clear that CP plasticity is largely rescued in later stages, but it is hard to know if downstream or secondary adaptations may be masking the PTX-induced plasticity normally observed. Nonetheless, this study provides an important advance in our understanding of what parameters change during CPs to calibrate network dynamics at later developmental stages.

      Reviewer #3 (Public Review):

      Summary:

      In Hunter, Coulson et al, the authors seek to expand our understanding of how neural activity during developmental critical periods might control the function of the nervous system later in life. To achieve increased excitation, the authors build on their previous results and apply picrotoxin 17-19 hours after egg-laying, which is a critical period of nervous system development. This early enhancement of excitation leads to multiple effects in third-instar larvae, including prolonged recovery from electroshock, increased synchronization of motor neuron networks, and increased AP firing frequency. Using optogenetics and whole-cell patch clamp electrophysiology, the authors elegantly show that picrotoxin-induced over-excitation leads to increased strength of excitatory inputs and not loss of inhibitory inputs. To enhance inhibition, the authors chose an approach that involved the stimulation of mechanosensory neurons; this counteracts picrotoxin-induced signs of increased excitation. This approach to enhancing inhibition requires further control experiments and validation.

      Strengths:

      • The authors confirm their previous results and show that 17-19 hours after egg laying is a critical period of nervous system development.

      • Using Ca2+/Sr2+ substitutions, the authors demonstrate that synaptic connections between A18a  aCC show increased mEPSP amplitudes. The authors show that this aCC input is what is driving enhanced excitation.

      • The authors demonstrate that the effects of over-excitation attributed to picrotoxin exposure are generalizable and also occur in bss mutant flies.

      Weaknesses:

      • The authors build on their previous work and argue that the critical period (17-19h after egg-laying) is a uniquely sensitive period of development. Have the authors already demonstrated that exposure to picrotoxin at L1 or L2 (and even early L3 if experimentally possible) does not lead to changes in induced seizure at L3? This would further the authors' hypothesis of the uniqueness of the 17-19h AEL period. If this has already been established in prior publications, then this needs to be further explained. I do note in Gaicehllo and Baines (2015) that Fig 2E shows the identification of the 17-19h window.

      This is a pertinent comment. We now have evidence that activity manipulation (in this instance by increasing temperature, which recapitulates the effect of PTX) is not effective at larval stages (L1 to L3) but remains effective between 17-19hrs AEL. This observation forms part of a separate study where we explore the role of circadian activity on embryonic and larval neuronal development. We include a brief statement to address this comment in the revision (first paragraph of Results).

      • Regarding experiments in Fig 2, authors only report changes in AP firing frequency. Can the authors also report other metrics of excitability, including measures of intrinsic excitability with and without picrotoxin exposure (including RMP, Rm)? Was a different amount of current injection needed to evoke stable 5-10 Hz firing with and without picrotoxin? In the representative figure (Fig. 2A), it appears that the baseline firing frequencies are different prior to optogenetic stimulation.

      No differences in RM, Rin or capacitance were observed due to PTX. This is now included in the revision along with an explanation that different levels of current injection were used to measure effects to excitatory vs inhibitory synaptic drive. We did not specifically monitor the amount of current required to maintain stable firing.

      • The ch-related experiments require further controls and explanation. Regarding experiments in Fig 6, what is the effect of ch neuron stimulation alone on time lag and AP frequency? Can the authors further clarify what is known about connections between aCC and ch neurons? It is difficult for this reviewer to conceptualize how enhancing ch-mediated inhibition would worsen seizures. While the cited study (Carreira-Rosario et al 2021) convincingly shows that inhibition of mechanosensory input leads to excessive spontaneous network activity, has it been shown that the converse - stimulation of ch neurons - indeed enhances network inhibition?

      • The interpretation of ch-related experiments is further complicated by the explanation in the Discussion that ch neuron stimulation depolarizes aCC neurons; this seems to undercut the authors' previous explanation that the increased E:I ratio is corrected by enhanced inhibition from ch neurons. The idea that ch neurons are placing neurons in a depolarized refractory state is not substantiated by data in the paper or citations.

      To respond to these two points combined: The reviewer is correct in stating that additional experiments will be required to fully understand mechanism. We believe that cholinergic (excitatory) chordotonal input to aCC may be an important component for setting the rhythm of the locomotor CPG. Indeed, it may be that CPG rhythm is a key factor during the CP. Our observations suggest optogenetic stimulation of Ch neurons alone is sufficient to induce large, ~400-, currents that resemble endogenous spontaneous rhythmic currents (SRCs) associated with CPG activity. SRCs occur with a characteristic frequency of ~1Hz, and we have some unpublished data that suggests it is possible to change this frequency using ch stimulation. This data therefore unifies prior work (Carreira-Rosario et al., 2021 description of a brake) with our own (observation that ch depolarize aCC). However, we do not include this speculation in the Discussion because the experiments we have conducted were pilots. They may be expanded upon and included in future work.

      • In the Discussion, the authors suggest that enhanced proprioception leading to seizures is reminiscent of neurological conditions. This seems to be an oversimplification. Connecting abnormal proprioception to seizures is quite different from connecting abnormal proprioception to disorders of coordination. This should be revised.

      Because this is peripheral to our main study, we have deleted this from the revision.

      Reviewer #1 (Recommendations For The Authors):

      1. Although the authors have to be commended for the scrutiny with which they pinpoint a site of circuit change, it cannot be excluded that other parts of the circuit also undergo adjustments in response to activity manipulation during the CP, e.g. the membrane properties of the interneurons. This is not a problem but should be discussed.

      We agree with this comment and have added the following text to the discussion……’However, we recognise that other parts of the locomotor network may also undergo change due to CP manipulation. The advantage of this system is that most of these elements are now open to specific manipulation through cell-specific genetic drivers’. (Discussion paragraph 3)

      1. It is surprising that there is no discussion of the potential molecular cause for the observed increases in postsynaptic responses to SV release from cholinergic neurons. Given that there are no differences in postsynaptic structure, puncta number etc., the subunit composition of the nAChR seems an obvious guess. What is known about the nAChRs subunit composition on aCC, and when during development do the receptors/different subunits become expressed? A paragraph in the discussion on this issue would be highly relevant to the manuscript.

      Our own work (unpublished) together with a recent paper from the Littleton lab (https://www.sciencedirect.com/science/article/pii/S0896627323005810?via%3Dihub#mmc2) suggests that aCC expresses the majority, if not all, of the 7 alpha and 3 beta subunits that compromise nAChRs. The situation is further complicated by the fact that these receptors are pentameric and are composed of various subunits – the composition significantly altering channel kinetics. Less is known about expression timelines for each receptor subunit, and certainly not in aCC. We already include the following sentence in the results text……’ A change in the frequency of mini excitatory postsynaptic potentials (mEPSPs, a.k.a. minis) would suggest the adaptation is primarily presynaptic (e.g. increased probability of release), whilst a change in distribution and/or amplitude of minis is more consistent with a mechanism acting postsynaptically (e.g. increased or altered receptor subunits).’ Given that we know next to nothing about the nAChR subunit composition in aCC and how this might change due to CP manipulation, we feel it better not to speculate further. To help the reader, we include the following sentence in the discussion……’The precise mechanism contributing to increased mini amplitude remains to be determined, but a plausible scenario may involve change in cholinergic subunit composition.’ (Discussion paragraph 3)

      1. It would be important to provide the p-values for Figures 1B and C, especially because it seems that the inhibition also becomes stronger upon PTX treatment during the CP. There is no statistical testing mentioned, was no test done or was it not significant? It is agreed that the effect size is clearly stronger for the increased excitation than for the increased inhibition, but looking at the data suggests that the effect on excitation is not much more significant than the effect on inhibition.

      The reviewer is referring to Fig 2B&C. P values have been added to both main text and to the figure legend.

      1. Associated with the point above, in the discussion line 407 and below the authors come back to this point and reason that it is surprising that increased excitation is not compensated for by homeostatic mechanisms. It is concluded that homeostatic compensation brings the system back to a setpoint that is defined during the critical period, but the setpoint is set higher in this case. However, an alternative explanation is that GABA administration during the critical period causes the excitation set point to be too high, but this is then partially counteracted in a homeostatic manner by increasing inhibition. If the p-values in Figures 2B and C are rather similar, this might even be the favorable interpretation.

      We believe the reviewer means ‘PTX administration’ and not GABA. This is an interesting idea and one we had not really considered. We address this comment by adding the following text………. ‘Alternatively, whilst the increased inhibition we observe is not statistically significant (p = 0.15), it is close and has a medium effect size (Cohen’s d = 0.78), and thus may be indicative of an attempt by the locomotor network to rebalance activity back towards a genetically pre-determined level. In this regard, it may just not have sufficient range to be able to counter the increase in excitation due to CP manipulation.’ (Discussion paragraph 5)

      1. To asses the magnitudes of A18a-mediated excitation and A31k-mediated inhibition to aCC, changes in aCC firing frequency were measured. For this aCC was injected with current to fire at all. However, the current injections were chosen to cause firing at 5-10 Hz. During a crawling burst, aCC fires well above 100Hz (Kadas et al., 2017). Are the effects also visible at such firing frequencies, or at least across different firing frequencies? I am not asking for additional experiments, but maybe the data are there and can be referred to?

      Spiking in aCC occurs as burst firing, evoked by cholinergic synaptic drive, that lasts for ~300ms and achieving firing frequencies of between 50-100Hz (Kadas et al., 2017 and our own unpublished data). We did not test for effects to excitation or inhibition at these higher frequencies. We now make this explicit in the discussion by adding the following sentence……’The firing frequencies that we imposed (1-10Hz) are also lower than seen during fictive locomotion (Kadas et al., 2017), which shows burst firing lasting for ~300 ms and achieving spike frequencies of up to 100Hz.’ (Discussion paragraph 3)

      1. In Figure 3B some minis are demarked by green arrows and others are not. Were the non-marked ones not included in the analysis, and what were the criteria to mark some and others not? This is particularly important because the cumulative distribution of minis is analyzed in Figure 3D, and this depends crucially on what qualifies as mini and what does not.

      All mini’s are marked by green arrows. The events not marked are not mini’s. Drosophila neurons are small and have an unfavourable dendritic structure for recording minis. Thus, we carefully analyse traces by eye taking only events that show very rapid rise times and slower, exponential decay (the typical mini shape). There are, however, other events which are most likely single/multiple channel openings, which due to filtering are rounded. We now include this same trace, greatly expanded, as Fig S1D to show how we identified minis from non-minis.

      1. The asynchronous release experiment under Sr2+ seems an elegant way to analyze minis upon optogenetic stimulation of an identified presynaptic cholinergic neuron. I suggest being a little more conservative with the term asynchronous release (or replacing it), which is usually the release of many single vesicles that follow AP-mediated synaptic transmission and has nicely been demonstrated at the Drosophila NMJ (Besse et al., 2007). Also, please show the trace in Figure S2A under Sr2+ at a higher pA magnification, it is really hard to see the minis there.

      We have adopted a previously published technique that, in our view, correctly uses the term ‘asynchronous release’. This is not to say that all asynchronous release occurs via the same mechanism. Indeed, the papers that report the technique we use predate Besse 2007. We also expand the trace in Fig S1A (not S2A as wrongly indicated).

      Reviewer #2 (Recommendations For The Authors):

      1. Can the authors explain what they think is the parameter of "activity" being measured in the locomotor circuit (mainly aCC) during the CP? Is the aCC neuron simply summing (perhaps through a proxy like Ca2+) total excitation/inhibition over time during the CP?

      Reviewer #1 also requests that we discuss how activity is ‘measured’ and thus we now include a dedicated paragraph in the discussion to address this concern. Whether aCC sums ‘average’ activity or perhaps is influenced by activity extremes remains uncertain. Our data is consistent with the former but further work is required to validate our conclusion. This work will be published in due course.

      Related to understanding this concept, could the authors' silence activity (using Kir2.1, TNT, or BoNT) from each of the monosynaptic premotor inputs in otherwise wildtype and following PTX exposure to determine how the circuit responds when each of the monosynaptic inputs are silenced? This might inform the role they play in instructing how activity is measured over time during the CP.

      This is an excellent suggestion and, indeed, we have planned such experiments. Silencing specific neurons, whilst manipulating the CP, may well result in more significant network instability due to the setting of multiple (and physiologically inappropriate) homeostatic set points. Such studies go beyond the scope of the present study and thus we prefer not to speculate at this early stage, but to wait for experimental data.

      On a related note, the authors focus on just 2 premotor inputs, presumably due to the availability of specific drivers. But do the authors know how many other inputs (other ACh, Gaba, and glutamate) onto aCC there are, and to what extent do the authors think these are changed in similar or distinct ways? Is it implied that all neurons are similarly altered by the manipulations?

      The connectome details the number and types of neurons that directly contact the aCC motoneuron (Zarin et al., 2019). In terms of cholinergic excitors, the results present in Figure 3 suggest that most (all?) inputs are strengthened following embryonic PTX exposure. However, to conclude this would be highly speculative and thus we refrain from doing so in the manuscript. As other single-neuron driver lines become available, such expts will hopefully be possible.

      1. If PTX treatment does indeed increase CPG synchronicity, shouldn't there be a readout of this effect on larval locomotion? While the speed of locomotion wasn't significantly impacted, perhaps another parameter was altered.

      It is quite possible that other aspects of locomotion are being altered (turning, rearing, etc), but we have not analysed for these more subtle behaviours. Indeed, although not statistically significant, there is a modest reduction in average velocity in larvae derived from PTX-exposed embryos. We see similar reductions in characterised seizure mutants which also show increased synchronicity (Streit et al., 2016).

      1. In Figure 2 and elsewhere, what is the baseline level of AP firing rate in each aCC neuron, before optogenetic stimulation? Is this informative about how PTX exposure alters excitability to begin with, perhaps by changing intrinsic excitability.

      We now include this data in the relevant results section. Interestingly, following exposure to PTX, basal firing was significantly increased in A18a (excitatory premotor) but not in A31k (inhibitory premotor). This reflects our experiment in which we conclude that excitatory drive to aCC is increased relative to inhibitory synaptic drive. Thus, this measure seemingly validates our conclusion that E:I balance has been altered following activity-manipulation during the CP.

      1. Figure 3: The apparent increase in mini amplitude is very small (4.1 vs 4.5 pA); is this physiologically meaningful? Although the authors say the decrease in mini freq is not significant in Fig. 3B after PTX, it does appear rather large, a 40% reduction (5 vs 3 Hz).

      We must be guided by statistics in drawing conclusions, but the reader can interpret our data as they wish. Minis measure quantal release and thus to appreciate how small change can, when combined over the many receptors present, influence cell physiology, one needs to compare spiking activity. We show in Fig 2 that such change is sufficient to increase the excitatory synaptic drive provided by the A18a neuron. The seemingly larger reduction in mini frequency is intriguing and may reflect additional change, but without further experiments we cannot draw firm conclusions.

      1. The clever vibration assay is a good one to induce the activation of mechanosensory neurons, but the specificity of the changes induced by this is difficult to ascertain. One possibility would be to silence the output of the ch neurons (by expression to tetanus or botulinum toxin) and still put the larvae through the same vibration during the CP to see if the rescue is lost.

      We agree that further experiments are required to fully understand underlying mechanism(s). However, we will not be able to complete such follow-on expts in a timely manner and thus, these must wait and form the basis of future studies.

      Minor points 1. Typos - there are numerous areas where it seems a comma is used inappropriately (e.g. lines 28, 69, 77, 104, 348, 365, etc). Suggest line editing the final "version of record".

      Checked and corrected.

      1. It would be of benefit to show the genotypes of the larvae in the various experimental manipulations in the relevant figure legends. This reviewer could not follow exactly how each experiment was done as it was not always clear which driver was being used to express which transgene in what genetic background.

      Done

      Reviewer #3 (Recommendations For The Authors):

      • Please provide sample videos of electroshock-induced seizures (e.g. Fig 1B). Is it clear that the period of immobility after electroshock is a seizure (perhaps defined as hyperactivity originating from the brain)? I acknowledge the Baines group is quite skilled in this technique and perhaps there is a straightforward answer or citation to include.

      We refer the reader to Marley and Baines 2011 which contains videos of seizure activity (first paragraph of Results).

      • Seizures are generated in the brain and travel to the periphery. Do the authors think it is possible that the peripheral manipulations in this manuscript might be controlling the behavioral readout of seizures without affecting hypersynchronous activity in the brain?

      We include the following statement (in methods) to provide our best understanding for how peripheral electroshock induces seizure………. ‘Strong peripheral stimulation likely causes excessive and synchronous synaptic excitation within the CNS resulting in seizure. However, the precise mechanism of this effect remains to be determined.’ Moreover, we feel it unlikely that manipulation of Ch neurons, by vibration, would suppress the effects we observe via peripheral mechanisms. Indeed, the Ch manipulation is limited to the embryonic CP, whilst our seizure assays are recorded many days later at L3.

      • How might enhancement of inhibition lead to worsened seizures? Is the enhancement of ch-related inhibition selectively affecting inhibitory circuits, thereby leading to a net increase in excitation?

      This is a difficult point to respond to at present. Enhanced inhibition per se might similarly disturb the encoding of an appropriate homeostatic setpoint(s) thus leaving a network open to being destabilized by a strong stimulus. Indeed, we have previously shown that increased inhibition during the CP results in the same effect (seizure) as increasing excitation (Giachello and Baines, 2015). Thus, presuming activation of Ch neurons during the CP translates to increased inhibition, then worsened seizure behaviour is a predictable effect. How this is achieved remains unknown and we prefer not to speculate here.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      We are pleased that Reviewers 1 and 3 have recommended that the revised paper be published.

      Reviewer #2

      For point A: Their preliminary simulation in 3D looks also nice, although it’s referenced in the discussion but not actually included in manuscript - I would advise adding it even under the mention of preliminary.

      We appreciate the reviewer for liking our 3D results and suggesting to include them in the manuscript. However, these are preliminary results of our ongoing work. We are yet to establish the corresponding viscosity results quantitatively in the 3D simulations. Because the relationship between viscosity and relaxation time is not (always) linear in glass forming systems, we hesitate to report our results for publication. We hope to report the new results as part of a separate work.

      For point B/C: I see some of the points of the authors - although not all of it made it in the main text. I still have some points that puzzle me. For instance, the authors mention that a single value of viscosity (from Green-Kubo) is ”valid for all time scales and amplitude”. This sounds very surprising to me for a complex fluid even at equilibrium: doesn’t it for instance assume linear response (hence small amplitudes)? Fast vs slow probing of a complex medium should also matter (see refs previously mentioned). Related to this, it’s not clear how can self-propulsion not matter if one would shear the system at a finite time scale, given past work on motility-driven unjamming and the mechanism of the authors from facilitation ( wouldn’t shearing at time scales larger vs smaller than the typical time for given cells to spontaneously rearrange from self-propulsion change drastically the effective complex modulus of the system?)

      There might be a slight misunderstanding between the reviewer and us when

      we say ‘single value of viscosity is valid for all time-scales and amplitude’. Let us explain this point more carefully. In our problem, we are studying the dynamics of a many body system which is undergoing Brownian dynamics where the fluctuation-dissipation theorem need not be valid (as the friction and the selfpropulsion noise strength are not related via Fluctuation-Dissipation Theorem). Now, for us to use the concepts of linear-response (which in the present study are the Green-Kubo relations for the transport coefficients in terms of timecorrelations functions), we need to show that the within the simulation time, the system has reached state that could be described using an “equilibrium” probability measure. This is the precise reason we calculated the ergodicity measure, which is a way to show that all the phase-space have been sampled uniformly under the given Brownian dynamics. This suggests (does not prove) that the system has attained a stationary probability measure (i.e, near equilibrium) for the value of self-propulsion used. Now for this value of self-propulsion, the Green-Kubo relations hold for ‘any time-scale of the simulations’ so that we can perform a time average over the trajectories of the particles (which is an alias of the stationary probability measure under the values of self-propulsion used). If we change the amplitude of the self-propulsion, we need to again compute the ergodicity measure and show the stationarity of the probability measure. If the system is ergodic with respect to the new self-propulsion, we can again use Green-Kubo for the simulations. Note that we will definitely get a different value of viscosity under the new self-propulsion as the shear-stresses generated will be different but the Green-Kubo holds. If the system is not ergodic, for the self-propulsion with the new amplitude, we cannot use Green-Kubo relations. Also a priori, one cannot say what is a large/small amplitude of self-propulsion because it has to be compared with the intrinsic energy scale, which is encoded in the energy function, which is difficult to say without explicit calculations.

      This is what we meant when we said, ‘single value of viscosity is valid for all time-scales and amplitude’. It is valid for time-scales of the simulations for a given amplitude of self-propulsion only if the system is ergodic. Note that if the system is not ergodic, then the results of Ref. [14] (in the main text) could be questioned on theoretical grounds, because they were analyzed using 3 the equilibrium rigidity percolation theory. Nevertheless, the authors of Ref. [14] showed that equilibrium phase transition theory works in tissues. For these reasons, we have been, just like the Reviewer, puzzled that equilibrium ideas appear to be valid in the cell system. Additional theoretical work has to be done to clarify these links in tissues. Although this is not the last word, we hope this clarifies our view point.

      For point D: I agree with the simplicity argument, although the added sentence from the discussion “Furthermore, the physics of the dynamics in glass forming materials does not change in systems with and without attractive forces” seems a bit strong given works like Lois et al., PRL, 2008 or Koeze et al, PRL, 2018 finding fundamentally different physics of jamming with or without adhesion. In the two cited papers the authors only consider equilibrium transitions in systems with attraction using computer simulations. Apparently, jamming properties depend on the strength of attraction. There are no attempts to characterize the dynamics, the focus of our work.

      What we meant is that any universal relations, such as the Vogel-FulcherTammann relation, would still be valid. Of course, non-universal quantities such as glass transition temperature Tg or fragility will change. In our case, changing the adhesion strength would change ϕS, and the parameters in the VFT. However, our contention is that the overall finding that increase in viscosity followed by saturation is unlikely to change. We have added some clarifying statements in the manuscript to make this clear.

    1. Author Response

      We would like to thank the reviewers for their encouraging comments and useful feedback, which will enable us to improve the manuscript. We would like to briefly comment on some of the points they raised.

      1. We agree this is a fairly specialized pipeline that has some requirements in terms of photographic setup. We are working hard to make these requirements as minimal as possible. However, given the huge variability in camera angles, backgrounds, arrangement of brain slices, etc., making the pipeline fully automated for unconstrained photos is extremely challenging.

      2. In principle, it should be possible to extend our method to sagittal slices of the cerebellum or axial slices f the brainstem, but this would require collecting and labeling additional training data and thus remains as future work.

      3. Producing accurate surfaces with sparse photographs is a very challenging problem and also remains as future work. We have a conference article producing surfaces on MRI scans with sparse slices (https://doi.org/10.1007/978-3-031-43993-3_4) but we haven’t gotten it to work well on photographs yet.

      4. Another challenging issue that remains as future work is getting the pipeline to work well with nonlinear deformations, e.g., slices of fresh tissue. While incorporating nonlinear deformation into the model is trivial from the coding perspective, we have not been able to make it work at the level of robustness that we achieve with affine transformations. This is because the nonlinear model introduces huge ambiguity in the space of solutions: for example, if one adds identical small nonlinear deformations to every slice, the objective function barely changes.

      5. As we acknowledge in the manuscript, the validation of the reconstruction error (in mm) with synthetic data is indeed optimistic, but informative in the sense that they reflect the trends of the error as a function of slice thickness and its variability (“jitter”).

      6. Since we use a single central coronal slice in the direct evaluation, SAMSEG yields very high Dice scores for large structures with strong contrast (e.g., the lateral ventricles). However, Photo-SynthSeg provides better average results across the board, particularly when considering 3D analysis out of the coronal plane (see qualitative results in Figure 2 and results on volume correlations).

    1. Author Response:

      We would like to thank the editor and the three reviewers for their time and effort taken in reviewing our manuscript and providing constructive feedback. Unfortunately, the first author of this manuscript is no longer involved in academia, and does not wish to further revise this manuscript. However, we agree with the entirety of the feedback and critiques provided by the referees, and feel these points should be taken into account when interpreting our results and conclusions.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      This work challenges previously published results regarding the presence and abundance of 6mA in the Drosophila genome, as well as the claim that the TET or DMAD enzyme serves as the "eraser" of this DNA methylation mark and its roles in development. This information is needed to clarify these questions in the field. I am less familiar with the biochemical approaches in this work, so my comments are mainly on the genetic analyses. Generally speaking, the methods for fly husbandry and treatment seem to be in accordance with those established in the field.

      Response : We thank the reviewer for his/her work and positive assessment of our manuscript.

      Reviewer #2 (Public Review):

      DNA adenine methylation (6mA) is a rediscovered modification that has been described in a wide range of eukaryotes. However, 6mA presence in eukaryote remains controversial due to the low abundance of its modification in eukaryotic genome. In this manuscript, Boulet et al. re-investigate 6mA presence in drosophila using axenic or conventional fly to avoid contaminants from feeding bacteria. By using these flies, they find that 6mA is rare but present in the drosophila genome by performing LC/MS/MS. They also find that the loss of TET (also known as DMAD) does not impact 6mA levels in drosophila, contrary to previous studies. In addition, the authors find that TET is required for fly development in its enzymatic activity-independent manner.

      The strength of this study is, that compared to previous studies of 6mA in drosophila, the authors employed axenic or conventional fly for 6mA analysis. These fly strains make it possible to analyze 6mA presence in drosophila without bacterial contaminant. Therefore, showing data of 6mA abundance in drosophila by performing LC-MS/MS in this manuscript is more convincing as compared with previous studies. Intriguingly, the authors find that the conserved iron-binding motif required for the catalytic activity of TET is dispensable for its function. This finding could be important to reveal TET function in organisms whose genomic 5mC levels are very low.

      The manuscript in this paper is well written but some aspects of data analysis and discussion need to be clarified and extended.

      1. It is convincing that an increase in 6mA levels is not observed in TETnull presented in Fig1. But it seems 6mA levels are altered in Ax.TET1/2 compared with Ax.TETwt and Ax.TETnull presented in Fig1f (and also WT vs TET1/2 presented in Fig1g). Is it sure that no statistically significant were not observed between Ax.TET1/2 and Ax.TETwt?

      2. The representing data of in vitro demethylation assay presented in Fig.3 is convincing, but it is not well discussed and analyzed why these results are contrary to previous reports (Yao et al., 2018 and Zhang et al., 2015).

      We thank the reviewer for his/her work and positive assessment of our manuscript.

      (1) We repeated our statistical analyses and confirmed that there is no significant difference between wildtype and tet1/2 mutant embryos in axenic conditions (Welch two sample t-test : p=0.075).

      (2) We added some elements in the revised manuscript to discuss the possible reasons for the discrepancies with previous reports. Notably both studies performed the in vitro demethylation assays over a much longer time course and with different sources of recombinant proteins. Zhang et al. purified TET catalytic domain from human cells (HEK293T) and observed around 2.5% of 6mA demethylation at 30 min and less than 25% after 10 hours of incubation as measured by HPLC-MS/MS analyses. Yao et al. incubated recombinant TET catalytic domain with 6mA DNA for 3h and observed a 25% decrease in 6mA levels as measured by dot blot. These results suggest that drosophila TET may oxidize 6mA, but with a much lower affinity than 5mC since with observed a near complete oxidation of 5mC after 1 minute and no decrease in 6mA levels after 30 minutes of reaction (for identical concentrations of substrate and enzyme). It is possible too that the preparation of TET catalytic domain in different systems changes its enzymatic activity, potentially in relation with distinct post-translational modifications. Still, as already mentioned in our manuscript, extensive biochemical analyses of the distant TET homolog from the fungus Coprinopsis cinerea (Mu et al., Nature Chem Biol 2022) strongly argue that TET enzymes do not harbor the residues required to serve as 6mA demethylase.

      Reviewer #1 (Recommendations For The Authors):

      Here are one comment (#1) and a couple of questions (#2-3) that could be addressed in the future, in order to understand the roles of 6mA and TET. Even though #2 and #3 are likely beyond the scope of this paper, #1 should be addressed within the scope of this work and compared with previous reports.

      1. The phenotypic analyses in Fig. 4 should use tet_null/Deficiency and tet_CD/Deficiency for their potential phenotypes. This needs to be addressed since both the tet_null and the tet_CD were generated using the same starting fly line (GFP knock-in). Using a deficiency chromosome and testing these alleles in hemizygotes would be helpful to eliminate any secondary effects due to genetic background issues.

      Thanks for this comment. Actually, tet_null and tet_CD were not generated using the same starting lines. Whereas tet_cd was generated (by CRISPR) using the tet-GFP knock-in line, tet_null was generated by FRT site recombination between two PBac insertions (Delatte et al. 2016). As for tet1 and tet2 (used in allelic combination in Fig 4 J-L), they correspond to two distinct mutant alleles generated by CRISPR (Zhang et al. 2015). We have clarified this in the M&M (page 9).

      1. Regarding the estimated "200 to 400 methylated adenines per haplogenome", is there any insight into where are they located in the genome?

      It is an interesting question and we initially used SMRT-seq sequencing to obtain this kind of information. As it turned out that this technique gives a high level of false positive, we should consider with caution the interpretation of these data and we decided not to include them in the manuscript. Still, we characterized the genomic features of the 6mA detected using stringent criteria (mQV>100, cov>25x in the fusion dataset and triplicated across samples of the same genotype). Both in wild type and tet_null, 6mA were dispersed along each chromosome although few of them were found on chromosome X. In both cases there appeared to be a higher accumulation of 6mAs on the histone locus and the transposon-rich tip of chromosome X, but 6mA density remained below 1.3/kb in other genomic regions. Comparisons with annotated genomic regions indicated that 6mA were enriched in long interspersed nuclear elements (LINEs) and satellite repeats, and depleted in 3’UTR and exons, but there was no significant difference in their repartition between the two genetic contexts. Besides, motif analyses showed similar enrichments in both conditions, with GAG triplet accounting for more than one quarter of all the sites. Whether this reflects the specificity of a putative adenine methylase or a technical bias associated the with SMTR-seq technology remains to be established.

      1. The TET-GFP and TET-CD-GFP knock-in lines give proper nuclear localization and could be used to identify genomic regions bound with full-length TET and TET-CD using anti-GFP for ChIP-seq or CUT&RUN (or CUT&TAG).

      Indeed, this is a line of research that we are following up and will be part of another study. Actually, our ChIP-seq experiments indicate that they bind on the same genomic regions.

      Reviewer #2 (Recommendations For The Authors):

      • I think the major findings of this paper are showing 6mA present in drosophila by using xenic or conventional breeding conditions and finding that TET function independently of its catalytic activity is essential for fly development. The authors could have been more precise in title and abstract to emphasize these findings.

      We have now modified the abstract to try to emphasize these findings.

      • The authors claim that any increase of 6mA levels was not observed in both TETnull and TET1/2, but it is not sufficiently convincing. Because it seems 6mA levels were increased in Ax. tet1/2 embryo as compared with in Ax.wt embryo (Fig.1). In this scenario, 6mA abundance in both TETnull and TET1/2 mutant are supposed to be the same. It would be better to re-analyze data carefully and discuss if 6mA levels were significantly increased in TET1/2, and why 6mA levels are different between TETnull and TET1/2. Additionally, the authors describe that the TET null mutant is pupal lethal, while the TET1/2 survivor is available. The text suggests that TET1/2 could have partial functionality on fly development (Fig.4). It would be better to check whether the N-terminus of TET is expressed in the TET1/2 mutant.

      Indeed, the increase in 6mA levels in Ax. tet1/2 embryo seems consequent (although it is not statistically significant) and no increase was observed in Ax tet_null embryos. Thus, the putative effect on 6mA levels in tet1/2 embryos may not be directly due to the absence of TET function. We now mention in the revised manuscript (page 6) that “the apparent increase in 6mA levels in tet1/2 axenic embryos was not reproduced in tet_null embryos, suggesting that it does not simply reflect the tet loss of function, and that it was not statistically significant”. Besides, we do not have an antibody to check whether the N-terminus of TET is expressed in the tet1/2 mutants, but the western blot published by Zhang et al 2015 shows that tet2 mutation leads to the expression of TET N-terminal domain. This N-terminal domain could have partial TET functionality and/or interfere with the function of other factors (notably those implicated in 6mA metabolism).

      • The authors show that SMRT-seq data did not reveal an increase in 6mA levels in loss of TET (Fig.2). It is convincing that total 6mA abundance was not altered by loss of TET. But were 6mA-accumulated locus/regions observed in WT not altered by loss of TET?

      Please refer to our answer to reviewer 1 on that point.

      • It remains unclear that the TET proteins the authors prepared do not exhibit 6mA demethylate activity in vitro, contrary to what was reported in previous papers (Fig.3). I think the preparation of recombinant proteins may make different results between this and previous papers. Yao et al., 2018 and Zhang et al., 2015 used recombinant proteins purified from Human cells or insect cells, while the author purified them from E.Coli. Additionally, it's mentioned that VK Rao et al., 2020 demonstrated cdk5-mediated phosphorylation of Tet3 increases its in catalytic activity in vitro. These previous reports suggest modification of TET could change demethylase activity. More analysis and discussion are needed to support the conclusion.

      Thanks for your insights. This in an important point and we added the following elements in the revised manuscript to discuss possible reasons for the discrepancies with previous reports (pages 7-8): “Our results contrast with previous reports showing that recombinant drosophila TET demethylates 6mA on dsDNA in vitro (Yao et al. 2018; Zhang et al., 2015a). However, both studies ran much longer reactions (up to 10 hours) and used different sources of recombinant protein (drosophila TET catalytic domain purified from human HEK293T cells). Notably, Zhang et al. (2015a) only found around 2.5% of 6mA demethylation at 30 min and less than 25% after 10 hours of incubation as measured by HPLC-MS/MS analyses. These results suggest that drosophila TET may oxidize 6mA, but with a much lower affinity than 5mC since with observed a near complete oxidation of 5mC after 1 min. and no significant decrease in 6mA levels after 30 min. of reaction (for identical concentrations of substrate and enzyme). It is possible too that the preparation of TET catalytic domain in different systems changes its enzymatic activity, potentially in relation to distinct post-translational modifications.”

    1. Author Response

      1. Reviewer 1 raised the concern that the images shown in the figures seem inconsistent with the quantitative data.

      Our provisional response: The quantitative data are based on many samples and the photographs are just supposed to show illustrations of example data. Because of the volume containing P1a cells, is impossible to present a single confocal image that covers all P1a neurons and would therefore correspond more closely to the quantitative data. We chose to illustrate the quantitative data using single confocal images which contain both Hr38+/GFP+ and Hr38-/GFP+ neurons, to demonstate that we can distinguish clearly which P1a neurons are positive or negative for for Hr38 expression. This can be clarified in the figure legends. If it is imperative to show images(s) to reflect the statistics, we can do that but will need to present multiple confocal images for each condition, which could be messy and confusing.

      1. Reviewer 2 states: "the major weakness is the calibration of the temporal resolution of HI-CatFISH in Figure 4 and Figure Supplement 4. According to Figure Supplement 4C, close to 100% of the Hr38-positive cells are already labeled with the exonic probe 30min post-stimulation, which is not reflected in Figure 4B (there, the expression level of the exonic probe peaks 60min post-induction)”.

      The confusion may arise because we drew the illustration diagram (Fig. 4B) based on the quantitative data in Fig.S4B, which plots the intensity of Hr38 exonic ISH signals, while the reviewer may be comparing the illustration to the time course based on Fig.S4C, which shows the % positive cells, a binary measure. In the illustration (fig.4B), we wrote 'Hr38 expression level', not '%Hr38 positive cells.’ We can clarify this in the figure legend. If the reviewers prefer, we can add a threshold line in the diagram corresponding to the % positive cells at maximum.

    1. Author Response

      eLife assessment

      This study presents valuable insights into the epigenetic landscape in adult kidney podocytes. A series of solid experiments demonstrate that genes that are regulated by a key kidney transcription factor, Mafb, are essential for H3K4me3 methylation and recruitment of Wt1 to Nphs1 and Nphs2. This new information provides insights into the potential relationship and coordination of transcription factors in regulating target genes in podocytes in glomerular diseases, although the conclusion that MafB is generally required for Wt1 to bind to podocyte-specific promoters is incomplete and should be extended beyond two or three genes.

      We thank the reviewers and editors for critically reading our manuscript and their insightful comments. We will strive to revise

      Reviewer #1 (Public Review):

      Summary:

      In their manuscript, Massa and colleagues provide a map of the epigenetic landscape in podocytes and analyze the role of the transcription factor MafB in podocyte gene expression. They initially map the histone profile in adult podocytes of the mouse by assaying three different histone methylation marks, namely H3K4me3, H3K4me1, and H3K27me3 for active, primed, and repressed states. They then perform Wt1- and MafB-ChIP-Seq analysis to identify respective direct targets of those transcription factors. Subsequently, they employ an inducible MafB knockout model and show that homozygous knockout mice show proteinuria and FSGS, suggesting an important role for MafB in podocyte homeostasis. RNA-Seq analysis in mice two daysafter tamoxifen application identified direct and indirect MafB target genes. Finally, the authors turn to a constitutive MafB knockout model, carry out anti-H3K4me3 and anti-Wt1 ChIP experiments, and examine selected promoters. One main conclusion from this work is that MafB opens chromatin and thus facilitates the binding of other transcription factors like Wt1 to podocyte-specific genes.

      Strengths and weaknesses:

      The authors have performed an impressive number of experiments and generated very valuable data. They use state-of the-art technology and the data are presented well and are sound. This being said the manuscript contains significant novel data, but also experiments that are already available in some sort. The histone profile in adult mouse podocytes is novel and provides an interesting map of epigenetic marks in this particular cell type. It is maybe not too surprising that podocyte-differentiation genes have different chromatin accessibility than genes associated with general development. The Wt1-ChIP has been done before by several labs but is certainly an important control in this work. The MafB-ChIP is new. The inducible MafB knockout model including the identification of Tcf21 as a target gene has been published by others in 2020 (and is acknowledged by the authors). The experiments addressing the potential role of MafB in chromatin opening are new. I find that the data are certainly compatible with the model put forward by the authors, but they are not compelling.

      We agree that additional data on changes in chromatin accessibility in the absence of Mafb would help to support our model and we will be working towards this data for a revised version of the manuscript.

      Reviewer #2 (Public Review):

      Summary:

      The authors investigate the role of MafB in regulating podocyte genes. Mafb is required for podocyte differentiation and maintenance. Mutations of this gene cause FSGS in mice and humans. They profiled MafB binding genome-wide in isolated glomeruli and defined overlap with Wt1. They provide evidence that Mafb is required for Wt1 binding and H3K4me3 methylation at the promoters of two essential podocyte genes, Nphs1 and Nphs2 Understanding how the action of different transcription factors is coordinated to control gene expression - the main goal of this paper - is an important line of investigation.

      While the main conclusion of the paper is supported by their data, the scope is limited. Additional ChIP-seq experiments and data analysis are needed to solidify and extend their conclusions.

      Strengths:

      1) Performing ChIP-seq for histone modifications on isolated podocytes provides valuable cell-type-specific information. Similarly, profiling Mafb and Wt1 in isolated glomeruli provides podocyte-specific binding patterns because these transcription factors (TFs) are not expressed in other cell types in glomeruli. The significant overlap of their Wt1 binding genome-wide withthat of prior published work is reassuring. RNA-seq on isolated podocytes provides the appropriate cell-type specific gene expression data to integrate with ChIP-seq data. Together, the RNA-seq and ChIP-seq data are valuable resources for other investigators examining gene regulation in mouse podocytes.

      2) The phenotype analysis of their FSGS model is convincing and well done.

      3) Testing how Wt1 binding is affected by loss of Mafb provides insight into how these key podocyte TFs may cooperate to regulate genes.

      Weaknesses:

      1) The conclusion that Mafb is required for Wt1 binding and H3K4me3 methylation is based solely on ChIP-PCR at two gene promoters (Nphs1, Nphs2). This result should be validated and extended by ChIP-seq. Mafb and Wt1 binding overlap at more than 200 sites. If their model is correct, it is likely that Wt1 binding would be affected at other genomic sites. This result would add strong support to their model of how Wt1 and Mafb cooperate to regulate genes in podocytes. Moreover, ChIP-seq would define whether the dependence of Wt1 on Mafb is also evident at distal regulatory regions (defined H3K4me1, which is typically found at predicted enhancers).

      We agree that a genome wide analysis of chromatin accessibility would help corroborating our model and will work towards this data for a revised version.

      2) The FSGS model generated by the authors involved conditional deletion of Mafb in podocytes at 8 weeks of age. They found that this resulted in reduced expression of Nphs1 and Nphs2 within 48 hours post-deletion. However, they investigated Wt1 binding and H3K4me3 genomic binding in Mafb homozygous null embryos. While this result provides information about podocyte differentiation, it does not address the maintenance of expression of these essential podocyte genes in the adult kidney. Because post-natal deletion of Mafb led to FSGS and reduced expression of Nphs1/2, ChIP-seq should be performed on the adult conditional mutants in order to provide mechanistic information about the disease.

      The fact that the phenotype in Mafb conditional mutant animals is progressive means that epigenetic changes are also likely to be quantitative. Indeed, Nphs1/Nphs2 are still expressed 6 weeks after Mafb deletion, albeit at lower levels. Since ChIP-seq experiments are not necessarily quantitative, we believe it may be difficult to detect statistically significant changes in this model. We will discuss this limitation of our study in a revised version of our manuscript.

      3) H3K4me1 binds enhancer regions. The authors performed ChIP-seq to profile H3K4me1 in isolated podocytes. However, there was no analysis reported of these results. It would be valuable to determine if Wt1 and Mafb co-localize at predicted enhancers in podocytes and if Wt1 binding is lost at these regions in Mafb mutant glomeruli.

      We well reanalyse the data taking the reviewer’s comments into account.

    1. Author Response

      The following is the authors’ response to the current reviews.

      For the final Version of Record the following changes will be included: 1. Figure 4: Example traces replaced with a more representative simulation run that is more similar to the mean. 2. Methods: Description of the alignment procedure expanded to explain the algorithm steps better.


      The following is the authors’ response to the previous reviews

      We are grateful for the positive and insightful feedback from the editors and reviewers. These constructive comments have contributed to the enhancement of our work. We have revised the manuscript, addressing each of the comments raised. In addition, based on the commentary provided, we have introduced two new figures that offer a deeper understanding of our research findings:

      In new Figure 7, we present the analysis of the difference in onset times between motion and flash responses. This figure also includes a simple illustration elucidating the origins of these differences, highlighting the varying engagement of receptive fields by these stimuli. The data presented in this figure were initially featured in the main text of the original manuscript. Figure 11 offers a detailed comparison of the temporal and spatial characteristics of the synthetic presynaptic signals driving optimal DS in SACs. We compare these characteristics with the properties extracted from recorded glutamate release. Our analysis suggests that the sluggish dynamics observed in biological signals impede effective directional integration. Below are the detailed point-by-point responses to reviewers comments.

      Reviewer #1 (Public Review):

      Summary:

      Direction selectivity (DS) in the visual system is first observed in the radiating dendrites of starburst amacrine cells (SACs). Studies over the last two decades have aimed to understand the mechanisms that underlie these unique properties. Most recently, a 'space-time' model has garnered special attention. This model is based on two fundamental features of the circuit. First, distinct anatomical types of bipolar cells (BCs) are connected to proximal/distal regions of each of the SAC dendritic sectors (Kim et al., 2014). Second, that input across the length of the starburst is kinetically diverse, a hypothesis that has been only recently demonstrated experimentally using iGluSnFR imaging (Srivastava et al., 2022). However, the stark kinetic distinctions, i.e., the sustained/transient nature of BC input to SACs dendrites appear to be present mainly in responses to stationary stimuli. When BC receptive field properties are probed using white noise stimuli, the kinetic differences between BCs are relatively subtle or nonexistent (Gaynes et al., 2022; Strauss et al., 2022, Srivastava et al., 2022). Thus, if and how BCs contribute to direction selectivity driven by moving spots that are commonly used to probe the circuit remains to be clarified. To address this issue, Gaynes et al., combine evolutionary computational modeling (Ankri et al., 2020) with two-photon iGluSnFR imaging to address to what degree BCs contribute to the generation of direction selectivity in the starburst dendrites in response to stimuli that are commonly used experimentally.

      Strengths:

      Combining theoretical models and iGluSnFR imaging is a powerful approach as it first provides a basic intuition on what is required for the generation of robust DS, and then tests the extent to which the experimentally measured BC output meets these requirements.

      The conclusion of this study builds on the previous literature and comprehensively considers the diverse BC receptive field properties that may contribute to DS (e.g. size, lag, rise time, decay time).

      By 'evolving' bipolar inputs to produce robust DS in a model network, these authors provide a sound framework for understanding which kinetic properties could potentially be important for driving downstream DS. They suggest that response delay/decay kinetics, rather than the center/surround dynamics are likely to be most relevant (albeit the latter could generate asymmetric responses to radiating/looming stimuli).

      Weaknesses:

      Finally, these authors report that the experimentally measured BC responses are far from optimal for generating DS. Thus, the BC-based DS mechanism does not appear to explain the robust DS observed experimentally (even with mutual inhibition blocked). Nevertheless, I feel the comprehensive description of BC kinetics and the solid assessment of the extent to which they may shape DS in SAC dendrites, is a significant advancement in the field.

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors sought to understand how the receptive fields of bipolar cells contribute to direction selectivity in starburst amacrine cell (SAC) dendrites, their post synaptic partners. In previous literature, this contribution is primarily conceptualized as the 'space-time wiring model', whereby bipolar cells with slow-release kinetics synapse onto proximal dendrites while bipolar cells with faster kinetics synapse more distally, leading to maximal summation of the slow proximal and fast distal depolarizations in response to motion away from the soma. The space-time wiring contribution to SAC direction selectivity has been extensively tested in previous literature using connectomic, functional, and modeling approaches. However, the authors argue that previous functional studies of bipolar cell kinetics have focused on static stimuli, which may not accurately represent the spatiotemporal properties of the bipolar cell receptive field in response to movement. Moreover, this group and others have recently shown that bipolar cell signal processing can change directionally when visual stimuli starts within the receptive field rather than passing through it, complicating the interpretation of moving stimuli that start within a bipolar cell of interest's receptive field (e.g. stimulating only one branch of a SAC or expanding/contracting rings). Thus, the authors choose to focus on modeling and functionally mapping bipolar cell kinetics in response to moving stimuli across the entire SAC dendritic field.

      General Comments

      There have been several studies that have addressed the contribution of space-time wiring to SAC process direction selectivity. The impact of this project is to show that this contribution is limited. First, the optimal solution obtained by the evolutionary algorithm to generate DS processes is slow proximal and fast distal inputs - exactly what is predicted by space-time wiring, which is exactly what is required of the HRC model. Hence, this result seems expected and it's not clear what the alternative hypothesis is. Second, the experimental results based on glutamate imaging to assess the kinetics of glutamate release under conditions of visual stimulation across a large region of retina confirm previous observations but were important to test. Third, by combining their model model with this experiment data, they conclude that even the optimal space-time wiring is not sufficient to explain the SAC process DS. The results of this approach might be more impactful if the authors come to some conclusion as to what factors do determine the direction selectivity of the SAC process since they have argued that all the current models are not sufficient.

      Reviewer #3 (Public Review):

      Gaynes et al. investigated the presynaptic and postsynaptic mechanisms of starburst amacrine cell (SAC) direction selectivity in the mouse retina by computational modeling and glutamate sensitivity (iGluSnFR) imaging methods. Using the SAC computational simulation, the authors initially tested bipolar cell contributions (space-time wiring model, presynaptic effect) and SAC axial resistance contributions (postsynaptic effect) to the SAC DS. Then, the authors conducted two-photon iGluSnFR imaging from SACs to examine the presynaptic glutamate release, and found seven clusters of ON-responding and six clusters of OFF-responding bipolar cells. They were categorized based on their response kinetics: delay, onset phase, decay time, and others. Finally, the authors generated a model consisting of multiple clusters of bipolar cells on proximal and distal SAC dendrites. When the SAC DS was measured using this model, they found that the space-time wiring model accounted for only a fraction of SAC DS.

      The article has many interesting findings, and the data presentation is superb. Strengths and weaknesses are summarized below.

      Major Strengths:

      • The authors utilized solid technology to conduct computational modeling with Neuron software and a machine-learning approach based on evolutionary algorithms. Results are effectively and thoroughly presented.

      • The space-time wiring model was evaluated by changing bipolar cell response properties in the proximal and distal SAC dendrites. Many response parameters in bipolar cells are compared, and DSI was compared in Figure 3.

      • Two-photon microscopy was used to measure the bipolar cell glutamate outputs onto SACs by conducting iGluSnFR imaging. All the data sets, including images and transients, are elegantly presented. The authors analyzed the response based on various parameters, which generated more than several response clusters. The clustering is convincing.

      Major Weaknesses:

      • In Figure 9, the authors generated the bipolar cell cluster alignment based on the space-time wiring model. The space-time wiring model has been proposed based on the EM study that distinct types of bipolar cells synapse on distinct parts of SAC dendrites (Green et al 2016, Kim et al 2014). While this is one of the representative Reicardt models, it is not fully agreed upon in the field (see Stincic et al 2016). While the authors' approach of testing the space-time wiring model and conclusions is interesting and appreciated, the authors could address more issues: mainly two clusters were used to generate the model, but more numbers of clusters should be applied. Although the location of each cluster on the SAC dendrites is unknown, the authors should know the populations of clusters by iGluSnFR experiments. Furthermore, the authors could provide more suggestive mechanisms after declining postsynaptic factors and the space-time wiring model.

      The reviewer is correct that the proximal and more distal SAC dendrites sample from different IPL depths. It should be theoretically possible to match the functional clusters we measured with anatomical bipolar cell identities. However, the stratifications of these cells have significant overlaps (Figure 6-S2), and previous attempts to match iGluSnFR signals to anatomy proved to be challenging (Franke et al., 2017; Gaynes et al., 2022; Matsumoto et al., 2019; Srivastava et al., 2022; Strauss et al., 2022). In the revised version of the manuscript, we reorder the functional clusters based on their transiency, which has a higher correlation to stratification depth (Franke et al., 2017).

      We have examined a scenario in which the presynaptic population comprises more than two clusters. We constructed synthetic models whose input structure was as in Figure 10 (old Figure 9). The optimal configuration for the most proximal and distal inputs closely resembled the proximal-distal model reported in Figure 2. However, we observed a nearly linear variation in the shape of the optimal mid-range inputs, transitioning from proximal-like to distal-like responses as the distance increased. We consider this outcome to be expected based on the structure of the space-time wiring model (Kim et al., 2014). Interestingly, this was not the case with models incorporating physiologically recorded signals. As we show in Figure 10, the most common optimal directional tuning was seen when the bipolar drive consisted of two main populations, both in the ON and OFF SACs.

      Finally, we believe that uncovering additional mechanisms that underlie directional selectivity in SACs represents a crucial challenge for the field to tackle. It is highly probable that achieving directional selectivity involves a complex interplay of multiple factors. This includes the organization of the presynaptic circuit, which we have partially addressed in this study, as well as the influence of postsynaptic active conductances and feedback loops involving other SACs and presynaptic cells. We have expanded the discussion section to describe the possible mechanisms

      • The computational modeling demonstrates intriguing results: SAC dendritic morphology produces dendritic isolation, and a massive input overcomes the dendritic isolation (Figure 1). This modeling seems to be generated by basic dendritic cable properties. However, it has been reported that SAC dendrites express Kv3 and voltage-gated Ca channels. It seems to be that these channels are not incorporated in this model.

      The reviewer's observation is accurate; the model depicted in Figure 1 did not include voltage-gated channels. Our goal was to study electrotonic isolation, which is often measured in passive models. However, while we did not incorporate voltage-gated potassium channels implicitly in the models, our simulations are rooted in previous models that were fine-tuned using empirical data. As potassium channels are expected to influence the experimentally recorded input resistance, we have indirectly accounted for their impact on the interdendritic signal propagation.

      In subsequent model iterations, we have integrated voltage-gated calcium channels into our simulations to assess the signal responsible for driving synaptic release. We show that nonlinear voltage dependence of the calcium currents enhances compartmentalization of the local calcium levels (Figure 2), but did not significantly influence local voltages. Therefore, calcium channels do not appear to have a major impact on electrotonic distances.

      • In Figure 5B, representative traces are shown responding to moving bars in horizontal directions. These did not show different responses to two directional stimuli. It is unclear whether directional preference was not detected, which was shown by Yonehara's group recently (Matsumoto et al 2021). Or that was not investigated as described in the Discussion.

      Indeed, we observed no discernible directional differences in bipolar responses. This phenomenon can be primarily attributed to the fact that the signals originating from the limited number of directionally-tuned release sites are overshadowed by the release from non-directionally-tuned units (Matsumoto et al., 2021). In the revised discussion, we have acknowledged this limitation in our recorded data.

      • The authors found seven ON clusters and six OFF clusters, which are supposed to be bipolar cell terminals. However, bipolar cells reported to provide synaptic inputs are T-7, T-6, and multiple T-5s for ON SACs and T-1, T-2, and T-3s for OFF SACs. The number of types is less than the number of clusters. Potentially, clusters might belong to glutamatergic amacrine cells. These points are not fully discussed.

      We have expanded the discussion section to address these points.

      Reviewer #1 (Recommendations For The Authors):

      Major comments

      1. One of the main conclusions of this study is that diverse BC kinetics contribute to DS (Fig. 9). The authors nicely demonstrate using modeling that the experimentally measured BC kinetics are far from ideal. However, this conclusion is based on a model that almost exclusively relies on just two of the 7 putative BC types (e.g., C1 & C6 for On SACs) placed optimally along the dendrites, which raises two important caveats.

      First, given that other BC types are likely to contribute, the effects of two distinct types are likely to be diluted. Thus, the contribution of BCs to DS is likely to be significantly overestimated. Second, given that the dendrites of 10-30 SACs cross each point in the honeycomb, for the given model to work, each BC would need to connect extremely selectively to SACs. i.e., at a given point, a sustained input must only connect to the more proximal dendritic segments, while avoiding entirely the distal segments of overlapping SAC dendrites. Thus, their model requires extremely selective wiring for which there is no evidence. In fact, there is evidence to the contrary provided by Ding et al. 2016, which showed that the type 7 (proximally biased) and type 5 (distally biased) populations had a substantial overlap (assuming these BC types correspond to kinetically diverse clusters).

      We wholeheartedly concur with the reviewer's perspective that our findings have led to an overestimation of the space-time wiring mechanism's role in SAC directional selectivity (DS). We have adjusted our discussion to emphasize this point. In light of this, our assertion that, even with the most favorable distribution of synaptic inputs, the space-time wiring model still does not fully account for the experimentally-determined directional tuning in SAC, remains valid.

      With regard to the model, it would also be worth comparing results to previous starburst models (e.g., Tukker et al,. 2004), which demonstrated a robust DS in SAC dendrites in the absence of kinetically diverse BC input. Why is the cell-intrinsic DS so weak in the present model?

      We have directly explored this question in the synthetic model (Figures 2, 3). Despite variances in the anatomy of SACs and the distribution of bipolar inputs between our model and the study by (Tukker et al., 2004), we observed remarkably similar levels of directional selectivity index computed from the voltage response (approximately 10%, as shown in Figure 3, 'Identical BCs').

      The primary distinction emerged in the degree of DS amplification mediated by calcium currents. Tukker et al., 2004 reported considerably higher DS compared to our findings, despite employing similar formulations for voltage-gated calcium channel models. The key factor driving this difference lies in the fact that Tukker et al., 2004 measured amplification in proximity to the threshold of calcium channel activation. Even minor variations in membrane potentials near this threshold can lead to substantial differences in calcium influx, especially when outward stimulation results in a calcium spike. In fact, recently, Robert Smith’s group revisited the threshold-based mechanism and concluded that it often fails to produce robust DS due to the heterogeneity of membrane potentials among different terminal dendrites (Wu et al., 2023).

      Our models were trained on five different stimuli velocities whose synaptic integration produced substantially different peak amplitudes. Consequently, the spike threshold alone couldn't reliably distinguish between inward and outward directions across all five conditions, resulting in reduced directional performance in our simulations. In the revised Figure 2-S2 we directly explore the performance of the model with identical BC formulations, trained on a single velocity. We find a dramatic enhancement of calcium DS (DSI=66%) in this condition compared to an identical model trained on 5 velocities (DSI=17%). Thus, evolutionary search is capable of finding the threshold-based solution, but only when the training is performed on a single stimulus velocity (Figure 2-S2). This solution did not generalize to multiple stimuli speeds because, as mentioned above, they lead to different postsynaptic depolarization levels (Figure 2, 2-S1). Instead, the algorithm converged on a set of postsynaptic paraments leading to less nonlinear calcium channel activation over a broader voltage range, ensuring effective DS performance over multiple velocities and heterogenous local potentials (Wu et al., 2023).

      1. Functionally distinct responses across different regions of interest (ROIs) were used to classify BC input. ROIs were obtained from multiple scan fields and retinas and combined into a single dataset for functional clustering. However, the consistency of the cluster distribution across these replicates has not been addressed. As BCs can exhibit different functional properties dependant on the state/health of the retina, it is important to know whether certain functional clusters may originate disproportionately from a particular experiment, as it implies that each cluster does not represent a different stable functional/anatomical population.

      We acknowledge that the state of the preparation can significantly impact signal dynamics. In response to this important consideration, we have incorporated details about the distribution of functional clusters in various experiments in the revised version of the manuscript (Figure 6-S1, and discussion).

      Other comments:

      1. Interpreting iGluSnFR signals: Since the sensor is expressed uniformly across the SAC dendrite, it is important to clarify why the measured F signals are considered synaptic responses. Could spillover contribute to the generation of slower responses?

      We do not believe spillover can explain slower responses because the sluggish clusters often responded significantly (up to 500ms) sooner to moving bars (Figures 6, 6-S3). We acknowledge and discuss this possibility of spillover in the revised discussion.

      1. One striking finding is the diversity of BCs RF sizes (Fig. 7C). Some BCs have RF that are far larger than their dendritic fields. It will be useful to discuss the potential mechanisms that may underlie large BC RFs.

      We changed the discussion to address this question.

      1. SAC DS is independent of dendritic isolation: The authors claim that dendritic isolation does not significantly impact DS. However, while this might be true for a linear motion through the receptive field, dendritic isolation probably matters for more dynamic stimuli. For example, DSGCs can encode rapid changes in objection direction, as DS is computed over fine spatiotemporal scales relying on SACs (Murphy-Baum et al., 2022). This could not occur if SAC dendrites were not well electrically isolated from each other.

      We believe that this is an accurate interpretation of our findings. Our research suggests that dendritic isolation is likely not a critical factor in the space-time wiring mechanism. However, as we demonstrate that this particular mechanism cannot fully account for the observed levels of DS in SACs, other mechanisms must be important. As previous studies revealed that dendritic isolation enhances SAC DS (for example, Koren et al., 2017), dendritic independence likely contributes to directional performance within SACs by these additional mechanisms.

      1. Figure 4: From what I understand, the BC inputs for the electrotonic connectivity variations evolved much like they were for the original model without axial resistance constraints. This makes sense, since stronger/weaker inputs with different temporal kernels may be appropriate for each condition, hence why the axial resistance wasn't changed post-evolution, which would have likely caused the DS to drop. If that is the case, however, I wonder how the best DS attainable by the final model which is constrained to the radial arrangement of realistic BC inputs (without being able to fit much more optimal sustained-transient BCs to their circumstance) would be impacted. Is dendritic isolation similarly unimportant when the pre-synaptic story isn't ideal?

      We have explored this question directly by allowing the evolutionary algorithm to modify the passive and active characteristics of the postsynaptic SAC. Our findings are summarized in Figure 9-S1. We observed a correlation between DSI levels and membrane/axial resistance values in SACs in the evolved models. Better DS was seen with leaky membranes (higher isolation) and lower axial resistance (lower isolation). While it is clear that postsynaptic parameters can influence synaptic integration, they can not fully compensate for inadequate presynaptic dynamics.

      1. BC are shown to contribute to DS across velocities (Fig. 9), which contrasts with results from Srivastava et al., (2022) that showed BCs contribute to DS at lower velocities. However, this discrepancy can easily be explained by the choice of moving spots. In this study, the sweeping bars had dynamic width (targeting pixel dwell time of 2s), which means for higher velocities the bar is significantly wider. While in the previous study, the width of the stimulus was kept constant, and thus for higher velocities, the sustained/transient kinetic differences of BCs are less clear (Srivastava et al., 2021). The author's should discuss this explicitly, to avoid discrepancies between these two studies the reader might otherwise perceive.

      We value reveiwer’s feedback, and in response, we have included an additional paragraph in the manuscript addressing the distinctions in directional tuning that arise from the space-time model presented in this work, in comparison to earlier studies.

      1. Methods: It will be good to discuss how ROIs sizes and positions were selected (pixel correlations?)

      We have included a more detailed explanation of the clustering procedure

      • Lines 614 describe whole-cell patch clamp techniques, which are not used in this study.

      We used patch-clamp to record the waveforms shown in Figure 2-S2

      1. Figure 6: Diversity of Glut responses to motion in ON and OFF SACs, caption typos?

      2. "Left:" without "Right:" to describe the population (I presume) viewed as an image

      3. If there should still be A,C and B,D to group the ON and OFF halves, maybe it should be mentioned in the caption

      Thank you for bringing this to our attention, the legends were fixed.

      References:

      Kim, J. S., Greene, M. J., Zlateski, A., Lee, K., Richardson, M., Turaga, S. C., Purcaro, M., Balkam, M., Robinson, A., Behabadi, B. F., Campos, M., Denk, W., Seung, H. S., & EyeWirers (2014). Space-time wiring specificity supports direction selectivity in the retina. Nature, 509(7500), 331-336. https://doi.org/10.1038/nature13240

      Gaynes, J. A., Budoff, S. A., Grybko, M. J., Hunt, J. B., & Poleg-Polsky, A. (2022). Classical center-surround receptive fields facilitate novel object detection in retinal bipolar cells. Nature communications, 13(1), 5575. https://doi.org/10.1038/s41467-022-32761-8

      Murphy-Baum B. and Awatramani GB (2022). Parallel processing in active dendrites during periods of intense spiking activity, Cell Reports, Volume 38, Issue 8,

      Srivastava P, de Rosenroll G., MatsumotoA., Michaels T., Turple Z., Jain V, Sethuramanujam S, Murphy-Baum B, Yonehara K., Awatramani, G.B. (2022) Spatiotemporal properties of glutamate input support direction selectivity in the dendrites of retinal starburst amacrine cells eLife 11:e81533

      Strauss, S., Korympidou, M. M., Ran, Y., Franke, K., Schubert, T., Baden, T., Berens, P., Euler, T., & Vlasits, A. L. (2022). Center-surround interactions underlie bipolar cell motion sensitivity in the mouse retina. Nature communications, 13(1), 5574. https://doi.org/10.1038/s41467-022-32762-7

      Tukker, J. J., Taylor, W. R., & Smith, R. G. (2004). Direction selectivity in a model of the starburst amacrine cell. Visual neuroscience, 21(4), 611-625. https://doi.org/10.1017/S0952523804214109

      Reviewer #2 (Recommendations For The Authors):

      Specific comments

      1. Line 223. The statement a model trained on only optimal DSI would produce "negligible absolute differences in calcium levels." is unclear. This needs to be better explained.

      We have modified and expanded this paragraph to make it more clear

      1. Figure 4. The authors use this model to test the hypothesis that space time wiring contribution to SAC process DS requires dendritic isolation. They do this by increasing axial resistance around the soma of their model neuron to isolate each dendrite. They found comparable DS was achieved in both conditions, indicating that the space-time wiring model works in two cases of high and low dendritic isolation. However, to test the claim that "specific details of postsynaptic integration appear to play a lesser role" (line 274) the authors may consider allowing the axial resistance to change as a part of the model rather than testing two extreme states.

      Membrane and axial resistances (and active parameters) were allowed to change as part of model evolution in most simulations presented in this manuscript. We have added the information on the final resistance values reached in the evolved models in Figure 9-S1

      1. Figure 6: To study glutamatergic input onto SACs, the authors expressed iGLuSnFR in ChAT-Cre mice and grouped similarly responding pixels into ROIs and separated these responses into functional groups based on cluster analysis (Figure 5). The alignment of the responses in Figure 6A was confusing. It appears that average responses for each cluster are aligned based on the peak observed during the stimulus in each direction, but it is unclear how they are aligned relative to each other or what this timing is relative to location of the stimulus (i.e. what is time 0 in 6A?).

      The displayed traces represent the average responses to horizontally moving bars (speed = 0.5mm/s), either moving to the left or right. To achieve this alignment, we employed a procedure consistent with our recent publication (Gaynes et al., 2022), which we have now detailed more comprehensively. Here's the step-by-step process we followed:

      1. Determination of half-maximum rise times: Initially, we calculated the half-maximum rise times for glutamate signals recorded in response to left and right-moving stimuli.

      2. Calculation of mean rise time: We then computed the mean of these rise times, which served as a reference point for alignment.

      3. Alignment procedure: To illustrate the alignment process, consider an example. Suppose the 50% rise time for responses to left-moving stimuli occurs at 3 seconds, while responses to right-moving stimuli occur 4 seconds after stimulation onset. This discrepancy suggests that the RF of the cell is shifted to the right from the center of the display (assuming a stimulation speed of 0.5mm/s on the retina, the RF's position would be approximately 250μm from the midline). To align these responses, we shifted both waveforms by 500ms so that their 50% rise times coincided at 3.5 seconds. Importantly, 3.5 seconds would represent the 50% rise time of the ROI if it were precisely centered on the display. This alignment effectively removed any spatial position dependence from the ROIs.

      4. Comparative analysis and clustering: With the responses now aligned, we were able to compare their shapes and subsequently cluster the ROIs into distinct functional clusters. For clarity, we opted to highlight the time of response peak for cluster 1. Although this peak closely aligned with the calculated time of stimulus motion over the center of the 'shifted RF' in the adjusted time frame, it provided a more straightforward comparison between response dynamics.

      1. The authors need to do a better job explaining how their results differ from Ezra-Tsur et al 2021, which uses the same sort of model to address the same question. The discussion about this study (lines 425-435) are based on how a more constrained version of these models work better but they do not directly address the difference in conclusion with regards to mechanisms that contribute to SAC process direction selectivity.

      We have expanded the discussion related to mechanisms that contribute to DS in SACs and discuss the differences between our studies.

      Minor point: The authors use the word "probe" to refer to visual stimulus. This is confusing because "probe" is also used to refer to sensors.

      In the revised manuscript, we minimized the usage of ‘probe’ to reference visual stimuli

      Reviewer #3 (Recommendations For The Authors):

      Writing and figure presentations are excellent.

      Thank you!

      References:

      Franke, K., Berens, P., Schubert, T., Bethge, M., Euler, T., & Baden, T. (2017). Inhibition decorrelates visual feature representations in the inner retina. Nature, 542(7642), 439-444. https://doi.org/10.1038/nature21394

      Gaynes, J. A., Budoff, S. A., Grybko, M. J., Hunt, J. B., & Poleg-Polsky, A. (2022). Classical Center-Surround Receptive Fields Facilitate Novel Object Detection in Retinal Bipolar Cells. Nat Commun, 13(1), 5575. https://doi.org/https://doi.org/10.1038/s41467-022-32761-8

      Kim, J. S., Greene, M. J., Zlateski, A., Lee, K., Richardson, M., Turaga, S. C., Purcaro, M., Balkam, M., Robinson, A., Behabadi, B. F., Campos, M., Denk, W., Seung, H. S., & EyeWirers. (2014). Space-time wiring specificity supports direction selectivity in the retina. Nature, 509(7500), 331-336. https://doi.org/10.1038/nature13240

      Matsumoto, A., Agbariah, W., Nolte, S. S., Andrawos, R., Levi, H., Sabbah, S., & Yonehara, K. (2021). Direction selectivity in retinal bipolar cell axon terminals. Neuron. https://doi.org/10.1016/j.neuron.2021.07.008

      Matsumoto, A., Briggman, K. L., & Yonehara, K. (2019). Spatiotemporally Asymmetric Excitation Supports Mammalian Retinal Motion Sensitivity. Curr Biol. https://doi.org/10.1016/j.cub.2019.08.048

      Srivastava, P., de Rosenroll, G., Matsumoto, A., Michaels, T., Turple, Z., Jain, V., Sethuramanujam, S., Murphy-Baum, B. L., Yonehara, K., & Awatramani, G. B. (2022). Spatiotemporal properties of glutamate input support direction selectivity in the dendrites of retinal starburst amacrine cells. Elife, 11. https://doi.org/10.7554/eLife.81533

      Strauss, S., Korympidou, M. M., Ran, Y., Franke, K., Schubert, T., Baden, T., Berens, P., Euler, T., & Vlasits, A. L. (2022). Center-surround interactions underlie bipolar cell motion sensing in the mouse retina. Nat Commun, 13(1), 5574. https://doi.org/https://doi.org/10.1038/s41467-022-32762-7

      Tukker, J. J., Taylor, W. R., & Smith, R. G. (2004). Direction selectivity in a model of the starburst amacrine cell. Vis Neurosci, 21(4), 611-625. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15579224

      Wu, J., Kim, Y. J., Dacey, D. M., Troy, J. B., & Smith, R. G. (2023). Two mechanisms for direction selectivity in a model of the primate starburst amacrine cell. Vis Neurosci, 40, E003. https://doi.org/10.1017/S0952523823000019

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors sought to understand the neurocomputational mechanisms of how acute stress impacts human effortful prosocial behavior. Functional neuroimaging during an effort-based decision task and computational modeling were employed. Two major results are reported: 1) Compared to controls, participants who experienced acute stress were less willing to exert effort for others, with a more prominent effect for those who were more selfish; 2) More stressed participants exhibited an increase in activation in the dorsal anterior cingulate cortex and anterior insula that are critical for self-benefiting behaviour. The authors conclude that their findings have important insights into how acute stress affects prosociality and its associated neural mechanisms.

      Overall, there are several strengths in this well-written manuscript. The experimental design along with acute stress induction procedures were well controlled, the data analyses were reasonable and informative, and the results from the computational modeling provide important insights (e.g., subjective values). Despite these strengths, there were some weaknesses regarding potential confounding factors in both the experimental design and methodological approach, including selective reporting of only some aspects of this complex dataset, and the interpretation of the observations. These detract from from the overall impact of the manuscript. In particular, the stress manipulation and pro-social task are both effortful, raising the possibility that stressed participants were more fatigued. Other concerns include the opportunity for social dynamics or cues during task administration, the baseline social value orientation (SVO) in each group, and the possibility of a different SVO in individuals with selfish tendencies. Finally, Figure 4 should specify whether the depicted prosocial choices include all five levels of effort.

      We thank the reviewer for their comments and suggestions. In our response to the recommendations for the author below, we have dealt with the reviewer’s concerns: - we added additional analysis on the role of fatigue and block effects to the supplementary materials. - we provided further information about the role of social cues and dynamics during task administration. - we showed there were no baseline group differences in SVO angle. - we clarified that Figure 4 refers to the proportion of prosocial choices across all effort levels.

      Reviewer #2 (Public Review):

      This manuscript describes an interesting study assessing the impact of acute stress on neural activity and helping behavior in young, healthy men. Strengths of the study include a combination of neuroimaging and psychoneuroendocrine measures, as well as computational modeling of prosocial behavior. Weaknesses include complex, difficult to understand 3-way interactions that the sample size may not be large enough to reliably test. Nonetheless, the study and results provide useful information for researchers seeking to better understand the influence of stress on the neural bases of complex behavior.

      The stressor was effective at eliciting physiological and psychological stress responses as shown in Figure 2.

      Higher perceived stress in more selfish participants (lower social value orientation (SVO) angle) was associated with lower prosocial responding (Figure 4). How can we reconcile this finding with the finding (presented on page 15) that those with a more prosocial SVO showed a significant decline in dACC activation to subjective value at increasing levels of perceived stress? This seems contrary to the behavioral response.

      A larger issue with the study is that the power analysis presented on page 23 is based on a 2 (between: stress v. control) by 2 (within: self v. other) design. Most of the reported findings come from analyses of 3-way interactions. How can the readers have confidence in the reliability of results from 3-way interaction analyses, which were not powered to detect such effects?

      We thank the reviewer for their comments and suggestions. When considering the influence of dACC activation on the behavioural response (i.e., proportion of prosocial choices), it is important to consider the difference in activation to SVself relative to SVother: - The difference in activation to SVself relative to SVother negatively predicted the proportion of prosocial choices, so more activation to SVself relative to SVother predicted a lower proportion of prosocial choices. - Similarly, SVO angle negatively predicted the difference in activation to SVself relative to SVother, so more activation to SVself relative to SVother was related to a lower (more individualistic) SVO angle (this is shown by the interaction between Recipient and SVO angle in Figure 4; right panel). In both cases, differences in prosociality (i.e. SVO angle or the proportion of prosocial choices) were related to differences in dACC activation to SVself relative to SVother.

      Thus, we agree the finding that those participants with a more prosocial SVO showed a significant decline in dACC activation to SV overall (across SVself and SVother) at increasing levels of perceived stress is difficult to interpret. We expected a three-way interaction between Recipient, SVO angle and Perceived Stress to mirror the behavioural results, rather than a two-way interaction between SVO angle and Perceived Stress. We have now acknowledged this in the Discussion, whilst also highlighting the work of Schulreich et al. (2022) who report a related finding.

      We have now added the following section to the results:

      “When linking activation difference in dACC and AI to behaviour, we found that – independent of the stress manipulation – the difference in activation between SVself and SVother in the dACC predicted the proportion of prosocial choices. Thus, greater activation to SVself relative to SVother predicted a lower proportion of prosocial choices (B=-0.704, SE=0.339, P=0.041). This relationship was not present in the AI (B=-0.423, SE=0.332, P=0.205).”

      And we have added the following to the discussion:

      “Additionally, participants with a more prosocial SVO showed reduced responses in the dACC to SV (across both self and other trials) at greater levels of perceived stress (Figure 4; middle panel). This suggests that more prosocial individuals may become less sensitive to SV overall following stress, whilst the responses of more individualistic participants to SV do not change under stress. Trying to link these activation differences to changes in effortful prosocial behaviour is difficult given the absence of the three-way interaction between SVO angle, Perceived Stress and Recipient, which would have mirrored the behavioural results. Overall, differences in activation between SVself and SVother in the dACC predicted the proportion of prosocial choices, so greater activation to SVself relative to SVother predicted a lower proportion of prosocial choices. Thus, it remains unclear how activation differences to SV across both self trials and other trials relates to changes in prosocial behaviour under stress. Schulreich et al. (2022) found that a decline in charitable donations following increases in cortisol in high mentalisers was related to a reduced representation of value for donations in the right dlPFC. Whilst there are important differences between the present study and Schulriech et al. (2022), such as the way in which prosocial behaviour was measured, both studies suggest that existing differences in social preferences and abilities (i.e., mentalising, SVO) can have a detrimental effect on the neural representations of value following acute stress. Establishing how these changes in neural representations of value impact behaviour following acute stress is a challenge for future work.”

      Concerning the power calculation, we have acknowledged this as a limitation in the discussion.

      “Our power calculation was based on a 2 x 2 design (Group x Recipient), however, several of our key findings involved three-way interactions (e.g. between Group, Recipient and Effort). Thus, future studies should aim to replicate our effects with larger sample sizes to ensure the robustness of these effects.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      1. The authors employed an integrative approach on inducing acute stress by combining the strengths of MIST and TSST, as shown by a robust stress response in cortisol. However, some concerns regarding the stress manipulation and the effort-based task need to be addressed. The authors justified the order of deployment as necessary to maintain stress responses throughout the scanning period. It is unclear whether and how potential order effects were controlled, and whether the effort-task performance in the front and back of the line might have different effects in a 90-minute experiment.

      Moreover, the stress manipulation itself involved a complex mental arithmetic task, which might have influenced participants' willingness to exert effort for others in the prosocial task. As shown in Figure 3, the proportion of participants working decreases as the effort levels increase for both self and other conditions in the stress and control groups. It is thus possible that participants could consider the prosocial task as an opportunity to take a break from the demanding arithmetic task. It would be helpful to present results from the different runs, particularly for the pre and post three runs.

      We thank the reviewer for highlighting this potential issue. We have added several analyses to the supplementary analysis to explore potential block effects and fatigue effects. Here we provide a summary of the key findings.

      Firstly, we investigated participants’ ratings of the effort levels, which they experienced immediately before and after the study, to investigate potential fatigue effects. We found that following the experiment compared to the before, participants in the stress group rated squeezing to the required effort levels as more physically demanding compared to the control group (p=.037). There were no group differences in how much more effort they reported exerting (p=.824) or how uncomfortable it was (p=.351) compared to before the experiment. Thus, overall the stress group found it more physically demanding to squeeze to the effort levels following the experiment. Crucially, however, increases in how physically demanding participants found it to squeeze to the required effort levels were not correlated with the number of effortful choices in the Self and Other condition in either group (all Ps >0.4). This suggests that whilst stressed participants rated squeezing to the required effort level as more physically demanding following the task relative to before, this was not related to how often participants exerted effort for self or other rewards.

      Secondly, we investigated potential block effects. We repeated the mixed effects logistic regression reported in the manuscript but included the interaction between the factors Group, Recipient and Block (1:6) in the model. Although both groups showed a decline in the number of effortful choices during the experiment, the two-way interaction between Group and Block (p=.188) nor the three-way interaction between Group, Recipient and Block were significant (p=.138). This shows that whilst there was a decline in the number of effortful choices throughout the experiment, this was not more pronounced in the stress group, nor was it more pronounced in the stress group for self relative to other effortful choices compared to the control group. Additionally, the key three-way interaction between Group, Recipient and Block was unaffected when controlling for potential block effects. We now also plot the data by block in the supplementary materials (Figure S3).

      Please see the section in the Supplementary Material and a summary of these analyses also appears in the manuscript in the Results section

      “We conducted additional analyses to rule out the influence of potential fatigue and block effects (see Fatigue and block effects in the Supplementary Materials). In short, the stress group rated squeezing to the required effort level as more physically demanding immediately after the experiment compared to before, which was not seen in the control group (Figure S2). However, this was not related to the number of effortful choices for self or other rewards (Table S2). Moreover, when we conducted the same mixed effects logistic regression on participants’ choices but also included the interaction between Group, Recipient and Block, there was no significant three-way interaction between these factors, nor a significant two-way interaction between Group and Block (Figure S3). Additionally, the three-way interaction between Group, Recipient and Effort was unaffected when controlling for potential block effects (Type III Wald test χ2[4]=22.06, P<0.001). Thus, whilst the stress group rated squeezing to the required effort level as more physically demanding following the experiment, this was not related to the number of effortful choices (for self or other) and the effects of Block on effortful choices (for self or other) did not differ between the group. Thus, changes in how physically demanding participants rated squeezing to the effort levels did not influence decisions to exert effort.”

      1. It would be useful to know whether the authors controlled for factors such as familiarity or gender among participants that might influence their choices on the task. If participants were able to interact or observe each other, it is possible that social dynamics played a role in their behavior, which could confound the interpretation of their results. It would be beneficial if the authors could provide further information on how the task was administered and whether any social cues were present.

      For the experimental design, although salivary samples and subjective pressure were measured, did the authors measure participants' subjective ratings of other negative emotions?

      Participants did not have the chance to see or interact with the participants in the “other” condition. Participants were told at the start of the experiment that they would be earning money for the next participant in the study, called Thomas. Thus, as all participants were men, the name of the participants was gender matched. Moreover, as they did not see or interact with the next participant, familiarity was controlled across participants.

      We have now added a section p. 8 to clarify this:

      “As all participants were men, the name of the next participant was gender matched (all participants were told he was called Thomas; see Methods). Moreover, as participants did not see or interact with the next participant, familiarity was controlled across participants.”

      We have now added a plot to the supplementary materials (Figure S4) showing the changes in the ratings of the emotions. Apart from the emotions anxious and disgusted, all other emotions (calm, happy, bad, sad, surprised, angry) showed a significant sample timepoint (1:8) by group (stress, control) interaction, thus mirroring the results for the perceived stress ratings. We now refer to this figure in the manuscript on p. 8:

      “for changes in other emotions during the experiment please see Figure S4”

      1. Regarding the data analysis section, the authors' analysis is careful overall and the results about SVO are interesting. It would be interesting to know if baseline SVO was similar across both stress and control groups, and if there were any differences in SVO among participants with more individualistic or selfish tendencies. Regarding Figure 4, it would be helpful if the authors clarified whether the vertical coordinate "prosocial choices" is a combination of the five levels of effort or if it is specific to one level. Additionally, it would be useful to explore whether there is a correlation between SVO and prosocial choices and whether effort level could be used as a covariate to control for potential confounding effects. These suggestions could improve the clarity and strength of their contributions.

      There were no differences in SVO angle between the control group and stress group (p=.956). There was also a significant correlation between SVO angle and the proportion of prosocial choices across the whole sample. This has now been reported in the manuscript on p. 13:

      “There were no existing differences in SVO angle between the groups (control group mean = 19.33, SD = 8.67; stress group mean = 19.23, SD=8.14; p=0.956). We found that across the whole sample – independent of the stress manipulation – there was a significant correlation between SVO angle and the proportion of prosocial choices (r=0.225, P=0.032). So, as expected, those with a more prosocial SVO angle showed a higher proportion of prosocial choices in the task.

      To clarify, the variable “% prosocial choices” is a combination of all the five effort levels. In other words, we took the total number of prosocial choices (‘work’ for other) across all effort levels relative to the total number of effortful choices. We have now clarified this in the manuscript on p. 13. As this was a combination of all effort levels (and reward levels), it was not possible to include effort level as a covariate.

      “This measure combined all reward and effort levels.”

      1. It is noteworthy that in the dACC, an effect was observed with regard to the interaction between perceived stress and SVO angle. Considering this observation, another suggestion would be for the authors to include visualization in Figure 4 to present the results of this interaction. This could help readers better comprehend the findings and provide a clearer representation of the results.

      We have now updated Figure 4 so that it has three panels showing the behavioural and neural results concerning SVO angle as well as the relationship between SVO angle and activation to SVself and SVother in the dACC.

      1. It would be helpful for readers if the authors could label all statistical plots with appropriate statistical values, effect sizes, and their respective significance levels. By doing so, readers would be able to quickly identify major findings of this study and gauge the degree of significance associated with each plot. The authors should consider including such information in their statistical plots to enhance the comprehensibility of the study results.

      We have added statistical values (e.g., beta estimates), including indicators of significance to the plots.

      1. The authors selected ROIs based on previous work on stress-related and effort-based decision making (i.e., AI and dACC). While other brain regions may also play a role in decision making and social cognition, the authors could choose to focus on these specific ROIs due to their relevance to the experimental question and hypotheses of this study such as prosocial, mentalizing and subjective values.

      We agree that several other ROIs may have also been of interest. However, we decided to restrict our analysis to the dACC and the AI as these two ROIs were the focus of a previous study using the same prosocial effort paradigm (Lockwood et al. 2022) and multiple studies suggest these regions are sensitive to stress effects.

      1. The authors chose to use one sample t-test with AUC as a covariate to examine brain activations across all participants regardless of their stress or control condition. This approach could identify brain regions that are associated with perceived stress. However, the authors didn't conduct a simple two sample t-test between stress and control groups since their research question and hypotheses focused on the neurocomputational mechanisms underlying prosocial decision-making during stress. Regarding the different stages of decision-making, such as offer, force, and outcome, the authors did not conduct specific analyses for each stage. Instead, they used the computational model to estimate the subjective value of each option at each stage, which allowed them to examine the neural correlates of different value-related parameters across the entire decision-making process. However, it would be interesting to examine the role of different stages as well.

      Our design matrix modelled three events during each trial: the offer, force, and outcome phase (as per Lockwood et al. 2022). However, our hypotheses and research question for the effects of acute stress concerned the offer phase, i.e. when participants were deciding whether to exert effort or not (work vs. rest). Therefore, we decided to limit our reporting to this event. We have clarified this on p. 32 in the Methods:

      “Our hypotheses and research questions concerning the effects of acute stress concerned the offer phase, i.e., when participants were deciding whether to exert effort or not (work vs. rest). Therefore, we limited our reporting to this event.”

      1. The authors' findings pertaining to individual differences are intriguing, particularly for individuals with selfish tendencies to exhibit lower pro-social tendencies under stress. Additionally, group variations in effortful behavior related to benfitting others, relative to oneself, are more evident at lower effort levels rather than higher ones. The authors could dedicate more space in the discussion section to discuss the potential mechanisms involved and address the absence of pertinent theoretical support.

      We have now extended the discussion to further outline potential mechanisms. Broadly, we interpret our findings in terms of compromised executive functioning under acute stress: “downregulation of the brain’s ‘executive control network’ (Hermans et al., 2014)”. In the original submission, we focused on changes in inhibition and shifts to habitual/automatic processing. We have now expanded this to include a section on cognitive flexibility (see below). Note that changes in executive functioning have been widely reported following stress (see Shields et al., 2016 for a meta-analyses). However, which specific executive functions influenced our observed changes in prosocial behaviour is an exciting avenue for future work.

      We have added this section on p. 20-21 concerning cognitive flexibility:

      “The dlPFC has also been implicated in cognitive flexibility under acute stress. For example, Kalia et al. (2018) used functional near infrared spectroscopy to show that reduced cognitive flexibility under stress was related to changes in activation in the dlPFC in men. In our study, participants in the control group were more likely to exert effort for self rewards compared to other rewards at higher, but not at lower, levels of effort. Whilst participants in the stress group favoured exerting effort for self rewards at every effort level (Figure 3). This consistent preference for self rewards compared to other rewards at all effort level suggests that stressed participants did not adapt their social behaviour in response to changing contextual information. This supports multiple studies showing reduced cognitive flexibility under stress (Goldfarb et al., 2017; Kalia et al., 2018; Raio et al., 2017; Shields et al., 2016). An exciting avenue for future work is to test whether individual differences in executive functions, such as inhibition and cognitive flexibility, predict changes in social behaviour following acute stress. This would be analogous to the finding in non-social domains, where greater working memory capacity protects against stress-induced changes in learning (Otto et al., 2013).

      Reviewer #2 (Recommendations For The Authors):

      The manuscript suggests that the stress group made more selfish responses than the control group at lower, but not higher, levels of effort (as shown in Figure 3). I recommend that Figure 3, showing these data, be modified for clarity. Currently, data for the between-subjects comparison (Control and Stress groups) are linked by a dashed line. This linkage (at least in my mind) connotes that these data points are from the same people at different times. In fact, the within-subjects data are not linked by a line, but are noted by different colored symbols. Please reconsider how these data are presented.

      We have redrawn Figure 3. For each effort level, the self vs. other manipulation is shown on the x axis and the two groups (Control vs. Stress) are shown by black and grey lines. For each group, the lines are connected to show that the Self vs. Other manipulation is a within-subject manipulation.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Thank you for the response and reviews of our manuscript eLife-RP-RA-2023-86638 “Energetics of the Microsporidian Polar Tube Invasion Machinery”. We are grateful for the comments and constructive criticism from all three reviewers, which have helped us to improve our manuscript.

      As a summary to the editor, we here provide a list of the major revisions we have implemented to address all the comments provided by the referees.

      1. We added Supplementary Section A.9 and Figure S4 to explain the details of calculation and have magnified sketches of flow fields.

      2. We clarified the term "required pressure" to "required pressure differences", and explained that the same pressure differences can be achieved by either positive or negative pressure. We invoke the fact that the spore wall buckled inward to deduce that germination is a negative pressure process.

      3. We only rank the hypotheses based on calculation of total energy requirement. The peak pressure and peak power requirement calculations are now just for quantitative reference. The ranking of hypotheses does not change.

      4. We clarified the definition of topological connections in Section "Systematic evaluation of possible topological configurations of a spore," making it explicit that the topological questions listed only involved the "original PT content" (not PT space at all time).

      Thank you again for the opportunity to revise our work. We attach a point-by-point response to the referees below.

      Public Reviews:

      Reviewer #1 (Public Review):

      1. The authors used mathematical models to explore the mechanism(s) underlying the process of polar tube extrusion and the transport of the sporoplasm and nucleus through this structure. They combined this with experimental observations of the structure of the tube during extrusion using serial block face EM providing 3 dimensional data on this process. They also examined the effect of hyperosmolar media on this process to evaluate which model fit the predicted observed behavior of the polar tube in these various media solutions.

      We thank the reviewer for their accurate summary of our work. One subtle point, however, is that we examine the effect of hyperviscous media on the polar tube extrusion process, rather than hyperosmolar media. In Supplementary Section A.6 of our updated manuscript, we have shown that the changes in osmolarity due to methylcellulose is negligible.

      1. Overall, this work resulted in the authors arriving at a model of this process that fit the data (model 5, E-OE-PTPV-ExP). This model is consistent with other data in the literature and provides support for the concept that the polar tube functions by eversion (unfolding like a finger of a glove) and that the expanding polar vacuole is part of this process. Finally, the authors provide important new insights into the buckling of the spore wall (and possible cavitation) as providing force for the nucleus to be transported via the polar tube. This is an important observation that has not been in previous models of this process.

      We thank the reviewer for acknowledging the novelty and importance of our study.

      Reviewer #2 (Public Review):

      1. Microsporidia has a special invasion mechanism, which the polar tube (PT) ejects from mature spores at ultra-fast speeds, to penetrate the host and transfer the cargo to host. This work generated models for the physical basis of polar tube firing and cargo transport through the polar tube. They also use a combination of experiments and theory to elucidate possible biophysical mechanisms of microsporidia. Moreover, their approach also provided the potential applications of such biophysical approaches to other cellular architecture.

      We thank the reviewer for their accurate summary and acknowledging the potential applications on other organisms.

      1. The conclusions of this paper are mostly well supported by data, but some analyses need to be clarified. According to the model 5 (E-OE-PTPV-ExP) in P42 Fig. 6, is the posterior vacuole connected with the polar tube? If yes, how does the nucleus unconnected with the posterior vacuole enter the polar tube?

      As we mentioned in our glossary and detailed in Section "Systematic evaluation of possible topological configurations of a spore", Model 5 requires the "original PT content" (any material inside the PT prior to cargo entering the tube) to permit fluid flow to posterior vacuole and external environment post anchoring disc rupture, but cannot permit fluid flow to the sporoplasm that is transported through the tube. As the the germination process progresses, our model does not require the connection between PT and posterior vacuole to be maintained afterwards, and that creates space allowing sporoplasm (including nucleus) sporoplasm (including nucleus) to enter PT space through fluid entrainment. We have clarified the definitions in Section "Systematic evaluation of possible topological configurations of a spore" and have additional clarification in the caption of Fig. 6 in the updated manuscript.

      1. In Fig. 6, would the posterior vacuole become two parts after spore germination? One part is transported via the polar tube, and the other is still in the spore. I recommend this process requires more experiments to prove.

      According to our Model 5, the membrane connection between PT and posterior vacuole must be broken for the infectious cargo to extrude. However, our current data does not allow us to prove nor disprove the membrane fission event. In theory, the membrane content in PT can potentially be severed into multiple parts by Plateau-Rayleigh instability, an interfacial-tension-driven fluid thread breakup mechanism. Note that it is possible to have membrane fission at the time scale of germination process, as when the time scale of shearing is faster than the viscoelastic time of lipid membranes (roughly 10 msec), membrane fission can happen (Morlot & Roux 2013). For time scale longer than viscoelastic time of lipid membrane, protein complexes like dynamin would be required for membrane fission. Future cryo-EM study of the vacuole-PT connection at the anterior tip (and in the spore as a whole) is needed to clarify the physical process. We added this discussion in Section "Predictions and proposed future experiments".

      Reviewer #3 (Public Review):

      Abstract:

      The paper follows a recent study by the same team (Jaroenlak et al Plos Pathogens 2020), which documented the dramatic ejection dynamics of the polar tube (PT) in microsporidia using live-imaging and scanning electron microscopy. Although several key observations were reported in this paper (the 3D architecture of the PT within the spore, the speed and extent of the ejection process, the translocation dynamics of the nucleus during germination), the precise geometry of the PT during ejection remain inaccessible to imaging, making it difficult to physically understand the phenomenon.

      This paper aims to fill this gap with an indirect "data-driven" approach. By modeling the hydrodynamic dissipation for different unfolding mechanisms identified in the literature and by comparing the predictions with experiments of ejection in media of various viscosities, authors shows that data are compatible with an eversion (caterpillar-like) mechanism but not compatible with a "jack-in-the-box" scenario. In addition, the authors observe that most germinated spores exhibit an inward bulge, which they attribute to buckling due to internal negative pressure and which they suggest may be a mean of pushing the nucleus out of the PT during the final stage of ejection.

      We thank the reviewer for their accurate summary of our work.

      Major strengths:

      Probably the most impressive aspect of the study is the experimental analysis of the ejection dynamics (velocity, ejection length) in medium of various viscosities over 3 orders of magnitudes, which, combined with a modeling of the viscous drag of the PT tube, provides very convincing evidence that the unfolding mechanism is not a global displacement of the tube but rather an apical extension mechanism, where the motion is localized at the end of the tube. The systematic classification of the different unfolding scenarios, consistent with the previous literature, and their confrontation with data in terms of energy, pressure and velocity also constitute an original approach in microbiology where in-situ and real time geometry is often difficult to access.

      We thank the reviewer for acknowledging the novelty and importance of our study.

      Major weaknesses:

      1a. While the experimental part of the paper is clear, I had (and still have) a hard time understanding the modeling part. Overall, the different unfolding mechanisms should be much better explained, with much more informative sketches to justify the dissipation and pressure terms, magnifying the different areas where dissipation occurs, showing the velocity field and pressure field, etc.

      We thank the reviewer for their comments and suggestions. In the Figure S4 and SI Section A.9 of the updated manuscript, we have magnified the sketches with flow field, and added a detailed explanation of the derivations of dissipation terms.

      1b. In particular, a key parameter of eversion models is the geometry of the lubrication layers inside and outside the spore (h_sheath, h_slip). Where do the values of h_sheath and h_slip come from? What is the physical process that selects these parameters?

      As we described in SI Section A.9, h_sheath was set to be 25 nm based on the observed translucent space around PT in activated spores (Lom 1972), and h_slip was set to be 6 nm based on the observed gap thickness between PT and cargo (Takovarian et al. 2020). Although we don't expect these numbers to be the same for each spore, the uncertainty in these two parameters are much less than the uncertainty in cytoplasmic viscosity (which varies several orders of magnitude) and boundary slip length. Our sensitivity testing on cytoplasmic viscosity and boundary slip length thus covers any uncertainty in h_sheath or h_slip already.

      1c. For clarity, the figures showing the unfolding mechanics in the different scenarios should be in the main text, not in the supplemental materials.

      We have added Figure S4 and SI Section A.9 to explain the details of our sketches. We believe, however, putting all the details of the mechanics and how each term is derived in the main text may detract from the flow of the manuscript, and result in it being less accessible to readers who are not as familiar with the physics. We therefore decided to keep this information in supplemental materials.

      2a. The authors compute and discuss in several places "the pressure" required for ejection, but no pressure is indicated in the various sketches and no general "ejection mechanism" involving this pressure is mentioned in the paper.

      In the updated manuscript, we have changed the term “pressure” to “pressure difference” or “required pressure difference”. We did not calculate the detailed pressure field around each structure, but only estimated the required pressure difference to overcome the drag force and drive fluid flow in various spaces. We also clarified this point in Section "Developing a mathematical model for PT energetics".

      Also, as we mentioned in Section “Posterior vacuole expansion and the role of osmotic pressure”, we made no assumptions on how the pressure difference is generated in this paper. The unfolding mechanism of polar tube, how eversion is sustained, and the driving mechanism are ongoing research projects, and we decided not to make premature comments on that without strong support from experiments or simulation results.

      2b. What is this "required pressure" and to what element does it apply?

      The “required pressure” in the manuscript indicates the required pressure difference between the spore and the tip of the polar tube for it to push the tip forward and sustain the fluid flow within the polar tube. In the updated manuscript, we thus changed the term “required pressure” to “required pressure difference”. We also added this clarification to Section "Developing a mathematical model for PT energetics".

      2c. I understand that the article focuses on the dissipation required to the deployment of the PT but I find it difficult to discuss the unfolding mechanism without having any idea on the driving mechanism of the movement. How could eversion be initiated and sustained?

      As we mentioned in Section “Posterior vacuole expansion and the role of osmotic pressure”, we made no assumptions on how the energy, pressure or power is generated in this paper. We agree that the unfolding mechanism of the polar tube, how eversion is sustained, and the driving mechanism are important questions, and these are ongoing research projects. As no assumptions about this are required for our models, we decided not to comment on these aspects without strong support from experiments or simulation results. We have clarified this in Section “Posterior vacuole expansion and the role of osmotic pressure” of the updated manuscript.

      1. Finally, the authors do not explain how pressure, which appears to be a positive, driving quantity at the beginning of the process, can become negative to induce buckling at the end of ejection. Although the hypothesis of rapid translocation induced by buckling is interesting, a much better mechanistic description of the process is needed to support it.

      As discussed in Point 2-b above, the “required pressure” actually means “required pressure difference”. The same pressure difference can possibly be achieved by either positive pressure (the spore has a higher pressure than the ambient, pushing the fluid into PT) or negative pressure (the PT tip has a lower pressure than the ambient, sucking the fluid from the spore). Hydrodynamic dissipation analysis alone cannot tell the differences between positive or negative pressure, as it only tells you the required pressure differences between the spore and the polar tube tip. It will have to be inferred from the implied mechanisms or other evidence. We added these discussions in the 4th paragraph of Section "Developing a mathematical model for PT energetics" in the updated manuscript.

      That being said, from our observations of buckled spore walls, it is still sufficient to deduce that the polar tube ejection process is a negative pressure driven process. For the spore wall to buckle inwards, the ambient pressure has to be higher than the pressure within the spore, but that would contradict with the positive pressure hypothesis as elaborated above. We added these clarifications in the 2nd paragraph of Section "Models for the driving force behind cargo expulsion".

      References:

      Lom, J. (1972). On the structure of the extruded microsporidian polar filament. Zeitschrift Für Parasitenkunde, 38(3), 200–213.

      Takvorian, P. M., Han, B., Cali, A., Rice, W. J., Gunther, L., Macaluso, F., & Weiss, L. M. (2020). An Ultrastructural Study of the Extruded Polar Tube of Anncaliia algerae (Microsporidia). The Journal of Eukaryotic Microbiology, 67(1), 28–44.

      Morlot, S., & Roux, A. (2013). Mechanics of dynamin-mediated membrane fission. Annual Review of Biophysics, 42, 629–649.

      Reviewer #1 (Recommendations For The Authors):

      The work is solid and supported by the experimental data presented, the literature and the biophysical modeling.

      1. The model (Model 5) indicates that the polar tube is connected to the posterior vacuole and that the contents of this vacuole may be transported by the polar tube before the sporoplasm. This needs experimental validation in the future, which will require the identification of posterior vacuole markers (i.e. proteins specific to this structure). I find the topology of this idea difficult to understand. If the polar tube is outside of the sporoplasm membrane then how does it connect to the posterior vacuole? If the expanded posterior vacuole is still in the spore at the end of germination then how does the sporoplasm get out?

      Model 5 requires the "original PT content" (any material inside the PT prior to cargo entering the tube) to permit fluid flow to posterior vacuole and external environment post anchoring disc rupture, but cannot permit fluid flow to sporoplasm. As the germination process progresses, our model does not require the connection between PT and posterior vacuole to be maintained afterwards, and that creates space allowing sporoplasm (including nucleus) to enter PT space through fluid entrainment.

      We agree with the reviewer that the specific predictions from Model 5 need to be experimentally validated in the future, and identification of posterior vacuole markers is a good direction. We have mentioned this in Section "Predictions and proposed future experiments".

      1. I have always thought that the polaroplast was the initial cargo in the polar tube and that this formed the limiting membrane of the sporoplasm and nucleus after passage through the polar tube (i.e., the limiting membrane of the sporont).

      In this manuscript, we only analyze the possible topology of the organelles that are relevant for energy dissipation calculations. Our final hypothesis (E-OE-PTPV-ExP) indicates that there is a limiting membrane of the infectious cargo as they pass through PT, but the energy calculation cannot tell you where this membrane comes from. That being said, our final hypothesis is consistent with the common belief that polaroplast provides the limiting membrane of the sporoplasm, even though our analysis neither proved nor disproved it.

      1. I understand that the model indicates that during eversion the end of the PT moves away from the posterior vacuole allowing the sporoplasm access to the PT lumen, however, I am not clear how this process occurs (although I understand the reason that this model was the best fit for the available data). Does the model distinguish between connected (as in the PV is in the polar tube lumen) to the idea of it being in proximity (i.e. the PT is at the PV at the start of eversion)?

      As we mentioned in our reply to Point 1 of the same reviewer above, "connectivity" simply means whether fluid flow is permitted across the end connections among organelles and sub-spaces within the spores. For Model 5, the content of posterior vacuole can pass to the original PT content and to the external environment post anchoring disc disruption through fluid flow, but not to sporoplasm. However, as the germination progresses, the PT does not have to maintain its spatial proximity or membrane connection to posterior vacuole, as the topological connectivity questions are pertaining to the "original PT content". We clarified this point in Section "Systematic evaluation of possible topological configurations of a spore" in the updated manuscript.

      Reviewer #2 (Recommendations For The Authors):

      1. The connection of polar tube and posterior vacuole need to be analyzed by Cryo -EM.

      We thank the reviewer for their comments. This work is underway.

      Reviewer #3 (Recommendations For The Authors):

      1a. As stated in the public review, the explanation and description of the unfolding mechanism should be much better described and associated with clear sketches, magnifying all the areas where the flow shear rate is concentrated (surrounding zone, lubrication inside and outside the spore, etc) and drawing the velocity field, the boundary solid motion and pressure distribution in order to clearly understand, for each model, the dissipation and pressure terms given in figs. S2 and S3.

      In the updated manuscript, we added Figure S4 to enlarge all the regions where fluid shear is considered, with sketches of velocity fields.

      1b. This is particularly important for explaining the eversion models (see comment in the Public Review) but even the "jack-in-the-box" model sketched in Fig. S2 is confusing: Why does the blue tube disappear outside the spore? What happens to the tube in this case?

      The blue tube in the sketch of Model 1 in Fig. S2 is the fluid between the two outermost layers of PT, not the PT itself. We have clarified that in the newly added Fig. S4.

      1. Many ejection mechanisms based on the deployment of invaginated appendages have been described in the literature (e.g. Zuckerkandl Biol. Bull. 1950, Karabulut et al Nat. Com. 2022) and also mimicked for robotic applications (e.g. Hawkes et al Science Robotics 2017). Although this is not the main topic of the paper, it would be very useful if the authors could discuss in the introduction the most acceptable theory for motion generation (eversion driven by an overpressure in the spore?). In the current version, this comes too late in the discussion.

      As we discussed in Section “Lack of biophysical models explaining the microsporidian infection process”, PT eversion is the most widely accepted hypothesis because of experimental evidence (e.g. microscopic observations of PT extrusions, and pulse-labeling of half-ejected tubes). However, whether or not it is driven by an overpressure in the spore remains controversial. In fact, our observations of inwardly buckled spores indicates that the ejection process likely involves negative pressure.

      In our work, we thus take a data-driven approach to generate models for the physical basis of PT extrusion process, without immediately assuming that eversion is the correct hypothesis. It would therefore not make sense to have elaborated discussion on other eversion mechanisms in Introduction.

      1. About the physical constraints, I understand that the stored energy must be the same when the viscosity is changed (by conservation of energy), but what physical basis do you have for requiring that the power and pressure also be the same (lines 295-298)? For e.g. when a spring is stretched and released in a very viscous fluid without inertia, the total energy dissipated is the same whatever the viscosity but the power is not the same. The formulation of the chosen physical constraints should be better justified.

      We thank the reviewer for their feedback. In our updated manuscript, we only use total energy requirement for the ranking, and the peak pressure difference requirement and peak power requirements are calculated just for quantitative reference. The ranking of the 5 hypotheses does not change.

      1. About the mechanism for cargo translocation, authors should explain the physical origin of the hypothetical negative pressure. How could the initial positive pressure become negative?

      As we mentioned in our reply to Point 3 of the same reviewer in the public review, the “required pressure” actually means “required pressure difference”. The same pressure difference can possibly be achieved by either positive pressure (the spore has a higher pressure than the ambient, pushing the fluid into PT) or negative pressure (the PT tip has a lower pressure than the ambient, sucking the fluid from the spore). Hydrodynamic dissipation analysis alone cannot tell the differences between positive or negative pressure, as it only tells you the required pressure differences between the spore and the polar tube tip. It will have to be inferred from the implied mechanisms or other evidence. We added these discussions in the 4th paragraph of Section "Developing a mathematical model for PT energetics" in the updated manuscript.

      That being said, from our observations of buckled spore walls, it is still sufficient to deduce that the polar tube ejection process is a negative pressure driven process. For the spore wall to buckle inwards, the ambient pressure has to be higher than the pressure within the spore, but that would contradict with the positive pressure hypothesis as elaborated above. We added these clarifications in the 2nd paragraph of Section "Models for the driving force behind cargo expulsion".

      More minor comments:

      1. The videos are amazing but it is not clear if the PT is ejected through a bulk fluid or if the spores (and ejected PT) are in contact with a solid.

      As described in Supplementary Section A.6, purified spores were spotted on a coverslip and let water evaporate. 2.0 μL of germination buffer (10 mM Glycine-NaOH buffer pH 9.0 and 100 mM KCl) with different concentration (0%, 0.5%, 1%, 2%, 3%, 4%) of methylcellulose was added to the slide and place the coverslip on top. So the spore is attached to the coverslip and ejected through a bulk liquid of germination buffer.

      1. S2 caption: please be precise that H is the Heaviside step function.

      We have updated the captions for both Figure S2 and S3 to make it explicit.

      1. Line 233 a pi is missing, no?

      We thank the reviewer for their careful read. We have corrected that.

      1. The notations are quite unfortunate and confusing. In fluid mechanics capital D usually refers to the dissipation, capital C to the drag coefficient. It would be much clearer to call D the dissipation power (in Watt) and P the pressure requirement (in Pa), whatever the mechanism and put the different contribution (drag, lubrication, cytoplasm flow) in subscript.

      We thank the reviewer for their feedback. The notation of this paper is challenging as there are many symbols while keeping everything relatively intuitive to both people with biology background and physics background. We will keep these feedback in mind in our future work.

      1. Fig S2: what is D (in the formula of the total dissipation power)? Why not use R instead?

      D is the PT diameter, as we mentioned in the caption. We keep that as it is used in the definition of the shape factor.

      1. Fig S3 why the pressure requirement for the "jack-in-the-box" hypothesis is 2\mu (vLf(epsilon)/R^2)?

      We have now elaborated the calculation in SI Section A.9.

      1. Lines 486-497: Although shear thinning fluids have their viscosity that decreases with the shear rate, in most cases the resistance (stress) still increases with speed with these fluids. Is mucin a "velocity-weakening" fluid, i.e. a fluid in which stress decreases when shear rate increases.

      We agree that stress still increases with speed for most shear thinning fluids. The mechanical properties of mucin solution strongly depend on its compositions and buffers. In our discussion, we thus simply mention this possibility without claiming whether mucin (or other biopolymer environment that microsporidia species actually experience in vivo) is a velocity-weakening fluid or not.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      In this study, the authors investigated the role of MAM and the Notch signaling pathway in the onset of the atrophic phenotype in both in vivo and in vitro models. The rationale used to obtain the data is one of the main strengths of the study. Already from the reading, the reasoning scheme used by the authors in setting up the study and evaluating the data obtained is clear. Using both cellular and mouse models in vivo consolidates the data obtained. The authors also methodologically described all the choices made in the supplementary section. A weakness, on the other hand, is the failure to include averages and statistical data in the results that would give a quantifiable idea of the data obtained. To complete the picture, the authors could also investigate the possible involvement of the intrinsic apoptosis pathway as well as describe probable metabolic shifts to muscle cells in atrophic conditions. The rationale used by the authors to obtain the result is linear. The data obtained are useful for understanding the onset and characterization of the atrophic phenotype under disuse and microgravity conditions. The methods used are in line with those used in the field and can be a starting point for other studies. The cellular models are well described in the Materials and methods section. The selected mouse models followed a logical rationale and were in line with the intended aim.

      We thank this reviewer for comments that have led us to clarify several points.

      Reviewer #1 (Recommendations For The Authors):

      • In order to reinforce and justify the results obtained, I would suggest that the authors include numerical and statistical data in the results obtained.

      Answer) As the reviewer suggested, we have incorporated actual numerical and statistical data into each graph in all figures.

      • With the aim of better framing the picture of an atrophic muscle phenotype caused by microgravity or disuse, I would advise the authors to also focus on the possible involvement of the intrinsic apoptosis pathway. To this end, it would be interesting to assess a possible relationship between MAM and apoptosis. It would be useful to integrate this part into the discussion.

      Answer) It has been shown that suppression of Mfn2 expression attenuates calcium influx into mitochondria during apoptosis-inducing stimuli, thereby inhibiting apoptosis (Martins de Brito & Scorrano, Nature 2008), however, in our study, we found that apoptotic pathways, including Caspase3 or p-AKT were not significantly altered in differentiated human myocytes by microgravity for 7 days in culture, suggesting that microgravity-induced apoptosis is not an initial pathway to MAM. We have added these data in the new supplementary file 3 and mentioned it in the results.

      • In addition to TA, did the authors investigate what was seen in other muscles impacted by microgravity? If so, I would recommend supplementing what is available or, on the contrary, justifying the exclusivity of the choice of TA.

      Answer) It has been reported that the soleus, a slow-type muscle is more susceptible than the fast-type tibialis anterior muscle during gravity changes, so it makes more sense for the content of this study to analyze the soleus muscle. However, we chose the tibialis anterior muscle as our target because it provides the most stable results as a site for stem cell transplantation to observe muscle regeneration.

      • The authors affirm that there is an altered distribution and morphology of mitochondria under microgravity conditions. To corroborate this assertion, I would recommend including a morphological image that confirms it.

      Answer) The morphology of mitochondria in cultured myotubes, as observed by mitotracker staining in Figure 4G, varied widely, from finely divided to fused even within a single fiber compared to MFN2-mutated human iPS cells, making it difficult to conclude whether these changes were brought about by microgravity. Therefore, in this study, we have shown that they are reduced in microgravity by the difference in fluorescence intensity of mitotracker, which is directly proportional to mitochondrial activity.

      • It would be interesting if the authors would show whether there are changes in myosin expression or metabolic changes in cells subjected to microgravity and in the cell model with Mnf2 deletion. It would also be interesting to evaluate this in the presence of DAPT.

      Answer) As the reviewer’s suggestion, we have checked MYH1, MYH3, and MYH7 transcripts in differentiated myotubes under microgravity, with or without DAPT in the new supplementary file 12. We have added the data showing that not MYH1 but MYH7 transcript was partially recovered in the Results.

      A detailed description of the metabolic analyses with myogenic cells cultured in microgravity conditions will be published elsewhere (Sugiura et al., “Mitochondria aconitase is a main target for unloading-mediated mitochondria dysfunction toward muscle atrophy”, in preparation). We have described it in the Materials and methods of the manuscript.  

      Reviewer #2 (Public Review):

      In this study, the authors examined how the maintenance of mitochondrial-associated endoplasmic reticulum membranes (MAM) is critical for the prevention of muscle atrophy under microgravity conditions. They observed, a reduction in MAM in myotubes placed in a microgravity condition; in addition, MFN2-deficient human iPS cells showed a decrease in the number of MAM, similar to in myotubes differentiated under microgravity conditions, in addition to the activation of the Notch signaling pathway. The authors, moreover, observed that treatment with the gamma-secretase inhibitor with DAPT preserved the atrophic phenotype of differentiated myotubes in microgravity and improve the regenerative capacity of Mfn2-deficient muscle stem cells in dystrophic mice. The entire study was well conducted, bringing an interesting analysis in vitro and in vivo of aging conditions. In my opinion, it is necessary to improve the analysis of both genes and proteins to better support the conclusions

      The study can contribute to a better understanding of one of the major problems of aging, such as muscle atrophy and inhibition of muscle regeneration, emphasizing the importance of the NOTCH pathway in these pathological situations. The work will be of interest to all scientists working on aging

      We thank this reviewer for the positive comments and remarks that we have attempted to address.

      Reviewer #2 (Recommendations For The Authors):

      Results:

      In Figure 1b authors observed an increase in the transcripts of MuRF1 and FBXO32 after 7 days of microgravity condition. I suggest to investigate the protein expression of these genes to give more validation to this data.

      Answer) As the reviewer’s suggestion, we have investigated the western blotting with atrophic markers in microgravity samples. These data have been added in Figure 1D.

      Moreover, I suggest investigating not only Myogenin as an earlier gene of myotubes formation but also MRF4.

      Methods:

      I suggest when doing real-time PCR not to use a single gene as housekeeping but the average of three genes, to avoid the influence of a single housekeeping gene affecting the results.

      Answer) As the reviewer’s suggestion, we have investigated MRF4 expression by qPCR experiments with 3 different housekeeping genes (RPL13a, GAPDH, and ACTB). Our experiments showed no significant differences among these three housekeeping genes. We have added these data to Figure 1C and Methods in the manuscript.

    1. Author Response

      We thank the reviewers and editor for their careful evaluation of our manuscript, and we appreciate their favorable assessment of our work. Below, we clarify a few points concerning the relationship between our study and previous studies evaluating ligand docking to protein models.

      As reviewer 2 correctly notes, several previous assessments of AF2 models have simply excluded templates above a sequence identity cutoff when using AF2 to predict structures. Such AF2 predictions are still informed by all structures in the PDB before April 30, 2018, because these structures were used to train AF2—that is, to determine the tens of millions of parameters (“weights”) in the AF2 neural network. Machine learning methods nearly always perform better when evaluated on the data used to train them than when evaluated on other data. For this reason, we consider AF2 models only for proteins whose structures were not used to train AF2—that is, for proteins whose structures were not available in the PDB before April 30, 2018.

      Previous papers (including Beuming and Sherman, 2012, https://doi.org/10.1021/ci300411b) have shown a clear correlation between the binding pocket RMSD of a protein model and pose prediction accuracy based on that model. Our main findings are unexpected in light of these previous reports: we find that AF2 models yield pose prediction accuracy similar to that of traditional homology models despite having much better binding pocket RMSDs, and that AF2 models yield substantially worse pose prediction accuracy than experimentally determined structures with different ligands bound despite having similar binding pocket RMSDs.

      Reviewer 2 also correctly notes that previous papers have described AF2 models as “apo models,” because these models do not include coordinates for bound ligands. As noted by the AF2 developers (e.g., https://alphafold.ebi.ac.uk/faq), however, AF2 is designed to predict coordinates of protein atoms as they might appear in the PDB, and AF2 models are thus frequently consistent with structures in the presence of ligands even though those ligands are not included in the models. GPCR structures in the PDB, including those used to train AF2, nearly always contain a ligand in the orthosteric binding pocket. An AF2 model of a GPCR should thus not be viewed as an attempt to predict the GPCR’s structure in the unliganded (apo) state.

      Finally, we did not apply flexible docking in this study because previous work has found that standard flexible docking protocols typically improve pose prediction performance only when given prior information on which amino acid residues to treat as flexible. For example, previous studies that performed successful flexible docking to AF2 models generally used prior knowledge of the ligand’s experimentally determined binding pose to identify the residues to treat as flexible.

    1. Author Response

      Reviewer #3 (Public Review):

      Summary:

      The manuscript from Tariq and Maurici et al. presents important biochemical and biophysical data linking protein phosphorylation to phase separation behavior in the repressive arm of the Neurospora circadian clock. This is an important topic that contributes to what is likely a conceptual shift in the field. While I find the connection to the in vivo physiology of the clock to be still unclear, this can be a topic handled in future studies.

      Strengths: The ability to prepare purified versions of unphosphorylated FRQ and P-FRQ phosphorylated by CK-1 is a major advance that allowed the authors to characterize the role of phosphorylation in structural changes in FRQ and its impact on phase separation in vitro.

      Weaknesses: The major question that remains unanswered from my perspective is whether phase separation plays a key role in the feedback loop that sustains oscillation (for example by creating a nonlinear dependence on overall FRQ phosphorylation) or whether it has a distinct physiological role that is not required for sustained oscillation.

      The reviewer raises the key question regarding data suggesting LLPS and phase separated regions in circadian systems. To date condensates have been seen in cyanobacteria (Cohen et al, 2014, Pattanayak et al, 2020) where there are foci containing KaiA/C during the night, in Drosophila (Xiao et al, 2021) where PER and dCLK colocalize in nuclear foci near the periphery during the repressive phase, and in Neurospora (Bartholomai et al, 2022) where the RNA binding protein PRD-2 sequesters frq and ck1a transcripts in perinuclear phase separated regions. Because the proteins responsible for the phase separation in cyanobacteria and Drosophila are not known, it is not possible to seamlessly disrupt the separation to test its biological significance (Yuan et al, 2022), so only in Neurospora has it been possible to associate loss of phase separation with clock effects. There, loss of PRD-2, or mutation of its RNA-binding domains, results in a ~3 hr period lengthening as well as loss of perinuclear localization of frq transcripts. A very recent manuscript (Xie et al., 2024) calls into question both the importance and very existence of LLPS of clock proteins at least as regards to mammalian cells, noting that it may be an artefact of overexpression in some places where it is seen, and that at normal levels of expression there is no evidence for elevated levels at the nuclear periphery. Artefacts resulting from overexpression plainly cannot be a problem for our study nor for Xiao et al. 2021 as in both cases the relevant clock protein, FRQ or PER, was labeled at the endogenous locus and expressed under its native promoter. Also, it may be worth noting that although we called attention to enrichment of FRQ[NeonGreen] at the nuclear periphery, there remained abundant FRQ within the core of the nucleus in our live-cell imaging.

      Cohen SE, et al.: Dynamic localization of the cyanobacterial circadian clock proteins. Curr Biol 2014, 24:1836–1844, https://doi.org/10.1016/j.cub.2014.07.036.

      Pattanayak GK, et al.: Daily cycles of reversible protein condensation in cyanobacteria. Cell Rep 2020, 32:108032, https://doi.org/10.1016/j.celrep.2020.108032.

      Xiao Y, Yuan Y, Jimenez M, Soni N, Yadlapalli S: Clock proteins regulate spatiotemporal organization of clock genes to control circadian rhythms. Proc Natl Acad Sci U S A 2021, 118, https://doi.org/10.1073/pnas.2019756118.

      Bartholomai BM, Gladfelter AS, Loros JJ, Dunlap JC. 2022 PRD-2 mediates clock-regulated perinuclear localization of clock gene RNAs within the circadian cycle of Neurospora. Proc Natl Acad Sci U S A. 119(31):e2203078119. doi: 10.1073/pnas.2203078119.

      Yuan et al., Curr Biol 78: 102129, 2022. https://doi.org/10.1016/j.ceb.2022.102129

      Pancheng Xie, Xiaowen Xie, Congrong Ye, Kevin M. Dean, Isara Laothamatas , S K Tahajjul Taufique, Joseph Takahashi, Shin Yamazaki, Ying Xu, and Yi Liu (2024). Mammalian circadian clock proteins form dynamic interacting microbodies distinct from phase separation. Proc. Nat. Acad. Sci. USA. In press.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the reviewers and editors for their time and careful consideration of this study. Nearly every comment proved to be highly constructive and thoughtful, and as a result, the manuscript has undergone major revisions including the title, all figures, associated conclusions and web app. We feel that the revised resource provides a more systematic and comprehensive approach to correlating inter-individual transcript patterns across tissues for analysis of organ cross-talk. Moreover, the manuscript has been restructured to highlight utility of the web tool for queries of genes and pathways, as opposed to focused discrete examples of cherry-picked mechanisms. A few key revisions include:

      • Manuscript: All figures have been revised to place to explore broad pathway representation. These analyses have replaced the previous circadian and muscle-hippocampal figures to emphasize ability to recapitulate known physiology and remove the discovery portion which has not been validate experimentally.

      • Manuscript: The term “genetic correlation” or “genetically-derived” has been replaced throughout with “transcriptional”, “inter-individual”, or mostly just “correlations”.

      • Manuscript: A new figure (revised fig 2) has been added to evaluate the innate correlation structure of data used for common metabolic pathways, in addition an exploration of which tissues generally show more co-correlation and centrality among correlations.

      • Manuscript: A new figure (revised fig 4) has been added to highlight the utility of exploring gene ~ trait correlations in mouse populations, where controlled diets can be compared directly. These highlight sex hormone receptor correlations with the large amount of available clinical traits, which differ entirely depending on the tissue of expression and/or diet in mouse populations.

      • Web tool: Addition of a mouse section to query expression correlations among diverse inbred strains and associated traits from chow or HFHS diet within the hybrid mouse diversity panel.

      • Web tool: Overrepresentation analysis for pathway enrichments have been replaced with score-based gene set enrichment analyses and including network topology views for GSEA outputs.

      • Web tool: Associated github repository containing scripts for apps now include a detailed walk-through of the interface and definitions for each query and term.

      Public Reviews:

      Reviewer #1 (Public Review):

      Zhou et al. have set up a study to examine how metabolism is regulated across the organism by taking a combined approach looking at gene expression in multiple tissues, as well as analysis of the blood. Specifically, they have created a tool for easily analyzing data from GTEx across 18 tissues in 310 people. In principle, this approach should be expandable to any dataset where multiple tissues of data were collected from the same individuals. While not necessary, it would also raise my interest to see the "Mouse(coming soon)" selection functional, given that the authors have good access to multi-tissue transcriptomics done in similarly large mouse cohorts.

      Summary

      The authors have assembled a web tool that helps analyze multiple tissues' datasets together, with the aim of identifying how metabolic pathways and gene regulation are connected across tissues. This makes sense conceptually and the web tool is easy to use and runs reasonably quickly, considering the size of the data. I like the tool and I think the approach is necessary and surprisingly under-served; there is a lot of focus on multi-omics recently, but much less on doing a good job of integrating multi-tissue datasets even within a single omics layer.

      What I am less convinced about is the "Research Article" aspect of this paper. Studying circadian rhythm in GTEx data seems risky to me, given the huge range in circadian clock in the sample collection. I also wonder (although this is not even remotely in my expertise) whether the circadian rhythm also gets rather desynchronized in people dying of natural causes - although I suppose this could be said for any gene expression pathway. Similarly for looking at secreted proteins in Figure 4 looking at muscle-hippocampus transcript levels for ADAMTS17 doesn't make sense to me - of all tissue pairs to make a vignette about to demonstrate the method, this is not an intuitive choice to me. The "within muscle" results look fine but panels C-E-G look like noise to me...especially panel C and G are almost certainly noise, since those are pathways with gene counts of 2 and 1 respectively.

      I think this is an important effort and a good basis but a significant revision is necessary. This can devote more time and space to explaining the methodology and for ensuring that the results shown are actually significant. This could be done by checking a mix of negative controls (e.g. by shuffling gene labels and data) and a more comprehensive look at "positive" genes, so that it can be clearly shown that the genes shown in Fig 1 and 2 are not cherry-picked. For Figure 3, I suspect you would get almost an identical figure if instead of showing pan-tissue circadian clock correlations, you instead selected the electron transport chain, or the ribosome, or any other pathway that has genes that are expressed across all tissues. You show that colon and heart have relatively high connectivity to other tissues, but this may be common to other pathways as well.

      Response: We are thankful to the reviewer in their detailed assessment of the manuscript. The comments raised in both the public and suggested reviews clearly improved the revised study and helped to identify limitations. In general, we have removed data suggesting “discovery” using these generalized analyses, such as removing figures evaluating circadian rhythm genes and muscle-hippocampus correlations. These have been replaced with more thorough investigations of tissue correlation structure and potentially identified regions of data sparsity which are important for users to consider. Also, we have added a similar full detailed pipeline of mouse (HMDP) data and highlighted in the manuscript by showing transcript ~ trait correlations of sex hormone receptor genes which differ between organs and diets. Further responses to individual points are also provided below.

      Reviewer #2 (Public Review):

      Summary:

      Zhou et al. use publicly available GTEx data of 18 metabolic tissues from 310 individuals to explore gene expression correlation patterns within-tissue and across-tissues. They detect signatures of known metabolic signaling biology, such as ADIPOQ's role in fatty acid metabolism in adipose tissue. They also emphasize that their approach can help generate new hypotheses, such as the colon playing an important role in circadian clock maintenance. To aid researchers in querying their own genes of interest in metabolic tissues, they have developed an easy-to-use webtool (GD-CAT).

      This study makes reasonable conclusions from its data, and the webtool would be useful to researchers focused on metabolic signaling. However, some misconceptions need to be corrected, as well as greater clarification of the methodology used.

      Strengths:

      GTEx is a very powerful resource for many areas of biomedicine, and this study represents a valid use of gene co-expression network methodology. The authors do a good job of providing examples confirming known signaling biology as well as the potential to discover promising signatures of novel biology for follow-up and future studies. The webtool, GD-CAT, is easy to use and allows researchers with genes and tissues of interest to perform the same analyses in the same GTEx data.

      Weaknesses:

      A key weakness of the paper is that this study does not involve genetic correlations, which is used in the title and throughout the manuscript, but rather gene co-expression networks. The authors do mention the classic limitation that correlation does not imply causation, but this caveat is even more important given that these are not genetic correlations. Given that the goal of their study aligns closely with multi-tissue WGCNA, which is not a new idea (e.g., Talukdar et al. 2016; https://doi.org/10.1016/j.cels.2016.02.002), it is surprising that the authors only use WGCNA for its robust correlation estimation (bicor), but not its latent factor/module estimation, which could potentially capture cross-tissue signaling patterns. It is possible that the biological signals of interest would be drowned out by all the other variation in the data but given that this is a conventional step in WGCNA, it is a weakness that the authors do not use it or discuss it.

      Response: Thank you for the helpful and detailed suggestions regarding the study. The review raised some important points regarding methodological interpretations (ex. bicor-exclusive application as opposed to module-based approaches), as well as clarification of “genetic” inferences throughout the study. The comparison to module-based approaches has also now been discussed directly, pointing our considerations and advantages to each. We hope that the reviewer with our corrections to the misconceptions posed, many of which we feel were due to our insufficient description of methodological details and underlying interpretations. The revised manuscript, web portal and associated github provide much more detail and many more responses to specific points are provided below.

      Reviewer #3 (Public Review):

      Summary: A useful and potentially powerful analysis of gene expression correlations across major organ and tissue systems that exploits a subset of 310 humans from the GTEx collection (subjects for whom there are uniformly processed postmortem RNA-seq data for 18 tissues or organs). The analysis is complemented by a Shiny R application web service.

      The need for more multisystems analysis of transcript correlation is very well motivated by the authors. Their work should be contrasted with more simple comparisons of correlation structure within different organs and tissues, rather than actual correlations across organs and tissues.

      Strengths and Weaknesses: The strengths and limitations of this work trace back to the nature of the GTEx data set itself. The authors refer to the correlations of transcripts as "gene" and "genetic" correlations throughout. In fact, they name their web service "Genetically-Derived Correlations Across Tissues". But all GTEx subjects had strong exposure to unique environments and all correlations will be driven by developmental and environmental factors, age, sex differences, and shared and unshared pre- and postmortem technical artifacts. In fact we know that the heritability of transcript levels is generally low, often well under 25%, even studies of animals with tight environmental control.

      This criticism does not comment materially detract for the importance and utility of the correlations-whether genetic, GXE, or purely environmental-but it does mean that the authors should ideally restructure and reword text so as to NOT claim so much for "genetics". It may be possible to incorporate estimates of chip heritability of transcripts into this work if the genetic component of correlations is regarded as critical (all GTEx cases have genotypes).

      Appraisal of Work on the Field: There are two parts to this paper: 1. "case studies" of cross-tissue/organ correlations and 2. the creation of an R/Shiny application to make this type of analysis much more practical for any biologist. Both parts of the work are of high potential value, but neither is fully developed. My own opinion is that the R/Shiny component is the more important immediate contribution and that the "case studies" could be placed in the context of a more complete primer. Or Alternatively, the case studies could be their own independent contributions with more validation.

      Response: We thank the reviewer for their supportive and helpful comments. The discussion of usage of the term “genetic” has been removed entirely from the manuscript as this point was made by all reviewers. Further, we have revised the previous study to focus on more detailed investigations of why transcript isoforms seemed correlated between tissues and areas where datasets are insufficient to provide sufficient information (ex. Kidney in GTEx). As the reviewer points out, the previous “case studies” were unvalidated and incomplete and as a result, have been replaced. Additional points below have been revised to present a more comprehensive analyses of transcript correlations across tissues and improved web tool.

      (Recommendations For The Authors):

      As this manuscript is focused on the analytical process rather than the biological findings, the reviewer concerns are not a fundamental issue to subsequent acceptance of the paper, but some of the examples will need to be replaced or double-checked to ensure their biological and statistical relevance. To raise the scope and interest of the method developed, it would be seen very positively to include additional datasets, as the authors seem to have intended to have done, with a non-functional (and highlighted as such) selection for mouse data. Establishing that the authors can easily - and will easily - add additional datasets into their tool would greatly raise the reviewers' confidence in the methodology/resource aspect of this paper. This may also help address the significant concerns that all three reviewers raised with the biological examples, e.g. that GTEx data is so uncontrolled that studying environmentally-influenced traits such as circadian rhythm may be challenging or even impossible to do properly. Adding in a more highly controlled set of cross-tissue mouse data may be able to address both these concerns at once, i.e. the resource concern (can the website easily be updated with new data) and the biological concern (are the results from these vignettes actually statistically significant).

      Reviewer #1 (Recommendations For The Authors):

      Comments, in approximately reverse order of importance

      1. Some figure panels are not referenced in the text, e.g. Fig 1B and Figure 2E. Response: Thank you for pointing this out. We have revised every figure in the manuscript and additionally gone through to make sure every panel is referenced in the text.

      2. The authors mention "genetic data" several times but I don't see anything about DNA. By "genetic data" do you mean "transcriptome expression data," or something else?

      Response: This is an important point, also raised by all 3 reviewers. We have clarified in the abstract, results and discussion that correlations are between transcripts. As a result, all mentions of “genetics” or “genetic data” has been removed, with the exception of introducing mouse genetic reference panels.

      1. For Figure 3, the authors look at circadian clock data, but the GTEx data is from all sorts of different times of day from across the patient cohort depending on when the donor died, and I don't see this metadata actually mentioned anywhere. I see Arntl Clock and all the other circadian genes are highly coexpressed in each tissue (except not so strong in liver) but correlation across tissue seems more random. Also hypothalamus seems to be very strongly negatively correlated with spleen, but this large green block doesn't have significance? That is surprising to me, since the sample sizes are all equivalent I would expect any correlation remotely close to -1.0 to be highly significant.

      Response: The reviewer raises several important points with regard to the source of data and underlying interpretations. We have added a revised Fig 2, suggesting that representation of gene expression between tissues can be strongly biased by nature of samples (ex. differences in data that is available for each tissue) and also discussed considerations of the nature of sample origin in the limitations section. We have also used some of these points when introducing rationale for using mouse population data. As a result of comments from this reviewer and others, we have removed the circadian rhythm analysis and muscle-hippocampal figures from the revised study; however, specifically mentioned these cohort differences in the discussion section (lines 294-298). Circadian rhythm terms are also evaluated in Fig 2 and consistent with the reviewers concerns, less overall correlations are observed between transcripts across tissues when compared to other common GO terms assessed.

      1. Figure 4, this is all transcript-level data, so it is confusing to see protein nomenclature used, e.g. "expression of muscle ADAMTS17" should be "expression of muscle ADAMTS17" (ADAMTS17 the transcript should be in italics, in case the formatting is removed by the eLife portal). Same for FNDC5. In the figures you do have those in italics, so it is just an issue in the manuscript text. In general please look through the text and make sure whether you are referring really to a "gene," "transcript," or "protein." For instance, Figure 1 legend I think should be "A, All transcripts across the ... with local subcutaneous and muscle transcript expression." I know people still sometimes use "gene expression" to refer to transcripts, but now that proteomics is pretty mainstream, I would push for more careful vocabulary here.

      Response: Thank you for pointing these out. While we have replaced Fig 4 entirely as to limit the unvalidated discovery or research aspects of the paper, we have gone through the text and figures to check that the correct formatting is used for references to human genes (capitalized italics) or the newly-included mouse genes (lower-case italics).

      1. "Briefly, these data were filtered to retain genes which were detected across individuals where individuals were required to show counts > 0 in 1.2e6 gene-tissue combinations across all data." I don't quite understand the filtering metric here - what is 1.2 million gene-tissue combinations referring to? 20k genes times 18 tissues times 310 people is ~100 million measurements, but for a given gene across 310 people * 18 tissues that is only ~6000 quantifications per gene.

      Response: We apologize for this oversight, as the numbers were derived from the whole GTEx dataset in total and not the tissues used for the current study. We have clarified this point in the revised manuscript (methods section in Datasets used) and also removed confusing references to specific numbers of transcripts and tissues unless made clear.

      1. Generally I think your approach makes sense conceptually but... for the specific example used in e.g. figure 4, this only makes sense to me if applied to proteins and not to transcripts. Looking at the transcript levels per tissue for genes which are secreted could be interesting but this specific example is confusing, as is the tissue selected. I would not really expect much crosstalk between the hippocampus and the muscle, especially not in terms of secreted proteins.

      Response: This is a valid point, also raised by other reviewers. While we wanted to highlight the one potentially-new (ADAMTS7) and two established proteins (FNDC5 and ERFE) and their correlations, the fact that this direct circuit remains to be validated led us to replace the figure entirely. The point raised about inference of protein secretion compared to action; however, has been expanded upon in the results and discussion. We now show that complexities arise when using this approach to infer mechanisms of proteins which are primarily regulated post-transcriptionally. We provide a revised Supplemental Fig 4 showing that this general framework, when applied to expression of INS (insulin), almost exclusively captured pathways leading to its secretion and not action.

      1. It's not clear to me how correction for multiple testing is working in the analyses used in this manuscript. You mention q-values so I am sure it was done, I just don't see the precise method mentioned in the Methods section.

      Response: We apologize for this oversight and have included a specific mention of qvalue adjustment using BH methods, where our reasoning was the efficiency in run-time (compared to other qvalue methods). In addition, we provide a revised Fig 2 which suggests that innate correlation structure exists between tissues for a variety of pathways which should be considered. We also compare several empirical bicor pvalues and qvalue adjustments directly between these large pathways where much of the innate tissue correlation structure does appear present when BH qvalue adjustments are applied (revised Fig 2A).

      1. The piecharts in Figure 1 are interesting - I would actually be curious which tissues generally have closer coexpression. This would be an absolutely massive number of pairwise correlations to test, but maybe there is a smarter way to do it? For instance, for ADIPOQ, skeletal muscle has the best typical correlation, but would that be generally true just that many adipose genes have closer relationship between the two tissues?

      Response: This comment inspired us to perform a more systematic query of global gene-gene correlation structures, which is now shown as the revised Fig 2A. With respect to ADIPOQ, the reviewer is correct in that there does appear to be a general pattern of muscle genes showing stronger correlation with adipose genes. We emphasize and discuss there in the revised manuscript to point out that global trends of tissue correlation structure should be taken into account when looking at specific genes. Much of this innate co-correlation structure could be normalized by the BH qvalue adjustment (above); however, strongly correlated pathways like mitochondria showed selective patterns throughout thresholds (revised Fig 2A). Further, we analyze KEGG terms and general correlation structures (revised Fig 2B) to point out the converse, that some tissues are just poorly represented. Interpretation of correlated genes from these organ and pathway combinations should be especially considered in the framework that their poor representation in the dataset clearly impacted the global correlation structures. We have added these points to both results and discussion. In sum, we feel that this was a critical point to explore and attempted to provide a framework to identify/consider in the revised manuscript.

      1. The pathway enrichments in Figure 1 are more difficult for me to interpret, e.g. for ADIPOQ, the scWAT pathways make sense, but the enriched skeletal muscle pathways are less clearly relevant (rRNA processing?? Not impossible but no clear relevance either). What are the significances for these pathway enrichments? Is it even possible to select a gene that has no peripheral pathway enrichment, e.g. if you take some random Gm#### or olfactory receptor gene and run the analysis, are you also going to see significant pathways selected, as pathway enrichment often has a trend to overfit? The "within organ" does seem to make sense, but I am also just looking at 4 anecdotes here and it is unclear whether they are cherry picked because they did make sense. That is, it's unclear why you selected ADIPOQ and not APOE or HMGCR or etc. I also don't figure out how I can make these pathway enrichment plots using your website. I do get the pie chart but when I try the enrichment analysis block (NB: typo on your website, it says "Enrich-E-ment Analysis" with an extra E) I always get that "the selected tissue do not contain enough genes to generate positive the enrichment." (Also two typos in that phrase; authors should check and review extensively for improvements to the use of English.) After trying several genes I eventually got it to work. I think there is some significant overfitting here, as I am pretty sure that XIST expression in the white adipose tissue has nothing to do with olfactory signalling pathways, which are the top positive network (but with an n = 4 genes).

      Response: Several good points within this comment. 1) the pathway enrichments have been revised completely. The reviewer provided a helpful suggestion of a rank-based approach to query pathways, as opposed to the previous over-representation tests. After evaluating several different pathway enrichment tools based on correlated tissue expression transcripts, a rank- and weight-based test (GSEA) captured the most physiologic pathways observed from known actions of select secreted proteins. Therefore, revised pathway enrichments and web-tool queries unitize a GSEA approach which accounts for the rank and weight determined by correlation coefficient. In implementing these new pathway approaches, we feel that pathway terms perform significantly better at capturing mechanisms. 2) With respect to the selection genes, we wanted to provide a framework for investigating genes which encode secreted proteins that signal as a result of the abundance of the protein alone. This is a group-bias; however, and not necessarily reflective of trying to tackle the most important physiologic mechanisms underlying human disease. We agree with the reviewer in those evaluating genes such as APOE and cholesterol synthesis enzymes present an exciting opportunity, our expertise in interpretation and mechanistic confirmation is limited. 3) We have gone through the revised manuscript and attempted to correct all grammatical and/or spelling mistakes.

      1. The network figures I get on your website look actually more interesting than the ones you have in Figure 2, which only stay within a tissue. Making networks within a tissue is pretty easy I think for any biologist today, but the cross-tissue analysis is still fairly hard due to the size of the datasets and correlation matrices.

      Response: We greatly appreciate the reviewer’s enthusiasm for the network model generation aspect. We have tried to improve the figure generation and expanded the gene size selection for network generation in the web tool, both within and across tissues. We are working toward allowing users to select specific pathway terms and/or tissue genes to include in these networks as well, but will need more time to implement.

      1. I get a bug with making networks for certain genes, e.g. XIST - Liver does not work for plotting network graphs. Maybe XIST is a suppressed gene because it has zero expression in males? It is an interesting gene to look at as a "positive control" for many analyses, since it shows that sample sexing is done correctly for all samples.

      Response: The reviewer recognized a key consideration in underlying data structure for GTEx. In the revised manuscript, we evaluated tissue representation (or lack thereof) being a crucial factor in driving where significant relationships cannot be observed in tissues such as kidney, liver and spleen (Fig 2). Moreover, the representation of females (self-reported) in GTEx is less-than half of males (100 compared to 210 individuals). We have emphasized this point in the discussion where we specifically pointed out the lack of XIST Liver correlation being a product of data structure/availability and not reflecting real biologic mechanisms. We expanded on this point by highlighting the clear sex-bias in terms of representation.

      1. On the network diagram on your website, there doesn't seem to be any way to zoom in on the website itself? You can make a PDF which is nice but the text is often very small and hard to read.

      Response: We have revised the web interface plot parameters to create a more uniform graph.

      1. On a related note, is it possible to output the raw data and gene lists for the network graph? I would want to know what are those genes and their correlation coefficient.

      Response: We have enabled explore as .pdf or .svg graphics for the network and all plots. In addition, following pie chart generation at the top of the web app, users now have the ability to download a .csv file containing the bicor coefficients, regression pvalues and adjusted qvalues for all other gene-tissue combinations.

      1. Some functionality issues, e.g. on the "Scatter plot" block, I input a gene name again here. Shouldn't this use the same gene selected already at the top of the page? It seems confusing to again select the gene and tissue here, but maybe there is a reason for that.

      Response: It would be more intuitive to only display genes from a given selected tissue for scatterplots; however, we chose to keep all possible combinations with the [perhaps unnecessary] option of reselecting a tissue to allow users to query any specific gene without having to wait to run the pathways for all that correspond to a given tissues.

      1. Figure 4H should also probably be Figure 1A.

      Response: Good point, the revised Fig 1A is now a summary of the web tool

      I realize I have written a fairly critical review that will require most of the figures to be redone, but I think the underlying method is sound and the implementation by and end-user is quite simple, so I think your group should have no trouble addressing these points.

      Response: Your comments were really helpful and we feel that the tool has significantly improved as a result. So, we are thankful to the time and effort put toward helping here.

      Reviewer #2 (Recommendations For The Authors)

      Comments on the use of "genetic correlation"

      • The use of "genetic correlation" in title and throughout the manuscript is misleading. Should broadly be replaced with "gene expression correlation". Within genetics, "genetic correlation" generally refers to the correlation between traits due to genetic variation, as would be expected under pleiotropy (genetic variation that affects multiple traits). Here, I think the authors are somewhat conflating "genetic" (normally referring to genetic variation) with "gene" (because the data are gene expression phenotypes). I don't think they perform any genetic analysis in the manuscript. I hope I don't sound too harsh. I think the paper still has merit and value, but it is important to correct the terminology.

      Response: This was an important clarification raised by all reviewers. We apologize for the oversight. As a result, all mentions of “genetics” or “genetic data” has been removed, with the exception of introducing mouse genetic reference panels. These have generally been replaced with “transcript correlations”, “correlations” or “correlations across individuals” to avoid confusion.

      • The authors note an important limitation in the Discussion that correlations don't imply a specific causal model between two genes, and furthermore note that statistical procedures (mediation and Mendelian randomization) are dependent on assumptions and really only a well-designed experiment can completely determine the relationship. This is a very important point that I greatly appreciate. I think they could even further expand this discussion. The potential relationships between gene A and gene B are more complex than causal and reactive. For example, a genetic variant or environmental exposure could regulate a gene that then has a cascade of effects on other genes, including A and B. They belong to a shared causal pathway (and are potentially biologically interesting), but it's good to emphasize that correlations can reflect many underlying causal relationships, some more or less interesting biologically.

      Response: We thank the reviewer for pointing this out. We have expanded both the results and discussion sections to mention specifically how correlation between two genes can be due to a variety of parameters, often and not just encompassing their relationship. We mention the importance of considering genetic and environmental variables in these relationships as well which we feel will be an important “take-home message” for the reader. These points were also explored in the revised Fig 2 in terms of investigating broad pathway gene-gene correlation structures. As noted by the reviewer, contexts such as circadian rhythm or other variables in the data which are not fixed show much less overall significance in terms of broad relationships across organs.

      • It would be good for the authors to provide more context for the methods they use, even when they are fully published. For example, stating that biweight midcorrelation (bicor) is an approach for comparing to variables that is more robust to outliers than traditional correlations and is commonly used with gene co-expression correlation.

      Response: Thank you for pointing this out. A lack of method description was also an important reason for lack of clarity on other aspects so we have done our best to detail what exact approaches are being implemented and why. In the revised manuscript, we mention the usage if bicor values to limit influence of outlier individuals in driving regressions, but also point out that it is still a generalized linear model to assess relationships. We hope that the revised methods and expanded git repositories which detail each analysis provide much more transparency on what is being implemented.

      • Performing a similar analysis based on genetic correlation is an interesting idea, as it would potentially simplify the underlying causal models (removing variation that doesn't stem from genetic variants). I don't expect the authors to do this for this paper because it would be a significant amount of work (fitting and testing genetic correlations are not as straightforward). But still, an interesting idea to think about, and individuals in GTEx are genotyped I believe. Could be mentioned in the Discussion.

      Response: Absolutely. While we did not implement and models of genetic correlation (despite misusing the term) in this analysis. We have added to the discussion on how when genetic data is available, these approaches offer another way to tease out potentially causal interactions among the large amount of correlated data occurring for a variety of reasons.

      Comments on use of the term "local" and "regression"

      • "Local" is largely used to mean within-tissue, so how correlated gene X in tissue Y is with other genes in tissue Y. I think this needs to be defined explicitly early in the manuscript or possibly replaced with something like "within-tissue".

      Response: We have replaced al “local” mentions with “within-tissue” or simply name the tissue that the gene is expressed to avoid confusion with other terms of local (ex a transcript in proximity to where it is encoded on the genome).

      • "Regression" is also used frequently throughout, often when I think "correlation" would be more accurate. It's true that the regression coefficient is a function of the correlation between X and Y, but I don't think actual regression (the procedure) applies here. The coefficients being used are bicor, which I don't think relates as cleanly to linear regression.

      Response: Thank you for pointing this out. A lack of method description was also an important reason for lack of clarity on other aspects so we have done our best to detail what exact approaches are being implemented and why. In the revised manuscript, we mention the usage if bicor values to limit influence of outlier individuals in driving correlations, but also point out that it is still a generalized linear model to assess relationships. Further, we have removed usage of “regression” when referencing bicor values. We hope that the revised methods and expanded git repositories which detail each analysis provide much more transparency on what is being implemented.

      • "Further, pan-tissue correlations tend to be dominated by local regressions where a given gene is expressed. This is due to the fact that within-tissue correlations could capture both the regulatory and putative consequences of gene regulation, and distinguishing between the two presents a significant challenge" (lines 219-223). This sentence includes both "local" and "regressions" (and would be improved by my suggested changes I think), but I also don't fully understand the argument of "regulatory and putative consequences". I think the authors should elaborate further. In the examples, the within-tissue correlations do look stronger, suggesting within-tissue regulation that is quite strong and potentially secondary inter-tissue regulation. If that's the idea, I think it can be stated more clearly.

      Response: Thank you for pointing this out. We have revised the sentence to state the following:

      Further, many correlations tend to be dominated by genes expressed within the same organ. This could be due to the fact that, within-tissue correlations could capture both the pathways regulating expression of a gene, as well as potential consequences of changes in expression/function, and distinguishing between the two presents a significant challenge. For example, a GD-CAT query of insulin (INS) expression in pancreas shows exclusive enrichments in pancreas and corresponding pathway terms reflect regulatory mechanisms such as secretion and ion transport (Supplemental Fig 4).

      We feel that this point might not be intuitive, so have included a new figure (Supplemental Fig 4) which contains the tissue correlations and pathways for INS expression in pancreas. These analyses show an example where co-correlation structure seems almost entirely dominated by genes within the same organ (pancreas) and GSEA enrichments highlight many known pathways which are involved in regulating the expression/secretion of the gene/protein. We hope that this makes the point more clearly to the reader.

      Additional comments on Results:

      • I would break the titled Results sections into multiple paragraphs. For example, the first section (lines 84-129) has a few natural breakpoints that I noticed that would potentially make it feel less over-whelming to the reader.

      Response: We have broken up the results section into separate paragraphs in the revised manuscript. In addition, we have gone through to try and make sure that the amount of information per block/sentence focuses on key points.

      • "Expression of a gene and its corresponding protein can show substantial discordances depending on the dataset used" (line 224 of Results). This is a good point, and the authors could include citations here of studies that show discordance between transcripts and proteins, of which there are a good number. They could also add some biological context, such as saying differences could reflect post-translational regulation, etc.

      Response: Thank you for the supportive comment. We have referenced several comprehensive reviews of the topic, each of which contain tables summarizing details of mRNA-protein correlation. The revised discussion sentence is as follows:

      Expression of a gene and its corresponding protein can show substantial discordances depending on the dataset used. These have been discussed in detail39–41, but ranges of co-correlation can vary widely depending on the datasets used and approaches taken. We note that for genes encoding proteins where actions from acute secretion grossly outweigh patterns of gene expression, such as insulin, caution should be taken when interpreting results. As the depth and availability of tissue-specific proteomic levels across diverse individuals continues to increase, an exciting opportunity is presented to explore the applicability of these analyses and identify areas when gene expression is not a sufficient measure.

      1. Liu, Y., Beyer, A. & Aebersold, R. On the Dependency of Cellular Protein Levels on mRNA Abundance. Cell 165, 535–550 (2016).

      2. Maier, T., Güell, M. & Serrano, L. Correlation of mRNA and protein in complex biological samples. FEBS Letters 583, 3966–3973 (2009).

      3. Buccitelli, C. & Selbach, M. mRNAs, proteins and the emerging principles of gene expression control. Nat Rev Genet 21, 630–644 (2020).

      • In many ways, this work has similar goals to many studies that have performed multi-tissue WGCNA (e.g., Talukdar et al. 2016; https://doi.org/10.1016/j.cels.2016.02.002). In this manuscript, WGCNA's conventional approach to estimating robust correlations (bicor) is used, but they do not use WGCNA's data reduction/clustering functionality to estimate modules. Perhaps the modules would miss the signaling relationships of interest, being sort of lost in the presence of stronger signals that aren't relevant to the biological questions here. But I think it would be good for the authors to explain why they didn't use the full WGCNA approach.

      Response: This is an important point and we also feel that the previous lack of methodological details and discussion did a poor job at distinguishing why module-based approaches were not used. We wanted to be careful not to emphasize one approach being superior/inferior to another, rather point out the different considerations and when a direct correlation might inform a given question. As the reviewer points out, our general feeling is that adopting a simple gene-focused correlation approach allows users to view mechanisms through the lens of a single gene; however, this is limited in that these could be influenced by cumulative patterns of correlation structure (for example mitochondria in revised Fig 2A) which would be much more apparent in a module-based approach. This comment, in combination with the other listed above, was our motivation in exploring cumulative patterns of gene-gene correlations in the revised Fig 2. In the revised manuscript, we expanded on the results and discussion section to highlight utility of these types of approaches compared to module-based methods:

      The queries provided in GD-CAT use fairly simple linear models to infer organ-organ signaling; however, more sophisticated methods can also be applied in an informative fashion. For example, Koplev et al generated co-expression modules from 9 tissues in the STARNET dataset, where construction of a massive Bayesian network uncovered interactions between correlated modules6. These approaches expanded on analysis of STAGE data to construct network models using WGCNA across tissues and relating these resulting eigenvectors to outcomes42. The generalized approach of constructing cross-tissue gene regulatory modules presents appeal in that genes are able to be viewed in the context of a network with respect to all other gene-tissue combinations. In searching through these types of expanded networks, individuals can identify where the most compelling global relationships occur. One challenge with this type of approach; however, is that coregulated pathways and module members are highly subjective to parameters used to construct GRNs (for example reassignment threshold in WGCNA) and can be difficult in arriving at a “ground truth” for parameter selection. We note that the WGCNA package is also implemented in these analyses, but solely to perform gene-focused correlations using biweight midcorrelation to limit outlier inflation. While the midweight bicorrelation approach to calculate correlations could also be replaced with more sophisticated models, one consideration would be a concern of overfitting models and thus, biasing outcomes.

      Additional comments on Discussion:

      • In the second paragraph of the Discussion (lines 231-244), the authors mention that GD-CAT uses linear models to compare data between organs and point to other methods that use more complex or elaborate models. It's good to cite these methods, but I think they could more directly state that there are limitations to high complexity models, such as over-fitting.

      Response: Thank you for this suggestion. We have added a line (above) mentioning the overfitting concern.

      Comments on Methods:

      • The described gene filtration in the Methods of including genes with non-zero expression for 1.2e6 gene-tissue combinations is confusing. If there are 310 individuals and 18 tissues, for a given gene, aren't there only 5,580 possible data points? Might be helpful to contextualize the cut-off in terms of like the average number of individuals with non-zero expression within a tissue.

      Response: We apologize for this error. This number was pasted from a previous dataset used and not appropriate for this manuscript. In general, we have removed specific mentions of total number of gene_tissue correlation combinations, as these numbers reflect large but almost meaningless quantifications. Instead, we expanded the methods in terms of how individuals and genes filtered.

      • More details should be given about the gene ontology/pathway enrichment analysis. I suspect that a set-based approach (e.g., hypergeometric test) was used, rather than a score-based approach. The authors don't state what universe of genes were used, i.e., the overall set of genes that the reduced set of interest is compared to. Seems like this could or should vary with the tissues that are being compared. A score-based approach could be interesting to consider (https://www.biorxiv.org/content/10.1101/060012v3), using the genetic correlations as the score, as this would remove the unappealing feature of sets being dependent on correlation thresholds. This isn't something that I would demand of the published paper, but it could be an appealing approach for the authors to consider and confirm similar results to the set-based analysis.

      Response: This is an important point. Following this suggestion, we evaluated several different rank- and weight-based pathway enrichment tools, including FGSEA and others. Ultimately, we concluded that GSEA performed significantly better at 1) recapitulating known biology of select secreted protein genes and 2) leveraging the large numbers of genes occurring at qvalue cutoffs without having to further refine (ex. in the previous overrepresentation tests). For this reason, all pathway enrichments in the web tools and manuscripts not contain GSEA outputs and corresponding pathway enrichments or network graph visualizations. Thank you for this suggestion.

      Comments on figures:

      • I think there is a bit of a missed opportunity to use the figures to introduce and build up the story for readers. For example, in Figure 1, plotting ADIPOQ expression against a correlated gene in adipose (local) as well as peripheral tissues. This doesn't need to be done for every example, but I think it would help readers understand what the data are, and what's being detected before jumping into higher level summaries.

      Response: Thank you, this point also builds on others which recommended to restructure the manuscript and figures. In the revised manuscript, we first introduce the web tool (which was last previously), and immediately highlight comparisons of within- and across-organ correlations, such as ADIPOQ. We feel that the revised manuscript presents a superior structure in terms of demonstrating the key points and utility of looking at gene-gene correlations across tissues.

      • Figures 1 and 4 are missing the color scale legend for the bar plots, so it's impossible to tell how significant the enrichments are.

      Response: We apologize for the oversight. The pathways in the revised Fig 1 detail pathway network graphs among the top pathways which should make interpretation more intuitive. We have also gone through and made sure that GSEA enrichment pvalues are now present for all figures including pathways (revised Fig 1, Fig 3 and supplemental Fig 4).

      • The Figure 2 caption says that edges are colored based on correlation sign? Are there any negative correlations (red)? They all look blue to me. The caption could also state that edge weight reflects correlation magnitude (I assume). It would be ideal to include a legend that links a range of the depicted edge weights to their genetic correlation, though I don't know how feasible that may be depending on the package being used to plot the networks.

      Response: Good catch. We included in the revised manuscript the network edge parameters: Network edges represent positive (blue) and negative (red) correlations and the thicknesses are determined by coefficients. They are set for a range of bicor=0.6 (minimum to include) to bicor=0.99

      Related to seeing a dominant pattern of positive correlations, we agree that this observation is fascinating and gene-gene correlations being dominated by positive coefficients will be the topic of a closely-following manuscript from the lab

      • Figure 4A would be more informative as boxplots, which could still include Ssec score. This would allow the reader to get a sense of the variation in correlation p-value across all hippocampus transcripts.

      Response: Related to comments from this reviewer and others, we have removed the previous Fig 4 entirely from the manuscript to emphasize the ability of these gene-gene correlations to capture known biology and limit the extend of unvalidated “suggested” new mechanisms.

      Comments on GD-CAT

      • The online webtool worked nicely for me. It was easy to use and produce figures like in the manuscript. One suggestion is show data points in the scatter plot rather than just the regression line (if that's possible currently, I didn't figure it out). A regression line isn't that interesting to look at, but seeing how noisy the data look around it is something humans can usually interpret intuitively.

      Response: Thank you so much. We are excited that the web tool works sufficiently. We have also revised the individual gene-gene correlation tab to show individual data points instead of simple regression lines.

      Minor comments:

      Response: Thank you for these detailed improvements

      • This sentence is awkwardly constructed: "Here, we surveyed gene-gene genetic correlation structure for ~6.1x10^12 gene pairs across 18 metabolic tissues in 310 individuals where variation of genes such as FGF21, ADIPOQ, GCG and IL6 showed enrichments which recapitulate experimental observations" (lines 68-70). It's an important sentence because it's where in the Abstract/Introduction the authors succinctly state what they did, thus I would re-work it to something like: "Here, we surveyed gene expression correlation structure..., identifying genes, such as FGF21, ADIPOQ, GCG and IL6, that possess correlation networks that recapitulate known biological pathways."

      Response: The numbers of pairs examined and dataset size have been removed for clarity and we have revised this statement and results as a whole

      • Prefer swapping "signal" for "signaling" in line 53 of Abstract/Introduction.

      Response: Done

      • Remove extra period in line 208 of Results.

      Response: Removed

      • Change "well-establish" to "well-established" in line 247 of Discussion.

      Response: Replaced

      • Missing commas in line 302 of Methods.

      Response: added

      • Missing comma in line 485 of Figure 3 caption.

      Response: The previous Fig 3 has been removed

      • Typo in title of Figure 3E (change "Perihperal" to "Peripheral")

      Response: Thank you, changed

      • Add y-axis label to y-axis labels (relative cell proportions) to Supplemental Figures 1-3.

      Response: These labels have been added

      Reviewer #3 (Recommendations For The Authors):

      Minor technical comment: The authors refer to correlations between genes when they actually mean correlations between GTEX transcript isoform models. It is exceedingly important to keep this distinction clear in the reader's mind, a fact that is emphasized by the authors themselves when they comment on the potential value of similar proteomic assays to evaluate multiorgan system communication. GTEx has tried to do proteomics but I do not know of any open data yet.

      Response: Thank you for this point. We have gone through the manuscript and replaced “gene correlations” with “transcript” or other similar mentions. Related to the comment on GTEx proteomics, this is an important point as well. As the reviewer mentions, proteomics has been performed on GTEx data; however, given that this dataset contains only 6 sparsely-represented individuals, analyses such as the ones highlighted in our study remain highly limited. We have added the following to the discussion: As the depth and availability of tissue-specific proteomic levels across diverse individuals continues to increase, an exciting opportunity is presented to explore the applicability of these analyses and identify areas when gene expression is not a sufficient measure. For example, mass-spec proteomics was recently performed on GTEx42; however, given that these data represent 6 individuals, analyses utilizing well-powered inter-individual correlations such as ours which contain 310 individuals remain limited n applications.

      The R/Shiny companion application: The community utility of this application would be greatly improved by a link to a primer and more basic functionality. The Github site is a "work in progress" and does not include a readme file or explanation (that I could find) on the license.

      Response: Thank you, we are excited that the apps operate sufficiently. We have revised the github repository entirely to contain a full walk-through of app details and parameter selections. These are meant to walk users through each step of the pipeline and discuss what is being done at each step. We agree that this updated github repository allows users to understand the details of the R/Shiny app in much more detail. We also made all the app scripts, datasets, markdown/walkthrough files and docker image fully available to enhance accessibility.

    1. Author Response

      We appreciate the reviewers’ and editors’ advice on further improving this manuscript. We have provided point by point responses to the reviewers’ comments mentioned below. A revised version of this manuscript will be uploaded within a few weeks.

      Authors’ response to Reviewer 1 comments:

      • We appreciate the reviewer’s time in highlighting the strengths and weaknesses of this manuscript.

      • Per the reviewer’s advice, we will provide further description of the methods in a revised version of this manuscript.

      • The interpretation about the biological threat in response to elevated glycosuria in renal Glut2 KO mice is based on our observation that these mice exhibit changes in acute phase proteins measured using plasma proteomics. We will further discuss this in a revised version of this manuscript.

      • We acknowledge that this manuscript provides a resource for future mechanistic studies. Because multiple secreted proteins are changed between the control and experimental groups, some of them could be causal and others corelative in the context of enhancing compensatory glucose production in response to elevated glycosuria. Through future studies we will determine the causal proteins that trigger the increase in glucose production and identify the tissues that secrete these proteins.

      • We have shown previously (Cordeiro et al., Diabetologia 2022) that renal Glut2 deficiency doesn’t change insulin sensitivity (i.e. renal Glut2 KO mice don’t exhibit insulin resistance despite the activation of the HPA axis). It is likely that the massive glycosuria in renal Glut2 KO mice may overcome or mask the phenotype of insulin resistance potentially induced by an increase in the stress hormones.

      • In this manuscript, our major goal was to determine how elevated glycosuria leads to an increase in compensatory glucose production. We are not suggesting renal Glut2 as a therapeutic in this manuscript (that was already demonstrated in our previously published manuscript, Cordeiro et al., Diabetologia 2022).

      Authors’ response to Reviewer 2 comments:

      1) Renal Glut2 KO mice didn’t exhibit sex differences for the variables reported in our previous manuscript (Cordeiro et al., Diabetologia 2022). Therefore, in the present manuscript we decided to use male or female mice depending on their availability for each reported experiment. Per the reviewer’s advice, we will describe these details including age and sexes in each figure legend.

      2) For the method description, we have cited previous publications and mentioned ‘as described previously’. Based on the reviewer’s suggestion we will further describe the methods in detail to clarify the reviewer’s concerns. In addition, we will include age and sexes in the legends of each figure.

      3) For littermate controls, we had used Glut2loxp/loxp mice (which are like WT controls as described in Cordeiro et al., Diabetologia 2022) that were injected with tamoxifen exactly in the same way as the experimental mice. Het mice for Cre were not used as controls because they would have confounded the results as pointed out by the reviewer.

      4) Because elevated HPA activity is known to increase blood glucose levels, we suggested ‘the HPA axis may…..’. Given the nature of this manuscript, we agree the secreted proteins identified using plasma proteomics could contribute to enhanced glucose production directly or through secondary mechanisms. Afferent renal denervation using capsaicin reduced blood glucose levels concomitant with the suppression of the HPA axis in renal Glut2 KO mice. Based on these findings we speculated that the HPA axis may be partly responsible for increasing glucose production in renal Glut2 KO mice.

      We had considered using CRF antagonist and glucocorticoid receptor antagonists to determine the causal role of the HPA axis in contributing to the increase in glucose production in renal Glut2 KO mice. However, these drugs activate compensatory mechanisms including changes in insulin sensitivity. Therefore, use of these drugs would further confound the results instead of providing a clarity on the causal role of the HPA axis in enhancing glucose production in renal Glut2 KO mice.

      5) We understand the reviewer’s concerns whether the results reported here are translatable to humans. Please note that expression of SGLT2 is not kidney-specific; therefore, pleiotropic effects of SGLT2 inhibition in tissues other than the kidney cannot be excluded in animal models and humans. In contrast, the mouse model reported in this manuscript is kidney-specific Glut2 KO mice. Therefore, phenotype produced in renal Glut2 KO mice cannot be directly compared with that produced after SGLT2 inhibition. It may be too early to speculate whether the results reported in this manuscript are translatable to humans.

      In the referred research papers by the reviewer, the authors have used either models of different types of diabetes or included individuals with diabetes in their study. Notedly, diabetes itself affects the HPA axis independently of SGLT2 or GLUT2 inhibition. Therefore, it may not be appropriate to compare results obtained from animals or individuals with diabetes with that reported in this manuscript from renal Glut2 KO mice.

      6) Yes, we are currently performing mechanistic studies including assessment of mitochondrial function in renal Glut2 KO mice to determine whether and how the kidneys sense loss of glucose in urine.

      7) We apologize for the lack of methods description. We will provide additional method details in a revised version of this manuscript. All the assays were performed as per manufacturer’s instructions. Aliquots of the same samples were used for analyses of the hormones and for consistency across different assays.

    1. Author Response

      We highly appreciate the constructive feedback provided by the reviewers, which we believe will greatly improve the quality of our work. We were encouraged to see that our manuscript was considered to be “important”, of “great interest” as well as to “yield valuable results”.

      We also highly appreciate the overall positive eLife assessment. However, we were surprised to read that our “results range from solid from inadequate”. This especially applies given the positive and engaging nature of the reviews which seem to mainly concern the results interpretation being “inadequate” rather than the results themselves. Hence, we kindly request a reconsideration of this aspect of the assessment.

      Moreover, there is one Reviewer comment we would like to address directly. Reviewer #3 pointed out that “this study did not conduct a direct association analysis between MetS and cognitive levels without considering subgroup comparisons.” and that “After a thor-ough review of the methods and results sections” she/he “found no direct or strong evidence supporting the authors' claim that the identified latent variables were related to more severe MetS to worse cognitive performance. While a sub-group comparison was conducted, it did not adequately account for confounding factors such as educational level.”.

      We appreciate the observations of Reviewer #3 regarding the absence of a direct association analysis between Metabolic Syndrome (MetS) and cognitive levels without subgroup comparisons, and the lack of evidence linking latent variables to MetS severity and cognitive performance. Our apologies for any confusion caused by unclear presentation. Our study incorporated association analyses between MetS, brain structure, and cognition using MetS components, regional cortical thickness, and cognitive performance data in a PLS. These analyses were separately performed on the UK Biobank and HCHS datasets, due to their distinct cognitive assessments. We adjusted for age, sex, and education in the subgroup analyses by removing their effects from the input variables. The primary latent variables demonstrated significant associations with MetS components, cortical thickness, and cognitive scores, indicating that higher obesity, blood pressure, lipidemia, and glycemia levels correlate with lower cognitive performance. These relationships are detailed in supplementary figures S15b and S16b, with negligible loadings for age, sex, and education, confirming effective deconfounding. We acknowledge the reviewer's constructive feedback and will enhance the clarity of the Methods and Results sections, including conducting a mediation analysis.

      Furthermore, we strive to incorporate the Reviewers’ other suggestions into the analysis. The revision will include major changes to the manuscript.

      In response to Reviewer #1:

      • We will revise considering non-fasting plasma glucose as a surrogate marker of insuline resistance.

      • We will report Field IDs of the used UK Biobank variables.

      • We aim to moderate causal interpretations and reword the indicated passages for clarity.

      In response to Reviewer #2:

      • We will reconsider claims of binarizing vascular dementia and Alzheimer’s dementia pathophysiology.

      • We will further explore the cell type associations of the other latent variables.

      • We will expand the discussion regarding conclusions from our results and the future outlook.

      In response to Reviewer #3.

      • We will add an additional flowchart to detail the virtual histology analysis.

      • We will add a discussion of the second latent variable.

      • We will conduct a mediation analysis to statistically assess the mediation effect of brain structure on the relationship between MetS and cognitive performance.

      We are convinced that with these revisions, our manuscript will align even more closely with the high standards of eLife and make a strong contribution to its distinguished portfolio. We thank you for your consideration.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We are grateful to the reviewers for their remarks, which significantly improved the paper. We repeated the biochemical assay concerning SIRT6 activity on H3-K27Ac and quantified the results as requested. Please find our detailed answers bellow each recommendation of the reviewers.

      Major recommendations:

      1. Grammatical errors are still common; the authors may need to consider an external editing service if they intend to fix the problems as they indicate that they believe the errors have been removed. The Results section is relatively clean, but parts of the Abstract, Introduction, and Discussion are more difficult to understand, and errors are especially common in the Methods section and those parts of the manuscript that are new in this revision.

      We corrected the grammatical errors.

      1. The introduction doesn't mention the other structures published; this is considered to be a serious deficiency as it prevents the reader from understanding the context for the contributions described here. Withholding the comparison with (or mention of) the previously published work to the last sentence of the Discussion seems misleading and does not give the reader adequate ability to judge the novelty of the results presented in this manuscript.

      A paragraph comparing our paper to the other structures published appear at the end of the discussion. We feel this is still the right place for such a paragraph.

      1. The addition of the assay for deacetylation is a significant improvement over the initial submission. This is important both for validating the importance of the acidic patch contacts and for helping to resolve the conflicting reports regarding activity on H3-K27Ac. Given the importance of this assay for the impact of the manuscript, it is not clear why the authors chose to 1) put the data in the supplement instead of in the main manuscript, and 2) provide only single samples without quantitation. These both seem to be significant limitations.

      We repeated the experiment and provided quantification of the results. We placed the figure in the main manuscript.

      1. The authors should add text or a table to the Methods section explaining which maps were used for each figure. By our count, there are 8 maps and 5 models (plus MD models) based on two datasets, but the relationships among them are not clearly stated, and the names of the maps (such as "Zn-finger focused" and "Rossman-Fold-Focused") might be changed to be more helpful to the reader (for example, the latter includes more than the Rossman fold and might be renamed "Sirt6-focused"). The authors should also explain how the maps were validated, which data were deposited in public repositories, and why some data were not deposited. For example, no statistics or methods regarding how particles were separated into integrated vs. non-integrated motion are provided for the CryoDRGN models. Further, the "two principle movements" described are depicted in 4 maps from two CryoDRGN runs using two separate sets of particles, but the relationships among them are not defined clearly. Finally, the connectivity of densities in Fig 8 are not obvious in the submitted maps. Until these points are addressed, the work is considered incomplete.

      AND

      1. The PDB model provided for review and submitted to the PDB database shows loosely bound DNA at the nucleosomal entry/exit points near the binding site of SIRT6, but the maps provided for review and submitted to the EMDB show stronger density for the canonical location of the DNA expected at these sites. The CryoDRGN maps support a more extended conformation, but these maps were not deposited or provided for review so their validity cannot be assessed.

      We added a section to the methods listing the different maps used for the figures. We deposited the map we used to trance the H2A N-terminal tail (EMD-18497). Unfortunately, we couldn’t deposit the cryoDRGN maps as the deposition system either accepts composite maps, where the consensus should be deposited too or experimental maps, where the deposition of half maps are mandatory. Nevertheless, the cryoDRGN maps are available upon request. We also added a supplementary figure (Supplementary Fig 6) to show how the cryoDRGN analyses were performed.

      1. The orientation, angle and threshold used in Fig 1 make it difficult to see the multiple DNA orientations that are visible in the deposited consensus map. Examination of the map suggests that the DNA model submitted to PDB corresponds to a weaker DNA conformation than is present in the map where both DNA conformations are visible. The authors should consider modeling both conformations in their deposited model to provide a more complete, accurate representation of the data. It is concerning that a key conclusion of the manuscript is that the DNA conformation changes upon SIRT6 binding, but density for the canonical position is observable in Fig 8a.

      Figure 1 is showing the overall representation of the SIRT6 bound nucleosome structure. We show the DNA linker orientations in the subsequent figure. Figure 8 (now Figure 9) shows the rearrangement of the SIRT6 Rossmann fold domain not the DNA linker.

      1. Figure 4 needs a more complete legend, indicating that it is a hybrid of the consensus structure (one color) and the MD simulations (another color). In general, the colors used in the figure should be changed to make the main points more accessible.

      As there is a color code for the histones, changing colors might be confusing. The figure legend mentions that panels c, d and e are from MD simulations.

      Minor recommendations:

      1. Figures 2c, e, and f are not referenced in the text.

      We now referenced all figure panels in the text.

      1. Consider moving Supp. 5C to Fig. 2 as the models in that figure come from the CryoDRGN maps and not the consensus map.

      Supplemental Figure 5c show the DNA linker deviation upon SIRT6 binding from another angle. We prefer to keep it there.

      1.) Supp Fig 3 is labeled "ZnF-nucleosome" refinement, but this appears to come from Data Set #2 processing. The map might be labeled ZnF-nucleosome but then a mask should be shown that excludes the Rossman Fold. It is not clear if this is a focused refinement or just a 2.9 A map that was merged with the "Rossman-fold" map.

      We changed both supplemental figures accordingly.

      1. The orientation of Fig 2 b and e do not show the differences in these models as well as panels c and f. Panels b and e could be replaced with the 4 CryoDRGN maps.

      The models reflect the cryoDRGN maps and panels c and f were added to clarify the movement.

      1. The MD description should emphasize that the H3 tails are moving with respect to the active site, as it currently suggests the active site is moving.

      In the results and in the discussion section we mention that we observe new conformations of the H3 tail, not of the active site.

      1. The authors refer to the "flexibility of the Rossmann fold domain," but the Rossman Fold domain isn't flexible, the linkage to the ZnF is flexible. Perhaps "observed conformational space" or "dynamic Rossman-fold domain position" are meant.

      The text was changed accordingly.

      1. The H2A C-terminal tail present in Fig 1 (bottom right) and Figure 3e is not present in the model in Fig 4a,b.

      The H2A tails conformation was not resolved in the cryoDRGN maps so we didn’t model it.

      1. The crosslinking agent used is not specified.

      The crosslinking agent used is specified more clearly in the methods.

      1. Supp Table 1 and EM methods do not agree on the magnification for Dataset #1. Verify nominal versus binned magnification and reported pixel size.<br /> The magnification in the methods was changed.

      2. Fig 3F showing the difference between affinity for H2A and H2A.Z-containing nucleosomes would be more convincing with a titration rather than the current comparison of a single concentration.

      We agree with this remark however, we find single concentration comparison is convincing enough for the purposes of this paper as it is not a central finding.

      1. Fig S1 legend; both the Zn-finger and helix bundle are stated to be shown in green.

      Figure S1 legend was changed.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Response to Reviewers:

      Thank you for taking the time to review our manuscript and provide us with helpful comments. Your comments enabled us to improve the clarity of the manuscript, in particular:

      1. We improved the organization of the figures by associating each supplemental figure with a main-text figure using the eLife “figure supplements” format.

      2. We reduced the length of figure captions where possible.

      3. We improved organizational clarity by adding a brief organizational summary statement at the beginning of the results section which outlines the contents of the results subsections in the context of the introduction. We took particular care to use the same language, so the parallelism is clearer.

      4. In addition, we made various modifications to the main text to improve clarity for the reader. For this we asked specific help of our biologist co-authors to indicate which aspects would benefit from further clarification to enable the broad biology readership of eLife to comprehend our research better.

      Reviewer #1 (Public Review):

      The authors sought to resolve the coordinated functions of the two muscles that primarily power flight in birds (supracoracoideus and pectoralis), with particular focus on the pectoralis. Technology has limited the ability to resolve some details of pectoralis function, so the authors developed a model that can make accurate predictions about this muscle's function during flight. The authors first measured aerodynamic forces, wing shape changes, and pectoralis muscle activity in flying doves. They used cutting-edge techniques for the aerodynamic and wing shape measurements and they used well-established methods to measure activity and length of the pectoralis muscle. The authors then developed two mathematical models to estimate the instantaneous force vector produced by the pectoralis throughout the wing stroke. Finally, the authors applied their mathematical models to other-sized birds in order to compare muscle physiology across species.

      The strength of the methods is that they smoothly incorporate techniques from many complementary fields to generate a comprehensive model of pectoralis muscle function during flight. The high-speed structured-light technique for quantifying surface area during flight is novel and cutting-edge, as is the aerodynamic force platform used. These methods push the boundaries of what has historically been used to quantify their respective aspects of bird flight and their use here is exciting. The methods used for measuring muscle activation and length are standard in the field. Together, these provide both a strong conceptual foundation for the model and highlight its novelty. This model allows for estimations of muscle function that are not feasible to measure in live birds during flight at present. The weakness of this approach is that it relies heavily on a series of assumptions. While the research presented in this paper makes use of powerful methods from multiple fields, those methods each have assumptions inherent to them that simplify the biological system of study. This reduction in the complexity of phenomena allows the specific measurements to be made. In joining the techniques of multiple fields to study the greater complexity of the phenomenon of interest, the assumptions are all incorporated also. Furthermore, assumptions are inherent to mathematical modeling of biological phenomena. That being said, the authors acknowledge and justify their assumptions at each step and their model seems to be quite good at predicting muscle function.

      Indeed, the authors achieve their aims. They effectively integrate methods from multiple disciplines to explore the coordination and function of the pectoralis and supracoracoideus muscles during flight. The conclusions that the authors derive from their model address the intended research aim.

      The authors demonstrate the value of such interdisciplinary research, especially in studying complex behaviors that are difficult or infeasible to measure in living animals. Additionally, this work provides predictions for muscle function that can be tested empirically. These methods are certainly valuable for understanding flight but also have implications for biologists studying movement and muscle function more generally.

      Thank you for your thorough and positive review. We appreciate that you read our manuscript carefully and gave detailed feedback.

      Recommendations For The Authors:

      I thought that your manuscript was very interesting and your integration of techniques from multiple fields was effective. You address the weaknesses I highlighted in the public review well throughout the manuscript.

      Thank you for your well-measured feedback on this weakness and how we addressed it.

      I sometimes found that the manuscript was difficult to follow. With the interdisciplinary nature of your work, your manuscript has a lot of complexity. Your introduction is clear and I think that the last paragraph outlines your study very well. In the subsequent sections, the sub-headings are helpful, but I think your manuscript could be improved by indicating where those subsections fit into the phases you outline in your introduction (namely, muscle function, kinematics and aerodynamics, and mathematical modeling).

      Complied: throughout the manuscript we made modifications to improve the clarity. We also added a brief organizational summary statement at the beginning of the results section which outlines the contents of the results section in the context of the language introduced in the introduction. Finally, we reorganized the supplemental figures into eLife’s favored format of “figure supplements”, so that each extra figure is now associated with a figure in the main text. This should help the reader access information in an easier, hierarchical manner.

      Reviewer #2 (Public Review):

      In this work, the authors investigated the pectoralis work loop and the function of the supracoracoideus muscle in the down stroke during slow flight in doves. The aim of this study was to determine how aerodynamic force is generated, using simultaneous high-speed measurements of the wings' kinematics, aerodynamics, and activation and strain of pectoralis muscles during slow flight. The measurements show a reduction in the angle of attack during mid-downstroke, which induces a peak power factor and facilitates the tensioning of the supracoracoideus tendon with pectoralis power, which then can be released in the up-stroke. By combining the data with a muscle mechanics model, the timely tuning of elastic storage in the supracoracoideus tendon was examined and showed an improvement of the pectoralis work loop shape factor. Finally, other bird species were integrated into the model for a comparative investigation.

      The major strength of the methods is the simultaneous application of four high-speed techniques - to quantify kinematics, aerodynamics and muscle activation and strain - as well as the implementation of the time-resolved data into a muscle mechanics model. With a thorough analysis which supports the conclusions convincingly, the authors achieved their goal of reaching an improved understanding of the interplay of the pectoralis and supracoracoideus muscles during slow flight and the resulting energetic benefits.

      Thank you for your helpful and positive review. We appreciate that you summarized our manuscript accurately in a way that can help the reader.

      Recommendations For The Authors:

      The manuscript is very detailed and appears a bit long, including all the supplementary materials. It seems that the manuscript could easily have been separated into several publications, especially the comparative investigation including other extant bird species into the new model could have been a separate publication. This would have reduced the length of the supplements.

      Thank you for your feedback on our manuscript; we made numerous improvements to improve the readability. Hence, we decided to not cut the supplement short or split it into more papers. We chose eLife because we wanted to publish this study in one complete manuscript. This has three benefits: (1) The reader can find all information in one well-edited paper at one publisher that is open-access and high-quality. (2) The first author works in industry and gets no benefits from publishing multiple papers, and hence he opted to publish one with support of the author team. (3) The senior author is not interested in fragmented publishing. Rather, he writes fewer, more comprehensive integrative papers because that is ultimately more informative for the reader: one trusted published source has all that is important to know based on this completed research project. Overall, we weren’t able to find technical information that shouldn't go in the paper using the lens of reproducibility, so the supplement is relatively long. Combining three methods (kinematics, forces, muscles), of which two are only available in the senior author’s lab, and extensive math (two new integrative models plus scaling laws) requires sharing the information needed for replication for all approaches we combine.

      Also, some figure captions are very long and some of the content might have been included in the main text.

      Complied: thank you for helping us streamline the captions. We reviewed all the figure captions and removed material that is repeated in the main text, but not essential to understanding the figures. However, because of the length of the manuscript and our desire to make the manuscript readable and clear, we left all other text in the captions intact so they remain readable independently of the main text. This way, the reader does not have to go searching for information in the main text just to make sense of the figures. This is especially important because readers often read the figures first before deciding if they want to read the main text completely. In addition, we moved two panels from Figure 2 into its associated figure supplement, because it was not a main point in the text, and hence this helped reduce the length of the caption in figure 2.

    1. Author Response

      The authors wish to thank the Reviewers for valuable and constructive comments that will help up improve the paper’s quality.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript builds upon the authors' previous work on the cross-talk between transcription initiation and post-transcriptional events in yeast gene expression. These prior studies identified an mRNA 'imprinting' phenomenon linked to genes activated by the Rap1 transcription factor (TF), a surprising role for the Sfp1 TF in promoting RNA polymerase II (RNAPII) backtracking, and a role for the non-essential RNAPII subunits Rpb4/7 in the regulation of mRNA decay and translation. Here the authors aimed to extend these observations to provide a more coherent picture of the role of Sfp1 in transcription initiation and subsequent steps in gene expression. They provide evidence for (1) a physical interaction between Sfp1 and Rpb4, (2) Sfp1 binding and stabilization of mRNAs derived from genes whose promoters are bound by both Rap1 and Sfp1 and (3) an effect of Sfp1 on Rpb4 binding or conformation during transcription elongation.

      Strengths:

      This study provides evidence that a TF (yeast Sfp1), in addition to stimulating transcription initiation, can at some target genes interact with their mRNA transcripts and promote their stability. Sfp1 thus has a positive effect on two distinct regulatory steps. Furthermore, evidence is presented indicating that strong Sfp1 mRNA association requires both Rap1 and Sfp1 promoter binding and is increased at a sequence motif near the polyA track of many target mRNAs. Finally, they provide compelling evidence that Sfp1-bound mRNAs have higher levels of RNAPII backtracking and altered Rpb4 association or conformation compared to those not bound by Sfp1.

      Weaknesses:

      The Sfp1-Rpb4 association is supported only by a two-hybrid assay that is poorly described and lacks an important control. Furthermore, there is no evidence that this interaction is direct, nor are the interaction domains on either protein identified (or mutated to address function).

      Indeed, our two hybrid, immunoprecipitation and imaging results do not allow us to conclusively discern whether the interaction between Rpb4 and Sfp1 is direct or indirect. While the interaction holds significance, we consider the direct versus indirect distinction to be of secondary importance in the context of this paper. We intend to give more attention to this matter in our revised paper. In addition, we will make an effort to investigate an in vitro interaction between Sfp1 and Rpb4 by employing purified Sfp1 and Rpb4 proteins.

      The contention that Sfp1 nuclear export to the cytoplasm is transcription-dependent is not well supported by the experiments shown, which are not properly described in the text and are not accompanied by any primary data.

      We note that this assay has been developed and published in prior research by Lee, M. S., M. Henry, and P. A. Silver. (G&D, 1996) and was reported in a number of subsequent papers. Reassuringly, our conclusion is supported by the observation that Sfp1 binds to Pol II transcripts co-transcriptionally suggesting that Sfp1 is exported in the context of the mRNA.

      The presence of Sfp1 in P-bodies is of unclear relevance and the authors do not ask whether Sfp1-bound mRNAs are also present in these condensates.

      In the revised paper, we will indicate that we do not know whether RP mRNAs are present in the actual foci shown in Fig. 1B.

      Further analysis of Sfp1-bound mRNAs would be of interest, particularly to address the question of whether those from ribosomal protein genes and other growth-related genes that are known to display Sfp1 binding in their promoters are regulated (either stabilized or destabilized) by Sfp1.

      Fig. 4A, C and D show that RP mRNAs become destabilized in sfp1Δ cells.

      The authors need to discuss, and ideally address, the apparent paradox that their previous findings showed that Rap1 acts to destabilize its downstream transcripts, i.e. that it has the opposite effect of Sfp1 shown here.

      We would like to thank Reviewer 1 for this valuable comment. In the revised paper, we will delve into our hypothesis suggesting that Rap1 is likely responsible for regulating the imprinting of other proteins, that, in turn, lead to the destabilization of mRNAs, such as Rpb4.

      Finally, recent studies indicate that the drugs used here to measure mRNA stability induce a strong stress response accompanied by rapid and complex effects on transcription. Their relevance to mRNA stability in unstressed cells is questionable.

      Half-lives were determined mainly by the GRO analysis of optimally proliferating cells. This method does not requires any drug or stressful treatment. The results obtained by this method were consistent with the those obtained after thiolutin addition. Nevertheless, in our revised manuscript, we plan to supplement the half-life data with results obtained by subjecting cells to a temperature shift to 42°C, a natural method to block transcription in wild-type (WT) cells. This approach to determine half-lives has been previously reported in our publications, such as Lotan et al. (2005, 2007) and Goler Baron et al. (2008). This may rule out effects of the drug on halfe-life.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Kelbert et al. presents results on the involvement of the yeast transcription factor Sfp1 in the stabilisation of transcripts whose synthesis it stimulates. Sfp1 is known to affect the synthesis of a number of important cellular transcripts, such as many of those that code for ribosomal proteins. The hypothesis that a transcription factor can remain bound to the nascent transcript and affect its cytoplasmic half-life is attractive, but the methods used to demonstrate the half-life effects and the association of Sfp1 with cytoplasmic transcripts remain to be fully validated, as explained in my comments on the results below:

      Comments on methodology and results:

      1. A two-hybrid-based assay for protein-protein interactions identified Sfp1, a transcription factor known for its effects on ribosomal protein gene expression, as interacting with Rpb4, a subunit of RNA polymerase II. Classical two-hybrid experiments depend on the presence of the tested proteins in the nucleus of yeast cells, suggesting that the observed interaction occurs in the nucleus. Unfortunately, the two-hybrid method cannot determine whether the interaction is direct or mediated by nucleic acids.

      Please see our response to comment 1 of Reviewer 1.

      1. Inactivation of nup49, a component of the nuclear pore complex, resulted in the redistribution of GFP-Sfp1 into the cytoplasm at the temperature non-permissive for the nup49-313 strain, suggesting that GFP-Sfp1 is a nucleo-cytoplasmic shuttling protein. This observation confirmed the dynamic nature of the nucleo-cytoplasmic distribution of Sfp1. For example, a similar redistribution to the cytoplasm was previously reported following rapamycin treatment and under starvation (Marion et al., PNAS 2004). In conjunction with the observation of an interaction with Rpb4, the authors observed slower nuclear import kinetics for GFP-Sfp1 in the absence of Rpb4 when cells were transferred to a glucose-containing medium after a period of starvation. Since the redistribution of GFP-Sfp1 was abolished in an rpb1-1/nup49-313 double mutant, the authors concluded that Sfp1 localisation to the cytoplasm depends on transcription. The double mutant yeast cells may show a variety of non-specific effects at the restrictive temperature, and whether transcription is required for Sfp1 cytoplasmic localisation remains incompletely demonstrated.

      We concur with Reviewer 2 that any heat inactivation of a temperature-sensitive (ts) protein can result in non-specific effects. In the instance of rpb1-1, these non-specific effects are anticipated because of the transcriptional arrest, which can eventually lead to a reduction in protein content. However, it is worth noting that this process takes some time, whereas the impact on export is more rapid. We note that that this assay has been developed and published in prior research by Pam Silver (op. cit.) and was reported in a number of subsequent papers. Reassuringly, our conclusion is supported by the observation that Sfp1 binds to Pol II transcripts co-transcriptionally.

      1. Under starvation conditions, which led to the presence of Sfp1 in the cytoplasm and have previously been correlated with a decrease in the transcription of Sfp1 target genes, the authors observed that a plasmid-based expressed GFP-Sfp1 accumulated in cytoplasmic foci. These foci were also labelled by P-body markers such as Dcp2 and Lsm1. The quality of the microscopic images provided does not allow to determine whether Rpb4-RFP colocalises with GFP-Sfp1.

      The submitted PDF figure is of low quality. We believe that high quality figure will be convincing.

      1. To understand to which RNA Sfp1 might bind, the authors used an N-terminally tagged fusion protein in a cross-linking and purification experiment. This method identified 264 transcripts for which the CRAC signal was considered positive and which mostly correspond to abundant mRNAs, including 74 ribosomal protein mRNAs or metabolic enzyme-abundant mRNAs such as PGK1. The authors did not provide evidence for the specificity of the observed CRAC signal, in particular, what would be the background of a similar experiment performed without UV cross-linking. In a validation experiment, the presence of several mRNAs in a purified SFP1 fraction was measured at levels that reflect the relative levels of RNA in a total RNA extract. Negative controls showing that abundant mRNAs not found in the CRAC experiment were clearly depleted from the purified fraction with Sfp1 would be crucial to assessing the specificity of the observed protein-RNA interactions. The CRAC-selected mRNAs were enriched for genes whose expression was previously shown to be upregulated upon Sfp1 overexpression (Albert et al., 2019). The presence of unspliced RPL30 pre-mRNA in the Sfp1 purification was interpreted as a sign of co-transcriptional assembly of Sfp1 into mRNA, but in the absence of valid negative controls, this hypothesis would require further experimental validation.

      We argue that the 264 CRAC+ genes represent a distinct group with many unique features. Moreover, many CRAC+ genes do not fall into the category of highly transcribed genes.

      The biological significance of the 264 CRAC+ mRNAs was demonstrated by various experiments; all are inconsistent with technical flaws. Some examples are:

      1. Fig. 2a and B show that most reads of CRAC+ mRNA were mapped to specific location – close the pA sites.
      2. Fig. 2C shows that most reads of CRAC+ mRNA were mapped to specific RNA motif.

      3. Most RiBi CRAC+ promoter contain Rap1 binding sites (p= 1.9x10-22), whereas the vast majority of RiBi CRAC- promoters do not contain Rap1 binding site. (Fig. 3C).

      4. Fig. 4A shows that RiBi CRAC+ mRNAs become destabilized due to Sfp1 deletion, whereas RiBi CRAC- mRNAs do not. Fig. 4B shows similar results due to

      5. Fig. 6B shows that the impact of Sfp1 on backtracking is substantially higher for CRAC+ than for CRAC- genes. This is most clearly visible in RiBi genes.

      6. Fig. 7A shows that the Sfp1-dependent changes along the transcription units is substantially more rigorous for CRAC+ than for CRAC-.

      7. Fig. S4B Shows that chromatin binding profile of Sfp1 is different for CRAC+ and CRAC- genes

      Moreover, only a portion of the RiBi mRNAs binds Sfp1, despite similar expression of all RiBi.

      Most importantly, these genes do not all fall into the category of highly transcribed genes. On the contrary, as depicted in Figure 6A (green dots), it is evident that CRAC+ genes exhibit a diverse range of Rpb3 ChIP and GRO signals. Furthermore, as illustrated in Figure 7A, when comparing CRAC+ to Q1 (the most highly transcribed genes), it becomes evident that the Rpb4/Rpb3 profile of CRAC+ genes is not a result of high transcription levels. In our revised paper, we will give increased attention to this matter in the Discussion section.

      1. To address the important question of whether co-transcriptional assembly of Spf1 with transcripts could alter their stability, the authors first used a reporter system in which the RPL30 transcription unit is transferred to vectors under different transcriptional contexts, as previously described by the Choder laboratory (Bregman et al. 2011). While RPL30 expressed under an ACT1 promoter was barely detectable, the highest levels of RNA were observed in the context of the native upstream RPL30 sequence when Rap1 binding sites were also present. Sfp1 showed better association with reporter mRNAs containing Rap1 binding sites in the promoter region. However, removal of the Rap1 binding sites from the reporter vector also led to a drastic decrease in reporter mRNA levels. Whether the fraction of co-purified RNA is nuclear and co-transcriptional or not cannot be inferred from these results.

      The proposed co-transcriptional binding of Sfp1 is based on the findings presented in Figure 5C and Figure S2D, as well as the observed binding of Sfp1 to transcripts containing introns, as shown in Figures 2D and 3B. Our conclusion, which we still uphold, was drawn from the results presented in Figure 3. These results led us to the assertion that the "RNA-binding capacity of Sfp1 is regulated by Rap1-binding sites located at the promoter." We maintain our stance on this conclusion. Indeed, the Rap1 binding site does impact mRNA levels, as highlighted by Reviewer 2. However, "construct E," which possesses a promoter with a Rap1 binding site, exhibits lower transcript levels compared to "construct F," which lacks such a binding site in its promoter. Despite this difference in transcript levels, Sfp1 was able to pull down the former transcript but not the latter, even though expression of the former gene is relatively low. Thus, the results appear to be more reliant on the specific capacity of Sfp1 to interact with the transcript rather than on the transcript's expression level.

      1. To complement the biochemical data presented in the first part of the manuscript, the authors turned to the deletion or rapid depletion of SFP1 and used labelling experiments to assess changes in the rate of synthesis, abundance, and decay of mRNAs under these conditions. An important observation was that in the absence of Sfp1, mRNAs encoding ribosomal protein genes not only had a reduced synthesis rate but also an increased degradation rate. This important observation needs careful validation, as genomic run-on experiments were used to measure half-lives, and this particular method was found to give results that correlated poorly with other measures of half-life in yeast (e.g. Chappelboim et al., 2022 for a comparison). Similarly, the use of thiolutin to block transcription as a method of assessing mRNA half-life has been reported to be problematic, as thiolutin can specifically inhibit the degradation of ribosomal protein mRNA (Pelechano & Perez-Ortin, 2008). Specific repressible reporters, such as those used by Baudrimont et al. (2017), would need to be tested to validate the effect of Sfp1 on the half-life of specific mRNAs. Also, it would be very difficult to infer from the images presented whether the rate of deadenylation is altered by Sfp1.

      Various methods exist for assessing mRNA half-lives (HLs), and each of them carries its own set of challenges and biases. Consequently, it becomes problematic to directly compare HL values of a specific mRNA when different methods are employed. The superiority of one particular method over others remains unclear. However, they all exhibit a high degree of reliability when it comes to comparing different strains under the identical conditions using a single method.

      Estimating half-lives through the GRO approach is a non-invasive method, applied on optimally proliferating cells, which has been employed in numerous publications. While no method is without its limitations, we consider this approach to be among the most dependable. Our HL determination using thiolutin to block transcription provided results that were consistent with the values obtained by the GRO approach.

      Nevertheless, in our revised manuscript, we plan to supplement the HL data, obtain by thiolutin, with results obtained by subjecting cells to a temperature shift to 42°C, a natural method to block transcription in wild-type (WT) cells. This approach to determine HLs has been previously reported in our publications, such as Lotan et al. (2005, 2007) and Goler Baron et al. (2008).

      1. The effects of SFP1 on transcription were investigated by chromatin purification with Rpb3, a subunit of RNA polymerase, and the results were compared with synthesis rates determined by genomic run-on experiments. The decrease in polII presence on transcripts in the absence of SFP1 was not accompanied by a marked decrease in transcript output, suggesting an effect of Sfp1 in ensuring robust transcription and avoiding RNA polymerase backtracking. To further investigate the phenotypes associated with the depletion or absence of Sfp1, the authors examined the presence of Rpb4 along transcription units compared to Rpb3. One effect of spf1 deficiency was that this ratio, which decreased from the start of transcription towards the end of transcripts, increased slightly. The results presented are largely correlative and could arise from the focus on very specific types of mRNAs, such as those of ribosomal protein genes, which are sensitive to stress and are targeted by very active RNA degradation mechanisms activated, for example, under heat stress (Bresson et al., 2020).

      Figure 7A illustrates a significant reduction in Rpb4/Rpb3 ratios along the transcription unit in WT cells. This reduction is notably more pronounced in CRAC+ genes compared to the highly transcribed quartile (Q1), which includes all ribosomal protein (RP) genes, and it is completely absent in sfp1∆ cells. Furthermore, it's important to highlight that the CRAC+ gene group displays a wide range of transcription rates, as measured by either Rpb3 ChIP or GRO (Figure 6A). Given these observations, it is challenging to reconcile how the heightened sensitivity of RP mRNA degradation in response to stress could account for the more pronounced differences in the configuration of the Pol II elongation complex that are detected in CRAC+ genes under standard culture conditions in wt cells.

      Correlative studies are particularly informative when a gene mutation eliminates a correlation, and this is precisely the type of study depicted in Figure 7B-C. The configuration of elongating Pol II (as reflected by Rpb4/Rpb3 ratios) and the backtracking index are both transcriptional outputs. It is difficult to envision how stress-induced destabilization of RP mRNAs could explain the twofold higher correlation between these two parameters observed in CRAC+ genes under non-stressful conditions in WT cells (Figure 7B).

      Furthermore, it's worth noting that in WT cells, CRAC+ genes did not display any apparent unusual destabilization, but rather exhibited higher (not lower) mRNA stability compared to CRAC- genes (Figure 7C).

      Strengths: - Diversity of experimental approaches used - Validation of large-scale results with appropriate reporters

      Weaknesses: - Choice of evaluation method to test mRNA half-life - Lack of controls for the CRAC results