10,000 Matching Annotations
  1. Jul 2024
    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Overall, the manuscript is very well written, the approaches used are clever, and the data were thoroughly analyzed. The study conveyed important information for understanding the circuit mechanism that shapes grid cell activity. It is important not only for the field of MEC and grid cells, but also for broader fields of continuous attractor networks and neural circuits.

      We appreciate the positive comments.

      (1) The study largely relies on the fact that ramp-like wide-field optogenetic stimulation and focal optogenetic activation both drove asynchronous action potentials in SCs, and therefore, if a pair of PV+ INs exhibited correlated activity, they should receive common inputs. However, it is unclear what criteria/thresholds were used to determine the level of activity asynchronization, and under these criteria, what percentage of cells actually showed synchronized or less asynchronized activity. A notable percentage of synchronized or less asynchronized SCs could complicate the results, i.e., PV+ INs with correlated activity could receive inputs from different SCs (different inputs), which had synchronized activity. More detailed information/statistics about the asynchronization of SC activity is necessary for interpreting the results.

      The short answer here is that spiking responses from the pairs of SCs that we sampled appear asynchronous. We now show this in the form of cross-correlograms for all recorded pairs of SCs (Figure 2, Figure Supplement 1). The correlograms lack peaks that would indicate synchronous activation. Thus, while our dataset is not large enough to rule out occasional direct synchronisation of SCs, this appears unlikely to account for synchronised input to PV+INs.

      This conclusion is consistent with consideration of mechanisms that could in principle synchronise SCs:

      First, if responses to ramping light inputs was fully deterministic, then this could lead to fixed relative timing of spikes fired by different SCs. This is unlikely given the influence of stochastic channel gating on SC spiking (Dudman and Nolan 2009) and is inconsistent with trial to trial variability in spike timing (Figure 2, Figure Supplement 2).

      Second, as SCs are glutamatergic they could excite one another. However, excitatory connections between stellate cells are rare (Pastoll et al. 2013; Couey et al. 2013; Fuchs et al. 2016) and when detected they have low amplitude (mean < 0.25 mV; (Winterer et al. 2017)). Our finding that spiking by pairs of SCs is not correlated is consistent with this.

      Third, strong interaction between stellate cells mediated by local inhibitory pathways (Pastoll et al. 2013; Couey et al. 2013) could coordinate their activity. The lack of correlation between spiking of pairs of SCs suggests that such coordination is rarely recruited by our ramping protocols. Nevertheless, recruitment of inhibition may happen to some extent as experiments in Figure 4 show that correlated input from SCs to more distant, but not nearby PV+INs, is reduced by blocking inhibitory synapses. Given that we don't find evidence for synchronised spiking of SCs, this additional common input to widely separated PV+INs is instead best explained by recruitment of interneurons that act directly on the target SCs. We have modified Figure 8 to make this clear.

      Thus, for experiments with ramping light stimuli, synchronous activation of SCs is unlikely to explain common input to PV+INs. Input from the same SC best explains correlated responses of nearby PV+IN inhibitory populations, while recruitment of an additional inhibitory pathway may contribute to correlated responses of more distant PV+INs.

      For experiment using focal stimulation, substantial trial-to-trial variation in SC spike timing argues strongly against deterministic coordination. Indirect coordination of presynaptic neurons is also extremely unlikely given that focal activation is sparse and brief, while inputs from many presynaptic SCs are required to drive a postsynaptic interneuron to spike (e.g. (Pastoll et al. 2013; Couey et al. 2013)). Results from these experiments thus corroborate results from experiments using ramping light stimulation.

      In revising the manuscript we have tried to ensure these arguments are clear (e.g. p 5, para 3; p 6, para 2; p 10, para 1).

      (2) The hypothesis about the "direct excitatory-inhibitory" synaptic interactions is made based on the GABAzine experiments in Figure 4. In the Figure 8 diagram, the direct interaction is illustrated between PV+ INs and SCs. However, the evidence supporting this "direct interaction" between these two cell types is missing. Is it possible that pyramidal cells are also involved in this interaction? Some pieces of evidence or discussions are necessary to further support the "direction interaction".

      Indirect connections between stellate cells mediated via fast spiking inhibitory interneurons are well established by previous studies (e.g. (Pastoll et al. 2013; Couey et al. 2013; Fuchs et al. 2016), and so were not addressed here. Previous work also establishes that connections from stellate cells to pyramidal cells are extremely rare (Winterer et al. 2017). Because the Sim1:Cre mouse line is specific to stellate cells and does not drive transgene expression in pyramidal cells (Sürmeli et al. 2015), it's therefore unlikely that pyramidal cells play a role.

      To make these points clearer we have modified the text in the discussion (p 5, para 3; p 10, paras 1 & 2). We have also modified Figure 8 to highlight that the indirect interaction may be best accounted for by inhibitory pathways onto PV+INs rather than via SCs (which our new cross-correlation analyses indicate is unlikely).

      Reviewer #2 (Public Review):

      In this study, Huang et al. employed optogenetic stimulation alongside paired whole-cell recordings in genetically defined neuron populations of the medial entorhinal cortex to examine the spatial distribution of synaptic inputs and the functional-anatomical structure of the MEC. They specifically studied the spatial distribution of synaptic inputs from parvalbumin-expressing interneurons to pairs of excitatory stellate cells. Additionally, they explored the spatial distribution of synaptic inputs to pairs of PV INs. Their results indicate that both pairs of SCs and PV INs generally receive common input when their relative somata are within 200-300 ums of each other. The research is intriguing, with controlled and systematic methodologies. There are interesting takeaways based on the implications of this work to grid cell network organization in MEC.

      We appreciate the positive comments.

      (1) Results indicate that in brain slices, nearby cells typically share a higher degree of common input. However, some proximate cells lack this shared input. The authors interpret these findings as: "Many cells in close proximity don't seem to share common input, as illustrated in Figures 3, 5, and 7. This implies that these cells might belong to separate networks or exist in distinct regions of the connectivity space within the same network.". Every slice orientation could have potentially shared inputs from an orthogonal direction that are unavoidably eliminated. For instance, in a horizontal section, shared inputs to two SCs might be situated either dorsally or ventrally from the horizontal cut, and thus removed during slicing. Given the synaptic connection distributions observed within each intact orientation, and considering these distributions appear symmetrically in both horizontal and sagittal sections, the authors should be equipped to estimate the potential number of inputs absent due to sectioning in the orthogonal direction. How might this estimate influence the findings, especially those indicating that many close neurons don't have shared inputs?

      Given we find high probabilities of correlated inputs to nearby cells in both planes, our conclusion that nearby cells are likely to receive common inputs appears to be independent of the slice plane. For cells further apart, where the degree of correlated input becomes more variable, it is possible that cell pairs that have low input correlations measured in one slice plane would have high input correlations if measured in a different plane. An argument against this is that as the cell pairs are further apart, it is less likely that an orthogonal axon would intersect dendritic trees of both cells. Nevertheless, we can't rule this out given the data here. We have amended the discussion to highlight this possibility (p 10, para 1). We agree it would be interesting to address this point further with quantitative analyses but this will be difficult without detailed reconstructions of the circuit.

      (2) The study examines correlations during various light-intensity phases of the ramp stimuli. One wonders if the spatial distribution of shared (or correlated) versus independent inputs differs when juxtaposing the initial light stimulation phase, which begins to trigger spiking, against subsequent phases. This differentiation might be particularly pertinent to the PV to SC measurements. Here, the initial phase of stimulation, as depicted in Figure 7, reveals a relatively sparse temporal frequency of IPSCs. This might not represent the physiological conditions under which high-firing INs function. While the authors seem to have addressed parts of this concern in their focal stim experiments by examining correlations during both high and low light intensities, they could potentially extract this metric from data acquired in their ramp conditions. This would be especially valuable for PV to SC measurements, given the absence of corresponding focal stimulation experiments.

      We understand the gist of the question here as being can differences in correlation scores between initial vs later phases of responses to ramping light inputs be used to infer spatial organisation? These differences are likely to reflect heterogeneity in the spiking of the input neurons, for example through differences in spike threshold, spike frequency adaptation and saturation of spiking (e.g. Figure 2, Figure Supplement 1A, and also see (Pastoll et al. 2020)). We don't expect these differences to have any spatial organisation along the mediolateral axis, and while spike threshold follows a dorsoventral organisation there is nevertheless substantial local variation between neurons (Pastoll et al. 2020). It's therefore unlikely we can use differences in early versus late correlations to make the inferences proposed by the reviewer.

      With respect to PV to SC measurements, similar heterogeneity is likely. We note that we were unable to carry out focal stimulation experiments for PV to SC connections as PV neurons did not spike in response to focal optogenetic stimulation.

      With respect to physiological conditions, our aim here is simply to assess connectivity in well controlled conditions, e.g. voltage-clamp, minimal spontaneous activity, known neuronal locations, etc. It's not clear that physiological activation patterns would improve on these tests and quite likely data would be noisier and harder to interpret.

      (3) Re results from Figure 2: Please fully describe the model in the methods section. Generally, I like using a modeling approach to explore the impact of convergent synaptic input to PVs from SCs that could effectively validate the experimental approach and enhance the interpretability of the experimental stim/recording outcomes. However, as currently detailed in the manuscript, the model description is inadequate for assessing the robustness of the simulation outcomes. If the IN model is simply integrate-and-fire with minimal biophysical attributes, then the findings in Fig 2F results shown in Fig 2F might be trivial. Conversely, if the model offers a more biophysically accurate representation (e.g., with conductance-based synaptic inputs, synapses appropriately dispersed across the model IN dendritic tree, and standard PV IN voltage-gated membrane conductances), then the model's results could serve as a meaningful method to both validate and interpret the experiments.

      We appreciate the simulation descriptions were insufficient and have modified the manuscript to include additional details and clarification (p 14, paras 1-3).

      We're not sure we follow the logic here with respect to model types. The experiments were carried out in the voltage-clamp recording configuration with the goal of identifying correlated inputs independently from how they are integrated by the postsynaptic neuron. Given that membrane potential doesn't change (and so the CdVm/dt term of the membrane equation = 0), integrate and fire and point conductance-based models both simplify down to summing of input currents. We achieve this by convolving spike times with experimentally measured synaptic current waveforms. An assumption of our approach is that we achieve a reasonable space clamp. We believe this is justified given that stellate cells and PV interneurons are reasonably electrotonically compact, and that our analysis relies on consistent correlations rather than absolute amplitudes or time constants of the postsynaptic response and so should tolerate moderate space clamp errors.

      Reviewer #3 (Public Review):

      This paper presents convincing data from technically demanding dual whole-cell patch recordings of stellate cells in medial entorhinal cortex slice preparations during optogenetic stimulation of PV+ interneurons. The authors show that the patterns of postsynaptic activation are consistent with dual recorded cells close to each other receiving shared inhibitory input and sending excitatory connections back to the same PV neurons, supporting a circuitry in which clusters of stellate cells and PV+IN interact with each other with much weaker interactions between clusters. These data are important to our understanding of the dynamics of functional cell responses in the entorhinal cortex. The experiments and analysis are quite complex and would benefit from some revisions to enhance clarity.

      These are technically demanding experiments, but the authors show quite convincing differences in the correlated response of cell pairs that are close to each other in contrast to an absence of correlation in other cell pairs at a range of relative distances. This supports their main point of demonstrating anatomical clusters of cells receiving shared inhibitory input.

      We appreciate the positive comments.

      The overall technique is complex and the presentation could be more clear about the techniques and analysis. In addition, due to this being a slice preparation they cannot directly relate the inhibitory interactions to the functional properties of grid cells which was possible in the 2-photon in vivo imaging experiment by Heys and Dombeck, 2014.

      We have modified the manuscript to try to improve the presentation (specific changes are detailed below). We agree that an important future challenge is to relate our findings to in vivo observations (p 11, para 2).

      Reviewer #1 (Recommendations For The Authors):

      Major points

      (1) The study largely relies on the fact that ramp-like wide-field optogenetic stimulation and focal optogenetic activation both drove asynchronous action potentials in SCs, and therefore, if a pair of PV+ INs exhibited correlated activity, they should receive common inputs. In Figure 2 and its supplementary figures, the authors also showed examples of asynchronized activity. However, it is unclear to me what criteria/thresholds were used to determine the level of activity asynchronization, and under these criteria, what percentage of cells actually showed synchronized or less asynchronized activity. A notable percentage of synchronized or less asynchronized SCs could complicate the results, i.e., PV+ INs with correlated activity could receive inputs from different SCs (different inputs), which had synchronized activity. Related to this concern, it would also be important to simulate what level of activity asynchronization in SCs could still lead to correlated PV+ IN activity above shuffle, and among the recorded SCs, what percentage of cells belong to this synchronized/less asynchronized category.

      We address this point in our response to the public review. In brief, we have added additional cross-correllograms showing that ramp activation of SC pairs does not cause detectable synchronous activation. We also clarify that sensitivity of correlations of some widely separated pairs to GABA-blockers is suggestive of SCs activating common inhibitory inputs to cell pairs.

      (2) The above concern is more relevant to the focal stimulation experiments, in which the authors tried to claim that a pair of PV+ INs with correlated activity could receive inputs from the same SCs neurons. The authors also showed that the stimulation patterns leading to the activation of PV+ INs were more similar if PV+ INs had correlated activity (Figure 5D). However, if nearby SCs were more synchronized than distal SCs within this stimulation scale, even though a pair of PV+ INs showed correlated activity, they could still receive inputs from different but nearby SCs. In this case, it would be helpful to quantify the relationship between the level of activity synchronization of SCs and their distances. In Figure 5 Supplementary Figure 1, the data were only provided for 8 cells. If feasible, collecting data from more cells would be needed for the proposed analysis.

      We explain in our responses to point 1 above and in the public review that direct synchronisation of SCs is unlikely. This is particularly unlikely for focal stimulation experiments as the timing of responses of individual SCs is extremely variable between trials. Thus, even if there were strong synaptic connections between SCs, which the evidence suggests there is not (Pastoll et al. 2013; Couey et al. 2013; Fuchs et al. 2016), then this would be unlikely to result in reliably timed coordinated firing.

      (3) It is unclear what the definition of "common inputs" is. Do they refer to inputs from the same group of cells? If different groups of cells provide synchronized inputs, will the inputs be considered "common inputs" or "different inputs"?

      We used "common" in an attempt to be consistent with classic work by Yoshimura et al. and in an attempt to be succinct. Thus, by common input we are referring to cell pairs for which a proportion of their input is from the same presynaptic neuron(s), as opposed to cell pairs for which their input is from different neurons and therefore have no common input. We have attempted to make sure this is clear in the revised manuscript (e.g description of simulations on p 4, para 2).

      (4) In the introduction and abstract, it was mentioned that "dense, but specific, direct excitatory-inhibitory synaptic interactions may operate at the scale of grid cell clusters". It is unclear to me how "dense" was demonstrated in the data. Can the authors clarify?

      Thanks for flagging this, we were insufficiently clear. We have revised the text to refer to cell pairs for which a proportion of their input is from the same presynaptic neurons (e.g. p 3, para 1), and separately about indirect coordination, by which we mean inputs to cell pairs that appear correlated because of coordination between upstream neurons.

      (5) The hypothesis about the "direct excitatory-inhibitory" synaptic interactions is made based on the GABAzine experiments in Figure 4. In the Figure 8 diagram, the direct interaction is illustrated between PV+ INs and SCs. Is there any evidence supporting this "direct interaction"?

      The direct interaction from SCs to PV+INs and from PV+INs to SCs were previously demonstrated by experiments with recordings from pairs of neurons (e.g. (Pastoll et al. 2013; Couey et al. 2013; Fuchs et al. 2016; Winterer et al. 2017). Our results in Figures 3-5, which show that exciting SCs by light activation of ChR2 leads to excitation of PV+INs, and in Figure 7, which show that light activation of PV+INs expressing ChR2 leads to inhibition of SCs, are consistent with these previous conclusions. We have modified the manuscript to make sure this is clear (p 2, para 3).

      Is it possible that pyramidal cells are also involved in this interaction? If this is unlikely, the author may provide some pieces of evidence (e.g., timing of responses after optogenetic stimulation) or some discussions.

      This is unlikely given that previous studies indicate that connections from stellate to pyramidal cells are weak or absent (Winterer et al. 2017). We now clarify this in the Discussion (p 10, para 1).

      Minor points (1) Page 4: the last paragraph: the author claimed that CCpeakmean was reduced and CClagvar increased with cell separation. Although the trends are visible in the figures, the author may provide appropriate statistics to support this statement, such as a correlation between cell separation and CCpeakmean CClagvar./

      We have inserted summaries of linear model fits into the legends for Figure 3E-F, Figure 5F-H and Figure 7D.

      (2)  If I understood correctly, in the second last paragraph on page 6, "pairs of SCs" should be changed to "pairs of PV+ INs".

      Thanks. Corrected.

      (3)  Page 9: the 7th line to the end: where is Figure S4?

      Corrected to 'Figure 3, Figure Supplement 2'.

      (4)  Page 27: at the end of figure caption B: two ".

      Corrected.

      (5)  Figures 3A and B: what are the red vertical rectangles?

      These are the regions shown on an expanded time base in C and D. This is now clarified in the legend.

      (6)  Page 28 Figure caption of D and E: (C) and (D) should be (D) and (E).

      Corrected.

      (7)  The first sentence of the third paragraph in INTRODUCTION: 'later' should be 'layer'.

      Corrected.

      Reviewer #2 (Recommendations For The Authors):

      - Some related work has been done by Beed et al. 2013 to map the spatial distribution of inputs to neurons in MEC. Certainly, there are differences in the approaches and the key questions, but the contribution of this study would benefit from a more detailed comparison of the results from Beed vs the current study and should be included in the discussion.

      It's hard to include a detailed comparison of results, at least without losing focus, as the two studies address different questions with different approaches. We already noted that 'Local optical activation of unidentified neurons has also been used to infer connectivity principles but with a focus on responses of single postsynaptic neurons (Beed et al., 2013, 2010)'. In addition, we now note that 'Our focal optogenetic stimulation approach also offers insight into the spatial organization of presynaptic neuronal populations, with the advantage, compared to focal glutamate uncaging previously used to investigate connectivity in the MEC (Beed et al., 2013, 2010), that the identity of the presynaptic cell population is genetically defined'.

      - There are a few places where the language is ambiguous or needs a more detailed description for clarity. • 3rd paragraph under "Focal activation of SCs generates common input to nearby PV+Ins". The correlation probability description in this paragraph and a similar sentence in the methods are very hard to understand. I had to look up the analysis in Yoshimura et al. 2005 to understand what was done here. It's a nice analysis, but the manuscript could benefit from a more detailed description of this measure in the methods.

      We agree, it is a somewhat complex metric and is challenging to explain. In the interests of keeping the main text succinct, we have left the bare bones explanation as it was in the Results, but have expanded the explanation in the Methods. We hope this is now clear.

      - " Alternatively, if there is no clear spatial organization of SC to PV+INs connections, then the similarity between stimulus locations for pairs of SCs should have a random distribution." This sentence is hard to understand. I think the use of the phrase "similarity of stimulus location" is a strange phrasing and is driving the confusion in this sentence.

      We have replaced this with 'correspondence between active stimulus locations'.

      - In the discussion under "Spatial extent and functional organization of L2 circuits" there is a grammatical mistake (seems to be 2x phrasing of "leads to common synaptic input").

      Corrected.

      - Citation in the introduction/discussion. Introduction: in addition to Gu et al. 2018, Heys et al 2014 also showed there are non-random correlations among putative grid cells as a function of their somatic distance. In the discussion section, in addition to Gu et al. 2018, Heys et al. 2014 showed there is anatomical clustering of grid cells in MEC. This earlier work investigating functional correlations among neurons in the superficial aspect of MEC in vivo should be cited and is particularly relevant in these two sections of the manuscript.

      Thanks, we apologise for the oversight. We're well aware of this important study and have now cited it.

      -Typo - Paragraph 3 of the intro; "later" should be layer.

      Corrected.

      -Figure 5 (D-E) there is a typo high correlation probability is D and low correlation is E (text says C/D).

      Corrected.

      Reviewer #3 (Recommendations For The Authors):

      The paper is missing the bibliography section. This makes the review somewhat difficult as some cited papers are not immediately familiar based on the citation.

      Thanks and our apologises for making extra work by omitting this. It is now included.

      Page 2 - "cell clusters" - they should also cite the paper by Heys and Dombeck, 2014 that shows a spatial scale of inhibitory interactions computed based on correlations of grid cells recorded using 2-photon calcium imaging.

      Added (see above).

      Page 2 - "later 2 of the MEC" - layer.

      Corrected.

      Page 2 - "synaptic interactions" - again they should mention the work by Heys and Dombeck, 2014 that indirectly measured the spatial scale of inhibition.

      Now cited in this paragraph.

      Page 4 "we simulated responses" and Figure 2E - in each simulation - did they fit the magnitude and time constant of the simulated EPSCs to individual EPSCs in the data? Or did they randomly vary these to find the best fit?

      The parameters for the simulations are given in the Methods and were chosen to correspond to the experimental values. We have rewritten this section to make the simulation methods clearer. Simulations using different time constants within a physiological range support similar conclusions.

      Page 4 - "we identified 35/71" - Are these the cells that appear in yellow as correlated in Figures 3E-F? If so, the text should indicate that these cells are shown in yellow.

      We have added this and have also updated the legends for additional clarification.

      Figure 2, Figure Supplement 1 - B,C - the following phrase is not clear: "when the 4 / 8 of each neurons inputs from SCs also project to the other neuron (B)," Should the "the" be removed? Also, by 4/8 do they mean 50%, or do they mean 4 to 8?

      Thanks, we've reworded to improve the clarity.

      E - "receiving presynaptic inputs consisted of 4 overlapping SCs" - should it say "consisting"?

      Corrected.

      Figure 3, Figure Supplement 1 part E - "the same data as (C )" - should this be the same data as (D)?? I do not see how doing clustering on the shuffled data in (C ) would give two groups, but it makes sense if it is from (D).

      That's right, now corrected.

      Page 5 - "used action potentials" - this is confusing. Is the word "used" supposed to be there?

      Corrected.

      Page 5 - "widefield activation experiments" - they should cite the experiments that they are referring to here.

      Added.

      Page 5 - "effect of blocking" - "Figure 4" - I find it very odd that the agent GABAzine in Figure 4 is not explicitly mentioned in the main text (though it is mentioned in the methods). The main text should indicate that blocking was performed using GABAzine.

      Added.

      Page and page 14 and Figure 5 - "shifted" - do they mean shuffled?

      We do. The classic papers by Yoshimura et al. used shifted so we keep this here so it's clear we've used their approach. We've added additional explanation to try to make sure the meaning is clear.

      Figure 5 A, B, D, and E would benefit from a more detailed description. They should state whether the labels "1a" and "1b" and "2a" and "2b" refer to different recorded neurons in each pair. They should indicate that 2a and 2b are a different pair? Are the x, y axes of the images corresponding to anatomical position? Does "B" indicate the location of recordings shown in Figure 5B? The authors probably think this is all obvious, but it is not immediately obvious to the reader.

      We have added additional clarification.

      Page 8 - "Beed et al." - These papers by Beed ought to be cited in the introduction as well as they are highly relevant.

      We now cite Beed et al. 2013 in the Introduction when we discuss local inhibitory input to SCs. While the Beed et al. 2010 paper is an important contribution to understanding about pathways from deep to superficial layers, the introduction focuses on communication between identified pre- and postsynaptic populations within layer 2 and therefore we haven't found a way to cite it without losing focus. We do cite this paper multiple times elsewhere.

      Page 10 - "Excitatory-inhibitory interactions" - this summary of attractor models ought to cite the paper by Burak and Fiete as well.

      The discussion focuses on models with excitatory-inhibitory connectivity and cites an important paper from the Fiete group. The model by Burak and Fiete, while also important, is purely inhibitory and so is not well constrained by the known circuitry, and therefore could not be correctly cited here.

      Page 10 - "be consistent with models…or that focus on pyramidal neurons have also been proposed" - this seems ungrammatical as if two different sentences were merged.

      Corrected.

      References

      Couey, Jonathan J, Aree Witoelar, Sheng-Jia Zhang, Kang Zheng, Jing Ye, Benjamin Dunn, Rafal Czajkowski, et al. 2013. “Recurrent Inhibitory Circuitry as a Mechanism for Grid Formation.” Nat. Neurosci. 16 (3): 318–24. https://doi.org/10.1038/nn.3310.

      Dudman, Joshua T, and Matthew F Nolan. 2009. “Stochastically Gating Ion Channels Enable Patterned Spike Firing through Activity-Dependent Modulation of Spike Probability.” Plos Comput. Biol. 5 (2): e1000290. https://doi.org/10.1371/journal.pcbi.1000290.

      Fuchs, Elke C, Angela Neitz, Roberta Pinna, Sarah Melzer, Antonio Caputi, and Hannah Monyer. 2016. “Local and Distant Input Controlling Excitation in Layer II of the Medial Entorhinal Cortex.” Neuron 89 (1): 194–208. https://doi.org/10.1016/j.neuron.2015.11.029.

      Pastoll, Hugh, Derek L Garden, Ioannis Papastathopoulos, Gülşen Sürmeli, and Matthew F Nolan. 2020. “Inter- and Intra-Animal Variation in the Integrative Properties of Stellate Cells in the Medial Entorhinal Cortex.” Elife 9 (February). https://doi.org/10.7554/eLife.52258.

      Pastoll, Hugh, Lukas Solanka, Mark C W van Rossum, and Matthew F Nolan. 2013. “Feedback Inhibition Enables Theta-Nested Gamma Oscillations and Grid Firing Fields.” Neuron 77 (1): 141–54. https://doi.org/10.1016/j.neuron.2012.11.032.

      Sürmeli, Gülşen, Daniel Cosmin Marcu, Christina McClure, Derek L F Garden, Hugh Pastoll, and Matthew F Nolan. 2015. “Molecularly Defined Circuitry Reveals Input-Output Segregation in Deep Layers of the Medial Entorhinal Cortex.” Neuron 88 (5): 1040–53. https://doi.org/10.1016/j.neuron.2015.10.041.

      Winterer, Jochen, Nikolaus Maier, Christian Wozny, Prateep Beed, Jörg Breustedt, Roberta Evangelista, Yangfan Peng, Tiziano D’Albis, Richard Kempter, and Dietmar Schmitz. 2017. “Excitatory Microcircuits within Superficial Layers of the Medial Entorhinal Cortex.” Cell Rep. 19 (6): 1110–16. https://doi.org/10.1016/j.celrep.2017.04.041.

    1. eLife assessment

      In this fundamental study, authors present compelling evidence for the diversity in cellular and synaptic properties of one class of spinal interneurons and tie it to their differentiated role in locomotor pattern generation. The findings reported here will be of broad interest to neuroscientists in general and to motor systems scientists in particular.

    2. Reviewer #1 (Public Review):

      Summary:

      In this very interesting study, Agha and colleagues show that two types of Chx10-positive neurons (V2a neurons) have different anatomical and electrophysiological properties and receive distinct patterns of excitatory and inhibitory inputs as a function of speed during fictive swimming in the larval zebrafish. Using single cell fills they show that one cell type has a descending axon ("descending V2as"), while the other cell type has both a descending axon and an ascending axon ("bifurcating V2as"). In the Chx10:GFP line, descending V2as display strong GFP labeling, while bifurcating V2as display weak GFP labeling. The bifurcating V2as are located more laterally in the spinal cord. These two cell types have different electrophysiological properties as revealed by patch-clamp recordings. Positive current steps indicated that descending V2as comprise tonic spiking or bursting neurons. Bifurcating V2as comprise chattering or bursting neurons. The two types of V2a neurons display different recruitment patterns as a function of speed. Descending tonic and bifurcating chattering neurons are recruited at the beginning of the swimming bout, at fast speeds (swimming frequency above 30 Hz). Descending bursting neurons were preferentially recruited at the end of swimming bouts, at low speeds (swimming frequency below 30 Hz), while bifurcating bursting neurons were recruited for a broader swimming frequency range. The two types of V2a neurons receive distinct patterns of excitatory and inhibitory inputs during fictive locomotion. In descending V2as, when speed increases: i) excitatory conductances increase in fast neurons and decreases in slow neurons; ii) inhibitory conductances increase in fast neurons and increases in slow neurons. In bifurcating V2as, when speed increases: i) excitatory conductances increase in fast neurons but does not change in slow neurons; ii) inhibitory conductances increase in fast neurons and does not change in slow neurons. The timing of excitatory and inhibitory inputs was then studied. In descending V2as, fast neurons receive excitatory and inhibitory inputs that are in anti-phase with low contrast in amplitude and are both broadly distributed over the phase. The slow neurons receive two peaks of inhibition, one in anti-phase with the excitatory inputs and another just after the excitation. In bifurcating V2as, fast neurons receive two peaks of inhibition, while the slow ones receive anti-phase inhibition. They also show that silencing Dmrt3-labeled dI6 interneurons disrupted rhythm generation selectively at high speed.

      Strengths:

      This study focuses on the diversity of V2a neurons in zebrafish, an interesting cell population playing important roles in locomotor control and beyond, from fish to mammal. The authors provide compelling evidence that two subtypes of V2as show distinct anatomical, electrophysiological, speed-dependent spiking activity, and receive distinct synaptic inputs as a function of speed. This opens the door to future investigation of the inputs and outputs of these neurons. Finding ways to activate or inhibit specifically these cells would be very helpful in the years to come. The authors also provide an interesting speed-dependent circuit mechanism for rhythm generation.

      Weaknesses:

      No major weakness detected. The experiments were carefully done, and the data are of high quality.

    3. Reviewer #2 (Public Review):

      Summary:

      Animals exhibit different speeds of locomotion. In vertebrates, this is thought to be implemented by different groups of spinal interneurons and motor neurons. A fundamental assumption in the field has been that neural mechanisms that generate and sustain the rhythm at different locomotor speeds are the same. In this study the authors challenge this view. Using rigorous in vivo electrophysiology during fictive locomotion combined with genetics, the authors provide a detailed analysis of cellular and synaptic properties of different subtypes of spinal V2a neurons that play a crucial role in rhythm generation. Importantly, they are able to show that speed related subsets of V2a neurons have distinct cellular and synaptic properties and maybe utilizing different mechanisms to implement different locomotor speeds.

      Strengths:

      The authors fully utilize the zebrafish model system and solid electrophysiological analyses to study active and passive properties of speed related V2a subsets. Identification of V2a subtype is based directly on their recruitment at different locomotor speeds and not on indirect markers like soma size, D-V position etc. Throughout the article, the authors have cleverly used standard electrophysiological tests and analysis to tease out different neuronal properties and link it to natural activity. For example, in Figures 2 and 4, the authors make comparisons of V2a spiking with current steps and during fictive swims showing spike rates measured with current steps are physiologically relevant and observed during natural recruitment. The experiments done are rigorous and well controlled.

      The major claim of the manuscript is well substantiated by Figure 6 and 7. The authors have done rigorous experiments with statistical analysis to show that reciprocal inhibition is important for rhythmogenesis at fast speeds while recurrent inhibition is key at slow speeds. Furthermore, in Figure 7, a specific loss of reciprocal inhibition is shown to disrupt rhythmogenesis at high speeds but not at lower frequencies. These additions in the revised manuscript make the study extremely compelling.

      The Discussion is well-written and does an excellent job in putting this current study in the context of what is previously known. The addition of a working model in Figure 8 does a great job in summing these exciting and novel findings.

      Weaknesses:

      None noted.

    4. Reviewer #3 (Public Review):

      The manuscript by Agha et al. explores mechanisms of rhythmicity in V2a neurons in larval zebrafish. Two subpopulations of V2a neurons are distinguishable by anatomy, connectivity, level of GFP, and speed-dependent recruitment properties consistent with V2a neurons involved in rhythm generation and pattern formation. The descending neurons proposed to be consistent with rhythm generating neurons are active during either slow or fast locomotion, and their firing frequencies during current steps are well matched with the swim frequency they firing during. The bifurcating (patterning neurons) are active during a broader swim frequency range unrelated to their firing during current steps. All of the V2a neurons receive strong inhibitory input but the phasing of this input is based on neuronal type and swim speed the neuron is active, with prominent in-phase inhibition in slow descending V2a neurons and bifurcating V2a neurons active during fast swimming. Antiphase inhibition is observed in all V2a neurons but it is the main source of rhythmic inhibition in fast descending V2a neurons and bifurcating neurons active during slow swimming. The authors suggest that properties supporting rhythmic bursting are not directly related to locomotor speed but rather to functional neuronal subtypes.

      Strengths:

      This is a well-written paper with many strengths including the rigorous approach. Many parameters, including projection pattern, intracellular properties, inhibition received, and activity during slow/fast swimming were obtained from the same neuron. This links up very well with prior data from the lab on cell position, birth order, morphology/projections, and control of MN recruitment to provide a comprehensive overview of the functioning of V2a interneuronal populations in the larval zebrafish. The added dI6 silencing experiments strengthen the claims made regarding the roles of reciprocal inhibition in rhythm and pattern at fast and slow speeds. The overall conclusions are well supported by the data.

      Weaknesses:

      The main weaknesses have been addressed in the revision.

    5. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      The manuscript by Agha et al. provides a fundamental understanding regarding the participation of V2a interneurons in generating and patterning the locomotor rhythm. The authors provide convincing and solid evidence regarding the heterogeneity of V2a neurons in their intrinsic and synaptic properties and how these shape their outputs. The manuscript could be much improved by the inclusion of statistical analysis of some of the key data currently presented qualitatively. 

      We are extremely grateful for the positive and thorough comments provided by the three reviewers and have now had the opportunity to address all their concerns, as detailed below in our point-by-point response. Specifically, we have provided statistical analysis and major revisions to the text to help with rigor, clarity and interpretation, and we have also include new perturbation experiments that provide a more definitive test of one of our predictions – namely that reciprocal inhibition plays speed-specific roles in rhythm generation and pattern formation. The revisions greatly improve the manuscript and help bolster our conclusions.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary:

      In this very interesting study, Agha and colleagues show that two types of Chx10-positive neurons (V2a neurons) have different anatomical and electrophysiological properties and receive distinct patterns of excitatory and inhibitory inputs as a function of speed during fictive swimming in the larval zebrafish. Using single-cell fills they show that one cell type has a descending axon ("descending V2as"), while the other cell type has both a descending axon and an ascending axon ("bifurcating V2as"). In the Chx10:GFP line, descending V2as display strong GFP labeling, while bifurcating V2as display weak GFP labeling. The bifurcating V2as are located more laterally in the spinal cord. These two cell types have different electrophysiological properties as revealed by patch-clamp recordings. Positive current steps indicated that descending V2as comprise tonic spiking or bursting neurons. Bifurcating V2as comprise chattering or bursting neurons. The two types of V2a neurons display different recruitment patterns as a function of speed. Descending tonic and bifurcating chattering neurons are recruited at the beginning of the swimming bout, at fast speeds (swimming frequency above 30 Hz). Descending bursting neurons were preferentially recruited at the end of swimming bouts, at low speeds (swimming frequency below 30 Hz), while bifurcating bursting neurons were recruited for a broader swimming frequency range. The two types of V2a neurons receive distinct patterns of excitatory and inhibitory inputs during fictive locomotion. In descending V2as, when speed increases: i) excitatory conductances increase in fast neurons and decrease in slow neurons; ii) inhibitory conductances increase in fast neurons and increase in slow neurons. In bifurcating V2as, when speed increases: i) excitatory conductances increase in fast neurons but do not change in slow neurons; ii) inhibitory conductances increase in fast neurons and do not change in slow neurons. The timing of excitatory and inhibitory inputs was then studied. In descending V2as, fast neurons receive excitatory and inhibitory inputs that are in anti-phase with low contrast in amplitude and are both broadly distributed over the phase. The slow neurons receive two peaks of inhibition, one in anti-phase with the excitatory inputs and another just after the excitation. In bifurcating V2as, fast neurons receive two peaks of inhibition, while slow ones receive anti-phase inhibition. 

      Strengths: 

      This study focuses on the diversity of V2a neurons in zebrafish, an interesting cell population playing important roles in locomotor control and beyond, from fish to mammals. The authors provide compelling evidence that two subtypes of V2as show distinct anatomical, electrophysiological, and speed-dependent spiking activity, and receive distinct synaptic inputs as a function of speed. This opens the door to future investigation of the inputs and outputs of these neurons. Finding ways to activate or inhibit specifically these cells would be very helpful in the years to come. 

      Weaknesses: 

      No major weakness was detected. The experiments were carefully done, and the data were of high quality. 

      We really appreciate the positive assessment and have addressed minor issues below.

      Reviewer #2 (Public Review): 

      Summary: 

      Animals exhibit different speeds of locomotion. In vertebrates, this is thought to be implemented by different groups of spinal interneurons and motor neurons. A fundamental assumption in the field has been that neural mechanisms that generate and sustain the rhythm at different locomotor speeds are the same. In this study, the authors challenge this view. Using rigorous in vivo electrophysiology during fictive locomotion combined with genetics, the authors provide a detailed analysis of cellular and synaptic properties of different subtypes of spinal V2a neurons that play a crucial role in rhythm generation. Importantly, they are able to show that speed-related subsets of V2a neurons have distinct cellular and synaptic properties and may utilize different mechanisms to implement different locomotor speeds. 

      Strengths: 

      The authors fully utilize the zebrafish model system and solid electrophysiological analyses to study the active and passive properties of speed-related V2a subsets. Identification of the V2a subtype is based directly on their recruitment at different locomotor speeds and not on indirect markers like soma size, D-V position etc. Throughout the article, the authors have cleverly used standard electrophysiological tests and analysis to tease out different neuronal properties and link it to natural activity. For example, in Figures 2 and 4, the authors make comparisons of V2a spiking with current steps and during fictive swims showing spike rates measured with current steps are physiologically relevant and observed during natural recruitment. The experiments done are rigorous and well-controlled.

      Weaknesses: 

      The authors claim that a primary result of their study is that reciprocal inhibition is important for rhythmogenesis at fast speeds while recurrent inhibition is key at slow speeds. This is shown in Figure 6, however, the authors do not show any statistical tests for this claim. The authors also do not show any conclusive evidence that reciprocal inhibition is required for rhythmogenesis at fast speeds and vice versa for slow speeds. Additional experiments or modeling studies that conclusively show the necessity of these different inhibitory sources to the generation of different rhythms would be needed to strengthen this claim. 

      We have added new loss-of-function experiments as requested to strengthen the claim that reciprocal inhibition is critical for rhythmogenesis at fast speeds, but dispensable at slow. Specifically, we use botulinum toxin selectively expressed in Dmrt3-labeled dI6 interneurons, which play a role in reciprocal inhibition at a variety of speeds (new Figure 7). These experiments demonstrate a selective impact on rhythmic burst generation and alternation during periods of swimming where the highest frequency motor activity occurs. During lower frequency activity, rhythm generation is preserved, however motor output is selectively altered, consistent with the idea that reciprocal inhibition plays an important role in patterning at slow speeds.

      The authors do a great job of teasing out cellular and synaptic properties in the different V2a subsets, however, it is not clear if or how these match the final output. For example, V2aD neurons are tonic or bursting for fast and slow speeds respectively but it is not intuitive how these cellular properties would influence phasic excitation and inhibition these neurons receive. 

      This question gets at the heart of what we are trying to illustrate in Figure 6. Specifically, in the new Figure 6E,F we have aligned the cumulative distribution of spikes recorded in cell-attached mode with phasic excitatory and inhibitory currents to reveal how well cellular properties versus patterns of synaptic drive match the final output (spikes). Our expectation was if intrinsic cellular properties where ultimately generating phasic spiking patterns, then patterns of excitatory and inhibitory drive need not be phasic. Instead, we see that synaptic drive is phasic with spiking occurring between peaks in excitation and troughs in inhibition.  Since post-synaptic cellular properties should not impact the pre-synaptic excitation they receive, this suggests that phasic spiking in all V2a neurons regardless of the capacity for cellular rhythmogenesis is a result of phasic input. In response to this concern, we have elaborated our discussion of what cellular properties may contribute and the impact on output in the Discussion (L502-511). 

      It is not clear from the discussion why having different mechanisms of rhythm generation at different speeds could be an important circuit design. The authors use anguilliform and carangiform modes of swimming to denote fast and slow speeds but there are differences in these movements other than speed, like rostrocaudal coordination. The frequency and pattern of these movements are linked and warrant more discussion. 

      We appreciate the opportunity to elaborate on this point more in the Discussion. In particular, we have added more text to clarify differences in movement related to both pattern-formation and rhythm-generation (L373-398) and to also suggest potential reasons for differences in mechanisms of rhythm generation (L478-488).  

      Reviewer #3 (Public Review):

      The manuscript by Agha et al. explores mechanisms of rhythmicity in V2a neurons in larval zebrafish. Two subpopulations of V2a neurons are distinguishable by anatomy, connectivity, level of GFP, and speed-dependent recruitment properties consistent with V2a neurons involved in rhythm generation and pattern formation. The descending neurons proposed to be consistent with rhythm-generating neurons are active during either slow or fast locomotion, and their firing frequencies during current steps are well matched with the swim frequency they firing during. The bifurcating (patterning neurons) are active during a broader swim frequency range unrelated to their firing during current steps. All of the V2a neurons receive strong inhibitory input but the phasing of this input is based on neuronal type and swim speed when the neuron is active, with prominent in-phase inhibition in slow descending V2a neurons and bifurcating V2a neurons active during fast swimming. Antiphase inhibition is observed in all V2a neurons but it is the main source of rhythmic inhibition in fast descending V2a neurons and bifurcating neurons active during slow swimming. The authors suggest that properties supporting rhythmic bursting are not directly related to locomotor speed but rather to functional neuronal subtypes. 

      This is a well-written paper with many strengths including the rigorous approach. Many parameters, including projection pattern, intracellular properties, inhibition received, and activity during slow/fast swimming were obtained from the same neuron. This links up very well with prior data from the lab on cell position, birth order, morphology/projections, and control of MN recruitment to provide a comprehensive overview of the functioning of V2a interneuronal populations in the larval zebrafish. The overall conclusions are well supported by the data. Weaknesses are relatively minor and were largely related to terminology for some of the secondary conclusions. 

      (1) The assumption is made that all in-phase inhibition is recurrent and out-of-phase inhibition is reciprocal. The latter is likely true but the definition of recurrent may be a bit loose as could be multisegmental feed-forward inhibition as well. 

      This is an excellent point, which was also raised by Reviewer 1. We have now added references that justify this assertion (L281-283). We also add a new figure with schematics (Figure 8) to make it clearer how we are defining sources of recurrent versus reciprocal inhibition, as based on the anatomical constraints of the circuit. We agree that multi-segmental inputs could contribute to inhibition, but they will likely be more broadly distributed based on rostro-caudal location and contribute to tonic sources of drive.  We now clarify this (L285-286).

      (2). In a few places, it is mentioned that the properties of the V2a-D neurons are consistent with pacemakers. This could be true of both the V2a-D and -B neurons that burst in response to depolarizing steps but the properties of the remaining (fast) V2a-D neurons do not seem to be consistent with pacemakers, based on the properties shown. Tonic firing at a frequency related to the locomotor speed the neuron is active during and strong antiphase inhibition may instead suggest a stronger network component driving the rhythmicity. 

      We have been purposefully agnostic regarding the relative contribution of pacemaking to rhythm generation in the paper. Our measurements of bursting overlap with swim frequencies only in the V2a-D subtype. Similarly, the spike rates of V2a-D neurons alone overlap with their swim frequencies (Fig 2D,G,I). Since both respond to tonic input (current injection) by spiking in a pattern that resembles their natural spiking behavior, we have treated these cellular properties both as pacemaking. Although the bursting behavior is more consistent with what is normally considered pacemaking in rhythmic motor circuits, in the basal ganglia field tonic firing of dopaminergic neurons in the substantia nigra is referred to as pacemaking. Since the tonic firing pattern overlaps with swimming frequency in the same way the bursting pattern does, we are less inclined to discount its possible contribution to rhythmogenesis based on the fact they do not burst. We have made modifications to the document to make this point clearer (L409-416).  Regardless, our data argue that pacemaking is unlikely to be a major contributor to phasic firing in V2a neurons, at least at midbody, so we agree with you on this last point.

      Reviewer #1 (Recommendations For The Authors): 

      I only have very minor suggestions. 

      (1) It would be useful to add a table or a figure summarizing the main results (integration of anatomy, electrophysiological properties, synaptic inputs, firing, swimming speed). 

      We agree and have added a figure panel summarizing the main results (new Figure 8).

      (2) Some statistics to possibly add (only suggestions): Do bifurcating V2as display significantly weaker GFP labeling than descending V2as? Do descending V2as have a significantly smaller soma size? Do descending V2as have a significantly lower rheobase and significantly higher resistance? Are tonic descending neurons and chattering bifurcating neurons located significantly more dorsally than the bursting descending and bifurcating neurons? Is there a way to show that bifurcating bursting neurons are recruited statistically on a broader swimming frequency range than other cell types (e.g. SD, coefficient of variation, cumulative distribution function with Kolmogorov-Smirnov test)? 

      For the first question, in all cases when we targeted more dimly labeled neurons they were bifurcating. We now clarify this in the text (L119, L129-132). However, this is difficult to quantify, since absolute levels of fluorescence will vary from preparation to preparation based on the dissection and intensity of epifluorescence illumination. In addition, we did not always take images prior to recording and levels of GFP after recording will vary depending on relative state of dialysis. So, unfortunately, we cannot provide a rigorous statistical analysis beyond the qualitative statement we provide.

      For the remainder of the questions, we now provide statistical analysis for soma size, position, rheobase, and resistance for the data in Figure 2.  Please note, we have reported all our statistical analyses in the figure legends. We also provide analysis of the density distributions of swimming frequencies for slow bursting bifurcating neurons and slow bursting descending neurons as requested, which are significantly different following a K-S test (L162).

      (3) Some details to possibly add (only suggestions): proportion of neurons in which single cell fills were done/checked anatomically? Proportions of bursting/chattering/tonic/bursting neurons? In Figure 1, maybe define visually bifurcating vs descending neurons. In Figure 2I, the recruitment of bifurcating chattering neurons is not plotted. Is that normal? Figures 6D, E, maybe specify more clearly which neurons are the fast and slow ones. In Figure 3C, the X-axis name is missing. 

      For the first question, the proportion is 100%, since the morphology of all neurons was confirmed post recording, which we now clarify in the Methods section (L573). For the second question, the numbers of bursting/chattering/tonic/bursting neurons are now reported in legend of Figure 2, in addition to the total number of V2a-D and V2a-B types, so it is clear what proportion of the recording population this represents. For the third question, in Figure 1 we cannot define V2a neurons as bifurcating or descending yet, this was only possible to confirm after the recording (Figure 2), and was done for every neuron (as mentioned above). For the fourth question, for Figure 2I the chattering response was too variable to be meaningful in terms of averaging and plotting, which we now mention in the text (L169-171). The standard deviations are ridiculous. For the fifth question, we have modified Figures 6D, E to more clearly label fast and slow V2a neurons. Finally, we have included the X-axis label in Figure 3C, thank you!

      (4) Some text to possibly modulate (only suggestions): 

      A possible role for these V2a subtypes in the rhythm generation and pattern formation layer is an interesting idea but this may not be completely solved by the present experiments. Maybe the authors could suggest future experiments in the discussion that would establish how to tackle this important question (double bursts, deletions, etc...)? 

      We appreciate the opportunity to raise future experiments that could help further tease apart their contribution to rhythm and pattern and have now added potential experiments to the Discussion (L498-501; L527-529), which include more precise molecular identification, spatial perturbation, and computational modeling.

      It would be nice to cite the references in which the rhythm/pattern CPG concept was proposed initially (lines 49-50 and elsewhere, Cf. Perret and Cabelguen 1980 Brain Res; Perret et al. 1989 Stance and Motion, Plenum Press; McCrea et al. 2006 J Physiol). 

      Apologies for our poor scholarship here, we now credit the appropriate primary research articles (L50-51).

      In the abstract, it would be useful to say clearly which cells are descending vs. bifurcating ones. Same thing in the result section, maybe it would be nice to identify the two populations long before line 127. 

      We have modified the abstract and introduction sections accordingly. We also note that the two populations are defined in the first paragraph of the results (L90).

      About the possible mechanism of rhythm generation, it is mentioned in line 54 that a single mechanism was proposed to exist, but the authors also mention in lines 122-123 that several mechanisms were proposed for rhythm generation... Maybe adjust the introduction? 

      As requested, we have clarified our meaning in the introduction (L55-58). Several mechanisms exist, but the likelihood that different mechanisms operate at different speeds has not been considered.  Either cellular properties are tuned to different speeds (i.e., bursting is faster in neurons recruited at faster speeds) or network properties can explain different speeds (i.e., different frequencies and patterns emerge from the connectivity).

      About the convention that in fish in-phase currents originate from the ipsilateral and out-of-phase currents originate from the contralateral side (lines 271-275), is there any reference for this assumption? 

      Yes, we now provide references (L281-283).

      Lines 338-345 stating that reciprocal inhibition is important for rhythm generation as predicted by the half-center model can sound surprising to some authors considering that many studies showed that inhibition is not needed for rhythm generation, including lamprey hemicords stimulated electrically (Cangiano and Grillner 2003 J Neurophysiol; 2005 J Neurosci, Cangiano et al. 2012 Neuroscience), salamander hemicords or hemisegments stimulated chemically (Ryczko et al. 2010, 2015 J Neurophysiol), or rhythm activity evoked on each side of the cord using optogenetic stimulation of glutamatergic neurons (Hägglund et al. 2013 PNAS) etc. To demonstrate the importance of inhibition in rhythmogenesis, one would need to activate and/or deactivate the ipsilateral versus contralateral inhibitory neurons. It would be nice to maybe add citations to such studies if available in the zebrafish literature. Overall I would simply suggest modulating this section to be a bit more balanced conceptually. 

      We have included the above referenced studies for lampreys and added ones for tadpoles (L464-468), to stick with undulatory swimmers. We had focused on experiments with the most selective perturbations in the interests of space, but appreciate the opportunity to present both arguments. We also include new loss-of-function experiments that impact one spinal population linked to reciprocal inhibition (Dmrt3-labeled dI6 interneurons), which demonstrate a speed-specific impact on rhythmogenesis (L323-371; new Figure 7) and compare our findings to a recent study in the zebrafish literature examining the impact of spinal Dmrt3-ablations on axial rhythmogenesis (L426-433).

      Line 676 "episodies". 

      Thanks, corrected.

      Reviewer #2 (Recommendations For The Authors): 

      The authors make a claim that recurrent and reciprocal inhibition play key roles in rhythmogenesis at different speeds. This is not conclusively shown. Rayleigh's z-test can be used to test the significance of the directionality of circular data. Including more data from experiments or computational models to show the necessity of reciprocal or recurrent inhibition for timed spiking of V2a neurons would address this. 

      We have now modified Figure 6 so we can directly compare differences in reciprocal and recurrent inhibition between V2a types. We now report statistical analysis in the figure legends using a Watson’s Two Test for Homogeneity to test differences in the circular data. As mentioned above, we have also added new loss-of-function experiments as requested to strengthen the claim that reciprocal inhibition is critical for rhythmogenesis at fast speeds, but dispensable at slow. Specifically, we use botulinum toxin selectively expressed in Dmrt3-labeled dI6 interneurons, which play a role in reciprocal inhibition at a variety of speeds (new Figure 7). These experiments demonstrate a selective impact on rhythmic burst generation and alternation during periods of swimming where the highest frequency motor activity occurs. During lower frequency activity, rhythm generation is preserved, however motor output is selectively altered, consistent with the idea that reciprocal inhibition plays an important role in patterning at slow speeds.

      In Figure 4D, the authors show that V2a neurons, both subtypes, spike in advance of the center of the motor burst. Recent studies (Jay et al., 2023) have shown differences in the timing of V2aD and V2aB neurons. Are there differences in the methods or selection of cells that would reflect differences in results? 

      This is a great point and we appreciate the opportunity to reconcile our observations here with those in Jay et al., 2023. In the Jay et al paper, we used drifting visual stimuli to evoke fictive swimming.  These experiments allow you to uncouple rhythm generation (forward propulsion) and pattern formation (lateral direction). Notably, fictive swim frequencies during so called optomotor responses are below 35Hz, meaning that we are sampling exclusively from V2a neurons recruited during carangiform swim mode. In these experiments, slow V2a-D neurons fire well in advance of slow V2a-B neurons, compared to what we see here which is relatively synchronous. Critically, however, the phase-advanced firing pattern revealed in the Jay et al paper for V2a-D neurons aligns with the phase-advanced excitatory input reported here.  In addition, the recruitment probabilities of slow V2a-D neurons are higher in the Jay et al paper than what we report here. Collectively these observations suggest either more effective excitation during optomotor responses (Jay et al) or more potent inhibition during escape responses (Agha et al). Ultimately, differences in the relative synchrony of firing among slow V2a-D and slow V2a-B neurons appears to depend on the nature of the stimulus and range of swim frequencies, where in one case frequency and amplitude modulation are coupled over a broad range of frequencies (somatosensory stimuli delivered here), while in the other case frequency and amplitude modulation are uncoupled over a narrow range of frequencies (visual stimuli in Jay at al). We now elaborate on this point in the Discussion (L485-498).

      Given the conserved nature of spinal circuits across vertebrates, it is also important to discuss these findings in the context of limbed animals. In tetrapods, changes in locomotor speed also involve pattern/gait changes, however, it is not known if or how these changes in frequency and pattern are linked. This study, by suggesting that different speeds are implemented not only by different neurons but possibly by different neuronal mechanisms, provides important cues for the missing link and would strengthen the discussion. 

      We agree and have made substantial edits to the beginning Discussion to provide better context for the impact of our work (L373-398).

      Minor points: 

      Line 122: of needs to be replaced by or. 

      Corrected, thanks!

      Figure 3B Top panel: What is the grey bar? 

      This has been removed for clarity.

      Figure 3B bottom panel is not referenced in the main text at all. 

      Now referenced (L187, L189)

      Line 260: 2nd inhibition needs to be replaced with excitation. 

      Done, thanks!

      Reviewer #3 (Recommendations For The Authors): 

      Minor comments: 

      - Figure 2 panel ordering is visually appealing but tough to follow. 

      We apologize and tried reconfigurations, but they just looked too kludgy.  Hoping for a pass on this one.

      - Lines 164-166 and 319-327 (related to comment 2 above): For the fast/tonic V2a-Ds, it is not clear that this is intrinsic and it is not consistent with pacemaker properties. This could also be (and likely is) synaptically/network-driven rhythmicity, although the firing frequencies match up well with the swim frequencies. 

      Fast/tonic V2a-Ds were tested with somatic current injection as with all other neurons, which we assume primarily reflects intrinsic cellular properties. The spike rates we observe in fast/tonic V2a-Ds overlap with spike rates observed during fictive swimming, so they are positioned as well as bursting neurons to contribute to pacemaking. We also elaborate on this point in response to Major Comment #2.

      - Lines 189-192: The patterning neurons receive excitatory drive before rhythm-generating neurons. The time constant explanation makes sense for why two neurons with a common drive would fire at different times but this does not support the proposed hierarchical arrangement or being consistent with V2a-Bs being downstream as mentioned in lines 49-56 and 218-219. 

      In response to this point, we have modified Figure 6 so we can directly compare the timing of presynaptic excitatory inputs between the types. Here it can be seen clearly that phasic excitatory inputs to both fast and slow V2a-Ds are phase-advanced relative to fast and slow V2a-Ds (Figure 6B,C). As the reviewer mentions, it is likely a combination of time constants and the relative balance of excitation and inhibition that ultimately lead to synchronous spiking despite differences in the timing of inputs.

      - Lines 338-339: It is not shown that the rhythm relies on inhibition during slow. 

      This line has been removed in the revision process.

      - Consistent with the importance of reciprocal (contralateral) inhibition in fast locomotion here, rodent fictive locomotion is slower in hemisect than in the full cord. However, the Rybak and O'Donovan groups suggest that this is due to loss of drive to ipsilateral inhibitory neurons by excitatory contralateral projections, rather than contralateral inhibitory interneurons (see Falgairolle and O'Donovan 2019, 2021, and Shevtsova et al 2022). 

      This is an interesting point that highlights how we are defining reciprocal versus recurrent inhibition. In this example, although ipsilaterally-projecting interneurons are responsible for inhibition, since they are excited by commissurally-projecting excitatory interneurons, we would classify this as feedforward (reciprocal) not feedback (recurrent) inhibition. So reciprocal (feedforward) inhibition is still important to get higher frequency rhythms, it is di-synaptic in this case. We have added a new figure (Figure 8) to clarify what we mean by reciprocal (feedforward) and recurrent (feedback) based on the ipsilateral projection patterns of V2a neurons, and point out the definitions would be flipped for excitatory interneurons in the Discussion (L452-455).

    1. eLife assessment

      The authors provide a valuable analysis of what neural circuit mechanisms enable varying the speed of retrieval of sequences, which is needed in situations such as reproducing motor patterns. Their use of heterogeneous plasticity rules to allow external currents to control speed of sequence recall is a novel alternative to other mechanisms proposed in the literature. They perform a convincing characterization of relevant properties of recall via simulations and theory, though a better mapping to biologically plausible mechanisms is left for future work.

    2. Reviewer #1 (Public Review):

      While there are many models for sequence retrieval, it has been difficult to find models that vary the speed of sequence retrieval dynamically via simple external inputs. While recent works have proposed some mechanisms, the authors here propose a different one based on heterogeneous plasticity rules. Temporally symmetric plasticity kernels (that do not distinguish between the order of pre and post spikes, but only their time difference) are expected to give rise to attractor states, asymmetric ones to sequence transitions. The authors incorporate a rate-based, discrete-time analog of these spike-based plasticity rules to learn the connections between neurons (leading to connections similar to Hopfield networks for attractors and sequences). They use either a parametric combination of symmetric and asymmetric learning rules for connections into each neuron, or separate subpopulations having only symmetric or asymmetric learning rules on incoming connections. They find that the latter is conducive to enabling external inputs to control the speed of sequence retrieval.

      Comments on revised version:

      The authors have addressed most of the points of the reviewers.

      A major substantive point raised by both reviewers was on the biological plausibility of the learning.

      The authors have added a section in the Discussion. This remains an open question, however the discussion suffices for the current paper.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      While there are many models for sequence retrieval, it has been difficult to find models that vary the speed of sequence retrieval dynamically via simple external inputs. While recent works [1,2] have proposed some mechanisms, the authors here propose a different one based on heterogeneous plasticity rules. Temporally symmetric plasticity kernels (that do not distinguish between the order of pre and post spikes, but only their time difference) are expected to give rise to attractor states, asymmetric ones to sequence transitions. The authors incorporate a rate-based, discrete-time analog of these spike-based plasticity rules to learn the connections between neurons (leading to connections similar to Hopfield networks for attractors and sequences). They use either a parametric combination of symmetric and asymmetric learning rules for connections into each neuron, or separate subpopulations having only symmetric or asymmetric learning rules on incoming connections. They find that the latter is conducive to enabling external inputs to control the speed of sequence retrieval.

      Strengths:

      The authors have expertly characterised the system dynamics using both simulations and theory. How the speed and quality of retrieval varies across phases space has been well-studied. The authors are also able to vary the external inputs to reproduce a preparatory followed by an execution phase of sequence retrieval as seen experimentally in motor control. They also propose a simple reinforcement learning scheme for learning to map the two external inputs to the desired retrieval speed.

      Weaknesses:

      (1) The authors translate spike-based synaptic plasticity rules to a way to learn/set connections for rate units operating in discrete time, similar to their earlier work in [5]. The bio-plausibility issues of learning in [5] carry over here, for e.g. the authors ignore any input due to the recurrent connectivity during learning and effectively fix the pre and post rates to the desired ones. While the learning itself is not fully bio-plausible, it does lend itself to writing the final connectivity matrix in a manner that is easier to analyze theoretically.

      We agree with the reviewer that learning is not `fully bio-plausible’. However, we believe that extending the results to a model in which synaptic plasticity depends on recurrent inputs is beyond the scope of this work. We have added a mention of this issue in the Discussion in the revised manuscript.

      (2) While the authors learn to map the set of two external input strengths to speed of retrieval, they still hand-wire one external input to the subpopulation of neurons with temporally symmetric plasticity and the other external input to the other subpopulation with temporally asymmetric plasticity. The authors suggest that these subpopulations might arise due to differences in the parameters of Ca dynamics as in their earlier work [29]. How these two external inputs would connect to neurons differentially based on the plasticity kernel / Ca dynamics parameters of the recurrent connections is still an open question which the authors have not touched upon.

      The issue of how external inputs could self-organize to drive the network to retrieve sequences at appropriate speeds is addressed in the Results section, paragraph `Reward-driven learning’. These inputs are not `hand-wired’ - they are initially random and then acquire the necessary strengths to allow the network to retrieve the sequences at different speeds thanks to a simple reinforcement learning scheme. We have rewritten this section to clarify this issue.

      (3) The authors require that temporally symmetric and asymmetric learning rules be present in the recurrent connections between subpopulations of neurons in the same brain region, i.e. some neurons in the same brain region should have temporally symmetric kernels, while others should have temporally asymmetric ones. The evidence for this seems thin. Though, in the discussion, the authors clarify 'While this heterogeneity has been found so far across structures or across different regions in the same structure, this heterogeneity could also be present within local networks, as current experimental methods for probing plasticity only have access to a single delay between pre and post-synaptic spikes in each recorded neuron, and would therefore miss this heterogeneity'.

      We agree with the reviewer that this is currently an open question. We describe this issue in more detail in the Discussion of the revised manuscript.

      (4) An aspect which the authors have not connected to is one of the author's earlier work:

      Brunel, N. (2016). Is cortical connectivity optimized for storing information? Nature Neuroscience, 19(5), 749-755. https://doi.org/10.1038/nn.4286 which suggests that the experimentally observed over-representation of symmetric synapses suggests that cortical networks are optimized for attractors rather than sequences.

      We thank the reviewer for this suggestion. We have added a paragraph in the discussion that discusses work on statistics of synaptic connectivity in optimal networks. We expect that in networks that contain two subpopulations of neurons, the degree of symmetry should be intermediate between a network storing fixed point attractors exclusively, and a network storing sequences exclusively.

      Despite the above weaknesses, the work is a solid advance in proposing an alternate model for modulating speed of sequence retrieval and extends the use of well-established theoretical tools. This work is expected to spawn further works like extending to a spiking neural network with Dale's law, more realistic learning taking into account recurrent connections during learning, and experimental follow-ups. Thus, I expect this to be an important contribution to the field.

      We thank the reviewer for the insightful comments.

      Reviewer #2 (Public Review):

      Sequences of neural activity underlie most of our behavior. And as experience suggests we are (in most cases) able to flexibly change the speed for our learned behavior which essentially means that brains are able to change the speed at which the sequence is retrieved from the memory. The authors here propose a mechanism by which networks in the brain can learn a sequence of spike patterns and retrieve them at variable speed. At a conceptual level I think the authors have a very nice idea: use of symmetric and asymmetric learning rules to learn the sequences and then use different inputs to neurons with symmetric or asymmetric plasticity to control the retrieval speed. The authors have demonstrated the feasibility of the idea in a rather idealized network model. I think it is important that the idea is demonstrated in more biologically plausible settings (e.g. spiking neurons, a network with exc. and inh. neurons with ongoing activity).

      Summary

      In this manuscript authors have addressed the problem of learning and retrieval sequential activity in neuronal networks. In particular, they have focussed on the problem of how sequence retrieval speed can be controlled?

      They have considered a model with excitatory rate-based neurons. Authors show that when sequences are learned with both temporally symmetric and asymmetric Hebbian plasticity, by modulating the external inputs to the network the sequence retrieval speed can be modulated. With the two types of Hebbian plasticity in the network, sequence learning essentially means that the network has both feedforward and recurrent connections related to the sequence. By giving different amounts of input to the feed-forward and recurrent components of the sequence, authors are able to adjust the speed.

      Strengths

      - Authors solve the problem of sequence retrieval speed control by learning the sequence in both feedforward and recurrent connectivity within a network. It is a very interesting idea for two main reasons: 1. It does not rely on delays or short-term dynamics in neurons/synapses 2. It does not require that the animal is presented with the same sequences multiple times at different speeds. Different inputs to the feedforward and recurrent populations are sufficient to alter the speed. However, the work leaves several issues unaddressed as explained below.

      Weaknesses

      - The main weakness of the paper is that it is mostly driven by a motivation to find a computational solution to the problem of sequence retrieval speed. In most cases they have not provided any arguments about the biological plausibility of the solution they have proposed e.g.:

      - Is there any experimental evidence that some neurons in the network have symmetric Hebbian plasticity and some temporally asymmetric? In the references authors have cited some references to support this. But usually the switch between temporally symmetric and asymmetric rules is dependent on spike patterns used for pairing (e.g. bursts vs single spikes). In the context of this manuscript, it would mean that in the same pattern, some neurons burst and some don't and this is the same for all the patterns in the sequence. As far as I see here authors have assumed a binary pattern of activity which is the same for all neurons that participate in the pattern.

      There is currently only weak evidence for heterogeneity of synaptic plasticity rules within a single network, though there is plenty of evidence for such a heterogeneity across networks or across locations within a particular structure (see references in our Discussion). The reviewer suggests another interesting possibility, that the temporal asymmetry could depend on the firing pattern on the post-synaptic neuron. An example of such a behavior can be found in a paper by Wittenberg and Wang in 2006, where they show that pairing single spikes of pre and post-synaptic neurons lead to LTD at all time differences in a symmetric fashion, while pairing a pre-synaptic spike with a burst of post-synaptic spikes lead to temporally asymmetric plasticity, with a LTP window at short positive time differences. We now mention this possibility in the Discussion, but we believe exploring fully this scenario is beyond the scope of the paper.

      - How would external inputs know that they are impinging on a symmetric or asymmetric neuron? Authors have proposed a mechanism to learn these inputs. But that makes the sequence learning problem a two stage problem -- first an animal has to learn the sequence and then it has to learn to modulate the speed of retrieval. It should be possible to find experimental evidence to support this?

      Our model does not assume that the two processes necessarily occur one after the other. Importantly, once the correct external inputs that can modulate sequence retrieval are learned, sequence retrieval modulation will automatically generalize to arbitrary new sequences that are learned by the network.

      - Authors have only considered homogeneous DC input for sequence retrieval. This kind of input is highly unnatural. It would be more plausible if the authors considered fluctuating input which is different from each neuron.

      We have modified Figure 1e and Figure 2c to show the effects of fluctuating inputs on pattern correlations and single unit activity. We find that these inputs do not qualitatively affect our results.

      - All the work is demonstrated using a firing rate based model of only excitatory neurons. I think it is important that some of the key results are demonstrated in a network of both excitatory and inhibitory spiking neurons. As the authors very well know it is not always trivial to extend rate-based models to spiking neurons.

      I think at a conceptual level authors have a very nice idea but it needs to be demonstrated in a more biologically plausible setting (and by that I do not mean biophysical neurons etc.).

      We have included a new section in the discussion with an associated figure (Figure 7) demonstrating that flexible speed control can be achieved in an excitatory-inhibitory (E-I) spiking network containing two excitatory populations with distinct plasticity mechanisms.

      Reviewer #1 (Recommendations For The Authors):

      In the introduction, the authors state: 'symmetric kernels, in which coincident activity leads to strengthening regardless of the order of pre and post-synaptic spikes, have also been observed in multiple contexts with high frequency plasticity induction protocols in cortex [21]'. To my understanding, [21]'s final model 3, ignores LTD if the post-spike also participates in LTP, and only considers nearest-neighbour interactions. Thus, the kernel would not be symmetric. Can the authors clarify what they mean and how their conclusion follows, as [21] does not show any kernels either.

      In this statement, we were not referring to the model in [21], but rather the experimentally observed plasticity kernels at different frequencies. In particular, we were referring to the symmetric kernel that appears in the bottom panel of Figure 7c in that paper.

      The authors should also address the weaknesses mentioned above. They don't need to solve the issues but expand (and maybe indicate resolutions) on these issues in the Discussion.

      For ease of reproducibility, the authors should make their code available as well.

      We intend to publish the code required to reproduce all figures on Github.

      Reviewer #2 (Recommendations For The Authors):

      -  Show the ground state of the network before and after learning.

      We have decided not to include such a figure, as we have not analyzed the learning process, but instead a network with a fixed connectivity matrix which is assumed to be the end result of a learning process.

      -  Authors have only considered a network of excitatory neurons. This does not make sense. I think they should demonstrate a network of both exc. and inch. neurons (spiking neurons) exhibiting ongoing activity.

      See our comment to Reviewer #2 in the previous section.

      -  Show how the sequence dynamics unfolds when we assume a non-zero ongoing activity.

      We are not sure what the reviewer means by `non-zero ongoing activity. We show now the dynamics of the network in the presence of noisy inputs, which can represent ongoing activity from other structures (see Fig 1e and 2c).

      -  From the correlation (==quality) alone it is difficult to judge how well the sequence has been recovered. Authors should consider showing some examples so that the reader can get a visual estimate of what 0.6 quality may mean. High speed is not really associated with high quality (Fig 2b). So it is important to show how the sequence retrieval quality is for non-linear and heterogeneous learning rules.

      We believe that some insight into the relationship between speed and quality for the case of non-linear and heterogeneous learning rules is addressed by the correlation plots for chosen input configurations (see Fig. 3a and and 5b). We leave a full characterization for future work.

      -  Authors should show how the retrieval and quality of sequences change when they are recovered with positive input, or positive input to one population and negative to another. In the current version sequence retrieval is shown only with negative inputs. This is a somewhat non-biological setting. The inhibitory gating argument (L367-389) is really weak.

      We would like to clarify that with the parameters chosen in this paper, the transfer function has half its maximal rate at zero input. This is due to the fact we chose the threshold to be zero, using the fact that any threshold can be absorbed in the external inputs. Thus, negative inputs really mean sub-threshold inputs, and they are consistent with sub-threshold external excitatory inputs. We have clarified this issue in the revised manuscript.

      -  Authors should demonstrate how the sequence retrieval dynamics is altered when they assume a fluctuating input current for sequence retrieval instead of a homogeneous DC input.

      See our comment to Reviewer #2 in the previous section.

      -  Authors should show what are the differences in synaptic weight distribution for the two types of learning (bi-linear and non-linear). I am curious to know if the difference in the speed in the two cases is related to the weight distribution. In general I think it is a good idea to show the synaptic weight distribution before and after learning.

      As mentioned above, we do not study any learning process, but rather a network with a fixed connectivity matrix, assumed to represent the end result of learning. In this network, the distribution of synaptic weights converges to a Gaussian in the large p and cN limits, independently of the functions f and g, because of the central limit theorem, if there are no sign constraints on weights. In the presence of sign constraints, the distribution is a truncated Gaussian.

      -  I suggest the use of a monochromatic color scale for figure 2b and 3b.

      Figure 3: The sentence describing panel 2 seems incomplete.

      Also explain why there is non-monotonic relationship between I_s and speed for some values of

      I_a in 3b

      There is a non-monotonic relationship for retrieval quality, not speed. We have clarified this in the manuscript text, but don’t currently have an explanation for why this phenomenon occurs for these specific values of I_a.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Additional Discussion Points

      (1) There is not much exploration of potential mechanisms, i.e., the impact of PV neuron activity on the broader circuit. Additionally, the study exclusively focuses on PV cells and does not explore the role of other prefrontal populations, particularly those known to respond to cueevoked fear states. The discussion should consider how PV activity might impact the broader circuit and whether the present findings are specific to PV cells or applicable to other interneuron subtypes.

      We have added an extensive discussion of potential mechanisms and the potential contributions of other interneuron subtypes:

      “For example, PV neurons aid in improving visual discrimination through sharpening response selectivity in visual cortex (Lee et al., 2012). In prefrontal cortex, PV neurons are critical for task performance, particularly during performance of tasks that require flexible behavior such as rule shift learning (Cho et al., 2020) and reward extinction (Sparta et al., 2014). Further, PV neurons play an essential role in the generation of cortical gamma rhythms, which contribute to synchronization of selective populations of pyramidal neurons (Sohal et al., 2009; Cardin et al., 2009). Courtin et al (2014) showed that brief suppression of dorsomedial prefrontal (dmPFC) PV neural activity enhanced fear expression, one of the main functions of the dmPFC, by synchronizing the spiking activity of dmPFC pyramidal neurons (Courtin et al., 2014). This result is potentially relevant to our findings, but likely involves different circuit mechanisms because of the difference in timescale, targeted area, and downstream projection targets (Vertes, 2004). These and other studies support the idea that PV neural activity supports the execution of a behavior by shaping rather than suppressing cortical activity, potentially by selecting among conflicting behaviors by the synchronization of different pyramidal populations (Warden et al., 2012; Lee et al., 2014).

      The roles of other inhibitory neural subtypes (such as somatostatin (SOM)-expressing and vasoactive intestinal peptide (VIP)-expressing IL GABA neurons) in avoidance behavior are currently unknown, but are likely important given the role of SOM neurons in gamma-band synchronization (Veit et al., 2017), and the role of VIP neurons in regulating PV and SOM neural activity (Cardin, 2018).” 

      (2) There is some discordance between changes in neural activity and behavior. For example, in Figure 4C, the relationship between PV neuron activity and movement emerges almost immediately during learning, but successful active avoidance emerges much more gradually. Why is this?

      We have added extensive text to the discussion that addresses this issue:

      “Interestingly, the rise in IL PV neural activity during movement does not require avoidance learning. IL PV neurons begin to respond during movement immediately after the animal has received a single shock in an environment, but learning to cross the chamber to avoid the signaled shock takes tens of trials. Why is there a discordance between the emergence of the IL PV signal during movement and avoidance learning?

      The components underlying active avoidance have been debated over the years, but are thought to involve at least two essential behaviors – suppressing freezing, and moving to safety (LeDoux et al., 2017). Freezing is the default response of mice upon hearing a shock-predicting tone, and can be learned in a single trial (Ledoux, 1996; Fanselow, 2010; Zambetti et al., 2022). When a predator is in the distance, freezing can increase the chance of survival by reducing the chances of detection. However, a strategic avoidance behavior may prevent a future encounter with the predator altogether. The importance of IL PV neural activity in defensive behavior may be to suppress reactive defensive behaviors such as freezing in order to permit a flexible goaldirected response to threat.

      The freezing suppression and avoidance movement components of the avoidance response are dissociable, both because freezing precedes avoidance learning, and because animals intermittently move prior to avoidance learning. Our finding that the rise in PV activity during movement emerges immediately after receiving a single shock, tens of trials before animals have learned the avoidance behavior, suggests that the IL PV signal is associated with the suppression of freezing. Further, IL PV neurons do not respond during movement toward cued rewards because in reward-based tasks there is no freezing response in conflict with reward approach behavior.” 

      (3) vmPFC was defined here as including the infralimbic (IL) and dorsal peduncular (DP) regions. While the role of IL has been frequently characterized for motivated behavior, relatively few studies have examined DP. Perhaps the authors are just being cautious, given the challenges involved in the viral targeting of the IL region without leakage to nearby regions such as DP. But since the optical fibers were positioned above the IL region, it is possible that DP did not contribute much to either the fiber photometry signals or the effects of the optogenetic manipulations. Perhaps DP should be completely omitted, which is more consistent with the definitions of vmPFC in the field.

      Yes, we included DP to be cautious as our viral expression sometimes leaks into DP, though the optic fiber targets IL. We have replaced vmPFC with IL throughout the manuscript. 

      (4) In the Discussion, the authors should consider why PV cells exhibit increased activity during both movement initiation and successful chamber crossing during avoidance. While the functional contribution of the PV signal during movement initiation was tested with optogenetic inhibition, some discussion on the possible role of the additional PV signal during chamber crossing is of interest readers who are intrigued by the signaling of two events. Is the chamber crossing signal related to successful avoidance or learned safety (e.g., see Sangha, Diehl, Bergstrom, Drew 2020)?

      IL PV neural activity starts to increase at movement initiation, peaks at chamber crossing (when movement speed is highest), and decreases after chamber crossing (Figure 1E). Thus, the increase in PV neural activity at movement initiation and at chamber crossing are different phases of the same event. 

      We think this signal is unlikely to be a safety signal, and have added text to the discussion to clarify this issue:

      “We think the IL PV signal is unlikely to be a safety signal (Sangha et al., 2020). First, the PV signal rises during movement not only in the avoidance context, but during any movement in a “threatening” context (i.e. a context where the animal has been shocked). For example, PV neural activity rises during movement during the intertrial interval in the avoidance task. Further, the emergence of the PV signal during movement happens quickly – after the first shock – and significantly before the animal has learned to move to the safe zone. This suggests a close association with enabling movement in a threatening environment, when animals must suppress a freezing response in order to move. Additionally, the rise in PV activity was specifically associated with movement and not with tone offset, the indicator of safety in this task. Finally, if IL PV neural activity reflects safety signals one would expect the response to be enhanced by learning, but the amplitude of the IL PV response was unaffected by learning after the first shock.”

      (5) The primary conclusion here that PV cells control the fear response should be considered within the context of prior findings by the Herry laboratory. Courtin et al (2014) demonstrated a select role of prefrontal PV cells in the regulation of fear states, accomplished through their control over prefrontal output to the basolateral amygdala. The observations in this paper, which used both ChR2 and Arch-T to address the impact of vmPFC PV activity on reactive behavior, are highly relevant to issues raised both in the Introduction and Discussion.

      Courtin et al (2014)’s finding is very important. We did not discuss this paper originally because Courtin et al. is about dmPFC, which has a different role in fear processing than IL/vmPFC. We have added text about this finding to the discussion:

      “Courtin et al (2014) showed that brief suppression of dorsomedial prefrontal (dmPFC) PV neural activity enhanced fear expression, one of the main functions of the dmPFC, by synchronizing the spiking activity of dmPFC pyramidal neurons (Courtin et al., 2014). This result is potentially relevant to our findings, but likely involves different circuit mechanisms because of the difference in timescale, targeted area, and downstream projection targets (Vertes, 2004).

      Additional analyses

      (1) As avoidance trials progress (particularly on days 2 and 3), do PFC PV responses attenuate? That is, does continued unreinforced tone presentations lead to reduced reliance of PV cellmediated suppression in order for successful avoidance to occur?

      We added Figure 1—Figure supplement 1M and 1N and a sentence on page 5: “IL PV neural activity during the avoidance movement was not attenuated by learning or repeated reinforcement (Figure 1—Figure supplement 1M and N, N = 8 mice, p = 0.8886, 1-way ANOVA).” We only included data from days 1 and 2, since we started to introduce short and long tone trials on day 3 which might interfere. 

      (2) In Figure 3D, it would be very informative and further support the claim of "no role for movement during reward" if the response of these cells during the "initiation of movement during reward-approach" was shown (similar to Figure 1F for threat avoidance).

      Thank you for the question. We added Figure 3—Figure supplement 1B and C to show IL PV neural activity aligned to initiation of movement during reward-approach. IL PV activity decreased after movement initiation for reward approach (N = 6 mice, p=0.0382, paired t-test). This further solidifies our claim that IL PV neuron activity only increases for threat avoidance.   

      Reviewer 1 (Recommendations For The Authors):

      (1) Fig1G shows the average response of PV cells during chamber crossing on an animal-toanimal basis. It would be informative to also see a similar plot for movement initiation.

      We have added the suggested figure in Figure 1—Figure supplement 1B.  

      (2) In the Results section (Page 5), there is a small issue with the logic. It says: "As vmPFC inactivation impairs avoidance behavior, the activity of inhibitory vmPFC PV neurons might be predicted to be low during successful avoidance trials." As opposed to "low", it should say "high", right? If inhibition impairs avoidance, then high responding by these cells would be presumed to drive the avoidance response, as supported by your findings.

      We have re-worded the text in this section. Based on prior findings that IL inactivation impairs avoidance (Moscarello et al., 2013), we predicted that inhibitory PV neurons would be less active during avoidance, because activating these neurons could suppress IL. However, we found that they were selectively active during avoidance.

      (3) In the caption/legend for Fig1E, it says that the "black ticks" indicate "tone onset". But it should say "movement initiation".

      We thank the reviewer for pointing out this error. The ticks do indicate tone onset, and we have corrected the figure to reflect this. 

      Reviewer 2 (Recommendations For The Authors):

      (4) Perhaps replace the term 'good outcomes' with 'reinforcing outcomes' or simply 'reinforcement'.

      Thank you for the suggestion. We have replaced ‘good outcomes’ with ‘reinforcing outcomes’.

      Reviewer 3 (Recommendations For The Authors):

      (5) It would be useful to provide some (perhaps speculative) explanation for the discordance between the PV activity-movement relationship and success of active avoidance in Fig. 4C

      We have added text to the discussion that addresses this issue:

      “Interestingly, the rise in IL PV neural activity during movement does not require avoidance learning. IL PV neurons begin to respond during movement immediately after the animal has received a single shock in an environment, but learning to cross the chamber to avoid the signaled shock takes tens of trials. Why is there a discordance between the emergence of the IL PV signal during movement and avoidance learning?

      The components underlying active avoidance have been debated over the years, but are thought to involve at least two essential behaviors – suppressing freezing, and moving to safety (LeDoux et al., 2017). Freezing is the default response of mice upon hearing a shock-predicting tone, and can be learned in a single trial (Ledoux, 1996; Fanselow, 2010; Zambetti et al., 2022). When a predator is in the distance, freezing can increase the chance of survival by reducing the chances of detection. However, a strategic avoidance behavior may prevent a future encounter with the predator altogether. The importance of IL PV neural activity in defensive behavior may be to suppress reactive defensive behaviors such as freezing in order to permit a flexible goaldirected response to threat.

      The freezing suppression and avoidance movement components of the avoidance response are dissociable, both because freezing precedes avoidance learning, and because animals intermittently move prior to avoidance learning. Our finding that the rise in PV activity during movement emerges immediately after receiving a single shock, tens of trials before animals have learned the avoidance behavior, suggests that the IL PV signal is associated with the suppression of freezing. Further, IL PV neurons do not respond during movement toward cued rewards because in reward-based tasks there is no freezing response in conflict with reward approach behavior.” 

      (6) I don't really understand what is shown in Figure 4D -- exactly what time points does this represent? Was habituation performed everyday?

      Figure 4D shows data from the approach task, not the avoidance task. This data is from welltrained mice, not the first day of training on this task. There was a pre-task recording period every day.

      (7) Why was optogenetic inhibition only delivered from 0.5-2.5 sec after the tone cue?

      We wanted to avoid any possibility that perception of the tone would be disrupted, so we delayed the onset of optogenetic inhibition. We chose 0.5 sec onset because animals typically begin to move ~1 second after tone onset.

      (8) The regression analysis with shuffled time points is not well explained -- some additional methodological details are needed (Fig. 2H).

      We added the following to the methods section to provide a clearer explanation: 

      “DF/F (t) was modeled as the linear combination of all event kernels. Given the event occurrence time points of all event types, we can use linear regression to decompose characteristic kernels for each event type. Kernel coefficients of the model were solved by minimizing the mean square errors between the model and the actual recorded signals. To prove that kernel ki is an essential component for the raw calcium dynamics, we compared the explanation power of the full model to the reduced model where the time points of the occurrence of event ki were randomly assigned. Thus, the kernel coefficients should not reflect the response to the event in the reduced model. 

      Editor's notes:

      -  Should you choose to revise your manuscript, please include full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported for all key questions and not only when the pvalue is less than 0.05.

      Thank you for pointing this out. We have included all the test statistics and exact p values as suggested.

      -  Please note the sex of the mice and distribution of sexes in each group for each experiment.

      We have added the sex of mice for all experiments in the methods section.

    2. eLife assessment

      This important study extends our understanding of how the medial prefrontal cortex regulates flexible action during adversity. The data provide compelling evidence of a role for prefrontal PV neuron activity in active avoidance. This builds on the general idea that these neurons play a role in flexible behavior and demonstrates this in the context of freezing/avoidance conflict. The overall findings contribute to our understanding of mechanisms that support aversively motivated instrumental learning and may provide insight into both stress vulnerability and resilience processes. This work will be of interest to those interested in learning, aversive motivation, interneuron and/or prefrontal cortex function, or conditions relates to these processes and mechanisms.

    3. Reviewer #2 (Public Review):

      Summary:

      This study examined the role of a prefrontal cortex cell type in active avoidance behavior. The authors conduct a series of behavioral experiments incorporating fiber photometry and optogenetic silencing. The results indicate that prefrontal parvalbumin (PV) neurons play a permissive role in performing signaled active avoidance learning, for which details are sorely lacking. Notably, infralimbic parvalbumin activity resolves incompatible defensive responses to threat by suppressing conditional freezing in order to permit active instrumental controlling responses. The overall findings provide a significant contribution to our understanding of mechanisms that support aversively motivated instrumental learning and may provide insight into both stress vulnerability and resilience processes.

      Strengths:

      The writing and presentation of data is clear. The authors use a number of temporally-relevant methods and analyses that identify a novel prefrontal mechanism in resolving the conflict between competing actions (freezing vs escape avoidance). The authors conduct an extensive number of experiments to demonstrate that the uncovered prefrontal mechanism is selective for the initiation of avoidance under threat circumstances, not reward settings or general features of movement.

      Weaknesses:

      The study exclusively focuses on parvalbumin cells, thus questions remain whether the present findings are specific to parvalbumin or applicable to other prefrontal interneuron subtypes. The exact mechanisms that coordinate infralimbic parvalbumin cell activity and threat avoidance behavior are not explored.

    4. Reviewer #3 (Public Review):

      Summary:

      Here the authors study the role of parvalbumin (PV) expressing neurons in the ventromedial prefrontal cortex (vMPFC) of mice in active avoidance behavior using fiber photometry and optogenetic inhibition.

      Strengths:

      The methods are appropriate, the experiments are well done, and the results are all consistent with the conceptual model in which vmPFC PV neurons inhibit freezing to enable avoidance movements. There are good controls to rule out a role for cue offset in triggering changes in PV neuron activity, or for a nonspecific role of vmPFC PV neurons in movement initiation.

      Weaknesses:

      Although potential mechanisms, i.e., the impact of PV neuron activity on the broader circuit, are discussed, they are not directly examined here. There is some discordance between changes in neural activity and behavior: in Figure 4C, the relationship between PV neuron activity and movement emerges almost immediately during learning, but successful active avoidance emerges much more gradually. Again, this is discussed and plausible explanations for this discrepancy are provided.

    1. eLife assessment

      This valuable manuscript reveals sex differences in bi-conditional Pavlovian learning and conditional behavior. Males learn hierarchical context-cue-outcome associations more quickly, but females show more stable and robust task performance. These sex differences are related to cellular activation in the orbitofrontal cortex. Although the evidence supporting these claims is convincing, some assertions of sex differences in context-dependent discrimination behaviour may be slightly overstated yet have strong potential to guide future research to clarify the nature of these differences. The results will be of interest to many behavioural neuroscientists, particularly those who investigate sex-specific behaviours.

    2. Reviewer #1 (Public Review):

      Summary:

      Peterson et al., present a series of experiments in which the Pavlovian performance (i.e. time spent at a food cup/port) of male and female rats is assessed in various tasks in which context/cue/outcome relationships are altered. The authors find no sex differences in context-irrelevant tasks, and no such differences in tasks in which the context signals that different cues will earn different outcomes. They do find sex differences, however, when a single outcome is given and context cues must be used to ascertain which cue will be rewarded with that outcome (Ctx-dep O1 task). Specifically, they find that males acquired the task faster, but that once acquired, performance of the task was more resilient in female rats against exposures to a stressor. Finally, they show that these sex differences are reflected in differential rates of c-fos expression in all three subregions of rat OFC, medial, lateral and ventral, in the sense that it is higher in females than males, and only in the animals subject to the Ctx-dep O1 task in which sex differences were observed.

      Strengths:

      • Well written<br /> • Experiments elegantly designed<br /> • Robust statistics<br /> • Behaviour is the main feature of this manuscript, rather than any flashy techniques or fashionable lab methodologies, and luckily the behaviour is done really well.<br /> • For the most part I think the conclusions were well supported, although I do have some slightly different interpretations to the authors in places.

      Weaknesses:

      The authors have done an excellent job of addressing all previous weaknesses. I have no further comments.

    3. Reviewer #2 (Public Review):

      Summary:

      A bidirectional occasion-setting design is used to examine sex differences in the contextual modulation of reward-related behaviour. It is shown that females are slower to acquire contextual control over cue-evoked reward seeking. However, once established, the contextual control over behaviour was more robust in female rats (i.e., less within-session variability and greater resistance to stress) and this was also associated with increased OFC activation.

      Strengths:

      The authors use sophisticated behavioural paradigms to study the hierarchical contextual modulation of behaviour. The behavioural controls are particularly impressive and do, to some extent, support the specificity of the conclusions. The analyses of the behavioural data are also elegant, thoughtful, and rigorous.

      Weaknesses:

      The authors have addressed the major weaknesses that I identified in a previous review.

    4. Reviewer #3 (Public Review):

      Summary:

      This manuscript reports an experiment that compared groups of rats acquisition and performance of a Pavlovian bi-conditional discrimination, in which the presence of one cue, A, signals that the presentation of one CS, X, will be followed by a reinforcer and a second CS, Y, will be nonreinforced. Periods of cue A alternated with periods of cue B, which signaled the opposite relationship, cue X is nonreinforced and cue Y is reinforced. This is a conditional discrimination problem in which the rats learned to approach the food cup in the presence of each CS conditional on the presence of the third background cue. The comparison groups consisted of the same conditional discrimination with the exception that each CS was paired with a different reinforcer. This makes the problem easier to solve as the background is now priming a differential outcome. A third group received simple discrimination training of X reinforced and Y nonreinforced in cues A and B, and the final group were trained with X and Y reinforced on half the trials (no discrimination). The results were clear that the latter two discrimination learning procedures resulted in rapid learning in comparison to the first. Rats required about 3 times as many 4-session blocks to acquire the bi-conditional discrimination than the other two discrimination groups. Within the biconditional discrimination group, female and male rats spent the same amount of time in the food cup during the rewarded CS, but females spent more time in the food cup during CS- than males. The authors interpret this as a deficit in discrimination performance in females on this task and use a measure that exaggerates the difference in CS+ and CS_ responding (a discrimination ratio) to support their point. When tested after acute restraint stress, the male rats spent less time in the food cup during the reinforced CS in comparison to the female rats, but did not lose discrimination performance entirely. The was also some evidence of more fos positive cells in the orbitofrontal cortex in females. Overall, I think the authors were successful in documenting performance on the biconditional discrimination task, showing that it is more difficult to perform than other discriminations is valuable and consistent with the proposal that accurate performance requires encoding of conditional information (which the authors refer to as "context"). There is evidence that female rats spend more time in the food cup during CS-, but this I hesitate to agree that this is an important sex difference. There is no cost to spending more time in the food cup during CS- and they spend much less time there than during CS+. Males and females also did not differ in their CS+ responding, suggesting similar levels of learning, A number of factors could contribute to more food cup time in CS-, such as smaller body size and more locomotor activity. The number of food cup entries during CS+ and CS- was not reported here. Nevertheless, I think the manuscript will make a useful contribution to the field and hopefully lead readers to follow up on these types of tasks. One area for development would be to test the associative properties of the cues controlling the conditional discrimination, can they be shown to have the properties of Pavlovian occasion setting stimuli? Such work would strengthen the justification/rationale for using the term "context" and "occasion setter" to refer to these stimuli in this task in the way the authors do in this paper.

      Strengths:

      Nicely designed and conducted experiment.<br /> Documents performance difference by sex.

      Weaknesses:

      Overstatement of sex differences.<br /> Inconsistent, confusing, and possibly misleading use of terms to describe/imply the underlying processes contributing to performance.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer 2 (Public Review):

      Stress response in males versus females: The authors argue that the contextual control over behaviour was more robust in female rats as females show less within session variability and greater resistance to stress. What evidence is there that the restraint stress procedure caused a similar stress response in both sexes? That is, was the stress induction equally effective in males and females?

      The restraint protocol used in this study is a well-established stressor in rodents, known to produce robust behavioral and physiological effects (HPA axis activation), in both sexes. Although not measured in this study, the ACTH and cortisol responses are actually greater in females during restraint. To the extent that “stress induction” is interpreted as “HPA axis activation”, this strongly suggests that the stress induction in males and females was at least comparable, if not greater in females.

      We have added a few sentences (in the Result and Method section) to highlight this important point. We thank the reviewer for bringing this up.

      Minor corrections:<br /> (1) Please verify that the in-text reference to the figures is correct. I noticed a few mistakes, for example:

      - Line 120 (pdf) refers to Fig. 1 C-D but should refer to D only.

      - Line 312 (pdf) refers to Fig 1D for discrimination ratios but these are shown in Fig 1E

      - No reference in text to 2A

      Thank you for bringing this to our attention. We have fixed the in-text references to the figures.

      (2) In the results it states that the homecage c-Fos+ counts are shown in Figure 5 but I couldn't see these?

      The homecage c-Fos+ counts were initially shown as a pale gray band in the background of the main histograms. Because those counts are very low, it was hard to dissociate this gray band from the black horizontal axis. We have replaced the gray band with a more vivid blue line that is now in the foreground of the histograms. Moreover, we added a note in the figure legend to bring readers’ attention to this homecage count line, close to floor level. 

      (3) Line 306: It is stated that "the use of differential outcomes presumably allows animals to solve the task via simple (nonhierarchical) summation processes". I don't understand the use of "summation" here, isn't it simply that the rats are relying on direct context-outcome and/or cue-outcome associations?

      That’s right. These rats might be relying on direct context-outcome and cue-outcome associations and adding (or summing up) the converging expectations. We have added a few words in the text to clarify what we mean by summation (i.e. the addition of converging cue-evoked + context-evoked predictions).

    1. Author response:

      We thank the reviewers for their kind comments and advice. Like Reviewer 1, we acknowledge that while the exact involvement of Ih in allowing smooth transitions is likely not universal across all systems, our demonstration of the ways in which such currents can affect the dynamics of the response of complex rhythmic motor networks provides valuable insight. To address the concerns of Reviewer 2, we intend to include a sentence in the discussion to highlight the fact that cesium neither increased the pyloric frequency nor cause consistent depolarization in intracellular recordings. We will also highlight that these observations suggest both that cesium is not indirectly raising [K+]outside and support the conclusion that the effects of cesium are primarily through blockade of Ih rather than other potassium channels.

      Reviewer 3 raised some important points about modeling. While the lab has models that explore the effects of temperature on artificial triphasic rhythms, these models do not account for all the biophysical nuances of the full biological system. We have limited data about the exact nature of temperature-induced parameter changes and the extent to which these changes are mediated by intrinsic effects of temperature on protein structure versus protein interactions/modification by e.g. phosphorylation. With respects to the A current, we have seen in Tang et al., 2010 that the activation and inactivation rates are differentially temperature sensitive but do not have the data to suggest whether or not the time courses of such sensitivities are different as well. We intend to mention these facts in the paper, but plan to leave more comprehensive modeling as the purview of future works.

    2. eLife assessment

      This important study investigates neurobiological mechanisms underlying the maintenance of stable, functionally appropriate rhythmic motor patterns during changing environmental conditions - temperature in this study in the crab Cancer borealis stomatogastric central neural pattern generating circuits producing the rhythmic pyloric motor pattern, which is naturally subjected to temperature perturbations over a substantial range. The authors present compelling evidence that the neuronal hyperpolarization-activated inward current (Ih), known to contribute to rhythm control, plays a key role in the ability of these circuits to appropriately adjust the frequency of rhythmic neural activity in a smooth monotonic fashion while maintaining the relative timing of different phases of the activity pattern that determines proper motor coordination transiently and persistently to temperature perturbations. This study will interest neurobiologists studying rhythmic motor circuits and systems and their physiological adaptations.

    3. Reviewer #1 (Public Review):

      Summary:

      This interesting study investigates the neurobiological mechanisms underlying the stable operation and maintenance of functionally appropriate rhythmic motor patterns during changing environmental conditions - temperature in this study in the crab Cancer borealis stomatogastric neural pattern generating network producing the pyloric motor rhythm, which is naturally subjected to temperature perturbations over a substantial range. This study is relevant to the general problem that some rhythmic motor systems adjust to changing environmental conditions and state changes by increasing the cycle frequency in a smooth monotonic fashion while maintaining the relative timing of different network activity pattern phases that determine proper motor coordination. How this is achieved mechanistically in complex dynamic motor networks is not understood, particularly how the frequency and phase adjustments are achieved as conditions change while avoiding operational instabilities on different time scales. The authors specifically studied the contributions of the hyperpolarization-activated inward current (Ih), which is involved in rhythm control, to the adjustments of frequency and phases in the pyloric rhythmic pattern as the temperature was altered from 11 degrees C to 21 degrees C. They present strong evidence that this current is a critical biophysical feature in the ability of this system to adjust transiently and persistently to temperature perturbations appropriately. After blocking Ih in the pyloric network with cesium, the network was unable to reliably produce its characteristic rapid and smooth increase in the frequency of the triphasic rhythmic motor pattern in response to increasing temperature or its typical steady-state increase in frequency over this Q10 temperature range.

      Strengths:

      (1) The authors addressed this problem by technically rigorous experiments in the crab Cancer borealis stomatogastric ganglion (STG) in vitro, which readily allows for neuronal activity recording in a behaviorally and architecturally defined rhythmic neural circuit in conjunction with the application of blockers of Ih and synaptic receptors to disrupt circuit interactions. This approach is an effective way to experimentally investigate how complex rhythmic networks, at least in poikilotherms, mechanistically adjust to environmental perturbations such as temperature.

      (2) While previous work demonstrated that Ih increases in pyloric neurons as temperature increases, the authors here establish that this increase is necessary for normal responses of STG neural activity to temperature, which consist of a smooth monotonic increase in the frequency of rhythmic activity with increasing temperature.

      (3) The data shows that blocking Ih with cesium causes the frequency to transiently decrease ("jags") when the temperature increases and then increases after the temperature stabilizes at a steady state, revealing a non-monotonic frequency response to temperature perturbations.

      (4) The authors dissect some of the underlying neuronal and circuit dynamics, presenting evidence that after blocking Ih, the non-monotonic jags in the frequency response are mediated by intrinsic properties of pacemaker neurons, while in the steady state, Ih determined the overall frequency change (i.e., temperature sensitivity) through network interactions.

      (5) The authors' results highlight the existence of more complex dynamic responses to increasing temperature for the first time, suggesting a longer timescale process than previously recognized that may result from interactions between multiple channels and/or ion channel kinetics.

      Weaknesses:

      The involvement of Ih in achieving the frequency and phase adjustments as conditions change and allowing smooth transitions to avoid operational instabilities in other complex rhythmic motor netReviewer #2 (Public Review):

      Summary:

      Using the crustacean stomatogastric nervous system (STNS), the authors present an interesting study wherein the contribution of the Ih current to temperature-induced changes in the frequency of a rhythmically active neural circuit is evaluated. Ih is a hyperpolarization-activated cation current that depolarizes neurons. Under normal conditions, increasing the temperature of the STNS increases the frequency of the spontaneously active pyloric rhythm. Notably, under normal conditions, as temperature systematically increases, the concomitant increase in pyloric frequency is smooth (i.e., monotonic). By contrast, blocking Ih with extracellular cesium produces temperature-induced pyloric frequency changes that follow a characteristic sawtooth response (i.e., non-monotonic). That is, in cesium, increasing temperature initially results in a transient drop in pyloric frequency that then stabilizes at a higher frequency. Thus, the authors conclude that Ih establishes a mechanism that ensures smooth changes in neural network frequency during environmental disturbances, a feature that likely bestows advantages to the animal's function.

      The study describes several surprising and interesting findings. In general, the study's primary observation of the cesium-induced sawtooth response is remarkable. To my knowledge, this type of response has not yet been described in neurobiological systems, and I suspect that the unexpected response will be of interest to many readers.

      At first glance, I had some concerns regarding the use of extracellular cesium to understand network phenomena. Yes, extracellular cesium blocks Ih. But extracellular cesium has also been shown to block astrocytic potassium channels, at least in mammalian systems (i.e., K-IR, PMID: 10601465), and such a blockade can elevate extracellular potassium. I was heartened to see that the authors acknowledge the non-specificity of cesium (lines 320-325) and I agree with the authors' contention that "a first approximation most of the effects seen here can likely be attributed to Cs+ block of Ih". Upon reflecting on the potential confound, I was also reassured to see that extracellular cesium alone does not increase pyloric frequency, an effect that might be expected if cesium indirectly raises [K+]outside. I suggest including that point in the discussion.

      In summary, the authors present a solid investigation of a surprising biological phenomenon. In general, my comments are fairly minor. This is an interesting study.

      Strengths:

      A major strength of the study is the identification of an ionic conductance that mediates stable, monotonic changes in oscillatory frequency that accompany changes in the environment (i.e., temperature).

      Weaknesses:

      A potential experimental concern stems from the use of extracellular cesium to attribute network effects specifically to Ih. Previous work has shown that extracellular cesium also blocks inward-rectifier potassium channels expressed by astrocytes, and that such blockade may also elevate extracellular potassium, an action that generally depolarizes neurons. Notably, the authors address this potential concern in the discussion.works, for example, in homeotherms, is not established, so the present results may have limited general extrapolations.

    4. Reviewer #2 (Public Review):

      Summary:

      Using the crustacean stomatogastric nervous system (STNS), the authors present an interesting study wherein the contribution of the Ih current to temperature-induced changes in the frequency of a rhythmically active neural circuit is evaluated. Ih is a hyperpolarization-activated cation current that depolarizes neurons. Under normal conditions, increasing the temperature of the STNS increases the frequency of the spontaneously active pyloric rhythm. Notably, under normal conditions, as temperature systematically increases, the concomitant increase in pyloric frequency is smooth (i.e., monotonic). By contrast, blocking Ih with extracellular cesium produces temperature-induced pyloric frequency changes that follow a characteristic sawtooth response (i.e., non-monotonic). That is, in cesium, increasing temperature initially results in a transient drop in pyloric frequency that then stabilizes at a higher frequency. Thus, the authors conclude that Ih establishes a mechanism that ensures smooth changes in neural network frequency during environmental disturbances, a feature that likely bestows advantages to the animal's function.

      The study describes several surprising and interesting findings. In general, the study's primary observation of the cesium-induced sawtooth response is remarkable. To my knowledge, this type of response has not yet been described in neurobiological systems, and I suspect that the unexpected response will be of interest to many readers.

      At first glance, I had some concerns regarding the use of extracellular cesium to understand network phenomena. Yes, extracellular cesium blocks Ih. But extracellular cesium has also been shown to block astrocytic potassium channels, at least in mammalian systems (i.e., K-IR, PMID: 10601465), and such a blockade can elevate extracellular potassium. I was heartened to see that the authors acknowledge the non-specificity of cesium (lines 320-325) and I agree with the authors' contention that "a first approximation most of the effects seen here can likely be attributed to Cs+ block of Ih". Upon reflecting on the potential confound, I was also reassured to see that extracellular cesium alone does not increase pyloric frequency, an effect that might be expected if cesium indirectly raises [K+]outside. I suggest including that point in the discussion.

      In summary, the authors present a solid investigation of a surprising biological phenomenon. In general, my comments are fairly minor. This is an interesting study.

      Strengths:

      A major strength of the study is the identification of an ionic conductance that mediates stable, monotonic changes in oscillatory frequency that accompany changes in the environment (i.e., temperature).

      Weaknesses:

      A potential experimental concern stems from the use of extracellular cesium to attribute network effects specifically to Ih. Previous work has shown that extracellular cesium also blocks inward-rectifier potassium channels expressed by astrocytes, and that such blockade may also elevate extracellular potassium, an action that generally depolarizes neurons. Notably, the authors address this potential concern in the discussion.

    5. Reviewer #3 (Public Review):

      Summary:

      This paper presents a systematic analylsis of the role of the hyperpolarization-activated inward current (the h current) in the response of the pyloric rhythm of the stomatogastric ganglion (STG) of the crab. In a detailed set of experiments, they analyze the effect of blocking h current with bath infusion of the h current blocker cesium (perfused as CsCl). They show interesting and reproducible effects that blockade of h current results in a period of frequency decrease after an upward step in temperature, followed by a slow increase in frequency.<br /> This contrasts with the normal temperature response that shows an increase in frequency with an increase in temperature without a downward "jag" in the frequency response. This is an important paper for showing the role of h current in stabilizing network dynamics in response to perturbations such as a temperature change.

      The major effects are shown very clearly and convincingly in a range of experiments with combined intracellular recording from neurons during changes in temperature.

      They also provide additional detailed analyses of the effect of picrotoxin on these changes, showing that most of the effects except for the loss of frequency increase, appear to indicate that these effects are due to the role of h current in the pacemaker neurons PD.

      Weaknesses :

      I know the Marder lab has detailed models of the pyloric rhythm. I am not saying they have to add modeling to this already extensive and detailed paper, but it would be useful to know how much of these temperature effects have been modeled successfully and which ones have never been shown in the models.

      They describe the ionic mechanism for the decrease and increase in frequency as a difference in temperature sensitivity of different components of the A current, but it seems like it is also a function of the time course of the response to change in temperature (i.e. the different components could have the same final effect of temperature but show a different time course of the change). They could mention any known data about the mechanism for how temperature is altering these channel kinetics and whether this indicates a change in time course of response to the same temperature, or a difference in actual steady-state temperature sensitivity.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This work successfully identified and validated TRLs in hepatic metastatic uveal melanoma, providing new horizons for enhanced immunotherapy. Uveal melanoma is a highly metastatic cancer that, unlike cutaneous melanoma, has a limited effect on immune checkpoint responses, and thus there is a lack of formal clinical treatment for metastatic UM. In this manuscript, the authors described the immune microenvironmental profile of hepatic metastatic uveal melanoma by sc-RNAseq, TCR-seq, and PDX models. Firstly, they identified and defined the phenotypes of tumor-reactive T lymphocytes (TRLs). Moreover, they validated the activity of TILs by in vivo PDX modelling as well as in vitro coculture of 3D tumorsphere cultures and autologous TILs. Additionally, the authors found that TRLs are mainly derived from depleted and late activated T cells, which recognize melanoma antigens and tumor-specific antigens. Most importantly, they identified TRLs associated phenotypes, which provide new avenues for targeting expanded T cells to improve cellular and immune checkpoint immunotherapy.

      Strengths:

      Jonas A. Nilsson, et al. has been working on new therapies for melanoma.  The team has also previously performed the most comprehensive genome-wide analysis of uveal melanoma available, presenting the latest insights into metastatic disease. In this work, the authors performed paired sc-RNAseq and TCR-seq on 14 patients with metastatic UM, which is the largest single-cell map of metastatic UM available. This provides huge data support for other  studies of metastatic UM.

      We thank the reviewer for these kind words about our work.

      Weaknesses:

      Although the paper does have strengths in principle, the weaknesses of the paper are that these strengths are not  directly demonstrated. That is,  insufficient analyses are performed to fully support the key claims in the manuscript by the data presented. In particular:

      The author's description of the overall results of the article should be logical, not just a description of the observed phenomena. For example, the presentation related to the results of TRLs lacked logic. In addition, the title of the article emphasizes the three subtypes of hepatic metastatic UM  TRLs, but these three subtypes are not specifically discussed in the results as well as the discussion section. The title of the article is not a very comprehensive generalization and should be carefully considered by the authors.

      We thank the reviewer for the critical reading of our work. We have added more data and more discussion.

      The authors' claim that they are the first to use autologous TILs and sc-RNAseq to study immunotherapy needs to be supported by the corresponding literature to be more convincing. This can help the reader to understand the innovation and importance of the methodology.

      We have gone through the manuscript and found that we only refer to being first in using PDX models and autologous TILs to study immunotherapy responses by single-cell sequencing. While there are data to be deduced from other studies, we still believe this to be an accurate statement.

      In addition, the authors argue that TILs from metastatic UM can kill tumor cells. This is the key and bridging point to the main conclusion of the article. Therefore, the credibility of this conclusion should be considered.  Metastatic UM1 and UM9 remain responsive to autologous tumors under in vitro conditions with their autologous TILs.

      UM1 responds also in vivo in the subcutaneous model in the paper. We have also finished an experiment where we show that this model also responds in a liver metastasis model. These data have been added in this revised version of the paper. We add two main figures and one supplementary figure where we characterize the response in vivo and also by single-cell sequencing of TILs.

      In contrast, UM22, also as a metastatic UM, did not respond to TIL treatment. In particular, the presence of MART1-responsive TILs. The reliability of the results obtained by the authors in the model of only one case of UM22 liver metastasis should be considered. The authors should likewise consider whether such a specific cellular taxon might also exist in other patients with metastatic UM, producing an immune response to tumor cells. The results would be more comprehensive if supported by relevant data.

      The reviewer has interpreted the results absolutely right, the allogenic and autologous MART1-specific TILs cells while reactive in vitro against UM22, cannot kill this tumor either in a subcutaneous or liver metastases model. We hypothesize this has to do with an immune exclusion phenotype and show weak immunohistochemistry that suggest this. We hope the addition of more UM1 data can be viewed as supportive of tumor-reactivity also in vivo.

      In addition, the authors in that study used previously frozen biopsy samples for TCR-seq, which may be associated with low-quality sequencing data, high risk of outcome indicators, and unfriendly access to immune cell information. The existence of these problems and the reliability of the results should be considered. If special processing of TCR-seq data from frozen samples was performed, this should also be accounted for.  

      We agree with the reviewers and acknowledge we never anticipated the development of single-cell sequencing techniques when we started biobank 2013. We performed dead cell removal before the 10x Genomics experiment. We have also done extensive quality controls and believe that the data from the biopsies should be viewed as a whole and that quantitative intra-patient comparisons cannot be done.

      Reviewer #2 (Public Review):  

      Summary:  

      The study's goal is to characterize and validate tumor-reactive T cells in liver metastases of uveal melanoma (UM), which could contribute to enhancing immunotherapy for these patients. The authors used single-cell RNA and TCR sequencing to find potential tumor-reactive T cells and then used patientderived xenograft (PDX) models and tumor sphere cultures for functional analysis. They discovered that tumor-reactive T cells exist in activated/exhausted T cell subsets and in cytotoxic effector cells. Functional experiments with isolated TILs show that they are capable of killing UM cells in vivo and ex vivo.

      Strengths:  

      The study highlights the potential of using single-cell sequencing and functional analysis to identify T cells that can be useful for cell therapy and marker selection in UM treatment. This is important and novel as conventional immune checkpoint therapies are not highly effective in treating UM. Additionally, the study's strength lies in its validation of findings through functional assays, which underscores the clinical relevance of the research. 

      We thank the reviewer for these kind words about our work.

      Weaknesses:  

      The manuscript may pose challenges for individuals with limited knowledge of single-cell analysis and immunology markers, making it less accessible to a broader audience.

      The first draft of the manuscript (excluding methods) was written by a person (J.A.N) who is not a bioinformatician. It has been corrected to include the correct nomenclature where applicable but overall it is written with the aim to be understandable. We have made an additional effort in this version. 

      Reviewer #1 (Recommendations For The Authors):  

      (1) Firstly, the authors should provide high-resolution pictures to ensure readability for readers. 

      We have converted to pdf ourselves and that improved resolution. We are happy to provide high-resolution to the office if needed for the printing.

      (2) Furthermore, some parts of the article are more colloquial, and the authors should consider the logic and academic nature of the overall writing of the article. For example, authors should double-check whether the relevant expressions in the results are correct. For example, 'TCR' in the fourth part of the results should be 'TRLs'.

      We thank the reviewer for the recommendations and have gone through the manuscript.

      (3) Moreover, UM22 is described several times in the results as a metastatic UM and should be clearly defined in the methodology.

      The UM22 and UM1 samples are described in-depth in Karlsson et al., Nature Communications, 2020, a paper that is cited in the beginning of Results as part of the narrative. The current work can be viewed as an extension of that work.

      (4) Finally, it is recommended that authors describe a part of the results in full before citing the corresponding picture, otherwise, it will lead to confusion among readers.

      We have made an effort in the revised version to describe the new data in more detail.

      Reviewer #2 (Recommendations For The Authors):  

      The manuscript is very interesting and important to understanding key aspects of uveal melanoma immune profile and functionality. However, in my opinion, there are a few aspects that could be addressed.  

      - The manuscript lacks comprehensive details about the samples used, such as their disease progression, response to treatment, or any relevant information that could shed light on potential differences between samples. It would be valuable to know whether these samples were collected before any systemic treatment or if any of the patients underwent immunotherapy post-sample collection, along with the outcomes of such treatments. Providing this information would enrich the manuscript and provide a more holistic view of the research.

      We thank the reviewer for the recommendation and have included a new Supplementary table 7 with information about the samples. We have also pasted in individual samples’ contribution to the UMAP to add further holistic view.  

      - The results presented and discussed in the manuscript seem to indicate that there were no significant differences across the various samples, including comparisons between lymph-node and liver metastases. However, this lack of variation or the reasons for not discussing any observed differences should be clarified. If there are distinctions between the samples, it would be beneficial to discuss these findings in the manuscript.

      We thank the reviewer for the recommendation. Whereas 14 samples are many for a uveal melanoma study it is not really powered to do intra-patient comparisons.

      - The manuscript may pose difficulties for individuals with limited knowledge of single-cell analysis and immunology markers, potentially limiting its accessibility. To make the research more inclusive, the authors might consider presenting the technical aspects of their work in a less descriptive manner and providing explanations for those less familiar with the technology. This would help a broader audience grasp the significance of the study's findings. 

      The manuscript is from a multidisciplinary team where all have read and commented. The draft was written by a tumor biologist and edited by a bioinformatician for accuracy. We honestly think it is more understandable than most studies in this bioinformatics era. But we have tried to describe the new data in an easier way.

    2. eLife assessment

      This study presents valuable findings on tumor-reactive T cells in liver metastases of uveal melanoma (UM). The authors conducted single-cell RNA sequencing to identify potential tumor-reactive T cells and used PDX models for functional analysis. The evidence supporting their claims is solid. The work will be of interest to scientists working in the field of uveal melanoma.

    3. Reviewer #1 (Public Review):

      This work successfully identified and validated TRLs in hepatic metastatic uveal melanoma, providing new horizons for enhanced immunotherapy. Uveal melanoma is a highly metastatic cancer that, unlike cutaneous melanoma, has a limited effect on immune checkpoint responses, and thus there is a lack of formal clinical treatment for metastatic UM. In this manuscript, the authors described the immune microenvironmental profile of hepatic metastatic uveal melanoma by sc-RNAseq, TCR-seq, and PDX models. Firstly, they identified and defined the phenotypes of tumor-reactive T lymphocytes (TRLs). Moreover, they validated the activity of TILs by in vivo PDX modeling as well as in vitro co-culture of 3D tumorsphere cultures and autologous TILs. Additionally, the authors found that TRLs are mainly derived from depleted and late-activated T cells, which recognize melanoma antigens and tumor-specific antigens. Most importantly, they identified TRLs-associated phenotypes, which provide new avenues for targeting expanded T cells to improve cellular and immune checkpoint immunotherapy.

      Comments on revised manuscript

      The revised manuscript has addressed all my concerns.

    4. Reviewer #2 (Public Review):

      Summary:

      The study's goal is to characterize and validate tumor-reactive T cells in liver metastases of uveal melanoma (UM), which could contribute to enhancing immunotherapy for these patients. The authors used single-cell RNA and TCR sequencing to find potential tumor-reactive T cells and then used patient-derived xenograft (PDX) models and tumor sphere cultures for functional analysis. They discovered that tumor-reactive T cells exist in activated/exhausted T cell subsets and in cytotoxic effector cells. Functional experiments with isolated TILs show that they are capable of killing UM cells in vivo and ex vivo.

      Strengths:

      The study highlights the potential of using single-cell sequencing and functional analysis to identify T cells that can be useful for cell therapy and marker selection in UM treatment. This is important and novel as conventional immune checkpoint therapies are not highly effective in treating UM. Additionally, the study's strength lies in its validation of findings through functional assays, which underscores the clinical relevance of the research.

      Weaknesses:

      The manuscript may pose challenges for individuals with limited knowledge of single-cell analysis and immunology markers, making it less accessible to a broader audience.

    1. eLife assessment

      This study presents valuable findings on core genome mutations that might have driven the emergence of the Staphylococcus aureus lineage USA300, a frequent cause of community-acquired infections. The authors present a solid novel approach that combines genome-wide association studies and RNA-expression analyses, both applied to extensive publicly available datasets. This approach generated an intriguing hypothesis that should be validated experimentally. The work will interest microbiologists working in genomic epidemiology and phenotype-genotype association studies.

    2. Reviewer #1 (Public Review):

      Summary:

      This is large-scale genomics and transcriptomics study of the epidemic community-acquired methicillin-resistant S. aureus clone USA300, designed to identify core genome mutations that drove the emergence of the clone. It used publicly available datasets and a combination of genome-wide association studies (GWAS) and independent principal-component analysis (ICA) of RNA-seq profiles to compare USA300 versus non-USA300 within clonal complex 8. By overlapping the analyses the authors identified a 38bp deletion upstream of the iron-scavenging surface-protein gene isdH that was both significantly associated with the USA300 lineage and with a decreased transcription of the gene.

      Strengths:

      Several genomic studies have investigated genomic factors driving the emergence of successful S. aureus clones, in particular USA300. These studies have often focussed on acquisition of key accessory genes or have focussed on a small number of strains. This study makes a smart use of publicly available repositories to leverage the sample size of the analysis and identify new genomics markers of USA300 success.

      The approach of combining large-scale genomics and transcriptomics analysis is powerful, as it allows to make some inferences on the impact of the mutations. This is particular important for mutations in intergenic regions, whose functional impact is often uncertain.

      The statistical genomics approaches are elegant and state-of-the-art and can be easily applied to other contexts or pathogens.

      Weaknesses:

      The main weakness of this work is that these data don't allow a casual inference on the role of isdH in driving the emergence of USA300. It is of course impossible to prove which mutation or gene drove the success of the clone, however, experimental data would have strengthen the conclusions of the authors in my opinion.

      Another limitation of this approach is that the approach taken here doesn't allow to make any conclusions on the adaptive role of the isdH mutation. In other words, it is still possible that the mutation is just a marker of USA300 success, due to other factors such as PVL, ACMI or the SCCmecIVa. This is because by its nature this analysis is heavy influenced by population structure. Usually, GWAS is applied to find genetic loci that are associated with a phenotype and are independent of the underlying population structure. Here, authors are using GWAS to find loci that are associated with a lineage. In other words, they are simply running a univariate analysis (likely a logistic regression) between genetic loci and the lineage without any correction for population structure, since population structure is the outcome. Therefore, this approach can't be applied to most phenotype-genotype studies where correction for population structure is critical.

      Finally, the approach used is complex and not easily reproduced to another dataset. Although I like DBGWAS and find the network analysis elegant, I would be interested in seeing how a simpler GWAS tool like Pyseer would perform.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      (1) Line 56: replace "pyomastitis" with "pyogenic skin infections".

      Corrected.

      (2) Line 58: replace "basal strains" with "ancestral strains".

      Corrected.

      (3) Line 62: population structure impacts gene acquisition too, however, gene acquisitions can be easier to connect with a phenotype. For example, acquisition of mecA is thought to be adaptive rather than just linked to a successful lineage. This same reasoning applies to resistance-associated mutations such as gyrA mutations in ST22 emergence.

      We completely agree with the reviewer that population structure also impacts gene acquisition. We wanted to convey that connecting gain or loss of genes to a change in particular phenotype is much easier than doing the same for a mutation, specially in the presence of strong linkage, and therefore gene level analysis is the focus of many previous studies. We have rewritten the sentence to better convey this idea:

      “Due to this limitation, studies of emerging strains often focus on gene level analysis such as acquisition of mobile genetic elements or loss of gene function as their effect on phenotype is easier to determine than that of point mutations.”

      (4) Line 112 this might be simply due to the smaller size of the intergenic regions chosen. I suggest to correct for the size of the genome segment considered.

      We thank the reviewer for pointing this out. The size of the intergenic was indeed the simple explanation for this observation. We have added the following sentence to the manuscript:

      “This is reflective of the fact that most of S. aureus genome sequence comprises of ORFs e.g. ~84% of TCH1516 genome is part of an ORF.”

      (5) Line 189: please add p values to supp table 2.

      We have added the p and q values from DBGWAS into Supp table 2. It is under the ‘DBGWAS Result’ sheet.

      (6) Line 227: high entropy indicates that this site is polymorph, not necessarily that there is selective pressure. In the extreme, this might actually point to a neutral position, since any amino-acid could be equally present (see for example https://www.nature.com/articles/s41467-022-31643-3#Sec10 ).

      We agree that high entropy by itself may point to a position with neutral selection leading to some false positives. However, we were focused on positions that were mostly biallelic in CC8, and with differential prevalence in USA300 vs non-USA300 (albeit in the presence of strong linkage disequilibrium) in addition to having high entropy in non-CC8 strains. This helps us filter some of the positions that were mostly monoallelic or with rare mutations while preserving other sites of interest. The approach was able to find cap5E mutation which has been associated with disruption of capsule production.

      (7) Line 271: show USA500 on the tree.

      Our current study is mostly focused on differences between USA300 and non-USA300 strains and we want to highlight those differences in the tree.

      (8) Line 327: still not possible to infer causality.

      We have changed the language to remove mentions of causality and instead talk about the association of GWAS enriched genes with measured transcriptional changes. The revised sentence now reads:

      “Here, we demonstrated how a model of transcriptional regulation with iModulons can be used to make a headway through the impasse created by the high linkage disequilibrium and identify GWAS-enriched mutations that are also associated with measurable phenotypic changes in the TRN.”

      (9) Line 324: subclades reference.

      We are unsure what this means.

      (10) Line 366: the authors seem to have used a bespoke pan-genome analysis approach. Would they be able to validate it using established tools such as Roary, Pirate or Panaroo? Panaroo in particular appears to have superior accuracy thanks to its pan-genome graph approach (https://github.com/gtonkinhill/panaroo). 

      We have added the results of Roary to our analysis (Figure S1b). The roary results largely agree with our biggest take away from pangenomics which is that our collection of genomes have a good coverage of the CC8 clade at the gene level.

      (11) Line 397: what was the size of the core genome?

      There were 24881 core sites. We have added the number to the manuscript.

      (12) Line 407: please add citation or website for SCCmecFinder.

      The citation of SCCmecFinder (45) is at the end of the sentence.

      (13) Line 421: I was not able to find the code used for this analysis in the github repository provided.

      The code can be found in “notebook/02_Preprocess_DBGWAS.ipynb” within the repo.

      (14) Line 427: this is a very complex analysis for a simple univariate comparison between USA300-vs-non USA300 strains with no correction for population structure. The authors should compare their results with a more established pipeline like Pyseer or Gemma that can handle kmers and show the added value of their approach.

      We wanted to take advantage of DBGWAS’s ability to collapse kmers into unitigs and further collapse significant unitigs within a genetic neighborhood into components. Unfortunately, we found that in many cases, it became difficult to determine the exact mutation that was being enriched e.g. (T234G) without doing lots of manual work. Our network analysis simply parses the DBGWAS graph to automatically extract these mutations, making the results more interpretable. It does not do any additional hypothesis testing.

      We also attempted to pass kmer data into GEMMA but without the compaction provided by DBGWAS the memory required (>168 GB) exceeded what we had available.

      (15) DBGWAS: please indicate DBGWAS version and the options used for kmer size and number of neighbour nodes retained in the subgraph. Also, I assume that no correction for population structure was applied.

      We have added the version and parameters for DBGWAS. The method section now reads:

      “DBGWAS (v0.5.4) was used to enrich mutations unique to USA300 strains using default kmer size of 31 (-k 31) and neighborhood size of 5 (-nh 5). Alleles with frequency less than 0.1 were filtered  (-maf 0.1) and all components enriched with q-values less than 0.05 were documented (-SFF q0.05).”

      (16) Could the authors provide the DBGWAS output for the most significant unitings in graph format? This would help readers understand the findings.

      The outputs are available in the github repo. The link to this specific data is (https://github.com/sapoudel/USA300GWASPUB/tree/master/data/dbgwas/dbgwas_output/visualisations)

      The text format of the output is part of Supplementary Table 2 under “DBGWAS Result” sheet.

      (17) Line 469: please provide more details on iModulons, it is not enough to simply reference the paper: specific QC criteria, mapping algorithm and parameters, ICA algorithm.

      We have now added a new Supplementary Note 2 section with more details about building iModulons.

      (18) Line 474: what is log-TPM?

      Log-Transcripts per Million. We have added the description in the text.

      (19) Line 479: not sure what "Chapter 3" refers to.

      Thank you for correcting the mistake. The reference has been corrected.

      Reviewer #2 (Recommendations For The Authors):

      Line 45. The introduction is not well-structured, and there is a lack of coherence among the topics pertinent to the research objective. I would recommend rewriting this section addressing the following topics: the challenge of distinguishing lineages within the CC8, especially the CA-MRSA USA300 strains; discussing the state-of-the-art GWAS methodologies, elucidating the main confounding factors in the application of GWAS to bacterial studies, and finally, exploring how current methods aim to address these concerns.

      We would like to thank the reviewer for the suggestions. The main innovation of the paper is using iModulons to find phenotype associated mutations from a set of linked mutations. The challenge of distinguishing CC8 subclades has been largely resolved thanks to efforts by Bowers et al. (PMID: 29720527). We have made some revisions to address the GWAS methodologies (bugwas and DBGWAS), the effect of linkage disequilibrium in interpreting the output of these methods and how combining the results of these association tests with modeling of TRN with iModulons can lead to finding candidate mutations of interest that are linked to specific changes in gene regulation.

      Line 56. Replace "pyomastitis" with "pyomyositis".

      Corrected to “pyogenic skin infections.”

      Lines 71. What do the authors mean by "endemic USA300 strain"?

      We have removed references to endemic strains.

      Line 106. Please verify the number of genomes used in the DBGWAS analysis. In the text, the authors mention that 2038 genomes were utilized. However, in Supplementary Table 1, only 2030 genomes are listed.

      Thank you for catching the discrepancy. We started the analysis with 2037 genomes, including four “spiked-in” reference genomes- USA100 D592 (CC5 strain used for rooting the CC8 tree), TCH1516 (same accession number as the one used for ICA), COL and Newman. Before further analysis, we removed 6 genomes for being smaller than 2.5 million base-pairs (see preprocessing.ipynb) and the USA100 D592 strain as it is not part of CC8. This resulted in 2030 genomes being used for DBGWAS. We kept the other 3 spiked CC8 genomes to help annotate the unitigs from DBGWAS.  Lastly, we removed the other three CC8 clade spiked genomes for pangenomic analysis. To clarify this, we have made the following changes to the text:

      (1) Changed line 106: We downloaded 2033 S. aureus genomes for analysis and excluded six of them with genome length of less than 2.5 million base pairs. The remaining 2027 S. aureus CC8 genomes formed a closed pangenome, suggesting that the sampled genomes mostly captured the gene level variations within the clonal complex (Figure 1a).

      (2) DBGWAS section Line 177: We used 2030 genomes for this analysis; the 2027 genomes in pangenomics analysis above were “spiked” with three well known CC8 genomes- TCH1516, COL, and Newman- to help annotate the DBGWAS unitigs.

      Line 108. Could the authors provide a table with the genes that constitute the core, accessory genome, and unique genes for each of the strains?

      The genes presence absence tables are very large files and therefore we have only added them to our github repo. The results can be found in following files:

      Pangenomics: data/pangenome/Pangenomics/CC8_strain_by_gene.pickle.gz

      Lines 112 and 315. On what basis did the authors decide on the size of the upstream regulatory region? In the search for mutations, they extracted segments of 300 base pairs, whereas, in the search for the Fur binding motif, only 100 base pairs were considered. The RegPrecise database contains regulons for Staphylococcus aureus N315 (https://regprecise.lbl.gov/genome.jsp?genome_id=26), including the Fur regulon with multiple Transcription Factor Binding Sites (TFBSs) that extend beyond the 100 base-pair sequence. I would recommend reconsidering the search within the standardized upstream region of -400 base pairs. In the case of the Fur binding motif search, it might be beneficial to include the TFBSs available in the RegPrecise database.

      For Fur motif search, we chose 100 base-pairs because the Fur motif in non-USA300 strains were within ~20 base-pairs of isdH translation start site (Figure 4C). In our search of Fur motif in this analysis, we were not looking to see if any exists, we were simply looking to see if the one proximal to the translation start site exists as our DBGWAS analysis suggested that specific region was deleted in USA300 strains.

      Line 175. This work aimed to identify potential mutations associated with the success of a specific lineage rather than a phenotype, where correction for population structure effects is necessary. Would the implementation of the bugwas method in DBGWAS for controlling bacterial population structure not potentially impact the results? How was this issue addressed in your analysis? Would it not be pertinent to run a program without population structure correction to enable a comparison of results?

      We initially tried to use Linear Mixed Models to find kmers that were only enriched in USA300 strains. These efforts were hampered by extreme linkage disequilibrium which led to high collinearity between kmer abundance making it extremely difficult to get a good estimate of the coefficients. We also tried to run chi-squared tests individually on each kmer which led to unmanageable number (>100k) kmers that were significantly different. DBGWAS on the other hand was able to compress unbranched kmers in the De Bruijn into unitigs and further reduce the number of tests by testing at pattern level instead of unitig level. We found no straight forward way to run DBGWAS (or GEMMA) without population structure correction. Therefore, it is likely we may be underestimating the number of significant unitigs with this approach.

      Line 189. Please italicize the gene name cap5E.

      Corrected.

      Line 277. Please clarify the QC/QA criteria and curation process employed for the selection of RNA-seq experiments, as this constitutes a crucial step in the reconstruction of the network.

      We have now added a new supplementary material section, Supplementary Note 2 titled “Creating iModulons for CC8 Clade Staphylococcus aureus” with details of QC/QA.

      Line 279. In Supplementary Table 3, please label the first column and standardize the use of either the experiment ID or the run ID. Furthermore, verify the experiment identifiers from rows 19 to 26, as I could not locate them in the SRA database.

      We have changed all accession to experiment ID including rows 19 to 26.

      Lines 290, 330, 424, and 437. Please correct "SCCMec" to "SCCmec IVa" (italicize "mec").

      Corrected.

      Line 298. What is the size of the upstream regulatory region considered for this analysis? It is important to standardize this value for all analyses involving the upstream regulatory region. In this regard, I recommend maintaining a consistent size of -400 base pairs.

      For Fur motif search we chose 100 base-pairs because the Fur motif in non-USA300 strains were within ~20 base-pairs of isdH translation start site (Figure 4C). In our search of Fur motif in this analysis, we were not looking to see if any exists, we were simply looking to see if the one proximal to the translation start site exists as our DBGWAS analysis suggested that specific region was deleted in USA300 strains. In our usual analysis, we use -300 base pairs.

      Line 321. The discussion is rather concise and lacks an in-depth comparative perspective with relevant literature on any of the obtained results, whether concerning the proposed methodology or the potential new markers associated with the success of the USA300 lineage. The authors must underscore the method is not applicable to all GWAS analyses, due to the issue of correction for population structure.

      We have now added sections talking about the importance of isdH in S. aureus infection and a section addressing the limitation of the current approach when applied to other GWAS type study.

      Line 366. The authors employed the methodology described in the article by Hyun et al. 2022 (https://doi.org/10.1186/s12864-021-08223-8) to construct the pangenome. However, this methodology was designed for comparative analysis of pangenomes across various species, which does not align with the objective of this study, focusing solely on S. aureus genomes. Consequently, it remains unclear to me why the authors made this particular choice and, more importantly, what advantages it offers over well-established tools for individual pangenomes, such as Roary. I would strongly recommend validating the results using at least one established tool.

      With our analysis, we can determine proper thresholds for core/accessory/unique genes based on the observed data (Supplementary Figure 1a). However, we agree that it would be proper to include a more established pangenome package. We have added the results of Roary to our analysis. The Roary results largely agree with our biggest take away from pangenomics which is that our collection of genomes have a good coverage of the CC8 clade at the gene level.

      Line 370. Please include the version of CD-HIT that was utilized.

      Added. CD-HIT version 4.6 was used for the analysis.

      Line 372. What tool did the authors use to extract these regions?

      The list of CDS, 5’ and 3’ sequences can be extracted easily with a combination of fasta file and gff file. The gff file was used to find the position of each of these sequences and the sequences were extracted from the fasta file with python scripts.

      Line 395. What were the QC/QA criteria used to select the sequences?

      The QC/QA criteria for the sequences are mentioned in the beginning of the Pangnomic analysis subsection and is as follows:

      “Briefly, “complete” or “WGS” samples from CC8/ST8 were downloaded from the PATRIC database. Sequences with lengths that were not within 3 standard deviations of the mean length or those with more than 100 contigs were filtered out.”

      Line 407. Please correct the tool name to "SCCmecFinder" (italicize "mec").

      The name has been corrected.

      Line 409. I believe BLASTp was run locally, so please specify the version used and the search parameters.

      As corrected further down, we used BLASTn not BLASTp. The version v2.2.31 has been added to the methods section.

      Line 416. There is conflicting information with line 409, which mentions that PVL was identified through a protein BLAST, but right below, it states it was a BLASTn. Please verify which information is correct and consider the previous comment to specify the version and parameters.

      Thank you catching the discrepancy. We have corrected the text:

      “PVL was detected using nucleotide BLAST.”

      Line 418. Please provide the column identifiers for the Supplementary Table 5 (PVL worksheet).

      Column names are added.

      Line 418. Please remove the repeated word "and" in Supplementary Table 5 (mecA worksheet) and italicize the gene names in this table.

      Corrected

      Line 419. You can use the abbreviation "SNPs" since it was introduced in line 65.

      Corrected.

      Line 420. In my view, this analysis could benefit from a more detailed and clearer explanation.

      We have added to the explanation. The section now reads:

      “To find the root of the USA300 strains in the phylogenetic tree, the genomes in the tree were first annotated by their PVL and SCC_mec_ status. Then the tree traversed from leaf to root starting from known USA300 strains – TCH1516 and FPR3757- while keeping track of the number of descendant genomes from the current root that contained known markers SCC_mec_ IVa and PVL. The node where the number of genomes with the markers started flatlining was marked as the root of USA300.”

      Line 428. Specify the version and parameters used in the analysis with DBGWAS.

      Added. The text now reads:

      “DBGWAS (v0.5.4) was used to enrich mutations unique to USA300 strains using default kmer size of 31 (-k 31) and neighborhood size of 5 (-nh 5). Alleles with frequency less than 0.1 were filtered  (-maf 0.1) and all components enriched with q-values less than 0.05 were documented (-SFF q0.05).”

      Line 431. What tools were employed to calculate Pearson correlation and distances relative to the reference genome?

      Added. The text now reads:

      “Genome-wide linkage was estimated by Pearson correlation (calculated with built-in Pandas function) of the presence/ absence of enriched kmers and distance was measured based on the kmer alignment to the reference TCH1516 genome as determined by BLASTn.”

      Line 450. What type of BLAST was used?

      Added. Nucleotide blast was used for all kmer analysis.

      Line 452. I didn't quite understand the reason for making this analysis available in a separate repository. It would be easier for readers looking to reproduce the work if all the codes were in a single repository.

      We kept the repository separate in case we wanted to further develop the network analysis code in the future. We have added the link to the network analysis repository in the README of the publication repo.

      Line 460. Please specify the version and parameters, if run locally, or indicate if a web page was used.

      Corrected to indicate that we used the PATRIC website for this

      Line 470. Specify the version and provide a detailed account of all parameters used, along with the QC/QA criteria and curation methods applied.

      We have added Supplementary Note 2 with all the details about packages and parameters used to calculate the iModulons.

      Line 479. The phrase "ICA was then run as previously described in chapter 3" does not make sense. Please clarify.

      We have corrected the mistake and added a new supplementary note with details about our ICA run. The line now reads:

      “A detailed version of the methods for RNA-sequencing and ICA analysis is available as Supplementary Note 2. ICA of RNA sequencing data was performed using the pymodulon package.”

      Line 484. Specify the version of CD-HIT.

      Added. The version used was v4.6.

      Line 494. To enable reproducibility, the repository should be better organized, especially the directory containing the code. Numbering each script in the order it was run would assist the reader in comprehending the overall analysis flow and adapting it to their needs. If creating a manual for method usage is not feasible, the code could be more extensively commented on to explain the parameters, choices made, and how these could be modified. The "Data" folder seems to contain some test files, such as those in the "isdh_fimo" folder, so removing test files would aid the understanding of the reader.

      Thank you for the suggestions. We have now numbered the notebooks that generate the figures, we have added more comments to the code, removed testing code and test datasets.

      Throughout the article, please correct "SCCMec" to "SCCmec" (italicize "mec").

      Corrected.

    1. Joint Public Review:

      The present study explored the principles that allow cells to maintain complex subcellular proteinaceous structures despite the limited lifetimes of the individual protein components. This is particularly critical in the case of neurons, where the size and protein composition of synapses define synaptic strength and encode memory.

      PSD95 is an abundant synapse protein that acts as a scaffold in the recruitment of transmitter receptors and other signaling proteins and is required for proper memory formation. The authors used super-resolution microscopy to study PSD95 super-complexes isolated from the brains of mice expressing tagged PSD variants (Halo-Tag, mEos, GFP). Their results show compellingly that a large fraction (~25%) of super-complexes contains two PSD95 copies about 13 nm apart, that there is substantial turnover of PSD95 proteins in super-complexes over a period of seven days, and that ~5-20% of the super-complexes contain new and old PSD95 molecules. This percentage is higher in synaptic fractions as compared to total brain lysates, and highest in isocortex samples (~20%). These important findings support the notion put forward by Crick that sequential subunit replacement gives synaptic super-complexes long lifetimes and thus aids in memory maintenance. Overall, this is very interesting, providing key insights into how synaptic protein complexes are formed and maintained. On the other hand, the actual role of these PSD95 super-complexes in long-term memory storage remains unknown.

      Strengths

      (1) The study employed an appropriate and validated methodology.

      (2) Large numbers of PSD95 super-complexes from three different mouse models were imaged and analyzed, providing adequately powered sample sizes.

      (3) State-of-the-art super-resolution imaging techniques (PALM and MINFLUX) were used, providing a robust, high-quality, cross-validated analysis of PSD95 protein complexes that is useful for the community.

      (4) The result that PSD95 proteins in dimeric complexes are on average 12.7 nm apart is useful and has implications for studies on the nanoscale organization of PSD95 at synapses.

      (5) The finding that postsynaptic protein complexes can continue to exist while individual components are being renewed is important for our understanding of synapse maintenance and stability.

      (6) The data on the turnover rate of PSD95 in super-complexes from different brain regions provide a first indication of potentially meaningful differences in the lifetime of super-complexes between brain regions.

      Weaknesses

      (1) The manuscript emphasizes the hypothesis that stable super-complexes, maintained through sequential replacement of subunits, might underlie the long-term storage of memory. While an interesting idea, this notion requires considerably more research. The presented experimental data are indeed consistent with this notion, but there is no evidence that these complexes are causally related to memory storage.

      (2) Much of the presented work is performed on biochemically isolated protein complexes. The biochemical isolation procedures rely on physical disruption and detergents that are known to alter the composition and structure of complexes in certain cases. Thus, it remains unclear how the protein complexes described in this study relate to PSD95 complexes in intact synapses.

      (3) Because not all GFP molecules mature and fold correctly in vitro and the PSD95-mEos mice used were heterozygous, the interpretation of the corresponding quantifications is not straightforward.

      (4) It was not tested whether different numbers of PSD95 molecules per super-complex might contribute to different retention times of PSD95, e.g. in synaptic vs. total-forebrain super-complexes.

      (5) The conclusion that the population of 'mixed' synapses is higher in the isocortex than in other brain regions is not supported by statistical analysis.

      (6) The validity of conclusions regarding PSD95 degradation based on relative changes in the occurrence of SiR-Halo-positive puncta is limited.

    2. Author response:

      (1) The manuscript emphasizes the hypothesis that stable super-complexes, maintained through sequential replacement of subunits, might underlie the long-term storage of memory. While an interesting idea, this notion requires considerably more research. The presented experimental data are indeed consistent with this notion, but there is no evidence that these complexes are causally related to memory storage. 

      We agree with the reviewer that, while our data support the idea that subunit exchange in supercomplexes could underlie long-term memory storage, more research is necessary to conclusively validate this hypothesis. The experimental data presented are consistent with the idea that stable supercomplexes, maintained through sequential replacement of subunits, play a role in memory retention. However, establishing a causal relationship between these supercomplexes and memory storage will require additional experiments and in-depth analyses.

      (2) Much of the presented work is performed on biochemically isolated protein complexes. The biochemical isolation procedures rely on physical disruption and detergents that are known to alter the composition and structure of complexes in certain cases. Thus, it remains unclear how the protein complexes described in this study relate to PSD95 complexes in intact synapses. 

      Whilst it could be the case that biochemical isolation procedures have the potential to alter the composition and structure of protein complexes, we have previously published the protocol used to isolate PSD95-containing supercomplexes (Nat Commun. 2016; 7: 11264). In that study, we demonstrated that the isolated supercomplexes are approximately 1.5 MDa in size and contain multiple proteins, including other scaffolding proteins (e.g., PSD93) and receptors (e.g., NMDARs). Importantly, these supercomplexes remain stable when exposed to detergents and dilution, strongly indicating that they represent the native complexes present in intact synapses.

      (3) Because not all GFP molecules mature and fold correctly in vitro and the PSD95-mEos mice used were heterozygous, the interpretation of the corresponding quantifications is not straightforward. 

      Although genetic tagging ensures a 1:1 labeling stoichiometry, we acknowledge that the presence of unfolded GFP and the use of heterozygous PSD95-mEos mice can complicate the analysis. We have highlighted this limitation in the manuscript. Nonetheless, our results show a high level of consistency across the different genetic fusions used in this study.

      (4) It was not tested whether different numbers of PSD95 molecules per super-complex might contribute to different retention times of PSD95, e.g. in synaptic vs. total-forebrain super-complexes. 

      The potential impact of varying numbers of PSD95 molecules per super-complex on retention times was considered. However, our analysis showed minimal differences in the distribution of molecule numbers per super-complex between the synaptic and forebrain samples.

      (5) The conclusion that the population of 'mixed' synapses is higher in the isocortex than in other brain regions is not supported by statistical analysis. 

      The conclusion that the population of 'mixed' synapses is higher in the isocortex than in other brain regions is indeed supported by statistical analysis. All relevant statistical data are detailed in Table S2, and the finding is statistically significant. We will emphasize this point in the revised manuscript.

      (6) The validity of conclusions regarding PSD95 degradation based on relative changes in the occurrence of SiR-Halo-positive puncta is limited.

      We recognize that conclusions based solely on the relative changes in SiR-Halo-positive puncta concerning PSD95 degradation have limitations. To address this, we also quantified the “new” PSD95 by analyzing AF488-Halo-positive molecules.

    1. Reviewer #1 (Public Review):

      Summary:

      Bowler et al. present a thoroughly tested system for modularized behavioral control of navigation-based experiments, particularly suited for pairing with 2-photon imaging but applicable to a variety of techniques. This system, which they name behaviorMate, represents a valuable contribution to the field. As the authors note, behavioral control paradigms vary widely across laboratories in terms of hardware and software utilized and often require specialized technical knowledge to make changes to these systems. Having a standardized, easy-to-implement, and flexible system that can be used by many groups is therefore highly desirable. This work will be of interest to systems neuroscientists looking to integrate flexible head-fixed behavioral control with neural data acquisition.

      Strengths:

      The present manuscript provides compelling evidence of the functionality and applicability of behaviorMate. The authors report benchmark tests for real-time update speed between the animal's movement and the behavioral control, on both the treadmill-based and virtual reality (VR) setups. Further, they nicely demonstrate and quantify reliable hippocampal place cell coding in both setups, using synchronized 2-photon imaging. This place cell characterization also provides a concrete comparison between the place cell properties observed in treadmill-based navigation vs. visual VR in a single study, which itself is a helpful contribution to the field.

      Documentation for installing and operating behaviorMate is available via the authors' lab website and linked in the manuscript.

      Weaknesses:

      The following comments are mostly minor suggestions intended to add clarity to the paper and provide context for its significance.

      (1) As VRMate (a component of behaviorMate) is written using Unity, what is the main advantage of using behaviorMate/VRMate compared to using Unity alone paired with Arduinos (e.g. Campbell et al. 2018), or compared to using an existing toolbox to interface with Unity (e.g. Alsbury-Nealy et al. 2022, DOI: 10.3758/s13428-021-01664-9)? For instance, one disadvantage of using Unity alone is that it requires programming in C# to code the task logic. It was not entirely clear whether VRMate circumvents this disadvantage somehow -- does it allow customization of task logic and scenery in the GUI? Does VRMate add other features and/or usability compared to Unity alone? It would be helpful if the authors could expand on this topic briefly.

      (2) The section on "context lists", lines 163-186, seemed to describe an important component of the system, but this section was challenging to follow and readers may find the terminology confusing. Perhaps this section could benefit from an accompanying figure or flow chart, if these terms are important to understand.

      (2a) Relatedly, "context" is used to refer to both when the animal enters a particular state in the task like a reward zone ("reward context", line 447) and also to describe a set of characteristics of an environment (Figure 3G), akin to how "context" is often used in the navigation literature. To avoid confusion, one possibility would be to use "environment" instead of "context" in Figure 3G, and/or consider using a word like "state" instead of "context" when referring to the activation of different stimuli.

      (3) Given the authors' goal of providing a system that is easily synchronizable with neural data acquisition, especially with 2-photon imaging, I wonder if they could expand on the following features:

      (3a) The authors mention that behaviorMate can send a TTL to trigger scanning on the 2P scope (line 202), which is a very useful feature. Can it also easily generate a TTL for each frame of the VR display and/or each sample of the animal's movement? Such TTLs can be critical for synchronizing the imaging with behavior and accounting for variability in the VR frame rate or sampling rate.

      (3b) Is there a limit to the number of I/O ports on the system? This might be worth explicitly mentioning.

      (3c) In the VR version, if each display is run by a separate Android computer, is there any risk of clock drift between displays? Or is this circumvented by centralized control of the rendering onset via the "real-time computer"?

    2. eLife assessment

      This work represents a new toolkit for implementing virtual reality experiments in head-fixed animals. It is a valuable contribution to the field and the evidence for its utility and performance is solid. Some minor improvements in the material presented - including clarifying design decisions and providing more details about design features - would improve the readability and thereby potentially increase its impact.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors present behaviorMate, an open-source behavior recording and control system including a central GUI and compatible treadmill and display components. Notably, the system utilizes the "Intranet of things" scheme and the components communicate through a local network, making the system modular, which in turn allows user to easily configure the setup to suit their experimental needs. Overall, behaviorMate is a valuable resource for researchers performing head-fixed imaging studies, as the commercial alternatives are often expensive and inflexible to modify.

      Strengths and Weaknesses:

      The manuscript presents two major utilities of behaviorMate: (1) as an open-source alternative to commercial behavior apparatus for head-fixed imaging studies, and (2) as a set of generic schema and communication protocols that allows the users to incorporate arbitrary recording and stimulation devices during a head-fixed imaging experiment. I found the first point well-supported and demonstrated in the manuscript. Indeed, the documentation, BOM, CAD files, circuit design, source, and compiled software, along with the manuscript, create an invaluable resource for neuroscience researchers looking to set up a budget-friendly VR and head-fixed imaging rig. Some features of behaviorMate, including the computer vision-based calibration of the treadmill, and the decentralized, Android-based display devices, are very innovative approaches and can be quite useful in practical settings. However, regarding the second point, my concern is that there is not adequate documentation and design flexibility to allow the users to incorporate arbitrary hardware into the system. In particular:

      (1) The central controlling logic is coupled with GUI and an event loop, without a documented plugin system. It's not clear whether arbitrary code can be executed together with the GUI, hence it's not clear how much the functionality of the GUI can be easily extended without substantial change to the source code of the GUI. For example, if the user wants to perform custom real-time analysis on the behavior data (potentially for closed-loop stimulation), it's not clear how to easily incorporate the analysis into the main GUI/control program.

      (2) The JSON messaging protocol lacks API documentation. It's not clear what the exact syntax is, supported key/value pairs, and expected response/behavior of the JSON messages. Hence, it's not clear how to develop new hardware that can communicate with the behaviorMate system.

      (3) It seems the existing control hardware and the JSON messaging only support GPIO/TTL types of input/output, which limits the applicability of the system to more complicated sensor/controller hardware. The authors mentioned that hardware like Arduino natively supports serial protocols like I2C or SPI, but it's not clear how they are handled and translated to JSON messages.

      Additionally, because it's unclear how easy to incorporate arbitrary hardware with behaviorMate, the "Intranet of things" approach seems to lose attraction. Since currently, the manuscript focuses mainly on a specific set of hardware designed for a specific type of experiment, it's not clear what are the advantages of implementing communication over a local network as opposed to the typical connections using USB.

      In summary, the manuscript presents a well-developed open-source system for head-fixed imaging experiments with innovative features. The project is a very valuable resource to the neuroscience community. However, some claims in the manuscript regarding the extensibility of the system and protocol may require further development and demonstration.

    4. Reviewer #3 (Public Review):

      In this work, the authors present an open-source system called behaviourMate for acquiring data related to animal behavior. The temporal alignment of recorded parameters across various devices is highlighted as crucial to avoid delays caused by electronics dependencies. This system not only addresses this issue but also offers an adaptable solution for VR setups. Given the significance of well-designed open-source platforms, this paper holds importance.

      Advantages of behaviorMate:

      The cost-effectiveness of the system provided.

      The reliability of PCBs compared to custom-made systems.

      Open-source nature for easy setup.

      Plug & Play feature requiring no coding experience for optimizing experiment performance (only text-based Json files, 'context List' required for editing).

      Points to clarify:

      While using UDP for data transmission can enhance speed, it is thought that it lacks reliability. Are there error-checking mechanisms in place to ensure reliable communication, given its criticality alongside speed?

      Considering this year's price policy changes in Unity, could this impact the system's operations?

      Also, does the Arduino offer sufficient precision for ephys recording, particularly with a 10ms check?

      Could you clarify the purpose of the Sync Pulse? In line 291, it suggests additional cues (potentially represented by the Sync Pulse) are needed to align the treadmill screens, which appear to be directed towards the Real-Time computer. Given that event alignment occurs in the GPIO, the connection of the Sync Pulse to the Real-Time Controller in Figure 1 seems confusing. Additionally, why is there a separate circuit for the treadmill that connects to the UI computer instead of the GPIO? It might be beneficial to elaborate on the rationale behind this decision in line 260. Moreover, should scenarios involving pupil and body camera recordings connect to the Analog input in the PCB or the real-time computer for optimal data handling and processing?

      Given that all references, as far as I can see, come from the same lab, are there other labs capable of implementing this system at a similar optimal level?

    1. eLife assessment

      The authors studied the relationship between structural and functional lateralization in the planum temporale region of the brain, whilst also considering the morphological presentation of a single or duplicated Heschl's gyrus. The analyses are convincing due to a large sample size, inter-rater reliability, and corrections for multiple comparisons. The associations in this valuable work might serve as a reference for future targeted-studies on brain lateralization.

    2. Reviewer #1 (Public Review):

      Summary:

      Qin and colleagues analysed data from the Human Connectome Project on four right-handed subgroups with different gyrification patterns in Heschl's gyrus. Based on these groups, the authors highlight the structure-function relationship of planum temporale asymmetry in lateralised language processing at the group level and next at the individual level. In particular, the authors propose that especially microstructural asymmetries are related to functional auditory language asymmetries in the planum temporale.

      Strengths:

      The study is interesting because of an ongoing and long-standing debate about the relationship between structural and functional brain asymmetries, and in particular whether structural brain asymmetries can be seen as markers of functional language brain lateralisation.

      In this debate, the relationship between Heschl's gyrus asymmetry and planum temporale asymmetry is rare and therefore valuable here. A large sample size and inter-rater reliability support the findings.

      Weaknesses:

      In this case of multiple brain measures, it would be important to provide the reader with some sort of effect size (e.g. Cohen's d) to help interpret the results. In addition, the authors highlight the microstructural results in spite of the macrostructural results. However, the macrostructural surface results are also strong. I would suggest either reducing the emphasis on micro vs macrostructural results or adding information to justify the microstructural importance.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors assessed the link between structural and functional lateralization in area PT, one of the brain areas that is known to exhibit strong structural lateralization, and which is known to be implicated in speech processing. Importantly, they included the sulcal configuration of Heschl's gyrus (HG), presenting either as a single or duplicated HG, in their analysis. They found several significant associations between microstructural indices and task-based functional lateralization, some of which depended on the sulcal configuration.

      Strengths:

      A clear strength is the large sample size (n=907), an openly available database, and the fact that HG morphology was manually classified in each individual. This allows for robust statistical testing of the effects across morphological categories, which is not often seen in the literature.

      Weaknesses:

      - Unfortunately, no left-handers were included in the study. It would have been a valuable addition to the literature, to study the effect of handedness on the observed associations, as many previous studies on this topic were not adequately powered. The fact that only right-handers were studied should be pointed out clearly in the introduction or even the abstract.

      - The tasks to quantify functional lateralization were not specifically designed to pick up lateralization. In the interest of the sample size, it is understandable that the authors used the available HCP-task-battery results, however, it would have been feasible to access another dataset for validation. A targeted subset of results, concerning for example the relationship between sulcal morphology and task-based functional lateralization, could be re-assessed using other open-access fMRI datasets.

      - The study is mainly descriptive and the general discussion of the findings in the larger context of brain lateralization comes a bit short. For example, are the observed effects in line with what we know from other 'language-relevant' areas? What could be the putative mechanisms that give rise to functional lateralization based on the microstructural markers observed? And which mechanisms might be underlying the formation of a duplicated HG?

    1. eLife assessment

      This study used deep neural networks (DNN) to reconstruct voice information (viz., speaker identity), from fMRI responses in the auditory cortex and temporal voice areas, and assessed the representational content in these areas with decoding. A DNN-derived feature space approximated the neural representation of speaker identity-related information. While some of the neural decoding results are valuable, the overall evidence for general representational and computational principles is incomplete as the results rely on a very specific model architecture.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors trained a variational autoencoder (VAE) to create a high-dimensional "voice latent space" (VLS) using extensive voice samples, and analyzed how this space corresponds to brain activity through fMRI studies focusing on the temporal voice areas (TVAs). Their analyses included encoding and decoding techniques, as well as representational similarity analysis (RSA), which showed that the VLS could effectively map onto and predict brain activity patterns, allowing for the reconstruction of voice stimuli that preserve key aspects of speaker identity.

      Strengths:

      This paper is well-written and easy to follow. Most of the methods and results were clearly described. The authors combined a variety of analytical methods in neuroimaging studies, including encoding, decoding, and RSA. In addition to commonly used DNN encoding analysis, the authors performed DNN decoding and resynthesized the stimuli using VAE decoders. Furthermore, in addition to machine learning classifiers, the authors also included human behavioral tests to evaluate the reconstruction performance.

      Weaknesses:

      This manuscript presents a variational autoencoder (VAE) to evaluate voice identity representations from brain recordings. However, the study's scope is limited by testing only one model, leaving unclear how generalizable or impactful the findings are. The preservation of identity-related information in the voice latent space (VLS) is expected, given the VAE model's design to reconstruct original vocal stimuli. Nonetheless, the study lacks a deeper investigation into what specific aspects of auditory coding these latent dimensions represent. The results in Figure 1c-e merely tested a very limited set of speech features. Moreover, there is no analysis of how these features and the whole VAE model perform in standard speech tasks like speech recognition or phoneme recognition. It is not clear what kind of computations the VAE model presented in this work is capable of. Inclusion of comparisons with state-of-the-art unsupervised or self-supervised speech models known for their alignment with auditory cortical responses, such as Wav2Vec2, HuBERT, and Whisper, would strengthen the validation of the VAE model and provide insights into its relative capabilities and limitations.

      The claim that the VLS outperforms a linear model (LIN) in decoding tasks does not significantly advance our understanding of the underlying brain representations. Given the complexity of auditory processing, it is unsurprising that a nonlinear model would outperform a simpler linear counterpart. The study could be improved by incorporating a comparative analysis with alternative models that differ in architecture, computational strategies, or training methods. Such comparisons could elucidate specific features or capabilities of the VLS, offering a more nuanced understanding of its effectiveness and the computational principles it embodies. This approach would allow the authors to test specific hypotheses about how different aspects of the model contribute to its performance, providing a clearer picture of the shared coding in VLS and the brain.

      The manuscript overlooks some crucial alternative explanations for the discriminant representation of vocal identity. For instance, the discriminant representation of vocal identity can be either a higher-level abstract representation or a lower-level coding of pitch height. Prior studies using fMRI and ECoG have identified both types of representation within the superior temporal gyrus (STG) (e.g., Tang et al., Science 2017; Feng et al., NeuroImage 2021). Additionally, the methodology does not clarify whether the stimuli from different speakers contained identical speech content. If the speech content varied across speakers, the approach of averaging trials to obtain a mean vector for each speaker-the "identity-based analysis"-may not adequately control for confounding acoustic-phonetic features. Notably, the principal component 2 (PC2) in Figure 1b appears to correlate with absolute pitch height, suggesting that some aspects of the model's effectiveness might be attributed to simpler acoustic properties rather than complex identity-specific information.

      Methodologically, there are issues that warrant attention. In characterizing the autoencoder latent space, the authors initialized logistic regression classifiers 100 times and calculated the t-statistics using degrees of freedom (df) of 99. Given that logistic regression is a convex optimization problem typically converging to a global optimum, these multiple initializations of the classifier were likely not entirely independent. Consequently, the reported degrees of freedom and the effect size estimates might not accurately reflect the true variability and independence of the classifier outcomes. A more careful evaluation of these aspects is necessary to ensure the statistical robustness of the results.

    3. Reviewer #2 (Public Review):

      Summary:

      Lamothe et al. collected fMRI responses to many voice stimuli in 3 subjects. The authors trained two different autoencoders on voice audio samples and predicted latent space embeddings from the fMRI responses, allowing the voice spectrograms to be reconstructed. The degree to which reconstructions from different auditory ROIs correctly represented speaker identity, gender, or age was assessed by machine classification and human listener evaluations. Complementing this, the representational content was also assessed using representational similarity analysis. The results broadly concur with the notion that temporal voice areas are sensitive to different types of categorical voice information.

      Strengths:

      The single-subject approach that allows thousands of responses to unique stimuli to be recorded and analyzed is powerful. The idea of using this approach to probe cortical voice representations is strong and the experiment is technically solid.

      Weaknesses:

      The paper could benefit from more discussion of the assumptions behind the reconstruction analyses and the conclusions it allows. The authors write that reconstruction of a stimulus from brain responses represents 'a robust test of the adequacy of models of brain activity' (L138). I concur that stimulus reconstruction is useful for evaluating the nature of representations, but the notion that they can test the adequacy of the specific autoencoder presented here as a model of brain activity should be discussed at more length. Natural sounds are correlated in many feature dimensions and can therefore be summarized in several ways, and similar information can be read out from different model representations. Models trained to reconstruct natural stimuli can exploit many correlated features and it is quite possible that very different models based on different features can be used for similar reconstructions. Reconstructability does not by itself imply that the model is an accurate brain model. Non-linear networks trained on natural stimuli are arguably not tested in the same rigorous manner as models built to explicitly account for computations (they can generate predictions and experiments can be designed to test those predictions). While it is true that there is increasing evidence that neural network embeddings can predict brain data well, it is still a matter of debate whether good predictability by itself qualifies DNNs as 'plausible computational models for investigating brain processes' (L72). This concern is amplified in the context of decoding and naturalistic stimuli where many correlated features can be represented in many ways. It is unclear how much the results hinge on the specificities of the specific autoencoder architectures used. For instance, it would be useful to know the motivations for why the specific VAE used here should constitute a good model for probing neural voice representations.

      Relatedly, it is not clear how VAEs as generative models are motivated as computational models of voice representations in the brain. The task of voice areas in the brain is not to generate voice stimuli but to discriminate and extract information. The task of reconstructing an input spectrogram is perhaps useful for probing information content, but discriminative models, e.g., trained on the task of discriminating voices, would seem more obvious candidates. Why not include discriminatively trained models for comparison?

      The autoencoder learns a mapping from latent space to well-formed voice spectrograms. Regularized regression then learns a mapping between this latent space and activity space. All reconstructions might sound 'natural', which simply means that the autoencoder works. It would be good to have a stronger test of how close the reconstructions are to the original stimulus. For instance, is the reconstruction the closest stimulus to the original in latent space coordinates out of using the experimental stimuli, or where does it rank? How do small changes in beta amplitudes impact the reconstruction? The effective dimensionality of the activity space could be estimated, e.g. by PCA of the voice samples' contrast maps, and it could then be estimated how the main directions in the activity space map to differences in latent space. It would be good to get a better grasp of the granularity of information that can be decoded/ reconstructed.

      What can we make of the apparent trend that LIN is higher than VLS for identity classification (at least VLS does not outperform LIN)? A general argument of the paper seems to be that VLS is a better model of voice representations compared to LIN as a 'control' model. Then we would expect VLS to perform better on identity classification. The age and gender of a voice can likely be classified from many acoustic features that may not require dedicated voice processing.

      The RDM results reported are significant only for some subjects and in some ROIs. This presumably means that results are not significant in the other subjects. Yet, the authors assert general conclusions (e.g. the VLS better explains RDM in TVA than LIN). An assumption typically made in single-subject studies (with large amounts of data in individual subjects) is that the effects observed and reported in papers are robust in individual subjects. More than one subject is usually included to hint that this is the case. This is an intriguing approach. However, reports of effects that are statistically significant in some subjects and some ROIs are difficult to interpret. This, in my view, runs contrary to the logic and leverage of the single-subject approach. Reporting results that are only significant in 1 out of 3 subjects and inferring general conclusions from this seems less convincing.

      The first main finding is stated as being that '128 dimensions are sufficient to explain a sizeable portion of the brain activity' (L379). What qualifies this? From my understanding, only models of that dimensionality were tested. They explain a sizeable portion of brain activity, but it is difficult to follow what 'sizable' is without baseline models that estimate a prediction floor and ceiling. For instance, would autoencoders that reconstruct any spectrogram (not just voice) also predict a sizable portion of the measured activity? What happens to reconstruction results as the dimensionality is varied?

      A second main finding is stated as being that the 'VLS outperforms the LIN space' (L381). It seems correct that the VAE yields more natural-sounding reconstructions, but this is a technical feature of the chosen autoencoding approach. That the VLS yields a 'more brain-like representational space' I assume refers to the RDM results where the RDM correlations were mainly significant in one subject. For classification, the performance of features from the reconstructions (age/ gender/ identity) gives results that seem more mixed, and it seems difficult to draw a general conclusion about the VLS being better. It is not clear that this general claim is well supported.

      It is not clear why the RDM was not formed based on the 'stimulus GLM' betas. The 'identity GLM' is already biased towards identity and it would be stronger to show associations at the stimulus level.

      Multiple comparisons were performed across ROIs, models, subjects, and features in the classification analyses, but it is not clear how correction for these multiple comparisons was implemented in the statistical tests on classification accuracies.

      Risks of overfitting and bias are a recurrent challenge in stimulus reconstruction with fMRI. It would be good with more control analyses to ensure that this was not the case. For instance, how were the repeated test stimuli presented? Were they intermingled with the other stimuli used for training or presented in separate runs? If intermingled, then the training and test data would have been preprocessed together, which could compromise the test set. The reconstructions could be performed on responses from independent runs, preprocessed separately, as a control. This should include all preprocessing, for instance, estimating stimulus/identity GLMs on separately processed run pairs rather than across all runs. Also, it would be good to avoid detrending before GLM denoising (or at least testing its effects) as these can interact.

    4. Reviewer #3 (Public Review):

      Summary:

      In this manuscript, Lamothe et al. sought to identify the neural substrates of voice identity in the human brain by correlating fMRI recordings with the latent space of a variational autoencoder (VAE) trained on voice spectrograms. They used encoding and decoding models, and showed that the "voice" latent space (VLS) of the VAE performs, in general, (slightly) better than a linear autoencoder's latent space. Additionally, they showed dissociations in the encoding of voice identity across the temporal voice areas.

      Strengths:

      - The geometry of the neural representations of voice identity has not been studied so far. Previous studies on the content of speech and faces in vision suggest that such geometry could exist. This study demonstrates this point systematically, leveraging a specifically trained variational autoencoder.

      - The size of the voice dataset and the length of the fMRI recordings ensure that the findings are robust.

      Weaknesses:

      - Overall, the VLS is often only marginally better than the linear model across analysis, raising the question of whether the observed performance improvements are due to the higher number of parameters trained in the VAE, rather than the non-linearity itself. A fair comparison would necessitate that the number of parameters be maintained consistently across both models, at least as an additional verification step.

      - The encoding and RSM results are quite different. This is unexpected, as similar embedding geometries between the VLS and the brain activations should be reflected by higher correlation values of the encoding model.

      - The consistency across participants is not particularly high, for instance, S1 seemed to have demonstrated excellent performances, while S2 showed poor performance.

      - An important control analysis would be to compare the decoding results with those obtained by a decoder operating directly on the latent spaces, in order to further highlight the interest of the non-linear transformations of the decoder model. Currently, it is unclear whether the non-linearity of the decoder improves the decoding performance, considering the poor resemblance between the VLS and brain-reconstructed spectrograms.

    1. eLife assessment

      This useful study measured how information about object categories varies with time in EEG responses to object images in human participants and found that real-world size, retinal size, and real-world depth are represented at different time points in the response. The evidence presented is incomplete and can be further strengthened by removing confounds related to other covarying properties such as semantic categories, and by clarifying the partial correlations that are used to support the conclusions.

    2. Reviewer #1 (Public Review):

      Lu & Golomb combined EEG, artificial neural networks, and multivariate pattern analyses to examine how different visual variables are processed in the brain. The conclusions of the paper are mostly well supported, but some aspects of methods and data analysis would benefit from clarification and potential extensions.

      The authors find that not only real-world size is represented in the brain (which was known), but both retinal size and real-world depth are represented, at different time points or latencies, which may reflect different stages of processing. Prior work has not been able to answer the question of real-world depth due to the stimuli used. The authors made this possible by assessing real-world depth and testing it with appropriate methodology, accounting for retinal and real-world size. The methodological approach combining behavior, RSA, and ANNs is creative and well thought out to appropriately assess the research questions, and the findings may be very compelling if backed up with some clarifications and further analyses.

      The work will be of interest to experimental and computational vision scientists, as well as the broader computational cognitive neuroscience community as the methodology is of interest and the code is or will be made available. The work is important as it is currently not clear what the correspondence between many deep neural network models and the brain is, and this work pushes our knowledge forward on this front. Furthermore, the availability of methods and data will be useful for the scientific community.

      Some analyses are incomplete, which would be improved if the authors showed analyses with other layers of the networks and various additional partial correlation analyses.

      Clarity

      (1) Partial correlations methods incomplete - it is not clear what is being partialled out in each analysis. It is possible to guess sometimes, but it is not entirely clear for each analysis. This is important as it is difficult to assess if the partial correlations are sensible/correct in each case. Also, the Figure 1 caption is short and unclear.

      For example, ANN-EEG partial correlations - "Finally, we directly compared the timepoint-by-timepoint EEG neural RDMs and the ANN RDMs (Figure 3F). The early layer representations of both ResNet and CLIP were significantly correlated with early representations in the human brain" What is being partialled out? Figure 3F says partial correlation

      Issues / open questions

      (2) Semantic representations vs hypothesized (hyp) RDMs (real-world size, etc) - are the representations explained by variables in hyp RDMs or are there semantic representations over and above these? E.g., For ANN correlation with the brain, you could partial out hyp RDMs - and assess whether there is still semantic information left over, or is the variance explained by the hyp RDMs?

      (3) Why only early and late layers? I can see how it's clearer to present the EEG results. However, the many layers in these networks are an opportunity - we can see how simple/complex linear/non-linear the transformation is over layers in these models. It would be very interesting and informative to see if the correlations do in fact linearly increase from early to later layers, or if the story is a bit more complex. If not in the main text, then at least in the supplement.

      (4) Peak latency analysis - Estimating peaks per ppt is presumably noisy, so it seems important to show how reliable this is. One option is to find the bootstrapped mean latencies per subject.

      (5) "Due to our calculations being at the object level, if there were more than one of the same objects in an image, we cropped the most complete one to get a more accurate retinal size. " Did EEG experimenters make sure everyone sat the same distance from the screen? and remain the same distance? This would also affect real-world depth measures.

    3. Reviewer #2 (Public Review):

      Summary:

      This paper aims to test if neural representations of images of objects in the human brain contain a 'pure' dimension of real-world size that is independent of retinal size or perceived depth. To this end, they apply representational similarity analysis on EEG responses in 10 human subjects to a set of 200 images from a publicly available database (THINGS-EEG2), correlating pairwise distinctions in evoked activity between images with pairwise differences in human ratings of real-world size (from THINGS+). By partialling out correlations with metrics of retinal size and perceived depth from the resulting EEG correlation time courses, the paper claims to identify an independent representation of real-world size starting at 170 ms in the EEG signal. Further comparisons with artificial neural networks and language embeddings lead the authors to claim this correlation reflects a relatively 'high-level' and 'stable' neural representation.

      Strengths:

      - The paper features insightful figures/illustrations and clear figures.

      - The limitations of prior work motivating the current study are clearly explained and seem reasonable (although the rationale for why using 'ecological' stimuli with backgrounds matters when studying real-world size could be made clearer; one could also argue the opposite, that to get a 'pure' representation of the real-world size of an 'object concept', one should actually show objects in isolation).

      - The partial correlation analysis convincingly demonstrates how correlations between feature spaces can affect their correlations with EEG responses (and how taking into account these correlations can disentangle them better).

      - The RSA analysis and associated statistical methods appear solid.

      Weaknesses:

      - The claim of methodological novelty is overblown. Comparing image metrics, behavioral measurements, and ANN activations against EEG using RSA is a commonly used approach to study neural object representations. The dataset size (200 test images from THINGS) is not particularly large, and neither is comparing pre-trained DNNs and language models, or using partial correlations.

      - The claims also seem too broad given the fairly small set of RDMs that are used here (3 size metrics, 4 ANN layers, 1 Word2Vec RDM): there are many aspects of object processing not studied here, so it's not correct to say this study provides a 'detailed and clear characterization of the object processing process'.

      - The paper lacks an analysis demonstrating the validity of the real-world depth measure, which is here computed from the other two metrics by simply dividing them. The rationale and logic of this metric is not clearly explained. Is it intended to reflect the hypothesized egocentric distance to the object in the image if the person had in fact been 'inside' the image? How do we know this is valid? It would be helpful if the authors provided a validation of this metric.

      - Given that there is only 1 image/concept here, the factor of real-world size may be confounded with other things, such as semantic category (e.g. buildings vs. tools). While the comparison of the real-world size metric appears to be effectively disentangled from retinal size and (the author's metric of) depth here, there are still many other object properties that are likely correlated with real-world size and therefore will confound identifying a 'pure' representation of real-world size in EEG. This could be addressed by adding more hypothesis RDMs reflecting different aspects of the images that may correlate with real-world size.

      - The choice of ANNs lacks a clear motivation. Why these two particular networks? Why pick only 2 somewhat arbitrary layers? If the goal is to identify more semantic representations using CLIP, the comparison between CLIP and vision-only ResNet should be done with models trained on the same training datasets (to exclude the effect of training dataset size & quality; cf Wang et al., 2023). This is necessary to substantiate the claims on page 19 which attributed the differences between models in terms of their EEG correlations to one of them being a 'visual model' vs. 'visual-semantic model'.

      - The first part of the claim on page 22 based on Figure 4 'The above results reveal that real-world size emerges with later peak neural latencies and in the later layers of ANNs, regardless of image background information' is not valid since no EEG results for images without backgrounds are shown (only ANNs).

      Appraisal of claims:

      While the method shows useful and interesting patterns of results can be obtained by combining contrasting behavioral/image metrics, the lack of additional control models makes the evidence for the claimed unconfounded representation of real-world size in EEG responses incomplete.

      Discussion of likely impact:

      The paper is likely to impact the field by showcasing how using partial correlations in RSA is useful, rather than providing conclusive evidence regarding neural representations of objects and their sizes.

      Additional context important to consider when interpreting this work:

      - Page 20, the authors point out similarities of peak correlations between models ('Interestingly, the peaks of significant time windows for the EEG × HYP RSA also correspond with the peaks of the EEG × ANN RSA timecourse (Figure 3D,F)'. Although not explicitly stated, this seems to imply that they infer from this that the ANN-EEG correlation might be driven by their representation of the hypothesized feature spaces. However this does not follow: in EEG-image metric model comparisons it is very typical to see multiple peaks, for any type of model, this simply reflects specific time points in EEG at which visual inputs (images) yield distinctive EEG amplitudes (perhaps due to stereotypical waves of neural processing?), but one cannot infer the information being processed is the same. To investigate this, one could for example conduct variance partitioning or commonality analysis to see if there is variance at these specific time-points that is shared by a specific combination of the hypothesis and ANN feature spaces.

      - Page 22 mentions 'The significant time-window (90-300ms) of similarity between Word2Vec RDM and EEG RDMs (Figure 5B) contained the significant time-window of EEG x real-world size representational similarity (Figure 3B)'. This is not particularly meaningful given that the Word2Vec correlation is significant for the entire EEG epoch (from the time-point of the signal 'arriving' in visual cortex around ~90 ms) and is thus much less temporally specific than the real-world size EEG correlation. Again a stronger test of whether Word2Vec indeed captures neural representations of real-world size could be to identify EEG time-points at which there are unique Word2Vec correlations that are not explained by either ResNet or CLIP, and see if those time-points share variance with the real-world size hypothesized RDM.

    4. Reviewer #3 (Public Review):

      The authors used an open EEG dataset of observers viewing real-world objects. Each object had a real-world size value (from human rankings), a retinal size value (measured from each image), and a scene depth value (inferred from the above). The authors combined the EEG and object measurements with extant, pre-trained models (a deep convolutional neural network, a multimodal ANN, and Word2vec) to assess the time course of processing object size (retinal and real-world) and depth. They found that depth was processed first, followed by retinal size, and then real-world size. The depth time course roughly corresponded to the visual ANNs, while the real-world size time course roughly corresponded to the more semantic models.

      The time course result for the three object attributes is very clear and a novel contribution to the literature. However, the motivations for the ANNs could be better developed, the manuscript could better link to existing theories and literature, and the ANN analysis could be modernized. I have some suggestions for improving specific methods.

      (1) Manuscript motivations<br /> The authors motivate the paper in several places by asking " whether biological and artificial systems represent object real-world size". This seems odd for a couple of reasons. Firstly, the brain must represent real-world size somehow, given that we can reason about this question. Second, given the large behavioral and fMRI literature on the topic, combined with the growing ANN literature, this seems like a foregone conclusion and undermines the novelty of this contribution.

      While the introduction further promises to "also investigate possible mechanisms of object real-world size representations.", I was left wishing for more in this department. The authors report correlations between neural activity and object attributes, as well as between neural activity and ANNs. It would be nice to link the results to theories of object processing (e.g., a feedforward sweep, such as DiCarlo and colleagues have suggested, versus a reverse hierarchy, such as suggested by Hochstein, among others). What is semantic about real-world size, and where might this information come from? (Although you may have to expand beyond the posterior electrodes to do this analysis).

      Finally, several places in the manuscript tout the "novel computational approach". This seems odd because the computational framework and pipeline have been the most common approach in cognitive computational neuroscience in the past 5-10 years.

      (2) Suggestion: modernize the approach<br /> I was surprised that the computational models used in this manuscript were all 8-10 years old. Specifically, because there are now deep nets that more explicitly model the human brain (e.g., Cornet) as well as more sophisticated models of semantics (e.g., LLMs), I was left hoping that the authors had used more state-of-the-art models in the work. Moreover, the use of a single dCNN, a single multi-modal model, and a single word embedding model makes it difficult to generalize about visual, multimodal, and semantic features in general.

      (3) Methodological considerations<br /> a) Validity of the real-world size measurement<br /> I was concerned about a few aspects of the real-world size rankings. First, I am trying to understand why the scale goes from 100-519. This seems very arbitrary; please clarify. Second, are we to assume that this scale is linear? Is this appropriate when real-world object size is best expressed on a log scale? Third, the authors provide "sand" as an example of the smallest real-world object. This is tricky because sand is more "stuff" than "thing", so I imagine it leaves observers wondering whether the experimenter intends a grain of sand or a sandy scene region. What is the variability in real-world size ratings? Might the variability also provide additional insights in this experiment?<br /> b) This work has no noise ceiling to establish how strong the model fits are, relative to the intrinsic noise of the data. I strongly suggest that these are included.

    1. eLife assessment

      Some delayed rectifier currents in neurons are formed by the combination of Kv2 and silent subunits, KvS. However, we lack the tools to identify these heteromeric channels in vivo. In this valuable study by the Sack group, the authors identify a pharmacological tool that can reveal the presence of KvS subunits as components of the delayed rectifier potassium currents in selected neurons. The experimental evidence presented in the manuscript is compelling and represents an advance that should be of interest to a wide community of neuroscientists and channel physiologists.

    2. Reviewer #1 (Public Review):

      Summary:

      Kv2 subfamily potassium channels contribute to delayed rectifier currents in virtually all mammalian neurons and are encoded by two distinct types of subunits: Kv2 alpha subunits that have the capacity to form homomeric channels (Kv2.1 and Kv2.2), and KvS or silent subunits (Kv5,6,8.9) that can assemble with Kv2.1 or Kv2.2 to form heteromeric channels with novel biophysical properties. Many neurons express both types of subunits and therefore have the capacity to make both homomeric Kv2 channels and heteromeric Kv2/KvS channels. Determining the contributions of each of these channel types to native potassium currents has been very difficult because the differences in biophysical properties are modest and there are no Kv2/KvS-specific pharmacological tools. The authors set out to design a strategy to separate Kv2 and Kv2/KvS currents in native neurons based on their observation that Kv2/KvS channels have little sensitivity to the Kv2 pore blocker RY785 but are blocked by the Kv2 VSD blocker GxTx. They clearly demonstrate that Kv2/KvS currents can be differentiated from Kv2 currents in native neurons using a two-step strategy to first selectively block Kv2 with RY785, and then block both with GxTx. The manuscript is beautifully written; takes a very complex problem and strategy and breaks it down so both channel experts and the broad neuroscience community can understand it.

      Strengths:

      The compounds the authors use are highly selective and unlikely to have significant confounding cross-reactivity to other channel types. The authors provide strong evidence that all Kv2/KvS channels are resistant to RY785. This is a strength of the strategy - it can likely identify Kv2/KvS channels containing any of the 10 mammalian KvS subunits and thus be used as a general reagent on all types of neurons. The limitation then of course is that it can't differentiate the subtypes, but at this stage, the field really just needs to know how much Kv2/KvS channels contribute to native currents and this strategy provides a sound way to do so.

      Weaknesses:

      The authors are very clear about the limitations of their strategy, the most important of which is that they can't differentiate different subunit combinations of Kv2/KvS heteromers. This study is meant to be a start to understanding the roles of Kv2/KvS channels in vivo. As such, this is a minor weakness, far outweighed by the potential of the strategy to move the field through a roadblock that has existed since its inception.

      The study accomplishes exactly what it set out to do: provide a means to determine the relative contributions of homomeric Kv2 and heteromeric Kv2/KvS channels to native delayed rectifier K+ currents in neurons. It also does a fabulous job laying out the case for why this is important to do.

    3. Reviewer #2 (Public Review):

      Summary:

      Silent Kv subunits and the channels containing these Kv subunits (Kv2/KvS heteromers) are in the process of discovery. It is believed that these channels fine-tune the voltage-activated K+ currents that repolarize the membrane potential during action potentials, with a direct effect on cell excitability, mostly by determining action potentials firing frequency.

      Strengths:

      What makes silent Kv subunits even more important is that, by being expressed in specific tissues and cell types, different silent Kv subunits may have the ability to fine-tune the delayed rectifying voltage-activated K+ currents that are one of the currents that crucially determine cell excitability in these cells. The present manuscript introduces a pharmacological method to dissect the voltage-activated K+ currents mediated by Kv2/KvS heteromers as a means of starting to unveil their importance, together with Kv2-only channels, to the cells where they are expressed.

      Weaknesses:

      While the method is effective in quantifying these currents in any isolated cell under an electric voltage clamp, it is ineffective as a modulating maneuver to perhaps address these currents in an in vivo experimental setting. This is an important point but is not a claim made by the authors. There are other caveats with the methods and data:

      (i) The need for a 'cocktail' of blockers to supposedly isolate Kv2 homomers and Kv2/KvS heteromers' currents from others may introduce errors in the quantification Kv2/KvS heteromers-mediated K+ currents and that is due to possible blockers off targets.

      (ii) During the electrophysiology experiments, the authors use a holding potential that is not as negative as it is needed for the recording of the full population of the Kv2/KvS channels. Depolarized holding potentials lead to a certain level of inactivation of the channels, that vary according to the KvS involved/present in that specific population of channels. As a reminder, some KvS promote inactivation and others prevent inactivation. Therefore, the data must be interpreted as such.

      (iii) The analysis of conductance activation by using tail currents is only accurate when dealing with non-inactivating conductances. Also, in dealing with a heterogenous population of Kv2/KvS heteromers, heterogenous K+ conductance deactivation kinetics is a must. Indeed, different KvS may significantly relate to different deactivation kinetics as well.

      (iv) Silent Kv subunits may be retained in the ER, in heterologous systems like CHO cells. This aspect may subestimate their expression in these systems. Nevertheless, the authors show similar data in CHO cells and in primary neurons.

      (v) The hallmark of silent Kv subunits is their effect on the time inactivation of K+ currents. As such, data should be shown throughout, preferably, from this perspective, but it was only done so in Figure 4G.

      (vi) Functional characterization of currents only, as suggested by the authors as a bona fide of Kv2 and Kv2/KvS currents, should not be solely trusted to classify the currents and their channel mediators.

    1. eLife assessment

      In their manuscript, Cummings et al. use in vitro reconstitution to examine the differential activities of tubulin polyglycylases, providing valuable insights into the enzymatic regulation of microtubule glycylation and its mechanistic role in maintaining cilia function and microtubule dynamics. The convincing evidence, supported by well-designed experiments and appropriate controls, significantly advances our understanding of the tubulin code and its biochemical mechanisms.

    2. Reviewer #1 (Public Review):

      Summary:

      In their current study, Cummings et al have approached this fundamental biochemical problem using a combination of purified enzyme-substrate reactions, MS/MS, and microscopy in vitro to provide key insights into the hierarchy of generating polyglycylation in cilia and flagella. They first establish that TTLL8 is a monoglycylase, with the potential to add multiple mono glycine residues on both α- and β-tubulin. They then go on to establish that monoglycylation is essential for TTLL10 binding and catalytic activity, which progressively reduces as the level of polyglycylation increases. This provides an interesting mechanism of how the level of polyglycylation is regulated in the absence of a deglycylase. Finally, the authors also establish that for efficient TTLL10 activity, it is not just monoglycylation, but also polyglutamylation that is necessary, giving a key insight into how both these modifications interact with each other to ensure there is a balanced level of PTMs on the axonemes for efficient cilia function.

      Strengths:

      The manuscript is well-written, and experiments are succinctly planned and outlined. The experiments were used to provide the conclusions to what the authors were hypothesising and provide some new novel possible mechanistic insights into the whole process of regulation of tubulin glycylation in motile cilia.

      Weaknesses:

      The initial part of the manuscript where the authors discuss about the requirement of monoglycylation by TTLL8 is not new. This was established back in 2009 when Rogowski et al (2009) showed that polyglycylation of tubulin by TTLL10 occurs only when co-expressed in cells with TTLL3 or TTLL8. So, this part of the study adds very little new information to what was known.

      The study also fails to discuss the involvement of the other monoglycylase, TTLL3 in the entire study, which is a weakness as in vivo, in cells, both the monoglycylases act in concert and so, may play a role in regulating the activity of TTLL10.

    3. Reviewer #2 (Public Review):

      In their manuscript, Cummings et al. focus on the enzymatic activities of TTLL3, TTLL8, and TTLL10, which catalyze the glycylation of tubulin, a crucial posttranslational modification for cilia maintenance and motility. The experiments are beautifully performed, with meticulous attention to detail and the inclusion of appropriate controls, ensuring the reliability of the findings. The authors utilized in vitro reconstitution to demonstrate that TTLL8 functions exclusively as a glycyl initiase, adding monoglycines at multiple positions on both α- and β-tubulin tails. In contrast, TTLL10 acts solely as a tubulin glycyl elongase, extending existing glycine chains. A notable finding is the differential substrate recognition between TTLL glycylases and TTLL glutamylases, highlighting a broader substrate promiscuity in glycylases compared to the more selective glutamylases. This observation aligns with the greater diversification observed among glutamylases. The study reveals a hierarchical mechanism of enzyme recruitment to microtubules, where TTLL10 binding necessitates prior monoglycylation by TTLL8. This binding is progressively inhibited by increasing polyglycine chain length, suggesting a self-regulatory mechanism for polyglycine chain length control. Furthermore, TTLL10 recruitment is enhanced by TTLL6-mediated polyglutamylation, illustrating a complex interplay between different tubulin modifications. In addition, they uncover that polyglutamylation stimulates TTLL10 recruitment without necessarily increasing glycylation on the same tubulin dimer, due to the potential for TTLLs to interact with neighboring tubulin dimers. This mechanism could lead to an enrichment of glycylation on the same microtubule, contributing to the complexity of the tubulin code. The article also addresses a significant challenge in the field: the difficulty of generating microtubules with controlled posttranslational modifications for in vitro studies. By identifying the specific modification sites and the interplay between TTLL activities, the authors provide a valuable tool for creating differentially glycylated microtubules. This advancement will facilitate further studies on the effects of glycylation on microtubule-associated proteins and the broader implications of the tubulin code. In summary, this study substantially contributes to our knowledge of posttranslational enzymes and their regulation, offering new insights into the biochemical mechanisms underlying microtubule modifications. The rigorous experimental approach and the novel findings presented make this a pivotal addition to the field of cellular and molecular biology.

    1. Author Response:

      Thank you for the reviews and the eLife assessment. We want to take this opportunity to acknowledge the weaknesses pointed out by the reviewers and we will make small changes to the manuscript to account for these as part of the Version of Record.

      The tools are command-based and store outcomes locally

      We consider this to be an advantage of our ecosystem, which is intended for the case of individuals or small groups of authors. These features facilitate easy installation and integration with other tools. Further, our tool labelbuddy is a graphic user interface. Our tools may also be integrated into web-based systems as backends. Pubget is already being used in this way in the NeuroSynth Compose platform for semi-automated neuroimaging meta-analyses.

      pubget only gathers open-access papers from PubMed Central

      We recognize this as a limitation, and we acknowledge it in the original manuscript (in the discussion section, starting with "A limitation of Pubget is that it is restricted to the Open-Access subset of PMC"). We chose to limit the scope of our tools in order to ensure maintainability. Further, we are currently expanding pubget so it will also be able to access the abstracts and meta-data from closed-access papers indexed on PubMed. Future research could build other tools to work alongside pubget, to access other databases.

      Logic flow is difficult to follow

      We thank the reviewer for this feedback. Our paper describes an ecosystem of literature mining tools which does not lend itself to narrative flow nor does readily fit into the standard "Intro, Results, Discussion, Methods" structure that is typical in the scientific literature. We have done our best to conform to this expected format, but we have also provided detailed section and subsection headings to enable the reader to digest the paper nonlinearly. Each of the tools we describe also has detailed documentation on github that we update continuously.

      Results were not validated

      For the example where we automatically extracted participant demographics from papers, we validated the results on a held-out dataset of 100 manually-annotated papers. For the example with automatic large-scale meta-analyses (neuroquery and neurosynth), these methods are described together with their validation in the original papers. If this ecosystem of tools is integrated into other workflows, it should be validated in those contexts. We recognize that validating meta-analyses is a difficult problem because we do not have ground truth maps of the brain.

      Efficiency was not quantified

      Creators of tools do not always do experiments to quantify their efficiency and other qualities. We have chosen not to do this here, first because it is outside the scope of this paper as it would necessitate to specify very precise tasks and how efficiency is measured, and second because at least for the data collection part, the benefit of using an automated tool over manually downloading papers one by one is clear even without quantifying it. Compared to the approach of re-using existing datasets, our ecosystem is not necessarily more or less efficient. But it has other advantages, such as providing datasets that contain the latest literature, whereas the existing datasets are static and quickly out-of-date.

      We do not highlight the strength of AI functions

      We provide an example of using our tools to gather data and manually annotate a validation set for use with large language models (in our case, GPT). We are further exploring this domain in other projects; for example, for performing semi-automated meta-analyses using the NeuroSynth Compose platform. However, we did not deem it necessary to include more AI examples in the current paper; we only wanted to provide enough examples to demonstrate the scope of possible use cases of our ecosystem.

      We thank the reviewers for their time and valuable feedback, which we will keep in mind in our future research.

    1. eLife assessment

      This important work substantially advances our understanding of RNA structure analysis by introducing an innovative method that extends DMS probing to include guanosine residues, thereby enhancing our ability to detect complex tertiary interactions. The evidence supporting the conclusions is compelling, with detailed analyses demonstrating the method's capacity to differentiate structural contexts and improve RNA structure predictions. This work will be of broad interest to RNA structural biology, biochemistry, and biophysics researchers.

    2. Reviewer #1 (Public Review):

      Summary:

      DMS-MaP is a sequencing-based method for assessing RNA folding by detecting methyl adducts on unpaired A and C residues created by treatment with dimethylsulfate (DMS). DMS also creates methyl adducts on the N7 position of G, which could be sensitive to tertiary interactions with that atom, but N7-methyl adducts cannot be detected directly by sequencing. In this work, the authors adopt a previously developed method for converting N7-methyl-G to an abasic site to make it detectable by sequencing and then show that the ability of DMS to form an N7-methyl-G adduct is sensitive to RNA structural context. In particular, they look at the G-quadruplex structure motif, which is dense with N7-G interactions, is biologically important, and lacks conclusive methods for in-cell structural analysis.

      Strengths:

      - The authors clearly show that established methods for detecting N7-methyl-G adducts can be used to detect those adducts from DMS and that the formation of those adducts is sensitive to structural context, particularly G-quadruplexes.

      - The authors assess the N7-methyl-G signal through a wide range of useful probing analyses, including standard folding, adduct correlations, mutate-and-map, and single-read clustering.

      - The authors show encouraging preliminary results toward the detection of G-quadruplexes in cells using their method. Reliable detection of RNA G-quadruplexes in cells is a major limitation for the field and this result could lead to a significant advance.

      - Overall, the work shows convincingly that N7-methyl-G adducts from DMS provide valuable structural information and that established data analyses can be adapted to incorporate the information.

      Weaknesses:

      - Most of the validation work is done on the spinach aptamer and it is the only RNA tested that has a known 3D structure. Although it is a useful model for validating this method, it does not provide a comprehensive view of what results to expect across varied RNA structures.

      - It's not clear from this work what the predictive power of BASH-MaP would be when trying to identify G-quadruplexes in RNA sequences of unknown structure. Although clusters of G's with low reactivity and correlated mutations seem to be a strong signal for G-quadruplexes, no effort was made to test a range of G-rich sequences that are known to form G-quadruplexes or not. Having this information would be critical for assessing the ability of BASH-MaP to identify G-quadruplexes in cells.

      - Although the authors present interesting results from various types of analysis, they do not appear to have developed a mature analysis pipeline for the community to use. I would be inclined to develop my own pipeline if I were to use this method.

      - There are various aspects of the DAGGER analysis that don't make sense to me:<br /> (1) Folding of the RNA based on individual reads does not represent single-molecule folding since each read contains only a small fraction of the possible adducts that could have formed on that molecule. As a result, each fold will largely be driven by the naive folding algorithm. I recommend a method like DREEM that clusters reads into profiles representing different conformations.<br /> (2) How reliable is it to force open clusters of low-reactivity G's across RNA's that don't already have known G-quadruplexes?<br /> (3) By forcing a G-quadruplex open it will be treated as a loop by the folding algorithm, so the energetics won't be accurate.<br /> (4) It's not clear how signals on "normal" G's are treated. In Figure 5C some are wiped to 0 but others are kept as 1.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript introduces BASH MaP and DAGGER, innovative tools for analyzing RNA tertiary structures, specifically focusing on the G-quadruplexes. Traditional methods have struggled to detect and analyze these structures due to their reliance on interactions on the Hoogsteen face of guanine, which are not readily observable through conventional probing that targets Watson-Crick interactions. BASH MaP employs dimethyl sulfate and potassium borohydride to enhance the detection of N7-methylguanosine by converting it into an abasic site, thereby enabling its identification through misincorporation during reverse transcription. This method provides higher precision in identifying G-quadruplexes and offers deeper insights into RNA's structural dynamics and alternative conformations in both vitro and cellular contexts. Overall, the study is well-executed, demonstrating robust signal detection of N7-Gs with some compelling positive controls, thorough analysis, and beautifully presented figures.

      Strengths:

      The manuscript introduces a new method to detect G-quadruplexes (G-qs) that simplifies and potentially enhances the robustness and quantification compared to previous methods relying on reverse transcription truncations. The authors provide a strong positive control, demonstrating a 70% misincorporation at endogenous N7-G within the 18S rRNA, which illustrates BASH MaP's high signal-to-noise ratio. The data concerning the detection of positive control G-qs is particularly compelling.

      Weaknesses:

      Figure 3E shows considerable variability in the correlations among guanosines, suggesting that the methods may struggle with specificity in determining guanosine participation within and between different quadruplexes. There is no estimation of the methods false positive discovery rate.

    4. Reviewer #3 (Public Review):

      Summary:

      In this study, the authors aim to develop an experimental/computational pipeline to assess the modification status of an RNA following treatment with dimethylsulfate (DMS). Building upon the more common DMS Map method, which predominantly assesses the modification status of the Watson-Crick-Franklin face of A's and C's, the authors insert a chemical processing step in the workflow prior to deep sequencing that enables detection of methylation at the N7 position of guanosine residues. This approach, termed BASH MaP, provides a more complete assessment of the true modification status of an RNA following DMS treatment and this new information provides a powerful set of constraints for assessing the secondary structure and conformational state of an RNA. In developing this work, the authors use Spinach as a model RNA. Spinach is a fluorogenic RNA that binds and activates the fluorescence of a small molecule ligand. Crystal structures of this RNA with ligand bound show that it contains a G-quadruplex motif. In applying BASH MaP to Spinach, the authors also perform the more standard DMS MaP for comparison. They show that the BASH MaP workflow appears to retain the information yielded by DMS MaP while providing new information about guanosine modifications. In Spinach, the G-quadruplex G's have the least reactive N7 positions, consistent with the engagement of N7 in hydrogen bonding interactions at G's involved in quadruplex formation. Moreover, because the inclusion of data corresponding to G increases the number of misincorporations per transcript, BASH MaP is more amenable to analysis of co-occurring misincorporations through statistical analysis, especially in combination with site-specific mutations. These co-occurring misincorporations provide information regarding what nucleotides are structurally coupled within an RNA conformation. By deploying a likelihood-ratio statistical test on BASH MaP data, the authors can identify Gs in G-quadruplexes, deconvolute G-G correlation networks, base-triple interactions and even stacking interactions. Further, the authors develop a pipeline to use the BASH MaP-derived G-modification data to assist in the prediction of RNA secondary structure and identify alternative conformations adopted by a particular RNA. This seems to help with the prediction of secondary structure for Spinach RNA.

      Strengths:

      The BASH Map procedure and downstream data analysis pipeline more fully identify the complement of methylations to be identified from the DMS treatment of RNA, thereby enriching the information content. This in turn allows for more robust computational/statistical analysis, which likely will lead to more accurate structure predictions. This seems to be the case for the Spinach RNA.

      Weaknesses:

      The authors demonstrate that their method can detect G-quadruplexes in Spinach and some other RNAs both in vitro and in cells. However, the performance of BASH MaP and associated computational analysis in the context of other RNAs remains to be determined.

    1. eLife assessment

      This valuable study combines evolution experiments with molecular and genetic techniques to study how a genetic lesion in MreB that causes rod-shape cells to become spherical, with concomitant deleterious fitness effects, can be rescued by natural selection. The results are convincing, although the statistical analyses and figure presentation could be improved, and the concrete contribution of the paper and how it relates to previous literature clarified.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors performed experimental evolution of MreB mutants that have a slow-growing round phenotype and studied the subsequent evolutionary trajectory using analysis tools from molecular biology. It was remarkable and interesting that they found that the original phenotype was not restored (most common in these studies) but that the round phenotype was maintained.

      Strengths:

      The finding that the round phenotype was maintained during evolution rather than that the original phenotype, rod-shaped cells, was recovered is interesting. The paper extensively investigates what happens during adaptation with various different techniques. Also, the extensive discussion of the findings at the end of the paper is well thought through and insightful.

      Weaknesses:<br /> I find there are three general weaknesses:

      (1) Although the paper states in the abstract that it emphasizes "new knowledge to be gained" it remains unclear what this concretely is. On page 4 they state 3 three research questions, these could be more extensively discussed in the abstract. Also, these questions read more like genetics questions while the paper is a lot about cell biological findings.

      (2) it is not clear to me from the text what we already know about the restoration of MreB loss from suppressors studies (in the literature). Are there suppressor screens in the literature and which part of the findings is consistent with suppressor screens and which parts are new knowledge?

      (3) The clarity of the figures, captions, and data quantification need to be improved.

    3. Reviewer #2 (Public Review):

      Yulo et al. show that deletion of MreB causes reduced fitness in P. fluorescens SBW25 and that this reduction in fitness may be primarily caused by alterations in cell volume. To understand the effect of cell volume on proliferation, they performed an evolution experiment through which they predominantly obtained mutations in pbp1A that decreased cell volume and increased viability. Furthermore, they provide evidence to propose that the pbp1A mutants may have decreased PG cross-linking which might have helped in restoring the fitness by rectifying the disorganised PG synthesis caused by the absence of MreB. Overall this is an interesting study.

      Queries:

      Do the small cells of mreB null background indeed have have no DNA? It is not apparent from the DAPI images presented in Supplementary Figure 17. A more detailed analysis will help to support this claim.

      What happens to viability and cell morphology when pbp1A is removed in the mreB null background? If it is actually a decrease in pbp1A activity that leads to the rescue, then pbp1A- mreB- cells should have better viability, reduced cell volume and organised PG synthesis. Especially as the PG cross-linking is almost at the same level as the T362 or D484 mutant.

      What is the status of PG cross-linking in ΔmreB Δpflu4921-4925 (Line 7)?

      What is the morphology of the cells in Line 2 and Line 5? It may be interesting to see if PG cross-linking and cell wall synthesis is also altered in the cells from these lines.

      The data presented in 4B should be quantified with appropriate input controls.

      What are the statistical analyses used in 4A and what is the significance value?

      A more rigorous statistical analysis indicating the number of replicates should be done throughout.

    4. Reviewer #3 (Public Review):

      This paper addresses an understudied problem in microbiology: the evolution of bacterial cell shape. Bacterial cells can take a range of forms, among the most common being rods and spheres. The consensus view is that rods are the ancestral form and spheres the derived form. The molecular machinery governing these different shapes is fairly well understood but the evolutionary drivers responsible for the transition between rods and spheres are not. Enter Yulo et al.'s work. The authors start by noting that deletion of a highly conserved gene called MreB in the Gram-negative bacterium Pseudomonas fluorescens reduces fitness but does not kill the cell (as happens in other species like E. coli and B. subtilis) and causes cells to become spherical rather than their normal rod shape. They then ask whether evolution for 1000 generations restores the rod shape of these cells when propagated in a rich, benign medium.

      The answer is no. The evolved lineages recovered fitness by the end of the experiment, growing just as well as the unevolved rod-shaped ancestor, but remained spherical. The authors provide an impressively detailed investigation of the genetic and molecular changes that evolved. Their leading results are:

      (1) The loss of fitness associated with MreB deletion causes high variation in cell volume among sibling cells after cell division.

      (2) Fitness recovery is largely driven by a single, loss-of-function point mutation that evolves within the first ~250 generations that reduces the variability in cell volume among siblings.

      (3) The main route to restoring fitness and reducing variability involves loss of function mutations causing a reduction of TPase and peptidoglycan cross-linking, leading to a disorganized cell wall architecture characteristic of spherical cells.

      The inferences made in this paper are on the whole well supported by the data. The authors provide a uniquely comprehensive account of how a key genetic change leads to gains in fitness and the spectrum of phenotypes that are impacted and provide insight into the molecular mechanisms underlying models of cell shape.

      Suggested improvements and clarifications include:

      (1) A schematic of the molecular interactions governing cell wall formation could be useful in the introduction to help orient readers less familiar with the current state of knowledge and key molecular players.

      (2) More detail on the bioinformatics approaches to assembling genomes and identifying the key compensatory mutations are needed, particularly in the methods section. This whole subject remains something of an art, with many different tools used. Specifying these tools, and the parameter settings used, will improve transparency and reproducibility, should it be needed.

      (3) Corrections for multiple comparisons should be used and reported whenever more than one construct or strain is compared to the common ancestor, as in Supplementary Figure 19A (relative PG density of different constructs versus the SBW25 ancestor).

      (4) The authors refrain from making strong claims about the nature of selection on cell shape, perhaps because their main interest is the molecular mechanisms responsible. However, I think more can be said on the evolutionary side, along two lines. First, they have good evidence that cell volume is a trait under strong stabilizing selection, with cells of intermediate volume having the highest fitness. This is notable because there are rather few examples of stabilizing selection where the underlying mechanisms responsible are so well characterized. Second, this paper succeeds in providing an explanation for how spherical cells can readily evolve from a rod-shaped ancestor but leaves open how rods evolved in the first place. Can the authors speculate as to how the complex, coordinated system leading to rods first evolved? Or why not all cells have lost rod shape and become spherical, if it is so easy to achieve? These are important evolutionary questions that remain unaddressed. The manuscript could be improved by at least flagging these as unanswered questions deserving of further attention.

      The value of this paper stems both from the insight it provides on the underlying molecular model for cell shape and from what it reveals about some key features of the evolutionary process. The paper, as it currently stands, provides more on which to chew for the molecular side than the evolutionary side. It provides valuable insights into the molecular architecture of how cells grow and what governs their shape. The evolutionary phenomena emphasized by the authors - the importance of loss-of-function mutations in driving rapid compensatory fitness gains and that multiple genetic and molecular routes to high fitness are often available, even in the relatively short time frame of a few hundred generations - are well-understood phenomena and so arguably of less broad interest. The more compelling evolutionary questions concern the nature and cause of stabilizing selection (in this case cell volume) and the evolution of complexity. The paper misses an opportunity to highlight the former and, while claiming to shed light on the latter, provides rather little useful insight.

    5. Author response:

      Thank you for handling our paper and our thanks to the reviewers for their engagement, comments and valuable suggestions. We will take the opportunity to provide a full response and submit a revised version in the coming weeks.

    1. Joint Public Review:

      An outside expert evaluated your responses to the original reviewers and offered the following comments:

      The main criticism was whether deleterious variants were appropriately classified in the work. The authors use two different methods to characterize the effect of alleles to satisfy these comments. The result is somewhat complex. The authors do replicate the effect of dominance on fixation and segregation of deleterious alleles by classifying polymorphisms as synonymous or synonymous with SNPeff. This is not entirely surprising as it is approximately equivalent to classifying based on fold degeneracy (but it includes sites that have other than 0 or 4 fold degeneracy). However, the authors do not mention in the text that their observation of increased segregating deleterious mutations in recessive alleles was only statistically significant in A. halleri (for both analyses). Using SIFT, the authors only find an effect of dominance in A. lyrata. So in reality, while the trends are the same across the analyses, the statistical significance of the effects of dominance was not consistent.

      Reviewer 2 had several more detailed criticisms of the manuscript. The first was that the authors should explore the dominance of linked deleterious mutations themselves. I agree that this would be interesting, but it is very difficult to accomplish, and I agree with the author's reluctance to do much more here. The reviewer also criticized the authors simulation approach. The authors provided their simulation script as requested, but declined to do additional simulations under varied selection coefficients. I felt this was a minimally adequate response to the reviewers concerns, but the authors could have reasonably conducted a few additional simulations under varied selection coefficients.

      I think that the scope of the findings described in the assessment was reasonable. This is interesting work, but despite the author's arguments, the system is somewhat unique if for no other reason than that balancing selection at S-loci is uniquely strong

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The paper combines phenotypic and genomic analyses of the "sheltered load" (i.e. the accumulation of deleterious mutations linked to S-loci that are hidden from selection in the homozygous state) in Arabidopsis. The authors compare results to previous theoretical predictions concerning the extent of the load in dominant vs recessive S-alleles, and further develop exciting theory to reconcile differences between previous theory and observed results.

      Strengths:

      This is a very nice combination of theory and data to address a classical question in the field.

      We thank the reviewer for this positive feedback.

      Weaknesses:

      The "genetic load" is a poorly defined concept in general, and its quantification via the number of putatively deleterious mutations is quite difficult. Furthermore counting up the number of derived mutations at fully constrained nucleotides may not be a great estimate of the load, and certainly does not allow for evaluation of recessivity -- a concept critical to ideas concerning the sheltered load. Alternative approaches - including estimating the severity of mutations - could be helpful as well. This imperfection in available approaches to test theory must be acknowledged more strongly by the authors.

      As suggested by the reviewer, we implemented alternative approaches to estimate the severity of deleterious mutations and now report the results of SNPeff and

      SIFT4G analyses in Table S6. The results we obtained with these other metrics were overall very similar to those based on our previous counting of mutations at 0-fold and 4-fold degenerate sites. More generally, we tried to improve the presentation of our strategy to estimate the genetic load (clarified in lines 262-268, 271, 292-295, 297. In particular, we made it clear that our population genetic analysis cannot assess the recessivity of the observed mutations (lines 428-434).

      Reviewer #2 (Public Review):

      Summary:

      This study looks into the complex dominance patterns of S-allele incompatibilities in Brassicaceae, through which it attempts to learn more about the sheltering of deleterious load. I found several weak points in the analyses that diminished my excitement about the results. In particular, the way in which deleterious mutations were classified lacked the ability to distinguish the severity of the mutations and thus their expected associated dominance.

      First, we would like to clarify that our goal with this study is NOT to learn something about dominance of the linked deleterious mutations (we can not). Instead, we compare the accumulation of deleterious mutations linked to dominant vs recessive S-ALLELES, but are agnostic regarding the dominance level of the LINKED mutations themselves. The rationale is that the different intensities of natural selection between dominant vs recessive S-alleles provide a powerful way to examine the process by which deleterious mutations are sheltered in general. We further clarified this aspect on lines 70-73 and 399-401.

      Second, as mentioned above in response to Reviewer 1, we complemented the analysis by predicting the severity of the deleterious mutations by SIFT4G and SNPeff. The results were largely consistent, with the exception that the number of sites included in SIFT4G was low, such that the statistical power was reduced (lines 296-300).

      Furthermore, the simulation approach could have provided this exact sort of insight but was not designed to do so, making this comparison to the empirical data also less than exciting for me.

      As explained above, studying dominance of the linked mutations we observed is an interesting research question (albeit a difficult one), but it was not our goal here. Instead, our study was designed as an empirical test of the predictions presented in Llaurens et al (2009), and we re-analysed some aspects of the model outcome to illustrate our points.

      We now better explain that we based our choice of parameters on the fact that in the theoretical study by Llaurens et al (2009), recessive deleterious mutations are predicted to accumulate in a much more straightforward manner (line 316-318).

      We now dedicate a paragraph of the discussion to explain how our stochastic simulations could be improved, and acknowledge that a full exploration of the interaction between dominance of the S-alleles and dominance of the linked deleterious mutations would be an interesting follow-up - albeit beyond the scope of our study (line 437-441).

      Major and minor comments:

      I think the introduction (or somewhere before we dive into it in the results) of the dominance hierarchy for the S-alleles needs a more in-depth explanation. Not being familiar with this beforehand really made this paper inaccessible to me until I then went to find out more before continuing. I would expect this paper to be broad enough that self-contained information makes it accessible to all readers. For example, lines 110-115 could be in the Introduction.

      We thank the reviewer for this useful remark. We now give a more comprehensive description of the dominance hierarchy and introduce the classes of dominance in A. lyrata already in the introduction, on lines 64-70.

      Along with my above comment, perhaps it is not my place to comment, but I find the paper not of a broad enough scope to be of interest to a broad readership. This S-allele dominance system is more than simple balancing selection, it is a very complex and specific form of dominance between several haplotypes, and the mechanism of dominance does not seem to be genetic. I am not sure that it thus extrapolates to broad comments on general dominance and balancing selection, e.g. it would not be the same as considering inversions and this form of balancing selection where we also expect recessive deleterious mutations to accumulate.

      We disagree with these interpretations by the reviewer, for two reasons:

      First, the mechanism of dominance is actually entirely genetic. In fact, we uncovered some years ago that it is based on the molecular interaction between small non-coding RNAs from dominant alleles and their target sites on recessive alleles (Durand et al. Science 2014, see lines 68-70). If there is something specific with this system, it is that the dominance phenomenon is better understood at the mechanistic level than in most other cases, but the resulting phenomenon in itself (a dominance hierarchy) is rather common.

      Second, the kind of variation in the intensity of linked selection created by this mechanism is actually a general phenomenon, so our results have broad relevance beyond our particular study system. We modified the introduction to explain this point

      more clearly, highlighting in particular the fact that the situation we study closely resembles the case of sex chromosomes, where X (or Z) chromosomes are genetically recessive and Y (or W) chromosomes are genetically dominant. We cite this example in lines 83-87 of the introduction and also several well-studied other examples on lines 480-489 of the discussion.

      It would have been particularly interesting, or a nice addition, to see deleterious mutations classed by something like SNPeff or GERP where you can have different classes of moderate to severe deleterious variants, which we would expect also to be more recessive the more deleterious they are. In line with my next comment on the simulations, I think relative differences between mutations expected to be more or less dominant may be even more insightful into the process of sheltering which may or may not be going on here.

      We agree with the reviewer, and as detailed above we have now integrated such analyses with SNPeff and SIFT4G (Table S6). These new results reinforce our conclusion that while S-allele dominance influences the fixation of deleterious mutations, it has no effect on their total number. See lines 270-272 and 296-300.

      In the simulations, h=0 and s=0.01 (as in Figure 5) for all deleterious mutations seems overly simplistic, and at the convenient end for realistic dominance. I think besides recessive lethals which we expect to be close to h=0 would have a much larger selection coefficient, and other deleterious mutations would only be partially recessive at such an s value. I expect this would change some of the simulation results seen, though to what degree I am not certain. It would be nice to at least check the same exact results for h=0.3 or 0.2 (or additionally also for recessive lethals, e.g. h=0 and s=-0.9). I would also disagree with the statement in line 677, many studies have shown, particularly those on balancing selection, that partially recessive deleterious mutations are not eliminated by natural selection and do play a role in population genetic dynamics. I am also not surprised that extinction was found for higher s values when the mutation rate for such mutations was very high and the distribution of s values was constant. An influx of such highly deleterious mutations is unlikely to ever let a population survive, yet that does NOT mean that in nature, the rare influx of such mutations does lead to them being sheltered. I find overall that the simulation results contribute very little, to none, to this paper, as without something more realistic, like a simultaneous distribution of s and h values, you cannot say which, if any class of these mutations are the ones expected to accumulate because of S-allele dominance.

      We understand that the previous version of our manuscript was confusing between dominance of the S-alleles and dominance of the linked deleterious mutations. We clarified that our study focuses on the effect of the former only (lines 99, 263-264 and 581-583).

      We agree that a complete exploration of the interaction between dominance of the S-alleles and dominance of the linked mutations being sheltered would have been an asset, but as explained above this is not the focus of our study. The previous work by Llaurens et al (2009) has already established that deleterious mutations can fix within S-allele lineages, especially when linked to dominant S-alleles, and when the number of S-alleles is large. Under the conditions they examined, deleterious mutations were much more strongly eliminated if not fully recessive (h=0 vs h=0.2), so for the present study we decided to simulate fully recessive mutations only. We now formally acknowledge the possibility that some complex interaction may take place between dominance of the S-alleles and dominance of the linked deleterious mutations (lines 440-442). However, as explained above we feel that fully exploring this complex interaction would require a detailed investigation, which is clearly beyond the scope of the present study.

      Rather they only show the disappointing or less exciting result that fully recessive, weakly deleterious mutations (which I again think do not even exist in nature as I said above) have minor, to no effect across the classes of S-allele dominance. They provide no insight into whether any type of recessive deleterious mutation can accumulate under the S-allele dominance hierarchy, and that is the interesting question at hand. I would either remove these simulations or redo them in another approach. The authors never mention what simulation approach was used, so I can only assume this is custom, in-house code. Yet I do not find that code provided on the github page. I do not know if the lack of a distribution for h and s values is then a choice or a programming limitation, but I see it as one that should be overcome if these simulations are meant to be meaningful to the results of the study.

      The code we used (in C) was adapted from the previous study by Llaurens et al. (2009), which at the time was not deposited in a data repertory, unfortunately. With the agreement of the authors of that study, this code is now available on Github:

      (https://github.com/leveveaudrey/model_ssi_Llaurens; line 723).

      It is correct that our simulations were not aimed at determining whether “any type of recessive deleterious mutation can accumulate”, but we strongly believe that they help interpreting the observations made in the genomic data.

      Recommendations for the authors:

      Notes from the editor:

      I found Table 1 confusing, with column headings of observed proportion but perhaps numbers reflecting counts.

      Thank you for pointing out this confusion. There was indeed an error in the last column, which we have now corrected.

      I found Figure 2 a bit hard to parse, with the vertical lines being unclear and the x-axis ticks of insufficient resolution to evaluate the physical extent of the signals.

      We increased the size of the label on the x-axis and detailed it on the Figure 2, which is now hopefully more clear. Moreover, we increase the size of the vertical lines.

      Finally, I wonder, given the rapid decay of signal in lyrata, whether 25kb is the right choice for evaluating load and whether the pattern may look different on a smaller scale.

      It is true that the signal decays rapidly in A. lyrata, as can be seen in the haplotype structure analysis and in line with our previous analysis of the same populations Le Veve et al (MBE 2023; in this study we explored the effect of the choice of the size of the chromosomal region analyzed; lines 266-269). However, for the sake of comparison, we prefer to stick to the same window size. The fact that we still see an effect of dominance in spite of the lower statistical power associated with the more rapid decay (because a smaller number of genes is expected to be impacted) actually reinforces our conclusions.

      Reviewer #1 (Recommendations For The Authors):

      I have a few additional suggestions to improve the manuscript.

      (1) How does the load linked to the S-locus compare to that observed in other genomic regions? It would be useful to provide a comparison of the results quantified in Figures three and four to comparable genomic regions unlinked to the S-locus. How severe is the linked load?

      This comparison to the genomic background was actually the core of our previous study (Le Veve et al MBE 2023), which was based on the same populations. This analysis revealed that polymorphism of the 0-fold degenerate sites was more than twice higher in the 25kb immediately flanking the S-locus than in a series of 100 unlinked control regions. Here, the main focus of the present study is on the effect of linkage to particular S-alleles (which was not possible previously because haplotypes had to be phased).

      (2) Details of the GLM for data underlying Figures 3 and 4 are somewhat unclear. Is the key explanatory variable (Dominance) treated as continuous? Categorical? Ordinal etc…

      Dominance is considered as a continuous variable. We specify this in line 162 of the results, in the legends of Figures 3 and 4, in the Material and Method (lines 627 and 660) and in the legend of Table S4.

      (3) I had some trouble understanding the two different p-values in columns five and six of table one. Please provide more detail.

      We understand that the two p-values in Table 1 were confusing. The first was related to the binomial test and the second to the permutation test. To be consistent with the rest of the manuscript, we conserved only the p-value of the permutation test.

      (4) As mentioned in the "weaknesses" above, the authors should be more clear about what they are quantifying. They are explicitly counting the number of variants at 0-fold degenerate sites as a proxy for the genetic load. How good this proxy is is unclear. The most egregious misstatement here was on line 314 in which they make reference to the "total load." However, this limitation should be acknowledged throughout the manuscript and deserves more attention in the methods and discussion.

      As mentioned above, we now integrate additional methods to define and quantify the load (SIFT4G and SNPeff), which reinforced our previous conclusions (lines 271-272, 297-302).

      We clarified our wording and replaced the mention of “total load” by “mean number of linked deleterious mutations per copy of S-allele” (line 324-325). In the discussion we tried to better explain the limitations of approaches to estimate the genetic load (line 431-437).

      Reviewer #2 (Recommendations For The Authors):

      Line 60, it should be specified that this is only for recessive deleterious mutations.

      Non-recessive deleterious mutations would certainly not be expected to accumulate.

      As explained in details above, the question of whether and how non-recessive deleterious mutations can accumulate when linked to the S-locus is difficult and would in itself deserve a full treatment, which is clearly beyond the scope of the present study. We clarified this point on line 56.

    3. eLife assessment

      This study presents valuable empirical work and simulations that are relevant for the evolution of genetic load linked to self-incompatibility alleles in two Arabidopsis species. The evidence supporting the findings is solid, although it remains to be seen how generalizable the conclusions are beyond the specific system investigated here, not least because the statistical significance varied between the two species. The work will be of relevance to geneticists interested in the evolution of allelic diversity in similar systems.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      The authors provide solid data on a functional investigation of potential nucleoid-associated proteins and the modulation of chromosomal conformation in a model cyanobacterium. While the experiments presented are convincing, the manuscript could benefit from restructuring towards the precise findings; alternatively, additional data buttressing the claims made would significantly enhance the study. These valuable findings will be of interest to the chromosome and microbiology fields.

      We appreciate editors for taking time for assessment and reviewers for giving critical suggestions. Both reviewers were concerned about our interpretation of 3C data, and Reviewer #2 suggested the biochemistry of cyAbrB2 to reinforce our claim. We agree with the concern and suggest editors add a sentence “How cyAbrB2 affects chromosome structure is still elusive from this study, and the biochemical assays are needed in the future experiment.” to the eLife assessment.

      The major revision points are the following;

      Reconstruction of Figures

      Previous Figure 5E has been omitted

      Additional 3C data on the nifJ region

      Rephrasing the conclusion of 3C data

      Additional discussion on cyAbrB2 and NAPs

      Reviewer #1 (Public Review): 

      Strength: 

      At first glance, I had a very positive impression of the overall manuscript. The experiments were well done, the data presentation looks very structured, and the text reads well in principle.

      Weakness: 

      Having a closer look, the red line of the manuscript is somewhat blurry. Reading the abstract, the introduction, and parts of the discussion, it is not really clear what the authors exactly aim to target. Is it the regulation of fermentation in cyanobacteria because it is under-investigated? Is it to bring light to the transcriptional regulation of hydrogenase genes? The regulation by SigE? Or is it to get insight into the real function of cyAbrB2 in cyanobacteria? All of this would be good of course. But it appears that the authors try to integrate all these aspects, which in the end is a little bit counterintuitive and in some places even confusing. From my point of view, the major story is a functional investigation of the presumable transcriptional regulator cyAbrB2, which turned out to be a potential NAP. To demonstrate/prove this, the hox genes have been chosen as an example due to the fact that a regulatory role of cyAbrB2 has already been described. In my eyes, it would be good to restructure or streamline the introduction according to this major outcome. 

      As you pointed out, the major focus of this study is cyAbrB2 as a potential NAPs. To focus on NAPs, we simplified the first paragraph of the discussion (ll.246-263) and added the section comparing cyAbrB2 with other known NAPs (11.269-299). To emphasize the description of cyAbrB2, we also rearranged the figures and divided the analysis on cyAbrB2 ChIP into two figures. We reduced the first paragraph of the introduction but mostly preserved the composition of the introduction to keep the general to specific pattern, even though the manuscript is blurry.

      Points to consider: 

      The authors suggest that the microoxic condition is the reason for the downregulation of e.g. photosynthesis (l.112-114). But of course, they also switched off the light to achieve a microoxic environment, which presumably is the trigger signal for photosynthesis-related genes. I suggest avoiding making causal conclusions exclusively related to oxygen and recommend rephrasing (for example, "were downregulated under the conditions applied").

      We agree with this point. We rephrased l.114 to “by the transition to dark microoxic conditions from light aerobic conditions” (ll.108-109).

      The authors hypothesized that cyAbrB2 modulates chromosomal conformation and conducted a 3C analysis. But if I read the data in Figure 5B & C correctly, there is a lot of interaction in a range of 1650 and 1700 kb, not only at marked positions c and j. Positions c and j have been picked because it appears that cyAbrB2 deletion impacts this particular interaction. But is it really significant? In the case of position j the variation between the replicates seems quite high, in the case of position c the mean difference is not that high. Moreover, does all this correlate with cyAbrB2 binding, i.e. with positions of gray bars in panel A? If this was the case, the data obtained for the cyabrB2 mutant should look totally different but they are quite similar to WT. That's why the sentence "By contrast, the interaction frequency in Δcyabrb2 mutant were low and unchanged in the aerobic and microoxic conditions" does not fit to the data shown. But I have to mention that I am not an expert in these kinds of assays. Nevertheless, if there is a biological function that shall be revealed by an experiment, the data must be crystal clear on that. At least the descriptions of the 3C data and the corresponding conclusions need to be improved. For me, it is hard to follow the authors' thoughts in this context. 

      According to your suggestion, we again have carefully observed the 3C data. Furthermore, we conducted an additional 3C experiment on nifJ region (Figures 7F-J). Then we admit we had overinterpreted the 3C data. Therefore, we rewrote the result and discussion of the 3C assay in line with the data (ll.220-245) and removed the previous Figure 5E. Following are individual responses.

      Positions c and j have been picked because it appears that cyAbrB2 deletion impacts this particular interaction. But is it really significant?

      We could not find statistically significant differences at locus c and j. Therefore, we added this in the result section “Note that the interaction scores exhibit considerable variability and we could not detect statistical significance at those loci.” (ll.231-232)

      does all this correlate with cyAbrB2 binding, i.e. with positions of gray bars in panel A?

      As you are concerned, interaction frequency and cyAbrB2 binding do not correlate. Therefore, we withdraw the previous claim and stated as follows; “Moreover, our 3C data did not support bridging at least in hox region and nifJ region, as the high interaction locus and cyAbrB2 binding region did not seem to correlate (Figure 7).” (ll.280-282)

      If this was the case, the data obtained for the cyabrB2 mutant should look totally different but they are quite similar to WT.

      We rewrote it as follows; “Then we compared the chromatin conformation of wildtype and cyabrb2∆. Although overall shapes of graphs did not differ, some differences were observed in wildtype and cyabrb2∆ (Figures 7B and 7G); interaction of locus (c) with hox region were slightly lower in cyabrb2∆ and interaction of loci (f’) and (g’) with nifJ region were different in wildtype and cyabrb2∆. Note that the interaction scores exhibit considerable variability and we could not detect statistical significance at those loci.” (ll.228-232)

      That's why the sentence "By contrast, the interaction frequency in Δcyabrb2 mutant were low and unchanged in the aerobic and microoxic conditions" does not fit to the data shown.

      We rewrote the sentence as follow; “While the interaction scores exhibit considerable variability, the individual data over time demonstrate declining trends of the wildtype at locus (c) and (j) (Figure S8). In ∆cyabrb2, by contrast, the interaction frequency of loci (c) and (j) was unchanged in the aerobic and microoxic conditions (Figure 7E). The interaction frequency of locus (c) in ∆cyabrb2 was as low as that in the microoxic condition of wildtype, while that of locus (j) in ∆cyabrb2 was as high as that in the aerobic condition of wildtype (Figures 7B and 7C).” (ll.238-243)

      The figures are nicely prepared, albeit quite complex and in some cases not really supportive of the understanding of the results description. Moreover, they show a rather loose organization that sometimes does not fit the red line of the results section. For example, Figure 1D is not mentioned in the paragraph that refers to several other panels of the same figure (see lines110-128). Panel 1D is mentioned later in the discussion. Does 1D really fit into Figure 1 then? Are all the panels indeed required to be shown in the main document? As some elements are only briefly mentioned, the authors might also consider moving some into the supplement (e.g. left part of Figure 1C, Figure 2A, Figure 3B ...) or at least try to distribute some panels into more figures. This would reduce complexity and increase comprehensibility for future readers. Also, Figure 3 is a way too complex. Panel G could be an alone-standing figure. The latter would also allow for an increase in font sizes or to show ChIP data of both conditions (L+O2 and D-O2) separately. Moreover, a figure legend typically introduces the content as a whole by one phrase but here only the different panels are described, which fits to the impression that all the different panels are not well connected. Of course, it is the decision of the authors what to present and how but may they consider restructuring and simplifying.

      According to the advice, we have rearranged the Figure composition.

      The left side of Figure 1C has been moved to supplement. Instead, representative expression fold changes of “Transient”, “Plateau”, “Continuous”, and “Late” genes are shown for comprehensibility. We left Figure 1D in Figure 1, as this diagram shows our motive to focus on hox and nifJ. We moved Figure 2A to supplement. We did not move Fig3B, as this figure shows the distribution of cyAbrB2 (“long tract of AT-rich DNA”) comprehensively and simply. We agree that Figure 3 was too complex. Therefore, we moved Figures 3F and 3G to a new independent figure (Figure 4). In Figure 4C (former 3G), we show the ChIP data of the L+O2 condition only, and the change of ChIP data under the D-O2 condition is shown in Figure 5. The schematic image showing cyanobacterial chromosome and NAPs (previous Figure 5E) was omitted because it was overinterpreting.

      The authors assume a physiological significance of transient upregulation of e.g. hox genes under microoxic conditions. But does the hydrogenase indeed produce hydrogen under the conditions investigated and is this even required? Moreover, the authors use the term "fermentative gene". But is hydrogen indeed a fermentation product, i.e. are protons the terminal electron acceptor to achieve catabolic electron balance? Then huge amounts of hydrogen should be released. Comment should be made on this.

      This is a very important point; Yes, hydrogenase indeed produces hydrogen under the conditions we investigated, and proton accepts a majority of reducing power under the dark microoxic condition. We wrote in the introduction section as follows; “Hydrogen is generated in quantities comparable to lactate and dicarboxylic acids as the result of electron acceptance in the dark microoxic condition (Akiyama and Osanai 2023; Iijima et al. 2016)” (ll.54-55). The detailed explanation is below, although omitted from the manuscript.

      A recent study (Akiyama and Oasanai 2023) quantified the consumed glycogen and secreted fermentative products (hydrogen, lactate, dicarboxylic acid, and acetate) in the Synechocystis under the dark microoxic condition, the same conditions as we investigated. The system of the study consists of a 10 mL liquid layer and a 10 mL gas layer, cultivated for 3 days under dark microoxic conditions. Then the amounts of lactic acid, dicarboxylic acid, and hydrogen were approximately 2 µmol, 3.5 µmol, and 11µmol (assuming the gas layer was at 1 atm and ignoring aqueous population), respectively. On the other hand, glycogen equivalent to 15µmol of glucose was consumed in the system. This estimate supports hydrogen accounts for a substantial portion of fermentative products during dark microoxic conditions.

      The necessity of hydrogen production under dark microoxic conditions was demonstrated in (Gutekunst et al. 2014). They show hydrogenase activity is required for the mixotrophic growth in the light-dark and microoxic cycle with arginine. The necessity remains unclear in our conditions because we only performed continuous dark microoxic conditions without glucose.

      The authors also mention a reverse TCA cycle. But is its existence an assumption or indeed active in cyanobacteria, i.e. is it experimentally proven? The authors are a little bit vague in this regard (see lines 241-246).

      We misused the Terminology. We mean to mention the “reductive branch of TCA”. Cyanobacteria conduct the branched TCA cycle under microoxic conditions. One of the branches is the reductive branch, which reduces oxaloacetate to produce malate. We corrected “reverse TCA cycle” to “reductive branch of TCA”. (Figure 1D and ll.260-262)

      Reviewer #2 (Public Review): 

      This work probes the control of the hox operon in the cyanobacterium Synechocystis, where this operon directs the synthesis of a bidirectional hydrogenase that functions to produce hydrogen. In assessing the control of the hox system, the authors focused on the relative contributions of cyAbrB2, alongside SigE (and to a lesser extent, SigA and cyAbrB1) under both aerobic and microoxic conditions. In mapping the binding sites of these different proteins, they discovered that cyAbrB2 bound many sites throughout the chromosome repressed many of its target genes, and preferentially bound regions that were (relatively) rich in AT-residues. These characteristics led the authors to consider that cyAbrB2 may function as a nucleoid-associated protein (NAP) in Synechocystis, given its functional similarities with other NAPs like H-NS. They assessed the local chromosome conformation in both wild-type and cyabrB2 mutant strains at multiple sites within a 40 kb window on either side of the hox locus, using a region within the hox operon as bait. They concluded that cyAbrB2 functions as a nucleoid-associated protein that influences the activity of SigE through its modulation of chromosome architecture.

      The authors approached their experiments carefully, and the data were generally very clearly presented and described.

      Based on the data presented, the authors make a strong case for cyAbrB2 as a nucleoid-associated protein, given the multiple ways in which it seems to function similarly to the well-studied Escherichia coli H-NS protein. It would be helpful to provide some additional commentary within the discussion around the similarities and differences of cyAbrB2 to other nucleoid-associated proteins, and possible mechanisms of cyAbrB2 control (post-translational modification; protein-protein interactions; etc.). The manuscript would also be strengthened with the inclusion of biochemical experiments probing the binding of cyAbrB2, particularly focusing on its oligomerization and DNA polymerization/bridging potential.

      We agree with the comment that the biochemical experiments will deepen our insights into the cyAbrB2 and chromatin conformation. As the reviewer pointed out, the biochemical assay will provide valuable information on mechanisms of cyAbrB2 control, such as post-transcriptional modification, cooperation with cyAbrB1, oligomerization, and the structure of cyAbrB2-bound DNA. However, we think those potential findings are worth of new independent research paper, rather than a part of this paper. Therefore, we added a discussion mentioning biochemistry as the future work (ll.275-290; the section of “The biochemistry of cyAbrB2 will shed light on the regulation of chromatin conformation in the future”).

      Previous work had revealed a role for SigE in the control of hox cluster expression, which nicely justified its inclusion (and focus) in this study. However, the results of the SigA studies here suggested that SigA both strongly associated with the hox promoter, and its binding sites were shared more frequently than SigE with cyAbrB2. The focus on cyAbrB2 is also well-justified, given previous reports of its control of hox expression; however, it shares binding sites with an essential homologue cyAbrB1. Interestingly, while the B1 protein appears to bind similar sites, instead of repressing hox expression, it is known as an activator of this operon. It seems important to consider how cyAbrB1 activity might influence the results described here.

      We infer that the minor side of the bimodal SigE peak is the genuine population that contributes to hox transcription, as hox genes are expressed in a SigE-dependent manner (Figure S2). We considered the strong SigA peak upstream of the hox operon binds the promoter of TU1715, the opposite direction of the hox operon. We added a description of the single SigA peak and bimodal SigE peak near the TSS of the hox operon as follows;

      “A bimodal peak of SigE was observed at the TSS of the hox operon in a microoxic-specific manner (Figure 6C bottom panel). The downstream side of the bimodal SigE peak coincides with SigA peak and the TSS of TU1715. Another side of the bimodal peak lacked SigA binding and was located at the TSS of the hox operon (marked with an arrow in Figure 6C), although the peak caller failed to recognize it as a peak.” (ll.206-209)

      The point that cyAbrB1 binds similar sites as cyAbrB2, despite regulating hox expression in the opposite direction, is very interesting. Therefore, we referred to the transcriptome data of the cyAbrB1 knockdown strain and compared the impact of cyAbrB1 knockdown and cyAbrB2 deletion. We described in result and discussion as follows;

      “we referred to the recent study performing transcriptome of cyAbrB1 knockdown strain, whose cyAbrB1 protein amount drops by half (Hishida et al. 2024). Among 24 genes induced by cyAbrB1 knockdown, 12 genes are differentially downregulated genes in cyabrb2∆ in our study (Figure S5D).” (ll.162-165)

      “CyAbrB1, the homolog of cyAbrB2, may cooperatively work, as cyAbrB1 directly interacts with cyAbrB2 (Yamauchi et al. 2011), their distribution is similar, and they partially share their target genes for suppression (Figures 3A S5C and S5D). The possibility of cooperation would be examined by the electrophoretic mobility shift assay of cyAbrB1 and cyAbrB2 as a complex. Despite their similar repressive function, cyAbrB1 and cyAbrB2 regulate hox expression in the opposite directions, and their mechanism remains elusive.” (ll.292-296)

      Hox operon differs from this general tendency. To see if cyAbrB1 behaves differently from cyAbrB2 in the hox operon, we did an additional ChIP-qPCR experiment on cyAbrB1 in the aerobic condition and the dark microoxic condition (Figure 5C). However, we could not find the difference.

      Reviewer #1 (Recommendations For The Authors): 

      Figure 1B: I recommend changing the header in the grey bar to terms like "upregulated" and "downregulated", which are also used in the legend description. Upregulation of genes can also be a result of de-repression, which is why the term "activated" is somewhat misleading.

      Corrected.

      Lines 114-116: It is unclear what the authors exactly mean here. Please clarify. 

      We rephrase the sentence “The enrichment in the butanoate metabolism pathway indicates the upregulation of genes involved in carbohydrate metabolism. We further classified genes according to their expression dynamics.” (ll.110-111)

      Reviewer #3 (Recommendations For The Authors): 

      Major/experimental comments: 

      (1) For the chromosome conformation capture experiments, it is indicated that these were conducted at aerobic (1hr) and microoxic (4 hr) conditions. But the data presented in Figure 1 suggest that 1 hr corresponds to the beginning of microoxic growth, and that time 0 is aerobic. The composite 3C data in Figure 5 show some interesting but specific differences. It is appreciated that the authors presented the profiles for individual samples in Figure S7, and the differences here do not seem to be as compelling. Are the major differences being highlighted significantly (statistically) different (e.g. at the (c) and (j) loci)? Might the differences be starker if an earlier aerobic condition (e.g. time 0) had been used instead of the 1 hr - microoxic - timepoint?

      Previous Figure 5 consisted of three time points (solid line: aerobic condition, dashed line:1hr of microoxic condition, and dotty line:4hr of microoxic condition). We omitted data of 4hr in the main figure (Figure 7) as 4hr in microoxic conditions makes data complicated. Three time points are shown in the profiles of individual loci (Figure S8).

      There is no statistical significance found in (c) and (j) loci by t-test. Therefore, we have toned down the interpretation of 3C data as follows; “Our 3C result demonstrated that cyAbrB2 influences the chromosomal conformation of hox and nifJ region to some extent (Figure 7).” (ll.325-326)

      (2) This is a complicated system that involves multiple regulatory proteins, each of which is differentially affected by the growth conditions (aerobic/microoxic). It is obviously beyond the scope of this work to probe deeply into all of these proteins. The focus here was on cyAbrB2, and to a slightly lesser extent SigE; however, based on the data presented, it seems that SigA and cyAbrB1 may be equally important contributors to hox control/expression, and in the case of cyAbrB1, possibly also to chromosome conformation. cyAbrB1 appears to have the same binding sites as cyAbrB2, and has been reported to interact with cyAbrB2. Given this association, it is possible that the two proteins may affect the binding of each other, and that loss of one might lead to enhanced binding by the other (or binding may require heterooligomerization?). Probing the regulatory interplay between these two proteins (or at least discussing it) feels important. Conducting e.g. mobility shift assays with each protein, both individually and together, could possibly allow for some understanding of how they function together. 

      We agree that the biochemistry of cyAbrB2 and cyAbrB1 may explain why cyAbrB1 and cyAbrB2 bind long tracts of AT-rich genome regions in vitro. We would like to put the biochemistry future plan as we think biochemistry data is beyond the present study.

      The idea that cyAbrB1 and cyAbrB2 cooperate to form heterooligomers and broad binding to the genome is a very rational and interesting prediction. We add this idea to the discussion “Overall, the biochemistry integrating assay conditions (PTM, buffer condition, and cooperation with cyAbrB1) and output (DNA binding, oligomerization, and DNA structure) will deepen the understanding of cyAbrB2 as cyanobacterial NAPs.”(ll.287-290). We also compared our transcriptome of ∆_cyabrb2 with the recent study of cyabrb1 knockdown (ll. 162-165), and concluded “they partially share their target genes for suppression (Figures 3A S5C and S5D)” (l. 293).

      (3) Throughout the manuscript, there is reference made to cyAbrB2 binding becoming 'blurry' or non-specific under microoxic conditions. It is not clear what this means. It appears that when cyAbrB2 binds, any given protected region can be quite extensive, which can be suggestive of polymerization along the chromosome. Are the boundaries for binding sites typically clearly delineated, and this changes when the cultures are growing under microoxic conditions? There is also no mention made anywhere about oligomerization potential for cyAbrB2, which would be important for the polymerization, and bridging suggested for cyAbrB2 in the model presented in Figure 5. Previous publications (Song et al., 2022; Ishi et al., 2008) have suggested that it can exist as a dimer in vivo, but that in vitro it is largely monomeric. The manuscript would benefit from some additional biochemical analyses of cyAbrB2 binding activity, with a particular focus on DNA binding and oligomerization/bridging potential, and some additional discussion about these characteristics as well. 

      Throughout the manuscript, there is reference made to cyAbrB2 binding becoming 'blurry' or non-specific under microoxic conditions. It is not clear what this means.

      In order to clearly describe “cyAbrB2 binding becomes blurry”, we rearranged the figure composition and made an exclusive figure (Figure 5). We also rephrased the description by adopting the reviewer’s word “boundaries for binding sites”, as this phrase well describes the change. “When cells entered microoxic conditions, the boundaries of the cyAbrB2 binding region and cyAbrB2-free region became obscure (Figure 5), “(ll.319-320)

      There is also no mention made anywhere about oligomerization potential for cyAbrB2,

      We added the discussion about oligomerization “DNA-bound cyAbrB2 is expected to oligomerize, based on the long tract of cyAbrB2 binding region in our ChIP-seq data. However, no biochemical data mentioned the DNA deforming function or oligomerization of cyAbrB2 in the previous studies and preference for AT-rich DNA is not fully demonstrated in vitro (Dutheil et al. 2012; Ishii and Hihara 2008; Song et al. 2022)”(ll. 277-280) and “Overall, the biochemistry integrating assay conditions (PTM, buffer condition, and cooperation with cyAbrB1) and output (DNA binding, oligomerization, and DNA structure) will deepen the understanding of cyAbrB2 as cyanobacterial NAPs.” (ll.287-290)

      The manuscript would benefit from some additional biochemical analyses of cyAbrB2 binding activity, with a particular focus on DNA binding and oligomerization/bridging potential, and some additional discussion about these characteristics as well. 

      We added the discussion integrally considering known features of cyAbrB2, novel findings on cyAbrB2, and the comparison with known NAPs (ll.269-290).

      (4) Given that the major take-away for the authors (based on the title) seems to be the nucleoid-associated protein potential for cyAbrB2, the Discussion would benefit from some additional focus in this area. How similar is cyAbrB2 to other nucleoid-associated proteins? (e.g. H-NS, Lsr2) How does counter-silencing work for other nucleoid-associated proteins? Can the authors definitively exclude the possibility of binding site competition/occlusion, given that cyAbrB2 covers the promoter region of hox? What is other nucleoid-associated proteins have been characterized in the cyanobacteria? 

      We agree with the point, so we additionally discussed cyAbrB2 comparing with H-NS and Lsr2, the canonical NAPs (ll. 269-290).

      We did not deny the possibility of the exclusion of RNAP by cyAbrB2, but the previous manuscript insufficiently discussed that. To emphasize that cyAbrB2 excludes RNA polymerase, we simplified Figure 6 and employed mosaic plots showing anti-co-occurrence of cyAbrB2 binding regions and SigE peaks. Furthermore, we added discussion about SigE exclusion by cyAbrB2 (ll. 355-359)

      We mention the possibility of other nucleoid-associated proteins in cyanobacteria in the discussion. “Furthermore, the conformational changes by deletion of cyAbrB2 were limited, suggesting there are potential NAPs in cyanobacteria yet to be characterized.” (ll.336-339)

      (5) Previous work (Song et al., 2022) showed that changing the AT content of cyAbrB2 binding sites did not affect its ability to bind DNA. There are also previous papers suggesting that cyAbrB2 may be subject to diverse post-translational modifications (e.g. phosphorylation - Spat et al., 2023; glutationylation - Sakr et al., 2013), as well as association with cyAbrB1. These collectively suggest there may be other factors that contribute to cyAbrB2 binding specificity/activity. These seem like relevant points to discuss, particularly given the transient nature of the cyAbrB2 effects on some genes.

      We have included the discussion about AT content, post-translational modifications and transient regulations, and association with cyAbrB1 (ll. 284-295)

      (6) Given the major binding site for SigA upstream of the hox operon, it seems that it likely also contributes to hox cluster expression, together with SigE. Is there a sense for the relative contribution of each sigma factor to hox cluster expression? And whether both are subject to the same inhibitory effect of cyAbrB2? 

      As described above response to the public review, the SigA binding site upstream of the hox operon should be assigned to the TSS of TU1715 (Figure 6C). Transcription of hox operon is highly dependent on SigE as shown in Figure S2, and residual transcription in sigE∆ strain is derived from other sigma factors (SigABCD). Estimating the relative contribution of sigma factors other than SigE is difficult at present because SigABCDE can partially compensate for each other.

      As the different impact of NAPs on the primary and alternative sigma factor is observed in H-NS (Shin et al. 2005), whether both the primary sigma factor (SigA) and the alternative sigma factor (SigE) are inhibited by cyAbrB2 to the same extent is a very interesting question.

      We calculated the odds ratio of SigE and SigA being in the cyAbrB2-free region and wrote in the result; “SigE preferred the cyAbrB2-free region in the aerobic condition more than SigA did (Odds ratios of SigE and SigA being in the cyAbrB2-free region were 4.88 and 2.74, respectively).” (ll.193-195) and discussed “The higher exclusion pressure of cyAbrB2 on SigE may contribute to sharpening the transcriptional response of hox and nifJ on entry to microoxic conditions.” (ll.357-359)

      (7) The 3C experiments suggest there are indeed changes in chromosome architecture in the hox region as growth conditions change and when different regulators are present. Across the chromosome, analogous changes are expected; however, it may be premature to draw this conclusion based on changes at one locus. Is there a reason that the authors did not take full advantage of their 3C samples and sequence them, to capture the full chromosome interactome at the two time-points? This would allow broader conclusions to be drawn regarding changes in chromosome structure and the impact of cyAbrB2.

      In response to the suggestion, we performed an additional 3C assay on the nifJ region by utilizing residual 3C samples. Expanding to genome-wide sequence (Hi-C) needs concentration of ligated fragments by the biotinylation, which were omitted in our 3C sample.

      We rewrote the result as obtained from the 3C data of hox and nifJ (ll.220-245) and omitted the schematic image of an entire chromosome of cyanobacteria (previous Figure 5E).

      Editorial comments: 

      (1) The data presentation in Figure 1 is very effective. 

      (2) Line 87: please rephrase - you can have 'high similarity' or 'high levels of identity', but not high levels of homology - genes/proteins are either homologous or not.

      (3) Line 118: classified into four 'groups'? 

      (4) Line 590: remove 'the'. 

      (5) Figure 2S, panel B: please define acronyms in the legend (GT, IP) and write out 'FLAG' in full for AbrB1.

      (2) to (5) have been corrected.

      (6) Please provide information on or a reference for the tagging of SigA for use in the ChIP-seq experiments within the Materials and Methods.

      Added (l.365)

      (7) Line 648: space between 'binding' and 'regions'. 

      corrected.

      (8) Fig 4E: please make the solid lines thicker - they are currently difficult to see.

      We have made Figure 6C (former 4E) larger and the line thicker.

      (9) Line 666: location. 

      (10) Line 673: Individual. 

      (11) Figure S5, panel C graph title: should this be 'Relative'? 

      (12) Figure S7: What is 'GT'? Should this be 'WT'? 

      (9) to (12) have been corrected.

      (13) In addition to the data presented in Figure 3G, it would be nice to have a small table or Venn diagram summarizing the number of cyAbrB2 binding sites that fall into the different categories (full gene/operon; downstream of a gene; within a gene; promoter region). 

      In response to the comment, we noticed the categories we had applied (full gene/operon; downstream of a gene; within a gene; promoter region) were arbitrary. Therefore, we categorized transcriptional units (TUs) according to the extent of occupancy by cyAbrB2. (Figures 4B and 4C)

      (14) Line 280-281: suggest replacing 'mediates' with 'influences'. 'Mediates' sounds like a direct interaction (for which the evidence is not currently strong without some additional biochemical data), but 'influences' could better accommodate both direct and indirect possibilities. 

      (15) Line 410: it is not clear what this means. 

      We have omitted “As a result, DNA ~600-fold condensed DNA than 3C samples were ligated.”, as it does not give any information about the experimental procedure.

    2. eLife assessment

      The authors provide solid data on a functional investigation of potential nucleoid-associated proteins and the modulation of chromosomal conformation in a model cyanobacterium. These valuable findings will be of interest to the chromosome and microbiology fields. Additional analysis and the tempering of conclusions has helped to improve the work, although further refinement remains possible.

    3. Reviewer #3 (Public Review):

      This work probes the control of the hox operon in the cyanobacterium Synechocystis, where this operon directs the synthesis of a bidirectional hydrogenase that functions to produce hydrogen. In assessing the control of the hox system, the authors focused on the relative contributions of cyAbrB2, alongside SigE (and to a lesser extent, SigA and cyAbrB1) under both aerobic and microoxic conditions. In mapping the binding sites of these different proteins, they discovered that cyAbrB2 bound many sites throughout the chromosome, repressed many of its target genes, and preferentially bound regions that were (relatively) rich in AT-residues. These characteristics led the authors to consider that cyAbrB2 may function as a nucleoid-associated protein (NAP) in Synechocystis, given the functional similarities with other NAPs like H-NS. They assessed the local chromosome conformation in both wild type and cyabrB2 mutant strains at multiple sites within a 40 kb window on either side of the hox locus, using a region within the hox operon as bait. They concluded that cyAbrB2 functions as a nucleoid associated protein that influences the activity of SigE through its modulation of chromosome architecture.

      The authors approached their experiments carefully, and the data were generally very clearly presented. At the same time, the overall work contains many lines of inquiry and different protein investigations that in some ways made it more challenging to identify the overall take-away message(s).

      Based on the data presented, the authors make a strong case for cyAbrB2 as a nucleoid-associated protein, given the multiple ways in which is seems to function similarly to the well-studied Escherichia coli H-NS protein. They now provide additional commentary that relates cyAbrB2 with other nucleoid-associated proteins.

      Previous work had revealed a role for SigE in the control of hox cluster expression, which nicely justified its inclusion (and focus) in this study. The focus on cyAbrB2 is also well-justified, given previous reports of its control of hox expression; however, it shares binding sites with an essential homologue cyAbrB1. Interestingly, while the B1 protein appears to bind similar sites, instead of repressing hox expression, it is known as an activator of this operon. If the information on cyAbrB1 is retained in the manuscript, it would be important to consider how cyAbrB1 activity might influence the results described here (although the authors could also consider removing the cyAbrB1 information to help improve the focus of the manuscript).

    1. eLife assessment

      In this manuscript the authors present high-speed atomic force microscopy (HSAFM) to analyze real-time structural changes in actin filaments induced by cofilin binding. This important study enhances our understanding of actin dynamics which plays a crucial role in a broad spectrum of cellular activities based on solid experimental evidence. Some technical questions, however, remain, making the data interpretation incomplete.

    2. Reviewer #1 (Public Review):

      The authors provided a detailed analysis of the real-time structural changes in actin filaments resulting from cofilin binding, using High-Speed Atomic Force Microscopy (HSAFM). The cofilin family controls the lifespan of actin filaments in cells by severing the filament and promoting depolymerization. Understanding the effects of cofilin on actin filament structure is critical. It is widely acknowledged that cofilin binding significantly shortens the pitch of the actin helix. The authors previously reported (1) that this shortening extends to the unbound region of the actin filament on the pointed end side of the cofilin binding cluster. In this study, the authors presented substantially improved AFM images and provide detailed accounts of the dynamics observed. It was found that a minimal cofilin-binding cluster, consisting of 2-4 molecules, could induce changes in the helical parameters over one or more actin crossover repeats. Adjacent to the cofilin-binding clusters, the actin crossovers were observed to shorten within seconds, and this shortening was limited to one side of the cluster. Additionally, the phosphate binding to the actin filament was observed to stabilize the helical twist, suggesting a mechanism in which cofilin preferentially binds to ADP-bound actin filaments. These findings significantly advance our understanding of actin filament dynamics which is essential for a wide of cellular processes.

      However, two insufficient parts exist. Readers should be aware of possible errors in the Mean Axial Distance (MAD) analysis and the limitations of discussions about the actin subunit structure.

      The authors have presented findings that the MAD within actin filaments exhibits a significant dependency on the helical twist. However, difficulty in determining each subunit interval from the AFM image might affect the analysis. For example, the observation of three peaks in HHP6 of Figure Supplement 6C, corresponding to 4.5 pairs, showed peak intervals of 5, 11.8, 8.7, and 5.7 nm (measured from the figure). The second region (11.8 nm) appears excessively long. If one peak is hidden in the second region, the MAD becomes 5.5 nm.

      The authors also suggest a strong link between the C-form (cofilin binding form of actin found in cofilactin) and the formation of regions of the short pitch helix outside the cofilin binding cluster. However, the AFM observation did not provide any evidence about the actin form in these regions because of measurement limitations. Additionally, Oda et al. (2) have demonstrated that the C-form is highly unstable in the absence of cofilin binding, casting doubt on the possibility of the C-form propagating without cofilin binding. The "C-actin-like structure" in the paper is not necessarily related to the C-form actin. It might be one of the G-forms (monomeric actin forms) or another unknown form.

      (1) K. X. Ngo et al., a, Cofilin-induced unidirectional cooperative conformational changes in actin filaments revealed by high-speed atomic force microscopy. eLife 4, (2015).<br /> (2) T. Oda et al., Structural Polymorphism of Actin. Journal of molecular biology 431, 3217-3228 (2019).

    3. Reviewer #2 (Public Review):

      Summary:

      This study by Ngo et al. uses mostly high-speed AFM to estimate conformational changes within actin filaments, as they get decorated by cofilin. The authors build on their earlier study (Ngo et al. eLife 2015) where they used the same technique to monitor the expansion of cofilin clusters on actin filaments, and the propagation of the associated conformational changes in the filament (reduction of the helical pitch). Here, they propose a higher-resolution description of the binding of cofilin to actin filaments.

      Strengths:

      The high speed AFM technique used here is quite original to address this question, compared to more classical light and electron microscopy techniques. It can certainly bring valuable information as it provides a high spatial resolution while monitoring live events. Also, in this paper, a nice effort was made to make the 3D structures and conformational changes clear and understandable.

      Weaknesses:

      In spite of the authors' response to my earlier comments, I still have concerns regarding the AFM technique. In particular, regarding the interactions of the filaments with the surface, which I still find unclear and potentially problematic.

      The filaments appear densely packed on the surface, and even clearly in register in some images (if not most images, e.g., Figs 3AD, 4BC, 5A, 8AC). I understand that there are practical reasons for this, but isn't there a risk that this could affect the result? Maybe I did not understand the authors' response well enough, but I did not see a clear control that would alleviate my concern.

      The properties of the lipid layer and its interaction with the actin filaments are still unclear to me. A poor control of these interactions is a problem if one aims to measure conformational changes at high resolution. The strength of the interaction appears tuned by the ratio of lipids put on the surface to change its electrostatic charge. A strong attachment likely does more than suppress torsional motion (as claimed in Fig 8A). It may also hinder cofilin binding in several ways (lower availability of binding sites on the filament facing the surface, electrostatic interactions between cofilin and the surface, etc.). Here again, I was not fully reassured by the authors' response.

      The identification of cofilactin regions relies on the additional height of the "peaks", due to the presence of cofilin. It thus seems that cofilin is detected every half helical pitch (HHP), and I still don't understand how the authors can make reliable claims regarding the presence or absence of cofilin between these peaks.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The authors provided a detailed analysis of the real-time structural changes in actin filaments resulting from cofilin binding, using High-Speed Atomic Force Microscopy (HSAFM). The cofilin family controls the lifespan of actin filaments in the cells by severing the filament and promoting depolymerization. Understanding the effects of cofilin on actin filament structure is critical. It is widely acknowledged that cofilin binding significantly shortens the pitch of the actin helix. The authors previously reported (1) that this shortening extends to the unbound region of the actin filament on the pointed end side of the cluster. In this study, the authors presented substantially improved AFM images and provide detailed accounts of the dynamics observed. It was found that a minimal cofilin-binding cluster, consisting of 2-4 molecules, could induce changes in the helical parameters over one or more actin crossover repeats. Adjacent to the cofilin-binding clusters, the actin crossovers were observed to shortened within seconds, and this shortening was limited to one side of the cluster. Additionally, the phosphate binding to the actin filament was observed to stabilize the helical twist, suggesting a mechanism in which cofilin preferentially binds to ADP-bound actin filaments. These findings significantly advance our understanding of actin filament dynamics which is essential for a wide of cellular processes.<br /> However, I propose that the sections about MAD and certain parts of the discussions need substantial revisions.

      In this study, we leverage high spatiotemporal resolutions of high-speed atomic force microscopy (HS-AFM) to analyze real-time structural changes in actin filaments induced by cofilin binding. Furthermore, we experimentally demonstrate the inherent variability in twist conformations of bare actin filaments. Our study integrates HS-AFM with Principal Component Analysis (PCA) to elucidate the actin structure-dependent preferential cooperative binding of cofilin. We provide experimental evidence to substantiate a "proof of principle" regarding the flexible helical twists of actin filaments that regulate the functions of actin-binding proteins. This important study enhances our understanding of actin filaments’ dynamics and polymorphic structures which play crucial roles in a broad spectrum of cellular activities.

      We appreciate the comments from Reviewer 1. Below, we address their concerns point by point.

      MAD analysis

      The authors have presented findings that the mean axial distance (MAD) within actin filaments exhibits a significant dependency on the helical twist, a conclusion not previously derived despite extensive analyses through electron microscopy (EM) and molecular dynamics (MD) simulations. Notably, the MAD values span from 4.5 nm (8.5 pairs per half helical pitch, HHP) to 6.5 nm (4.5 pairs/HHP) as depicted in Figure 3C. The inner domain (ID) of actin remains very similar across C, G, and F forms (2, 3), maintaining similar ID-ID interactions in both cofilactin and bare actin filaments, keeping the identical axial distance between subunits in the both states. This suggests that the ID is unlikely to undergo significant structural changes, even with fluctuations in the filament's twist, keeping the ID-ID interactions and the axial distances. The broad range of MAD values reported poses a challenge for explanation. A careful reassessment of the MAD analysis is recommended to ensure accuracy.

      The central challenge to study “Protein Dynamics” in real time lies in bridging the gap in time scales: HS-AFM captures dynamics of proteins within the milliseconds to seconds range, whereas molecular dynamics (MD) simulations typically operate within the femtoseconds to microseconds domain. Protein dynamics encompass a spectrum of temporal scales, from atomic vibrations to molecular tumbling and collective motions in simulations. HS-AFM stands out as a potent technique for delving into protein dynamics, including processes like protein folding and conformational changes triggered by drugs or protein interactions. Additionally, a significant limitation of MD simulation is the spatial modeling constraint (~50 x 50 nm unit), which restricts the study of large complex biological systems. However, utilizing HS-AFM enables the construction of intricate protein models facilitating the real time imaging of their structures and dynamics during functional activity.

      Regarding the suggestion about ID-ID interactions in both cofilactin and bare actin filaments, maintaining identical axial distances (ADs) between subunits in both states, our HS-AFM cannot provide atomic-level structural insights to address this issue. However, we demonstrate that the variability of OD twists in actin protomers could potentially lead to globally shorter half helical pitches (HHPs) and fewer protomer pairs per HHP (Figure 2, Figure supplement 2) (see lines 218-222). The fluctuation in filament’s twist is further supported by currently available experimental data, including our findings (Figure 3C) in this study (see our Discussion in lines 555-560).

      The minimal change in local ID-ID interactions results in an unchanged global length of actin filaments in both cofilin-bound and unbound cases (Figure supplement 2). However, filament’s twists, as experimentally detected by EM, high-resolution interferometric scattering microscopy (iSCAT), HS-AFM, and in pseudo AFM, are changeable (see lines 555-560).

      We have additionally reassessed the fluctuation and dynamics of MAD in F-ADP-actin and F-ADP.Pi-actin over time at high temporal resolution (Figure supplement 3, Video 3, Table supplement 5). These data are further explained in the Results section (lines 264-270).

      Furthermore, we reassessed the broad range of MAD values in F-ADP-actin segments on both sides of large cofilin clusters over time (Figure supplement 8, Video 5). These findings are explained in the Results section (lines 333-337) and further discussed in the new results (lines 555-560).

      In determining axial distances, the authors extracted measurements from filament line profiles. It is advised to account for potential anomalies such as missing peaks or pseudo peaks, which could arise from noise interference. An example includes the observation of three peaks in HHP6 of Figure Supplement 5C, corresponding to 4.5 pairs. Peak intervals measured from the graph were 5, 11.8, 8.7, and 5.7 nm. The second region (11.8 nm) appears excessively long. If one peak is hidden in the second region, the MAD becomes 5.5 nm.

      We acknowledge the difficulty in identifying peaks within the regions of bare actin segments adjacent to cofilin clusters or within the cofilactin region. In the revised Figure supplement 6C (originally Figure supplement 5C), we did not assess peak intervals as suggested by Reviewer 1. The measurement of axial distance (AD) and the number of peaks within a HHP to calculate the correct MAD is further detailed in the Methods section (see HS-AFM data analysis and processing, highlighted in purple).

      Additionally, the purpose of presenting these Figures supplement 6-7 is to directly compare the half helices and the number of protomer pairs per HHP between bare actin filaments and actin segments near the boundary between cofilactin and bare actin segments on the PE side in the same AFM images. In an original version of this paper, we have avoided including the MAD values measured in the cofilactin region (HHP6, HHP7) in Figure Supplement 7E, to mitigate the measurement errors.

      Compiling histograms of axial distances (ADs) rather than focusing solely on MAD may provide deeper insights. If the AD is too long or too short, the authors should suspect the presence of missing peaks or pseudo-peaks due to noise. If 4.4 or 5.5 pairs/HHP regions tend to contain missing peaks and 7.5-8.5 pairs/HHP regions tend to contain pseudo peaks, this may explain the MAD dependency on the helical twist.

      The measurement of axial distance (AD) and the number of peaks within a HHP to calculate the correct MAD is further detailed in the Methods section (see Analyses of pseudo AFM images of F-actin and C-actin structures constructed from existing PDB structures (e.g., Figure supplement 2); and HS-AFM data analysis and processing, highlighted in purple).

      We disagree with Reviewer 1’s suggestion that compiling histograms of ADs, rather than focusing solely on MAD, may provide deeper insights. AFM imaging provides only a 2-dimensional (2D) surface structure, unlike the 3-dimensional (3D) structure offered by Cryo-EM. In AFM imaging, we cannot capture the object from different angles as Cryo-EM does. Therefore, AD values measured in 2D AFM images do not accurately represent the axial distance between two adjacent protomers along the same actin filament. Consequently, we relied on MAD values. Our results, including the fluctuation in the number of protomer pairs per HHP, are further supported by other studies (see our Discussion in lines 555-560).

      Additionally, Figure 3E indicates a first decay constant of 0.14 seconds, substantially shorter than the frame rate (0.5 sec/frame). This suggests significant variations in line profiles between frames, attributable either to overly rapid dynamics or a low signal-to-noise ratio. Implementing running frame averages (of 2-3 frames) is recommended to distinguish between these scenarios. If the dynamics are indeed fast, the averaged frame's line profile may degrade, complicating peak identification. Conversely, if poor signal-to-noise ratio is the cause, averaging frames could facilitate peak detection. In the latter case, the authors can find the optimal number of frame averages and obtain better line profiles with fewer missing and pseudo-peaks.

      We utilized state-of-the-art HS-AFM with high temporal and spatial resolution to capture the dynamic structures of F-ADP-actin and F-ADP.Pi-actin segments at higher frame rate of 0.2 sec/frame and 0.1 sec/frame, respectively (Figure supplement 3). As suggested, we implemented running frame averages (3 frames) in the ACF analyses. Consistently, our results indicate that the first time constant (t1) remains around 0.1-0.4 seconds, independent of the imaging rates (0.1 – 0.5 sec/frame), for AD between two adjacent actin protomers in F-actin bound with ADP or ADP.Pi (Table Supplement 5), and in the similar range of (t1), shown in Figure 3E. These significant experimental results support the notion that helical twists, the number of actin protomers per HHP, and MAD in bare F-actin segments, are intrinsically dynamic and fluctuate around the mean values over time (see further in lines 264-270; 333-337; and 555-560). It should be noted that our original ACF analyses did not include the averaging of running frames, thus eliminating the possibility of low signal/noise ratio in our analysis, as shown in Figure 3E-F.

      Discussions

      The authors suggest a strong link between the C-form of actin and the formation of a short pitch helix. However, Oda et al. (3) have demonstrated that the C-form is highly unstable in the absence of cofilin binding, casting doubt on the possibility of the C-form propagating without cofilin binding. Moreover, in one strand of the cofilactin, interactions between actin subunits are limited to those between the inner domains (ID-ID interactions), which are quite similar to the interactions observed in bare actin filaments. This similarity implies that ID-ID interactions alone are insufficient to determine the helical parameters, suggesting that the presence of cofilin is essential for the formation of the short pitch helix in the cofilactin filament. Thus, crossover repeats are not necessarily shortened even if the actin form is C-form.

      We have experimentally observed a shortened bare half helix adjacent to cofilin clusters on the PE side at high spatial resolution, comprising fewer protomers than normal half helices. Thus, we hypothesized that crossover repeats are shortened if the actin protomers in the bare half helix neighboring the cofilin cluster on the PE side resembles a C-actin structure. This assumption is further explained by referring to C-actin structure in Figure 2 and Figure supplement 2. Even though the C-form, as suggested in Oda et al., 2019, is unstable, it intrinsically fluctuates around the mean value over time and adopts various conformations. A single PDB structure resolved by Cryo-EM through the ensembles of averaging structural images should be referenced as a single atomistic structure, one of many possible conformations, regardless it is resolved by Cryo-EM, X-ray diffraction or crystallography, or NMR (see Figure 1, legend of Figure supplement 1).

      We highlight two main points regarding this issue: (1) The short helical pitch at the global scale is associated with the twisting of the OD at the local scale for individual protomers; (2) Actins in different nucleotide or cofilin bound states exhibit varying ranges, distributions, spectra, variations of both local OD twist and global helical pitch (Figure 1-2, Figure supplement 1-2). The first point underscores that the twist/untwist of the OD determines the shortness of the helical pitches, rather than the ID-ID interactions. The latter point is more related to the global length of the filament. The minimal change in local ID-ID interactions results in an unchanged global length of actin filaments in both cofilin-bound and unbound cases (see pseudo AFM images in Figure supplement 2 for canonical actin filament and cofilactin segments with the same length (comprising 62 protomers). However, filament’s twists, as experimentally detected by EM, high-resolution interferometric scattering microscopy (iSCAT), HS-AFM, and in pseudo AFM, are changeable (see lines 555-560) and independent on the ID-ID interactions.

      Narita (4) proposes that the facilitation of cofilin binding may occur through a shortening in the helix pitch, independent of a change to the C-form of actin. Furthermore, the dissociation of the D-loop from an adjacent actin subunit leads directly to the transition of actin to the G-form, which is considered the most stable configuration for the actin molecule (3).

      See also our explanation above. We have incorporated these points in a Discussion section. See lines 497-499; 510-511.

      Furthermore, our PCA analysis indicates that the transition from C-actin to G-actin necessitates the opening of the nucleotide cleft (resulting in a decrease in PC1) and is more readily achieved than the direct transition from F-actin to G-actin (which requires decreases in both PC1 and PC2). Whether this transition is directly triggered by the dissociation of the D-loop remains a topic for our future investigations. Our PCA analysis reveals that the D-loop is deeply buried within the core of the filament (Figure 2). Further experiments will be conducted to elucidate its roles.

      The mechanism by which the shortened pitch propagates remains a critical and unresolved issue. It appears that this propagation is not a result of the C-form's propagation but likely involves an unidentified mechanism. Identifying and understanding this mechanism represents an essential direction for future research.

      It's worth mentioning that our HS-AFM data and spatial ACF analysis lend support to a hypothesis suggesting that 2-4 bare actin protomers adjacent to cofilin clusters on the PE side adopt C-actin-like structures. Additionally, we have proposed several hypotheses aimed at better understanding the mechanisms driving the unidirectional binding and expansion of cofilin clusters toward the PE side. These hypotheses will require further examination in future experiments. Additional information can be found in lines 328-329; 344-351; and 416-430.

      (1) K. X. Ngo et al., a, Cofilin-induced unidirectional cooperative conformational changes in actin filaments revealed by high-speed atomic force microscopy. eLife 4, (2015).<br /> (2) K. Tanaka et al., Structural basis for cofilin binding and actin filament disassembly. Nature communications 9, 1860 (2018).<br /> (3) T. Oda et al., Structural Polymorphism of Actin. Journal of molecular biology 431, 3217-3228 (2019).<br /> (4) A. Narita, ADF/cofilin regulation from a structural viewpoint. Journal of muscle research and cell motility 41, 141-151 (2020).

      We have cited them accordingly in the paper.

      Reviewer #2 (Public Review):

      Summary:

      This study by Ngo et al. uses mostly high-speed AFM to estimate conformational changes within actin filaments, as they get decorated by cofilin. The authors build on their earlier study (Ngo et al. eLife 2015) where they used the same technique to monitor the expansion of cofilin clusters on actin filaments, and the propagation of the associated conformational changes in the filament (reduction of the helical pitch). Here, they propose a higher-resolution description of the binding of cofilin to actin filaments.

      Strengths:

      The high speed AFM technique used here is quite original to address this question, compared to classical light and electron microscopy techniques. It can certainly bring valuable information as it provides a high spatial resolution while monitoring live events. Also, in this paper, a nice effort was made to make the 3D structures and conformational changes clear and understandable.

      We are grateful for the positive feedback from Reviewer 2.

      Weaknesses:

      The paper also has a number of limitations, which I detail below.

      In addition to AFM, the authors also propose a Principal Component Analysis (PCA) of exisiting structural data on actin protomers. However, this part seems very similar to another published work by others (Oda et al. JMB 2019), which is not even cited.

      We addressed this issue and explained it in Methods section, lines 612-621.

      The asymmetrical growth of cofilin clusters has so far only been seen using AFM, by the same authors (Ngo et al. eLife 2015). Using fluorescent microscopy, others have reported a very symmetrical expansion of cofilin clusters (Wioland et al. Curr Biol 2017). This is not mentioned at all, here. It should be discussed, and explanations for this discrepancy could be proposed.

      We have cited this paper (Wioland et al. Curr Biol 2017) in the current manuscript (see lines 361-362). However, we are unable to evaluate the technical distinctions between our methods and theirs. Instead, we have referred to a more recent paper that employed similar techniques to those used by Wioland et al. in Current Biology 2017. Our findings align with those reported by Bibeau JP et al. in the Journal of Molecular Biology 2021 (see their Results on page 7, titled “Cofilin clusters elongate preferentially towards the actin filament pointed end”. At the minimum, we believe this is appropriate.

      Regarding the AFM technique, I have the following concerns.

      The filaments appear densely packed on the surface, and even clearly in register in some images (if not most images, e.g., Figs 3A, 4BC, 5A). Why is that? Isn't there a risk that this could affect the result? This suggests there is some interaction between the filaments.

      In this study, as well as in many similar studies of actin filaments alone or in interaction with other actin binding proteins (ABPs) including cofilin, we have carefully considered the density of filaments when designing experiments. We used highly dense, but not packed, actin filaments to minimize free space between filaments and the surface, which helps maintain stable tip-scanning during AFM imaging. This strategy technically allows us to capture high spatial and temporal resolutions of actin filaments’ structures.

      The actin filaments, resemble paracrystal structures, are represented as densely packed actin filaments (see our data in Ngo and Kodera et al., eLife 2015, Figure 1C). Thus, the data presented in this paper is technically appropriate and does not risk misinterpretation due to lateral interactions impacting the structures and function of actin filaments and cofilin.

      The properties of the lipid layer and its interaction with the actin filaments are not clear at all. A poor control of these interactions is a problem if one aims to measure conformational changes at high resolution. The strength of the interaction appears tuned by the ratio of lipids put on the surface to change its electrostatic charge. A strong attachement likely does more than suppress torsional motion (as claimed in Fig 8A). It may also hinder cofilin binding in several ways (lower availability of binding sites on the filament facing the surface, electrostatic interactions between cofilin and the surface, etc.)

      We are confident that our lipid membrane bilayer is the optimal choice for immobilizing actin filaments in a controlled manner for HS-AFM experiments, achieved through the variation of positively charged lipids. In this study, we have fine-tuned the surface charge for our specific purposes.

      As an example, to capture high-spatial resolution images of actin structures (Figure 5-6, Figure supplement 5B, 6), we strongly fixed the filaments on DPPC/DPTAP (50/50 wt%) after the binding reaction between actin filaments and cofilin in solution was completed. This experiment yielded valuable information, including: (i) the ability to replicate the conformation of cofilactin and hybrid cofilactin/bare actin segments in solution, akin to the first steps in sample preparation for Cryo-EM techniques; and (ii) the capability to capture these structures, reflecting their solution states, by firmly fixing them on a lipid surface. On the lipid surface, these structures were retained stably during AFM imaging.

      If there is a choice, we advise against using amino-silane and other positively charged polymers typically used for modifying glass surfaces to fix actin filaments in studies using fluorescence microscopy. The strong immobilization by these chemicals can alter the structural dynamics and functions of actin filaments, lead to non-specific binding of cofilin on the modified glass surface, and potentially affect data interpretation.

      On a local scale, the reviewer may argue about the "lower availability of binding sites on the filament facing the surface". However, on a global scale, we maintain that two single strands forming helical twists of long F-actin segments should have an equal chance to bind cofilin even when fixed on a lipid membrane. The evidence shown in Figure 8A and Video 7, which demonstrates that small cofilin clusters associate and dissociate locally without developing into large clusters along the actin filament, supports our conclusion that flexibility and dynamics in helical twists plays a crucial role in facilitating the binding and growth of cofilin clusters.

      The lipid surface utilized in our study with actin filaments and cofilin provides an ideal surface, as it is flat and minimizes the nonspecific binding of cofilin to the lipid membrane (see an example of the lipid surface in Video 5).

      How do we know that the variations over time are not mostly experimental noise, i.e. variations between repeats of the same measurement? As shown in Fig 3, correlation is mostly lost from one image to the next, and rather stable after that.

      This question is similar to the above question of Reviewer 1. Please also refer to our response in lines 264-270; 333-337; 555-560, measurement Methods, and Figure supplement 3 and Table supplement 5.

      The identification of cofilactin regions relies on the additional height of the "peaks", due to the presence of cofilin. It thus seems that cofilin is detected every half helical pitch (HHP), but not in between, thereby setting the resolution for the localization of cluster borders to one HHP. It thus seems difficult to claim that there is a change in helicity without cofilin decoration over this distance. In Fig 7, the change in helicity could be due to cofilin decoration that is undetected because cofilins have not yet reached the next peak.

      There are several important criteria to distinguish the "supertwisted half helix" in cofilactin region from the "normal half helix". As illustrated in the pseudo AFM images constructed for normal F-actin and C-actin segments (with and without cofilin decoration) from PDB structures, it is evident that these two structures differ significantly in length and the number of protomer pairs per HHP (see Figure Supplement 2). In both pseudo and experimental AFM images, these parameters can be easily detected by measuring the distance between two cross-over points. Furthermore, the height or thickness difference between the cofilactin and bare actin regions is approximately 10-15 Å, which is well resolved by HS-AFM due to its exceptional z-axis resolution of ~1 Å. Technically, we were able to detect these differences by creating a longitudinal section profile that covered both bare actin and cofilactin areas, as shown in Figure supplement 6.

      We experimentally reveal that a critical cofilin cluster comprising 2-4 molecules (Figures 5-6) or larger cofilin clusters (Figures 7-8, Figure Supplements 6-8) could equally supertwist a bare half helix on the PE side. The observation that a small cofilin cluster (2-4 molecules) can shorten a half helix by reducing number of protomers per HHP to 9 or 11 (4.5 or 5.5 protomer pairs), which typically requires full decoration by 9-11 cofilin molecules, strongly suggests that supertwisting or the change in helicity does not always require complete cofilin decoration. We predicted that 2-4 bare actin protomers neighboring a cofilin cluster on the PE side can adopt the C-actin-like structure. See further in lines 324-329.

      Figure 7 captures a live binding event of cofilin at low spatial resolution, yet (i) the half helical pitches and (ii) the thickness of the cofilactin and bare actin segments can still be clearly distinguished. This demonstrates that changes in helicity within the cofilactin region propagate to an unbound half helix on the PE side, rearranging the helical twist by reducing the number of actin protomers per HHP, prior to recruiting additional cofilin for binding and expanding clusters.

      Reviewer #1 (Recommendations For The Authors):

      I believe C-form and G-form are better than C-actin like structure or G-actin like structure.

      We avoid using terms like "G-form", "F-form", or "C-form", as defined by Cryo-EM (Oda et al., 2019), because they refer to specific nucleotide and cofilin-bound states in other original papers. Instead, we use “G-actin”, “F-actin”, “C-actin”, “G-actin-like”, and “C-actin-like” to emphasize "Structural Dynamics" and "Structural Polymorphism". This highlights that even F-actin structures without cofilin bound can adopt "C-actin-like" conformations with fewer OD twists, resulting in a shorter global helical pitch. ADP-bound F-actins exhibit greater variability in helical twists than ADP-Pi-bound F-actin (Figure 9), indicating that ADP-bound F-actin protomers can adopt more C-actin-like conformations than ADP-Pi-bound F-actin protomers (Figure 1, Figure supplement 1).

      Technical terms describing actin structures do not need to be the same between Cryo-EM and HS-AFM, as the two techniques are fundamentally different. Our work underscores the importance of considering “structural dynamics and heterogeneity” in different nucleotide states of filamentous actin structures, both with and without cofilin, over time.

      Figure 1A

      A very similar analysis has already been performed by Oda et al (1). The authors should describe the relationships with the previous analysis.

      We addressed this issue in Methods – Principal component analysis – in lines 612-621.

      Figure 1B, C

      A very similar analysis has already been performed by Tanaka et al. (2). The authors should describe the relationship with the previous analysis.

      We addressed this issue in Methods – Principal component analysis – in lines 612-621 and legend of Figure 1.

      Lines 397-398

      "However, we noted that in rare instances, cofilin clusters also grew on both sides in the regular bare half helices when ATP or ADP was present."

      I believe other experiments also contain ATP in the solution. I could not catch the meaning of this sentence.

      We addressed this issue in the Results section, line 412. "However, we noted that in rare instances, cofilin clusters also grew on both sides in the regular bare half helices when only ADP was present."

      Additionally, we enhanced the description in the Methods section to avoid any confusion regarding nucleotides in the buffer. Please refer to the Methods section under “HS-AFM imaging”, lines 702-738.

      Lines 427-429

      "Consequently, the proportion of naturally supertwisted half helices with HHPs shorter than 30 nm was 5.8% for F-ADP-actin but only 1.1% and 0.2% for F-ADP.Pi-actin and phalloidin-stabilized F-actin, respectively."<br /> Similar discussion was made in (3) for the actin filaments with tension. It might be comparable with the current data.

      We cited it accordingly, line 447 for Okura et al., 2023.

      Lines 553-557

      "Nonetheless, it remains plausible that the structural flexibility exhibited 553 by ADP-bound actin protomers could result in subtle variations in the conformations of the DNase binding loop (Dloop) G46-M47-G48-N49, as suggested in (Chou and Pollard, 2019). We suggest that the absence of bound Pi possibly increases the torsional flexibilities during helical twisting of ADP bound actin filaments in contrast to their ADP.Pi-bound counterparts."

      The crystal structure of the F-form (4) showed that Pi in ADP.Pi connects the two large domains of the actin molecule, stabilizing F-form. Pi release largely weakens the connection. This might be useful for the discussion.

      We incorporated this point with the suggested citation in lines 582-584.

      (1) T. Oda et al., Structural Polymorphism of Actin. Journal of molecular biology 431, 3217-3228 (2019).

      (2) K. Tanaka et al., Structural basis for cofilin binding and actin filament disassembly. Nature communications 9, 1860 (2018).

      (3) K. Okura et al., Mechanical Stress Decreases the Amplitude of Twisting and Bending Fluctuations of Actin Filaments. Journal of molecular biology 435, 168295 (2023).

      (4) Y. Kanematsu et al., Structures and mechanisms of actin ATP hydrolysis. Proceedings of the National Academy of Sciences of the United States of America 119, e2122641119 (2022).

      Reviewer #2 (Recommendations For The Authors):

      Line 190: "Noticeably, PCA analysis revealed higher structural flexibility in F-ADP-actin (red dots), exploring a larger space than F-ADP-Pi-actin structures (orange dots) within the F-actin cluster (inset in Figure 1A)". Is there a quantification to support this claim? Visually, things are not so clear.

      We have improved Figure 1 by adding 2 circles to an inset, providing clearer quantification to support our claim.

      In the PCA part: isn't it a bit obvious, or at least expected, that the conformation adopted by actin in the cofilactin structure is the most favorable one for binding cofilin?

      We agree this point with the reviewer and have added this point accordingly in the Results section, lines 202-204.

      I found it a bit unclear how the structures in Fig 2 were obtained.

      We further explained it by adding “Zoom-in views of these long filaments are shown in Figure 2” in Methods section, line 661.

      In the AFM images, the authors always seem to know the polarity of the filaments. Unless I missed it, how they know this is not explained. In their earlier work (Ngo et al. 2015) they used a subfragment of myosin II which indicates polarity when bound to F-actin. I found no such explanation here.

      We have addressed this issue in the legend of each figure accordingly.

      For clarity, I suggest writing "C-actin-like structures" (with two hyphens) rather than "C-actin like structures".

      We agree and are currently incorporating this change in the text.

      The term "cluster" in PCA can be confusing because it is used for cofilin clusters throughout the text.

      "Cluster" is a common term used in PCA analysis. To clarify, we revised the legend in Figure 1 and Figure Supplement 1, changing "PCA clusters" to distinguish them from “cofilin clusters” or “F-actin clusters”.

      There are many acronyms. Readibility of the figure legends (which can be consulted independently from the main text) would be improved if acronyms were explicited there as well.

      We have revised some of the acronyms in the legend of each figure accordingly. At the minimum, we believe it is appropriate.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      This manuscript builds upon the authors' previous work on the cross-talk between transcription initiation and post-transcriptional events in yeast gene expression. These prior studies identified an mRNA 'imprinting' phenomenon linked to genes activated by the Rap1 transcription factor (TF), a surprising role for the Sfp1 TF in promoting RNA polymerase II (RNAPII) backtracking, and a role for the non-essential RNAPII subunits Rpb4/7 in the regulation of mRNA decay and translation. Here the authors aimed to extend these observations to provide a more coherent picture of the role of Sfp1 in transcription initiation and subsequent steps in gene expression. They provide evidence for (1) a physical interaction between Sfp1 and Rpb4, (2) Sfp1 binding and stabilization of mRNAs derived from genes whose promoters are bound by both Rap1 and Sfp1 and (3) an effect of Sfp1 on Rpb4 binding or conformation during transcription elongation. 

      Strengths: 

      This study provides evidence that a TF (yeast Sfp1), in addition to stimulating transcription initiation, can at some target genes interact with their mRNA transcripts and promote their stability. Sfp1 thus has a positive effect on two distinct regulatory steps. Furthermore, evidence is presented indicating that strong Sfp1 mRNA association requires both Rap1 and Sfp1 promoter binding and is increased at a sequence motif near the polyA track of many target mRNAs. Finally, they provide compelling evidence that Sfp1-bound mRNAs have higher levels of RNAPII backtracking and altered Rpb4 association or conformation compared to those not bound by Sfp1. 

      Weaknesses: 

      The Sfp1-Rpb4 association is supported only by a two-hybrid assay that is poorly described and lacks an important control. Furthermore, there is no evidence that this interaction is direct, nor are the interaction domains on either protein identified (or mutated to address function). 

      Indeed, our two hybrid, immunoprecipitation and imaging results do not allow us to conclusively discern whether the interaction between Rpb4 and Sfp1 is direct or indirect. While the interaction holds significance, we consider the direct versus indirect distinction to be of secondary importance in the context of this paper. In the current text we indicated that 'our two hybrid, immunoprecipitation and imaging results do not differentiate between a direct or indirect interactions' (see page 6, sentences highlighted in blue)

      The contention that Sfp1 nuclear export to the cytoplasm is transcription-dependent is not well supported by the experiments shown, which are not properly described in the text and are not accompanied by any primary data. 

      This section has been re-written for better clarity (see page 7). We note that this assay was originally developed and published by Lee, M. S., M. Henry, and P. A. Silver in their 1996 paper in G&D and has since been reported in numerous subsequent studies. Reassuringly, our conclusion is bolstered by the observation that Sfp1 binds to Pol II transcripts co-transcriptionally, suggesting that Sfp1 is exported in the context of the mRNA.

      The presence of Sfp1 in P-bodies is of unclear relevance and the authors do not ask whether Sfp1-bound mRNAs are also present in these condensates. 

      P-bodies consist of both RNA and proteins (reviewed in doi: 10.1021/acs.biochem.7b01162). The significance of this experiment lies in its contribution to further confirming the co-localization of Sfp1 with mRNAs and Rpb4. This observation could also yield valuable insights for future investigations into the role of Sfp1.

      Further analysis of Sfp1-bound mRNAs would be of interest, particularly to address the question of whether those from ribosomal protein genes and other growth-related genes that are known to display Sfp1 binding in their promoters are regulated (either stabilized or destabilized) by Sfp1. 

      Fig. 4A, C and D show that RP mRNAs become destabilized in sfp1Δ cells.

      The authors need to discuss, and ideally address, the apparent paradox that their previous findings showed that Rap1 acts to destabilize its downstream transcripts, i.e. that it has the opposite effect of Sfp1 shown here. 

      We would like to thank Reviewer 1 for this valuable comment. In the revised paper, we delved into our hypothesis suggesting that Rap1 is likely responsible for regulating the imprinting of other proteins, that, in turn, lead to the destabilization of mRNAs, such as Rpb4. See blue paragraph in page 20.

      Finally, recent studies indicate that the drugs used here to measure mRNA stability induce a strong stress response accompanied by rapid and complex effects on transcription. Their relevance to mRNA stability in unstressed cells is questionable. 

      Half-lives were determined mainly by the GRO analysis of optimally proliferating cells. This  method does not requires any drug or stressful treatment.  The results obtained by this method were consistent with those obtained after thiolutin addition. Using both methods, we discovered that disruption of Sfp1 results in substantial mRNA destabilization. Nevertheless, in our revised manuscript, we show results obtained by subjecting cells to a temperature shift to 42°C, a natural method to inhibit transcription. This approach to determine half-lives has been previously reported in our publications, such as Lotan et al. (2005, 2007) and Goler Baron et al. (2008). This may rule out effects of the drug on half-lives. Indeed, this assay clearly determine HL under heat stress. Thus it can clearly demonstrate that, at least during heat shock, Sfp1 stabilizes mRNAs. Since the results are similar to those obtained by the GRO method at 30oC, we concluded that Sfp1 stabilizes mRNA under optimal and hot conditions.

      Reviewer #2 (Public Review): 

      Summary: 

      The manuscript by Kelbert et al. presents results on the involvement of the yeast transcription factor Sfp1 in the stabilisation of transcripts whose synthesis it stimulates. Sfp1 is known to affect the synthesis of a number of important cellular transcripts, such as many of those that code for ribosomal proteins. The hypothesis that a transcription factor can remain bound to the nascent transcript and affect its cytoplasmic half-life is attractive, but the methods used to demonstrate the half-life effects and the association of Sfp1 with cytoplasmic transcripts remain to be fully validated, as explained in my comments on the results below: 

      Comments on methodology and results: 

      (1) A two-hybrid-based assay for protein-protein interactions identified Sfp1, a transcription factor known for its effects on ribosomal protein gene expression, as interacting with Rpb4, a subunit of RNA polymerase II. Classical two-hybrid experiments depend on the presence of the tested proteins in the nucleus of yeast cells, suggesting that the observed interaction occurs in the nucleus. Unfortunately, the two-hybrid method cannot determine whether the interaction is direct or mediated by nucleic acids. 

      Indeed, our two hybrid, immunoprecipitation and imaging results do not allow us to conclusively discern whether the interaction between Rpb4 and Sfp1 is direct or indirect. While the interaction holds significance, we consider the direct versus indirect distinction to be of secondary importance in the context of this paper. In the current text we indicated that 'our two hybrid, immunoprecipitation and imaging results do not differentiate between a direct or indirect interactions' (see page 6)

      (2) Inactivation of nup49, a component of the nuclear pore complex, resulted in the redistribution of GFP-Sfp1 into the cytoplasm at the temperature non-permissive for the nup49-313 strain, suggesting that GFP-Sfp1 is a nucleo-cytoplasmic shuttling protein. This observation confirmed the dynamic nature of the nucleo-cytoplasmic distribution of Sfp1. For example, a similar redistribution to the cytoplasm was previously reported following rapamycin treatment and under starvation (Marion et al., PNAS 2004). In conjunction with the observation of an interaction with Rpb4, the authors observed slower nuclear import kinetics for GFP-Sfp1 in the absence of Rpb4 when cells were transferred to a glucose-containing medium after a period of starvation. Since the redistribution of GFP-Sfp1 was abolished in an rpb1-1/nup49-313 double mutant, the authors concluded that Sfp1 localisation to the cytoplasm depends on transcription. The double mutant yeast cells may show a variety of non-specific effects at the restrictive temperature, and whether transcription is required for Sfp1 cytoplasmic localisation remains incompletely demonstrated. 

      We agree with Reviewer 2 that any heat inactivation of a temperature-sensitive (ts) protein can lead to non-specific effects. It is evident that nup49-313 does not prevent Sfp1 export to the cytoplasm. In the case of rpb1-1, these non-specific effects are expected due to transcriptional arrest, which can eventually result in a reduction in protein content. However, this process takes some time, while the impact on export is more rapid. It is worth noting that this assay was developed and previously published by Pam Silver (Henry and Silver G&D 1996) and has been reported in many subsequent papers. Importantly, our conclusion is supported by the observation that Sfp1 binds both nascent RNA (co-transcriptionally) and mature mRNA (cytoplasmic). These observations, along with the reduced mRNA export upon transcription blocking, are consistent with our proposal that Sfp1 is exported in association with mRNA.

      (3) Under starvation conditions, which led to the presence of Sfp1 in the cytoplasm and have previously been correlated with a decrease in the transcription of Sfp1 target genes, the authors observed that a plasmid-based expressed GFP-Sfp1 accumulated in cytoplasmic foci. These foci were also labelled by P-body markers such as Dcp2 and Lsm1. The quality of the microscopic images provided does not allow to determine whether Rpb4-RFP colocalises with GFP-Sfp1. 

      The submitted PDF figure is of low quality. We believe that high quality figure of the final submission is convincing. 

      (4) To understand to which RNA Sfp1 might bind, the authors used an N-terminally tagged fusion protein in a cross-linking and purification experiment. This method identified 264 transcripts for which the CRAC signal was considered positive and which mostly correspond to abundant mRNAs, including 74 ribosomal protein mRNAs or metabolic enzyme-abundant mRNAs such as PGK1. The authors did not provide evidence for the specificity of the observed CRAC signal, in particular, what would be the background of a similar experiment performed without UV cross-linking. In a validation experiment, the presence of several mRNAs in a purified SFP1 fraction was measured at levels that reflect the relative levels of RNA in a total RNA extract. Negative controls showing that abundant mRNAs not found in the CRAC experiment were clearly depleted from the purified fraction with Sfp1 would be crucial to assessing the specificity of the observed protein-RNA interactions. The NON-CRAC+ selected mRNAs were enriched for genes whose expression was previously shown to be upregulated upon Sfp1 overexpression (Albert et al., 2019). The presence of unspliced RPL30 pre-mRNA in the Sfp1 purification was interpreted as a sign of co-transcriptional assembly of Sfp1 into mRNA, but in the absence of valid negative controls, this hypothesis would require further experimental validation.

      We would like to thank Reviewer 2 for bringing this issue up, as it helped us to clarify it in the revised paper.

      First, we emphasized in the Discussion that many CRAC+ genes do not fall into the category of highly transcribed genes. Please see more detailed discussion below.

      Secondly, we examined various features of the 264 genes - classified as CRAC+ - to estimate their specificity and biological significance. Our various experiments revealed that the CRAC+ genes represent a distinct group with many unique features.

      The biological significance of the 264 CRAC+ mRNAs was demonstrated by various experiments; all are inconsistent with technical flaws. In fact, all the experiments and analyses that we have pursued indicate the unique nature of the CRAC+ genes. Some examples are:

      (1) Fig. 2a and B show that most reads of CRAC+ mRNA were mapped to specific location – close the pA sites.

      (2) Fig. 2C shows that most reads of CRAC+ mRNA were mapped to specific RNA motif located near the 3’ ends of the mRNAs.

      (3) Most RiBi CRAC+ promoter contain Rap1 binding sites (p= 1.9x10-22), whiles the vast majority of RiBi non-CRAC+  promoters do not. (Fig. 3C).

      (4) Fig. 4A shows that RiBi CRAC+ mRNAs become destabilized due to Sfp1 deletion, whereas RiBi non-CRAC+ mRNAs do not. Fig. 4B shows similar results due to Sfp1 depletion.

      (5) Fig. 6B shows that the impact of Sfp1 on backtracking is substantially higher for CRAC+ than for non-CRAC+ genes. This is most clearly visible in RiBi genes.

      (6) Fig. 7A shows that the Sfp1-dependent changes along the transcription units is substantially more rigorous for CRAC+ than for non-CRAC+.

      (7) In Fig. S4B, the chromatin binding profile of Sfp1 is shown to be different for CRAC+ and non-CRAC+ genes.

      Taken together, the many unique features, in fact, any feature that we examined, indicate the specificity and significance of this group, demonstrating that our CRAC results are biologically significant.

      Most importantly, these genes do not all fall into the category of highly transcribed genes.  On the contrary, as depicted in Figure 6A (green dots), it is evident that CRAC+ genes exhibit a diverse range of Rpb3 ChIP and GRO signals. Furthermore, as illustrated in Figure 7A, when comparing CRAC+ to Q1 (the most highly transcribed genes), it becomes evident that the Rpb4/Rpb3 profile of CRAC+ genes behaves differently from the Q1 group. Evidently, despite the heterogeneous transcription of CRAC+ genes (as mentioned above), the Rpb4/Rpb3 profile decreases more substantially than that of the highly transcribed genes (Q1).  Moreover, despite similar expression levels among all RiBi mRNAs, only a portion of them binds Sfp1.

      Thus, all our results indicate that CRAC+ genes represent biologically significant group, irrespective of the expression of it members. In response to this comment, we included a new paragraph discussing the validity of our conclusions. See page 18, blue paragraph.

      (5) To address the important question of whether co-transcriptional assembly of Spf1 with transcripts could alter their stability, the authors first used a reporter system in which the RPL30 transcription unit is transferred to vectors under different transcriptional contexts, as previously described by the Choder laboratory (Bregman et al. 2011). While RPL30 expressed under an ACT1 promoter was barely detectable, the highest levels of RNA were observed in the context of the native upstream RPL30 sequence when Rap1 binding sites were also present. Sfp1 showed better association with reporter mRNAs containing Rap1 binding sites in the promoter region. However, removal of the Rap1 binding sites from the reporter vector also led to a drastic decrease in reporter mRNA levels. Whether the fraction of co-purified RNA is nuclear and co-transcriptional or not cannot be inferred from these results. 

      The proposed co-transcriptional binding of Sfp1 is based on the findings presented in Figure 5C and Figure S2D, as well as the observed binding of Sfp1 to transcripts containing introns, as shown in Figures 2D and 3B.  The results of Fig. 3 led us to the assertion that the "RNA-binding capacity of Sfp1 is regulated by Rap1-binding sites located at the promoter." We maintain our stance on this conclusion. Indeed, the Rap1 binding site does impact mRNA levels, as highlighted by Reviewer 2. However, "construct E," which possesses a promoter with a Rap1 binding site, exhibits lower transcript levels compared to "construct F," which lacks such a binding site in its promoter. Despite this difference in transcript levels, Sfp1 was able to pull down the former transcript but not the latter, even though expression of the former gene is relatively low. Thus, the results appear to be more reliant on the specific capacity of Sfp1 to interact with the transcript rather than on the transcript's expression level.

      (6) To complement the biochemical data presented in the first part of the manuscript, the authors turned to the deletion or rapid depletion of SFP1 and used labelling experiments to assess changes in the rate of synthesis, abundance, and decay of mRNAs under these conditions. An important observation was that in the absence of Sfp1, mRNAs encoding ribosomal protein genes not only had a reduced synthesis rate but also an increased degradation rate. This important observation needs careful validation, as genomic run-on experiments were used to measure half-lives, and this particular method was found to give results that correlated poorly with other measures of half-life in yeast (e.g. Chappelboim et al., 2022 for a comparison). Similarly, the use of thiolutin to block transcription as a method of assessing mRNA half-life has been reported to be problematic, as thiolutin can specifically inhibit the degradation of ribosomal protein mRNA (Pelechano & Perez-Ortin, 2008). Specific repressible reporters, such as those used by Baudrimont et al. (2017), would need to be tested to validate the effect of Sfp1 on the half-life of specific mRNAs. Also, it would be very difficult to infer from the images presented whether the rate of deadenylation is altered by Sfp1.

      Various methods exist for assessing mRNA half-lives (HLs), and each of them carries its own set of challenges and biases. Consequently, it becomes problematic to directly compare HL values of a specific mRNA when different methods are employed. The superiority of one particular method over others remains unclear (in my opinion). However, they exhibit a high degree of reliability when it comes to comparing different strains under the identical conditions using a single method.

      Estimating HLs through the GRO approach is a non-invasive method, applied on optimally proliferating cells, which has been employed in numerous publications. While no method is without its limitations, our experience along the years reassured approach to be among the most dependable. Our HL determination using thiolutin to block transcription provided results that were consistent with the values obtained by the GRO approach.

      Nevertheless, in our revised manuscript, we supplemented the HL data, obtain by thiolutin, with results obtained by subjecting cells to a temperature shift to 42°C, a natural method to block transcription in wild-type (WT) cells. This approach to determine HLs has been previously reported in our publications, such as Lotan et al. (2005, 2007) and Goler Baron et al. (2008). The new results are shown in Fig. S3B. They are consistent with our conclusion that Sfp1 stabilizes mRNAs.

      Using a repressible promoter to determine mRNA HL is, unfortunately, not suitable in this paper because the promoter itself is involved in HL regulation. This observation is supported by Bregman et al. (2011) and depicted in Fig. 3, which illustrates that the promoter is critical for mRNA imprinting, consequently regulating HL.

      (7) The effects of SFP1 on transcription were investigated by chromatin purification with Rpb3, a subunit of RNA polymerase, and the results were compared with synthesis rates determined by genomic run-on experiments. The decrease in polII presence on transcripts in the absence of SFP1 was not accompanied by a marked decrease in transcript output, suggesting an effect of Sfp1 in ensuring robust transcription and avoiding RNA polymerase backtracking. To further investigate the phenotypes associated with the depletion or absence of Sfp1, the authors examined the presence of Rpb4 along transcription units compared to Rpb3. One effect of spf1 deficiency was that this ratio, which decreased from the start of transcription towards the end of transcripts, increased slightly. The results presented are largely correlative and could arise from the focus on very specific types of mRNAs, such as those of ribosomal protein genes, which are sensitive to stress and are targeted by very active RNA degradation mechanisms activated, for example, under heat stress (Bresson et al., 2020). 

      Figure 7A illustrates a significant reduction in Rpb4/Rpb3 ratios along the transcription unit in WT cells. This reduction is notably more pronounced in CRAC+ genes compared to the highly transcribed quartile (Q1), which includes all ribosomal protein (RP) genes, and it is completely absent in sfp1∆ cells. Furthermore, it's important to highlight that the CRAC+ gene group displays a wide range of transcription rates, as measured by either Rpb3 ChIP or GRO (Figure 6A). Given these observations, we do not think that heightened sensitivity of RP mRNA degradation in response to stress is responsible for the pronounced difference in the configuration of the Pol II elongation complex that is detected in CRAC+ genes, mainly because this experiment was performed under standard (non-stress) culture conditions.

      Correlative studies are particularly informative when a gene mutation eliminates a correlation, and this is precisely the type of study depicted in Figure 7B-C. The correlations shown in these panels are dependent on Sfp1. Indeed, RP genes are sensitive to stress. However, we used non-stressed conditions. Furthermore, CRAC+ genes did not display any apparent unusual destabilization but rather exhibited higher (not lower) mRNA stability compared to non-CRAC+ genes (Figure 7C).

    1. Reviewer #2 (Public Review):

      Summary:<br /> The manuscript by Kelbert et al. presents results on the involvement of the yeast transcription factor Sfp1 in the stabilisation of transcripts whose synthesis it stimulates. Sfp1 is known to affect the synthesis of a number of important cellular transcripts, such as many of those that code for ribosomal proteins. The hypothesis that a transcription factor can remain bound to the nascent transcript and affect its cytoplasmic half-life is attractive, but the methods used to demonstrate the half-life effects and the association of Sfp1 with cytoplasmic transcripts remain to be fully validated, as explained in my comments on the results below:

      Comments on methodology and results:<br /> (1) A two-hybrid-based assay for protein-protein interactions identified Sfp1, a transcription factor known for its effects on ribosomal protein gene expression, as interacting with Rpb4, a subunit of RNA polymerase II. Classical two-hybrid experiments depend on the presence of the tested proteins in the nucleus of yeast cells, suggesting that the observed interaction occurs in the nucleus. Unfortunately, the two-hybrid method cannot determine whether the interaction is direct or mediated by nucleic acids.

      (2) Inactivation of nup49, a component of the nuclear pore complex, resulted in the redistribution of GFP-Sfp1 into the cytoplasm at the temperature non-permissive for the nup49-313 strain, suggesting that GFP-Sfp1 is a nucleo-cytoplasmic shuttling protein. This observation confirmed the dynamic nature of the nucleo-cytoplasmic distribution of Sfp1. For example, a similar redistribution to the cytoplasm was previously reported following rapamycin treatment and under starvation (Marion et al., PNAS 2004). In conjunction with the observation of an interaction with Rpb4, the authors observed slower nuclear import kinetics for GFP-Sfp1 in the absence of Rpb4 when cells were transferred to a glucose-containing medium after a period of starvation. Since the redistribution of GFP-Sfp1 was abolished in an rpb1-1/nup49-313 double mutant, the authors concluded that Sfp1 localisation to the cytoplasm depends on transcription. The double mutant yeast cells may show a variety of non-specific effects at the restrictive temperature, and whether transcription is required for Sfp1 cytoplasmic localisation remains incompletely demonstrated.

      (3) Under starvation conditions, which led to the presence of Sfp1 in the cytoplasm and have previously been correlated with a decrease in the transcription of Sfp1 target genes, the authors observed that a plasmid-based expressed GFP-Sfp1 accumulated in cytoplasmic foci. These foci were also labelled by P-body markers such as Dcp2 and Lsm1. The quality of the microscopic images provided does not allow to determine whether Rpb4-RFP colocalises with GFP-Sfp1.

      (4) To understand to which RNA Sfp1 might bind, the authors used an N-terminally tagged fusion protein in a cross-linking and purification experiment. This method identified 264 transcripts for which the CRAC signal was considered positive and which mostly correspond to abundant mRNAs, including 74 ribosomal protein mRNAs or metabolic enzyme-abundant mRNAs such as PGK1. The authors did not provide evidence for the specificity of the observed CRAC signal, in particular, what would be the background of a similar experiment performed without UV cross-linking. In a validation experiment, the presence of several mRNAs in a purified SFP1 fraction was measured at levels that reflect the relative levels of RNA in a total RNA extract. Negative controls showing that abundant mRNAs not found in the CRAC experiment were clearly depleted from the purified fraction with Sfp1 would be crucial to assessing the specificity of the observed protein-RNA interactions. The CRAC-selected mRNAs were enriched for genes whose expression was previously shown to be upregulated upon Sfp1 overexpression (Albert et al., 2019). The presence of unspliced RPL30 pre-mRNA in the Sfp1 purification was interpreted as a sign of co-transcriptional assembly of Sfp1 into mRNA, but in the absence of valid negative controls, this hypothesis would require further experimental validation.

      (5) To address the important question of whether co-transcriptional assembly of Spf1 with transcripts could alter their stability, the authors first used a reporter system in which the RPL30 transcription unit is transferred to vectors under different transcriptional contexts, as previously described by the Choder laboratory (Bregman et al. 2011). While RPL30 expressed under an ACT1 promoter was barely detectable, the highest levels of RNA were observed in the context of the native upstream RPL30 sequence when Rap1 binding sites were also present. Sfp1 showed better association with reporter mRNAs containing Rap1 binding sites in the promoter region. However, removal of the Rap1 binding sites from the reporter vector also led to a drastic decrease in reporter mRNA levels. Whether the fraction of co-purified RNA is nuclear and co-transcriptional or not cannot be inferred from these results.

      (6) To complement the biochemical data presented in the first part of the manuscript, the authors turned to the deletion or rapid depletion of SFP1 and used labelling experiments to assess changes in the rate of synthesis, abundance, and decay of mRNAs under these conditions. An important observation was that in the absence of Sfp1, mRNAs encoding ribosomal protein genes not only had a reduced synthesis rate but also an increased degradation rate. This important observation needs careful validation, as genomic run-on experiments were used to measure half-lives, and this particular method was found to give results that correlated poorly with other measures of half-life in yeast (e.g. Chappelboim et al., 2022 for a comparison). Similarly, the use of thiolutin to block transcription as a method of assessing mRNA half-life has been reported to be problematic, as thiolutin can specifically inhibit the degradation of ribosomal protein mRNA (Pelechano & Perez-Ortin, 2008). Specific repressible reporters, such as those used by Baudrimont et al. (2017), would need to be tested to validate the effect of Sfp1 on the half-life of specific mRNAs. Also, it would be very difficult to infer from the images presented whether the rate of deadenylation is altered by Sfp1.

      (7) The effects of SFP1 on transcription were investigated by chromatin purification with Rpb3, a subunit of RNA polymerase, and the results were compared with synthesis rates determined by genomic run-on experiments. The decrease in polII presence on transcripts in the absence of SFP1 was not accompanied by a marked decrease in transcript output, suggesting an effect of Sfp1 in ensuring robust transcription and avoiding RNA polymerase backtracking. To further investigate the phenotypes associated with the depletion or absence of Sfp1, the authors examined the presence of Rpb4 along transcription units compared to Rpb3. One effect of spf1 deficiency was that this ratio, which decreased from the start of transcription towards the end of transcripts, increased slightly. The results presented are largely correlative and could arise from the focus on very specific types of mRNAs, such as those of ribosomal protein genes, which are sensitive to stress and are targeted by very active RNA degradation mechanisms activated, for example, under heat stress (Bresson et al., 2020).

      Strengths:<br /> - Diversity of experimental approaches used<br /> - Validation of large-scale results with appropriate reporters

      Weaknesses:<br /> - Choice of evaluation method to test mRNA half-life<br /> - Lack of controls for the CRAC results

    1. eLife assessment

      This useful manuscript presents an analysis of different factors that are required for release of the lipid-linked morphogen Shh from cellular membranes. The evidence is still incomplete, as experiments rely on over-expression of Shh in a single cell line and are sometimes of a correlative nature. The study, which otherwise confirms and extends previous findings, will be of interest to developmental biologists who work on Hedgehog signaling.

    2. Reviewer #1 (Public Review):

      This manuscript presents a model in which combined action of the transporter-like protein DISP and the sheddases ADAM10/17 promote shedding of a mono-cholesteroylated Sonic Hedgehog (SHH) species following cleavage of palmitate from the dually lipidated precursor ligand. The authors propose that this leads to transfer of the cholesterol-modified SHH to HDL for solubilization. The minimal requirement for SHH release by this mechanism is proposed to be the covalently linked cholesterol modification because DISP could promote transfer of a cholesteroylated mCherry reporter protein to serum HDL. The authors used an in vitro system to demonstrate dependency on DISP/SCUBE2 for release of the cholesterol modified ligand. These results confirm previously published results from other groups (PMC3387659 and PMC3682496).

      A strength of the work is the use of a bicistronic SHH-Hhat system to consistently generate dually-lipidated ligand to determine the quantity and lipidation status of SHH released into cell culture media.

      Key shortcomings include the unusual normalization strategies used for many experiments and the lack of quantification/statistical analyses for several experiments. Due to these omissions, it is difficult to conclude that the data justify the conclusions. The significance of the data provided is overstated because many of the presented experiments confirm/support previously published work. The study provides a modest advance in understanding of the complex issue of SHH membrane extraction.

    3. Reviewer #2 (Public Review):

      Ehring et al. analyze contributions of Dispatched, Scube2, serum lipoproteins and Sonic Hedgehog lipid modifications to the generation of different Shh release forms. Hedgehog proteins are anchored in cellular membranes by N-terminal palmitate and C-terminal cholesterol modifications, yet spread through tissues and are released into the circulation. How Hedgehog proteins can be released, and in which form, remains controversial. The authors systematically dissect contributions of several previously identified factors, and present evidence that Disp, Scube2 and lipoproteins concertedly act to release a novel Shh variant that is cholesterol-modified but not palmitoylated. The results provide new insights into the function of Disp and Scube2 in Hedgehog release. The findings concerning the function of lipoproteins and cholesterol in Hedgehog release are largely confirmatory (PMID 23554573, 20685986). However, in light of the multitude of competing models for Hedgehog release, the present study is a valuable contribution that provides further insights into the relevance of lipoproteins in this process.

      A novel and surprising finding of the present study is the differential removal of Shh N- or C-terminal lipid anchors depending on the presence of HDL and/or Disp. In particular, the identification of a non-palmitoylated but cholesterol-modified Shh variant that associates with lipoproteins is potentially important. The authors use RP-HPLC and defined controls to assess the properties of processed Shh forms, but their precise molecular identity remains to be defined. A caveat is the strong reliance on over-expression of Shh in a single cell line. The authors detect Shh variants that are released independently of Disp and Scube2 in secretion assays, which however are excluded from interpretation as experimental artifacts. Thus, it would be important to demonstrate key findings in cells that secrete Shh endogenously.

    1. Reviewer #2 (Public Review):

      In this work, Sarkar et al. investigated the potential ability of adenosine triphosphate (ATP) as a solubilizer of protein aggregates by combining MD simulations and ThT/TEM experiments. They explored how ATP influences the conformational behaviors of Trp-cage and β-amyloid Aβ40 proteins. Currently, there are no experiments in the literature supporting their simulation results of ATP on Trp-cage. The simulation protocol employed for the Aβ40 monomer system is conventional MD simulation, while REMD simulation (an enhanced sampling method) is used for the Aβ monomer + ATP system. It is not clear whether the conformational difference is caused by ATP or by the different simulation methods used. ThT/TEM experiments should be performed on Aβ40 fibrils rather than on Aβ(16-22) aggregates. Moreover, to elucidate their experimental results that ATP can dissolve preformed Aβ fibrils, the authors need to study the influence of ATP on Aβ fibrils instead of on Aβ dimer in their MD simulations. The novelty of this study is limited. The role of ATP in inhibiting Aβ fibril formation and dissolving preformed Aβ fibrils has been reported in previous experimental and computational studies (Journal of Alzheimer's Disease, 2014, 41: 561; Science 2017, 2017, 356, 753-756 J. Phys. Chem. B 2019, 123, 9922−9933; Scientific Reports, 2024, 14: 8134). However, most of those papers are not discussed in this manuscript. Additionally, some details of MD simulations and data analysis are missing in the manuscript, including the initial structures of all the simulations, the method for free energy calculation, the dielectric constant used, etc.

    2. eLife assessment

      The authors combined molecular dynamics simulations and experiments to study the role of ATP as a hydrotrope of protein aggregates. The topic is of major current interest and thus the study potentially makes a useful contribution to the community. In the current form, however, the level of evidence from the computation is considered incomplete, due to several issues such as limited convergence test, analysis, and the very high ATP concentration used in the simulation.

    3. Reviewer #1 (Public Review):

      Summary:

      This work combines molecular dynamics (MD) simulations along with experimental elucidation of the efficacy of ATP as a biological hydrotrope. While ATP is broadly known as the energy currency, it has also been suggested to modulate the stability of biomolecules and their aggregation propensity. In the computational part of the work, the authors demonstrate that ATP increases the population of the more expanded conformations (higher radius of gyration) in both a soluble folded mini-protein Trp-cage and an intrinsically disordered protein (IDP) Aβ40. Furthermore, ATP is shown to destabilise the pre-formed fibrillar structures using both simulation and experimental data (ThT assay and TEM images). They have also suggested that the biological hydrotrope ATP has significantly higher efficacy as compared to the commonly used chemical hydrotrope sodium xylene sulfonate (NaXS).

      Strengths:

      This work presents a comprehensive and compelling investigation of the effect of ATP on the conformational population of two types of proteins: globular/folded and IDP. The role of ATP as an "aggregate solubilizer" of pre-formed fibrils has been demonstrated using both simulation and experiments. They also elucidate the mechanism of action of ATP as a multi-purpose solubilizer in a protein-specific manner. Depending on the protein, it can interact through electrostatic interactions (for predominantly charged IDPs like Aβ40), or primarily van der Waals' interactions through (for Trp-Cage).

      Weaknesses:

      The data presented by the authors are sound and adequately support the conclusions drawn by the authors. However, there are a few points that could be discussed or elucidated further to broaden the scope of the conclusions drawn in this work as discussed below:

      (i) The concentration of ATP used in the simulations is significantly higher (500 mM) as compared to those used in the experiments (6-20 mM) or cellular cytoplasm (~5 mM as mentioned by the authors). Since the authors mention already known concentration dependence of the effect of ATP, it is worth clarifying the possible limitations and implications of the high ATP concentrations in the simulations. It seems ATP can stabilise the proteins at low concentrations, but the current work does not address this possible effect. It would be interesting to see whether the effect of ATP on globular proteins and IDPs remains similar even at lower ATP concentrations.

      (ii) The authors make a somewhat ambitious statement that the role of ATP as a solubilizer of pre-formed fibrils could be used as a therapeutic strategy in protein aggregation-related diseases. However, it is not clear how it would be so since ATP is a promiscuous substrate in several biochemical processes and any additional administration of ATP beyond normal cellular concentration (~5 mM) could be detrimental.

      (iii) A natural question arises about what is so special about ATP as a solubilizer. The authors have also asked this question but in a limited scope of comparing to a commonly used chemical hydrotrope NaXS. However, a bigger question would be what kind of chemical/physical features make ATP special? For example, (i) if the amphiphilic property is important, what about some standard surfactants? (ii) how would ATP compare to other nucleotides like ADP or GTP? It might be useful to explore such questions in the future to further establish the special role of ATP in this regard.

      (iv) In Figure 2F, it seems that in the presence of 0.5 M ATP, the Rg increases (as expected), but the number of native contacts remains almost similar. The reduction in the number of native contacts at higher ATP concentrations is not as dramatic as the increase in Rg. This is somewhat counterintuitive and should be looked into. Normally one would expect a monotonous reduction in the number of native contacts as the protein unfolds (increase in Rg).

    4. Reviewer #3 (Public Review):

      Summary:

      Since its first experimental report in 2017 (Patel et al. Science 2017), there have been several studies on the phenomenon in which ATP functions as a biological hydrotrope of protein aggregates. In this manuscript, by conducting molecular dynamics simulations of three different proteins, Trp-cage, Abeta40 monomer, and Abeta40 dimer at a high concentration of ATP (0.1, 0.5 M), Sarkar et al. find that the amphiphilic nature of ATP, arising from its molecular structure consisting of phosphate group (PG), sugar ring, and aromatic base, enables it to interact with proteins in a protein-specific manner and prevents their aggregation and solubilize if they aggregate. The authors also point out that in comparison with NaXS, which is the traditional chemical hydrotrope, ATP is more efficient in solubilizing protein aggregates because of its amphiphilic nature.

      Trp-cage, featured with a hydrophobic core in its native state, is denatured at high ATP concentration. The authors show that the aromatic base group (purine group) of ATP is responsible for inducing the denaturation of helical motifs in the native state.

      For Abeta40, which can be classified as an IDP with charged residues, it is shown that ATP disrupts the salt bridge (D23-K28) required for the stability of beta-turn formation.

      By showing that ATP can disassemble preformed protein oligomers (Abeta40 dimer), the authors argue that ATP is "potent enough to disassemble existing protein droplets, maintaining proper cellular homeostasis," and enhancing solubility.

      Overall, the message of the paper is clear and straightforward to follow. I did not follow all the literature, but I see in the literature search, that there are several studies on this subject. (J. Am. Chem. Soc. 2021, 143, 31, 11982-11993; J. Phys. Chem. B 2022, 126, 42, 8486-8494; J. Phys. Chem. B 2021, 125, 28, 7717-7731; J. Phys. Chem. B 2020, 124, 1, 210-223).

      If this study is indeed the first one to test using MD simulations whether ATP is a solubilizer of protein aggregates, it may deserve some attention from the community. But, the authors should definitely discuss the content of existing studies, and make it explicit what is new in this study.

      Strengths:

      The authors showed that due to its amphiphilic nature, ATP can interact with different proteins in a protein-specific manner, a. finding more general and specific than merely calling ATP a biological hydrotrope.

      Weaknesses:

      (1) My only major concern is that the simulations were performed at unusually high ATP concentrations (100 and 500 mM of ATP), whereas the real cellular concentration of ATP is 1-5 mM. Even if ATP is a good solubilizer of protein aggregates, the actual concentration should matter. I was wondering if there is a previous report on a titration curve of protein aggregates against ATP, and what is the transition mid-point of ATP-induced solubility of protein aggregates.

      For instance, urea or GdmCl have long been known as the non-specific denaturants of proteins, and it has been well experimented that their transition mid-point of protein unfolding is ~(1 - 6) M depending on the proteins.

      (2) The sentence "... a clear shift of relative population of Abeta40 conformational subensemble towards a basin with higher Rg and lower number of contacts in the presence of ATP" is not a precise description of Figures 4A and 4B. It is not clear from the figures whether the Rg of Abeta40 is increased when Abeta40 is subject to ATP. The authors should give a more precise description of what is observed in the result from their simulations or consider a better-order parameter to describe the change in molecular structure. In addition, the disruption of beta-sheet from Figure 4E to 4F is not very clear. The authors may want to use an arrow to indicate the region of the contact map associated with this change.

      Although the full atomistic simulations were carried out, the analyses demonstrated in this study are a bit rudimentary and coarse-grained (e.g, Rg is a rather poor order parameter to discuss dynamics involved in proteins). The authors could go beyond and say more about how ATP interacts with proteins and disrupts the stable configurations.

      (3) Although the amphiphilic character of ATP is highlighted, a similar comment can be made as to GTP. Is GTP, whose cellular concentration is ~0.5 mM, also a good solubilizer of protein aggregates? If not, why? Please comment.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Major comments (Public Reviews)

      Generality of grid cells

      We appreciate the reviewers’ concern regarding the generality of our approach, and in particular for analogies in nonlinear spaces. In that regard, there are at least two potential directions that could be pursued. One is to directly encode nonlinear structures (such as trees, rings, etc.) with grid cells, to which DPP-A could be applied as described in our model. The TEM model [1] suggests that grid cells in the medial entorhinal may form a basis set that captures structural knowledge for such nonlinear spaces, such as social hierarchies and transitive inference when formalized as a connected graph. Another would be to use eigen-decomposition of the successor representation [2], a learnable predictive representation of possible future states that has been shown by Stachenfield et al. [3] to provide an abstract structured representation of a space that is analogous to the grid cell code. This general-purpose mechanism could be applied to represent analogies in nonlinear spaces [4], for which there may not be a clear factorization in terms of grid cells (i.e., distinct frequencies and multiple phases within each frequency). Since the DPP-A mechanism, as we have described it, requires representations to be factored in this way it would need to be modified for such purpose. Either of these approaches, if successful, would allow our model to be extended to domains containing nonlinear forms of structure. To the extent that different coding schemes (i.e., basis sets) are needed for different forms of structure, the question of how these are identified and engaged for use in a given setting is clearly an important one, that is not addressed by the current work. We imagine that this is likely subserved by monitoring and selection mechanisms proposed to underlie the capacity for selective attention and cognitive control [5], though the specific computational mechanisms that underlie this function remain an important direction for future research. We have added a discussion of these issues in Section 6 of the updated manuscript.

      (1) Whittington, J.C., Muller, T.H., Mark, S., Chen, G., Barry, C., Burgess, N. and Behrens, T.E., 2020. The Tolman-Eichenbaum machine: unifying space and relational memory through generalization in the hippocampal formation. Cell, 183(5), pp.1249-1263.

      (2) Dayan, P., 1993. Improving generalization for temporal difference learning: The successor representation. Neural computation, 5(4), pp.613-624.

      (3) Stachenfeld, K.L., Botvinick, M.M. and Gershman, S.J., 2017. The hippocampus as a predictive map. Nature neuroscience, 20(11), pp.1643-1653.

      (4) Frankland, S., Webb, T.W., Petrov, A.A., O'Reilly, R.C. and Cohen, J., 2019. Extracting and Utilizing Abstract, Structured Representations for Analogy. In CogSci (pp. 1766-1772).

      (5) Shenhav, A., Botvinick, M.M. and Cohen, J.D., 2013. The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron, 79(2), pp.217-240. Biological plausibility of DPP-A

      We appreciate the reviewers’ interest in the biological plausibility of our model, and in particular the question of whether and how DPP-A might be implemented in a neural network. In that regard, Bozkurt et al. [1] recently proposed a biologically plausible neural network algorithm using a weighted similarity matrix approach to implement a determinant maximization criterion, which is the core idea underlying the objective function we use for DPP-A, suggesting that the DPP-A mechanism we describe may also be biologically plausible. This could be tested experimentally by exposing individuals (e.g., rodents or humans) to a task that requires consistent exposure to a subregion, and evaluating the distribution of activity over the grid cells. Our model predicts that high frequency grid cells should increase their firing rate more than low frequency cells, since the high frequency grid cells maximize the determinant of the covariance matrix of the grid cell embeddings. It is also worth noting that Frankland et al. [2] have suggested that the use of DPPs may also help explain a mutual exclusivity bias observed in human word learning and reasoning. While this is not direct evidence of biological plausibility, it is consistent with the idea that the human brain selects representations for processing that maximize the volume of the representational space, which can be achieved by maximizing the DPP-A objective function defined in Equation 6. We have added a comment to this effect in Section 6 of the updated manuscript.

      (1) Bozkurt, B., Pehlevan, C. and Erdogan, A., 2022. Biologically-plausible determinant maximization neural networks for blind separation of correlated sources. Advances in Neural Information Processing Systems, 35, pp.13704-13717.

      (2) Frankland, S. and Cohen, J., 2020. Determinantal Point Processes for Memory and Structured Inference. In CogSci.

      Simplicity of analogical problem and comparison to other models using this task

      First, we would like to point out that analogical reasoning is a signatory feature of human cognition, which supports flexible and efficient adaptation to novel inputs that remains a challenge for most current neural network architectures. While humans can exhibit complex and sophisticated forms of analogical reasoning [1, 2, 3], here we focused on a relatively simple form, that was inspired by Rumelhart’s parallelogram model of analogy [4,5] that has been used to explain traditional human verbal analogies (e.g., “king is to what as man is to woman?”). Our model, like that one, seeks to explain analogical reasoning in terms of the computation of simple Euclidean distances (i.e., A - B = C - D, where A, B, C, D are vectors in 2D space). We have now noted this in Section 2.1.1 of the updated manuscript. It is worth noting that, despite the seeming simplicity of this construction, we show that standard neural network architectures (e.g., LSTMs and transformers) struggle to generalize on such tasks without the use of the DPP-A mechanism.

      Second, we are not aware of any previous work other than Frankland et al. [6] cited in the first paragraph of Section 2.2.1, that has examined the capacity of neural network architectures to perform even this simple form of analogy. The models in that study were hardcoded to perform analogical reasoning, whereas we trained models to learn to perform analogies. That said, clearly a useful line of future work would be to scale our model further to deal with more complex forms of representation and analogical reasoning tasks [1,2,3]. We have noted this in Section 6 of the updated manuscript.

      (1) Holyoak, K.J., 2012. Analogy and relational reasoning. The Oxford handbook of thinking and reasoning, pp.234-259.

      (2) Webb, T., Fu, S., Bihl, T., Holyoak, K.J. and Lu, H., 2023. Zero-shot visual reasoning through probabilistic analogical mapping. Nature Communications, 14(1), p.5144.

      (3) Lu, H., Ichien, N. and Holyoak, K.J., 2022. Probabilistic analogical mapping with semantic relation networks. Psychological review.

      (4) Rumelhart, D.E. and Abrahamson, A.A., 1973. A model for analogical reasoning. Cognitive Psychology, 5(1), pp.1-28.

      (5) Mikolov, T., Chen, K., Corrado, G. and Dean, J., 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

      (6) Frankland, S., Webb, T.W., Petrov, A.A., O'Reilly, R.C. and Cohen, J., 2019. Extracting and Utilizing Abstract, Structured Representations for Analogy. In CogSci (pp. 1766-1772).

      Clarification of DPP-A attentional modulation

      We would like to clarify several concerns regarding the DPP-A attentional modulation. First, we would like to make it clear that ω is not meant to correspond to synaptic weights, and thank the reviewer for noting the possibility for confusion on this point. It is also distinct from a biasing input, which is often added to the product of the input features and weights. Rather, in our model ω is a vector, and diag (ω) converts it into a matrix with ω as the diagonal of the matrix, and the rest entries are zero. In Equation 6, diag(ω) is matrix multiplied with the covariance matrix V, which results in elementwise multiplication of ω with column vectors of V, and hence acts more like gates. We have noted this in Section 2.2.2 and have changed all instances of “weights (ω)” to “gates (ɡ)” in the updated manuscript. We have also rewritten the definition of Equation 6 and uses of it (as in Algorithm 1) to depict the use of sigmoid nonlinearity (σ) to , so that the resulting values are always between 0 and 1.

      Second, we would like to clarify that we don’t compute the inner product between the gates ɡ and the grid cell embeddings x anywhere in our model. The gates within each frequency were optimized (independent of the task inputs), according to Equation 6, to compute the approximate maximum log determinant of the covariance matrix over the grid cell embeddings individually for each frequency. We then used the grid cell embeddings belonging to the frequency that had the maximum within-frequency log determinant for training the inference module, which always happened to be grid cells within the top three frequencies. Author response image 1 (also added to the Appendix, Section 7.10 of the updated manuscript) shows the approximate maximum log determinant (on the y-axis) for the different frequencies (on the x-axis).

      Author response image 1.

      Approximate maximum log determinant of the covariance matrix over the grid cell embeddings (y-axis) for each frequency (x-axis), obtained after maximizing Equation 6.

      Third, we would like to clarify our interpretation of why DPP-A identified grid cell embeddings corresponding to the highest spatial frequencies, and why this produced the best OOD generalization (i.e., extrapolation on our analogy tasks). It is because those grid cell embeddings exhibited greater variance over the training data than the lower frequency embeddings, while at the same time the correlations among those grid cell embeddings were lower than the correlations among the lower frequency grid cell embeddings. The determinant of the covariance matrix of the grid cell embeddings is maximized when the variances of the grid cell embeddings are high (they are “expressive”) and the correlation among the grid cell embeddings is low (they “cover the representational space”). As a result, the higher frequency grid cell embeddings more efficiently covered the representational space of the training data, allowing them to efficiently capture the same relational structure across training and test distributions which is required for OOD generalization. We have added some clarification to the second paragraph of Section 2.2.2 in the updated manuscript. Furthermore, to illustrate this graphically, Author response image 2 (added to the Appendix, Section 7.10 of the updated manuscript) shows the results after the summation of the multiplication of the grid cell embeddings over the 2d space of 1000x1000 locations, with their corresponding gates for 3 representative frequencies (left, middle and right panels showing results for the lowest, middle and highest grid cell frequencies, respectively, of the 9 used in the model), obtained after maximizing Equation 6 for each grid cell frequency. The color code indicates the responsiveness of the grid cells to different X and Y locations in the input space (lighter color corresponding to greater responsiveness). Note that the dark blue area (denoting regions of least responsiveness to any grid cell) is greatest for the lowest frequency and nearly zero for the highest frequency, illustrating that grid cell embeddings belonging to the highest frequency more efficiently cover the representational space which allows them to capture the same relational structure across training and test distributions as required for OOD generalization.

      Author response image 2.

      Each panel shows the results after summation of the multiplication of the grid cell embeddings over the 2d space of 1000x1000 locations, with their corresponding gates for a particular frequency, obtained after maximizing Equation 6 for each grid cell frequency. The left, middle, and right panels show results for the lowest, middle, and highest grid cell frequencies, respectively, of the 9 used in the model. Lighter color in each panel corresponds to greater responsiveness of grid cells at that particular location in the 2d space.

      Finally, we would like to clarify how the DPP-A attentional mechanism is different from the attentional mechanism in the transformer module, and why both are needed for strong OOD generalization. Use of the standard self-attention mechanism in transformers over the inputs (i.e., A, B, C, and D for the analogy task) in place of DPP-A would lead to weightings of grid cell embeddings over all frequencies and phases. The objective function for the DPP-A represents an inductive bias, that selectively assigns the greatest weight to all grid cell embeddings (i.e., for all phases) of the frequency for which the determinant of the covariance matrix is greatest computed over the training space. The transformer inference module then attends over the inputs with the selected grid cell embeddings based on the DPP-A objective. We have added a discussion of this point in Section 6 of the updated manuscript.

      We would like to thank the reviewers for their recommendations. We have tried our best to incorporate them into our updated manuscript. Below we provide a detailed response to each of the recommendations grouped for each reviewer.

      Reviewer #1 (Recommendations for the authors)

      (1) It would be helpful to see some equations for R in the main text.

      We thank the reviewer for this suggestion. We have now added some equations explaining the working of R in Section 2.2.3 of the updated manuscript.

      (2) Typo: p 11 'alongwith' -> 'along with'

      We have changed all instances of ‘alongwith’ to ‘along with’ in the updated manuscript.

      (3) Presumably, this is related to equivariant ML - it would be helpful to comment on this.

      Yes, this is related to equivariant ML, since the properties of equivariance hold for our model. Specifically, the probability distribution after applying softmax remains the same when the transformation (translation or scaling) is applied to the scores for each of the answer choices obtained from the output of the inference module, and when the same transformation is applied to the stimuli for the task and all the answer choices before presenting as input to the inference module to obtain the scores. We have commented on this in Section 2.2.3 of the updated manuscript.

      Reviewer #2 (Recommendations for the authors)

      (1) Page 2 - "Webb et al." temporal context - they should also cite and compare this to work by Marc Howard on generalization based on multi-scale temporal context.

      While we appreciate the important contributions that have been made by Marc Howard and his colleagues to temporal coding and its role in episodic memory and hippocampal function, we would like to clarify that his temporal context model is unrelated to the temporal context normalization developed by Webb et al. (2020) and mentioned on Page 2. The former (Temporal Context Model) is a computational model that proposes a role for temporal coding in the functions of the medial temporal lobe in support of episodic recall, and spatial navigation. The latter (temporal context normalization) is a normalization procedure proposed for use in training a neural network, similar to batch normalization [1], in which tensor normalization is applied over the temporal instead of the batch dimension, which is shown to help with OOD generalization. We apologize for any confusion engendered by the similarity of these terms, and failure to clarify the difference between these, that we have now attempted to do in a footnote on Page 2.

      Ioffe, S. and Szegedy, C., 2015, June. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning (pp. 448-456). pmlr.

      (2) page 3 - "known to be implemented in entorhinal" - It's odd that they seem to avoid citing the actual biology papers on grid cells. They should cite more of the grid cell recording papers when they mention the entorhinal cortex (i.e. Hafting et al., 2005; Barry et al., 2007; Stensola et al., 2012; Giocomo et al., 2011; Brandon et al., 2011).

      We have now cited the references mentioned below, on page 3 after the phrase “known to be implemented in entohinal cortex”.

      (1) Barry, C., Hayman, R., Burgess, N. and Jeffery, K.J., 2007. Experience-dependent rescaling of entorhinal grids. Nature neuroscience, 10(6), pp.682-684.

      (2) Stensola, H., Stensola, T., Solstad, T., Frøland, K., Moser, M.B. and Moser, E.I., 2012. The entorhinal grid map is discretized. Nature, 492(7427), pp.72-78.

      (3) Giocomo, L.M., Hussaini, S.A., Zheng, F., Kandel, E.R., Moser, M.B. and Moser, E.I., 2011. Grid cells use HCN1 channels for spatial scaling. Cell, 147(5), pp.1159-1170.

      (4) Brandon, M.P., Bogaard, A.R., Libby, C.P., Connerney, M.A., Gupta, K. and Hasselmo, M.E., 2011. Reduction of theta rhythm dissociates grid cell spatial periodicity from directional tuning. Science, 332(6029), pp.595-599.

      (3) To enhance the connection to biological systems, they should cite more of the experimental and modeling work on grid cell coding (for example on page 2 where they mention relational coding by grid cells). Currently, they tend to cite studies of grid cell relational representations that are very indirect in their relationship to grid cell recordings (i.e. indirect fMRI measures by Constaninescu et al., 2016 or the very abstract models by Whittington et al., 2020). They should cite more papers on actual neurophysiological recordings of grid cells that suggest relational/metric representations, and they should cite more of the previous modeling papers that have addressed relational representations. This could include work on using grid cell relational coding to guide spatial behavior (e.g. Erdem and Hasselmo, 2014; Bush, Barry, Manson, Burges, 2015). This could also include other papers on the grid cell code beyond the paper by Wei et al., 2015 - they could also cite work on the efficiency of coding by Sreenivasan and Fiete and by Mathis, Herz, and Stemmler.

      We thank the reviewer for bringing the additional references to our attention. We have cited the references mentioned below on page 2 of the updated manuscript.

      (1) Erdem, U.M. and Hasselmo, M.E., 2014. A biologically inspired hierarchical goal directed navigation model. Journal of Physiology-Paris, 108(1), pp.28-37.

      (2) Sreenivasan, S. and Fiete, I., 2011. Grid cells generate an analog error-correcting code for singularly precise neural computation. Nature neuroscience, 14(10), pp.1330-1337.

      (3) Mathis, A., Herz, A.V. and Stemmler, M., 2012. Optimal population codes for space: grid cells outperform place cells. Neural computation, 24(9), pp.2280-2317.

      (4) Bush, D., Barry, C., Manson, D. and Burgess, N., 2015. Using grid cells for navigation. Neuron, 87(3), pp.507-520

      (4) Page 3 - "Determinantal Point Processes (DPPs)" - it is rather annoying that DPP is defined after DPP-A is defined. There ought to be a spot where the definition of DPP-A is clearly stated in a single location.

      We agree it makes more sense to define Determinantal Point Process (DPP) before DPP-A. We have now rephrased the sentences accordingly. In the “Abstract”, the sentence now reads “Second, we propose an attentional mechanism that operates over the grid cell code using Determinantal Point Process (DPP), which we call DPP attention (DPP-A) - a transformation that ensures maximum sparseness in the coverage of that space.” We have also modified the second paragraph of the “Introduction”. The modified portion now reads “b) an attentional objective inspired from Determinantal Point Processes (DPPs), which are probabilistic models of repulsion arising in quantum physics [1], to attend to abstract representations that have maximum variance and minimum correlation among them, over the training data. We refer to this as DPP attention or DPP-A.” Due to this change, we removed the last sentence of the fifth paragraph of the “Introduction”.

      (1) Macchi, O., 1975. The coincidence approach to stochastic point processes. Advances in Applied Probability, 7(1), pp.83-122.

      (5) Page 3 - "the inference module R" - there should be some discussion about how this component using LSTM or transformers could relate to the function of actual brain regions interacting with entorhinal cortex. Or if there is no biological connection, they should state that this is not seen as a biological model and that only the grid cell code is considered biological.

      While we agree that the model is not construed to be as specific about the implementation of the R module, we assume that — as a standard deep learning component — it is likely to map onto neocortical structures that interact with the entorhinal cortex and, in particular, regions of the prefrontal-posterior parietal network widely believed to be involved in abstract relational processes [1,2,3,4]. In particular, the role of the prefrontal cortex in the encoding and active maintenance of abstract information needed for task performance (such as rules and relations) has often been modeled using gated recurrent networks, such as LSTMs [5,6], and the posterior parietal cortex has long been known to support “maps” that may provide an important substrate for computing complex relations [4]. We have added some discussion about this in Section 2.2.3 of the updated manuscript.

      (1) Waltz, J.A., Knowlton, B.J., Holyoak, K.J., Boone, K.B., Mishkin, F.S., de Menezes Santos, M., Thomas, C.R. and Miller, B.L., 1999. A system for relational reasoning in human prefrontal cortex. Psychological science, 10(2), pp.119-125.

      (2) Christoff, K., Prabhakaran, V., Dorfman, J., Zhao, Z., Kroger, J.K., Holyoak, K.J. and Gabrieli, J.D., 2001. Rostrolateral prefrontal cortex involvement in relational integration during reasoning. Neuroimage, 14(5), pp.1136-1149.

      (3) Knowlton, B.J., Morrison, R.G., Hummel, J.E. and Holyoak, K.J., 2012. A neurocomputational system for relational reasoning. Trends in cognitive sciences, 16(7), pp.373-381.

      (4) Summerfield, C., Luyckx, F. and Sheahan, H., 2020. Structure learning and the posterior parietal cortex. Progress in neurobiology, 184, p.101717.

      (5) Frank, M.J., Loughry, B. and O’Reilly, R.C., 2001. Interactions between frontal cortex and basal ganglia in working memory: a computational model. Cognitive, Affective, & Behavioral Neuroscience, 1, pp.137-160.

      (6) Braver, T.S. and Cohen, J.D., 2000. On the control of control: The role of dopamine in regulating prefrontal function and working memory. Control of cognitive processes: Attention and performance XVIII, (2000).

      (6) Page 4 - "Learned weighting w" - it is somewhat confusing to use "w" as that is commonly used for synaptic weights, whereas I understand this to be an attentional modulation vector with the same dimensionality as the grid cell code. It seems more similar to a neural network bias input than a weight matrix.

      We refer to the first paragraph of our response above to the topic “Clarification of DPP-A attentional modulation” under “Major comments (Public Reviews)”, which contains our response to this issue.

      (7) Page 4 - "parameterization of w... by two loss functions over the training set." - I realize that this has been stated here, but to emphasize the significance to a naïve reader, I think they should emphasize that the learning is entirely focused on the initial training space, and there is NO training done in the test spaces. It's very impressive that the parameterization is allowing generalization to translated or scaled spaces without requiring ANY training on the translated or scaled spaces.

      We have added the sentence “Note that learning of parameter occurs only over the training space and is not further modified during testing (i.e. over the test spaces)” to the updated manuscript.

      (8) Page 4 - "The first," - This should be specific - "The first loss function"

      We have changed it to “The first loss function” in the updated manuscript.

      (9) Page 4 - The analogy task seems rather simplistic when first presented (i.e. just a spatial translation to different parts of a space, which has already been shown to work in simulations of spatial behavior such as Erdem and Hasselmo, 2014 or Bush, Barry, Manson, Burgess, 2015). To make the connection to analogy, they might provide a brief mention of how this relates to the analogy space created by word2vec applied to traditional human verbal analogies (i.e. king-man+woman=queen).

      We agree that the analogy task is simple, and recognize that grid cells can be used to navigate to different parts of space over which the test analogies are defined when those are explicitly specified, as shown by Erdem and Hasselmo (2014) and Bush, Barry, Manson, and Burgess (2015). However, for the analogy task, the appropriate set of grid cell embeddings must be identified that capture the same relational structure between training and test analogies to demonstrate strong OOD generalization, and that is achieved by the attentional mechanism DPP-A. As suggested by the reviewer’s comment, our analogy task is inspired by Rumelhart’s parallelogram model of analogy [1,2] (and therefore similar to traditional human verbal analogies) in as much as it involves differences (i.e A - B = C - D, where A, B, C, D are vectors in 2D space). We have now noted this in Section 2.1.1 of the updated manuscript.

      (1) Rumelhart, D.E. and Abrahamson, A.A., 1973. A model for analogical reasoning. Cognitive Psychology, 5(1), pp.1-28.

      (2) Mikolov, T., Chen, K., Corrado, G. and Dean, J., 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

      (10) Page 5 - The variable "KM" is a bit confusing when it first appears. It would be good to re-iterate that K and M are separate points and KM is the vector between these points.

      We apologize for the confusion on this point. KM is meant to refer to an integer value, obtained by multiplying K and M, which is added to both dimensions of A, B, C and D, which are points in ℤ2, to translate them to a different region of the space. K is an integer value ranging from 1 to 9 and M is also an integer value denoting the size of the training region, which in our implementation is 100. We have clarified this in Section 2.1.1 of the updated manuscript.

      (11) Page 5 - "two continuous dimensions (Constantinescu et al._)" - this ought to give credit to the original study showing the abstract six-fold rotational symmetry for spatial coding (Doeller, Barry and Burgess).

      We have now cited the original work by Doeller et al. [1] along with Constantinescu et al. (2016) in the updated manuscript after the phrase “two continuous dimensions” on page 5.

      (1) Doeller, C.F., Barry, C. and Burgess, N., 2010. Evidence for grid cells in a human memory network. Nature, 463(7281), pp.657-661.

      (12) Page 6 - Np=100. This is done later, but it would be clearer if they right away stated that Np*Nf=900 in this first presentation.

      We have now added this sentence after Np=100. “Hence Np*Nf=900, which denotes the number of grid cells.”

      (13) Page 6 - They provide theorem 2.1 on the determinant of the covariance matrix of the grid code, but they ought to cite this the first time this is mentioned.

      We have cited Gilenwater et al. (2012) before mentioning theorem 2.1. The sentence just before that reads “We use the following theorem from Gillenwater et al. (2012) to construct :”

      (14) Page 6 - It would greatly enhance the impact of the paper if they could give neuroscientists some sense of how the maximization of the determinant of the covariance matrix of the grid cell code could be implemented by a biological circuit. OR at least to show an example of the output of this algorithm when it is used as an inner product with the grid cell code. This would require plotting the grid cell code in the spatial domain rather than the 900 element vector.

      We refer to our response above to the topic “Biological plausibility of DPP-A” and second, third, and fourth paragraphs of our response above to the topic “Clarification of DPP-A attentional modulation” under “Major comments (Public Reviews)”, which contain our responses to this issue.

      (15) Page 6 - "That encode higher spatial frequencies..." This seems intuitive, but it would be nice to give a more intuitive description of how this is related to the determinant of the covariance matrix.

      We refer to the third paragraph of our response above to the topic “Clarification of DPP-A attentional modulation” under “Major comments (Public Reviews)”, which contains our response to this issue.

      (16) Page 7 - log of both sides... Nf is number of frequencies... Would be good to mention here that they are referring to equation 6 which is only mentioned later in the paragraph.

      As suggested, we now refer to Equation 6 in the updated manuscript. The sentence now reads “This is achieved by maximizing the determinant of the covariance matrix over the within frequency grid cell embeddings of the training data, and Equation 6 is obtained by applying the log on both sides of Theorem 2.1, and in our case where refers to grid cells of a particular frequency.”

      (17) Page 7 - Equation 6 - They should discuss how this is proposed to be implemented in brain circuits.

      We refer to our response above to the topic “Biological plausibility of DPP-A” under “Major comments (Public Reviews)”, which contains our response to this issue.

      18) Page 9 - "egeneralize" - presumably this is a typo?

      Yes. We have corrected it to “generalize” in the updated manuscript.

      (19) Page 9 - "biologically plausible encoding scheme" - This is valid for the grid cell code, but they should be clear that this is not valid for other parts of the model, or specify how other parts of the model such as DPP-A could be biologically plausible.

      We refer to our response above to the topic “Biological plausibility of DPP-A” under “Major comments (Public Reviews)”, which contains our response to this issue.

      (20) Page 12 - Figure 7 - comparsion to one-hots or smoothed one-hots. The text should indicate whether the smoothed one-hots are similar to place cell coding. This is the most relevant comparison of coding for those knowledgeable about biological coding schemes.

      Yes, smoothed one-hots are similar to place cell coding. We now mention this in Section 5.3 of the updated manuscript.

      (21) Page 12 - They could compare to a broader range of potential biological coding schemes for the overall space. This could include using coding based on the boundary vector cell coding of the space, band cell coding (one dimensional input to grid cells), or egocentric boundary cell coding.

      We appreciate these useful suggestions, which we now mention as potentially valuable directions for future work in the second paragraph of Section 6 of the updated manuscript.

      (22) Page 13 - "transformers are particularly instructive" - They mention this as a useful comparison, but they might discuss further why a much better function is obtained when attention is applied to the system twice (once by DPP-A and then by a transformer in the inference module).

      We refer to the last paragraph of our response above to the topic “Clarification of DPP-A attentional modulation” under “Major comments (Public Reviews)”, which contains our response to this issue.

      (23) Page 13 - "Section 5.1 for analogy and Section 5.2 for arithmetic" - it would be clearer if they perhaps also mentioned the specific figures (Figure 4 and Figure 6) presenting the results for the transformer rather than the LSTM.

      We have now rephrased to also refer to the figures in the updated manuscript. The phrase now reads “a transformer (Figure 4 in Section 5.1 for analogy and Figure 6 in Section 5.2 for arithmetic tasks) failed to achieve the same level of OOD generalization as the network that used DPP-A.”

      (24) Page 14 - "statistics of the training data" - The most exciting feature of this paper is that learning during the training space analogies can so effectively generalize to other spaces based on the right attention DPP-A, but this is not really made intuitive. Again, they should illustrate the result of the xT w inner product to demonstrate why this work so effectively!

      We refer to the second, third, and fourth paragraphs of our response above to the topic “Clarification of DPP-A attentional modulation” under “Major comments (Public Reviews)”, which contains our response to this issue.

      (25) Bibliography - Silver et al., go paper - journal name "nature" should be capitalized. There are other journal titles that should be capitalized. Also, I believe eLife lists family names first.

      We have made the changes to the bibliography of the updated manuscript suggested by the reviewer.

    1. eLife assessment

      This important modeling work demonstrates out-of-distribution generalization using a grid cell coding scheme combined with Determinantal Point Process Attention. The simulations provide convincing evidence that the model improves generalization performance across several tasks. The generality of the approach is unclear, however, and there is limited comparison to relevant prior work.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This paper presents a cognitive model of out-of-distribution generalisation, where the representational basis is grid-cell codes. In particular, the authors consider the tasks of analogies, addition, and multiplication, and the out-of-distribution tests are shifting or scaling the input domain. The authors utilise grid cell codes, which are multi-scale as well as translationally invariant due to their periodicity. To allow for domain adaptation, the authors use DPP-A which is, in this context, a mechanism of adapting to input scale changes. The authors present simulation results demonstrating that this model can perform out-of-distribution generalisation to input translations and re-scaling, whereas other models fail.

      Strengths:<br /> This paper makes the point it sets out to - that there are some underlying representational bases, like grid cells, that when combined with a domain adaptation mechanism, like DPP-A, can facilitate out-of-generalisation. I don't have any issues with the technical details.

      Weaknesses:<br /> The paper does leave open the bigger questions of 1) how one learns a suitable representation basis in the first place, 2) how to have a domain adaptation mechanism that works in more general settings other than adapting to scale. Overall, I'm left wondering whether this model is really quite bespoke or whether there is something really general here. My comments below are trying to understand how general this approach is.

      COMMENTS<br /> This work relies on being able to map inputs into an appropriate representational space. The inputs were integers so it's easy enough to map them to grid locations. But how does this transfer to making analogies in other spaces? Do the inputs need to be mapped (potentially non-linearly) into a space where everything is linear? In general, what are the properties of the embedding space that allows the grid code to be suitable? It would be helpful to know just how much leg work an embedding model would have to do.

      It's natural that grid cells are great for domain shifts of translation, rescaling, and rotation, because they themselves are multi-scaled and are invariant to translations and rotations. But grid codes aren't going to be great for other types of domain shifts. Are the authors saying that to make analogies grid cells are all you need? If not then what else? And how does this representation get learned? Are there lots of these invariant codes hanging around? And if so how does the appropriate one get chosen for each situation? Some discussion of the points is necessary as otherwise, the model seems somewhat narrow in scope.

      For effective adaptation of scale, the authors needed to use DPP-A. Being that they are relating to brains using grid codes, what processes are implementing DPP-A? Presumably, a computational module that serves the role of DPP-A could be meta-learned? I.e. if they change their task set-up so it gets to see domain shifts in its training data an LSTM or transformer could learn to do this. The presented model comparisons feel a bit of a straw man.

      I couldn't see it explained exactly how R works.

    3. Reviewer #2 (Public Review):

      Summary:<br /> This paper presents a model of out-of-distribution (OOD) generalization that focuses on modeling an analogy task, in which translation or scaling is tested with training in one part of the space and testing in other areas of the space progressively more distant from the training location. Similar tests were performed on arithmetic including addition and multiplication, and similarly impressive results appear for addition but not multiplication. The authors show that a grid cell coding scheme helps performance on these analogy and arithmetic tasks, but the most dramatic increase in performance is provided by a complex algorithm for distributional point-process attention (DPP-A) based on maximizing the determinant of the covariance matrix of the grid embeddings.

      Strengths:<br /> The results appear quite impressive. The results for generalization appear quite dramatic when compared to other coding schemes (i.e. one-hot) or when compared to the performance when ablating the DPP-A component but retaining the same inference modules using LSTM or transformers. This appears to be an important result in terms of generalization of results in an analogy space.

      Weaknesses:<br /> There are a number of ways that its impact and connection to grid cells could be enhanced. From the neuroscience perspective, the major comments concern making a clearer and stronger connection to the actual literature on grid cells and grid cell modeling, and discussing the relationship of the complex DPP-A algorithm to biological circuits.

      Major comments:<br /> 1. They should provide more citations to other groups that have explored analogy using this type of task. Currently, they only cite one paper (Webb et al., 2020) by their own group in their footnote 1 which used the same representation of behavioral tasks for generalization of analogy. It would be useful if they could cite other papers using this simplified representation of analogy and also show the best performance of other algorithms from other groups in their figures, so that there is a sense of how their results compare to the best previous algorithm by other groups in the field (or they can identify which of their comparison algorithms corresponds to the best of previously published work).

      2. While the grid code they use is very standard and based on grid cell researchers (Bicanski and Burgess, 2019), the rest of the algorithm doesn't have a clear claim on biological plausibility. It has become somewhat standard in the field to ignore the problem of how the brain could biologically implement the latest complex algorithm, but it would be useful if they at least mention the problem (or difficulty) of implementing DPP-A in a biological network. In particular, does maximizing the determinant of the covariance matrix of the grid code correspond to something that could be tested experimentally?

      3. Related to major comment 2., it would be very exciting if they could show what the grid code looks like after the attentional modulation inner product xT w has been implemented. This could be highly useful for experimental researchers trying to connect these theoretical simulation results to data. This would be most intuitive to grid cell researchers if it is plotted in the same format as actual biological experimental data - specifically which grid cell codes get strengthened the most (beyond just the highest frequencies).

      4. To enhance the connection to biological systems, they should cite more of the experimental and modeling work on grid cell coding (for example on page 2 where they mention relational coding by grid cells). Currently, they tend to cite studies of grid cell relational representations that are very indirect in their relationship to grid cell recordings (i.e. indirect fMRI measures by Constaninescu et al., 2016 or the very abstract models by Whittington et al., 2020). They should cite more papers on actual neurophysiological recordings of grid cells that suggest relational/metric representations, and they should cite more of the previous modeling papers that have addressed relational representations. This could include work on using grid cell relational coding to guide spatial behavior (e.g. Erdem and Hasselmo, 2014; Bush, Barry, Manson, Burges, 2015). This could also include other papers on the grid cell code beyond the paper by Wei et al., 2015 - they could also cite work on the efficiency of coding by Sreenivasan and Fiete and by Mathis, Herz, and Stemmler.

    1. Reviewer #4 (Public Review):

      In this manuscript by Sha et al. the authors test the role of TNFa in modulating tumor regression/recurrence under therapeutic pressure from castration (or enzalutamide) in both in vitro and in vivo models of prostate cancer. Using the PTEN-null genetic mouse model, they compare the effect of a TNFα ligand trap, etanercept, at various points pre- and post-castration. Their most interesting findings from this experiment were that etanercept given 3 days prior to castration prevented tumor regression, which is a common phenotype seen in these models after castration, but etanercept given 1 day prior to castration prevented prostate cancer recurrence after castration. They go on to perform RNA sequencing on tumors isolated from either sham or castrate mice from two time points post-castration to study acute and delayed transcriptional responses to androgen deprivation. They found enrichment of gene sets containing TNF-targets which initially decrease post-castration but are elevated by 35 days, the time at which tumors recur. The authors conduct a similar set of experiments using human prostate cancer cell lines treated with the androgen receptor inhibitor enzalutamide and observe that drug treatment leads to cells with basal stem-like features that express high levels of TNF. They noticed that CCL2 levels correlate with changes in TNF levels raising the possibility that CCL2 might be a critical downstream effector for disease recurrence. To this end, they treated PTEN-null and hi-MYC castrated mice with a CCR2-antagonist (CCR2a) because CCR2 is one receptor of CCL2 and monitors tumor growth dynamics. Interestingly, upon treatment with CCR2a, tumors did not recur according to their measurements. They go on to demonstrate that the tumors pre-treated with CCR2a had reduced levels of putative TAMs and increased CTLs in the context of TNF or CCR2 inhibition providing a cellular context associated with disease regression. Lastly, they perform single-cell RNA sequencing to further characterize the tumor microenvironment post-castration and report that the ratio of CTLs to TAMs is lower in a recurrent tumor.

      While the concepts behind the study have merit, the data are incomplete and do not fully support the authors' conclusions. The author's definition of recurrence is subjective given that the amount of disease regression after castration is both variable (Figure 8) and relatively limited, particularly in the PTEN loss model. Critical controls are missing. For example, both drug experiments were completed without treating non-castrate plus drug controls which raises the question of how specific these findings are to castration resistance. No validation was performed to ensure that either the TNF ligand trap or the CCR2 agonist was acting on target. The single-cell sequencing experiments were done without replicates which raises concern about its interpretation. At a conceptual level, the authors say that a major cause of disease recurrence in the immunosuppressive TME, but provide little functional data that macrophages and T cells are directly responsible for this phenotype. Statistical analyses were performed on only select experiments. In summary, further work is recommended to support the conclusions of this story.

    2. eLife assessment

      This study presents a potentially valuable finding regarding the role of cytokine signaling in the mechanism of response and resistance to castration therapy in prostate cancer. The evidence, although solid for some aspects of the work, is incomplete and only partially supports the main claims.

    3. Reviewer #1 (Public Review):

      Summary:<br /> Sha K et al aimed at identifying the mechanism of response and resistance to castration in the Pten knockout GEM model. They found elevated levels of TNF overexpressed in castrated tumors associated with an expansion of basal-like stem cells during recurrence, which they show occurring in prostate cancer cells in culture upon enzalutamide treatment. Further, the authors carry on a timed dependent analysis of the role of TNF in regression and recurrence to show that TNF regulates both processes. Similarly, CCL2, which the authors had proposed as a chemokine secreted upon TNF induction following enzalutamide treatment, is also shown to be elevated during recurrence and associated with the remodeling of an immunosuppressive microenvironment through depletion of T cells and recruitment of TAMs.

      Strengths:

      The paper exploits a well-established GEM model to interrogate mechanisms of response to standard-of-care treatment. This is of utmost importance since prostate cancer recurrence after ADT or ARSi marks the onset of an incurable disease stage for which limited treatments exist. The work is relevant in the confirmation that recurrent prostate cancer is mostly an immunologically "cold" tumor with an immunosuppressive immune microenvironment

      Weaknesses:

      While the data is consistent and the conclusions are mostly supported and justified, the findings overall are incremental and of limited novelty. The role of TNF and NF-kB signaling in tumor progression and the role of the CCL2-CCR2 in shaping the immunosuppressive microenvironment are well established.

      On the other hand, it is unclear why the authors decided to focus on the basal compartment when there is a wealth of literature suggesting that luminal cells are if not exclusively, surely one of the cells of origin of prostate cancer and responsible for recurrence upon antiandrogen treatment. As a result, most of the later shown data has to be taken with caution as it is not known if the same phenomena occur in the luminal compartment.

    4. Reviewer #2 (Public Review):

      Summary:

      In this study, Sha and Zhang et al. reported that androgen deprivation therapy (ADT) induces a switch to a basal-stemness status, driven by the TNF-CCL2-CCR2 axis. Their results also reveal that enhanced CCL2 coincides with increased macrophages and decreased CD8 T cells, suggesting that ADT resistance may be related to the TNF/CCL2/CCR2-dependent immunosuppressive tumor microenvironment (TME). Overall, this is a very interesting study with a significant amount of data.

      Strengths:

      The strengths of the study include various clinically relevant models, cutting-edge technology (such as single-cell RNA-seq), translational potential (TNF and CCR2 inhibitors), and novel insights connecting stemness lineage switch to an immunosuppressive TME. Thus, I believe this work would be of significant interest to the field of prostate cancer and journal readership.

      Weaknesses:

      (1) One of the key conclusions/findings of this study is the ADT-induced basal-stemness lineage switch driving ADT resistance. However, most of the presented evidence supporting this conclusion only selects a couple of marker genes. What exacerbates this issue is that different basal-stemness markers were often selected with different results. For example, Figure S1A uses CD166/EZH2 as markers, while Figure S1B uses ITGb1/EZH2. In contrast, Figure 1D uses Sca1/CD49, and Figure 2B-C uses CD49/CD166. Since many basal-stemness lineage gene signatures have been previously established, the study should examine various basal-stemness gene signatures rather than a couple of selected markers. Moreover, why were none of the stemness/basal-gene signatures significantly changed in the GO enrichment analysis in Figure 6A/B?

      (2) A related weakness is the lack of functional results supporting the stemness lineage switch. Although the authors present colony formation assay results, these could be influenced simply by promoted cell proliferation, which is not a convincing indicator of stemness. To support this key conclusion, widely accepted stemness assays, such as the prostasphere formation assay (in vitro) and Extreme Limiting Dilution Analysis (ELDA) xenograft assay (in vivo), should be carried out.

      (3) Another significant concern is that this study uses concurrency to demonstrate a causal relationship in many key results, which is entirely different. For example, Figure S4A and S4B only show increased CCL2 and TNF secretion simultaneously, which cannot support that CCL2 is dependent on TNF. Similarly, Figure 5A only shows that CCL2 increased coincidently with a rise in TNF, which cannot support a causal relationship. To support the causal relationship of this conclusion, it is necessary to show that TNF-KO/KD would abolish the increased CCL2 secretion.

      (4) Some of the selective data presentations are not explained and are difficult to understand. For example, why does CD49 staining in Figure S3A have data for all four time points, while CD166 in Figure S3D only has data for the last time point (day 21)? Similarly, although several TNF_UP gene signatures were highlighted in Figure 4B, several TNF_DN signatures were also enriched in the same table, such as RUAN_RESPONSE_TO_TNF_DN. What is the explanation for these contrasting results?

    5. Reviewer #3 (Public Review):

      Summary:

      The current manuscript evaluates the role of TNF in promoting AR targeted therapy regression and subsequent resistance through CCL2 and TAMs. The current evidence supports a correlative role for TNF in promoting cancer cell progression following AR inhibition. Weaknesses include a lack of descriptive methodology of the pre-clinical GEM model experiments and it is not well-defined which cell types are impacted in this pre-clinical model which will be quite heterogenous with regards to cancer, normal, and microenvironment cells.

      Strengths:

      (1) Appropriate use of pre-clinical models and GEM models to address the scientific questions.

      (2) Novel finding of TNF and interplay of TAMs in promoting cancer cell progression following AR inhibition.

      (3) Potential for developing novel therapeutic strategies to overcome resistance to AR blockade.

      Weaknesses:

      (1) There is a lack of description regarding the GEM model experiments - the age at which mice experiments are started.

      (2) Tumor volume measurements are provided but in this context, there is no discussion on how the mixed cancer and normal epithelial and microenvironment is impacted by AR therapy which could lead to the subtle changes in tumor volume.

      (3) There are no readouts for target inhibition across the therapeutic pre-clinical trials or dosing time courses.

      (4) The terminology of regression and resistance appears arbitrary. The data seems to demonstrate a persistence of significant disease that progresses, rather than a robust response with minimal residual disease that recurs within the primary tumor.

      (5) It is unclear if the increase in basal-like stem cells is from normal basal cells or cancer cells with a basal stem-like property.

      6) In the Hi-MYC model, MYC expression is regulated by AR inhibition and is profoundly ARi responsive at early time points.

    1. eLife assessment

      This important work introduces a method to express fluorogenic DNA aptamers in E. coli, paving the way for genetically encoded fluorescent DNA. The evidence supporting the conclusions is solid, consisting of comparisons of the aptamer's activity in vitro and within bacterial cells. This advancement described in this study is likely to become a standard technique in the DNA aptamer field, and the work will be of interest and utility to researchers in synthetic biology, molecular imaging, and bacterial genetics fields.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors use an interesting expression system called a retron to express single-stranded DNA aptamers. Expressing DNA as a single-stranded sequence is very hard - DNA is naturally double-stranded. However, the successful demonstration by the authors of expressing Lettuce, which is a fluorogenic DNA aptamer, allowed visual demonstration of both expression and folding. This method will likely be the main method for expressing and testing DNA aptamers of all kinds, including fluorogenic aptamers like Lettuce and future variants/alternatives.

      Strengths:

      This has an overall simplicity which will lead to ready adoption. I am very excited about this work. People will be able to express other fluorogenic aptamers or DNA aptamers tagged with Lettuce with this system.

      Weaknesses:

      Several things are not addressed/shown:

      (1) How stable are these DNA in cells? Half-life?

      (2) What concentration do they achieve in cells/copy numbers? This is important since it relates to the total fluorescence output and, if the aptamer is meant to bind a protein, it will reveal if the copy number is sufficient to stoichiometrically bind target proteins. Perhaps the gels could have standards with known amounts in order to get exact amounts of aptamer expression per cell?

      (3) Microscopic images of the fluorescent E. coli - why are these not shown (unless I missed them)? It would be good to see that cells are fluorescent rather than just showing flow sorting data.

      (4) I would appreciate a better Figure 1 to show all the intermediate steps in the RNA processing, the subsequent beginning of the RT step, and then the final production of the ssDNA. I did not understand all the processing steps that lead to the final product, and the role of the 2'OH.

      (5) I would like a better understanding or a protocol for choosing insertion sites into MSD for other aptamers - people will need simple instructions.

      (6) Can the gels be stained with DFHBI/other dyes to see the Lettuce as has been done for fluorogenic RNAs?

      (7) Sometimes FLAPs are called fluorogenic RNA aptamers - it might be good to mention both terms initially since some people use fluorogenic aptamer as their search term.

      (8) What E coli strains are compatible with this retron system?

      (9) What steps would be needed to use in mammalian cells?

      (10) Is the conjugated RNA stable and does it degrade to leave just the DNA aptamer?

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript explores a DNA fluorescent light-up aptamer (FLAP) with the specific goal of comparing activity in vitro to that in bacterial cells. In order to achieve expression in bacteria, the authors devise an expression strategy based on retrons and test four different constructs with the aptamer inserted at different points in the retron scaffold. They only observe binding for one scaffold in vitro, but achieve fluorescence enhancement for all four scaffolds in bacterial cells. These results demonstrate that aptamer performance can be very different in these two contexts.

      Strengths:

      -Given the importance of FLAPs for use in cellular imaging and the fact that these are typically evolved in vitro, understanding the difference in performance between a buffer and a cellular environment is an important research question.

      -The return strategy utilized by the authors is thoughtful and well-described.

      -The observation that some aptamers fail to show binding in vitro but do show enhancement in cells is interesting and surprising.

      Weaknesses:

      -This study hints toward an interesting observation, but would benefit from greater depth to more fully understand this phenomenon. Particularly challenging is that FLAP performance is measured in vitro by affinity and in cells by enhancement, and these may not be directly proportional. For example, it may be that some constructs have much lower affinity but a greater enhancement and this is the explanation for the seemingly different performance.

      -The authors only test enhancement at one concentration of fluorophore in cells (and this experimental detail is difficult to find and would be helpful to include in the figure legend). This limits the conclusions that can be drawn from the data and limits utility for other researchers aiming to use these constructs.

      -The FLAP that is used seems to have a relatively low fluorescence enhancement of only 2-3 fold in cells. It would be interesting to know if this is also the case in vitro. This is lower than typical FLAPs and it would be helpful for the authors to comment on what level of enhancement is needed for the FLAP to be of practical use for cellular imaging.

    1. eLife assessment

      The authors have developed a valuable approach that employs cell-free expression to reconstitute ion channels into giant unilamellar vesicles for biophysical characterisation. The work is solid and will be of particular interest to those studying ion channels that primarily occur in organelles and are therefore not amenable to be studied by more traditional methods.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors have developed a valuable method based on a fully cell-free system to express a channel protein and integrate it into a membrane vesicle in order to characterize it biophysically. The study presents a useful alternative to study channels that are not amenable to being studied by more traditional methods.

      Strengths:

      The evidence supporting the claims of the authors is solid and convincing. The method will be of interest to researchers working on ionic channels, allowing them to study a wide range of ion channel functions such as those involved in transport, interaction with lipids, or pharmacology.

      Weaknesses:

      The inclusion of a mechanistic interpretation of how the channel protein folds into a protomer or a tetramer to become functional in the membrane would strengthen the study.

    3. Reviewer #2 (Public Review):

      It is challenging to study the biophysical properties of organelle channels using conventional electrophysiology. The conventional reconstitution methods require multiple steps and can be contaminated by endogenous ionophores from the host cell lines after purification. To overcome this challenge, in this manuscript, Larmore et al. described a fully synthetic method to assay the functional properties of the TRPP channel family. The TRPP channels are an important organelle ion channel family that natively traffic to primary cilia and ER organelles. The authors utilized cell-free protein expression and reconstitution of the synthetic channel protein into giant unilamellar vesicles (GUV), the single channel properties can be measured using voltage-clamp electrophysiology. Using this innovative method, the authors characterized their membrane integration, orientation, and conductance, comparing the results to those of endogenous channels. The manuscript is well-written and may present broad interest to the ion channel community studying organelle ion channels. Particularly because of the challenges of patching native cilia cells, the functional characterization is highly concentrated in very few labs. This method may provide an alternative approach to investigate other channels resistant to biophysical analysis and pharmacological characterization.

    1. eLife assessment

      In this valuable study, Huffer et al posit that non-cold sensing members of the TRPM subfamily of ion channels (e.g., TRPM2, TRPM4, TRPM5) contain a binding pocket for icilin that overlaps with the one found in the cold-activated TRPM8 channel. By examining a body of TRP channel cryo-EM structures to identify the conserved site, this study presents convincing electrophysiological evidence supporting the identification of an icilin binding pocket within TRPM4. This study shows that icilin has modulatory effects on the TRPM4 channel and will be of direct interest to those working in the TRP-channel field, but it also has implications for studies of somatosensation, taste, as well as pharmacological targeting of the TRPM subfamily.

    2. Reviewer #1 (Public Review):

      In this important study, Huffer et al posit that non-cold sensing members of the TRPM subfamily of ion channels (e.g., TRPM2, TRPM4, TRPM5) contain a binding pocket for icilin which overlaps with the one found in the cold-activated TRPM8 channel.

      The authors identify the residues involved in icilin binding by analyzing the existing TRPM8-icilin complex structures and then use their previously published approach of structure-based sequence comparison to compare the icilin binding residues in TRPM8 to other TRPM channels. This approach uncovered that the residues are conserved in a number of TRPM members: TRPM2, TRPM4, and TRPM5. The authors focus on TRPM4, with the rationale that it has the simplest activation properties (a single Ca2+-binding site). Electrophysiological studies show that icilin by itself does not activate TRPM4, but it strongly potentiates the Ca2+ activation of TRPM4, and introducing the A867G mutation (the mutation that renders avian TRPM8 sensitive to icilin) further increases the potentiating effects of the compound. Conversely, the mutation of a residue that likely directly interacts with icilin in the binding pocket, R901H, results in channels whose Ca2+ sensitivity is not potentiated by icilin.

      The data indicate that, just like in TRPV channels, the binding pockets and allosteric networks might be conserved in the TRPM subfamily.

      The data are convincing, and the authors employ good experimental controls.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors set out to study whether the cooling agent binding site in TRPM8, which is located between the S1-S4 and the TRP domain, is conserved within the TRPM family of ion channels. They specifically chose the TRPM4 channel as the model system, which is directly activated by intracellular Ca2+. Using electrophysiology, the authors characterized and compared the Ca2+ sensitivity and the voltage dependence of TRPM4 channels in the absence and presence of synthetic cooling agonist icilin. They also analyzed the mutational effects of residues (A867G and R901H; equivalent mutations in TRPM8 were shown involved in icilin sensitivity) on Ca2+ sensitivity and voltage-dependence of TRPM4 in the absence and presence of Ca2+. Based on the results as well as structure/sequence alignment, the authors concluded that icilin likely binds to the same pocket in TRPM4 and suggested that this cooling agonist binding pocket is conserved in TRPM channels.

      Strengths:

      The authors gave a very thorough introduction to the TRPM channels. They have nicely characterized the Ca2+ sensitivity and the voltage-dependence of TRPM4 channels and demonstrated icilin potentiates the Ca2+ sensitivity and diminishes the outward rectification of TRPM4. These results indicate icilin modulates TRPM4 activation by Ca2+.

      Weaknesses:

      The reviewer has a few concerns. First, icilin alone (at 25µM) and in the absence of Ca2+ does not activate the TRPM4 channel. Have the authors titrated a wide range of icilin concentrations (without Ca2+ present) for TRPM4 activation? It raises the question that whether icilin is indeed an agonist for TRPM4 channel. This has not been tested so it is unclear. One may argue that icilin needs Ca2+ as a co-factor for channel activation just like in TRPM8 channel. This leads to the second concern, which is a complication in the experimental design and data interpretation. TRPM4 itself requires Ca2+ for activation to begin with, thus it is hard to dissect whether the current observed here for TRPM4 is activated by Ca2+ or by icilin plus its cofactor Ca2+. This is the difference between TRPM8 and TRPM4, as TRPM8 itself is not activated by Ca2+, thus TRPM8 activation is through icilin and Ca2+ acts as a prerequisite for icilin activation.

      The results presented in this study are only sufficient to show that icilin modulates the Ca2+-dependent activation of TRPM4 and icilin at best may act as an allosteric modulator for TRPM4 function. One cannot conclude from the current work that icilin is an agonist or even specifically a cooling agonist for TRPM4. Icilin is a cooling agonist for TRPM8, but it does not mean that if icilin modulates TRPM4 activity then it serves as a cooling agonist for TRPM4.

      For the mutation data on A867G, Figure 4A-B, left panels, it looks like A867G has stronger Ca2+ sensitivity compared to the WT in the absence of icilin and the onset of current activation is faster than the WT, or this is simply due to the scale of the data figure are different between A867G and the WT. Overall the mutagenesis data are weak to support the conclusion that icilin binds to the S1-S4 pocket. The authors need to mutate more residues that are involved in direct interaction with icilin based on the available structural information, including but limited to residues equivalent to Y745 and H845 in human TRPM8.

      The authors set out to study the conservation of the cooling agonist binding site in TRPM family, but only tested a synthetic cooling agonist icilin on TRPM4. In order to draw a broad conclusion as the title and the discussion have claimed, the authors need to more cooling compounds, including the most well-known natural cooling agonist menthol, and other cooling agonists such as WS-12 and/or C3, and test their effects on several TRPM channels, not just TRPM4. With the current data, the authors need to significantly tone down the claim of a conserved cooling agonist binding pocket in the TRPM family.

      On page 11, the authors suggest based on the current data, that TRPM2 and TRPM5 may also be sensitive to cooling agonists because the key residues are conserved. TRPM2 is the closest homolog to TRPM8 but is menthol-insensitive. There are studies that attempted to convert menthol sensitivity to TRPM2, for example, Bandell 2006 attempted to introduce S2 and TRP domains from TRPM8 into TRPM2 but failed to make TRPM2 a menthol-sensitive channel. The sequence conservation or structural similarity is not sufficient for the authors to suggest a shared cooling agonist sensitivity or even a common binding site in the TRPM2 and TRPM5 channels. Again, as pointed out above, the authors need to establish the actual activation of other TRPM channels by these agonists first, before proceeding to functionally probe whether other TRPM channels adopt a conserved agonist binding site.

      Taken together, this current work presents data to show the modulatory effects of icilin on the Ca2+ dependent activation and voltage dependence of the TRPM4 channel.

    4. Reviewer #3 (Public Review):

      Summary:

      The family of transient receptor potential (TRP) channels are tetrameric cation selective channels that are modulated by a variety of stimuli, most notably temperature. In particular, the Transient receptor potential Melastatin subfamily member 8 (TRPM8) is activated by noxious cold and other cooling agents such as menthol and icilin and participates in cold somatosensation in humans. The abundance of TRP channel structural data that has been published in the past decade demonstrates clear architectural conservation within the ion channel family. This suggests the potential for unifying mechanisms of gating despite their varied modes of regulation, which are not yet understood. To address this question, the authors examine the 264 structures of TRP channels determined to date and observe a potential binding pocket for icilin in multiple members of the Melastatin subfamily, TRPM2, TRPM4, and TRPM5. Interestingly, none of the other Melastatin subfamily members had been shown to be sensitive to icilin apart from TRPM8. Each of these channels is activated by intracellular calcium (Ca2+) and a Ca2+ binding site neighbors the predicted pocket for icilin binding in all cryo-EM structures. The authors examined whether icilin could modulate the activation of TRPM4 in the presence of intracellular Ca2+. The addition of icilin enhances Ca2+-dependent activation of TRPM4, promotes channel opening at negative membrane potentials, and improves the kinetics of opening. Furthermore, mutagenesis of TRPM4 residues within the putative icilin binding pocket predicted to enhance or diminish TRPM4 activity elicit these behaviors. Overall, this study furthers our understanding of the Melastatin subfamily of TRP channel gating and demonstrates that a conserved binding pocket observed between TRPM4 and TRPM8 channel structures can function similarly to regulate channel gating.

      Strengths:

      This is a simple and elegant study capitalizing on a vast amount of high-resolution structural information from the TRP channel of ion channels to identify a conserved binding pocket that was previously unknown in the Melastatin subfamily, which is interrogated by the authors through careful electrophysiology and mutagenesis studies.

      Weaknesses:

      No weaknesses were identified by this reviewer.

    1. eLife assessment

      This fundamental work provides new mechanistic insight into the regulation of PDGF signaling through splicing controls. The evidence is compelling to demonstrate the involvement of Srsf3, an RNA-binding protein, in this new mechanism. The work will be of broad interest to developmental biologists in general and molecular biologists/biochemists in the field of growth factor signaling and RNA splicing.

    2. Reviewer #1 (Public Review):

      In their manuscript "PDGFRRa signaling regulates Srsf3 transcript binding to affect PI3K signaling and endosomal trafficking" Forman and colleagues use iMEPM cells to characterize the effects of PDGF signaling on alternative splicing. They first perform RNA-seq using a one-hour stimulation with Pdgf-AA in control and Srsf3 knockdown cells. While Srsf3 manipulation results in a sizeable number of DE genes, PDGF does not. They then turn to examine alternative splicing, due to findings from this lab. They find that both PDGF and Srsf3 contribute much more to splicing than transcription. They find that the vast majority of PDGF-mediated alternative splicing depends upon Srsf3 activity and that skipped exons are the most common events with PDGF stimulation typically promoting exon skipping in the presence of Srsf3. They used eCLIP to identify RNA regions bound to Srsf3. Under both PDGF conditions, the majority of peaks were in exons with +PDGF having a substantially greater number of these peaks. Interestingly, they find differential enrichment of sequence motifs and GC content in stimulated versus unstimulated cells. They examine 2 transcripts encoding PI3K pathway (enriched in their GO analysis) members: Becn1 and Wdr81. They then go on to examine PDGFRRa and Rab5, an endosomal marker, colocalization. They propose a model in which Srsf3 functions downstream of PDGFRRa signaling to, in part, regulate PDGFRa trafficking to the endosome. The findings are novel and shed light on the mechanisms of PDGF signaling and will be broadly of interest. This lab previously identified the importance of PDGF naling on alternative splicing. The combination of RNA-seq and eCLIP is an exceptional way to comprehensively analyze this effect. The results will be of great utility to those studying PDGF signaling or neural crest biology. There are some concerns that should be considered, however.

      (1) It took some time to make sense of the number of DE genes across the results section and Figure 1. The authors give the total number of DE genes across Srsf3 control and loss conditions as 1,629 with 1,042 of them overlapping across Pdgf treatment. If the authors would add verbiage to the point that this leaves 1,108 unique genes in the dataset, then the numbers in Figure 1D would instantly make sense. The same applies to PDGF in Figure 1F and the Venn diagrams in Figure 2.

      (2) The percentage of skipped exons in the +PSI on the righthand side of Figure 2F is not readable.

      (3) It would be useful to have more information regarding the motif enrichment in Figure 3. What is the extent of enrichment? The authors should also provide a more complete list of enriched motifs, perhaps as a supplement.

      (4) It is unclear what subset of transcripts represent the "overlapping datasets" on lines 280-315. The authors state that there are 149 unique overlapping transcripts, but the Venn diagram shows 270. Also, it seems that the most interesting transcripts are the 233 that show alternative splicing and are bound by Srsf3. Would the results shown in Figure 5 change if the authors focused on these transcripts?

      (5) In general, there is little validation of the sequencing results, performing qPCR on Arhgap12 and Cep55. The authors should additionally validate the PI3K pathway members that they analyze. Related, is Becn1 expression downregulated in the absence of Srsf3, as would be predicted if it is undergoing NMD?

      (6) What is the alternative splicing event for Acap3?

      (7) The insets in Figure 6 C"-H" are useful but difficult to see due to their small size. Perhaps these could be made as their own figure panels.

      (8) In Figure 6A, it is not clear which groups have statistically significant differences. A clearer visualization system should be used.

      (9) Similarly in Figure 6B, is 15 vs 60 minutes in the shSrsf3 group the only significant difference? Is there a difference between scramble and shSrsf3 at 15 minutes? Is there a difference between 0 and 15 minutes for either group?

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript builds upon the work of a previous study published by the group (Dennison, 2021) to further elucidate the coregulatory axis of Srsf3 and PDGFRa on craniofacial development. The authors in this study investigated the molecular mechanisms by which PDGFRa signaling activates the RNA-binding protein Srsf3 to regulate alternative splicing (AS) and gene expression (GE) necessary for craniofacial development. PDGFRa signaling-mediated Srsf3 phosphorylation drives its translocation into the nucleus and affects binding affinity to different proteins and RNA, but the exact molecular mechanisms were not known. The authors performed RNA sequencing on immortalized mouse embryonic mesenchyme (MEPM) cells treated with shRNA targeting 3' UTR of Srsf3 or scramble shRNA (to probe AS and DE events that are Srsf3 dependent) and with and without PDGF-AA ligand treatment (to probe AS and DE events that are PDGFRa signaling dependent). They found that PDGFRa signaling has more effect on AS than on DE. A matching eCLIP-seq experiment was performed to investigate how Srsf3 binding sites change with and without PDGFRa signaling.

      Strengths:

      (1) The work builds well upon the previous data and the authors employ a variety of appropriate techniques to answer their research questions.

      (2) The authors show that Srsf3 binding pattern within the transcript as well as binding motifs change significantly upon PDGFRa signaling, providing a mechanistic explanation for the significant changes in AS.

      (3) By combining RNA-seq and eCLIP datasets together, the authors identified a list of genes that are directly bound by Srsf3 and undergo changes in GE and/or AS. Two examples are Becn1 and Wdr81, which are involved in early endosomal trafficking.

      Weaknesses:

      (1) The authors identify two genes whose AS are directly regulated by Srsf3 and involved in endosomal trafficking; however, they do not validate the differential AS results and whether changes in these genes can affect endosomal trafficking. In Figure 6, they show that PDGFRa signaling is involved in endosome size and Rab5 colocalization, but do not show how Srsf3 and the two genes are involved.

      (2) The proposed model does not account for other proteins mediating the activation of Srsf3 after Akt phosphorylation. How do we know this is a direct effect (and not a secondary or tertiary effect)?

    1. eLife assessment

      This study provides valuable insights into the influence of sex on bile acid metabolism and the risk of hepatocellular carcinoma (HCC). The data to support that there are inter-relationships between sex, bile acids, and HCC in mice are solid, but for the most part, they are descriptive. At this point, there is not enough evidence to determine the clinical significance of the findings, given the differences in bile acid composition between mice and men.

    2. Reviewer #1 (Public Review):

      Summary:

      Liver cancer shows a higher incidence in males than females with incompletely understood causes. This study utilized a mouse model that lacks the bile acid feedback mechanisms (FXR/SHP DKO mice) to study how dysregulation of bile acid homeostasis and a high circulating bile acid may underlie the gender-dependent prevalence and prognosis of HCC. By transcriptomics analysis comparing male and female mice, unique sets of gene signatures were identified and correlated with HCC outcomes in human patients. The study showed that the ovariectomy procedure increased HCC incidence in female FXR/SHP DKO mice that were otherwise resistant to age-dependent HCC development and that removing bile acids by blocking intestine bile acid absorption reduced HCC progression in FXR/SHP DKO mice. Based on these findings, the authors suggest that gender-dependent bile acid metabolism may play a role in the male-dominant HCC incidence, and that reducing bile acid levels and signaling may be beneficial in HCC treatment.

      Strengths:

      (1) Chronic liver diseases often preceed the development of liver and bile duct cancer. Advanced chronic liver diseases are often associated with dysregulation of bile acid homeostasis and cholestasis. This study takes advantage of a unique FXR/SHP DKO model that develops high organ bile acid exposure and spontaneous age-dependent HCC development in males but not females to identify unique HCC-associated gene signatures. The study showed that the unique gene signature in female DKO mice that had lower HCC incidence also correlated with lower-grade HCC and better survival in human HCC patients.

      (2) The study also suggests that differentially regulated bile acid signaling or gender-dependent response to altered bile acids may contribute to gender-dependent susceptibility to HCC development and/or progression.

      Weaknesses:

      (1) HCC shows heterogeneity, and it is unclear what tissues (tumor or normal) were used from the DKO mice and human HCC gene expression dataset to obtain the gene signature, and how the authors reconcile these gene signatures with HCC prognosis.

      (2) The authors identified a unique set of gene expression signatures that are linked to HCC patient outcomes, but analysis of these gene sets to understand the causes of cancer promotion is still lacking. The studies of urea cycle metabolism and estrogen signaling were preliminary and inconclusive. These mechanistic aspects may be followed up in revision or future studies.

      (3) While high levels of bile acids are convincingly shown to promote HCC progression, their role in HCC initiation is not established. The DKO model may be limited to conditions of extremely high levels of organ bile acid exposure. The DKO mice do not model the human population of HCC patients with various etiology and shared liver pathology (i.e. cirrhosis). Therefore, high circulating bile acids may not fully explain the male prevalence of HCC incidence.

      (4) The authors showed lower circulating bile acids and increased fecal bile acid excretion in female mice and hypothesized that this may be a mechanism underlying the lower bile acid exposure that contributed to lower HCC incidence in female DKO mice. Additional analysis of organ bile acids within the enterohepatic circulation may be performed because a more accurate interpretation of the circulating bile acids and fecal bile acids can be made in reference to organ bile acids and total bile acid pool changes in these mice.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript of Patton et al. shows that in mice in which both FXR and SHP are knocked out, the sex difference in liver cancer risk is recapitulated. Authors show that the protection against tumor development seen in female mice is dependent upon ovarian hormone secretion and higher fecal bile acid excretion in females compared to males. The female liver-specific gene signature correlates with low-grade tumors and better survival in human HCC patients.

      The combination of the use of the double knockout mice together with ovariectomy in female mice and using a bile acid raisin in male mice to underscore their conclusion is strong. However, there are also some shortcomings, that should be addressed.

      Strengths:

      (1) Using computational modelling, Patton and colleagues correlate mouse DKO transcriptome data to the clinical outcomes of HCC patients using HCC transcriptome datasets.

      (2) The dependence of female protection on ovarian hormones and increased fecal bile acid excretion is nicely shown by combining ovariectomy and bile acid raisin with the use of double knockout mice.

      Weaknesses:

      (1) The translational value to human HCC is not so strong yet. Authors show that there is a correlation between the female-selective gene signature and low-grade tumors and better survival in HCC patients overall. However, these data do not show whether this signature is more highly correlated with female tumor burden and survival. In other words, whether the mechanisms of female protection may be similar between humans and mice. In that respect, it would also be good to elaborate on whether women have higher fecal BA excretion and lower serum BA concentration.

      (2) The authors should perform a thorough spelling and grammar check.

      (3) There are quite some errors and inaccuracies in the result section, figures, and legends. The authors should correct this.

    1. eLife assessment

      The findings are useful for understanding the disease's pathology and immune dysregulation, but the evidence is still incomplete regarding whether these immune changes are directly caused by copper metabolism alterations or are secondary to liver dysfunction.

    2. Reviewer #1 (Public Review):

      Summary:

      Wilson's Disease (WD) is an inherited rare pathological condition due to a mutation in ATP7B that alters mitochondrial structure and dysfunction. Additionally, WD results in dysregulated copper metabolism in patients. These metabolic abnormalities affect the functions of the liver and can result in cholecystitis. Understanding the immune component and its contribution to WD and cholecystitis has been challenging. In this work, the authors have performed single-cell RNA sequencing of mesenchymal tissue from three WD patients and three liver hemangioma patients.

      Strengths:

      The authors describe the transcriptomic alterations in myeloid and lymphoid compartments.

      Weaknesses:

      In brief, this manuscript lacks a clear focus, and the writing needs vast improvement. Figures lack details (or are misrepresented), the results section only catalogs observations, and the discussion needs to focus on their findings' mechanistic and functional relevance. The major weakness of this manuscript is that the authors do not provide a mechanistic link between the absence of ATP7B and NK cells' impaired/altered functions. While the work is of high clinical relevance, there are various areas that could be improved.

    3. Reviewer #2 (Public Review):

      Summary:

      Wilson's disease is a rare genetic disorder caused by mutations in the ATP7B gene. Previous studies have documented that ATP7B mutations can disrupt copper metabolism, affecting brain and liver function. In this paper, the authors performed a retrospective clinical study and found that Wilson's disease has a high incidence of cholecystitis. Single-cell RNA-seq analysis revealed changes in the immune microenvironment, including the activation of immune responses and the exhaustion of natural killer cells.

      Strengths:

      A key finding of this study is that the predominant ATP7B gene mutation in the Chinese population is the 2333G>T (p. R778L) mutation. The authors reported associations between Wilson's disease and cholecystitis, as well as the exhaustion of natural killer cells.

      Weaknesses:

      The underlying mechanisms linking ATP7B mutations to cholecystitis and natural killer cell exhaustion remain unclear. Specifically, it is not yet determined whether copper metabolism alterations directly cause cholecystitis and natural killer cell exhaustion, or if these effects are secondary to liver dysfunction.

    1. eLife assessment

      This study investigates BMP signaling mechanisms in the developing chick cerebellum to better understand germinal layer formation, cellular amplification and neuronal differentiation. The data from human tissue is compelling and lends support to the possible links of these processes to medulloblastoma, although this study does raise exciting questions regarding the generalized role of BMP signaling during normal development and malignant growth. Overall, this is an important study with beautifully presented findings.

    2. Reviewer #1 (Public Review):

      Summary:

      Rook et al examined the role of BMP signaling in cerebellum development, using chick as a model alongside human tissue samples. They first examined p-SMADs and found differences between the species, with human samples retaining high p-SMAD after foliation, while in chick, BMP signaling appears to decrease following foliation. To understand the role of BMP during early development, they then used early chick embryos to modulate BMP, using either a constitutively active BMP regulator to increase BMP signaling or overexpressing the negative intracellular BMP regulator to decrease BMP signaling. After validating the constructs in ovo, the authors then examined GNP morphology and migration. They then determined whether the effects were cell autonomous.

      Strengths:

      The experiments were well-designed and well-controlled. The figures were extremely clear and convincing, and the accompanying drawings help orient the reader to easily understand the experimental set up. These studies also help clarify the role of BMP at different stages of cerebellum development, suggesting early BMP signaling is required for dorsalization, not rhombic lip induction, and that later BMP signaling is needed to regulate the timing of migration and maturation of granule neurons.

      Weaknesses:

      While these studies certainly hint that BMP modulation may affect tumor growth, this was not explicitly tested here. Future studies are required to generalize the functional role of BMP signaling in normal cerebellum development to malignant growth.

    3. Reviewer #2 (Public Review):

      Summary:

      This is a fundamental and elegant study showing the role of BMP signaling in cerebellar development. This is an important question because there are multiple diseases, including aggressive childhood cancers, which involve granule cell precursors. Thus understanding of the factors that govern the formation of the granule cell layer is important both from a basic science and a disease perspective.

      Overall, the manuscript is clear and well-written. The figures are extremely clear, wonderfully informative, and overall quite beautiful.

      Figures 1-3 show the experimental design and report how BMP activity is altered over development in both the chick and the human developing cerebellum. Both data is very impressive and convincing.

      They then go on to modulate BMP activity in the developing chick, using a complex electroporation paradigm that allows them to label cells with GFP as well as with cell-specific reporters of BMP activity levels. They bidirectionally modulate BMP levels and then can look at both cell-specific and non-specific alterations in the formation of the external and internal granule cell layer, across different developmental timepoints. These are really elegant and rigorous experiments, as they look at both sagittal and transverse sections to collect this data. This makes the data extremely compelling. With these rigorous techniques, they show that BMP signaling serves more than one function across development: it is involved in the initial tangential migration from the rhombic lip, but at a later time, both up- and down-regulation of BMP activity reduces density of amplifying cells in the external granule cell layer.

      Strengths:

      Overall, I think the paper is interesting and important and the data is strong. The use of both chick and human tissue strengthens the findings. They are extremely rigorous, analyzing data from multiple planes at multiple ages, which also really strengthens their findings. The dual electroporation approach is extremely elegant, providing beautiful visual representations of their findings.

      Weaknesses:

      I find no significant weaknesses.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1.1) I thought the manuscript was very clear. While I realize the authors included the reference to medulloblastoma in the introduction based on previous reviewer comments, I think this speculation is better left in the discussion.

      Whilst we appreciate the reviewers feedback here, we felt it was important to include a reference to medulloblastoma and developmental disorders associated with the cerebellum to put this work into a broader context.

      We removed the sentence “Medulloblastoma can be a consequence of uncontrolled proliferation of granule cell progenitors, with BMP overexpression being a potential therapeutic avenue to inhibit this proliferation” to limit the speculation in this statement.

      (1.2) line 81: It would be better to cite the 2 original papers (Hendrikes et al 2022, Smith et al 2022) rather than the Phoenix commentary article. I'm not sure the Phoenix article needs to be cited at all within this paper.

      We have cited the two suggested papers and removed the citation to Phoenix et al.

      (1.3) line 102: confusing sentence with the unexpected separation of do and not: "the same conditional deletions of BMP pathway elements that fail to block early granule cell specification at the rhombic lip do result not in a larger cerebellum as might be expected, but either have no affect".

      We thank the reviewer for pointing out this error and have corrected the text to “do not result in a larger cerebellum”.

      (1.4) line 133: inconsistent acronyms (for example, W9 vs pcw9).

      This has been corrected to PCW in all occurrences.

      (1.5) line 139: coronal vs transverse? it seems like you show transverse sectioning but refer to it as coronal in the text.

      We thank the reviewer for highlighting this and have corrected the text to “transverse”.

      (1.6) fig 2C: would it be possible to provide a similar inset as 2D?

      We thank the reviewer for this suggestion and have added the insets in 2C. We agree that this is now clearer and more consistent with the rest of the figure.

      (1.7) line 368/369/435/436 missing arrows.

      The arrows have been re-added- it appears that they did not show up on the uploaded PDF.

      (1.8) line 517 missing word: rhombic-lip-derived.

      This typo has been corrected.

      Reviewer #2 (Public Review):

      (2.1) Fig. 3 M Why are there asterisks both above and below the brackets?

      This was a formatting error that has now been corrected.

      (2.2) Fig. 8. The arrows (BMP up and BMP down) are touching the right ")" in the figure, which makes it hard to read.

      This was also a formatting issue which has been corrected.

      (2.3) Fig. 4 and 8 legends. There are spaces in the text which I believe are for arrows to be inserted "(BMP )", but the arrows have been omitted in the PDF that I read.

      This is the same as reviewer 1’s comment- these have been re-added to the text and appears to have been an issue with the PDF upload.

      (2.4) Fig. 3 legend gets very hard to read at the end, where it seems some punctuation is missing.

      We have re-worded the legend for Fig. 3 to make it easier to read.

      (2.5) Significant figures in some of the text are probably too much given the accuracy at which they can be measured with.

      We appreciate the reviewer’s concerns here, however these were added in response to the original reviewer’s request to “provide some additional support to otherwise qualitative observations”.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      More details should be provided in terms of inclusion and exclusion criteria for the participants, as well as missing data due to the non-cooperation of newborns during the experimental process. Potential differences between preterm and full-term infants are worth exploring. Several aspects of EEG data analyses and data interpretation should be better clarified.

      Here I have several comments and questions to improve the manuscript.

      (1) It would be wise to know whether there was any missing data due to the non-cooperation of newborns during the experimental process.

      Thank you for the suggestion. While our initial aim was to include 120 neonates in the final data analysis, we actually recruited 198 neonatal participants for this study. The 78 EEG datasets were excluded from the data analysis due to non-cooperation of neonates (n = 75) or technical issues (n = 3). We have incorporated this detailed information in the Subjects subsection (lines 375-383) in the revised manuscript.

      (2) The authors investigated the impact of gestational age on emotional perceptual sensitivity in newborns by grouping infants of varying gestational ages in the experiment. The methods section mentions that the study conducted experiments within 24 hours after the birth of the newborns. When do preterm infants (with a gestational age of 35 and 36 weeks) begin to exhibit emotional discrimination comparable to full-term newborns? 

      This is indeed an intriguing question that merits exploration. However, in our study, we recruited relatively healthy preterm neonates, many of whom were discharged from the hospital with their mothers within 3-5 days after birth. It would have been challenging to arrange for another EEG testing session once these preterm infants reached full-term age, as their parents were unwilling to return to the hospital.

      (3) When analyzing EEG data, excluding artifacts with peak deviations exceeding ±200 μV is a relatively lenient criterion, potentially resulting in the retention of some large-amplitude artifacts or noise. What is the rationale behind the author's choice of this criterion? Or, in other words, what considerations led to this specific selection?

      In our standard practice, we typically employ a stricter threshold of ±100 μV for artifact removal in studies involving healthy adults and a median threshold of ±150 μV for data from adult patients, such as those with schizophrenia. However, when analyzing neonatal data, we often resort to the loosest criterion of ±200 μV. This decision is primarily due to the inherent challenges associated with neonatal EEG recordings, as we cannot expect newborns to cooperate or remain quiet during the recording process. Consequently, neonatal EEG data tend to contain more artifacts compared to those from healthy adults. Furthermore, the excitability of the newborn brain is notably elevated. This heightened excitability arises from an imbalance in the distribution and function of excitatory and inhibitory neurotransmitter systems. Typically, the expression of excitatory neurotransmitters and their receptors surpasses that of inhibitory neurotransmitters, resulting in increased excitability in the immature brain. This heightened excitability can occasionally lead to the occurrence of paroxysmal electrical activity. As a result, neonatal EEG recordings may at times display large amplitudes, exceeding even 100 μV. In this revision, we have referenced other neonatal/infant EEG studies or technique pipelines that have used the threshold of ±200 μV to support this criterion (lines 483-484).    

      (4) In the Discussion section, the authors mentioned the biomarkers, such as the fusiform gyrus and hippocampus, which have been identified as potential predictors of autism risk. It is suggested that the authors briefly elucidate the crucial role of these biomarkers in processing social information, which would enhance the readability and logicality of this manuscript.

      Thank you for the thoughtful suggestion. We have expanded the discussion concerning the involvement of the fusiform gyrus and hippocampus in social information processing (lines 314-319).

      Reviewer #2 (Public Review):

      First, readers need to see spectrograms that show the 0-4000 Hz in more detail, rather than what is now shown (0-10,000 Hz). The vocal signals in clearer spectrograms will show I believe the initial consonant burst and formant frequencies that are unique to human speech and give rise to the perception of the consonant sounds in the vocal signals like 'dada' and 'tutu' that were tested. The control signals will presumably not show these abrupt acoustic changes at their onset, even though they appear (from the oscillograms) to approximate the amplitude envelope. The primary cue distinguishing the happy and neutral signals in both the vocal and control signals is the pitch of the signals (high vs low), but the burst of energy representing the consonants is only contained in the vocal signals; it has no comparable match in the control signals. It is possible that the presence of a sharp acoustic onset (a unique characteristic of consonants in human speech) is especially alerting to the infants, and that this acoustic cue, in the context of the pitch change, enhances discrimination in the vocal case. One way to test this would be to use only vowel sounds to represent the vocal signals, without consonants.

      Thank you for your expert comments and considerations. We have redrawn Figure 3 using Praat software with a frequency range of 0-5000 Hz, as suggested by Praat’s default parameters. Based on the spectrograms, we acknowledge the potential role of consonants in accounting for differences in stimuli. Consequently, we have included this consideration as one of the limitations of our study in this revised version (lines 325-330).

      Another critical detail that the authors need to include about the signals is an explanation of how the control signals were generated. The text states that the Fo and amplitude envelope of the vocal signals were mimicked in the control signals, but what was the signal used for the controls? Was a pure tone complex modulated, or was pink noise used to generate the control signals? Or were the original vocal signals simply filtered in some way to create the controls, which would preserve the Fo and amplitude envelope? If merely filtered, the control signals still may be perceived as 'vocal' signals, rather than as nonspeech (the Supplement contains the sounds, and some of the control sounds can be perceived, to my ear, as 'vocal' signals).

      We sincerely appreciate your attention to detail regarding the generation of control signals. As a non-specialized laboratory in audio editing, our approach involved filtering the original vocal sounds around the fundamental frequency (f0) and ensuring a balanced mean intensity between vocal and nonvocal stimuli (as now stated in lines 432-437). However, it became evident that certain “vocal” components persisted in the control sounds, particularly noticeable in the sound “tutu”. In this revision, we openly acknowledge this oversight (lines 331-333). We extend our gratitude once again for highlighting the importance of meticulous consideration when generating control sounds for a study.

      Second, there is no information in the manuscript or supplement about the auditory environment of the participants, nor discussion of the fetus' ability to hear in the womb. In the womb, infants are listening to the mothers' bone-conducted speech (which is full of consonant sounds), and we know from published studies that infants can discern differences not only in the prosody of the speech they hear in the womb, but the phonetic characteristics of the mother's speech. The ability at 37 weeks GA or beyond to discriminate the pitch changes in the vocal, but not control signals, could thus be due to additional experience in utero to speech. Another experiential explanation is that the infants born at 37 weeks GA and beyond may be exposed to greater amounts of speech after birth, when compared to those born at 35 and 36 weeks GA, from the attending nurses and from their caregivers, and this speech is also full of consonant sounds. What these infants hear is likely to be 'infant-directed speech,' which is significantly higher in pitch, mirroring the signals tested here. At 37 weeks GA, infants are likely more robust, may sleep less, and are likely more alert. If infants' exposure to speech, either after birth, or their auditory ability to discern differences in speech in utero, is enhanced at 37 weeks GA and beyond, then an 'experience-related' explanation is a viable alternative to a maturational explanation, and should be discussed. Perhaps both are playing a role. As the authors state, many more signals need to be tested to discern how the effect should be interpreted, and other viable interpretations of the current results discussed.

      We acknowledge the importance of considering the auditory environment of participants and the fetus' ability to hear in the womb. In our study, neonates were exposed to a native language environment both before and after birth (as added in lines 385-386), and we took efforts to minimize their exposure to speech stimuli other than those used in the experiment. Specifically, all neonates participated the experiment and underwent EEG recording within the first 24 hours after birth (lines 386-387). They were promptly transported to a dedicated testing room for EEG recording as soon as their condition stabilized after birth. During recording sessions, they were separated from their mothers to minimize exposure to natural speech (as added in lines 459-461). As a result, we believe that both preterm and term neonates were exposed to comparable amounts of speech after birth and before the experiment. We also ensured that all participants were in a natural sleep state during EEG recording. However, it is possible that term neonates slept less and were more attentive to the limited speech stimuli in their environment before the experiment compared to preterm newborns.

      The debate surrounding nature versus nurture in neonate and infant development persists. We recognize the potential impact of prenatal auditory experiences on neonatal perceptual sensitivity. Therefore, we have added a brief discussion regarding innate- or experience-related explanations for emotional prosodic discrimination in neonates, aiming to shed light on future research directions (lines 343-351).

    2. eLife assessment

      This is an important study on changes in newborns' neural abilities to distinguish auditory signals at 37 weeks of gestation. The evidence of change in neural discrimination as a function of gestational age is convincing, but further analysis of the acoustic signals and control of the infants' language environment is necessary for the results to be used in clinical applications. The work contributes to the field of neurodevelopment.

    3. Reviewer #1 (Public Review):

      Summary:<br /> This manuscript aimed to investigate the emergence of emotional sensitivity and its relationship with gestational age. Using an oddball paradigm and event-related potentials, the authors conducted an experiment in 120 healthy neonates with a gestational age range of 35 to 40 weeks. A significant developmental milestone was identified at 37 weeks gestational age, marking a crucial juncture in neonatal emotional responsiveness.

      Strengths:<br /> This study has several strengths, by providing profound insights into the early development of social-emotional functioning and unveiling the role of gestational age in shaping neonatal perceptual abilities. The methodology of this study demonstrates rigor and well-controlled experimental design, particularly involving matched control sounds, which enhances the reliability of the research. Their findings not only contribute to the field of neurodevelopment, but also showcase potential clinical applications, especially in the context of autism screening and early intervention for neurodevelopmental disorders.

      Comments on the revised version:

      After reviewing the authors' response letter and the revised manuscript, I believe they have done a commendable job in addressing my comments.<br /> Additionally, I concur with the concerns raised by Reviewer #2 regarding several potential confounding factors that require better control in their experimental design. These include the differences in physical properties between vocal and nonvocal stimuli, as well as the infant's exposure to the speech/auditory environment. These concerns should be thoroughly and explicitly discussed in the manuscript, ensuring a clearer understanding for the readers.

    4. Reviewer #2 (Public Review):

      This is an important and very interesting report on a change in newborns' neural abilities to distinguish auditory signals as a function of the gestational age (GA) of the infant at birth (from 35 weeks GA to 40 weeks GA). The authors tested neural discrimination of sounds that were labeled 'happy' vs 'neutral' by listeners that represent two categories of sound, either human voices or auditory signals that mimic only certain properties of the human vocal signals. The finding is that a change occurs in neural discrimination of the happy and neutral auditory signals for infants born at or after 37 weeks of gestation, and not prior (at 35 or 36 weeks of gestation), and only for discrimination of the human vocal signals; no change occurs in discrimination of the nonhuman signals over the 35- to 40-week gestational ages tested. The neural evidence of discrimination of the vocal happy-neutral distinction and the absence of the discrimination of the control signals is convincing. The authors interpret this as a 'landmark' in infants' ability to detect changes in emotional vocal signals, and remark on the potential value of the test as a marker of the infants' interest in emotional signals, underscoring the fact that children at risk for autism spectrum disorder may not show the discrimination. Although the finding is novel and interesting, additional discussion is essential so that readers understand two potential caveats affecting this interpretation.

      Comments on the revised version:

      The revised manuscript does discuss the limitations of the control stimuli, as well as the limitations with regard to conclusions that can be drawn from this data set. I therefore expected the authors to temper a bit their recommendation that this could be a 'screening' signal for autism because these data are not sufficiently strong to make that recommendation. Also, in the same vein, perhaps the title might be adjusted somewhat to suggest less certainty, for example, by using the word "change" rather than "milestone"'? The data are of interest, but the limitations are genuine limitations.

    1. Author response:

      The following is the authors’ response to the previous reviews

      It is unclear to us why you did not adjust the title to better reflect the well-supported claims of the paper, i.e., that this is a valuable model for human loss-of-function mutations in IQCH.

      Thanks for the editor’s suggestion. We have changed the title to “Deficiency of IQCH causes male infertility in humans and mice.” Additionally, we have provided the original images of the gels or blots as a zipped folder.

    2. eLife assessment

      This valuable study describes mice with a knock out of the IQ motif-containing H (IQCH) gene, to model a human loss-of-function mutation in IQCH associated with male sterility. While the evidence for interaction between IQCH and potential RNA binding proteins is limited, the human infertility is reproduced in the mouse, making it a compelling model. The paper could be of interest to cell biologists and male reproductive biologists working on the sperm flagellar cytoskeleton and mitochondrial structure.

    3. Reviewer #3 (Public Review):

      In this study, Ruan et al. investigate the role of the IQCH gene in spermatogenesis, focusing on its interaction with calmodulin and its regulation of RNA-binding proteins. The authors examined sperm from a male infertility patient with an inherited IQCH mutation as well as Iqch CRISPR knockout mice. The authors found that both human and mouse sperm exhibited structural and morphogenetic defects in multiple structures, leading to reduced fertility in Ichq-knockout male mice. Molecular analyses such as mass spectrometry and immunoprecipitation indicated that RNA-binding proteins are likely targets of IQCH, with the authors focusing on the RNA-binding protein HNRPAB as a critical regulator of testicular mRNAs. The authors used in vitro cell culture models to demonstrate an interaction between IQCH and calmodulin, in addition to showing that this interaction via the IQ motif of IQCH is required for IQCH's function in promoting HNRPAB expression. In sum, the authors concluded that IQCH promotes male fertility by binding to calmodulin and controlling HNRPAB expression to regulate the expression of essential mRNAs for spermatogenesis. These findings provide new insight into molecular mechanisms underlying spermatogenesis and how important factors for sperm morphogenesis and function are regulated.

      The strengths of the study include the use of mouse and human samples, which demonstrate a likely relevance of the mouse model to humans; the use of multiple biochemical techniques to address the molecular mechanisms involved; the development of a new CRISPR mouse model; ample controls; and clearly displayed results. Assays are done rigorously and in a quantitative manner. Overall, the claims made by the authors in this manuscript are well-supported by the data provided.

    1. eLife assessment

      In this important study, the authors explore ER stress signaling mediated by ATF6 using a genome-wide gene depletion screen. They find that the ER chaperone Calreticulin binds and directly represses ATF6, a new and intriguing function for Calreticulin. The evidence presented is convincing, based on CHO genetics and biochemical analysis.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Tung and colleagues identify Calreticulin as a repressor of ATF6 signaling using a crispr screen and characterize the functional interaction between ATF6 and CALR.

      Strengths:

      The manuscript is well written and interesting with an innovative experimental design which provides some new mechanistic insight into ATF6 regulation as well as crosstalk with the IRE1 pathway. The methods used were fit for purpose and reasonable conclusions were drawn from the data presented.

      Comments on latest version:

      The authors did a good job at addressing my comments even though they found several aspects to exceed the scope of the work. The manuscript is clearer now and the model pushed by the authors is better supported by the data. One point I am curious about the authors' opinion would be about the status of ATF6alpha activation in pathological cells in which CALR is mutated (e.g., myeloproliferative neoplasms), although this neither challenges the conclusions of the manuscript and my positive opinion of the work.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors explore ER stress signalling mediated by ATF6 using a genome-wide gene depletion screen. They find that the ER chaperone Calreticulin binds and directly represses ATF6; this proposed function for Calreticulin is intriguing and constitutes an important finding. The evidence presented is based on CHO genetic evidence and biochemical results and is convincing. 

      We thank the editors for their favourable assessment of our work.

      Reviewer #1 (Public Review): 

      Summary: 

      In this manuscript, Tung and colleagues identify Calreticulin as a repressor of ATF6 signalling using a CRISPR screen and characterize the functional interaction between ATF6 and CALR. 

      Strengths: 

      The manuscript is well written and interesting with an innovative experimental design that provides some new mechanistic insight into ATF6 regulation as well as crosstalk with the IRE1 pathway. The methods used were fit for purpose and reasonable conclusions were drawn from the data presented. Findings are novel and bring together glycoprotein quality control and activation of one sensor of the UPR. This is a novel perspective on how the integration of ER homeostasis signals could be sensed in the ER. 

      We thank the reviewer for their favourable assessment of our work.

      Weaknesses: 

      Several points remain to be documented to support the authors' model. 

      Major comments 

      (1) It is interesting that BiP, PDIs, and COPII are not identified in the screen. Might this indicate some bias in the system perhaps limiting its sensitivity or pleiotropic effects of the reporter? 

      The reviewer raises a valid concern. Our CRISPR screen aimed to identify genes that selectively modulate ATF6⍺. Therefore, we excluded from consideration genes whose inactivation had effects on the broader ER environment. This would disfavour the selection of genes encoding BiP, PDI and COPII components. Additionally, a positive selection screen inherently removes essential genes like BiP. The absence of COPII components among the hits could be due to essentiality or that those components are not strong selective modulators for ATF6⍺ activation, as the stronger ATF6⍺ modulators as S1P, S2P and transcription factor S2P and NFY were among our top hits. Cell type specificity may also play a role. For example, ERp18, a small PDI previously implicated in ATF6⍺ activation (Oka et al 2019; PMID: 31368601), despite the presence of sgRNAs targeting hamster ERp18 in the library. Interestingly, depletion of ERp18 in our dual UPR reporter CHO-K1 cell line did not affect the ATF6⍺ and IRE1⍺ UPR branches in CHO-K1 cells. This new information has been incorporated into the revised manuscript as Supplemental Figure S6E and the discussion has been edited in line with these comments.

      (2) CLR interacts with ATF6 independently of ATF6 glycans (and cysteines). How do the authors reconcile this observation with the lectin functions of CALR? What is the interaction mode then - if the CALR N (lectin) domain is not involved, is it the P domain that is responsible for the interaction? All the binding experiments are performed in the presence of 1 mM CaCl2, is calcium necessary for CALR to achieve binding? 

      These points merit clarification. The Biolayer Interferometry (BLI) assay reported on an interaction between ATF6 and CRT that is independently of ATF6⍺ glycans. However, cellbased experiments revealed a contribution of glycan-dependent interactions to the binding and repression. Therefore, we conclude that the interaction of CRT with ATF6⍺ likely involves both lectin-dependent and lectin-independent interactions (dependent on the P-domain). Indeed, this hybrid model has previously been suggested as the mode of stable interaction of CRT with other substrates, as cited in the discussion section (Wijeyesakere et al., 2013; PMID: 24100026). CRT is a known calcium-dependent protein, and all the in vitro experiments were conducted in the presence of 1 mM CaCl2. We do not have data from experiments without CaCl2.

      (3) Does the introduction of the reporter system affect the normal BiP (or ATF6) protein levels in the cells? 

      To address this question, we have conducted new experiments comparing endogenous BiP protein levels between the reporter-containing cells and the parental CHO-K1 cells using immunoblotting and an anti-BiP antibody. These data indicate that the reporter system does not affect to the endogenous BiP protein levels. This new information has been incorporated as revised Supplemental Figure S1C.

      (4) Does the depletion of CRT affect BiP interaction with ATF6? The absence of CRT may lead to misfolding of glycoproteins and titration of BiP away from ATF6 leading to activation. An indicator of ER stress levels that is independent of ATF6 and IRE1 might be useful. 

      To further assess ER stress levels in CRT-depleted cells, we compared expression levels of endogenous ER resident proteins containing a KDEL signal (e.g., P3H1, GRP94, BiP and PDI) in parental CHO-K1 cells, dual UPR reporter cell lines (XC45-6S) and CRT-depleted cells (CRT∆#2P) under basal conditions and during ER stress by immunoblotting. This comparison confirmed the basal elevation in BiP protein level in cells lacking CRT, consistent with previous findings (Figure 2D) and more broadly the integrity of UPR signalling in cells lacking CRT. In the interest of time, we did not extend the analysis to other branches of the UPR. This new information has been incorporated as Supplemental Figure S5 and in the text of the revised manuscript.

      (5) Does CALR depletion alter ATF6 redox status. 

      We thank the reviewer for raising this interesting point. In response, we compared ATF6⍺ redox status in parental and CRT-depleted cells using non-reducing SDS-PAGE. Overall, the redox pattern was similar in parental and CRT-depleted cells with the detection of two redox forms: an inter-chain disulfide-stabilised dimer and the monomer. Under basal conditions, ATF6⍺ predominantly existed as a monomer, while under ER stress, the monomer band decreased with a corresponding increase in a disulfide-stabilised dimer form in parental cells, as previously reported (Oka et al, 2022; PMID: 35286189). However, under ER stress, CRTdepleted cells showed a significantly higher fraction of monomer versus dimer compared to parental cells. Taking all together, these data suggest that the loss of CRT may favour the monomeric form of ATF6α, which is proposed to be more efficiently trafficked (Nadanaka, et al 2007; PMID: 17101776), aligning with our observations that CRT depletion is associated to constitutive activation of ATF6α. These new data have been included as Supplemental Figure S7 and are detailed explained in the results section of the revised manuscript.

      (6) Figure 4C would benefit from some immunoblotting against BiP.

      Although we acknowledge the validity of this suggestion and understand the referee's interest in comparing the amount of CRT in pulldown with that of BiP, the necessity of generating additional samples makes this experiment impractical. Consequently, we opted not to include in our conclusion any comparison regarding the retention of ATF6α by BiP relative to CRT.

      (7) Overlooked requirement of cysteines for ATF6 functionality (Figure 5B). 

      We interpret this comment to refer to the inactivity of the cysteine-free allele of ATF6⍺. Whilst this is a reproducible observation of significance to the structure-activity features of ATF6⍺’s luminal domain, it is less informative in terms of understanding trans-active regulators of ATF6⍺ and was therefore not explored further.

      (8) Without a clear definition of the role of CRT in ATF6 folding, one cannot infer that the observed phenotype is not based on defects in ATF6 "folding" and glycosylation considering the possibility of activation of newly synthesised un-glycosylated ATF6. 

      If the main role of CRT were to assist ATF6⍺ folding, one would expect that depletion of CRT would lead to a non-functional ATF6⍺, resulting in ER retention and less activity. However, our data indicate that the loss of CRT correlates with the constitutive activation of the ATF6⍺ fluorescent reporter and increased Golgi trafficking and processing of ATF6⍺. Therefore, these data suggest that in CRT-depleted cells, the majority of ATF6⍺ is likely to fold to a functional state.

      (9) ATF6 was defined in several studies as a natively unstable protein and shows a close relationship with the ERAD machinery, is the role of CALR also involved in a quality control mechanism for natively unfolded ATF6? 

      The reviewer brings up a valid point too. Although we have not closely evaluated the role of CRT in the quality control machinery, we observed that the loss of CRT was not associated with an increased levels of ATF6⍺ in CRT depleted cells in basal conditions compared with parental cells (Fig 3B.1, compare line 1 and line 7; Figure 3B.2, compare line 1 and line 5). These observations suggest that if ATF6⍺ were degraded by ERAD and loss of CRT compromised ERAD functionality, CRT-depleted cells should exhibit increased levels of endogenous ATF6⍺. The fact that endogenous ATF6⍺ levels are slightly reduced in CRT depleted cells does not support a role for CRT in the quality control mechanism for natively unfolded ATF6⍺.

      (10) C618 in ATF6 is located within the BiP binding site and in close proximity of an Nglycosylation site. Is this region of particular importance for CALR binding? 

      It is an interesting point that we have not explored in this study. Consequently, without experimental data, we cannot infer the possible implications of C618 in CRT binding.

      (11) The authors have mutated all the N glycosylation sites at once; they should be mutated one by one and the impact on ATF6 stability evaluated independently of the CALR status. 

      We agree that analysing each N-glycosylation site individually would provide further insight into their contributions to ATF6⍺ stability/functionality. However, given the scope of the paper in its present form we have elected not to addressing this point.

      (12) The relationship between the absence of CALR and IRE1 remains weak. The authors do not exclude the possibility that CALR could have a direct effect on IRE1 itself. This should be either removed or further investigated. 

      We beg to differ. The relationship between the absence of CRT and IRE1 is not weak; loss of CRT in CHO-K1 cells represses IRE1; we conceded readily that the relationship is incompletely understood. ATF6⍺ signalling involves crosstalk with the IRE1 pathway, partly mediated by direct heterodimerisation of N-ATF6⍺ with XBP1s (Yamamoto et al., 2007, 2004). Additionally, recent research has shown that ATF6⍺ activity can repress IRE1 signalling (Walter et al., 2018). Therefore, given that our results indicate that the loss of CRT leads to constitutive activation of ATF6⍺, we suggest that a negative feedback loop in which ATF6⍺ represses IRE1 contributes to the observations made here on the relationship between CRT and IRE1. This does not exclude other aspects to the relationship, a point that is now clarified further in the revised manuscript. 

      Minor point 

      In the introduction on page 3 it is mentioned that loss of ATF6 impairs survival in cellular and animal models, this is not completely true as ATF6a ko in mice has no clear deleterious phenotype and only the double ko ATF6a/b has some dramatic impact.

      We have modified that sentence on the revised manuscript. 

      Reviewer #2 (Public Review): 

      Summary: 

      In this study, the authors set out to use an unbiased CRISPR/Cas9 screen in CHO cells to identify genes encoding proteins that either increase or repress ATF6 signalling in CHO cells. 

      Strengths: 

      The strengths of the paper include the thoroughness of the screens, the use of a novel, double ATF6/IRE1 UPR reporter cell line, and follow-up detailed experiments on two of the findings in the screens, i.e. FURIN and CRT, to test the validity of involvement of each as direct regulators of ATF6 signalling. Additional strengths are the control experiments that validate the ATF6 specificity of the screens, as well as, for CRT, the finding of focus, determining roles for the glycosylation and cysteines in ATF6 as mechanistically involved in how CRT represses ATF6, at least in CHO cells. 

      We thank the reviewer for their favourable assessment of our work.  

      Weaknesses: 

      (1) The weaknesses of the paper are that the authors did not describe why they focused only on the top 100 proteins in each list of ATF6 activators and repressors. 

      We concede that the more genes one studies the better. However, In whole genome CRISPR screens where thousands of hits arise, it is a common practise that researchers prioritise candidates with the greatest significant as those genes are likely to have a more meaningful impact on the phenotype under investigation. Therefore, our decision to focus on the top 100 genes was based on a desire to identify the most prominent and potentially impactful candidates for further analysis, ensuring a manageable scope for in-depth study while maintaining a measure of relevance and significance. Moreover, setting the threshold at 100 hits to perform GEO enrichment analysis is a practise used by previous researchers (PMID: 30323222; PMID: 37251921). In our case, the top 100 hits included the genes with an adjusted P < 0.005. For interested readers, the full ranked list is accessible in the GEO databank (GSE254745) and as supplemental Table S1.

      (2) Additionally, there were a few methodology items missing, such as the nature of where the insertion site in the CHO cell genome of the XBP1::mCherry reporter. Since the authors go to great lengths to insert the other reporter for ATF6 activation in a "safe harbor" location, it leads to questions about whether the XBP1::mCherry reporter insertion is truly innocuous. 

      We appreciate the opportunity to clarify certain aspects of our experimental procedures. In order to generate a double UPR reporter cell line, we employed a previously established the XC45 CHO-K1 clone with an integrated XBP1s::mCherry reporter (Harding et al., 2019; PMID: 31749445). Since the ROSA26 safe harbor locus was available in the XC45 CHO-K1 cell line, we directed integrated the ATF6⍺ reporter there. To provide further clarity, the revised manuscript includes additional details in the Methods section regarding the creation of the XBP1 reporter.

      (3) An additional weakness is that the evidence for the physical interaction between ATF6LD and CRT is not strong, being dependent mainly on a single IP/IB experiment in Figure 4C that comprises only 1 lane on the gel for each of the test cases. Moreover, while that figure suggests that the interaction between CRT and ATF6 is decreased by mutating out the glycosylation sites in the ATF6LD, the BLI experiment in the same figure, 4B, suggests that there are no differences in the affinities of CRT for ATF6LD WT, deltaGly and deltaCys. 

      We would like to highlight that in the IP/IB experiments (see Figure 4C), where wildtype ATF6 (ATF6⍺_LDWT) and GFP-ATF6_LD∆Gly were transiently transfected, GFP-ATF6_LD∆Gly was expressed at lower levels than ATF6⍺_LDWT. This lower expression levels might explain why CRT is more prominently immunoprecipitated with ATF6⍺_LDWT and could account for the differences observed among in vitro and in vivo assays.

      (4) An additional detail is that I found Figure 6A to be difficult to interpret, and that 6B was required in order for me to best evaluate the points being made by the authors in this figure. 

      We have simplified Figure 6A in the revised manuscript to make it more interpretable by focussing the reader’s attention on the transfected population. 

      Overall, I believe that this work will positively impact the field as it provides a list of potential regulators of ATF6 activation and repression that others will be able to use as a launch point for discovering such interactions in cells and tissues or interest beyond CHO cells. However, I agree with the authors that these findings were in CHO cell lines and that it is possible, if not likely, that some of the interactions they found will be cell type/line specific. 

      We accept this point and re-emphasize the qualification that our conclusions cannot be glibly extrapolated to other cell lines.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews: 

      Reviewer #1 (Public Review): 

      The goal of the current study was to evaluate the effect of neuronal activity on blood-brain barrier permeability in the healthy brain, and to determine whether changes in BBB dynamics play a role in cortical plasticity. The authors used a variety of well-validated approaches to first demonstrate that limb stimulation increases BBB permeability. Using in vivo-electrophysiology and pharmacological approaches, the authors demonstrate that albumin is sufficient to induce cortical potentiation and that BBB transporters are necessary for stimulus-induced potentiation. The authors include a transcriptional analysis and differential expression of genes associated with plasticity, TGF-beta signaling, and extracellular matrix were observed following stimulation. Overall, the results obtained in rodents are compelling and support the authors' conclusions that neuronal activity modulates the BBB in the healthy brain and that mechanisms downstream of BBB permeability changes play a role in stimulus-evoked plasticity. These findings were further supported with fMRI and BBB permeability measurements performed in healthy human subjects performing a simple sensorimotor task. There is literature to suggest that there are sex differences in BBB dysfunction in pathophysiological conditions and the authors have acknowledged the use of only males as a minor limitation of the study that should be addressed in the future. Future studies should also test whether the upregulation of OAT3 plays a role in cortical plasticity observed following stimulation. Overall, this study provides novel insights into how neurovascular coupling, BBB permeability, and plasticity interact in the healthy brain. 

      Reviewer #2 (Public Review): 

      Summary: 

      This study builds upon previous work that demonstrated that brain injury results in leakage of albumin across the blood brain barrier, resulting in activation of TGF-beta in astrocytes. Consequently, this leads to decreased glutamate uptake, reduced buffering of extracellular potassium and hyperexcitability. This study asks whether such a process can play a physiological role in cortical plasticity. They first show that stimulation of a forelimb for 30 minutes in a rat results in leakage of the blood brain barrier and extravasation of albumin on the contralateral but not ipsilateral cortex. The authors propose that the leakage is dependent upon neuronal excitability and is associated with an enhancement of excitatory transmission. Inhibiting the transport of albumin or the activation of TGF-beta prevents the enhancement of excitatory transmission. In addition, gene expression associated with TGF-beta activation, synaptic plasticity and extracellular matrix are enhanced on the "stimulated" hemisphere. That this may translate to humans is demonstrated by a break down in the blood brain barrier following activation of brain areas through a motor task. 

      Strengths: 

      This study is novel and the results are potentially important as they demonstrate an unexpected break down of the blood brain barrier with physiological activity and this may serve a physiological purpose, affecting synaptic plasticity. 

      The strengths of the study are: 

      (1) The use of an in vivo model with multiple methods to investigate the blood brain barrier response to a forelimb stimulation. 

      (2) The determination of a potential functional role for the observed leakage of the blood brain barrier from both a genetic and electrophysiological view point 

      (3) The demonstration that inhibiting different points in the putative pathway from activation of the cortex to transport of albumin and activation of the TGF-beta pathway, the effect on synaptic enhancement could be prevented.  (4) Preliminary experiments demonstrating a similar observation of activity dependent break down of the blood brain barrier in humans. 

      Weaknesses: 

      The authors adequately addressed most of my points. A few remain: 

      (1) Although the reviewers have addressed the possible effects of anaesthesia on neuro-vascular coupling. They have not mentioned or addressed the possible effects of ketamine (an NMDA receptor antagonist) on synaptic plasticity. Indeed, the low percentage of SEP increase following potentiation (10-20%) could perhaps be explained by partial block of NMDA receptors by ketamine.

      We agree and apologize for this oversight. This important issue is now addressed in the Discussion.

      “Notably, the antagonistic effect of ketamine on NMDA receptors might attenuate the magnitude of SEP potentiation recorded in our experiments (Anis et al., 1983; Salt et al., 1988).”

      (2) The experimental paradigms remain unclear to me. Now, it appears that drugs are applied for 50 minutes and that the stimulation occurs during the "washout period". The more conventional approach would be to have the drug application during the stimulation period to determine if the drugs occlude or enhance the effects of stimulation and then washout the drugs. The problem is that drugs variably washout at different rates depending upon their lipid solubility.

      We agree that the more conventional approach would have been to continue applying the drug throughout the experiment and that differential rates of washout may add variability to our experiments. However, despite this limitation, within each treatment group we found that the SEP response at 50 minutes (immediately after the drug application window) does not differ from SEP response at 80 minutes (after 30 minutes of stimulation and washout) [Figure 3H&G]. This suggests that the drug effects were still present despite terminating drug application and performing potentiation-inducing stimulation. Moreover, our analysis showed that animals within each treatment group (except AP5) had similar SEP responses with little intra-group variability.

      (3) It is still not clear to what extent the experimenters and those doing the analysis were blinded to group. If one or both were blind to group, then please put this in the methods.

      Thank you for this comment. We revised the Methods section to clearly confirm that data was collected and analyzed blindly.  

      Reviewer #3 (Public Review): 

      Summary: 

      This study used prolonged stimulation of a limb to examine possible plasticity in somatosensory evoked potentials induced by the stimulation. They also studied the extent that the blood brain barrier (BBB) was opened by the prolonged stimulation and whether that played a role in the plasticity. They found that there was potentiation of the amplitude and area under the curve of the evoked potential after prolonged stimulation and this was long-lasting (>5 hrs). They also implicated extravasation of serum albumin, caveolae-mediated transcytosis, and TGFb signalling, as well as neuronal activity and upregulation of PSD95. Transcriptomics was done and implicated plasticity related genes in the changes after prolonged stimulation, but not proteins associated with the BBB or inflammation. Next, they address the application to humans using a squeeze ball task. They imaged the brain and suggest that the hand activity led to an increased permeability of the vessels, suggesting modulation of the BBB. 

      Strengths: 

      The strengths of the paper are the novelty of the idea that stimulation of the limb can induce cortical plasticity in a normal condition, and it involves opening of the BBB with albumin entry. In addition, there are many datasets and both rat and human data. 

      Weaknesses: 

      The conclusions are not compelling however because of a lack of explanation of methods.

      In the revised paper, we added a section titled ‘study design’ that presents an overview of the experimental approach.

      The explanation of why prolonged stimulation in the rat was considered relevant to normal conditions should be as clear in the paper as it is in the rebuttal.

      We added a new paragraph to the Discussion section explaining this point as we did in the rebuttal:  

      “Our animal experiments show that a 30 min limb stimulation (at 6Hz and 2mA) increases cross-BBB influx, while a 1 min stimulation (of similar frequency and magnitude) does not. We believe that both types of stimulations fall within the physiological range because our continuous electrophysiological recordings showed no signs of epileptiform or otherwise pathological activity. Moreover, the recorded SEP levels were similar to those reported in previous physiological LTP studies in rats (Eckert & Abraham, 2010; Han et al., 2015; Mégevand et al., 2009) and humans (McGregor et al., 2016). In humans, skill acquisition often involves motor training sessions that last ≥30 minutes (Bengtsson et al., 2005; Classen et al., 1998) and result in physiological plasticity of sensory and motor systems (Classen et al., 1998; Draganski et al., 2004; Sagi et al., 2012). Hence, the experimental task in our human study (30 minutes of repetitive squeezing of an elastic stress-ball) is likely to represent physiological activity, with neuronal activation in primarily motor and sensory areas (Halder et al., 2005). Future human and animal studies are needed to explore the BBB modulating effects of additional stimulation protocols – with varying durations, frequencies, and magnitudes. Such studies may also elucidate the temporal and ultrastructural characteristics that differentiate between physiological and pathological BBB modulation. “

      The authors need to ensure other aspects of the rebuttal are as clear in the paper as in the rebuttal too. 

      Thank you for this comment. This was addressed in the revised paper.

      The only remaining concern that is significant is that it is hard to understand the figures. 

      Thank you for this comment. We revised the figures according to the reviewer’s recommendations. We hope that these changes increase the legibility of the figures. 

      Reviewer #3 (Recommendations For The Authors): 

      The manuscript is improved but there are still suggestions that do not appear to have been addressed. More experiments are not involved in addressing these concerns but one wants the paper to be clarified in terms of what was done. 

      Figures. Please use arrows to point to the effect that the reader should see. Please note what the main point is. 

      Major concerns: 

      Please add explanations, exact p values, and other revisions in the rebuttal to the paper. 

      Rebuttal explanations were added to the paper and p values appear in figure legends.

      Fig 1d shows a seizure-like event which the authors don't think is a seizure because it lacks a depolarization ship. This explanation is not convincing because a LFP would not necessarily show a depolarization ship. Another argument of a discussion of the event as a seizure is warranted. Note that expanding the trace might also show it is unlike a seizure. Regarding the idea that 6Hz 2 mA stimuli for 30 min are physiological, the authors make three arguments which are not clear. First, no epileptiform activity was found, but in Fig. 1 it looks like a seizure occurred. Second, memory and skill acquisition in humans open involve a similar training duration - but what about 6Hz 2 mA?

      Rats are known to rhythmically move their whiskers at frequencies ranging between 5 and 15 Hz (Mégevand et al., 2009). We agree that there is no clear way to justify the similarity between the experimental design in humans and rats. However, we believe that both paradigms (paw stimulation in rats and ball squeeze in humans) represent non-pathological input that we found to modulate barrier permeability. This argument was added to the discussion of the paper:

      “We believe that both types of stimulations fall within the physiological range because in rats, activity between 515 Hz represents physiological rhythmic whisker movement during environment exploration (Mégevand et al., 2009).” 

      Seizures are typically induced in rats via direct tetanic stimulation of the brain (at 50 Hz and 0.3-2.5mA) or maximal electroshock test to the cornea (at 50 Hz and 150 mA) (Swinyard et al., 1952). We, therefore, assert that the activity we observe represents physiological responses and not seizures. This argument is beyond the scope of the current paper. 

      Please note a limitation is that the high level of serum albumin is unlikely to be physiological but may not have been as high in the animal because of the low diffusion rate and degradation (please add the refs in the rebuttal). 

      Thank you, we added the following to the Results section: 

      “The relatively high concentration of albumin was chosen to account for factors that lower its effective tissue concentration such as its low diffusion rate and its likelihood to encounter a degradation site or a cross-BBB efflux transporter (Tao & Nicholson, 1996; Zhang & Pardridge, 2001).”

      Fig. 1. 

      Please consider a box in b to show where the expanded traces in the lower row came from. 

      Thank you for the suggestion. We added lines indicating where the trace excerpts were taken from.

      c. Please use arrows to point to the parts that the authors want the reader to note. In the legend, explain what t is, and delta HbT.

      Thank you. We implemented this suggestion.

      d. It is not clear what the double-sided arrows are meant to show compared to the arrow without two sides. 

      We replaced the two-headed arrow with two single ones.

      e. Please explain what the upward lines at the top signify. What does the red asterisk mean? 

      Thank you. We implemented this suggestion.

      f. Is the reader supposed to note the yellow area? Please make it with an arrow or circle if so. 

      Thank you, we added a white circle to mark the area of tracer accumulation.

      g. Please explain what the permeability index is or reference the part of the paper that does. 

      Further to this suggestion, we added a refence to the appropriate methods section to the legend.

      h. Please use arrows to point to the area of interest. 

      Thank you. We implemented this suggestion.

      m-n. Please mark areas of interest with arrows.  m. the top right two images are unclear. I suggest making them say ipsi inset and contra inset instead of using asterisks. 

      Thank you. We added the ipsi and contra labels to panels in m. The images in panel n represent a phenomenon with no particular region of interest, but rather peri-vascular tracer accumulation along the entire depicted blood vessel. We clarified that panel n represents a separate experiment than panel m: “n. In an animal injected with both EB and NaFlu post stimulation, fluorescence imaging shows extravascular accumulation of both tracers along a cortical small vessel in the stimulated hemisphere.”

      Figure 2. 

      (2) a. Middle. What are the vertical lines at the top? The rebuttal states that was explained in the revised legends but I don't see it. 

      Our apologies. We now included an explanation that “an excerpt of the stimulation trace is shown above the middle LFP trace”.

      c and d are very different field potentials in shape and therefore hard to compare. The rebuttal addresses this but the explanation is not in the revised text. 

      We agree that there is variability in SEP responses between animals. We now added a statement acknowledging this in the methods section: “To overcome potential variability in SEP morphology between animals (Mégevand et al., 2009), each animal’s plasticity measures (max amplitude and AUC of post stimulation SEP) were compared to the same measures at baseline.” 

      In d, it is not clear there is potentiation because the traces are not aligned. 

      All panels depicting SEP traces represent raw data with no alignment. The shift observed in panel d exemplifies why we compare post-stimulation parameters of max amplitude and area under curve to baseline in each animal. 

      Exact P values are said to have been added in the rebuttal but they were not. 

      Exact P values appear in Figure legends.

      (3) b. Use arrows to mark the area of interest. 

      Thank you. We added a white circle to mark the area of tracer accumulation similar to Figure 1f.

      d. Why is there an oscillation superimposed on all traces except CNQX? 

      We agree that this is an interesting question. Future studies should determine the source of this SEP pattern.   

      (4) What does the line and the number 2 mean? How were data normalized? What was counted? What area of cortex?

      The number 2 refers to the scale bar line, meaning a log fold change of 2 reflects the size of the scale bar line. 

      The plot shows the log fold change against the mean count of each gene in the contralateral somatosensory cortex between 1 and 24 hours after stimulation.

      The x axis title was changed to “mean expression” and the legend was modified to:

      “Scatter plot of gene expression from RNA-seq in the contralateral somatosensory cortex 24 vs. 1 h after 30 min stimulation. The y axis represents the log fold change, and the x axis represents the mean expression levels (see methods, RNA Sequencing & Bioinformatics). Blue dots indicate statistically significant differentially expressed genes (DEGs) by Wald Test (n=8 rats per group).”

      How were the pericytes, smooth muscle cells, ,etc. distinguished? 

      This was explained under Methods->RNA Sequencing & Bioinformatics: “Analysis of cell-specific and vascular zonation genes was performed as described (Vanlandewijck et al., 2018), using the database provided in (http://betsholtzlab.org/VascularSingleCells/database.html).”

      What were the chi square statistics? If there were cells used instead of rats, please justify. 

      Thank you. The legend was expanded to include the following:

      “The contralateral somatosensory cortex was found to have a significantly higher number of DEGs related to synaptic plasticity, than the ipsilateral side (***p<0.001, Chi-square).”     

      (5) b. what do the icons mean? 

      We agree that the icons were confusing. We simplified this panel to just show when participants were asked to squeeze the ball (black icon). This explanation was added to the Figure legend.

      Abbreviations? 

      Abbreviations of MRI protocols were added to the figure legend for clarity.

      In c-e what are the units of measure? Fold-change? 

      The units represent t-statistics values for each voxel. The label ‘t-statistic’ was added to the figure.  

      What are the white Iines, + and - signs? 

      The white lines point to voxels of highest activation (t-statistic). This was added to the legend.

      And these are not +/- signs these are voxels with significant activation which only appear similar.

      f. Please explain f and g for clarity. 

      Thank you. The explanation was modified for added clarity.

      Supplemental Fig. 4. 

      Original question: If ipsilateral and contralateral showed many changes why do the authors think the effects were only contralateral? 

      The authors replied: Our gene analysis was designed to complement our in vivo and histological findings, by assessing the magnitude of change in differentially expressed genes (DEGs). This analysis showed that: (1) the hemisphere contralateral to the stimulus has significantly more DEGs than the ipsilateral hemisphere; and (2) the DEGs were related to synaptic plasticity and TGF-b signaling. These findings strengthen the hypothesis raised by our in vivo and histological experiments. 

      Could the authors clarify the answer to the question in the text? 

      Thank you. This section was added to the Discussion. 

      Papers referenced in this letter:

      Anis, N. A., Berry, S. C., Burton, N. R., & Lodge, D. (1983). The dissociative anaesthetics, ketamine and phencyclidine, selectively reduce excitation of central mammalian neurones by N-methyl-aspartate. British Journal of Pharmacology, 79(2), 565–575. hQps://doi.org/10.1111/j.1476-5381.1983.tb11031.x

      Bengtsson, S. L., Nagy, Z., Skare, S., Forsman, L., Forssberg, H., & Ullén, F. (2005). Extensive piano practicing has regionally specific effects on white matter development. Nature Neuroscience, 8(9), 1148–1150. hQps://doi.org/10.1038/nn1516

      Classen, J., Liepert, J., Wise, S. P., Hallett, M., & Cohen, L. G. (1998). Rapid plasticity of human cortical movement representation induced by practice. Journal of Neurophysiology, 79(2), 1117–1123. hQps://doi.org/10.1152/JN.1998.79.2.1117/ASSET/IMAGES/LARGE/JNP.JA47F4.JPEG

      Draganski, B., Gaser, C., Busch, V., Schuierer, G., Bogdahn, U., & May, A. (2004). Changes in grey matter induced by training. Nature, 427(6972), 311–312. hQps://doi.org/10.1038/427311a

      Eckert, M. J., & Abraham, W. C. (2010). Physiological effects of enriched environment exposure and LTP induction in the hippocampus in vivo do not transfer faithfully to in vitro slices. Learning and Memory, 17(10), 480–484. hQps://doi.org/10.1101/lm.1822610

      Halder, P., Sterr, A., Brem, S., Bucher, K., Kollias, S., & Brandeis, D. (2005). Electrophysiological evidence for cortical plasticity with movement repetition. European Journal of Neuroscience, 21(8), 2271–2277. hQps://doi.org/10.1111/J.1460-9568.2005.04045.X

      Han, Y., Huang, M. De, Sun, M. L., Duan, S., & Yu, Y. Q. (2015). Long-term synaptic plasticity in rat barrel cortex. Cerebral Cortex, 25(9), 2741–2751. hQps://doi.org/10.1093/cercor/bhu071

      McGregor, H. R., Cashaback, J. G. A., & Gribble, P. L. (2016). Functional Plasticity in Somatosensory Cortex Supports Motor Learning by Observing. Current Biology, 26(7), 921–927. hQps://doi.org/10.1016/j.cub.2016.01.064

      Mégevand, P., Troncoso, E., Quairiaux, C., Muller, D., Michel, C. M., & Kiss, J. Z. (2009). Long-term plasticity in mouse sensorimotor circuits after rhythmic whisker stimulation. Journal of Neuroscience, 29(16), 5326– 5335. hQps://doi.org/10.1523/JNEUROSCI.5965-08.2009

      Sagi, Y., Tavor, I., HofsteQer, S., Tzur-Moryosef, S., Blumenfeld-Katzir, T., & Assaf, Y. (2012). Learning in the Fast Lane: New Insights into Neuroplasticity. Neuron, 73(6), 1195–1203. hQps://doi.org/10.1016/j.neuron.2012.01.025

      Salt, T. E., Wilson, D. G., & Prasad, S. K. (1988). Antagonism of N-methylaspartate and synapBc responses of neurones in the rat ventrobasal thalamus by ketamine and MK-801. British Journal of Pharmacology,

      94(2), 443–448. hQps://doi.org/10.1111/j.1476-5381.1988.tb11546.x

      Swinyard, E. A., Brown, W. C., & Goodman, L. S. (1952). Comparative assays of antiepileptic drugs in mice and rats. The Journal of Pharmacology and Experimental Therapeutics, 106(3), 319–330. hQp://jpet.aspetjournals.org/content/106/3/319.abstract

      Tao, L., & Nicholson, C. (1996). Diffusion of albumins in rat cortical slices and relevance to volume transmission. Neuroscience, 75(3), 839–847. hQps://doi.org/10.1016/0306-4522(96)00303-X

      Vanlandewijck, M., He, L., Mäe, M. A., Andrae, J., Ando, K., Del Gaudio, F., Nahar, K., Lebouvier, T., Laviña, B.,

      Gouveia, L., Sun, Y., Raschperger, E., Räsänen, M., Zarb, Y., Mochizuki, N., Keller, A., Lendahl, U., &

      Betsholtz, C. (2018). A molecular atlas of cell types and zonation in the brain vasculature. Nature, 554(7693), 475–480. hQps://doi.org/10.1038/nature25739

      Zhang, Y., & Pardridge, W. M. (2001). Mediated efflux of IgG molecules from brain to blood across the blood– brain barrier. Journal of Neuroimmunology, 114(1–2), 168–172. hQps://doi.org/10.1016/S01655728(01)00242-9

    2. eLife assessment

      This study builds upon previous work which demonstrated that brain injury results in the entry of a protein called albumin into the brain which then causes diverse effects. The present study shows that prolonged stimulation of a forelimb in a rat leads to albumin entry, and is associated with effects that suggest plasticity is enhanced in the stimulated side of the brain. The strength of evidence was convincing and results are important because they suggest a previously-considered pathological process may be relevant to the normal brain and have benefits.

    3. Reviewer #3 (Public Review):

      Summary:

      This study used prolonged stimulation of a limb to examine possible plasticity in somatosensory evoked potentials induced by the stimulation. They also studied the extent that the blood brain barrier (BBB) was opened by the prolonged stimulation and whether that played a role in the plasticity. They found that there was potentiation of the amplitude and area under the curve of the evoked potential after prolonged stimulation and this was long-lasting (>5 hrs). They also implicated extravasation of serum albumin, caveolae-mediated transcytosis, and TGFb signalling, as well as neuronal activity and upregulation of PSD95. Transcriptomics was done and implicated plasticity related genes in the changes after prolonged stimulation, but not proteins associated with the BBB or inflammation. Next, they address the application to humans using a squeeze ball task. They imaged the brain and suggested that the hand activity led to an increased permeability of the vessels, suggesting modulation of the BBB.

      Strengths:

      The strengths of the paper are the novelty of the idea that stimulation of the limb can induce cortical plasticity in a normal condition, and it involves the opening of the BBB with albumin entry. In addition, there are many datasets, both rat and human data.

      Weaknesses:

      The explanation of why prolonged stimulation in the rat was considered relevant to normal conditions is still somewhat weak. The authors argue that the stimulation frequency they used is similar to rhythmic whisker movement. That is a good argument. However, the intensity they used, 2 mA is in the range they say can elicit a seizure if stimulation is 50 Hz. So that weakens the argument.

      The authors made a lot of the requested changes but some questions were not addressed or the explanations were so brief that the confusion remained. Please go over the revisions again and make sure sentences are complete, jargon is explained, and arguments/justifications are clear. It will help the reader greatly.

      The authors responded to the previous comments of Reviewer 2 regarding experimental design and variability of washout periods. It would be useful to incorporate the response into the paper so the readers know why the authors think the variability was not an important factor in the results.

      Comments on the revised version:

      The manuscript is improved.

    1. eLife assessment

      This study provides an important cell type atlas of the gill of the mussel Gigantidas platifrons using a single nucleus RNA-seq dataset, a resource for the community of scientists studying deep sea physiology and metabolism and intracellular host-symbiont relationships. The evidence supporting the conclusions is convincing with high-quality single-nucleus RNA sequencing and transplant experiments. This work will be of broad relevance for scientists interested in host-symbiont relationships across ecosystems.

    2. Reviewer #1 (Public Review):

      Wang, He et al have constructed a comprehensive single nucleus atlas for the gills of the deep sea Bathymodioline mussels, which possess intracellular symbionts that provide a key source of carbon and allow them to live in these extreme environments. They provide annotations of the different cell states within the gills, shedding light on how multiple cell types cooperate to give rise to the emergent functions of the composite tissues and the gills as a whole. They pay special attention to characterizing the bacteriocyte cell populations and identifying sets of genes that may play a role in their interaction with the symbiotes.

      Wang, He et al sample mussels from 3 different environments: animals from their native methane rich environment, animals transplanted to a methane-poor environment to induce starvation and animals that have been starved in the methane-poor environment and then moved back to the methane-rich environment. They demonstrated that starvation had the biggest impact on bacteriocyte transcriptomes. They hypothesize that the up-regulation of genes associated with lysosomal digestion leads to the digestion of the intracellular symbiont during starvation, while the non-starved and reacclimated groups more readily harvest the nutrients from symbiotes without destroying them. Further work exploring the differences in symbiote populations between ecological conditions will further elucidate the dynamic relationship between host and symbiote. This will help disentangle specific changes in transcriptomic state that are due to their changing interactions with the symbiotes from changes associated with other environmental factors.

      This paper makes available a high quality dataset that is of interest to many disciplines of biology. The unique qualities of this non-model organism and collection of conditions sampled make it of special interest to those studying deep sea adaptation, the impact of environmental perturbation on Bathymodioline mussels populations, and intracellular symbiotes. The authors also use a diverse array of tools to explore and validate their data.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review): 

      Wang, He et al have constructed comprehensive single nucleus atlas for the gills of the deep sea Bathymodioline mussels, which possess intracellular symbionts that provide a key source of carbon and allow them to live in these extreme environments. They provide annotations of the different cell states within the gills, shedding light on how multiple cell types cooperate to give rise to the emergent functions of the composite tissues and the gills as a whole. They pay special attention to characterizing the bacteriocyte cell populations and identifying sets of genes that may play a role in their interaction with the symbiotes. 

      Wang, He et al sample mussels from 3 different environments: animals from their native methane rich environment, animals transplanted to a methane-poor environment to induce starvation and animals that have been starved in the methane-poor environment and then moved back to the methane-rich environment. They demonstrated that starvation had the biggest impact on bacteriocyte transcriptomes. They hypothesize that the up-regulation of genes associated with lysosomal digestion leads to the digestion of the intracellular symbiont during starvation, while the non-starved and reacclimated groups more readily harvest the nutrients from symbiotes without destroying them. Further work exploring the differences in symbiote populations between ecological conditions will further elucidate the dynamic relationship between host and symbiote. This will help disentangle specific changes in transcriptomic state that are due to their changing interactions with the symbiotes from changes associated with other environmental factors. 

      This paper makes available a high quality dataset that is of interest to many disciplines of biology. The unique qualities of this non-model organism and collection of conditions sampled make it of special interest to those studying deep sea adaptation, the impact of environmental perturbation on Bathymodioline mussels populations, and intracellular symbiotes. The authors also use a diverse array of tools to explore and validate their data. 

      Reviewer #2 (Public Review): 

      Wang, He et al. shed insight into the molecular mechanisms of deep-sea chemosymbiosis at the single-cell level. They do so by producing a comprehensive cell atlas of the gill of Gigantidas platifrons, a chemosymbiotic mussel that dominates the deep-sea ecosystem. They uncover novel cell types and find that the gene expression of bacteriocytes, the symbiont-hosting cells, supports two hypotheses of host-symbiont interactions: the "farming" pathway, where symbionts are directly digested, and the "milking" pathway, where nutrients released by the symbionts are used by the host. They perform an in situ transplantation experiment in the deep sea and reveal transitional changes in gene expression that support a model where starvation stress induces bacteriocytes to "farm" their symbionts, while recovery leads to the restoration of the "farming" and "milking" pathways. 

      A major strength of this study includes the successful application of advanced single nucleus techniques to a non-model, deep sea organism that remains challenging to sample. I also applaud the authors for performing an in situ transplantation experiment in a deep sea environment. From gene expression profiles, the authors deftly provide a rich functional description of G. platifrons cell types that is well-contextualized within the unique biology of chemosymbiosis. These findings offer significant insight into the molecular mechanisms of deep-sea host-symbiont ecology, and will serve as a valuable resource for future studies into the striking biology of G. platifrons. 

      The authors' conclusions are generally well-supported by their results. However, I recognize that the difficulty of obtaining deep-sea specimens may have impacted experimental design and no replicates were sampled. 

      It is notable that the Fanmao cells were much more sparsely sampled. It appears that fewer cells were sequenced, resulting in the Starvation and Reconstitution conditions having 2-3x more cells after doublet filtering. These discrepancies also are reflected in the proportion of cells that survived QC, suggesting a distinction in quality or approach. However, the authors provide clear and sufficient evidence via bootstrapping that batch effects between the three samples are negligible. While batch effect does not appear to have affected gene expression profiles, the proportion of cell types may remain sensitive to sampling techniques, and thus interpretation of Fig. S12 must be approached with caution. 

      Reviewer #3 (Public Review): 

      Wang et al. explored the unique biology of the deep-sea mussel Gigantidas platifrons to understand fundamental principles of animal-symbiont relationships. They used single-nucleus RNA sequencing and validation and visualization of many of the important cellular and molecular players that allow these organisms to survive in the deep-sea. They demonstrate that a diversity of cell types that support the structure and function of the gill including bacteriocytes, specialized epithelial cells that host sulfur-oxidizing or methane-oxidizing symbionts as well as a suite of other cell types including supportive cells, ciliary, and smooth muscle cells. By performing experiments of transplanting mussels from one habitat which is rich in methane to methane-limited environments, the authors showed that starved mussels may consume endosymbionts versus in methane-rich environments upregulated genes involved in glutamate synthesis. These data add to the growing body of literature that organisms control their endosymbionts in response to environmental change. 

      The conclusions of the data are well supported. The authors adapted a technique that would have been technically impossible in their field environment by preserving the tissue and then performing nuclear isolation after the fact. The use of single-nucleus sequencing opens the possibility of new cellular and molecular biology that is not possible to study in the field. Additionally, the in-situ data (both WISH and FISH) are high-quality and easy to interpret. The use of cell-type-specific markers along with a symbiont-specific probe was effective. Finally, the SEM and TEM were used convincingly for specific purposes in the case of showing the cilia that may support water movement. 

      The one particular area for future exploration surrounds the concept of a proliferative progenitor population within the gills. The authors recover molecular markers for these putative populations and additional future work will uncover if these are indeed proliferative cells contribute to symbiont colonization. 

      Overall the significance of this work is identifying the relationship between symbionts and bacteriocytes and how these host bacteriocytes modulate their gene expression in response to environmental change. It will be interesting to see how similar or different these data are across animal phyla. For instance, the work of symbiosis in cnidarians may converge on similar principles of there may be independent ways in which organisms have been able to solve these problems. 

      We extend our sincere gratitude to all the reviewers for their positive comments and kind words. We highly value the substantial efforts they made in helping us improve and enhance our manuscript. Additionally, we appreciate the reviewers for pointing out the limitations of our current study, which will guide us in improving our future researches.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      This study system is so interesting and this is a truly unique and exciting dataset. Most of my suggestions are aimed at improving readability and making it more accessible for a broader audience, since I predict many fields will find it interesting. 

      Line 60: which species of mussel? Is this the same one? 

      We appreciate the comments from the reviewer. The reference here is to deep-sea bathymodiolin mussels, which, in most cases, possess enlarged gill filaments that accommodate symbionts.

      Line 237-230: citation of previous findings missing 

      We appreciate the comments from the reviewer. After carefully reviewing these paragraphs, we believe that all the previous findings have now been properly cited.

      Line 256: it might be a good idea to give a brief description of what slingshot analysis is here 

      We appreciate the comments from the reviewer. We have revise the corresponding part of our manuscript to make it clear.

      This parts of manscript now reads: “We performed Slingshot analysis, which uses a cluster-based minimum spanning tree (MST) and a smoothed principal curve to determine the developmental path of cell clusters. The re-sult shows that the PEBZCs might be the origin of all gill epithelial cells, including the other two proliferation cells (VEPC and DEPC) and bacteriocytes (Supplementary Fig. S6).” Line 203-207 of the revised manscript.

      Line 289: Wording is a bit confusing- what is meant by morphological analysis?

      We acknowledge that our wording might be a bit confusing here. We are referring to the TEM ultrastructural analysis. Therefore, we have changed “morphological analysis” to “ultrastructural analysis.” Line231 in the revised manuscript.

      Line 351-354: how did you calculate distances? How many dimensions were used? 

      We calculated the centroid coordinates for each cell type in each state on the 2-dimensional UMAP plot (Fig. 6A). Then, for each cell type, we determined the Euclidean distance between the centroid coordinates of each pair of states. We have revised the manuscript with this more detailed description. Line 292-295 of revised manuscript.

      Line 462: identify -> identified 

      We apologize for our mistake and appreciate the reviewer’s kind assistance with proofreading. The typo has been corrected in the new version. Line396 of the revised manscript.

      Line 509: what does the size of the dot represent? 

      In this context, the color and intensity of each dot represent a specific gene’s expression level in the single-cell cluster. The dot size is universal and therefore does not convey a specific meaning.

      Fig 3A: What is the blue cluster highlighted? 

      We apologize for our mistake. The label for the teal box was missed. We have corrected our mistake in the revised manuscript.

      Fig 3K: Wording in key is confusing. 

      We have modified our description of Fiugre 3K in the figure legneds. Now it reads: “Schematic of water flow agitated by different ciliary cell types. The color of arrowheads corresponds to water flow potentially influenced by specific types of cilia, as indicated by their color code in Figure 3A.” Line462-464 in the revised manscript.

      Fig 5B: which population of mussels was used to take these images? 

      These mussels from “Fanmao” (methane rich) site were used to take these images. We have revised our material and methods to make it clear. Line602-603 of the revised manuscript.

      Fig 5E,5G,5H: panels not referenced in text 

      We apologize for our mistake and appreciate the reviewer’s thorough reading. This error has been corrected in the new version of the manuscript. Line233 of the revised manuscript.

      Reviewer #2 (Recommendations For The Authors): 

      Minor comments: 

      Fig. 3A - the teal box in the legend lacks a label 

      We apologize for our mistake. The label for the teal box was missed. We have corrected our mistake in the

      Reviewer #3 (Recommendations For The Authors): 

      My enthusiasm for the manuscript remains high and I appreciate the authors care in responding to the various reviewer questions and concerns. 

      In regards to the cell proliferation results, I have modified my public review and look forward to your future work in this area. The data for both pHistone H3 and anti PCNA are compelling! 

      One typo I did catch occurs on line 520. I believe you meant to say "outer" not "otter." 

      We apologize for our mistake and appreciate the reviewer’s kind assistance with proofreading. The typo has been corrected in the new version.

    4. Reviewer #2 (Public Review):

      Wang, He et al. shed insight into the molecular mechanisms of deep-sea chemosymbiosis at the single-cell level. They do so by producing a comprehensive cell atlas of the gill of Gigantidas platifrons, a chemosymbiotic mussel that dominates the deep-sea ecosystem. They uncover novel cell types and find that the gene expression of bacteriocytes, the symbiont-hosting cells, supports two hypotheses of host-symbiont interactions: the "farming" pathway, where symbionts are directly digested, and the "milking" pathway, where nutrients released by the symbionts are used by the host. They perform an in situ transplantation experiment in the deep sea and reveal transitional changes in gene expression that support a model where starvation stress induces bacteriocytes to "farm" their symbionts, while recovery leads to the restoration of the "farming" and "milking" pathways.

      A major strength of this study includes the successful application of advanced single nucleus techniques to a non-model, deep sea organism that remains challenging to sample. I also applaud the authors for performing an in situ transplantation experiment in a deep sea environment. From gene expression profiles, the authors deftly provide a rich functional description of G. platifrons cell types that is well-contextualized within the unique biology of chemosymbiosis. These findings offer significant insight into the molecular mechanisms of deep-sea host-symbiont ecology, and will serve as a valuable resource for future studies into the striking biology of G. platifrons.

      The authors' conclusions are generally well-supported by their results. However, I recognize that the difficulty of obtaining deep-sea specimens may have impacted experimental design and no replicates were sampled.

      It is notable that the Fanmao cells were much more sparsely sampled. It appears that fewer cells were sequenced, resulting in the Starvation and Reconstitution conditions having 2-3x more cells after doublet filtering. These discrepancies also are reflected in the proportion of cells that survived QC, suggesting a distinction in quality or approach. However, the authors provide clear and sufficient evidence via bootstrapping that batch effects between the three samples are negligible. While batch effect does not appear to have affected gene expression profiles, the proportion of cell types may remain sensitive to sampling techniques, and thus interpretation of Fig. S12 must be approached with caution.

    5. Reviewer #3 (Public Review):

      Wang et al. explored the unique biology of the deep-sea mussel Gigantidas platifrons to understand fundamental principles of animal-symbiont relationships. They used single-nucleus RNA sequencing and validation and visualization of many of the important cellular and molecular players that allow these organisms to survive in the deep-sea. They demonstrate that a diversity of cell types that support the structure and function of the gill including bacteriocytes, specialized epithelial cells that host sulfur-oxidizing or methane-oxidizing symbionts as well as a suite of other cell types including supportive cells, ciliary, and smooth muscle cells. By performing experiments of transplanting mussels from one habitat which is rich in methane to methane-limited environments, the authors showed that starved mussels may consume endosymbionts versus in methane-rich environments upregulated genes involved in glutamate synthesis. These data add to the growing body of literature that organisms control their endosymbionts in response to environmental change.

      The conclusions of the data are well supported. The authors adapted a technique that would have been technically impossible in their field environment by preserving the tissue and then performing nuclear isolation after the fact. The use of single-nucleus sequencing opens the possibility of new cellular and molecular biology that is not possible to study in the field. Additionally, the in-situ data (both WISH and FISH) are high-quality and easy to interpret. The use of cell-type-specific markers along with a symbiont-specific probe was effective. Finally, the SEM and TEM were used convincingly for specific purposes in the case of showing the cilia that may support water movement.

      The one particular area for future exploration surrounds the concept of a proliferative progenitor population within the gills. The authors recover molecular markers for these putative populations and additional future work will uncover if these are indeed proliferative cells that contribute to symbiont colonization.

      Overall the significance of this work is identifying the relationship between symbionts and bacteriocytes and how these host bacteriocytes modulate their gene expression in response to environmental change. It will be interesting to see how similar or different these data are across animal phyla. For instance, the work of symbiosis in cnidarians may converge on similar principles of there may be independent ways in which organisms have been able to solve these problems.

    1. eLife assessment

      This manuscript provides important information on the calcification process, especially the properties and formation of freshly formed tests (the foraminiferan shells), in the miliolid foraminiferan species Pseudolachlanella eburnea. The evidence from the high-quality SEM images is convincing although the fluorescence images only provide indirect support for the calcification process.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript by Dubicka and co-workers on calcification in miliolid foraminifera presents an interesting piece of work. The study uses confocal and electron microscopy to show that the traditional picture of calcification in porcelaneous foraminifera is incorrect.

      Strengths:<br /> The authors present high-quality images and an original approach to a relatively solid (so I thought) model of calcification.

      Weaknesses:

      There are several major shortcomings. Despite the interesting subject and the wonderful images, the conclusions of this manuscript are simply not supported at all by the results. The fluorescent images may not have any relation to the process of calcification and should therefore not be part of this manuscript. The SEM images, however, do point to an outdated idea of miliolid calcification. I think the manuscript would be much stronger with the focus on the SEM images and with the speculation of the physiological processes greatly reduced.

      Comments on revised version:

      I continue to disagree. As the authors acknowledge: 'may be a hint indicating ACC...', but it may also be something else. This is really something else than showing ACC is involved in foraminiferal calcification. I still think the reasoning is shaky and below, I will clarify why the fluorescence may well not be related to ACC and in fact, some or even most of the vesicles may not play the role that the authors suggest. Even if they do, the conclusions are not supported by the data presented here. Unfortunately, I found some of the other answers to my question not satisfactory either.

    3. Reviewer #2 (Public Review):

      Summary:

      Dubicka et al. in their paper entitled " Biocalcification in porcelaneous foraminifera" suggest that in contrast to the traditionally claimed two different modes of test calcification by rotallid and porcelaneous miliolid formaminifera, both groups produce calcareous tests via the intravesicular mineral precursors (Mg-rich amorphous calcium carbonate). These precursors are proposed to be supplied by endocytosed seawater and deposited in situ as mesocrystals formed at the site of new wall formation within the organic matrix. The authors did not observe the calcification of the needles within the transported vesicles, which challenges the previous model of miliolid mineralization. Although the authors argue that these two groups of foraminifera utilize the same calcification mechanism, they also suggest that these calcification pathways evolved independently in the Paleozoic.

      Comments on the revised version

      In my reply to the author's rebuttal letter, I will focus on one key point. The main observation supporting the author's conclusion, as expressed in the abstract, is:

      "We found that both groups [i.e., rotaliids and miliolids, the latter documented in the reviewed paper] produced calcareous shells via the intravesicular formation of unstable mineral precursors (Mg-rich amorphous calcium carbonates) supplied by endocytosed seawater and deposited at the site of new wall formation within the organic matrix. Precipitation of high-Mg calcitic mesocrystals took place in situ and formed a dense, chaotic meshwork of needle-like crystallites."

      In my review, I pointed out that there is no support for the existence of an intracellular, vesicular intermediate amorphous phase.

      The authors replied:

      "We used laser line 405 nm and multiphoton excitation to detect ACCs. These wavelengths (partly) permeate the shell to excite ACCs autofluorescence. The autofluorescence of the shells is present as well but not clearly visible in movie S4 as the fluorescence of ACCs is stronger. This may be related to the plane/section of the cell which is shown. The laser permeates the shell above the ACCs (short distance) but to excite the shell CaCO3 around foraminifera in the same three-dimensional section where ACCs are shown, the light must pass a thick CaCO3 area due to the three-dimensional structure of the foraminiferan shell. Therefore, the laser light intensity is reduced. In a revised version, a movie/image with reduced threshold is shown."

      This reply does not address the reviewer's concerns. Detection of ACC with 405 nm excitation is not sufficient; many organic components can fluoresce under violet light excitation. For example, Delvene et al. (2002) (https://doi.org/10.18261/let.55.4.7) showed that "the Pleistocene and Jurassic microborings emit in the blue-yellow spectral region (420-600 nm) with a laser excitation of 405 nm, which coincides with the emission due to NADPH [nicotinamide adenine dinucleotide], FAD [flavin adenine dinucleotide], and riboflavin pigments characteristic of some cyanobacteria." Traditionally, in geological or biogenic calcium carbonate samples, Raman spectroscopic characterization of ACC and its magnesium content can be used (e.g., Wang, D., Hamm, L. M., Bodnar, R. J. & Dove, P. M. Raman spectroscopic characterization of the magnesium content in amorphous calcium carbonates. J. Raman Spectrosc. 43, 543-548 (2012); Perrin, J. et al. Raman characterization of synthetic magnesian calcites. Am. Mineral. 101, 2525-2538 (2016)). However, in biological, living-cell systems, Mehta et al. (2022) (doi: 10.1016/j.saa.2022.121262) successfully used FTIR spectroscopy to identify ACC by two characteristic FTIR vibrations at ca. 860 cm-1 and ca. 306 cm-1. Other methods such as STXM analyses at the C K-edge (Monteil et al. 2021, doi: 10.1038/s41396-020-00747-3) are also available. Because the core of the authors' interpretation (i.e., detection of ACC in vesicles) is not supported by hard evidence, the claim that the study represents a "paradigm shift" is far-fetched and the whole model is based on speculations. If the authors are able to unequivocally confirm the presence of ACC within the vesicles and its subsequent transformation into calcitic needles, the other problems noted in the paper will be relatively trivial.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The manuscript by Dubicka and co-workers on calcification in miliolid foraminifera presents an interesting piece of work. The study uses confocal and electron microscopy to show that the traditional picture of calcification in porcelaneous foraminifera is incorrect.

      Strengths:

      The authors present high-quality images and an original approach to a relatively solid (so I thought) model of calcification.

      Weaknesses:

      There are several major shortcomings. Despite the interesting subject and the wonderful images, the conclusions of this manuscript are simply not supported at all by the results. The fluorescent images may not have any relation to the process of calcification and should therefore not be part of this manuscript. The SEM images, however, do point to an outdated idea of miliolid calcification. I think the manuscript would be much stronger with the focus on the SEM images and with the speculation of the physiological processes greatly reduced.

      We agree that fluorescence studies presented in the paper are not an unequivocal proof by itself for calcification model utilised by studied Miliolida species. However, fluorescence data combined with SEM studies, especially overlap of the elements that show autofluorescence upon excitation at 405 nm (emission 420–480 nm) and acidic vesicles marked by p_H-_sensitive LysoGlow84, may be a hint indicating ACC-bearing vesicles.

      We will tone down the the physiological interpretation based on fluorescence studies in the revised version of the manuscript.

      Nevertheless, we think that our fluorescent life-imaging experiments provides important observations in miliolida, which is scarce in the existing literature, and therefore are worth being presented as they might be very helpful in better understanding of full calcification model in the future.

      Reviewer #2 (Public Review):

      Summary:

      Dubicka et al. in their paper entitled " Biocalcification in porcelaneous foraminifera" suggest that in contrast to the traditionally claimed two different modes of test calcification by rotallid and porcelaneous miliolid formaminifera, both groups produce calcareous tests via the intravesicular mineral precursors (Mg-rich amorphous calcium carbonate). These precursors are proposed to be supplied by endocytosed seawater and deposited in situ as mesocrystals formed at the site of new wall formation within the organic matrix. The authors did not observe the calcification of the needles within the transported vesicles, which challenges the previous model of miliolid mineralization. Although the authors argue that these two groups of foraminifera utilize the same calcification mechanism, they also suggest that these calcification pathways evolved independently in the Paleozoic.

      We do not argue that Miliolida and Rotallida utilize exactly the same calcification mechanism but the both groups use less divergent crystallization pathways, where mesocrystalline chamber walls are created by accumulating and assembling particles of pre-formed liquid amorphous mineral phase.

      Strengths:<br /> The authors document various unknown aspects of calcification of Pseudolachlanella eburnea and elucidate some poorly explained phenomena (e.g., translucent properties of the freshly formed test) however there are several problematic observations/interpretations which in my opinion should be carefully addressed.

      Weaknesses:

      (1) The authors (line 122) suggest that "characteristic autofluorescence indicates the carbonate content of the vesicles (Fig. S2), which are considered to be Mg-ACCs (amorphous MgCaCO3) (Fig. 2, Movies S4 and S5)". Figure S2 which the authors refer to shows only broken sections of organic sheath at different stages of mineralization. Movie S4 shows that only in a few regions some vesicles exhibit red autofluorescence interpreted as Mg-ACC (S5 is missing but probably the authors were referring to S3). In their previous paper (Dubicka et al 2023: Heliyon), the authors used exactly the same methodology to suggest that these are intracellularly formed Mg-rich amorphous calcium carbonate particles that transform into a stable mineral phase in rotaliid Aphistegina lessonii. However, in Figure 1D (Dubicka et al 2023) the apparently carbonate-loaded vesicles show the same red autofluorescence as the test, whereas in their current paper, no evidence of autofluorescence of Mg-ACC grains accumulated within the "gel-like" organic matrix is given. The S3 and S4 movies show circulation of various fluorescing components, but no initial phase of test formation is observable (numerous mineral grains embedded within the o rganic matrix - Figures 3A and B - should be clearly observed also as autofluorescence of the whole layer). Thus the crucial argument supporting the calcification model (Figure 5) is missing.

      This is correct that we did not observe the initial phase of test formation in vivo. Therefore, it is not our crucial argument supporting novel components of the new calcification model. We suspect that vesicles preparing and transporting Mg-ACC are produced way before their docking and deposition into the new wall, because such seawater vesicles were observed between the chamber formation stages (Goleń and Tyszka, 2024, personal communication based on independent experiments on a closely related miliolid taxon). It means that our in vivo experiments most likely represent a long, dynamic stage of vesicles formation via seawater endocytosis, their modification (incl. Mg-ACC formation) before the stage of exocytosis during the new chamber formation. Our crucial arguments supporting the calcification model come from the SEM imaging of the specimens fixed during chamber formation, as well as from the transparency of the new chamber wall during its progressive calcification.

      There is no support for the following interpretation (lines 199-203) "The existence of intracellular, vesicular intermediate amorphous phase (Mg-ACC pools), which supply successive doses of carbonate material to shell production, was supported by autofluorescence (excitation at 405 nm; Fig. 2; Movies S3 and S4; see Dubicka et al., 2023) and a high content of Ca and Mg quantified from the area of cytoplasm by SEM-EDS analysis (Fig. S6)."

      We used laser line 405nm and multiphoton excitaton to detect ACCs. These wavelengths (partly) permeate the shell to excite ACCs autofluorescence. The autofluorescence of the shells is present as well but not clearly visible in movieS4 as the fluorescence of ACCs is stronger. This may be related to the plane/section of the cell which is shown. The laser permeates the shell above the ACCs (short distance) but to excite the shell CaCO3 around foraminifera in the same three-dimensional section where ACCs are shown, the light must pass a thick CaCO3 area due to the three-dimensional structure of the foraminiferan shell. Therefore, the laser light intensity is reduced. In a revised version a movie/image with reduced threshold is shown.

      Author response image 1.

      Autofluorescence image of studied Miliolida species (exc. 405 nm) showing algal chlorophyll (blue) and CaCO3 (red), both ACC and calcite shell.

      It would be very convenient if it was possible to visualize ACC by illumination with a blacklight, but there are very many organic molecules that have an autofluorescence excited by ~405 nm. One of the examples is NADH (Lee et al., 2015. Kor J Physiol Pharmac 19(4): 373-382), an omnipresent molecule in any cell (couldn't copy the appropriate picture here, but the reference has a figure with the em/exc spectra).

      The paper of Lee et al. 2015 shows that the excitation spectrum of NADH is ending close to 400 nm. This means that NADH is not or only very weakly excitable at 405nm, what we used as the excitation laser line. 

      (2) The authors suggest that "no organic matter was detected between the needles of the porcelain structures (Figures 3E; 3E; S4C, and S5A)". Such a suggestion, which is highly unusual considering that biogenic minerals almost by definition contain various organic components, was made based only on FE-SEM observation. The authors should either provide clearcut evidence of the lack of organic matter (unlikely) or may suggest that intense calcium carbonate precipitation within organic matrix gel ultimately results in a decrease of the amount of the organic phase (but not its complete elimination), alike the pure calcium carbonate crystals are separated from the remaining liquid with impurities ("mother liquor"). On the other hand, if (249-250) "organic matrix involved in the biomineralization of foraminiferal shells may contain collagen-like networks", such "laminar" organization of the organic matrix may partly explain the arrangement of carbonate fibers parallel to the surface as observed in Fig. 3E1.

      We agree with the reviewer that biogenic minerals should by definition contain some organic components. We just wrote that "no organic matter was detected between the needles of the porcelain structures” that means that we did not detect any organic structures based only on our FE-SEM observations. We will rephrase this part of the text to avoid further confusion.

      (3) The author's observations indeed do not show the formation of individual skeletal crystallites within intracellular vesicles, however, do not explain either what is the structure of individual skeletal crystallites and how they are formed. Especially, what are the structures observed in polarized light (and interpreted as calcite crystallites) by De Nooijer et al. 2009? The author's explanation of the process (lines 213-216) is not particularly convincing "we suspect that the OM was removed from the test wall and recycled by the cell itself".

      Thank you for this comment. We will do our best to supplement our explanations. We are aware about the structures observed in polarized light by De Nooijer et al. (2009). However, Goleń et al. (2022, Prostist; + 2 other citations) showed that organic polymers may also exhibit light polarization. Additional experimental studies are needed to separate these types of polarization. We will try to investigate this issue in our future research.

      (4) The following passage (lines 296-304) which deals with the concept of mesocrystals is not supported by the authors' methodology or observations. The authors state that miliolid needles "assembled with calcite nanoparticles, are unique examples of biogenic mesocrystals (see Cölfen and Antonietti, 2005), forming distinct geometric shapes limited by planar crystalline faces" (later in the same passage the authors say that "mesocrystals are common biogenic components in the skeletons of marine organisms" (are they thus unique or are they common)? It is my suggestion to completely eliminate this concept here until various crystallographic details of the miliolid test formation are well documented.

      Our intension was to express that mesocrystals are common biogenic components in the skeletons of marine organisms however such a miliolid needles forming distinct geometric shapes limited by planar crystalline faces are unique.

      Reviewer #1 (Recommendations For The Authors):

      Below, I have summarized my main criticisms.

      (1) The movies S1-S4 do not indicate what is described. The videos show indeed seawater (S1), cell membranes (S2), and autofluorescence and acidic vesicles (S3 and S4). The presence of all these intracellular structures is not surprising: any eukaryotic cell will have those. The authors, however, claim that they participate in the process of calcification, which is simply not shown. One of the main arguments seems the presence of 'carbonate pools', in the caption these are even claimed to be 'Mg-ACC pools', but this is by no means revealed by an excitation of 405nm/ emission between 420 and 490 nm. It would be very convenient if it was possible to visualize ACC by illumination with a blacklight, but there are very many organic molecules that have an autofluorescence excited by ~405 nm. One of the examples is NADH (Lee et al., 2015. Kor J Physiol Pharmac 19(4): 373-382), an omnipresent molecule in any cell (couldn't copy the appropriate picture here, but the reference has a figure with the em/exc spectra).

      The paper of Lee et al. 2015 shows that the excitation spectrum of NADH is ending close to 400 nm. This means that NADH is not or only very weakly excitable at 405nm, what we used as the excitation laser line. 

      The fluorescence by this excitation/ emission couple unlikely indicates the vesicles in which these foraminifera calcify. Therefore, most of the interpretation of the authors on what happens with the calcitic needles is not based on results but remains pure speculation.

      The fluorescence autofluorescence upon excitation at 405 nm (emission 420–480 nm is typical for CaCO3 both for biocalcite and amorphous calcium carbonate, what was proven by laboratory synthesis of amorphous calcium carbonate (Dubicka et al., in preparation).

      (2) The results mention 'granules', which are the supposed Mg-ACC-containing vesicles, but the movies simply don't show any granules. Only fluorescence. Again, the results show a lot of vesicles with autofluorescence, but these are not necessarily related to calcification. Proof could be supplied by showing that the same fluorescent vesicles are 'used up' when the specimens under observation are making a new chamber, but until that is done, the fate of all these vesicles remains uncertain and once more, may not be involved in calcification at all.

      We suspect that vesicles preparing and transporting Mg-ACC are produced way before their docking and deposition into the new wall, because such seawater vesicles were observed between the chamber formation stages (Goleń and Tyszka, 2024, personal communication based on independent experiments on a closely related miliolid taxon). It means that our in vivo experiments most likely represent a long, dynamic stage of vesicles formation via seawater endocytosis, their modification (incl. Mg-ACC formation) before the stage of exocytosis during the new chamber formation. Our crucial arguments supporting the calcification model come from the SEM imaging of the specimens fixed during chamber formation, as well as from the transparency of the new chamber wall during its progressive calcification.

      (3) The Methods are unclear. How long were the foraminifers kept before being placed under the microscope? Were they fed with anything? This is important since the chlorophyll should not be from any food source. I didn't know that this foraminiferal species has photosynthetic symbionts: genera like Quinqueloculina don't. Is there any reference for this? Normally, I wouldn't care that much, but the authors find the presence of (facultative) symbionts important (lines 305-336). I am a bit suspicious about this since the only evidence for the presence of photosynthetic symbionts is because of the autofluorescence. As the authors said, commonly these miliolid species are regarded as symbiont-barren, so additional proof for these symbionts is necessary.

      We agree that additional proof is needed for the presence of photosynthetic symbionts. We rephrased the manuscript accordingly.

      (4) It is also unclear (Methods) at what stage the miliolids were photographed (Figure 3). How did chamber formation proceed, what was the timing of the photographs, etc. These pictures are to me the most interesting finding of this study, but need to be described much better.

      All individuals of living foraminifera were fixed at the overall stage of chamber formation. However, every individual presents a complete set of successive steps (substages) of chamber wall calcification fixed at once. Fig. 3A and B present nearly the most proximal (youngest) part of the new chamber with a thick wall of calcite nanograins within a gel-like organic matrix. Fig. 3C and D present a bit more distal (intermediate) part of the calcified chamber. Fig. 3E shows the most distal part of the new chamber. This part is anchored to the older, underlying solid calcified chamber (not shown in this figure). All these steps are synchronous, however, represent gradual successive stages of calcification. The main text and Figs 4 and 5 explain this phenomenon in details.

      There are many small issues with the text too. These include:

      Line 28/29: in many other groups, calcification is thought to be polyphyletic (e.g. sponges: Chombard et al., 1997. Biol Bull 193: 359-367).

      Corrected

      Line 29/30: there may be even more 'types of shells'. The first author has shown in earlier papers that nodosarids have a unique shell architecture. Spirillinids also seem to have their own way of calcification. It is unclear what is meant here by 'two contrasting models'.

      By now there are known only two models of foraminiferal calcification. Lagenida biocalcification has not been studied.

      Line 33: 'Both groups'? This paper only shows calcification in miliolids.

      However, we refer to previous study.

      Line 42: Perhaps, but there is no data on the pseudopodial network in this manuscript.

      We refer to Angell, 1980 studies

      Line 43: Likely, but that is not what this manuscript is showing.

      Line 42-44: The authors should make a choice and be clear. The point of this paper is that miliolids and rotalids calcify in ways that are actually not as different as they seemed previously. Still, they are said to have different 'chamber formation modes'. If they are calcifying in a similar way (which I think is not necessarily supported by the results), isn't calcification in these groups like variations on the same theme? How does this relate to the independent origins of calcification within these two groups?

      Our intension is to show that Miliolida and Rotaliida utilize less divergent calcification pathways, following the recently discovered biomineralization principles.

      Line 49-51: is this a well-established distinction? If so, please add a reference. If not: what is fundamentally different between B and C? Does only the size of the intracellular vesicle matter?

      Rephrased

      Line 60: please include a reference for the intracellular calcification by coccolithophores.

      Added

      Line 67: this is wrong. It is the alignment of the needles at the surface that makes them all reflect light in the same way and gives the shells a porcelaneous appearance. A close-up of the miliolid's shell surface shows this arrangement. Underneath this layer, the orientation of the needles is more random.

      We referred to Johan Hohenegger papers.

      Line 114: how else?

      Line 114-116: I don't see the relevance here. If seawater is taken up, the vesicle containing this seawater has to have a membrane around it. By definition. The text here ('These vesicles') suggests that Calcein and FM1-43 were combined (which they easily could have), but the methods describe that they are used successively.

      Yes, we used two dyes separately.

      Lines 122-130: I think the interpretation of this autofluorescence signal is wrong. Even if it was true, these lines belong to the Discussion.

      This paragraph has been placed within discussion

      Line 138: What are 'mobile clusters'? I don't see a relation between the location of the symbionts and the other vesicles (Figure 2).

      Line 147-148: How can an SEM image show the absence of organic matter?

      We meant the absence of the gel-like OM visible in the previous stages of the chamber formation

      Line 148: Should be 'Figs. 3E; 3E1; S4C'.

      Corrected

      Lines 143-150: this can be merged with the following paragraph.

      Done

      Lines 151-169: why is there no indication of the time? Figures 3 and 4 link the pictures in time to show the development of the growing chamber wall. However, neither here nor in the methods, is there any recording of the time after the beginning of chamber formation. Now, the images are linked (Figure 4) as if they were taken at regular intervals, but this is not documented.

      Lines 170-184: this should go to the Discussion.

      Done

      Line 193-195: this is likely, but not visible in Figure 1.

      It was visible by optical microscopy and described by Angell, 1980

      Line 199-201: I don't understand this: the fluorescent vesicles were not observed during chamber formation so any link between the SEM and CLSM scans remains pure speculation.

      Line 203-204: needed for what?

      For better documentation of Miliolid ACC-bearing granules

      Line 220: is this shown in any of the images? 

      Angell, 1980

      Line 230: It sounds nice, but I don't think a 'paradigm shift' is appropriate here. However interesting and important foraminiferal biomineralization is, the authors show that the crystals of miliolids are likely formed differently than previously thought. If this is a 'paradigm shift', then most scientific findings are.

      In our opinion this is definitely a shift of paradigm

      Line 231: I don't think anyone suggested miliolids and coccolithophores share 'the same' pathway. They are shown (cocco's) and thought (miliolids) to secrete their calcite intracellularly.

      Changed to similar, intracellular

      Line 258: References should only be to peer-reviewed studies.

      Line 430: Burgers'

      Corrected

      Reviewer #2 (Recommendations For The Authors):

      Please separate clearly the results (observations) from the discussion (interpretations): various interpretational/commentary phrases should be removed from the Results section to Discussion e.g., lines 124-130, 131-135.

      Interpretation have been separated from results as suggested by Reviewer.

      [line 49] " living cells have evolved three major skeleton crystallization pathways". I would rather say "organisms" not "cells" as the coordination of the calcification process in multicellular organisms clearly involves processes that are beyond the individual cell activity.

      Corrected

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      Original comment: There is no explanation for how this work could be a breakthrough in simulation gregarious feeding as is stated in the manuscript.

      Reviewer response: I think I understand where the authors are trying to take this next step. If the authors were to follow up on this study with the proposed implementation of inhalant/exhalent velocities profiles (or more preferably velocity/pressure fields), then that study would be a breakthrough in simulating such gregarious feeding. Based on what has been done within the present study, I think the term "breakthrough" is instead overly emphatic. An additional note on this. The authors are correct that incorporating additional models could be used to simulation a population (as has been successfully done for several Ediacaran taxa despite computational limitations), but it's not the only way. The authors 1 might explore using periodic boundary conditions on the external faces of the flow domain. This could require only a single Olivooid model to assess gregarious impacts - see the abundant literature of modeling flow through solar array fields.

      We appreciate the reviewer 1 for the suggestion. Modeling gregarious feeding via periodic boundary conditions is surely a practical way with limited computational resources. Modeling flow through solar array fields can also be an inspiring case. However, to realism the simulation of gregarious feeding behavior on an uneven seabed and with irregular organism spatial distribution, just using periodic boundary conditions may not be sufficient (see Author response image 1 for a simple example). We will go on exploring the way of realizing the simulations of large-scale gregarious feeding.

      Author response image 1.

      An example of modeling gregarious feeding behavior on an uneven seabed.

      Original comment: The claim that olivooid-type feeding was most likely a prerequisite transitional form to jet-propelled swimming needs much more support or needs to be tailored to olivooids. This suggests that such behavior is absent (or must be convergent) before olivooids, which is at odds with the increasing quantities of pelagic life (whose modes of swimming are admittedly unconstrained) documented from Cambrian and Neoproterozoic deposits. Even among just medusozoans, ancestral 1 state reconstruction suggests that they would have been swimming during the Neoproterozoic (Kayal et al., 2018; BMC Evolutionary Biology) with no knowledge of the mechanics due to absent preservation. Author response: Thanks for your suggestions. Yes, we agree with you that the ancestral swimming medusae may appear before the early Cambrian, even at the Neoproterozoic deposits. However, discussions on the affinities of Ediacaran cnidarians are severely limited because of the lack of information concerning their soft anatomy. So, it is hard to detect the mechanics due to absent preservation. Olivooids found from the basal Cambrian Kuanchuanpu Formation can be reasonably considered as cnidarians based on their radial symmetry, external features, and especially the internal anatomies (Bengtson and Yue 1997; Dong et al. 2013; 2016; Han et al. 2013; 2016; Liu et al. 2014; Wang et al. 2017; 2020; 2022). The valid simulation experiment here was based on the soft tissue preserved in olivooids.

      Reviewer response: This response does not sufficiently address my earlier comment. While the authors are correct that individual Ediacaran affinities are an area of active research and that Olivooids can reasonably be considered cnidarians, this doesn't address the actual critique in my comment. Most (not all) Ediacaran soft-bodied fossils are considered to have been benthic, but pelagic cnidarian life is widely acknowledged to at least be present during later White Sea and Nama assemblages (and earlier depending on molecular clock interpretations). The authors have certainly provided support for the mechanics of this type of feeding being co-opted for eventual jet propulsion swimming in Olivooids. They have not provided sufficient justifications within the manuscript for this to be broadened beyond this group.

      Thanks for your sincere commentary. We of course agree with the possibility of the emergence of swimming cnidarians before the lowermost Cambrian Fortunian Stage. See lines 16-129: “Ediacaran fossil assemblages with complex ecosystems consist of exceptionally preserved soft-bodied eukaryotes of enigmatic morphology, which their affinities are mostly unresolved (Tarhan et al., 2018, Integrative and Comparative Biology, 58 (4), 688–702; Evans et al., 2022, PNAS, 11(46), e220747511).” Undoubtedly Olivooids belong to cnidarians charactered by their external and internal biological structures. Limited by the fossil records, we could only speculate on the transition from the benthic to the swimming of ancestral cnidarians via the valid fossil preservation, e.g. olivooids. The transition may require processes such as increasing body size, thickening the mesoglea, and degenerating the periderm, etc. And these processes may also evolve independently or comprehensively. Moreover, the ecological behaviors of the ancestral cnidarians may evolve independently at different stages from Ediacaran to Cambrian. We therefore could not provide more sufficient justifications beyond olivooids.

      Original comment: L446: two layers of hexahedral elements is a very low number for meshing boundary layer flow

      Reviewer response: As the authors point out in the main text, these organisms are small (millimeters in scale) and certainly lived within the boundary layer range of the ocean. While the boundary layer is not the main point, it still needs to be accurately resolved as it should certainly affect the flow further towards the far field at this scale. I'm not suggesting the authors need to perfectly resolve the boundary layer or focus on using turbulence models more tailored to boundary layer flows (such as k-w), but the flow field still needs sufficient realism for a boundary bounded flow. The authors really should consider quantitatively assessing the number of hexahedral elements within their mesh refinement study.

      To address this concern, we run another four simulations based on mesh4 within our mesh refinement study to assess the number of hexahedral elements (five layers and eight layers of hexahedral elements with different thickness of boundary layer mesh (controlled by thickness adjustment factor), respectively). the results had been supplemented to Table supplement 2. As shown in the results, the number of layers of hexahedral elements seems does not significant influence the result, but the thickness of boundary layer mesh can influence the maximum flow velocity of the contraction phase. However, the results of all the simulations were generally consistent, as shown in Author response image 2. The description of the results above were added to section “Mesh sensitivity analysis”.

      Author response image 2.

      Results of mesh refinement study of different boundary layer mesh parameters.

    2. eLife assessment

      This important study advances our understanding of early Cambrian cnidarian paleoecology and suggests that the reconstructed ancestral feeding and respiration mechanisms predate jet-propelled swimming utilized by modern jellyfish. The work combines solid evidence of fluid and structural mechanics modeling, simulating for the first time the feeding and respiratory capacities in a microfossil (Quadrapyrgites), which in turn opens new possibilities using this approach for paleontological research. Assuming that the prior interpretations and assumptions concerning the modeled organism's soft part and skeletal anatomy are correct, the hypotheses that (1) the organism could alternately contract and expand the oral region and (2) such movement increased feeding efficiency seem plausible.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors utilize fluid-structure interaction analyses to simulate fluid flow within and around the Cambrian cnidarian Quadrapyrgites to reconstruct feeding/respiration dynamics. Based on vorticity and velocity flow patterns, the authors suggest that the polyp expansion and contraction ultimately develop vortices around the organism that are like what modern jellyfish employ for movement and feeding. Lastly, the authors suggest that this behavior is likely a prerequisite transitional form to swimming medusae.

      Strengths:

      While fluid-structure-interaction analyses are common in engineering, physics, and biomedical fields, they are underutilized in the biological and paleobiological sciences. Zhang et al. provide a strong approach to integrating active feeding dynamics into fluid flow simulations of ancient life. Based on their data, it is entirely likely the described vortices would have been produced by benthic cnidarians feeding/respiring under similar mechanisms. However, some of the broader conclusions require additional justification.

      Weaknesses:

      (1) The claim that olivooid-type feeding was most likely a prerequisite transitional form to jet-propelled swimming needs much more support or needs to be tailored to olivooids. This suggests that such behavior is absent (or must be convergent) before olivooids, which is at odds with the increasing quantities of pelagic life (whose modes of swimming are admittedly unconstrained) documented from Cambrian and Neoproterozoic deposits. Even among just medusozoans, ancestral state reconstruction suggests that they would have been swimming during the Neoproterozoic (Kayal et al., 2018; BMC Evolutionary Biology) with no knowledge of the mechanics due to absent preservation.<br /> (2) While the lack of ambient flow made these simulations computationally easier, these organisms likely did not live in stagnant waters even within the benthic boundary layer. The absence of ambient unidirectional laminar current or oscillating current (such as would be found naturally) biases the results.<br /> (3) There is no explanation for how this work could be a breakthrough in simulation gregarious feeding as is stated in the manuscript.

      Despite these weaknesses the authors dynamic fluid simulations convincingly reconstruct the feeding/respiration dynamics of the Cambrian Quadrapyrgites, though the large claims of transitionary stages for this behavior are not adequately justified. Regardless, the approach the authors use will be informative for future studies attempting to simulate similar feeding and respiration dynamics.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors seek to elucidate the early evolution of cnidarians through computer modeling of fluid flow in the oral region of very small, putative medusozoan polyps. They propose that the evolutionary advent of the free-swimming medusoid life stage was preceded by a sessile benthic life stage equipped with circular muscles that originally functioned to facilitate feeding and that later became co-opted for locomotion through jet propulsion.

      Strengths:

      Assumptions of the modeling exercise laid out clearly; interpretations of the results of the model runs in terms of functional morphology plausible. An intriguing investigation that should stimulate further discussion and research.

      Weaknesses:

      Speculation on the origin of the medusoid life stage in cnidarians heavily dependent on prior assumptions concerning the soft part anatomy and material properties of the skeleton of the modeled fossil organism that may be open to alternative interpretations. Logically, of course, the hypothesis that cnidarian medusae originated from benthic polyps must be evaluated along with the alternative hypotheses that the medusa came first and that the ancestral cnidarian exhibited both life stages.

    1. Author response:

      The following is the authors’ response to the original reviews.

      The points raised let us critically rethink our approach, our results, and our conclusions. Furthermore, it gave us the chance to elaborate on some critical aspects that were mentioned. With the help of the reviewers, we made some clarifications in the point-by-point responses and implemented them in the manuscript. Furthermore, we modified the figures as suggested:

      - The colors in Figure 1C, D, G and H have been adapted as suggested

      - We added a Figure2-figure supplement 1, which strengthens our conclusion in Figure 2

      - As asked by reviewer #1 (weaknesses #3), we added the data about neutrophil numbers in the different organs (Figure 6-figure supplement 3C).

      Reviewer #1 (Public Review):

      Summary:

      - Extracellular ATP represents a danger-associated molecular pattern associated to tissue damage and can act also in an autocrine fashion in macrophages to promote proinflammatory responses, as observed in a previous paper by the authors in abdominal sepsis. The present study addresses an important aspect possibly conditioning the outcome of sepsis that is the release of ATP by bacteria. The authors show that sepsis-associated bacteria do in fact release ATP in a growth dependent and strain-specific manner. However, whether this bacterial derived ATP play a role in the pathogenesis of abdominal sepsis has not been determined. To address this question, a number of mutant strains of E. coli has been used first to correlate bacterial ATP release with growth and then, with outer membrane integrity and bacterial death. By using E. coli transformants expressing the ATP-degrading enzyme apyrase in the periplasmic space, the paper nicely shows that abdominal sepsis by these transformants results in significantly improved survival. This effect was associated with a reduction of peritoneal macrophages and CX3CR1+ monocytes, and an increase in neutrophils. To extrapolate the function of bacterial ATP from the systemic response to microorganisms, the authors exploited bacterial OMVs either loaded or not with ATP to investigate the systemic effects devoid of living microorganisms. This approach showed that ATP-loaded OMVs induced degranulation of neutrophils after lysosomal uptake, suggesting that this mechanism could contribute to sepsis severity.

      Strengths:

      - A strong part of the study is the analysis of E. coli mutants to address different aspects of bacterial release of ATP that could be relevant during systemic dissemination of bacteria in the host.

      We want to thank the reviewer for recognizing this important aspect of our experimental approach.

      Weaknesses:

      - As pointed out in the limitations of the study whether ATP-loaded OMVs provide a mechanistic proof of the pathogenetic role of bacteria-derived ATP independently of live microorganisms in sepsis is interesting but not definitively convincing. It could be useful to see whether degranulation of neutrophils is differentially induced by apyrase-expressing vs control E. coli transformants.

      We thank the reviewer for raising several important points. In our study, we assessed local and systemic effects of released bacterial ATP. The consequences of local bacterial ATP release were assessed using an apyrase-expressing E. coli transformant. Locally, bacterial ATP resulted in a decrease in neutrophil numbers and we hypothesize that directly released bacterial ATP either leads to neutrophil death (e.g. via P2X7 receptor (Proietti et al., 2019)) or interferes with the recruitment of neutrophils (e.g. via P2Y receptors (Junger, 2011)).

      The systemic consequences were assessed using ATP-loaded and empty OMV. We have shown that degranulation is induced by OMV-derived bacterial ATP. ATP-containing OMV are engulfed by neutrophils, reach its endolysosomal compartment and might activate purinergic receptors, which then lead to aberrant degranulation. This concept, that needs to be explored in future studies, is fundamentally different from classical purinergic signaling via directly released bacterial ATP into the extracellular space.

      It is possible that neutrophil degranulation is also modulated by directly released bacterial ATP. We agree that this should be assessed in future studies. Also, the role of OMV-derived bacterial ATP should be assessed locally as well as the importance of directly released vs. OMV-mediated bacterial ATP dissected locally. Based on our measurements (Figure 4-figure supplement 1A and Figure 5C), we estimate that the effect of OMV-derived bacterial ATP might be much smaller than the effects of directly released bacterial ATP. Thus, direct ATP release might predominate locally. However, we fully agree that this has to be investigated in a future study to reconcile the different aspects of bacterial ATP signaling. A paragraph will be added to the manuscript, in which we discuss this particular issue.

      - Also, the increase of neutrophils in bacterial ATP-depleted abdominal sepsis, which has better outcomes than "ATP-proficient" sepsis, seems difficult to correlate to the hypothesized tissue damage induced by ATP delivered via non-infectious OMVs.

      We fully acknowledge the mentioned discrepancy. What we propose is that bacterial ATP exhibits different functions that are dependent on the release mechanism (see above). Locally, in the peritoneal cavity, neutrophil numbers are decreased by directly released bacterial ATP. Remotely, ATP is delivered via OMV and impacts on neutrophil function. We agree that, in particular, in the peritoneal cavity, both effects may play a role. However, the impact of directly released bacterial ATP seems to be dominant (see above).

      We propose that neutrophils are decreased locally because of directly released bacterial ATP, which prevents efficient infection control and, therefore, impairs sepsis survival. In addition, these fewer neutrophils might even be dysregulated by the engulfment of bacterial ATP delivered via OMV, which leads to an upregulated and possibly aberrant degranulation process worsening local and remote tissue damage. We agree that in addition to neutrophil numbers, the function of local neutrophils should be assessed with and without the influence of OMV-delivered bacterial ATP. This could be done by RNA sequencing of primary neutrophils from the peritoneal cavity or neutrophil cell lines as well as degranulation assays.

      - Are the neutrophils counts affected by ATP delivered via OMVs?

      This is difficult to show in the peritoneal cavity where we have both, directly released bacterial ATP and OMV-derived bacterial ATP. We assessed such putative difference, however, for the systemic organs and the blood, where we did not find any differences in neutrophil numbers.

      Author response image 1.

      - A comparison of cytokine profiles in the abdominal fluids of E. coli and OMV treated animals could be helpful in defining the different responses induced by OMV-delivered vs bacterial-released ATP. The analyses performed on OMV treated versus E. coli infected mice are not closely related and difficult to combine when trying to draw a hypothesis for bacterial ATP in sepsis.

      We fully agree that there are several open questions that remain to be elucidated, in particular, to differentiate the local role of directly released versus OMV-delivered bacterial ATP. In this study, we laid the foundation for future in vivo research to examine the specific role of bacterial ATP in sepsis. Such future research avenues might be to investigate the local effects of OMV-delivered bacterial ATP, and how neutrophil migration, apoptosis and degranulation are altered. We agree that exploration of the local secretory immune response and cytokine profiles are relevant to understand the different mechanisms of how bacterial ATP alters sepsis. However, such experiments should be ideally performed in systems where the source and the delivery of ATP can be modulated locally.

      - Also it was not clear why lung neutrophils were used for the RNAseq data generation and analysis.

      Thank you for this remark. We have chosen primary lung neutrophils for four reasons:

      (1) Isolation of primary lung neutrophils allowed us to assess an in vivo response that would not have been possible with cell lines.

      (2) The lung and the respiratory system are among the clinically most important organs affected during sepsis resulting in a significant cause of mortality.

      (3) We show in Figure 6C that specifically in the lung, OMV are engulfed by neutrophils, which shows the relevance of the lung also in our study context.

      (4) And finally, lung neutrophils were chosen to examine specifically distant and not local effects.

      Reviewer #2 (Public Review):

      Summary:

      - In their manuscript "Released Bacterial ATP Shapes Local and Systemic Inflammation during Abdominal Sepsis", Daniel Spari et al. explored the dual role of ATP in exacerbating sepsis, revealing that ATP from both host and bacteria significantly impacts immune responses and disease progression.

      Strengths:

      - The study meticulously examines the complex relationship between ATP release and bacterial growth, membrane integrity, and how bacterial ATP potentially dampens inflammatory responses, thereby impairing survival in sepsis models. Additionally, this compelling paper implies a concept that bacterial OMVs act as vehicles for the systemic distribution of ATP, influencing neutrophil activity and exacerbating sepsis severity.

      We thank the reviewer for mentioning these key points and supporting the relevance of our study.

      Weaknesses:

      (1) The researchers extracted and cultivated abdominal fluid on LB agar plates, then randomly picked 25 colonies for analysis. However, they did not conduct 16S rRNA gene amplicon sequencing on the fluid itself. It is worth noting that the bacterial species present may vary depending on the individual patients. It would be beneficial if the authors could specify whether they've verified the existence of unculturable species capable of secreting high levels of Extracellular ATP.

      Most septic complications are caused by a limited spectrum of bacteria, belonging mainly either to the Firmicutes or the Proteobacteria phyla, including E. coli, K. pneumoniae, S. aureus or E. faecalis (Diekema et al., 2019; Mureșan et al., 2018). We validated this well documented existing evidence by randomly assessing 25 colonies. For the planned experiments, it was crucial to work with culturable bacteria; otherwise, ATP measurements, the modulation of ATP generation or loading of OMV would not have been possible. Using such culturable bacteria allowed us to describe mechanisms of ATP release.

      We fully agree that hard-to-culture or unculturable bacteria might contribute significantly to septic complications. This, however, would need to be explored in future studies using extensive culturing methods (Cheng et al., 2022).

      (2) Do mice lacking commensal bacteria show a lack of extracellular ATP following cecal ligation puncture?

      ATP is typically secreted by many cells of the host in active and passive manners in the case of any injury, including cecal ligation and puncture (Burnstock, 2016; Dosch et al., 2018; Eltzschig et al., 2012; Idzko et al., 2014). We hypothesize that bacterial ATP is a potential priming agent at early stages of sepsis, and indeed, at such early time points, a comparison of peritoneal ATP levels between germfree and colonized mice could support our hypothesis. Future studies addressing this question must, however, correct for the different immune responses between germ-free and colonized mice. This is of utmost importance, especially for the cecal ligation and puncture model, since the cecum of germ-free mice is extremely large, making such experiments hard to control.

      (3) The authors isolated various bacteria from abdominal fluid, encompassing both Gram-negative and Gram-positive types. Nevertheless, their emphasis appeared to be primarily on the Gram-negative E. coli. It would be beneficial to ascertain whether the mechanisms of Extracellular ATP release differ between Gram-positive and Gram-negative bacteria. This is particularly relevant given that the Gram-positive bacterium E. faecalis, also isolated from the abdominal fluid, is recognized for its propensity to release substantial amounts of Extracellular ATP.

      We fully agree with this comment. In this paper, we used E. coli as our model organism to determine the principles of sepsis-associated bacterial ATP release and therefore focused on gram-negative bacteria. In addition to the direct, growth-dependent release, we found a relevant impact of OMV-delivered bacterial ATP. For this latter purpose, a gram-negative strain, in which OMV generation has been well described (Schwechheimer & Kuehn, 2015), was chosen. Recently, gram-positive bacteria have been shown to secrete ATP and OMV as well (Briaud & Carroll, 2020; Hironaka et al., 2013; Iwase et al., 2010). Given the fundamental differences in the structure of the cell wall of gram-positive bacteria and the mechanisms of OMV generation and release, future studies are required to assess the relevance of directly released and OMV-delivered ATP in gram-positive bacteria.

      (4) The authors observed changes in the levels of LPM, SPM, and neutrophils in vivo. However, it remains uncertain whether the proliferation or migration of these cells is modulated or inhibited by ATP receptors like P2Y receptors. This aspect requires further investigation to establish a convincing connection.

      We fully agree with this comment. The decrease in LPM and the consequential predomination of SPM have been well described after inflammatory stimuli in the context of the macrophage disappearance reaction (Ghosn et al., 2010). Also, it has been shown that purinergic signaling modulates infiltration of neutrophils and can lead to cell death as a consequence of  P2Y and P2X receptor activation (Junger, 2011; Proietti et al., 2019). In our study, we propose that intracellular purinergic receptors contribute to neutrophil function during sepsis. After introducing the general principles and fundaments of bacterial ATP with our studies, we fully agree that additional experiments need to address downstream purinergic receptor activation. That, however, would go beyond the scope of our study.

      (5) Additionally, is it possible that the observed in vivo changes could be triggered by bacterial components other than Extracellular ATP? In this research field, a comprehensive collection of inhibitors is available, so it is desirable to utilize them to demonstrate clearer results.

      This question is of utmost importance and defined the choice of our model and experimental approach. When we started the project, we used two different E. coli mutants that release low (ompC) and high (eaeH) amounts of ATP. However, the limitation of this approach is that these are different bacteria, which may also differ in the components they secrete or the surface proteins they express. We, therefore, decided against that approach. With the approach we finally used (same bacterium, just with and without ATP), we aimed to minimize the influence of non-ATP bacterial components.

      (6) Have the authors considered the role of host-derived Extracellular ATP in the context of inflammation?

      Yes, the role of host-derived extracellular ATP in inflammation and sepsis is well-established with contradictory results (Csóka et al., 2015; Ledderose et al., 2016). This conflicting data was the rationale to test the relevance of bacterial ATP. We suggest that bacterial ATP is essential in the early phase of sepsis when bacteria invade the sterile compartment and before efficient host response, including the eukaryotic release of ATP, is established.

      (7) The authors mention that Extracellular ATP is rapidly hydrolyzed by ectonucleotases in vivo. Are the changes of immune cells within the peritoneal cavity caused by Extracellular ATP released from bacterial death or by OMVs?

      This is a relevant question that was also asked by reviewer #1, and we answered it in detail above (weaknesses comment #1 and #2). From our ATP measurements (Figure 4-figure supplement 1A and Figure 5C), we conclude that locally, the role of directly released bacterial ATP (extracellular) predominates over OMV-derived bacterial ATP. Furthermore, the mechanisms between directly released and OMV-derived bacterial ATP (within OMV, engulfed and transported to the endolysosomal compartment) are different, and especially extracellular ATP has been described to lead to apoptosis via P2X7 signaling.

      (8) In the manuscript, the sample size (n) for the data consistently remains at 2. I would suggest expanding the sample size to enhance the robustness and rigor of the results.

      Two biological replicates (independent cultures) were only used for the bacteria cultures in Figure 1, Figure 2, and Figure 3, which achieved similar results and the standard deviation remained very small, indicating its robustness. In the in vitro experiments in Figure 5 we used a sample size of 6 (three biological replicates measured in technical duplicates), since we saw bigger deviations in our measurements. For the in vivo experiments, we always used 5 or more animals in at least two independent experiments.

      Reviewer #2 (Recommendations For The Authors):

      (9). Line 37: 11 million sepsis-related deaths were reported "in" 2017.

      The passage has been corrected as suggested.

      (10) By the way, the similar colors used in Figure 1C and G are too chaotic, making it difficult to distinguish.

      We agree, the colors have been adapted.

      Author response image 2.

      (11). All "in vivo" and "in vitro" should be italicized.

      We italicized all of them.

      (12). The title of Figure 4 is confusing: "Impairs sepsis outcome in vivo?" Could you make it more specific?

      We agree, the title has been rephrased:

      “Bacterial ATP reduces neutrophil counts and reduces survival in a mouse model of abdominal sepsis.”

      (13) Line 314-316: The sentence "Potentially, despite the lack of a transporter, ATP may similarly to eukaryotic cells leak (Yegutkin et al., 2006) across the inner membrane into the periplasmic space that lacks the enzymes for ATP generation." sounds odd.

      This passage was reformulated in the manuscript.

      “Despite the lack of a transporter, ATP may leak across the inner membrane into the periplasmic space. Such leakage may be similar to baseline leakage in eukaryotic cells (Yegutkin et al., 2006).”

      (14) The numerical notation in the paper is odd: sometimes it uses a prime symbol as a superscript (such as line 504), and sometimes it does not (such as line 421). Should it be standardized to "3,200" and "150,000"?

      Thank you for this remark. The numbers have been standardized throughout the manuscript.

      (15) Line "0.4 mm EP cuvettes" should be "0.4 cm EP cuvettes"

      The specified passage has been corrected as suggested.

      References

      Briaud, P., & Carroll, R. K. (2020). Extracellular Vesicle Biogenesis and Functions in Gram-Positive Bacteria. Infection and Immunity, 88(12), 10.1128/iai.00433-20. https://doi.org/10.1128/iai.00433-20

      Burnstock, G. (2016). P2X ion channel receptors and inflammation. Purinergic Signalling, 12(1), 59–67. https://doi.org/10.1007/s11302-015-9493-0

      Cheng, A. G., Ho, P.-Y., Aranda-Díaz, A., Jain, S., Yu, F. B., Meng, X., Wang, M., Iakiviak, M., Nagashima, K., Zhao, A., Murugkar, P., Patil, A., Atabakhsh, K., Weakley, A., Yan, J., Brumbaugh, A. R., Higginbottom, S., Dimas, A., Shiver, A. L., … Fischbach, M. A. (2022). Design, construction, and in vivo augmentation of a complex gut microbiome. Cell, 185(19), 3617-3636.e19. https://doi.org/10.1016/j.cell.2022.08.003

      Csóka, B., Németh, Z. H., Törő, G., Idzko, M., Zech, A., Koscsó, B., Spolarics, Z., Antonioli, L., Cseri, K., Erdélyi, K., Pacher, P., & Haskó, G. (2015). Extracellular ATP protects against sepsis through macrophage P2X7 purinergic receptors by enhancing intracellular bacterial killing. The FASEB Journal, 29(9), 3626–3637. https://doi.org/10.1096/fj.15-272450

      Diekema, D. J., Hsueh, P.-R., Mendes, R. E., Pfaller, M. A., Rolston, K. V., Sader, H. S., & Jones, R. N. (2019). The Microbiology of Bloodstream Infection: 20-Year Trends from the SENTRY Antimicrobial Surveillance Program. Antimicrobial Agents and Chemotherapy, 63(7), e00355-19. https://doi.org/10.1128/AAC.00355-19

      Dosch, M., Gerber, J., Jebbawi, F., & Beldi, G. (2018). Mechanisms of ATP Release by Inflammatory Cells. International Journal of Molecular Sciences, 19(4), 1222. https://doi.org/10.3390/ijms19041222

      Eltzschig, H. K., Sitkovsky, M. V., & Robson, S. C. (2012). Purinergic Signaling during Inflammation. New England Journal of Medicine, 367(24), 2322–2333. https://doi.org/10.1056/NEJMra1205750

      Ghosn, E. E. B., Cassado, A. A., Govoni, G. R., Fukuhara, T., Yang, Y., Monack, D. M., Bortoluci, K. R., Almeida, S. R., Herzenberg, L. A., & Herzenberg, L. A. (2010). Two physically, functionally, and developmentally distinct peritoneal macrophage subsets. Proceedings of the National Academy of Sciences, 107(6), 2568–2573. https://doi.org/10.1073/pnas.0915000107

      Hironaka, I., Iwase, T., Sugimoto, S., Okuda, K., Tajima, A., Yanaga, K., & Mizunoe, Y. (2013). Glucose Triggers ATP Secretion from Bacteria in a Growth-Phase-Dependent Manner. Applied and Environmental Microbiology, 79(7), 2328–2335. https://doi.org/10.1128/AEM.03871-12

      Idzko, M., Ferrari, D., & Eltzschig, H. K. (2014). Nucleotide signalling during inflammation. Nature, 509(7500), 310–317. https://doi.org/10.1038/nature13085

      Iwase, T., Shinji, H., Tajima, A., Sato, F., Tamura, T., Iwamoto, T., Yoneda, M., & Mizunoe, Y. (2010). Isolation and Identification of ATP-Secreting Bacteria from Mice and Humans. Journal of Clinical Microbiology, 48(5), 1949–1951. https://doi.org/10.1128/JCM.01941-09

      Junger, W. G. (2011). Immune cell regulation by autocrine purinergic signalling. Nature Reviews Immunology, 11(3), 201–212. https://doi.org/10.1038/nri2938

      Ledderose, C., Bao, Y., Kondo, Y., Fakhari, M., Slubowski, C., Zhang, J., & Junger, W. G. (2016). Purinergic Signaling and the Immune Response in Sepsis: A Review. Clinical Therapeutics, 38(5), 1054–1065. https://doi.org/10.1016/j.clinthera.2016.04.002

      Mureșan, M. G., Balmoș, I. A., Badea, I., & Santini, A. (2018). Abdominal Sepsis: An Update. The Journal of Critical Care Medicine, 4(4), 120–125. https://doi.org/10.2478/jccm-2018-0023

      Proietti, M., Perruzza, L., Scribano, D., Pellegrini, G., D’Antuono, R., Strati, F., Raffaelli, M., Gonzalez, S. F., Thelen, M., Hardt, W.-D., Slack, E., Nicoletti, M., & Grassi, F. (2019). ATP released by intestinal bacteria limits the generation of protective IgA against enteropathogens. Nature Communications, 10(1), Article 1. https://doi.org/10.1038/s41467-018-08156-z

      Schwechheimer, C., & Kuehn, M. J. (2015). Outer-membrane vesicles from Gram-negative bacteria: Biogenesis and functions. Nature Reviews Microbiology, 13(10), 605–619. https://doi.org/10.1038/nrmicro3525

    2. eLife assessment

      This fundamental study advances our understanding of the role of bacterial-derived extracellular ATP in the pathogenesis of sepsis. The evidence supporting the conclusions is compelling, although not all concerns from a previous round of reviews were adequately addressed. The work will be of broad interest to researchers on microbiology and infectious diseases.

    3. Reviewer #2 (Public Review):

      Summary:

      In their manuscript, Daniel Spari et al. explored the dual role of ATP in exacerbating sepsis, revealing that ATP from both host and bacteria significantly impacts immune responses and disease progression.

      Strengths:

      The study meticulously examines the complex relationship between ATP release and bacterial growth, membrane integrity, and how bacterial ATP potentially dampens inflammatory responses, thereby impairing survival in sepsis models. Additionally, this compelling paper implies a concept that bacterial OMVs act as vehicles for the systemic distribution of ATP, influencing neutrophil activity and exacerbating sepsis severity.

      Weaknesses:

      (1) The researchers extracted and cultivated abdominal fluid on LB agar plates, then randomly picked 25 colonies for analysis. However, they didn't conduct 16S sequencing on the fluid itself. It's worth noting that the bacterial species present may vary depending on the individual patients. It would be beneficial if the authors could specify whether they've verified the existence of unculturable species capable of secreting high levels of Extracellular ATP.

      (2) Do mice lacking commensal bacteria show a lack of Extracellular ATP following cecal ligation puncture?

      (3) The authors isolated various bacteria from abdominal fluid, encompassing both Gram-negative and Gram-positive types. Nevertheless, their emphasis appeared to be primarily on the Gram-negative E. coli. It would be beneficial to ascertain whether the mechanisms of Extracellular ATP release differ between Gram-positive and Gram-negative bacteria. This is particularly relevant given that the Gram-positive bacterium E. faecalis, also isolated from the abdominal fluid, is recognized for its propensity to release substantial amounts of Extracellular ATP.

      (4) The authors observed changes in the levels of LPM, SPM, and neutrophils in vivo. However, it remains uncertain whether the proliferation or migration of these cells is modulated or inhibited by ATP receptors like P2Y receptors. This aspect requires further investigation to establish a convincing connection.

      (5) Additionally, is it possible that the observed in vivo changes could be triggered by bacterial components other than Extracellular ATP? In this research field, a comprehensive collection of inhibitors is available, so it is desirable to utilize them to demonstrate clearer results.

      (6) Have the authors considered the role of host-derived Extracellular ATP in the context of inflammation?

      (7) The authors mention that Extracellular ATP is rapidly hydrolyzed by ectonucleotases in vivo. Are the changes of immune cells within the peritoneal cavity caused by Extracellular ATP released from bacterial death or by OMVs?

      (8) In the manuscript, the sample size (n) for the data consistently remains at 2. I would suggest expanding the sample size to enhance the robustness and rigor of the results.

    4. Reviewer #1 (Public Review):

      Summary:

      Extracellular ATP represents a danger-associated molecular pattern associated to tissue damage and can act also in an autocrine fashion in macrophages to promote proinflammatory responses, as observed in a previous paper by the authors in abdominal sepsis. The present study addresses an important aspect possibly conditioning the outcome of sepsis that is the release of ATP by bacteria. The authors show that sepsis-associated bacteria do in fact release ATP in a growth dependent and strain-specific manner. However, whether this bacterial derived ATP play a role in the pathogenesis of abdominal sepsis has not been determined. To address this question, a number of mutant strains of E. coli has been used first to correlate bacterial ATP release with growth and then, with outer membrane integrity and bacterial death. By using E. coli transformants expressing the ATP-degrading enzyme apyrase in the periplasmic space, the paper nicely shows that abdominal sepsis by these transformants results in significantly improved survival. This effect was associated to the reduction of small peritoneal macrophages and CX3CR1+ monocytes, and increase in neutrophils. To extrapolate the function of bacterial ATP from the systemic response to microorganisms, the authors exploited bacterial OMVs either loaded or not with ATP to investigate the systemic effects devoid of living microorganisms. This approach showed that ATP-loaded OMVs induced degranulation of neutrophils after lysosomal uptake, suggesting this mechanism could contribute to sepsis severity.

      Strengths:

      The most compelling part of the study is the analysis of E. coli mutants to address different aspects of bacterial release of ATP that could be pathogenically relevant during systemic dissemination of bacteria in the host.

      Weaknesses:

      As pointed out in the limitations of the study whether ATP-loaded OMVs could provide a mechanistic proof of the pathogenetic role of bacteria-derived ATP independently of live microorganisms in sepsis is interesting but not definitively convincing. It could be useful to see whether degranulation of neutrophils is differently induced also by apyrase-expressing vs control E. coli transformants. Also, the increase of neutrophils in bacterial ATP-depleted abdominal sepsis, which has better outcome than "ATP-proficient" sepsis, seems difficult to correlate to the hypothesized tissue damage induced by ATP delivered via non-infectious OMVs. Is neutrophils count affected by ATP delivered via OMVs? Probably a comparison of cytokine profiles in the abdominal fluids of E. coli and OMV treated animals could be helpful in defining the different responses induced by OMV-delivered vs bacterial-released ATP.

      The analyses performed on OMV treated versus E. coli infected mice are not immediately related and difficult to combine when trying to draw a pathogenetic hypothesis for bacterial ATP in sepsis.

      It's not clear why lung neutrophils were used for RNAseq.

    1. eLife assessment

      This useful modeling study explores how the biophysical properties of interneuron subtypes in the basolateral amygdala enable them to produce nested oscillations whose interactions facilitate functions such as spike-timing-dependent plasticity. The strength of evidence is currently viewed as incomplete because of insufficient grounding in prior experimental results and insufficient consideration of alternative explanations. This work will be of interest to investigators studying circuit mechanisms of fear conditioning as well as rhythms in the basolateral amygdala. (The authors explain why they disagree with this assessment in their author response.)

    1. Author response:

      Reviewer #1 (Public Review):

      This study excellently complements the previous one by unveiling the properties of NPRL2 in augmenting the effect of immune checkpoint inhibitors such as pembrolizumab in KRAS mutant lung cancer models.

      The following points should be clarified:

      (1) In KRAS mutant cell lines with LKB1 co-mutations or deletions, such as A549 cells, does treatment with NPRL2 not increase the efficacy of immunotherapy? Is this correct? Similarly, does the delivery of NPRL2 only potentiate the effect of immunotherapy in KRAS mutant cell lines without associated LKB1 mutations?

      NPRL2, when used as a single-agent immunotherapy, induces robust antitumor activity in immunotherapy-resistant (aPD1R) KRAS mutant models, such as A549 tumors (KRASmt/LKB1mt/aPD1R) and LLC2 (KRASmt/aPD1R), where immunotherapy is ineffective regardless of LKB1 co-mutation or deletion status. The antitumor effect of NPRL2 combined with aPD1 immunotherapy was not significantly different from NPRL2 alone in immunotherapy-resistant models but was significantly greater than immunotherapy alone. However, a synergistic antitumor effect was observed with NPRL2 and aPD1 immunotherapy in KRAS wild-type and immunotherapy-moderately-responsive models, such as H1299 (KRASwt/aPD1S).

      (2) Do the authors analyze by western blot if NPRL2 influences or restores STING and LKB1 in the A549 cell line that lacks LKB1 and STING?

      NPRL2 induces antitumor immunity on Kras mutant, aPD1 resistant models regardless of LKB1 co-mutations or deletions, however, it would be interesting to look into the effect of NPRL2 on the STING pathway in this LKB1 deleted A549 cell line.

      (3) Mechanistically, is there any explanation as to why NPRL2 delivery increases the efficacy of immunotherapy? Is there any effect on FUS or MYC?

      NPRL2 is a multifunctional tumor suppressor gene that is downregulated or absent in many cancers. NPRL2 has been shown to induce apoptosis, inhibit cell proliferation, and cause cell cycle arrest in various cancer types. Compelling evidence highlights the critical role of NPRL2 in causing DNA damage and double-strand breaks, which can trigger dendritic cell (DC) activation, antigen presentation, and priming of tumor-specific CD8+ T cells in the tumor microenvironment (TME). Our data indicate that NPRL2 treatment is associated with the induction of DC activation and maturation.

      The cellular mechanism of NPRL2 suggests that NPRL2-mediated antitumor immunity depends on the presence of CD4+ T cells, CD8+ T cells, and macrophages. Interestingly, the expression of FUS1, another tumor suppressor gene, was mostly absent or severely downregulated in most non-small cell lung cancers (NSCLC) and was unaffected by NPRL2 treatment. While MYC expression was not assessed in this study, it remains an area of interest for future research.

      (4) Is there any way to carry out a clinical study of systematically delivering NPRL2 in KRAS lung cancer patients?

      In this preclinical study, a clinical-grade DOTAP-NPRL2 formulation was prepared, utilizing NPRL2 encapsulated within nanovesicles for delivery. Based on the promising preclinical data, a phase I clinical trial will be initiated to evaluate the safety and efficacy of this formulation.

      Reviewer #2 (Public Review):

      Summary:

      NPRL2 gene therapy induces effective antitumor immunity in KRAS/STK11 mutant anti-PD1 resistant metastatic non-small cell lung cancer (NSCLC) in a humanized mouse model by Meraz et al investigated the antitumor immune responses to NPRL2 gene therapy in aPD1R / KRAS/STK11mt NSCLC in a humanized mouse model, and found that NPRL2 gene therapy induces antitumor activity on KRAS/STK11mt/aPD1R tumors through DC-mediated antigen presentation and cytotoxic immune cell activation.

      Strengths:

      The novelty of the study.

      Weaknesses:

      (1) The inconsistent effect of NPRL2 combined with pembrolizumab. Figure 2I-K, showed a similar tumor intensity in the NPRL2 group and combination group. However, NPRL2 combined with pembrolizumab was synergistic in the KRASwt/aPD1S H1299 tumors in Figure 4.

      NPRL2, as a single agent immunogen therapy, induces robust antitumor activity on both immunotherapy-resistant (aPD1R) KRAS mutant models, such as A549 tumors (KRASmt/LKB1mt/aPD1R) and LLC2 (KRASmt/aPD1R) and immunotherapy sensitive model such as H1299 (KRASwt/aPD1S) where immunotherapy was ineffective or limitedly effective. A synergistic antitumor effect of NPRL2 and Pembrolizumab combination was found only in immunotherapy moderately responsive models, not in immunotherapy resistant models where PD-1/PD-L1 signaling is impaired shown in Figure 1A.

      (2) The authors stated that NPRL2 combined with pembrolizumab was not synergistic in the KRAS/STK11mt/aPD1R tumors but was synergistic in the KRASwt/aPD1S H1299 tumors. How did the synergistic effect defined in the study, more details need to be provided here.

      Our biostatistician used generalized linear regression models to study the tumor growth over time. Two-way ANOVA with the interaction of treatment group and time point was performed to compare the difference of tumor intensity changes from baseline between each pair of the treatment groups at each time point. The nonparametric Mann-Whitney U test was applied to compare significance in different treatment groups. Differences of P < 0.05, P < 0.01, and P < 0.001 were considered statistically significant. When the combination antitumor effect of NPRL2 and pembrolizumab was found to be statistically significant compared to both single-agent effects synergy was confirmed using the method of Huang et al.

      Huang L, Wang J, Fang B, Meric-Bernstam F, Roth JA, Ha MJ. CombPDX: a unified statistical framework for evaluating drug synergism in patient-derived xenografts. Sci Rep 12(1):12984, 7/2022. e-Pub 7/2022. PMCID: PMC9338066.

      (3) Nearly all of the work was performed pre-clinically. Validation in the clinical setting would provide more strong evidence for the conclusion.

      In this preclinical study, a clinical-grade DOTAP-NPRL2 formulation was prepared, utilizing NPRL2 encapsulated within nanovesicles for delivery. Based on the promising preclinical data, a phase I clinical trial will be initiated to evaluate the safety and efficacy of this formulation.

      (4) Figure 5 and Figure 6 have the same legend. These 2 figures could be merged as a new one.

      Agreed.

      (5) Figure 5B & C, n=9 in the Figure 5B. However, the detail number in Figure 5C was less than 9.

      At least n=7-9 mice/group are shown in the figure 5C. We will revise accordingly.

      Reviewer #3 (Public Review):

      Summary:

      NPRL2/TUSC4 is a tumor suppressor gene whose expression is reduced in many cancers including NSCLC. This study presents a novel finding on NPRL2 gene therapy, which induces antitumor activity on aPD1-resistant tumors. Since KRAS/STK11 mutant tumors were reported to be less benefited from ICIs, this study has potential clinical application value.

      Strengths:

      This work uncovers the advantage of NPRL2 gene therapy by using humanized models and multiple cell lines. Moreover, via immune cell depletion studies, the mechanism of NPRL2 gene therapy has focused on dendritic cells and CD8+T cells.

      Weaknesses:

      A major concern would be the lack of systematic, and logical rigor. This work did not present a link between apoptosis and antigen presenting induced by NPRL2 restoration. There is no evidence proving that the PI3K/AKT/mTOR signaling pathway is related to antigen presenting, which is the major reason of NPRL2 induced antitumor response. Therefore, the two parts may not support each other logically.

      Thank you for your review and comments. We agree that future studies are necessary to establish a direct link between apoptosis and antigen presentation induced by NPRL2 restoration, as well as NPRL2-mediated downregulation of PI3K/AKT/mTOR signaling and its direct effect on antigen presentation. Although NPRL2 restoration directly induced apoptosis in several cell lines shown in Figure 1C and Figure 8Q and significantly increased the number of antigen-presenting DC cells in the tumor microenvironment upon NPRL2 treatment or NPRL2 restoration. Similarly, NPRL2 restoration downregulated the PI3K/AKT/mTOR pathway, which was associated with increased antitumor immunity.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Gating of Kv10 channels is unique because it involves coupling between non-domain swapped voltage sensing domains, a domain-swapped cytoplasmic ring assembly formed by the N- and C-termini, and the pore domain. Recent structural data suggests that activation of the voltage sensing domain relieves a steric hindrance to pore opening, but the contribution of the cytoplasmic domain to gating is still not well understood. This aspect is of particular importance because proteins like calmodulin interact with the cytoplasmic domain to regulate channel activity. The effects of calmodulin (CaM) in WT and mutant channels with disrupted cytoplasmic gating ring assemblies are contradictory, resulting in inhibition or activation, respectively. The underlying mechanism for these discrepancies is not understood. In the present manuscript, Reham Abdelaziz and collaborators use electrophysiology, biochemistry and mathematical modeling to describe how mutations and deletions that disrupt inter-subunit interactions at the cytoplasmic gating ring assembly affect Kv10.1 channel gating and modulation by CaM. In the revised manuscript, additional information is provided to allow readers to identify within the Kv10.1 channel structure the location of E600R, one of the key channel mutants analyzed in this study. However, the mechanistic role of the cytoplasmic domains that this study focuses on, as well as the location of the ΔPASCap deletion and other perturbations investigated in the study remain difficult to visualize without additional graphical information. This can make it challenging for readers to connect the findings presented in the study with a structural mechanism of channel function.

      The authors focused mainly on two structural perturbations that disrupt interactions within the cytoplasmic domain, the E600R mutant and the ΔPASCap deletion. By expressing mutants in oocytes and recording currents using Two Electrode Voltage-Clamp (TEV), it is found that both ΔPASCap and E600R mutants have biphasic conductance-voltage (G-V) relations and exhibit activation and deactivation kinetics with multiple voltage-dependent components. Importantly, the mutant-specific component in the G-V relations is observed at negative voltages where WT channels remain closed. The authors argue that the biphasic behavior in the G-V relations is unlikely to result from two different populations of channels in the oocytes, because they found that the relative amplitude between the two components in the G-V relations was highly reproducible across individual oocytes that otherwise tend to show high variability in expression levels. Instead, the G-V relations for all mutant channels could be well described by an equation that considers two open states O1 and O2, and a transition between them; O1 appeared to be unaffected by any of the structural manipulations tested (i.e. E600R, ΔPASCap, and other deletions) whereas the parameters for O2 and the transition between the two open states were different between constructs. The O1 state is not observed in WT channels and is hypothesized to be associated with voltage sensor activation. O2 represents the open state that is normally observed in WT channels and is speculated to be associated with conformational changes within the cytoplasmic gating ring that follow voltage sensor activation, which could explain why the mutations and deletions disrupting cytoplasmic interactions affect primarily O2. 

      Severing the covalent link between the voltage sensor and pore reduced O1 occupancy in one of the deletion constructs. Although this observation is consistent with the hypothesis that voltage-sensor activation drives entry into O1, this result is not conclusive. Structural as well as functional data has established that the coupling of the voltage sensor and pore does not entirely rely on the S4-S5 covalent linker between the sensor and the pore, and thus the severed construct could still retain coupling through other mechanisms, which is consistent with the prominent voltage dependence that is observed. If both states O1 and O2 require voltage sensor activation, it is unclear why the severed construct would affect state O1 primarily, as suggested in the manuscript, as opposed to decreasing occupancy of both open states. In line with this argument, the presence of Mg2+ in the extracellular solution affected both O1 and O2. This finding suggests that entry into both O1 and O2 requires voltage-sensor activation because Mg2+ ions are known to stabilize the voltage sensor in its most deactivated conformations. 

      We agree with the reviewer that access to both states requires a conformational change in the voltage sensor. This was stated in our revised article: “In contrast, to enter O2, all subunits must complete both voltage sensor transitions and the collective gating ring transition.” We interpret the two gating steps as sequential; the effective rotation of the intracellular ring would happen only once the sensor is in its fully activated position.

      We also agree that the S4-S5 segment cannot be the only interaction mechanism, as we demonstrated in our earlier work (Lörinczi et al., 2015; Tomczak et al., 2017).  

      Activation towards and closure from O1 is slow, whereas channels close rapidly from O2. A rapid alternating pulse protocol was used to take advantage of the difference in activation and deactivation kinetics between the two open components in the mutants and thus drive an increasing number of channels towards state O1. Currents activated by the alternating protocol reached larger amplitudes than those elicited by a long depolarization to the same voltage. This finding is interpreted as an indication that O1 has a larger macroscopic conductance than O2. In the revised manuscript, the authors performed single-channel recordings to determine why O1 and O2 have different macroscopic conductance. The results show that at voltages where the state O1 predominates, channels exhibited longer open times and overall higher open probability, whereas at more depolarized voltages where occupancy of O2 increases, channels exhibited more flickery gating behavior and decreased open probability. These results are informative but not conclusive because additional details about how experiments were conducted, and group data analysis are missing. Importantly, results showing inhibition of single ΔPASCap channels by a Kv10-specific inhibitor are mentioned but not shown or quantitated - these data are essential to establish that the new O1 conductance indeed represents Kv10 channel activity.

      We observed the activity of a channel compatible with Kv10.1 ΔPAS-Cap (long openings at low-moderate potentials, very short flickery activity at strong depolarizations) in 12 patches from oocytes obtained from different frog operations over a period of two and a half months once the experimental conditions could be established. As stated in the text, we did not proceed to generate amplitude histograms because we could not resolve clear single-channel events at strong depolarizations. Astemizole abolished the activity and (remarkably) strongly reduced the noise in traces at strong depolarizations, which we interpret as partially caused by flicker openings.

      Author response image 1.

      We include two example recordings of Astemizole application (100µM) on two different patches. Both recordings are performed at -60 mV (to decrease the likelihood that the channel visits O2) with 100 mM internal and 60 mM external K+. In both cases, the traces in Astemizole are presented in red.

      It is shown that conditioning pulses to very negative voltages result in mutant channel currents that are larger and activate more slowly than those elicited at the same voltage but starting from less negative conditioning pulses. In voltage-activated curves, O1 occupancy is shown to be favored by increasingly negative conditioning voltages. This is interpreted as indicating that O1 is primarily accessed from deeply closed states in which voltage sensors are in their most deactivated position. Consistently, a mutation that destabilizes these deactivated states is shown to largely suppress the first component in voltage-activation curves for both ΔPASCap and E600R channels.

      The authors then address the role of the hidden O1 state in channel regulation by calmodulation. Stimulating calcium entry into oocytes with ionomycin and thapsigarging, assumed to enhance CaM-dependent modulation, resulted in preferential potentiation of the first component in ΔPASCap and E600R channels. This potentiation was attenuated by including an additional mutation that disfavors deeply closed states. Together, these results are interpreted as an indication that calcium-CaM preferentially stabilizes deeply closed states from which O1 can be readily accessed in mutant channels, thus favoring current activation. In WT channels lacking a conducting O1 state, CaM stabilizes deeply closed states and is therefore inhibitory. It is found that the potentiation of ΔPASCap and E600R by CaM is more strongly attenuated by mutations in the channel that are assumed to disrupt interaction with the C-terminal lobe of CaM than mutations assumed to affect interaction with the N-terminal lobe. These results are intriguing but difficult to interpret in mechanistic terms. The strong effect that calcium-CaM had on the occupancy of the O1 state in the mutants raises the possibility that O1 can be only observed in channels that are constitutively associated with CaM. To address this, a biochemical pull-down assay was carried out to establish that only a small fraction of channels are associated with CaM under baseline conditions. These CaM experiments are potentially very interesting and could have wide physiological relevance. However, the approach utilized to activate CaM is indirect and could result in additional nonspecific effects on the oocytes that could affect the results.

      Finally, a mathematical model is proposed consisting of two layers involving two activation steps for the voltage sensor, and one conformational change in the cytoplasmic gating ring - completion of both sets of conformational changes is required to access state O2, but accessing state O1 only requires completion of the first voltage-sensor activation step in the four subunits. The model qualitatively reproduces most major findings on the mutants. Although the model used is highly symmetric and appears simple, the mathematical form used for the rate constants in the model adds a layer of complexity to the model that makes mechanistic interpretations difficult. In addition, many transitions that from a mechanistic standpoint should not depend on voltage were assigned a voltage dependence in the model. These limitations diminish the overall usefulness of the model which is prominently presented in the manuscript. The most important mechanistic assumptions in the model are not addressed experimentally, such as the proposition that entry into O1 depends on the opening of the transmembrane pore gate, whereas entry into O2 involves gating ring transitions - it is unclear why O2 would require further gating ring transitions to conduct ions given that the gating ring can already support permeation by O1 without any additional conformational changes.

      In essence, we agree with the reviewer; we already have addressed these points in our revised article:

      Regarding the voltage dependence we write “the κ/λ transition could reasonably be expected to be voltage independent because we related it to ring reconfiguration, a process that should occur as a consequence of a prior VSD transition. We have made some attempts to treat this transition as voltage independent but state-specific with upper-layer bias for states on the right and lower-layer bias for states on the left. This is in principle possible, as can already be gleaned from the similar voltage ranges of the left-right transition (α/β) and the κL/λ transition. However, this approach leads to a much larger number of free, less well constrained kinetic parameters and drastically complicated the parameter search. ” As you can see, we also formulated a strategy to free the model of the potentially spurious voltage dependence and (in bold here) explained why we did not follow this route in this study. 

      Regarding the need for gating ring transitions after O1, we wrote, “Thus, the underlying gating events can be separated into two steps: The first gating step involves only the voltage sensor without engaging the ring and leads to a pre-open state, which is non-conducting in the WT but conducting in our mutants. The second gating event operates at higher depolarizations, involves a change in the ring, and leads to an open state both in WT and in the mutants. ” 

      We interpret your statements such that you expect the conducting state to remain available once O1 is reached. However, the experimental evidence speaks against that the pore availability remains regardless of the further gating steps beyond O1. The description of model construction is informative here: “... we could exclude many possible [sites at which O1 connects to closed states] because the attachment site must be sufficiently far away from the conventional open state [O2]. Otherwise, the transition from "O1 preferred" to "O2 preferred" via a few closed intermediate states is very gradual and never produces the biphasic GV curves [that we observed]. ” 

      In other words, voltage-dependent gating steps beyond the state that offers access to O1 appear to close the pore, after it was open. That might occur because only then (for states in which at least one voltage sensor exceeded the intermediate position) the ring is fixed in a particular state until all sensors completed activation. In the WT, closing the pore in deactivated states might rely on an interaction that is absent in the mutant because, at least in HERG: “the interaction between the PAS domain and the C-terminus is more stable in closed than in open KV11.1 (HERG) channels, and a single chain antibody binding to the interface between PAS domain and CNBHD can access its epitope in open but not in closed channels, strongly supporting a change in conformation of the ring during gating ”

      Reviewer #3 (Public Review):

      In the present manuscript, Abdelaziz and colleagues interrogate the gating mechanisms of Kv10.1, an important voltage-gated K+ channel in cell cycle and cancer physiology. At the molecular level, Kv10.1 is regulated by voltage and Ca-CaM. Structures solved using CryoEM for Kv10.1 as well as other members of the KCNH family (Kv11 and Kv12) show channels that do not contain a structured S4-S5 linker imposing therefore a non-domain swapped architecture in the transmembrane region. However, the cytoplasmatic N- and C- terminal domains interact in a domain swapped manner forming a gating ring. The N-terminal domain (PAS domain) of one subunit is located close to the intracellular side of the voltage sensor domain and interacts with the C-terminal domain (CNBHD domain) of the neighbor subunit. Mutations in the intracellular domains has a profound effect in the channel gating. The complex network of interactions between the voltage-sensor and the intracellular domains makes the PAS domain a particularly interesting domain of the channel to study as responsible for the coupling between the voltage sensor domains and the intracellular gating ring.

      The coupling between the voltage-sensor domain and the gating ring is not fully understood and the authors aim to shed light into the details of this mechanism. In order to do that, they use well established techniques such as site-directed mutagenesis, electrophysiology, biochemistry and mathematical modeling. In the present work, the authors propose a two open state model that arises from functional experiments after introducing a deletion on the PAS domain (ΔPAS Cap) or a point mutation (E600R) in the CNBHD domain. The authors measure a bi-phasic G-V curve with these mutations and assign each phase as two different open states, one of them not visible on the WT and only unveiled after introducing the mutations.

      The hypothesis proposed by the authors could change the current paradigm in the current understanding for Kv10.1 and it is quite extraordinary; therefore, it requires extraordinary evidence to support it.

      STRENGTHS: The authors use adequate techniques such as electrophysiology and sitedirected mutagenesis to address the gating changes introduced by the molecular manipulations. They also use appropriate mathematical modeling to build a Markov model and identify the mechanism behind the gating changes.

      WEAKNESSES: The results presented by the authors do not fully support their conclusions since they could have alternative explanations. The authors base their primary hypothesis on the bi-phasic behavior of a calculated G-V curve that do not match the tail behavior, the experimental conditions used in the present manuscript introduce uncertainties, weakening their conclusions and complicating the interpretation of the results. Therefore, their experimental conditions need to be revisited. 

      We respectfully disagree. We think that your suggestions for alternative explanations are addressed in the current version of the article. We will rebut them once more below, but we feel the need to point out that our arguments are already laid out in the revised article.

      I have some concerns related to the following points:

      (1) Biphasic gating behavior

      The authors use the TEVC technique in oocytes extracted surgically from Xenopus Leavis frogs. The method is well established and is adequate to address ion channel behavior. The experiments are performed in chloride-based solutions which present a handicap when measuring outward rectifying currents at very depolarizing potentials due to the presence of calcium activated chloride channel expressed endogenously in the oocytes; these channels will open and rectify chloride intracellularly adding to the outward rectifying traces during the test pulse. The authors calculate their G-V curves from the test pulse steady-state current instead of using the tail currents. The conductance measurements are normally taken from the 'tail current' because tails are measured at a fix voltage hence maintaining the driving force constant. 

      We respectfully disagree. In contrast to other channels, like HERG, a common practice for Kv10 is not to use tail currents. It is long known that in this channel, tail currents and test-pulse steady-state currents can appear to be at odds because the channels deactivate extremely rapidly, at the border of temporal resolution of the measurements and with intricate waveforms. This complicates the estimation of the instantaneous tail current. Therefore, the outward current is commonly used to estimate conductance (Terlau et al., 1996; Schönherr et al., 1999; Schönherr et al., 2002; Whicher and MacKinnon, 2019), while the latter authors also use the extreme of the tail for some mutants.

      Due to their activation at very negative voltage, the reversal potential in our mutants can be measured directly; we are, therefore, more confident with this approach. Nevertheless, we have determined the initial tail current in some experiments. The behavior of these is very similar to the average that we present in Figure 1. The biphasic behavior is unequivocally present.

      Author response image 2.

      Calculating the conductance from the traces should not be a problem, however, in the present manuscript, the traces and the tail currents do not agree. 

      The referee’s observation is perfectly in line with the long-standing experience of several labs working with KV10: tail current amplitudes in KV10 appear to be out of proportion for the WT open state (O2). Importantly, this is due to the rapid closure, which is not present in O1. As a consequence, the initial amplitude of tail currents from O1 are easier to estimate correctly, and they are much more obvious in the graphs. Taken together, these differences between O1 and O2 explain the misconception the reviewer describes next.

      The tail traces shown in Fig1E do not show an increasing current amplitude in the voltage range from +50mV to +120mV, they seem to have reached a 'saturation state', suggesting that the traces from the test pulse contain an inward chloride current contamination. 

      As stated in the text and indicated in Author response image 3, the tail currents In Figure 1E increase in amplitude between +50 and +120 mV, as can be seen in the examples below from different experiments (+50 is presented in black, +120 in red). As stated above, the increase is not as evident as in traces from other mutants because the predominance of O2 also implies a much faster deactivation.

      Author response image 3. 

      We are aware that Ca2+-activated Cl- currents can represent a problem when interpreting electrophysiological data in oocytes. In fact, we show in Supplement 1 to Figure 8 that this can be the case during the Ca2+-CaM experiments, where the increase in Ca2+ would certainly augment Cl- contribution to the outward current. This is why we performed these experiments in Cl--free solutions. As we show in Figure 8, the biphasic behavior was also present in those experiments. 

      Importantly, Cl- free bath solutions would not correct contamination during the tail, since this would correspond to Cl- exiting the oocyte. Yet, if there would be contamination of the outward currents by Cl-, one would expect it to increase with larger depolarizations as the typical Ca2+activated Cl- current in oocytes does. As the reviewer states, this does not seem to be the case.

      In addition, this second component identified by the authors as a second open state appears after +50mV and seems to never saturate. The normalization to the maximum current level during the test pulse, exaggerates this second component on the calculated G-V curve. 

      We agree that this second component continues to increase; the reviewer brought this up in the first review, and we have already addressed this in our reply and in the discussion of the revised version: “This flicker block might also offer an explanation for a feature of the mutant channels, that is not explained in the current model version: the continued increase in current amplitude, hundreds of milliseconds into a strong depolarization (Supp. 4 to Fig. 9). If the relative stability of O2 and C2 continued to change throughout depolarization, such a current creep-up could be reproduced. However, this would require either the introduction of further layers of On ↔Cn states, or a non-Markovian modification of the model’s time evolution.” With non-Markovian, we mean a Langevin-type diffusive process. 

      It's worth noticing that the ΔPASCap mutant experiments on Fig 5 in Mes based solutions do not show that second component on the G-V.

      For the readers of this conversation, we would like to clarify that the reviewer likely refers to experiments shown in Fig. 5 of the initial submission but shown in Fig. 6 of the revised version (“Hyperpolarization promotes access to a large conductance, slowly activating open state.” Fig. 5 deals with single channels). We agree that these data look different, but this is because the voltage protocols are completely different (compare Fig. 6A (fixed test pulse, varied prepulse) and Fig. 2A (varied test pulse, fixed pre-pulse). Therefore, no biphasic behavior is expected. 

      Because these results are the foundation for their two open state hypotheses, I will strongly suggest the authors to repeat all their Chloride-based experiments in Mes-based solutions to eliminate the undesired chloride contribution to the mutants current and clarify the contribution of the mutations to the Kv10.1 gating.

      In summary, we respectfully disagree with all concerns raised in point (1). Our detailed arguments rebutting them are given above, but there is a more high-level concern about this entire exchange: the referee casts doubt on observations that are not new. Several labs have reported for a group of mutant KCNH channels: non-monotonic voltage dependence of activation (see, e.g., Fig. 6D in Zhao et al., 2017), multi-phasic tail currents (see e.g. Fig. 4A in Whicher and MacKinnon, 2019, in CHO cells where Cl- contamination is not a concern), and activation by high [Ca2+]i (Lörinczi et al., 2016). Our study replicates those observations and hypothesizes that the existence of an additional conducting state can alone explain all previously unexplained observations. We highlight the potency of this hypothesis with a Markov model that qualitatively reproduces all phenomena. We not only factually disagree with the individual points raised, but we also think that they don't touch on the core of our contribution

      (2) Two step gating mechanism.

      The authors interpret the results obtained with the ΔPASCap and the E600R as two step gating mechanisms containing two open states (O1 and O2) and assign them to the voltage sensor movement and gating ring rotation respectively. It is not clear, however how the authors assign the two open states.

      The results show how the first component is conserved amongst mutations; however, the second one is not. The authors attribute the second component, hence the second open state to the movement of the gating ring. This scenario seems unlikely since there is a clear voltagedependence of the second component that will suggest an implication of a voltage-sensing current.

      We do not suggest that the gating ring motion is not voltage dependent. We would like to point out that voltage dependence can be conveyed by voltage sensor coupling to the ring; this is the widely accepted theory of how the ring can be involved. Should the reviewer mean it in a narrow sense, that the model should be constructed such that all voltage-dependent steps occur before and independently of ring reconfiguration and that only then an additional step that reflects the (voltage-independent) reconfiguration solely, we would like to point the reviewer to the article, where we write: “the κ/λ transition could reasonably be expected to be voltage independent because we related it to ring reconfiguration, a process that should occur as a consequence of a prior VSD transition. We have made some attempts to treat this transition as voltage independent but state-specific with upper-layer bias for states on the right and lower-layer bias for states on the left. This is in principle possible, as can already be gleaned from the similar voltage ranges of the left-right transition (α/β) and the κL/λ transition. However, this approach leads to a much larger number of free, less well constrained kinetic parameters and drastically complicated the parameter search. ” As you can see, we also formulated a strategy to free the model from the potentially spurious voltage dependence and (in bold here) explained why we did not follow this route in this study. 

      The split channel experiment is interesting but needs more explanation. I assume the authors expressed the 2 parts of the split channel (1-341 and 342-end), however Tomczak et al showed in 2017 how the split presents a constitutively activated function with inward currents that are not visible here, this point needs clarification.

      As stated in the panel heading, the figure legend, and the main text, we did not use 1-341 and 342-end as done in Tomczak et al. Instead, “we compared the behavior of ∆2-10 and ∆210.L341Split,”. Evidently, the additional deletion (2-10) causes a shift in activation that explains the difference you point out. However, as we do not compare L341Split and ∆210.L341Split but ∆2-10 and ∆2-10.L341Split, our conclusion remains that “As predicted, compared to ∆2-10, ∆2-10.L341Split showed a significant reduction in the first component of the biphasic GV (Fig. 2C, D).” Remarkably, the behavior of the ∆3-9 L341Split described in Whicher and MacKinnon, 2019 (Figure 5) matches that of our ∆2-10 L341Split, which we think reinforces our case.

      Moreover, the authors assume that the mutations introduced uncover a new open state, however the traces presented for the mutations suggest that other explanations are possible. Other gating mechanisms like inactivation from the closed state, can be introduced by the mutations. The traces presented for ΔPASCap but specially E600R present clear 'hooked tails', a direct indicator of a populations of inactive channels during the test pulse that recover from inactivation upon repolarization (Tristani-Firouzi M, Sanguinetti MC. J Physiol. 1998). 

      There is a possibility that we are debating nomenclature here. In response to the suggestion that all our observations could be explained by inactivation, we attempted a disambiguation of terms in the reply and the article. As the argument is brought up again without reference to our clarification attempts, we will try to be more explicit here:

      If, starting from deeply deactivated states, an open state is reached first, and then, following further activation steps, closed states are reached, this might be termed “inactivation”. In such a reading, our model features many inactivated states. The shortest version of such a model is C-O-I. It is for instance used by Raman and Bean (2001; DOI: 10.1016/S00063495(01)76052-3) to explain NaV gating in Purkinje neurons. If “inactivation” is meant in the sense that a gating transition exists, which is orthogonal to an activation/deactivation axis, and that after this orthogonal transition, an open state cannot be reached anymore, then all of the upper floor in our model is inactivated with respect to the open state O1. Finally, the state C2 is an inactivated state to O2. In this view, “inactivation” explains the observed phenomena. 

      However, we must disagree if the referee means that a parsimonious explanation exists in which a single conducting state is the only source for all observed currents.   

      There is a high-level reason: we found a single assumption that explains three different phenomena, while the inactivation hypothesis with one conducting state cannot explain one of them (the increase of the first component under raised CaM). But there is also a low-level reason: the tails in Tristani-Firouzi and Sanguinetti 1998 are fundamentally different from what we report herein in that they lack a third component. Thus, those tails are consistent with recovery from inactivation through a single open state, while a three-component tail is not. In the framework of a Markov model, the time constants of transitions from and to a given state (say O2), cannot change unless the voltage changes. During the tail current, the voltage does not change, yet we observe: 

      i) a rapid decrease with a time constant of at most a few milliseconds (Fig 9 S2, 1-> 2),  ii) a slow increase in current, peaking after approximately 25 milliseconds and iii) a relaxation to zero current with a time constant of >50 ms. 

      According to the reviewer’s suggestion, these processes on three timescales should all be explained by depopulating and repopulating the same open state while all rates are constant. There might well be a complicated multi-level state diagram with a single open state with different variants, like (open and open inactivated) that could produce triphasic tails with these properties if the system had not reached a steady state distribution at the end of the test pulse. It cannot, however, achieve it from an equilibrated system, and certainly, it cannot at the same time produce “biphasic activation” and “activation by CaM”. 

      The results presented by the authors can be alternatively explained with a change in the equilibrium between the close to inactivated/recovery from inactivation to the open state. 

      Again, we disagree. The model construction explains in detail that the transition from the first to the second phase is not gradual. Shifting equilibria cannot reproduce this. We have extensively tested that idea and can exclude this possibility.

      Finally, the authors state that they do not detect "cumulative inactivation after repeated depolarization" but that is considering inactivation only from the open state and ignoring the possibility of the existence of close state inactivation or, that like in hERG, that the channel inactivates faster that what it activates (Smith PL, Yellen G. J Gen Physiol. 2002). 

      We respectfully disagree. We explicitly model an open state that inactivates faster (O2->C2) than it activates. Once more, this is stated in the revised article, which we point to for details. Again, this alternative mechanism does not have the potential to explain all three effects. As discussed above about the chloride contamination concerns, this inactivation hypothesis was mentioned in the first review round and, therefore, addressed in our reply and the revised article. We also explained that “inactivation” has no specific meaning in Markov models. In the absence of O1, all transitions towards the lower layer are effectively “inactivation from closed states”, because they make access to the only remaining open state less likely”. But this is semantics. What is relevant is that no network of states around a single open state can reproduce the three effets in a more parsimonious way than the assumption of the second open state does.

      (3) Single channel conductance.

      The single channels experiments are a great way to assess the different conductance of single channel openings, unfortunately the authors cannot measure accurately different conductances for the two proposed open states. The Markov Model built by the authors, disagrees with their interpretation of the experimental results assigning the exact same conductance to the two modeled open states. To interpret the mutant data, it is needed to add data with the WT for comparison and in presence of specific blockers. 

      We respectfully disagree. As previously shown, the conductance of the flickering wild-type open state is very difficult to resolve. Our recordings do not show that the two states have different single-channel conductances, and therefore the model assumes identical singlechannel conductance. 

      The important point is that the single-channel recordings clearly show two different gating modes associated with the voltage ranges in which we predict the two open states. One has a smaller macroscopic current due to rapid flickering (aka “inactivation”). These recordings are another proof of the existence of two open states because the two gating modes occur.  Wild-type data can be found in Bauer and Schwarz, (2001, doi:10.1007/s00232-001-0031-3) or Pardo et al., (1998, doi:10.1083/jcb.143.3.767) for comparison.

      We appreciate the effort editors and reviewers invested in assessing the revised manuscript. Yet, we think that the demanded revision of experimental conditions and quantification methods contradicts the commonly accepted practice for KV10 channels. Some of the reviewer comments are skeptical about the biphasic behavior, which is an established and replicated finding for many mutants and by many researchers. The alternative explanations for these disbelieved findings are either “semantics” or cannot quantitatively explain the measurements. Therefore, only the demand for more explanations and unprecedented resolution in singlechannel recordings remains. We share these sentiments.

      ———— The following is the authors’ response to the original reviews.

      (1) The authors must show that the second open state is not just an artifact of endogenous activity but represents the activity of the same EAG channels. I suggest that the authors repeat these experiments in Mes-based solutions. 

      (2) Along the same lines, it is necessary to show that these currents can be blocked using known EAG channel blockers such as astemizole. Ultimately, it will be important to demonstrate using single-channel analysis that these do represent two distinct open states separated by a closed state. 

      We have addressed these concerns using several approaches. The most substantial change is the addition of single-channel recordings on ΔPASCap. In those experiments, we could provide evidence of the two types of events in the same patch, and the presence of an outward current at -60 mV, 50 mV below the equilibrium potential for chloride. The channels were never detected in uninjected oocytes, and Astemizole silenced the activity in patches containing multiple channels. These observations, together with the maintenance of the biphasic behavior that we interpret as evidence of the presence of O1 in methanesulfonate-based solutions, strongly suggest that both O1 and O2 obey the expression of KV10.1 mutants.

      (3) Currents should be measured by increasing the pulse lengths as needed in order to obtain the true steady-state G-V curves. 

      We agree that the endpoint of activation is ill-defined in the cases where a steady-state is not reached. This does indeed hamper quantitative statements about the relative amplitude of the two components. However, while the overall shape does change, its position (voltage dependence) would not be affected by this shortcoming. The data, therefore, supports the claim of the “existence of mutant-specific O1 and its equal voltage dependence across mutants.”

      (4) A more clear and thorough description should be provided for how the observations with the mutant channels apply to the behavior of WT channels. How exactly does state O1 relate to WT behavior, and how exactly do the parameters of the mathematical model differ between WT and mutants? How can this be interpreted at a structural level? What could be the structural mechanism through which ΔPASCap and E600R enable conduction through O1? It seems contradictory that O1 would be associated exclusively with voltage-sensor activation and not gating ring transitions, and yet the mutations that enable cation access through O1 localize at the gating ring - this needs to be better clarified. 

      We have undertaken a thorough rewriting of all sections to clarify the structural correlates that may explain the behavior of the mutants. In brief, we propose that when all four voltage sensors move towards the extracellular side, the intracellular ring maintains the permeation path closed until it rotates. If the ring is altered, this “lock” is incompetent, and permeation can be detected (page 34). By fixing the position of the ring, calmodulin would preclude permeation in the WT and promote the population of O1 in the mutants.

      (5) Rather than the t80% risetime, exponential fits should be performed to assess the kinetics of activation. 

      We agree that the assessment of kinetics by a t80% is not ideal. We originally refrained from exponential fits because they introduce other issues when used for processes that are not truly exponential (as is the case here). We had planned to perform exponential fits in this revised version, but because the activation process is not exponential, the time constants we could provide would not be accurate, and the result would remain qualitative as it is now. In the experiments where we did perform the fits (Fig. 3), the values obtained support the statement made. 

      (6) It is argued based on the G-V relations in Figure 2A that none of the mutations or deletions introduced have a major effect on state O1 properties, but rather affect state O2. However, the occupancy of state O2 is undetermined because activation curves do not reach saturation. It would be interesting to explore the fitting parameters on Fig.2B further to test whether the data on Fig 2A can indeed only be described by fits in which the parameters for O1 remain unchanged between constructs. 

      We agree that the absolute occupancy of O2 cannot be properly determined if a steady state is not reached. This is, however, a feature of the channel. During very long depolarizations in WT, the current visually appears to reach a plateau, but a closer look reveals that the current keeps increasing after very long depolarizations (up to 10 seconds; see, e.g., Fig. 1B in Garg et al., 2013, Mol Pharmacol 83, 805-813. DOI: 10.1124/mol.112.084384). Interestingly, although the model presented here does not account for this behavior, we propose changes in the model that could. “If the relative stability of O2 and C2 continued to change throughout the depolarization such a current creep-up could be reproduced. However, this would require either the introduction of further layers of On↔Cn states or a non-Markovian modification of the model’s evolution.” Page 34.

      (7) The authors interpret the results obtained with the mutants DPASCAP and E600R -tested before by Lorinczi et al. 2016, to disrupt the interactions between the PASCap and cNBHD domains- as a two-step gating mechanism with two open states. All the results obtained with the E600R mutant and DPASCap could also be explained by inactivation/recovery from inactivation behavior and a change in the equilibrium between the closed states closed/inactivated states and open states. Moreover, the small tails between +90 to +120 mV suggest channels accumulate in an inactive state (Fig 1E). It is not convincing that the two open-state model is the mechanism underlying the mutant's behavior.  

      We respectfully disagree with the notion that a single open state can provide a plausible explanation for "All the results obtained with the E600R mutant and DPASCap". We think that our new single channel results settle the question, but even without this direct evidence, a quantitative assessment of the triphasic tail currents all but excludes the possibility of a single open state. We agree that it is, in principle, possible to obtain some form of a multiphasic tail with a single open state using the scheme suggested in this comment: at the end of the test pulse, a large fraction of the channels must be accumulated in inactive states, and a few are in the open state. The hyperpolarization to -100mV then induces a rapid depopulation of the open state, followed by slower replenishments from the inactive state. Exactly this process occurs in our model, when C2 empties through O2 (Supp. 5 to Fig 9, E600R model variant). However, this alone is highly unlikely to quantitatively explain the measured tail currents, because of the drastically different time scales of the initial current decay (submillisecond to at most a few milliseconds lifetime) and the much slower transient increase in current (several tens of milliseconds) and the final decay with time constants of >100 ms (see for instance data in Fig. 1 E for E600R +50 to +120mV test pulse). To sustain the substantial magnitude of slowly decaying current by slow replenishment of an open state with a lifetime of 1 ms requires vast amounts of inactivated channels. A rough estimation based on the current integral of the initial decay and the current integral of the slowly decaying current suggests that at the end of the test pulse, the ratio inactivated/open channels would have to be 500 to 1500 for this mechanism to quantitatively explain the observed tail currents. To put this in perspective: This would suggest that without inactivation all the expressed channels in an oocyte would provide 6 mA current during the +100 mV test pulse. While theoretically possible, we consider this a less likely explanation than a second open state.

      (8) Different models should be evaluated to establish whether the results in Figure 4 can also be explained by a model in which states O1 and O2 have the same conductance. It would be desirable if the conductance of both states were experimentally determined - noise analysis could be applied to estimate the conductance of both states. 

      In the modified model, O1 and O2 have the same single-channel conductance. The small conductance combined with the fast flickering did not allow an accurate determination, but we can state that there is no evidence that the single-channel conductance of the states is different.

      (9) Although not included, it looks like the model predicts some "conventional inactivation" This can be appreciated in Fig 8, and in the traces at -60mV. Interestingly, the traces obtained in the absence of Cl- also undergo slow inactivation, or 'conventional inactivation' as referred to by the authors. Please revise the following statement "Conventional inactivation was never detected in any mutants after repeated or prolonged depolarization. In the absence of inactivation, the pre-pulse dependent current increase at +40 mV could be related to changes in the relative occupancy of the open states". 

      We have carefully edited the manuscript to address this concern. The use of the term inactivation admittedly represents a challenge. We agree that the state that results from the flickering block (C2) could be defined as “inactivated” because it is preceded by an open state. Yet, in that case, the intermediate states that the channel travels between O1 and O2 would also be sensu stricto “inactivated”, but only in the mutants. We have made this clear in page 17.

      Recommendations for improving the writing and presentation.

      (1) Methods section: Please state the reversal potential calculated for the solution used. It looks like the authors used an Instantaneous I-V curve method to calculate the reversal potential; if that's correct, please show the I-V and the traces together with the protocol used. 

      We have provided the calculated reversal potentials for excised patches. We cannot predict the reversal potential in whole oocytes because we have no control over the intracellular solution. The reversal potential was determined in the mutants through the current at the end of the stimulus because the mutants produced measurable inward currents. The differences in reversal potential were not significant among mutants.

      Pulse protocols have been added to the figures.

      (2) Figure 1 suggestion: Combine the two panels in panel D and move the F panel up so the figure gets aligned in the lower end.

      Thank you, this has been done.

      (3) Please clarify the rationale for using the E600R-specific mutant. I assume it is based on the Lorinzci et al. 2016 effect and how this is similar to the DPASCap phenotype, or is it due to the impact of this mutation in the interactions between the N-term and the cNBHD? 

      We have explained the rationale for the use of E600R explicitly on page 6.

      (4) Fig S1A is not present in the current version of the manuscript. Include a cartoon as well as a structural figure clearly depicting the perturbations introduced by E600R, ΔPASCap, and the other deletions that are tested. Additional structural information supporting the discussion would also be helpful to establish clearer mechanistic links between the experimental observations described here and the observed conformational changes between states in Kv10 channel structures. 

      We have corrected this omission, thank you for pointing it out.

      (5) It would be informative to see the traces corresponding to the I-V shown in Fig 7 A and B at the same indicated time points (0, 60, 150, and 300s). Did the authors monitor the Ca2+ signal rise after the I&T treatment to see if it coincides with the peak in the 60s? 

      In Figure 7 (now Figure 8) we used voltage ramps instead of discrete I-V protocols because of the long time required for recording the latter. This is stated on page 19. Ca2+ was monitored through Cl- current after ionomycin/thapsigargin. The duration of the Ca2+ increase was reproducible among oocytes and in good agreement with the changes observed in the biphasic behavior of the mutants (Supplement 1 to Figure 8).

      (6) Fig 4. Please state in the legend what the different color traces correspond to in E600R and DPASCap. Is there a reason to change the interpulse on DPASCap to -20mV and not allow this mutant to close? Please state. How do the authors decide the 10 ms interval for the experiments in Fig 2? 

      Thank you for pointing this out, we have added the description. We have explained why we use a different protocol for ΔPASCap and the reason for using 10 ms interval (we believe the referee means Figure 4) on page 12.  

      (7) Fig. 5. Since the pre-pulse is supposed to be 5s, but the time scale doesn't correspond with a pre-pulse of 5 s before the test pulse to +40mV. Has the pre-pulse been trimmed for representation purposes? If so, please state. 

      The pre-pulse was 5s, but as the reviewer correctly supposed, the trace is trimmed to keep the +40 mV stimulus visible. This has now been clearly stated in the legend.

      (8) The mutant L322H is located within the S4 helix according to the Kv10.1 structure (PDB 5K7L), not in the 'S3-S4 linker'; please correct. 

      This has been done, thank you.

      The introduction of this mutant should also shift the voltage dependence toward more hyperpolarizing potentials (around 30mV, according to Schoenherr et al. 1999). It looks like that shift is present within the first component of the G-V. Still, since the max amplitude from the second component could be contaminated by endogenous Cl- currents, this effect is minimized. Repeating these experiments in the no Cl- solutions will help clarify this point and see the effect of the DPASCap and E600R in the background of a mutation that accelerates the transitions between the closed states (see Major comment 1). Did the authors record L322H alone for control purposes? 

      We have decided not to measure L322H alone or repeat the measurements in Cl--free solutions because we do not see a way to use the quantitative assessment of the voltage dependence of L322H and the L322H-variants of the eag domain mutants. Like in our answer to main point 3, we base our arguments not on the precise voltage dependence of the second component but on the shape of the G-V curves instead, specifically the consistent appearance of the first component and the local conductance minimum between the first and second components. After the introduction of L322H the first component is essentially absent.

      We think that the measurements of the L322H mutants cannot be interpreted as a hyperpolarizing shift in the first component. The peak of the first conductance component occurs around -20 mV in ΔPASCap and E600R (Fig. 7 C, D). After a -30mV shift, in L322H+DPASCap and L322H+E600R, this first peak would still be detected within the voltage range in our experiments, but it is not. A contamination of the second component would have little impact on this observation, which is why we refrain from the suggested measurements.  

      (9) The authors differentiate between an O1 vs. O2 state with different conductances, and maybe I missed it, but there's no quantitative distinction between the components; how are they different?

      Please see the response to the main comments 1 and 2. This has been addressed in singlechannel recordings.

      (10) Please state the voltage protocols, holding voltages, and the solutions (K+ concentration and Cl-presence/absence) used for the experiments presented in the legends on the figures. Hence, it's easier to interpret the experiments presented. 

      Thank you, this has been done.

      (11) The authors state on page 7 that "with further depolarizations, the conductance initially declined to rise again in response to strong depolarizations. This finding matches the changes in amplitude of the tail currents, which, therefore, probably reflect a true change in conductance" However, the tails in the strong voltage range (+50 to +120 mV) for the E600R mutant argue against this result. Please review.

      The increase in the amplitude of the tail current is also present in E600R, but the relative increase is smaller. We have decided against rescaling these traces because the Figure is already rather complex. We indicated this fact with a smaller arrow and clarified it in the text (page 8).

      (12) The authors mention that the threshold of activation for the WT is around -20mV; however, the foot of the G-V is more around -30 or -40mV. Please revise. 

      Thank you. We have done this. 

      (13) The authors state on page 9 that the 'second component occurs at progressively more depolarized potentials for increasingly larger N-terminal deletions" However E600R mutant that conserves the N-terminal intact has a shift as pronounced as the DPASCap and larger than the D2-10. How do the authors interpret this result? 

      We have corrected this statement in page 10 : “…the second component occurs at progressively more depolarized potentials for increasingly larger N-terminal deletions and when the structure of the ring is altered through disruption of the interaction between N- and C-termini (E600R)”.

      (14) The equation defined to fit the G-Vs, can also be used to describe the WT currents. If the O1 is conserved and present in the WT, this equation should also fit the WT data properly. The 1-W component shown could also be interpreted as an inactivating component that, in the WT, shifts the voltage-dependence of activation towards depolarizing potentials and is not visible. Still, the mutants do show it as if the transition from closed-inactivated states is controlled by interactions in the gating ring, and disturbing them does affect the transitions to the open state. 

      Out of the two open states in the mutant, O2 is the one that shares properties with the WT (e.g. it is inaccessible during Ca2+-CaM binding) while O1 is the open state with the voltage dependence that is conserved across the mutants. We, therefore, believe that this question is based on a mix-up of the two open states. We appreciate the core of the question: does the pattern in the mutants’ G-V curves find a continuation in the WT channel? 

      Firstly, the component that is conserved among mutants does not lead to current in the WT because the corresponding open state (O1) is not observed in WT. However, the gating event represented by this component should also occur in WT and –given its apparent insensitivity to eag domain mutations–  this gating step should occur in WT with the same voltage dependence as in all the mutants. This means that this first component sets a hard boundary for the most hyperpolarized G-V curve we can expect in the WT, based on our mutant measurements. Secondly, the second component shows a regular progression across mutants: The more intact the eag domain is, the more hyperpolarized the Vhalf values of transition term (1-W) and O2 activation. In Δ2-10, the transition term already almost coincides with O1 activation (estimated Vhalf values of -33.57 and -33.47 mV). A further shift of (1-W) in the WT is implausible because, if O1 activation is coupled to the earliest VSD displacement, the transition should not occur before O1 activation. Still, the second component might shift to more hyperpolarized values in the WT, depending on the impact of amino acids 2 to 10 on the second VSD transition.

      In summary, in WT the G-V should not be more hyperpolarized than the first component of the mutants, and the (1-W)-component probably corresponds to the Δ2-10 (1-W)-component. In WT the second component should be no more depolarized than the second component of Δ2-10. The WT G-V (Fig.1B) meets all these predictions derived from the pattern in the mutant GVs: When we use Eq. 4 to fit the WT G-V with A1=0 (O1 is not present in WT) and the parameters of the transition term (1-W)  fixed to the values attained in Δ2-10, we obtain a fit for the O2 component with Vhalf\=+21mV. This value nicely falls into the succession of Vhalf values for Δeag, ΔPASCap, and Δ2-10 (+103mV,+80mV,+52mV) and, at the same time, it is not more hyperpolarized than the conserved first component (Vhalf -34mV). Our measurements therefore support that the O2 component in the mutants corresponds to the single open state in the WT. 

      (15) Page 15, the authors state that 'The changes in amplitude and kinetics in response to rising intracellular Ca2+ support our hypothesis that Ca-CaM stabilized O1, possibly by driving the channels to deep closed states (Fig 5 and 6)' (pg 15). This statement seems contradictory; I can't quite follow the rationale since Ca2+ potentiates the current (Fig 7), and the addition of the L322H mutant in Fig 7 makes the shift of the first component to negative potentials visible.

      Please check the rationale for this section. 

      We have explained this more explicitly in the discussion (page 32). “Because access to O1 occurs from deep closed states, this could be explained by an increased occupancy of such deactivated states in response to CaM binding. This appears to be the case since CaM induces a biphasic behavior in the mutant channels that show reduced access to deep closed states; thus, L322H mutants behave like the parental variants in the presence of Ca2+-CaM. This implies a mechanistic explanation for the effect of Ca2+-CaM on WT since favoring entry into deep closed states would result in a decrease in current amplitude in the absence of (a permeable) O1”.

      Also, Figs 5 and 6 seem miscited here. 

      Thank you, we have corrected this.

      (16) For Figure 5, it would be helpful if each of the current traces corresponding to a particular voltage had a different color. That way, it will be easier to see how the initial holding voltage modulates current. 

      We have considered this suggestion, and we agree that it would make it easier to follow. Yet, since we have identified the mutants with different colors, it would be inconsistent if we used another color palette for this Figure. Supplement 3 to Figure 9 shows the differences in a clearer way.

      (17) Add zero-current levels to all current traces.

      We have done this.

      (18) The mathematical model should be described better. Particularly, the states from which O1 can be accessed should be described more clearly, as well as whether the model considers any direct connectivity between states O1 and O2. The origin of the voltage-dependence for transitions that do not involve voltage-sensor movements should be discussed. Also, it separation of kappa into kappa-l and kappa-r should be described. 

      We have extensively rewritten the description of the mathematical model to address these concerns.

      (19) Page 4, "reveals a pre-open state in which the transmembrane regions of the channel are compatible with ion permeation, but is still a nonconducting state". Also, page 27, "renders a hydrophobic constriction wider than 8 Å, enough to allow K+ flow, but still corresponds to a non-conducting state". These sentences are confusing - how can the regions be compatible with ion permeation, and still not be conducting? Is cation conductance precluded by a change in the filter, or elsewhere? How is it established that it represents a non-conducting state? 

      We have rephrased to clarify this apparent inconsistence. Page 4: “(…) in which the transmembrane regions of the channel are compatible with ion permeation (the permeation path is dilated, like in open states) but the intracellular gate is still in the same conformation as in closed states (Zhang et al., 2023).” Page 31: “The presence of an intact intracellular ring would preclude ionic flow in the WT, and its alteration would explain the permeability of this state in the mutants.”

    2. eLife assessment

      This valuable study examines the role of the interaction between cytoplasmic N- and C-terminal domains in voltage-dependent gating of Kv10.1 channels. The authors claim to have identified a hidden open state in Kv10.1 mutant channels, thus providing a window for observing early conformational transitions associated with channel gating. The evidence supporting the major conclusions is incomplete, however, and additional work is required to determine the molecular mechanism underlying the observations in this study. With the experimental conditions clarified and the mechanistic interpretations addressed, this work could be significant in understanding the gating mechanisms of the KCNH family and will appeal to biophysicists interested in ion channels and physiologists interested in cancer biology.

    3. Reviewer #1 (Public Review):

      Gating of Kv10 channels is unique because it involves coupling between non-domain swapped voltage sensing domains, a domain-swapped cytoplasmic ring assembly formed by the N- and C-termini, and the pore domain. Recent structural data suggests that activation of the voltage sensing domain relieves a steric hindrance to pore opening, but the contribution of the cytoplasmic domain to gating is still not well understood. This aspect is of particular importance because proteins like calmodulin interact with the cytoplasmic domain to regulate channel activity. The effects of calmodulin (CaM) in WT and mutant channels with disrupted cytoplasmic gating ring assemblies are contradictory, resulting in inhibition or activation, respectively. The underlying mechanism for these discrepancies is not understood. In the present manuscript, Reham Abdelaziz and collaborators use electrophysiology, biochemistry and mathematical modeling to describe how mutations and deletions that disrupt inter-subunit interactions at the cytoplasmic gating ring assembly affect Kv10.1 channel gating and modulation by CaM. In the revised manuscript, additional information is provided to allow readers to identify within the Kv10.1 channel structure the location of E600R, one of the key channel mutants analyzed in this study. However, the mechanistic role of the cytoplasmic domains that this study focuses on, as well as the location of the ΔPASCap deletion and other perturbations investigated in the study remain difficult to visualize without additional graphical information. This can make it challenging for readers to connect the findings presented in the study with a structural mechanism of channel function.

      The authors focused mainly on two structural perturbations that disrupt interactions within the cytoplasmic domain, the E600R mutant and the ΔPASCap deletion. By expressing mutants in oocytes and recording currents using Two Electrode Voltage-Clamp (TEV), it is found that both ΔPASCap and E600R mutants have biphasic conductance-voltage (G-V) relations and exhibit activation and deactivation kinetics with multiple voltage-dependent components. Importantly, the mutant-specific component in the G-V relations is observed at negative voltages where WT channels remain closed. The authors argue that the biphasic behavior in the G-V relations is unlikely to result from two different populations of channels in the oocytes, because they found that the relative amplitude between the two components in the G-V relations was highly reproducible across individual oocytes that otherwise tend to show high variability in expression levels. Instead, the G-V relations for all mutant channels could be well described by an equation that considers two open states O1 and O2, and a transition between them; O1 appeared to be unaffected by any of the structural manipulations tested (i.e. E600R, ΔPASCap, and other deletions) whereas the parameters for O2 and the transition between the two open states were different between constructs. The O1 state is not observed in WT channels and is hypothesized to be associated with voltage sensor activation. O2 represents the open state that is normally observed in WT channels and is speculated to be associated with conformational changes within the cytoplasmic gating ring that follow voltage sensor activation, which could explain why the mutations and deletions disrupting cytoplasmic interactions affect primarily O2.

      Severing the covalent link between the voltage sensor and pore reduced O1 occupancy in one of the deletion constructs. Although this observation is consistent with the hypothesis that voltage-sensor activation drives entry into O1, this result is not conclusive. Structural as well as functional data has established that the coupling of the voltage sensor and pore does not entirely rely on the S4-S5 covalent linker between the sensor and the pore, and thus the severed construct could still retain coupling through other mechanisms, which is consistent with the prominent voltage dependence that is observed. If both states O1 and O2 require voltage sensor activation, it is unclear why the severed construct would affect state O1 primarily, as suggested in the manuscript, as opposed to decreasing occupancy of both open states. In line with this argument, the presence of Mg2+ in the extracellular solution affected both O1 and O2. This finding suggests that entry into both O1 and O2 requires voltage-sensor activation because Mg2+ ions are known to stabilize the voltage sensor in its most deactivated conformations.

      Activation towards and closure from O1 is slow, whereas channels close rapidly from O2. A rapid alternating pulse protocol was used to take advantage of the difference in activation and deactivation kinetics between the two open components in the mutants and thus drive an increasing number of channels towards state O1. Currents activated by the alternating protocol reached larger amplitudes than those elicited by a long depolarization to the same voltage. This finding is interpreted as an indication that O1 has a larger macroscopic conductance than O2. In the revised manuscript, the authors performed single-channel recordings to determine why O1 and O2 have different macroscopic conductance. The results show that at voltages where the state O1 predominates, channels exhibited longer open times and overall higher open probability, whereas at more depolarized voltages where occupancy of O2 increases, channels exhibited more flickery gating behavior and decreased open probability. These results are informative but not conclusive because additional details about how experiments were conducted, and group data analysis are missing. Importantly, results showing inhibition of single ΔPASCap channels by a Kv10-specific inhibitor are mentioned but not shown or quantitated - these data are essential to establish that the new O1 conductance indeed represents Kv10 channel activity.

      It is shown that conditioning pulses to very negative voltages result in mutant channel currents that are larger and activate more slowly than those elicited at the same voltage but starting from less negative conditioning pulses. In voltage-activated curves, O1 occupancy is shown to be favored by increasingly negative conditioning voltages. This is interpreted as indicating that O1 is primarily accessed from deeply closed states in which voltage sensors are in their most deactivated position. Consistently, a mutation that destabilizes these deactivated states is shown to largely suppress the first component in voltage-activation curves for both ΔPASCap and E600R channels.

      The authors then address the role of the hidden O1 state in channel regulation by calmodulation. Stimulating calcium entry into oocytes with ionomycin and thapsigarging, assumed to enhance CaM-dependent modulation, resulted in preferential potentiation of the first component in ΔPASCap and E600R channels. This potentiation was attenuated by including an additional mutation that disfavors deeply closed states. Together, these results are interpreted as an indication that calcium-CaM preferentially stabilizes deeply closed states from which O1 can be readily accessed in mutant channels, thus favoring current activation. In WT channels lacking a conducting O1 state, CaM stabilizes deeply closed states and is therefore inhibitory. It is found that the potentiation of ΔPASCap and E600R by CaM is more strongly attenuated by mutations in the channel that are assumed to disrupt interaction with the C-terminal lobe of CaM than mutations assumed to affect interaction with the N-terminal lobe. These results are intriguing but difficult to interpret in mechanistic terms. The strong effect that calcium-CaM had on the occupancy of the O1 state in the mutants raises the possibility that O1 can be only observed in channels that are constitutively associated with CaM. To address this, a biochemical pull-down assay was carried out to establish that only a small fraction of channels are associated with CaM under baseline conditions. These CaM experiments are potentially very interesting and could have wide physiological relevance. However, the approach utilized to activate CaM is indirect and could result in additional non-specific effects on the oocytes that could affect the results.

      Finally, a mathematical model is proposed consisting of two layers involving two activation steps for the voltage sensor, and one conformational change in the cytoplasmic gating ring - completion of both sets of conformational changes is required to access state O2, but accessing state O1 only requires completion of the first voltage-sensor activation step in the four subunits. The model qualitatively reproduces most major findings on the mutants. Although the model used is highly symmetric and appears simple, the mathematical form used for the rate constants in the model adds a layer of complexity to the model that makes mechanistic interpretations difficult. In addition, many transitions that from a mechanistic standpoint should not depend on voltage were assigned a voltage dependence in the model. These limitations diminish the overall usefulness of the model which is prominently presented in the manuscript. The most important mechanistic assumptions in the model are not addressed experimentally, such as the proposition that entry into O1 depends on the opening of the transmembrane pore gate, whereas entry into O2 involves gating ring transitions - it is unclear why O2 would require further gating ring transitions to conduct ions given that the gating ring can already support permeation by O1 without any additional conformational changes.

    4. Reviewer #3 (Public Review):

      In the present manuscript, Abdelaziz and colleagues interrogate the gating mechanisms of Kv10.1, an important voltage-gated K+ channel in cell cycle and cancer physiology. At the molecular level, Kv10.1 is regulated by voltage and Ca-CaM. Structures solved using Cryo-EM for Kv10.1 as well as other members of the KCNH family (Kv11 and Kv12) show channels that do not contain a structured S4-S5 linker imposing therefore a non-domain swapped architecture in the transmembrane region. However, the cytoplasmatic N- and C- terminal domains interact in a domain swapped manner forming a gating ring. The N-terminal domain (PAS domain) of one subunit is located close to the intracellular side of the voltage sensor domain and interacts with the C-terminal domain (CNBHD domain) of the neighbor subunit. Mutations in the intracellular domains has a profound effect in the channel gating. The complex network of interactions between the voltage-sensor and the intracellular domains makes the PAS domain a particularly interesting domain of the channel to study as responsible for the coupling between the voltage sensor domains and the intracellular gating ring.

      The coupling between the voltage-sensor domain and the gating ring is not fully understood and the authors aim to shed light into the details of this mechanism. In order to do that, they use well established techniques such as site-directed mutagenesis, electrophysiology, biochemistry and mathematical modeling. In the present work, the authors propose a two open state model that arises from functional experiments after introducing a deletion on the PAS domain (ΔPAS Cap) or a point mutation (E600R) in the CNBHD domain. The authors measure a bi-phasic G-V curve with these mutations and assign each phase as two different open states, one of them not visible on the WT and only unveiled after introducing the mutations. The hypothesis proposed by the authors could change the current paradigm in the current understanding for Kv10.1 and it is quite extraordinary; therefore, it requires extraordinary evidence to support it.

      STRENGTHS: The authors use adequate techniques such as electrophysiology and site-directed mutagenesis to address the gating changes introduced by the molecular manipulations. They also use appropriate mathematical modeling to build a Markov model and identify the mechanism behind the gating changes.

      WEAKNESSES: The results presented by the authors do not fully support their conclusions since they could have alternative explanations. The authors base their primary hypothesis on the bi-phasic behavior of a calculated G-V curve that do not match the tail behavior, the experimental conditions used in the present manuscript introduce uncertainties, weakening their conclusions and complicating the interpretation of the results. Therefore, their experimental conditions need to be revisited

      I have some concerns related to the following points:

      (1) Biphasic gating behavior<br /> The authors use the TEVC technique in oocytes extracted surgically from Xenopus Leavis frogs. The method is well established and is adequate to address ion channel behavior. The experiments are performed in chloride-based solutions which present a handicap when measuring outward rectifying currents at very depolarizing potentials due to the presence of calcium activated chloride channel expressed endogenously in the oocytes; these channels will open and rectify chloride intracellularly adding to the outward rectifying traces during the test pulse.<br /> The authors calculate their G-V curves from the test pulse steady-state current instead of using the tail currents. The conductance measurements are normally taken from the 'tail current' because tails are measured at a fix voltage hence maintaining the driving force constant. Calculating the conductance from the traces should not be a problem, however, in the present manuscript, the traces and the tail currents do not agree. The tail traces shown in Fig1E do not show an increasing current amplitude in the voltage range from +50mV to +120mV, they seem to have reached a 'saturation state', suggesting that the traces from the test pulse contain an inward chloride current contamination. In addition, this second component identified by the authors as a second open state appears after +50mV and seems to never saturate. The normalization to the maximum current level during the test pulse, exaggerates this second component on the calculated G-V curve. It's worth noticing that the ΔPASCap mutant experiments on Fig 5 in Mes based solutions do not show that second component on the G-V.

      Because these results are the foundation for their two open state hypotheses, I will strongly suggest the authors to repeat all their Chloride-based experiments in Mes-based solutions to eliminate the undesired chloride contribution to the mutants current and clarify the contribution of the mutations to the Kv10.1 gating.

      (2) Two step gating mechanism.<br /> The authors interpret the results obtained with the ΔPASCap and the E600R as two step gating mechanisms containing two open states (O1 and O2) and assign them to the voltage sensor movement and gating ring rotation respectively. It is not clear, however how the authors assign the two open states.<br /> The results show how the first component is conserved amongst mutations; however, the second one is not. The authors attribute the second component, hence the second open state to the movement of the gating ring. This scenario seems unlikely since there is a clear voltage-dependence of the second component that will suggest an implication of a voltage-sensing current.

      The split channel experiment is interesting but needs more explanation. I assume the authors expressed the 2 parts of the split channel (1-341 and 342-end), however Tomczak et al showed in 2017 how the split presents a constitutively activated function with inward currents that are not visible here, this point needs clarification.

      Moreover, the authors assume that the mutations introduced uncover a new open state, however the traces presented for the mutations suggest that other explanations are possible. Other gating mechanisms like inactivation from the closed state, can be introduced by the mutations. The traces presented for ΔPASCap but specially E600R present clear 'hooked tails', a direct indicator of a populations of inactive channels during the test pulse that recover from inactivation upon repolarization (Tristani-Firouzi M, Sanguinetti MC. J Physiol. 1998). The results presented by the authors can be alternatively explained with a change in the equilibrium between the close to inactivated/recovery from inactivation to the open state. Finally, the authors state that they do not detect "cumulative inactivation after repeated depolarization" but that is considering inactivation only from the open state and ignoring the possibility of the existence of close state inactivation or, that like in hERG, that the channel inactivates faster that what it activates (Smith PL, Yellen G. J Gen Physiol. 2002).

      (3) Single channel conductance.<br /> The single channels experiments are a great way to assess the different conductance of single channel openings, unfortunately the authors cannot measure accurately different conductances for the two proposed open states. The Markov Model built by the authors, disagrees with their interpretation of the experimental results assigning the exact same conductance to the two modeled open states. To interpret the mutant data, it is needed to add data with the WT for comparison and in presence of specific blockers.

    1. eLife assessment

      This manuscript probes the ways in which a protein tag might influence the structure, dynamics and stability of a covalently-attached substrate protein. Such findings are of important significance to several fields, particularly in understanding how these influences control the abundance of proteins within a cell. The evidence provided to support the authors' conclusions are, however, incomplete and further control experiments are necessary to fully support the proposed model.

    2. Reviewer #1 (Public Review):

      This manuscript by Negi et al. investigates the effects of different ubiquitin and ubiquitin-like modifications on the stability of substrate proteins, seeking to provide mechanistic insights into known effects of these modifications on cellular protein abundance. The authors focus on comparative studies of two modifications, ubiquitin and FAT10 (a protein with two ubiquitin-like domains), on a panel of substrate proteins; prior work had established that FAT10-conjugated proteins had lower stability to proteosomal degradation than Ub-modified counterparts.

      Strengths of the work include its integration of data across diverse approaches, including molecular dynamics simulations, solution NMR spectroscopy, and in vitro and cellular stability assays. From these, the authors provide provocative mechanistic insight into the lower stability of FAT10 on its own, and in FAT10-mediated destabilization of substrate proteins in computational and experimental findings. Notably, such destabilization impacts both the tag and tagged proteins, raising some provocative questions about mechanism. The data here are generally compelling, albeit with minor concerns on presentation in parts. Conclusions from this work will be interesting to scientists in several fields, particularly those interested in cellular proteostasis and in vitro protein design / long-range communication.

      The most substantial weakness of this work from my perspective is the specificity of these destabilization effects. In particular, technical challenges of producing bona fide Ub- or FAT10-conjugated substrates with native linkages limits the ability to conduct in vitro studies on exactly the same molecules as being studied in cellular environments. Given some discussion in the manuscript about the importance of linkage location on the specificity of certain tag/substrate interactions, this raises an understandable but unfortunate caveat that needs to be considered more fully both in general and in light of data from other fields (e.g. single molecule pulling) showing site-dependence of comparable effects. I note that these concerns do not impact the caliber of the conclusions themselves, but perhaps suggest area for caution as to their potential impact at this time.

    3. Reviewer #2 (Public Review):

      "Plasticity of the proteasome-targeting signal Fat10 enhances substrate degradation" is a nice study where the authors have shown the differences between two protein degradation tags namely, FAT10 and ubiquitin. Even though these tags are closely related in terms of folds, they have differential efficiency in degrading the substrates covalently attached to them. The authors have utilised extensive MD simulations combined with biophysics and cell biology to show the structural dynamics these tags provide for proteasomal degradation.

    1. eLife assessment

      This study presents a valuable finding on the precision conferred by dynamical interpretation of morphogen gradients. The evidence supporting the claims of the authors is convincing, with compelling theoretical analysis and solid yet incomplete experimental data. With the experimental part strengthened, the work could be of interest to the developmental biology and developmental systems biology communities.

    2. Reviewer #1 (Public Review):

      This work focuses on the trade-off between precision and robustness in morphogen gradients of Hedgehog signaling. It presents a framework for how hedgehog signaling rises to precise responses and robust responses. This Framework is based on the characteristics of the hedgehog signaling pathway and specifically on the characteristics of the dynamical and stationary gradients that it forms in the Drosophila wing disc. On the one hand, the manuscript takes into account known results showing that the Hedgehog stationary gradient is robust due to a self-enhanced degradation (via activation of the Patched receptor). On the other hand, it uses the concept of dynamic interpretation of the gradient introduced by the leading author of this manuscript. According to this interpretation, different targets may be responding to a single signaling threshold and what differentiates the targets is whether they respond to the transient gradient, which extends over more cells, or if they respond to the stationary gradient. The Framework presented in this manuscript takes this prior knowledge and builds on it. The Framework proposes that the response from different targets will not be equally robust. Specifically, if the target responds to the stationary gradient, it will be a target with a robust response. Conversely, if the target responds to the gradient while it is being built, then it will be less robust but more precise. This framework is analyzed using mathematical models. Finally, experimental data that partially corroborate this framework are presented, focusing on the col and Dpp targets, which, according to previous results, read the stationary and transient gradients, respectively. To changes in Hh levels, the col pattern is more robust than the Dpp pattern. Furthermore, it is shown that this robustness decreases if the Patched receptor is not regulated. Hence, these experimental results confirm that the robustness is target-specific, as predicted by the models. The precision of the Dpp pattern is not tested experimentally.

    3. Reviewer #2 (Public Review):

      This paper presents a modeling analysis of a diffusing morphogen (hh) that patterns the wing disk by controlling the expression of dpp and col. Two modes of gene expression control/interpretation are analyzed and presented, one is a response using a steady state threshold (col), which could be robust (defined as a small spatial shift of the gene expression when hh dosage changes) by a ptch mediated negative feedback mechanism; the other is the "overshoot" where an earlier hh gradient profile pre-steady state is read at a threshold to activate the gene (dpp), which is less robust to dosage changes but has better boundary features. Experimental measurements of pattern widths of col and dpp were performed under different hh dosage to test the models. How these different modes were achieved by each gene was unclear.

      The reviewer found this study presents at best incremental advances to the field. It doesn't provide substantial progress conceptually or experimentally from Eldar et al., 2003, Adleman et al., 2022 and particularly Nahmad and Stathopoulos, 2009. The experimental data and interpretation appear to lack the rigor needed to challenge the model predictions.

      The authors pitched the difference between dpp and col in their response to hh dosage change as a tradeoff between robustness and precision. Specifically, the robustness refers to positioning and the precision refers to sharpness, which are somewhat arbitrary - as robustness could also refer to maintaining the sharpness of a expression boundary and precision can also refer to the position. Particularly for dpp, whose developmental significance of stripe position and sharpness is not analyzed (disc growth, pSmad, etc, for example - does a sharper but more mislocated dpp domain help the tissue?). The relationship between positioning and sharpness of a pattern in a morphogen system has been extensively discussed by many authors on a theoretcial level. The authors' theoretical analysis is clear and simple but not new. Experimental evidence indicates that dpp and col are regulated very differently by hh, particularly in terms of timing of response (Nahmad and Stathopoulos, 2009). No comparison of the GRNs from hh to these two genes was made or experimentally tested. It is difficult to conclude that their behaviors in response to hh dosage change are indeed from the hh gradient profile. It is also difficult to speculate if either of these genes (particularly dpp) is facing a true biological tradeoff or tuning back and forth between positioning and sharpness during evolution.

      Methods 4.5: To measure widths of gene expression patterns, the authors used a background subtraction, followed by normalization and then thresholded the boundary at 0.2 - this approach firstly is oversimplifying the profile of the expression gradient/profile which could be informative in model testing (e.g., sharpness of dpp?). Secondly, the sequence of the analysis steps may introduce larger errors to lower signal-to-noise images where the subtraction narrows the pattern more than those with higher signal-to-noise (e.g., the 18 degree vs 25 degree images, Fig.6A), this would result in errors in the width measurements. Importantly, disk size and wing size controls are not reported.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      fMRI was used to address an important aspect of human cognition - the capacity for structured representations and symbolic processing - in a cross-species comparison with non-human primates (macaques); the experimental design probed implicit symbolic processing through reversal of learned stimulus pairs. The authors present solid evidence in humans that helps elucidate the role of brain networks in symbolic processing, however the evidence from macaques was incomplete (e.g., sample size constraints, potential and hard-to-quantify differences in attention allocation, motivation, and lived experience between species).

      Thank you very much for your assessment. We would like to address the potential issues that you raise point-by-point below.

      We agree that for macaque monkey physiology, sample size is always a constraint, due to both financial and ethical reasons. We addressed this concern by combining the results from two different labs, which allowed us to test 4 animals in total, which is twice as much as what is common practice in the field of primate physiology. (We discuss this now on lines 473-478.)

      Interspecies differences in motivation, attention allocation, task strategies etc. could also be limiting factors. Note that we did address the potential lack of attention allocation directly in Experiment 2 using implicit reward association, which was successful as evidenced by the activation of attentional control areas in the prefrontal cortex. We cannot guarantee that the strategies that the two species deploy are identical, but we tentatively suggest that this might be a less important factor in the present study than in other interspecies comparisons that use explicit behavioral reports. In the current study, we directly measured surprise responses in the brain in the absence of any explicit instructions in either species, which allowed us to  measure the spontaneous reversal of learned associations, which is a very basic element of symbolic representation. Our reasoning is that such spontaneous responses should be less dependent on attention allocation and task strategies. (We discuss this now in more detail on lines 478-485.)

      Finally, lived experience could be a major factor. Indeed, obvious differences include a lifetime of open-field experiences and education in our human adult subjects, which was not available to the monkey subjects, and includes a strong bias towards explicit learning of symbolic systems (e.g. words, letters, digits, etc). However, we have previously shown that 5-month-old human infants spontaneously generalize learning to the reversed pairs after a short learning in the lab using EEG (Kabdebon et al, PNAS, 2019). This indicates that also with very limited experience, humans spontaneously reverse learned associations. (We discuss this now in more detail on lines 478-485.) It could be very interesting to investigate whether spontaneous reversal could be present in infant macaque monkeys, as there might be a critical period for this effect. Although neurophysiology in awake infant monkeys is highly challenging, it would be very relevant for future work. (We discuss this in more detail on lines 493-498.)

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Kerkoerle and colleagues present a very interesting comparative fMRI study in humans and monkeys, assessing neural responses to surprise reactions at the reversal of a previously learned association. The implicit nature of this task, assessing how this information is represented without requiring explicit decision-making, is an elegant design. The paper reports that both humans and monkeys show neural responses across a range of areas when presented with incongruous stimulus pairs. Monkeys also show a surprise response when the stimuli are presented in a reversed direction. However, humans show no such surprise response based on this reversal, suggesting that they encode the relationship reversibly and bidirectionally, unlike the monkeys. This has been suggested as a hallmark of symbolic representation, that might be absent in nonhuman animals. 

      I find this experiment and the results quite compelling, and the data do support the hypothesis that humans are somewhat unique in their tendency to form reversible, symbolic associations. I think that an important strength of the results is that the critical finding is the presence of an interaction between congruity and canonicity in macaques, which does not appear in humans. These results go a long way to allay concerns I have about the comparison of many human participants to a very small number of macaques. 

      We thank the reviewer for the positive assessment. We also very much appreciate the point about the interaction effect in macaque monkeys – indeed, we do not report just a negative finding. 

      I understand the impossibility of testing 30+ macaques in an fMRI experiment. However, I think it is important to note that differences necessarily arise in the analysis of such datasets. The authors report that they use '...identical training, stimuli, and whole-brain fMRI measures'. However, the monkeys (in experiment 1) actually required 10 times more training. 

      We agree that this description was imprecise. We have changed it to “identical training stimuli” (line 151), indeed the movies used for training were strictly identical. Furthermore, please note that we do report the fMRI results after the same training duration. In experiment 1, after 3 days of training, the monkeys did not show any significant results, even in the canonical direction. However, in experiment 2, with increased attention and motivation, a significant effect was observed on the first day of scanning after training, as was found in human subjects (see Figure 4 and Table 3).

      More importantly, while the fMRI measures are the same, group analysis over 30+ individuals is inherently different from comparing only 2 macaques (including smoothing and averaging away individual differences that might be more present in the monkeys, due to the much smaller sample size). 

      Thank you for understanding that a limited sampling size is intrinsic to macaque monkey physiology. We also agree that data analysis in humans and monkeys is necessarily different. As suggested by the reviewer, we added an analysis to address this, see the corresponding reply to the ‘Recommendations for the authors’ section below.

      Despite this, the results do appear to show that macaques show the predicted interaction effect (even despite the sample size), while humans do not. I think this is quite convincing, although had the results turned out differently (for example an effect in humans that was absent in macaques), I think this difference in sample size would be considerably more concerning. 

      Thank you for noting this. Indeed, the interaction effect is crucial, and the task design was explicitly made to test this precise prediction, described in our manuscript as the “reversibility hypothesis”. The congruity effect in the learned direction served as a control for learning, while the corresponding congruity effect in the reversed direction tested for spontaneous reversal. The reversibility hypothesis stipulates that in humans there should not be a difference between the learned and the reversed direction, while there should be for monkeys. We already wrote about that in the result section of the original manuscript and now also describe this more explicitly in the introduction and beginning of the result section.

      I would also note that while I agree with the authors' conclusions, it is notable to me that the congruity effect observed in humans (red vs blue lines in Fig. 2B) appears to be far more pronounced than any effect observed in the macaques (Fig. 3C-3). Again, this does not challenge the core finding of this paper but does suggest methodological or possibly motivational/attentional differences between the humans and the monkeys (or, for example, that the monkeys had learned the associations less strongly and clearly than the humans). 

      As also explained in response to the eLife assessment above, we expanded the “limitations” section of the discussion, with a deeper description of the possible methodological differences between the two species (see lines 478-485).

      With the same worry in mind, we did increase the attention and motivation of monkeys in experiment 2, and indeed obtained a greater activation to the canonical pairs and their violation, -notably in the prefrontal cortex – but crucially still without reversibility.

      In the end, we believe that the striking interspecies difference in size and extent of the violation effect, even for purely canonical stimuli, is an important part of our findings and points to a more efficient species-specific learning system, that our experiment tentatively relates to a symbolic competence.

      This is a strong paper with elegant methods and makes a worthwhile contribution to our understanding of the neural systems supporting symbolic representations in humans, as opposed to other animals. 

      We again thank the reviewer for the positive review.

      Reviewer #2 (Public Review): 

      In their article titled "Brain mechanisms of reversible symbolic reference: a potential singularity of the human brain", van Kerkoerle et al address the timely question of whether non-human primates (rhesus macaques) possess the ability for reverse symbolic inference as observed in humans. Through an fMRI experiment in both humans and monkeys, they analyzed the bold signal in both species while observing audio-visual and visual-visual stimuli pairs that had been previously learned in a particular direction. Remarkably, the findings pertaining to humans revealed that a broad brain network exhibited increased activity in response to surprises occurring in both the learned and reverse directions. Conversely, in monkeys, the study uncovered that the brain activity within sensory areas only responded to the learned direction but failed to exhibit any discernible response to the reverse direction. These compelling results indicate that the capacity for reversible symbolic inference may be unique to humans. 

      In general, the manuscript is skillfully crafted and highly accessible to readers. The experimental design exhibits originality, and the analyses are tailored to effectively address the central question at hand.

      Although the first experiment raised a number of methodological inquiries, the subsequent second experiment thoroughly addresses these concerns and effectively replicates the initial findings, thereby significantly strengthening the overall study. Overall, this article is already of high quality and brings new insight into human cognition. 

      We sincerely thank the reviewer for the positive comments. 

      I identified three weaknesses in the manuscript: 

      - One major issue in the study is the absence of significant results in monkeys. Indeed, authors draw conclusions regarding the lack of significant difference in activity related to surprise in the multidemand network (MDN) in the reverse congruent versus reverse incongruent conditions. Although the results are convincing (especially with the significant interaction between congruency and canonicity), the article could be improved by including additional analyses in a priori ROI for the MDN in monkeys (as well as in humans, for comparison). 

      First, we disagree with the statement about “absence of significant results in monkeys”. We do report a significant interaction which, as noted by the referee, is a crucial positive finding.

      Second, we performed the suggested analysis for experiment 2, using the bilateral ROIs of the putative monkey MDN from previous literature (Mitchell, et al. 2016), which are based on the human study by Fedorenko et al. (PNAS, 2013). 

      Author response table 1.

      Congruity effect for monkeys in Experiment 2 within the ROIs of the MDN (n=3). Significance was assessed with one-sided one-sample t-tests.

      As can be seen, none of the regions within the monkey MDN showed an FDR-corrected significant difference or interaction. Although the absence of a canonical congruity effect makes it difficult to draw strong conclusions, it did approach significance at an uncorrected level in the lateral frontal posterior region, similar to  the large prefrontal effect we report in Figures 4 and 5. Furthermore, for the reversed congruity effect there was never even a trend at the uncorrected level, and the crucial interaction of canonicity and congruity again approached significance in the lateral prefrontal cortex.  

      We also performed an ANOVA  in the human participants of the VV experiment on the average betas across the 7 different fronto-parietal ROIs as used by Mitchell et al to define their equivalent to the monkey brain (Fig 1a, right in Mitchell et al. 2016) with congruity, canonicity and hemisphere (except for the anterior cingulate which is a bilateral ROI) as within-subject factors. We confirmed the results presented in the manuscript (Figure 4C) with notably no significant interaction between congruity and canonicity in any of these ROIs (all F-values (except insula) <1). A significant main effect of congruity was observed in the posterior middle frontal gyrus (MFG) and inferior precentral sulcus at the FDR corrected level. Analyses restricted to the canonical trials found a congruity effect in these two regions plus the anterior insula and anterior cingulate/presupplementary motor area, whereas no ROIs were significant at a FDR corrected level for reverse trials. There was a trend in the middle MFG and inferior precentral region for reversed trials. Crucially, there was not even a trend for the interaction between congruity and canonicity at the uncorrected level. The difference in the effect size between the canonical and reversed direction can therefore be explained by the larger statistical power due to the larger number of congruent trials (70%, versus 10% for the other trial conditions), not by a significant effect by the canonical and the reversed direction. 

      Author response table 2.

      Congruity effect for humans in Experiment 2 within the ROIs of the MDN (n=23).

      These results support our contention that the type of learning of the stimulus pairs was very different in the two species. We thank the reviewer for suggesting these relevant additional analyses.

      - While the authors acknowledge in the discussion that the number of monkeys included in the study is considerably lower compared to humans, it would be informative to know the variability of the results among human participants. 

      We agree that this is an interesting question, although it is also very open-ended. For instance, we could report each subjects’ individual whole-brain results, but this would take too much space (and the interested reader will be able to do so from the data that we make available as part of this publication). As a step in this direction, we provide below a figure showing the individual congruity effects, separately for each experiment and for each ROI of table 5, and for each of the 52 participants for whom an fMRI localizer was available:

      Author response image 1.

      Difference in mean betas between congruent and incongruent conditions in a-priori linguistic and mathematical ROIs (see definition and analyses in Table 5) in both experiments (experiment 1 = AV, left panel; experiment 2= VV, right panel). Dots correspond to participants (red: canonical trials, green reversed trials).The boxplot notch is located at the median and the lower and upper box hinges at the 25th and 75th centiles. Whiskers extend to 1.5 inter-quartile ranges on either side of the hinges. ROIs are ranked by the median of the Incongruent-Congruent difference across canonical and reversed order,

      within a given experiment. For purposes of comparison between the two experiments, we have underlined with colors the top-five common ROIs between the two experiments. N.s.: non-significant congruity effect (p>0.05)

      Several regions show a rather consistent difference across subjects (see, for instance, the posterior STS in experiment 1, left panel). Overall, only 3 of the 52 participants did not show any beta superior to 2 in canonical or reversed in any ROIs. The consistency is quite striking, given the limited number of test trials (in total only 16 incongruent trials per direction per participant), and the fact that these ROIs were selected for their responses to spoken or written  sentences, as part of a subsidiary task quite different from the main task.

      - Some details are missing in the methods.  

      Thank you for these comments, we reply to them point-by-point below.

      Reviewer #3 (Public Review): 

      This study investigates the hypothesis that humans (but not non-human primates) spontaneously learn reversible temporal associations (i.e., learning a B-A association after only being exposed to A-B sequences), which the authors consider to be a foundational property of symbolic cognition. To do so, they expose humans and macaques to 2-item sequences (in a visual-auditory experiment, pairs of images and spoken nonwords, and in a visual-visual experiment, pairs of images and abstract geometric shapes) in a fixed temporal order, then measure the brain response during a test phase to congruent vs. incongruent pairs (relative to the trained associations) in canonical vs. reversed order (relative to the presentation order used in training). The advantage of neuroimaging for this question is that it removes the need for a behavioral test, which non-human primates can fail for reasons unrelated to the cognitive construct being investigated. In humans, the researchers find statistically indistinguishable incongruity effects in both directions (supporting a spontaneous reversible association), whereas in monkeys they only find incongruity effects in the canonical direction (supporting an association but a lack of spontaneous reversal). Although the precise pattern of activation varies by experiment type (visual-auditory vs. visual-visual) in both species, the authors point out that some of the regions involved are also those that are most anatomically different between humans and other primates. The authors interpret their finding to support the hypothesis that reversible associations, and by extension symbolic cognition, is uniquely human. 

      This study is a valuable complement to prior behavioral work on this question. However, I have some concerns about methods and framing. 

      We thank the reviewer for the careful summary of the manuscript, and the positive comments.

      Methods - Design issues: 

      The authors originally planned to use the same training/testing protocol for both species but the monkeys did not learn anything, so they dramatically increased the amount of training and evaluation. By my calculation from the methods section, humans were trained on 96 trials and tested on 176, whereas the monkeys got an additional 3,840 training trials and 1,408 testing trials. The authors are explicit that they continued training the monkeys until they got a congruity effect. On the one hand, it is commendable that they are honest about this in their write-up, given that this detail could easily be framed as deliberate after the fact. On the other hand, it is still a form of p-hacking, given that it's critical for their result that the monkeys learn the canonical association (otherwise, the critical comparison to the non-canonical association is meaningless). 

      Thank you for this comment. 

      Indeed, for experiment 1, the amount of training and testing was not equal for the humans and monkeys, as also mentioned by reviewer 2. We now describe in more detail how many training and imaging days we used for each experiment and each species, as well as the number of blocks per day and the number of trials per block (see lines 572-577). We also added the information on the amount of training receives to all of the legends of the Tables.

      We are sorry for giving the impression that we trained until the monkeys learned this. This was not the case. Based on previous literature, we actually anticipated that the short training would not be sufficient, and therefore planned additional training in advance. Specifically, Meyer & Olson (2011) had observed pair learning in the inferior temporal cortex of macaque monkeys after 816 exposures per pair. This is similar to the additional training we gave, about 80 blocks with 12 trials per pair per block. This is  now explained in more detail (lines 577-580).

      Furthermore, we strongly disagree with the pejorative term p-hacking. The aim of the experiment was not to show a congruency effect in the canonical direction in monkeys, but to track and compare their behavior in the same paradigm as that of humans for the reverse direction. It would have been unwise to stop after human-identical training and only show that humans learn better, which is a given. Instead, we looked at brain activations at both times, at the end of human-identical training and when the monkeys had learned the pairs in the canonical direction. 

      Finally, in experiment 2, monkeys were tested after the same 3 days of training as humans. We wrote: “Using this design, we obtained significant canonical congruity effects in monkeys on the first imaging day after the initial training (24 trials per pair), indicating that the animals had learned the associations” (lines 252-253).

      (2) Between-species comparisons are challenging. In addition to having differences in their DNA, human participants have spent many years living in a very different culture than that of NHPs, including years of formal education. As a result, attributing the observed differences to biology is challenging. One approach that has been adopted in some past studies is to examine either young children or adults from cultures that don't have formal educational structures. This is not the approach the authors take. This major confound needs to minimally be explicitly acknowledged up front. 

      Thank you for raising this important point. We already had a section on “limitations” in the manuscript, which we now extended (line 478-485). Indeed, this study is following a previous study in 5-month-old infants using EEG, in which we already showed that after learning associations between labels and categories, infants spontaneously generalize learning to the reversed pairs after a short learning period in the lab (Kabdebon et al, PNAS, 2019). We also cited preliminary results of the same paradigm as used in the current study but using EEG in 4-month-old infants (Ekramnia and Dehaene-Lambertz, 2019), where we replicated the results obtained by Kabdebon et al. 2019 showing that preverbal infants spontaneously generalize learning to the reversed pairs. 

      Functional MRI in awake infants remains a challenge at this age (but see our own work, DehaeneLambertz et al, Science, 2002), especially because the experimental design means only a few trials in the conditions of interest (10%) and thus a long experimental duration that exceed infants’ quietness and attentional capacities in the noisy MRI environment. (We discuss this on lines 493-496.)

      (3) Humans have big advantages in processing and discriminating spoken stimuli and associating them with visual stimuli (after all, this is what words are in spoken human languages). Experiment 2 ameliorates these concerns to some degree, but still, it is difficult to attribute the failure of NHPs to show reversible associations in Experiment 1 to cognitive differences rather than the relative importance of sound string to meaning associations in the human vs. NHP experiences. 

      As the reviewer wrote, we deliberately performed Experiment 2 with visual shapes to control for various factors that might have explained the monkeys' failure in Experiment 1. 

      (4) More minor: The localizer task (math sentences vs. other sentences) makes sense for math but seems to make less sense for language: why would a language region respond more to sentences that don't describe math vs. ones that do? 

      The referee is correct: our use of the word “reciprocally” was improper (although see Amalric et Dehaene, 2016 for significant differences in both directions when non-mathematical sentences concern specific knowledge). We changed the formulation to clarify this as follows: “In these ROIs, we recovered the subject-specific coordinates of each participant’s 10% best voxels in the following comparisons: sentences vs rest for the 6 language Rois ; reading vs listening for the VWFA ; and numerical vs non-numerical sentences for the 8 mathematical ROIs.” (lines 678-680).

      Methods - Analysis issues: 

      (5) The analyses appear to "double dip" by using the same data to define the clusters and to statistically test the average cluster activation (Kriegeskorte et al., 2009). The resulting effect sizes are therefore likely inflated, and the p-values are anticonservative. 

      It is not clear to us which result the reviewer is referring to. In Tables 1-4, we report the values that we found significant in the whole brain analysis, we do not report additional statistical tests for this data. For Table 5, the subject-specific voxels were identified through a separate localizer experiment, which was designed to pinpoint the precise activation areas for each subject in the domains of oral and written language-processing and math. Subsequently, we compared the activation at these voxel locations across different conditions of the main experiment. Thus, the two datasets were distinct, and there was no double dipping. In both interpretations of the comment, we therefore disagree with the reviewer.

      Framing: 

      (6) The framing ("Brain mechanisms of reversible symbolic reference: A potential singularity of the human brain") is bigger than the finding (monkeys don't spontaneously reverse a temporal association but humans do). The title and discussion are full of buzzy terms ("brain mechanisms", "symbolic", and "singularity") that are only connected to the experiments by a debatable chain of assumptions. 

      First, this study shows relatively little about brain "mechanisms" of reversible symbolic associations, which implies insights into how these associations are learned, recognized, and represented. But we're only given standard fMRI analyses that are quite inconsistent across similar experimental paradigms, with purely suggestive connections between these spatial patterns and prior work on comparative brain anatomy. 

      We agree with the referee that the term “mechanism” is ambiguous and, for systems neuroscientists, may suggest more than we are able to do here with functional MRI. We changed the title to “Brain areas for reversible symbolic reference, a potential singularity of the human brain”. This title better describes our specific contribution: mapping out the areas involved in reversibility in humans, and showing that they do not seem to respond similarly in macaque monkeys.

      Second, it's not clear what the relationship is between symbolic cognition and a propensity to spontaneously reverse a temporal association. Certainly, if there are inter-species differences in learning preferences this is important to know about, but why is this construed as a difference in the presence or absence of symbols? Because the associations aren't used in any downstream computation, there is not even any way for participants to know which is the sign and which is the signified: these are merely labels imposed by the researchers on a sequential task. 

      As explained in the introduction, the reversibility test addressed a very minimal core property of symbolic reference. There cannot be a symbol if its attachment doesn’t operate in both directions. Thus, this property is necessary – but we agree that it is not sufficient. Indeed, more tests are needed to establish whether and how the learned symbols are used in further downstream compositional tasks (as discussed in our recent TICS papers, Dehaene et al. 2022). We added a sentence in the introduction to acknowledge this fact:

      “Such reversibility is a core and necessary property of symbols, although we readily acknowledge that it is not sufficient, since genuine symbols present additional referential and compositional properties that will not be tested in the present work.” (lines 89-92).

      Third, the word "singularity" is both problematically ambiguous and not well supported by the results. "Singularity" is a highly loaded word that the authors are simply using to mean "that which is uniquely human". Rather than picking a term with diverse technical meanings across fields and then trying to restrict the definition, it would be better to use a different term. Furthermore, even under the stated definition, this study performed a single pairwise comparison between humans and one other species (macaques), so it is a stretch to then conclude (or insinuate) that the "singularity" has been found (see also pt. 2 above). 

      We have published an extensive review including a description of our use of the term “singularity” (Dehaene et al., TICS 2022). Here is a short except: “Humans are different even in domains such as drawing and geometry that do not involve communicative language. We refer to this observation using the term “human cognitive singularity”, the word singularity being used here in its standard meaning (the condition of being singular) as well as its mathematical sense (a point of sudden change). Hominization was certainly a singularity in biological evolution, so much so that it opened up a new geological age (the Anthropocene). Even if evolution works by small continuous change (and sometimes it doesn’t [4]), it led to a drastic cognitive change in humans.”

      We find the referee’s use of the pejorative term ”insinuate” quite inappropriate. From the title on, we are quite nuanced and refer only to a “potential singularity”. Furthermore, as noted above, we explicitly mention in the discussion the limitations of our study, and in particular the fact that only a single non-human species was tested (see lines 486-493). We are working hard to get chimpanzee data, but this is remarkably difficult for us, and we hope that our paper will incite other groups to collect more evidence on this point.

      (7) Related to pt. 6, there is circularity in the framing whereby the authors say they are setting out to find out what is uniquely human, hypothesizing that the uniquely human thing is symbols, and then selecting a defining trait of symbols (spontaneous reversible association) *because* it seems to be uniquely human (see e.g., "Several studies previously found behavioral evidence for a uniquely human ability to spontaneously reverse a learned association (Imai et al., 2021; Kojima, 1984; Lipkens et al., 1988; Medam et al., 2016; Sidman et al., 1982), and such reversibility was therefore proposed as a defining feature of symbol representation reference (Deacon, 1998; Kabdebon and DehaeneLambertz, 2019; Nieder, 2009).", line 335). They can't have it both ways. Either "symbol" is an independently motivated construct whose presence can be independently tested in humans and other species, or it is by fiat synonymous with the "singularity". This circularity can be broken by a more modest framing that focuses on the core research question (e.g., "What is uniquely human? One possibility is spontaneous reversal of temporal associations.") and then connects (speculatively) to the bigger conceptual landscape in the discussion ("Spontaneous reversal of temporal associations may be a core ability underlying the acquisition of mental symbols").

      We fail to understand the putative circularity that the referee sees in our introduction. We urge him/her to re-read it, and hope that, with the changes that we introduced, it does boil down to his/her summary, i.e. “What is uniquely human? One possibility is spontaneous reversal of temporal associations."

      Reviewer #1 (Recommendations For The Authors): 

      In general, the manuscript was very clear, easy to read, and compelling. I would recommend the authors carefully check the text for consistency and minor typos. For example: 

      The sample size for the monkeys kept changing throughout the paper. E.g., Experiment 1: n = 2 (line 149); n = 3 (line 205).  

      Thank you for catching this error, we corrected it. The number of animals was indeed 2  for experiment 1, and 3 for experiment 2. (Animals JD and YS participated in experiment 1 and JD, JC and DN in experiment 2. So only JD participated in both experiments.)

      Similarly, the number of stimulus pairs is reported inconsistently (4 on line 149, 5 pairs later in the paper). 

      We’re sorry that this was unclear. We used 5 sets of 4 audio-visual pairs each. We now clarify this, on line 157 and on lines 514-516.

      At least one case of p>0.0001, rather than p < 0.0001 (I assume). 

      Thank you once again, we now corrected this.

      Reviewer #2 (Recommendations For The Authors): 

      One major issue in the study is the absence of significant results in monkeys. Indeed, the authors draw conclusions regarding the lack of significant difference in activity related to surprise in the multidemand network (MDN) in the reverse congruent versus reverse incongruent conditions. Although the results are convincing (especially with the significant interaction between congruency and canonicity), the article could be improved by including additional analyses in a priori ROI for the MDN in monkeys (as well as in humans, for comparison). In other words: what are the statistics for the MDN regarding congruity, canonicity, and interaction in both species? Since the authors have already performed this type of analysis for language and Math ROIs (table 5), it should be relatively easy for them to extend it to the MDN. Demonstrating that results in monkeys are far from significant could further convince the reader. 

      Furthermore, while the authors acknowledge in the discussion that the number of monkeys included in the study is considerably lower compared to humans, it would be informative to know the variability of the results among human participants. Specifically, it would be valuable to describe the proportion of human participants in which the effects of congruency, canonicity, and their interaction are significant. Additionally, stating the variability of the F-values for each effect would provide reassurance to the reader regarding the distinctiveness of humans in comparison to monkeys. Low variability in the results would serve to mitigate concerns that the observed disparity is merely a consequence of testing a unique subset of monkeys, which may differ from the general population. Indeed, this would be a greater support to the notion that the dissimilarity stems from a genuine distinction between the two species. 

      We responded to both of these points above.

      In terms of methods, details are missing: 

      - How many trials of each condition are there exactly? (10% of 44 trials is 4.4) : 

      We wrote: “In both humans and monkeys, each block started with 4 trials in the learned direction (congruent canonical trials), one trial for each of the 4 pairs (2 O-L and 2 L-O pairs). The rest of the block consisted of 40 trials in which 70% of trials were identical to the training; 10% were incongruent pairs but the direction (O-L or L-O) was correct (incongruent canonical trials), thus testing whether the association was learned; 10% were congruent pairs but the direction within the pairs was reversed relative to the learned pairs (congruent reversed trials) and 10% were incongruent pairs in reverse (incongruent reversed trials).”(See lines 596-600.)

      Thus, each block comprised 4 initial trials, 28 canonical congruent trials, 4 canonical incongruent, 4 reverse congruent and 4 reverse incongruent trials, i.e. 4+28+3x4=40 trials.

      - How long is one trial? 

      As written in the method section: “In each trial, the first stimulus (label or object) was presented during 700ms, followed by an inter-stimulus-interval of 100ms then the second stimulus during 700ms. The pairs were separated by a variable inter-trial-interval of 3-5 seconds” i.e. 700+100+700=1500, plus 3 to 4.75 seconds of blank between the trials (see lines 531-533).

      - How are the stimulus presentations jittered? 

      See : “The pairs were separated by a variable inter-trial-interval randomly chosen among eight different durations between 3 and 4.75 seconds (step=250 ms). The series of 8 intervals was randomized again each time it was completed.”(lines 533-535).

      - What is the statistical power achieved for humans? And for monkeys? 

      We know of no standard way to define power for fMRI experiments. Power will depend on so many parameters, including the fMRI signal-to-noise ratio, the attention of the subject, the areas being considered, the type of analysis (whole-brain versus ROIs), etc.

      - Videos are mentioned in the methods, is it the image and sound? It is not clear. 

      We’re sorry that it was unclear. Video’s were only used for the training of the human subjects. We now corrected this in the method section (lines 552-554).

      Reviewer #3 (Recommendations For The Authors): 

      The main recommendations are to adjust the framing (making it less bold and more connected to the empirical evidence) and to ensure independence in the statistical analyses of the fMRI data. 

      See our replies to the reviewer’s comments on “Framing” above. In particular, we changed the title of the paper from “Brain mechanisms of reversible symbolic reference” to “Brain areas for reversible symbolic reference”.

      References cited in this response

      Dehaene, S., Al Roumi, F., Lakretz, Y., Planton, S., & Sablé-Meyer, M. (2022). Symbols and mental programs : A hypothesis about human singularity. Trends in Cognitive Sciences, 26(9), 751‑766. https://doi.org/10.1016/j.tics.2022.06.010.

      Dehaene-Lambertz, Ghislaine, Stanislas Dehaene, et Lucie Hertz-Pannier. Functional Neuroimaging of Speech Perception in Infants. Science 298, no 5600 (2002): 2013-15. https://doi.org/10.1126/science.1077066.

      Ekramnia M, Dehaene-Lambertz G. 2019. Investigating bidirectionality of associations in young infants as an approach to the symbolic system. Presented at the CogSci. p. 3449.

      Fedorenko E, Duncan J, Kanwisher N (2013) Broad domain generality in focal regions of frontal and parietal cortex. Proc Natl Acad Sci U S A 110:16616-16621.

      Kabdebon, Claire, et Ghislaine Dehaene-Lambertz. « Symbolic Labeling in 5-Month-Old Human Infants ». Proceedings of the National Academy of Sciences 116, no 12 (2019): 5805-10. https://doi.org/10.1073/pnas.1809144116.

      Mitchell, D. J., Bell, A. H., Buckley, M. J., Mitchell, A. S., Sallet, J., & Duncan, J. (2016). A Putative Multiple-Demand System in the Macaque Brain. Journal of Neuroscience, 36(33), 8574‑8585. https://doi.org/10.1523/JNEUROSCI.0810-16.2016

    2. eLife assessment

      fMRI was used to address an important aspect of human cognition - the capacity for structured representations and symbolic processing - in a cross-species comparison with macaques; the experimental design probed implicit symbolic processing through reversal of learned stimulus pairs. The authors present solid evidence in humans that helps elucidate the role of brain networks in symbolic processing, however the evidence from macaques was necessarily incomplete (e.g., hard-to-quantify differences in learning trajectories and lived experience between species).

    3. Reviewer #1 (Public Review):

      Kerkoerle and colleagues present a very interesting comparative fMRI study in humans and monkeys, assessing neural responses to surprise reactions at the reversal of a previously learned association. The implicit nature of this task, assessing how this information is represented without requiring explicit decision making, is an elegant design. The paper reports that both humans and monkeys show neural responses across a range of areas when presented with incongruous stimulus pairs. Monkeys also show a surprise response when the stimuli are presented in the reversed direction. However, humans show no such surprise response based on this reversal, suggesting that they encode the relationship reversibly and bidirectionally, unlike the monkeys. This has been suggested as a hallmark of symbolic representation, that might be absent in nonhuman animals.

      I find this experiment and the results quite compelling, and the data do support the hypothesis that humans are somewhat unique in their tendency to form reversible, symbolic associations. I think that an important strength of the results is that the critical finding is the presence of an interaction between congruity and canonicity in macaques, which does not appear in humans. These results go a long way to allay concerns I have about the comparison of many human participants to a very small number of macaques.

      The results do appear to show that macaques show the predicted interaction effect (even despite the sample size), while humans do not. I think this is quite convincing. (Although had the results turned out differently (for example an effect in humans that was absent in macaques), I think this difference in sample size would be considerably more concerning.)

      I would also note that while I agree with the authors conclusions, it is also notable to me that the congruity effect observed in humans (red vs blue lines in Fig. 2B) appears to be far more pronounced than any effect observed in the macaques (Fig. 3C-3). Again, this does not challenge the core finding of this paper but does suggest methodological or possibly motivational/attentional differences between the humans and the monkeys (or, for example, that the monkeys had learned the associations less strongly and clearly than the humans). The authors now discuss this more fully.

      This is a strong paper with elegant methods and makes a worthwhile contribution to our understanding of the neural systems supporting symbolic representations in humans, as opposed to other animals.

    4. Reviewer #2 (Public Review):

      In their article titled, van Kerkoerle et al address the timely question of whether non-human primates (rhesus macaques) possess the ability for reverse symbolic inference as observed in humans. Through an fMRI experiment in both humans and monkeys, they analyzed the bold signal in both species while observing audio-visual and visual-visual stimuli pairs that had been previously learned in a particular direction. Remarkably, the findings pertaining to humans revealed that a broad brain network exhibited increased activity in response to surprises occurring in both the learned and reverse directions. Conversely, in monkeys, the study uncovered that the brain activity within sensory areas only responded to the learned direction but failed to exhibit any discernible response to the reverse direction. These compelling results indicate that the capacity for reversible symbolic inference may be specific to humans, even though it remains to be tested in other species.

      In general, the manuscript is skillfully crafted and highly accessible to readers. The experimental design exhibits originality, and the analyses are tailored to effectively address the central question at hand. Although the first experiment raised a number of methodological inquiries, the subsequent second experiment thoroughly addresses these concerns and effectively replicates the initial findings, thereby significantly strengthening the overall study. Overall, this article is of high quality and brings new insight into human cognition.

      The main limitation of the studies is the sample size of the non-human primate group (n=2 and n=3). Nevertheless, this limitation is carefully addressed and discussed in the manuscript.

    5. Reviewer #3 (Public Review):

      Original review

      This study investigates the hypothesis that humans (but not non-human primates) spontaneously learn reversible temporal associations (i.e., learning a B-A association after only being exposed to A-B sequences), which the authors consider to be a foundational property of symbolic cognition. To do so, they expose humans and macaques to 2-item sequences (in a visual-auditory experiment, pairs of images and spoken nonwords, and in a visual-visual experiment, pairs of images and abstract geometric shapes) in a fixed temporal order, then measure the brain response during a test phase to congruent vs. incongruent pairs (relative to the trained associations) in canonical vs. reversed order (relative to the presentation order used in training). The advantage of neuroimaging for this question is that it removes the need for a behavioral test, which non-human primates can fail for reasons unrelated to the cognitive construct being investigated. In humans, the researchers find statistically indistinguishable incongruity effects in both directions (supporting a spontaneous reversible association), whereas in monkeys they only find incongruity effects in the canonical direction (supporting an association but a lack of spontaneous reversal). Although the precise pattern of activation varies by experiment type (visual-auditory vs. visual-visual) in both species, the authors point out that some of the regions involved are also those that are most anatomically different between humans and other primates. The authors interpret their findings to support the hypothesis that reversible associations, and by extension symbolic cognition, is uniquely human.

      This study is a valuable complement to prior behavioral work on this question. However, I have some concerns about methods and framing.

      Methods - Design issues:

      (1) The authors originally planned to use the same training/testing protocol for both species but the monkeys did not learn anything, so they dramatically increased the amount of training and evaluation. By my calculation from the methods section, humans were trained on 96 trials and tested on 176, whereas the monkeys got an additional 3,840 training trials and 1,408 testing trials. The authors are explicit that they continued training the monkeys until they got a congruity effect. On the one hand, it is commendable that they are honest about this in their write-up, given that this detail could easily be framed as deliberate after the fact. On the other hand, it is still a form of p-hacking, given that it's critical for their result that the monkeys learn the canonical association (otherwise, the critical comparison to the non-canonical association is meaningless).

      (2) Between-species comparisons are challenging. In addition to having differences in their DNA, human participants have spent many years living in a very different culture than that of NHPs, including years of formal education. As a result, attributing the observed differences to biology is challenging. One approach that has been adopted in some past studies is to examine either young children or adults from cultures that don't have formal educational structures. This is not the approach the authors take. This major confound needs to minimally be explicitly acknowledged up front.

      (3) Humans have big advantages in processing and discriminating spoken stimuli and associating them to visual stimuli (after all, this is what words are in spoken human languages). Experiment 2 ameliorates these concerns to some degree, but still it is difficult to attribute the failure of NHPs to show reversible associations in Experiment 1 to cognitive differences rather than the relative importance of sound string to meaning associations in the human vs. NHP experiences.

      (4) More minor: The localizer task (math sentences vs. other sentences) makes sense for math but seems to make less sense for language: why would a language region respond more to sentences that don't describe math vs. ones that do?

      Methods - Analysis issues:

      (5) The analyses appear to "double dip" by using the same data to define the clusters and to statistically test the average cluster activation (Kriegeskorte et al., 2009). The resulting effect sizes are therefore likely inflated, and the p-values are anticonservative.

      FRAMING:

      (6) The framing ("Brain mechanisms of reversible symbolic reference: A potential singularity of the human brain") is bigger than the finding (monkeys don't spontaneously reverse a temporal association but humans do). The title and discussion are full of buzzy terms ("brain mechanisms", "symbolic", and "singularity") that are only connected to the experiments by a debatable chain of assumptions.

      First, this study shows relatively little about brain "mechanisms" of reversible symbolic associations, which implies insights about how these associations are learned, recognized, and represented. But we're only given standard fMRI analyses that are quite inconsistent across similar experimental paradigms, with purely suggestive connections between these spatial patterns and prior work on comparative brain anatomy.

      Second, it's not clear what the relationship is between symbolic cognition and a propensity to spontaneously reverse a temporal association. Certainly if there are inter-species differences in learning preferences this is important to know about, but why is this construed as a difference in the presence or absence of symbols? Because the associations aren't used in any downstream computation, there is not even any way for participants to know which is the sign and which is the signified: these are merely labels imposed by the researchers on a sequential task.

      Third, the word "singularity" is both problematically ambiguous and not well supported by the results. "Singularity" is a highly loaded word that the authors are simply using to mean "that which is uniquely human". Rather than picking a term with diverse technical meanings across fields and then trying to restrict the definition, it would be better to use a different term. Furthermore, even under the stated definition, this study performed a single pairwise comparison between humans and one other species (macaques), so it is a stretch to then conclude (or insinuate) that the "singularity" has been found (see also pt. 2 above).

      (7) Related to pt. 6, there is circularity in the framing whereby the authors say they are setting out to find out what is uniquely human, hypothesizing that the uniquely human thing is symbols, and then selecting a defining trait of symbols (spontaneous reversible association) *because* it seems to be uniquely human (see e.g., "Several studies previously found behavioral evidence for a uniquely human ability to spontaneously reverse a learned association (Imai et al., 2021; Kojima, 1984; Lipkens et al., 1988; Medam et al., 2016; Sidman et al., 1982), and such reversibility was therefore proposed as a defining feature of symbol representation reference (Deacon, 1998; Kabdebon and Dehaene-Lambertz, 2019; Nieder, 2009).", line 335). They can't have it both ways. Either "symbol" is an independently motivated construct whose presence can be independently tested in humans and other species, or it is by fiat synonymous with the "singularity". This circularity can be broken by a more modest framing that focuses on the core research question (e.g., "What is uniquely human? One possibility is spontaneous reversal of temporal associations.") and then connects (speculatively) to the bigger conceptual landscape in the discussion ("Spontaneous reversal of temporal associations may be a core ability underlying the acquisition of mental symbols").

      Comments on revised version:

      I thank the authors for engaging constructively with my comments. I'm convinced by the responses to my original points 1, 2, 3, and 4. I'm also partially convinced by the response to point 6 (with qualifications discussed below). I do want to clear the record on points 1 and 6 (about which the authors expressed offense at aspects of my original comments), and to press on points 5 and 7.

      (1) It's very helpful to know that the plan was always to extend training in Expt 1. The rationale is now clear in the methods, although I'd encourage the authors to also emphasize this if space permits in the vicinity of lines 211-216, which still read as if the extended training was a post hoc decision ("the canonical congruity effect... was not significant... after 3 days of exposure... Thus... monkeys were further exposed..."). The authors have objected to my original use of "p hacking", which I agree was too strong (my apologies). My intention was only to point out that *if it were the case that training duration was conditional on the monkeys' success at learning the canonical association* (which the authors have now clarified was not the case), then this would be steering the study post hoc to achieve a desired outcome. I recognize the authors' point that the canonical direction was a sanity check, not the effect of interest (reversed association), but it's still true that they needed to achieve this sanity check in order for the absence of a reversed effect to be meaningful. This was the source of my original concern. This point is only clarificational (no action is recommended).

      (5) The authors have said they don't understand my concern about "double-dipping" in the statistical analyses, so I will attempt to clarify. First, I should stress that this concern applies only to the whole-brain results (Tables 1-4), not the fROI results. As the authors point out, this was indeed unclear, and I apologize. My concern about Tables 1-4 is that they seem to be derived using the classical technique of thresholding contrasts at some significance level to define clusters and then reporting cluster statistics (in this case, t-values) derived from *the same contrast in the same activation maps*. If this is not what was done (i.e., if orthogonal data and/or contrasts were used to define clusters and quantify contrasts within clusters, as in the fROI analyses), then this point is moot (and clarification in the paper would be helpful). But if this is what was done, then this procedure is known to be distortionary (e.g., Kriegeskorte et al 2009, "Nonindependent selective analysis is incorrect and should not be acceptable in neuroscientific publications").

      (6) The authors have objected to my use of the term "insinuate" as pejorative. I don't share this impression (and insult was certainly not my intent) but I'm happy to concede that a less loaded term (e.g., "suggest") would have been a better choice. I apologize. In any case, I stand by my intended original concern that a key idea in this piece (that reversible symbolic inference is a singularity of the human brain) is being advanced rhetorically rather than empirically, by repeatedly supplying it to readers (albeit with qualifiers like "potential") as an interpretive lens through which to view empirical results that only directly support a more modest claim (that macaques spontaneously reverse sequential associations less readily than humans do). To be clear, it is good that the authors don't make this stronger claim outright, and it is fine to motivate a more modest research question (e.g., do species differ in spontaneous reversal of associations) on the grounds that it is a stepping stone to a bigger one (what is the singularity). But by placing the bigger framing front and center in this way, there's a risk that this paper will be received by the community as establishing a conclusion that it does not actually establish.

      (7) The authors have said they don't understand the circularity I'm alleging. Having read the revision, I believe the issue is still there, so I'll make another attempt. The problem is most clearly apparent in the Discussion text quoted in my original comment (lines 347-350 of the revision, emphasis mine): "Several studies previously found behavioural evidence for a *uniquely human* ability to spontaneously reverse a learned association (Imai et al., 2021; Kojima, 1984; Lipkens et al., 1988; Medam et al., 2016; Sidman et al., 1982), and such reversibility was *therefore* proposed as a defining feature of symbol representation reference (Deacon, 1998; Kabdebon and Dehaene-Lambertz, 2019; Nieder, 2009)." In other words, reversal of associations is selected as a defining feature of symbols and targeted by this study *because* it is thought to be uniquely human. This is fine, but it prohibits you from then advocating the hypothesis that symbolic cognition is the singularity (lines 49-52), because "symbol" is being defined such that this is necessarily the case. To minimally paraphrase what I perceive to be the circular logic in the framing, the argument seems to go: "What is uniquely human? Symbols. What are symbols? That which is uniquely human." In my original comment, I suggested a reframing that would fix this issue, namely: "What is uniquely human? Spontaneous reversal of temporal associations." The authors say they don't see the difference between this framing and their own, so I'll try to clarify: the difference is that it sidesteps the notion of "symbol", and in so doing removes the circular definitions of "symbol" and "singularity" in terms of each other. This suggestion was given not as a prescription but as an example to show that the issue can be remedied by revisions to the framing without doing damage to the empirical claims. If the authors prefer a different remedy that avoids circular definitions of terms, that's fine.

    1. Author response:

      We thank the reviewers for their thorough comments on our manuscript. We appreciate their recognition of the strengths in our study, including addressing the significant problem of neonatal sepsis in preterm infants using a preterm piglet model, the robustness of our multi-omics dataset, and our multi-pronged approach to examining the physiological changes under different glucose management regimens.

      This document addresses our initial responses to the main concerns of the 3 reviewers. We will provide more detailed responses to their comments and revise the manuscript at a later date.

      In response to Reviewer #1, we acknowledge the concern about high blood glucose levels in the control group. This work is a follow-up from our previous work (Muk et al, JCI insight 2022) where we explored different PN glucose regimens. Taken together, our experiments suggest a linear relationship between glucose provision and infection severity, indicating increased glucose may heighten mortality risk, while radical reduction could reduce mortality due to sepsis, but cause hypoglycemia and brain damage. As for the discrepancy in survival rates between Figures 1B and 6B, this is due to a shortened follow-up time in the follow-up experiment. This was done to minimize animal suffering because relevant differences in immune-responses were detectable within 12 hours in the primary experiment. As for the relationship between bacterial burdens and glucose, we agree that lower bacterial density in piglets receiving the reduced glucose PN may result from slower bacterial growth. However, we analyzed the relationship between bacterial burdens and mortality and found that it did not correlate within each of the treatment groups. This finding inspired us to further explore the relationship between bacterial burdens and infection responses in our model which has resulted in our recent preprint: Wu et at. Regulation of host metabolism and defense strategies to survive neonatal infection. BioRxiv 2024.02.23.581534; doi: https://doi.org/10.1101/2024.02.23.581534

      For Reviewer #2, The distinction between early (EOS) and late onset sepsis (LOS) in the time cut-off makes sense clinically because they are likely to be caused by different organisms and origins (EOS with maternal origin and LOS with postnatal origin) and therefore require different empirical antibiotics regimes. However, it is also important to acknowledge that the pathophysiology of “sepsis” may be similar despite timing and pathogen and depends on the degree of immune activation. Therefore, even though the infection in our model is initiated on the first day after birth the organism that we use, Staphylococcus epidermidis (most common bacteria detected in LOS), makes it a better model for LOS. As for neutrophil specific transcripts, we only collected the whole blood transcript during the experiments, which reflects the transcriptomic profile of all the leucocytes. Since we did not do single cell RNA sequencing during the experiment there is no possibility of isolating the neutrophil transcriptome at this time. As for the question of a “safe glucose infusion rate”, there likely is none as the immune responses to glucose intake do not seem binary but increase with glucose intake. Our reduced glucose PN was chosen as it corresponded with the low end of recommended guidelines for PN glucose intake. However, the reduced glucose intervention still resulted in significant morbidity and a 25% mortality within 22 hours. There is therefore still vast room for improvement, but even though further reduction in PN glucose intake would probably provide further protection, it would entail dangerous hypoglycemia. The findings in this paper have prompted us to explore several alternative strategies to both reduce infection-related mortality and maintain glucose homeostasis. Thus, the optimal PN for infected newborns would probably differ from standard PN in all macronutrients compartments and will require much more pre- and clinical research.

      For Reviewer #3, we acknowledge the variability in data collected from animals at euthanasia. These endpoints represent snapshots of the animals' states at euthanasia, which is a clear limitation of our method. Therefore, we do not know what metabolic processes precede the development of lethal sepsis, although the increases in plasma lactate suggest a higher rate of glycolysis in animals on high glucose PN. However, we believe the data still heavily imply a causal relationship between energy metabolic processes, especially glycolytic breakdown of glucose, and the pro-inflammatory responses leading to sepsis. In our recent preprint mentioned above we further explored the metabolic responses in pigs that succumbed to sepsis, compared to those that survived and found that survival was strongly associated with increases in mitochondrial metabolism and reduction in glycolysis.

      We hope these clarifications and our commitment to further research address your concerns satisfactorily. Thank you for your valuable feedback.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1 (Public Review):

      “but an obvious influencing factor that the authors could investigate in their own data set is the retinal input. In Fig1b, the authors even show these data in the form of gaze and pupil size. In these example data, by eye, it looks like the pupil size is positively correlated with the run speed. This would of course have large consequences on the activity in V1, but the authors do not do anything with these data. The study would improve substantially if the authors would correlate their run speed traces with other factors that they have recorded too, such as pupil size and gaze.”

      Absolutely. We have added a first level of eye movement (and pupil size) analyses to the revised manuscript, resulting in an additional figure. In short, we found that eye movements are unlikely to play a significant role in our primary results, as the patterns of eye movements differed only slightly between running and stationary periods, and the measured impacts of such eye movements were also quantitatively much smaller than the primary effect sizes.

      We also note that in analyzing the eye movements, we also found that pupil size was larger during running than stationary. This is suggestive evidence that running is correlated with increases in arousal. Although more work will be needed to calibrate and quantify how much this factor affects neural responses (and perhaps to dissociate it from running per se), the simple analysis we present suggest that the large differences we observe could be explained by a difference between how arousal and running are correlated in the monkey versus the mouse. Instead, it appears that both species have at least qualitatively similar relations between pupil size (a standard proxy for arousal) and running.

      On this issue, we have added extensive discussion of the relevant recent work by Talluri et al. (2023) who attempted a similar cross-species analysis that considered spontaneous body movements and their effect on cortical activity (as well as the possibility that eye movements are a critical mediator in these modulations). Due to delays in revising our manuscript, we regret that our earlier submission had not cited this work, but we now do our best to highlight its importance and the synergy between these two papers. The full citation is listed below:

      Talluri BC, Kang I, Lazere A, Quinn KR, Kaliss N, Yates JL, Butts DA, Nienborg H. Activity in primate visual cortex is minimally driven by spontaneous movements. Nat Neurosci. 2023 Nov;26(11):1953-1959. doi: 10.1038/s41593-023-01459-5.

      There is a finer level of analysis that we hope to do in the future along these lines. It would rely on detailed characterization of each receptive field, building an image-computable model linking those receptive fields to the neural activity, and doing so at a finer time grain that links individual eye movements and changes in the spike train within a stimulus presentation (as opposed to working at the level of spike counts per stimulus presentation). Because these steps need to be accomplished together— and each requires substantial additional work and would go beyond the first-order findings we report in this work— we hope to report on such finer analyses in a standalone paper later. We are working on being able to do this in both marmoset and mouse.

      More generally, we want to emphatically agree that what is missing from this paper is the “why?”! We have done our best to show that a fair comparison reveals quantitatively different phenomena in marmoset and mouse. In the revised discussion, we lay out many lines of work that we hope will gain traction on this deeper mechanistic point. There’s a lot to do, and several of the possibilities are already current topics of exploration in our ongoing work.

      “Looking at the raster plot, however, shows that this strong positive correlation must be due entirely to the lower half of the neurons significantly increasing their firing rate as the mouse starts to run; in fact, the upper 25% or so of the neurons show exactly the opposite (strong suppression of the neurons as the mouse starts running). It would be more balanced if this heterogeneity in the response is at least mentioned somewhere in the text.”

      We are also intrigued by the heterogeneity of effects at the single neuron level. That is why the next section of the paper is dedicated to analyzing effects on a cell-by-cell basis. The fractions of neurons showing either increases or decreases are described separately, to get at this very issue.

      Reviewer 2 (Public Review)::

      “For example, it is known that the locomotion gain modulation varies with layer in the mouse visual cortex, with neurons in the infragranular layers expressing a diversity of modulations (Erisken et al. 2014 Current Biology). However, for the marmoset dataset, it was not reported from which cortical layer the neurons are from, leaving this point unanswered.”

      Reviewer 2 called for more consideration of details that have been addressed in the mouse literature, such as the cortical layer of the cells, and related aspects of circuitry. We have greatly re-worked the Discussion to address several of these issues. In short, the manuscript’s set of data were collected without strong traction on layers or cell types, and it will be quite interesting to get a better handle on this using both refinements to our recording procedures as well as new techniques that are now possible in the marmoset for future studies.

      “In this regard, it is worth noting that the authors report an interesting difference between the foveal and peripheral parts of the visual cortex in marmoset. It will be interesting to investigate these differences in more detail in future studies. Likewise, while running might be an important behavioral state for mice, other behavioral states might be more relevant for marmosets and do modulate the activity of the primate visual cortex more profoundly. Future work could leverage the opportunities that the marmoset model system offers to reveal new insights about behavioral-related modulation in the primate brain.”

      Same page! We have expanded the discussion to better emphasize these points and are already deep in follow up experiments to explore the foveal and peripheral representations.

      Reviewer 3 (Public Review)::

      “However, the authors did not take full advantage of the quantity and diversity of the marmoset visual cortex recordings in their analyses. They mention recording and analyzing the activity of peripheral V1 neurons but mainly present results involving foveal V1 neurons. Foveal neurons, with their small receptive fields strongly affected by precise eye position, would seem to be less likely to be comparable to rodent data. If the authors have a reason for not doing so, they should provide an explanation.”

      We agree, and hope the reviewer finds our overall reply, detailed response to Reviewer 1 (who raised a similar issue), and corresponding updates to the manuscript appropriate for this stage of understanding.

      “Given that the marmosets are motivated to run with liquid rewards, the authors should provide more context as to how this may or may not affect marmoset V1 activity. Additionally, the lack of consideration of eye movements or position presents a major absence for the marmoset results, and fails to take advantage of one of the key differences between primate and rodent visual systems - the marmosets have a fovea, and make eye movements that fixate in various locations on the screen during the task.”

      In addition to the response above, we have made edits to the manuscript to speak to issues of arousal and eye movements (also detailed in previous responses). Given the modest decrease in activity we see, the usual concerns about potential increases in neural activity related to eye movements (which we quantify in the revision) and other issues related to motivation are hard to specifically relate to existing literature. But in the revised Discussion we talk more about how future work can/should dissociate these factors, as has been done in the mouse literature.

      “Finally, the model provides a strong basis for comparison at the level of neuronal populations, but some methodological choices are insufficiently described and may have an impact on interpreting the claims.”

      We have also clarified the shared-gain model’s description, which we agree needed additional detail and clarification.

    2. eLife assessment

      This important work advances our understanding of the differences in locomotion-induced modulation in primate and rodent visual cortexes and underlines the significant contribution cross-species comparisons make to investigating brain function. The evidence in support of these differences across species is convincing. This work will be of broad interest to neuroscientists.

    3. Reviewer #1 (Public Review):

      More than ten years ago, it was shown that activity in the primary visual cortex of mice substantially increases when mice are running compared to when they are sitting still. This finding 'revolutionised' our thinking about visual cortex, turning away from it being a passive image processor and highlighting the influence of non-visual factors. The current study now for the first time repeats this experiment in marmosets. The authors find that in contrast to mice, marmoset V1 activity is slightly suppressed during running, and they relate this to differences in gain modulations of V1 activity between the two species.

      Strengths

      - Replication in primates of the original finding in mice partly took so long, because of the inherent difficulties with recording from the brain of a running primate. In fact one recent, highly related study on macaques looked at spontaneous limb movements as the macaque was sitting. The treadmill for the marmosets in the current study is a very elegant solution to the problem of running in primates. It allows for true replication of the 'running vs stationary' experiment and undoubtedly opens up many possibilities for other experiments recording from a head-fixed but active marmoset.<br /> - In addition to their own data in marmoset, the authors run their analyses on a publicly available data set in mouse. This allows them to directly compare mouse and marmoset findings, which significantly strengthens their conclusions.<br /> - Marmoset vision is fundamentally different from mouse vision as they have a fovea and make goal-directed eye movements. In this revised version of their paper, the authors acknowledge this and investigate the possible effect of eye movements and pupil size on the differences they find between running and stationary. They conclude that eye input does not explain all these differences.

      Significance

      The paper provides interesting new evidence to the ongoing discussion about the influence of non-visual factors in general, and running in particular, on visual cortex activity. As such, it helps to pull this discussion out of the rodent field mainly and into the field of primate research. The bigger question of *why* there are differences between rodents and primates remains still unanswered, but the authors do their best to provide possible explanations. The elegant experimental set-up of the marmoset on a treadmill will certainly add new findings to this issue also in the years to come.

    4. Reviewer #2 (Public Review):

      This work aims at answering whether activity in primate visual cortex is modulated by locomotion, as was reported for mouse visual cortex. The finding that the activity in mouse visual cortex is modulated by running has changed the concept of primary sensory cortical areas. However, it was an open question whether this modulation generalizes to primates.

      To answer this fundamental question the authors established a novel paradigm in which a head-fixed marmoset was able to run on a treadmill while watching a visual stimulus on a display. In addition, eye movements and running speed were monitored continuously and extracellular neuronal activity in primary visual cortex recorded using high-channel-count electrode arrays. This paradigm uniquely permitted to investigate whether locomotion modulates sensory evoked activity in visual cortex of marmoset. Moreover, to directly compare the responses in marmoset visual cortex to responses in mouse visual cortex the authors made use of a publicly-available mouse dataset from the Allen Institute. In this dataset the mouse was also running on a treadmill and observing a set of visual stimuli on a display. The authors took extra care to have the marmoset and mouse paradigms as comparable as possible.

      To characterize the visually driven activity the authors present a series of moving gratings and estimate receptive fields with sparse noise. To estimate the gain modulation by running the authors split the dataset into epochs of running and non-running which allowed them to estimate the visually evoked firing rates in both behavioral states.

      Strengths:

      The novel paradigm of head-fixed marmosets running on a treadmill while being presented with a visual stimulus is unique and ideally tailored to answering the question that the authors aimed to answer. Moreover, the authors took extra care to ensure that the paradigm in marmoset matched as closely as possible to the conditions in the mouse experiments such that the results can be directly compared. To directly compare their data the authors re-analyzed publicly available data from visual cortex of mice recorded at the Allen Institute. Such a direct comparison, and reuse of existing datasets, is another strong aspect of the work. Finally, the presented new marmoset dataset appears to be of high quality, the comparison between mouse and marmoset visual cortex is well done and the results and interpretation straightforward.

      Weaknesses:

      It is known that the locomotion gain modulation varies with layer in mouse visual cortex, with neurons in the infragranular layers expressing a diversity of modulations (Erisken et al. 2014 Current Biology). However, for the marmoset dataset the layer information was unfortunately not recorded, leaving this point open for future studies.

      Nonetheless, the aim of comparing the locomotion induced modulation of activity in primate and mouse primary visual cortex was convincingly achieved by the authors. The results shown in the figures support the conclusion that locomotion modulates the activity in primate and mouse visual cortex differently. While mice show a profound gain increase, neurons in primate visual cortex show little modulation or even a reduction in response strength.

      This work will have a strong impact on the field of visual neuroscience but also on neuroscience in general. It revives the debate of whether results obtained in the mouse model system can be simply generalized to other mammalian model systems, such as non-human primates. Based on the presented results, the comparison between the mouse and primate visual cortex is not as straightforward as previously assumed. This will likely trigger more comparative studies between mice and primates in the future, which is important and absolutely needed to advance our understanding of the mammalian brain.

      Moreover, the reported finding that neurons in primary visual cortex of marmosets do not increase their activity during running is intriguing, as it makes you wonder why neurons in the mouse visual cortex do so. The authors discuss a few ideas in the paper which can be addressed in future experiments. In this regard it is worth noting that the authors report an interesting difference between the foveal and peripheral part of the visual cortex in marmoset. It will be interesting to investigate these differences in more detail in future studies. Likewise, while running might be an important behavioral state for mice, other behavioral states might be more relevant for marmosets and do modulate the activity of primate visual cortex more profoundly. Future work could leverage the opportunities that the marmoset model system offers to reveal new insights about behavioral related modulation in the primate brain.

    5. Reviewer #3 (Public Review):

      Prior studies have shown that locomotion (e.g., running) modulates mouse V1 activity to a similar extent as visual stimuli. However, it's unclear if these findings hold in species with more specialized and advanced visual systems such as nonhuman primates. In this work, Liska et al. leverage population and single neuron analyses to investigate potential differences and similarities in how running modulates V1 activity in marmosets and mice. Specifically, they discovered that although a shared gain model could describe well the trial-to-trial variations of population-level neural activity for both species, locomotion more strongly modulated V1 population activity in mice. Furthermore, they found that at the level of individual units, marmoset V1 neurons, unlike mice V1 neurons, experience suppression of their activity during running.

      A major strength of this work is the introduction and completion of primate electrophysiology recordings during locomotion. Data of this kind were previously limited, and this work moves the field forward in terms of data collection in a domain previously inaccessible in primates. Another core strength of this work is that it adds to a limited collection of cross-species data collection and analysis of neural activity at the single-unit and population level, attempting to standardize analysis and data collection to be able to make inferences across species. In particular, the findings on how the primate peripheral and foveal V1 representations functionally relate to and differ from the mice V1 representations speak to the power of these cross-species comparisons.

      However, there are still some lingering potential extensions to this work, largely acknowledged by the authors. One of these extensions involves more detailed eye movement analysis within species, such as microsaccades in marmosets and the potential impact on marmoset V1 activity. In the mouse data, similar eye-related analyses were not possible, in part due to instability in the eye recordings of many mouse sessions that made it challenging to replicate partnered analyses for the marmosets. We agree with the authors' assessment that these analyses can be targeted in future work and still believe that the marmoset eye-movement findings provide novel insights that will inform future cross-species comparisons of the visual system. Furthermore, another important issue not fully explored is the possible effects of the reward scheme during marmoset locomotion on V1 activity. The authors note that, unlike their mice counterparts, the marmosets were encouraged to run via liquid rewards, given after subjects traversed a specific distance. While the authors discuss the changes in arousal present when marmosets were running, there are still some unanswered questions on how their reward scheme may affect biomarkers (e.g., pupil sizes) and marmoset V1 activity.

      Overall, the methods and data support the work's main claims. Single neuron and population level approaches demonstrate that the activity of V1 in mice and marmoset are categorically different. Since primate V1 is so diverse and differs from mouse V1, this presents important limitations on direct inferences from mouse V1 to primate V1. This work is a great step forward in the field, especially with the novel methodology of collecting neural activity from running primates.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study presents a useful comparison of the dynamic properties of two RNA-binding domains. The data collection and analysis are solid, making excellent use of a suite of NMR methods. However, evidence to support the proposed model linking dynamic behavior to RNA recognition and binding by the tandem domains remains incomplete. The work will be of interest to biophysicists working on RNA-binding proteins.

      We thank eLife for taking the time and effort to review our manuscript. Evidence from the literature and our study shows a great deal of parity between the dynamic behavior of dsRBDs and its dsRNA-recognition and -binding that helped us culminate in proposing a fair model. As already mentioned in the manuscript, we have been working on the suggested experiments to support our proposed model further.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In the manuscript entitled "Differential conformational dynamics in two type-A RNA-binding domains drive the double-stranded RNA recognition and binding," Chugh and co-workers utilize a suite of NMR relaxation methods to probe the dynamic landscape of the TAR RNA binding protein (TRBP) double-stranded RNA-binding domain 2 (dsRBD2) and compare these to their previously published results on TRBP dsRBD1. The authors show that, unlike dsRBD1, dsRBD2 is a rigid protein with minimal ps-ns or us-ms time scale dynamics in the absence of RNA. They then show that dsRBD2 binds to canonical A-form dsRNA with a higher affinity compared to dsRBD1 and does so without much alteration in protein dynamics. Using their previously published data, the authors propose a model whereby dsRBD2 recognizes dsRNA first and brings dsRBD1 into proximity to search for RNA bulge and internal loop structures.

      We thank the Reviewer for sending us an encouraging review. We have combined the findings reported in the literature with new ones that led us to propose the dsRNA-binding model by tandem A-form dsRBDs.

      We propose that dsRBD1 can first recognize a variety of sequential and structurally different dsRNAs. dsRBD2 assists the interaction with a higher affinity, thus fortifying the interaction between TRBP and a possible substrate. This may enable the other associated proteins like Dicer and Ago2 to perform critical biological functions.

      However, we feel that a few statements in the comment above are factually incorrect.

      Statement 1. “They then show that dsRBD2 binds to canonical A-form dsRNA with a higher affinity compared to dsRBD1 and does so without much alteration in protein dynamics.”

      We have explicitly shown the perturbation in dsRBD2 dynamics upon RNA binding.

      Statement 2. “Using their previously published data, the authors propose a model whereby dsRBD2 recognizes dsRNA first and brings dsRBD1 into proximity to search for RNA bulge and internal loop structures.”

      Our previously published data suggests that dsRBD1, owing to its high conformational dynamics in solution, is able to recognize a variety of structurally and sequentially different dsRNAs ([Paithankar et al., 2022]). dsRBDs preferably bind to the double-stranded region (minor-major-minor-groove) of an A-form RNA ([Acevedo et al., 2016]; [Vuković et al., 2014]) and do not search for bulge and internal loop structures as a part of the binding event. Even though dsRBDs preferably bind to the double-stranded region, they can still accommodate perturbation in the A-form helix due to mismatch and bulges with decreased binding affinity ([Acevedo et al., 2015]). However, it is a matter of future research to identify how much of a deviation from the A-form structure can be accommodated by the dsRBDs. The diffusion event observed in the literature ([Koh et al., 2013]) also does not show any direct implication for searching for bulge and internal loop structures.

      Strengths:

      The authors expertly use a variety of NMR techniques to probe protein motions over six orders of magnitude in time. Other NMR titration experiments and ITC data support the RNA-binding model.

      Weaknesses:

      The data collection and analysis are sound. The only weakness in the manuscript is the lack of context with the much broader field of RNA-binding proteins. For example, many studies have shown that RNA recognition motif (RRM) domains have similar dynamic characteristics when binding diverse RNA substrates. Furthermore, there was no discussion about the entropy of binding derived from ITC. It might be interesting to compare with dynamics from NMR.

      We understand the reviewer’s point that this study is focused on a dsRNA-binding mechanism rather than addressing the much broader field of RNA-binding. There are multiple challenges in finding a single mechanism that works for all RNA-binding proteins. For instance, RRM is a single-stranded RNA binding domain that is able to read out the substrate base sequence. RRM behaves entirely differently than the dsRBD in terms of target specificity. Besides, several other RNA-binding domains, like the KH-domain, Puf domains, Zinc finger domains, etc., showcase a unique RNA-binding behavior. Thus, it would be really difficult to draw a single rule of thumb for RNA-recognition behavior for all these diverse domains.

      Thank you for pointing out the entropy of binding from ITC. We have now included the entropy of binding discussion in the main text, page 7.

      Reviewer #2 (Public Review):

      Summary:

      Proteins that bind to double-stranded RNA regulate various cellular processes, including gene expression and viral recognition. Such proteins often contain multiple double-stranded RNA-binding domains (dsRBDs) that play an important role in target search and recognition. In this work, Chug and colleagues have characterized the backbone dynamics of one of the dsRBDs of a protein called TRBP2, which carries two tandem dsRBDs. Using solution NMR spectroscopy, the authors characterize the backbone motions of dsRBD2 in the absence and presence of dsRNA and compare these with their previously published results on dsRBD1. The authors show that dsRBD2 is comparatively more rigid than dsRBD1 and claim that these differences in backbone motions are important for target recognition.

      Strengths:

      The strengths of this study are multiple solution NMR measurements to characterize the backbone motions of dsRBD2. These include 15N-R1, R2, and HetNOE experiments in the absence and presence of RNA and the analysis of these data using an extended-model-free approach; HARD-15N-experiments and their analysis to characterize the kex. The authors also report differences in binding affinities of dsRBD1 and dsRBD2 using ITC and have performed MD simulations to probe the differential flexibility of these two domains.

      Weaknesses:

      While it may be true that dsRBD2 is more rigid than dsRBD1, the manuscript lacks conclusive and decisive proof that such changes in backbone dynamics are responsible for target search and recognition and the diffusion of TRBP2 along the RNA molecule. To conclusively prove the central claim of this manuscript, the authors could have considered a larger construct that carries both RBDs. With such a construct, authors can probe the characteristics of these two tandem domains (e.g., semi-independent tumbling) and their interactions with the RNA. Additionally, mutational experiments may be carried out where specific residues are altered to change the conformational dynamics of these two domains. The corresponding changes in interactions with RNA will provide additional evidence for the model presented in Figure 8 of the manuscript. Finally, there are inconsistencies in the reported data between different figures and tables.

      We thank the reviewer for the comprehensive and insightful review. A larger construct carrying both RBDs was not used because of the multiple challenges pertaining to dynamics study by NMR spectroscopy (intrinsic R2 rates of the dsRBD1-dsRBD2 construct would be high, resulting in broadened peaks) as per our previous experience ([Paithankar et al., 2022]). There would be additional dynamics in that construct coming from domain-domain relative motions, and it is difficult to deconvolute the dynamics information. Further, the dsRNA needed to bind to this construct will be longer, causing further line broadening in NMR.

      Coming to mutational studies, careful designing of domain mutants remains as a challenge because the conformational dynamics in both the domains are distributed all through the backbone rather than only in the RNA-binding residues. The mutational studies would need an exhaustive number of mutations in protein as well as RNA to draw a parallel between the binding and dynamics. Having said that, we are working on making such mutations in the protein (at several locations to freeze the dynamics site-specifically) and the RNA (to change the shape of the dsRNA) to systematically study this mechanism, which will be out of scope of this manuscript.

      The reviewer has rightly pointed out some subtle superficial differences in the reported data between different figures and tables. These superficial differences are present because of the context in which we are describing the data. For example, in Figure S4, we are talking about the average relaxation rates and nOe values for only the common residues we were able to analyze between two magnetic field strengths 600 and 800 MHz. Whereas in Figure 6, we are comparing the averages of the core (159-227) dsRBD residues at 600 MHz, in the presence and absence of D12RNA. The differences, however, are minute falls well within the error range.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Suggestions for improved or additional experiments -

      In regards to ITC data, dsRBD1 does not bind canonical A-form RNA with high affinity. What is dsRBD1 and dsRBD2 affinity to the miR-16 RNA?

      We have not performed ITC-based studies with miR-16 RNA for the domains. The study by Acevedo et al. has shown the effect of lengths of Watson-Crick duplex RNAs upon TRBP2 dsRBD binding. In this study, they have compared the ds22 RNA to miRNA/miRNA* duplex. By using EMSA, they show that the Kd,app (μM) for dsRBD1 is 3.5±0.2 and for dsRBD2 is 1.7±0.1, indicating a higher affinity by the latter ([Acevedo et al., 2015]).

      What was the amount of time used for the 1H saturation in the heteronuclear NOE experiment? Based on the average T1 (1/1.44 s-1) = 0.69 s, a recovery delay of >7 s should have been used for this experiment.

      According to Cavanagh et al., a minimum recovery/recycle delay should be greater than 5*1/R1 to make sure that 99% of the 1HN and 15N magnetizations are restored ([“Protein NMR Spectroscopy, Principles and Practice, John Cavanagh, Wayne J. Fairbrother, Arthur G. Palmer III, and Nicholas J. Skelton. Academic Press, San Diego, 1995, 587 pages, $59.95. ISBN: 0-12-164490-1.,” 1996]). In our study, we have used a relaxation delay of 5 s, which is greater than 7*1/R1avg thus ensuring at least 99% of the 1HN and 15N recover their bulk magnetization.

      Recommendations for improving writing and presentation -

      Figure 3 - The legend in panel C is incomplete.

      Figure 3 (Figure 4 in the revised manuscript) has been updated, and the legend now reads complete.

      Figures 3 E and F - The three views can be combined into one as is done in Figures 4 C and D.

      Thanks for the kind suggestion. We have depicted the kex in the three ranges to highlight the difference between the two domains at each range. Since there are three different exchange regimes with different populations, we believe this gives us an uncomplicated picture while classifying and comparing the dynamics between the two. Combining the three views into one becomes too overwhelming to visualize kex and population distribution in the protein.

      Figure 3 - The residues indicated in the text (e.g., R200, L212, and R224) should be indicated in panels E and F.

      We have marked the residues described in the text in Figure 4C (revised Figure 5C), and thus, they are not mentioned in Figures 3E and 3F (revised Figures 4E and 4F).

      The results and discussion put these findings into minimal context. Most comparisons are made between dsRBD1 and dsRBD2. What about other RNA-binding proteins? There is a wealth of structure/dynamics/functional data about RNA recognition motifs, which do exactly the same thing as described here but are missing.

      We understand the reviewer’s point that this study is focused on a dsRNA-binding mechanism rather than addressing the much broader field of RNA-binding. There are multiple challenges in finding a single mechanism that works for all RNA-binding proteins. For instance, RRM is a single-stranded RNA-recognition motif that can read out the substrate base sequence. RRM behaves entirely differently than the dsRBD in terms of sequence specificity. Besides, several other RNA-binding domains, like the KH-domain, Puf domains, Zinc-finger domains, etc., showcase a unique RNA-binding behaviour. Thus, with the current knowledge, it would not be possible to draw a single rule of thumb for RNA-recognition behaviour for all these diverse domains. Hence, the findings of this study are not comparable to those of other RNA-binding domains and are beyond the scope of this study.

      Results, page 8 - I'm not sure that allosteric quenching is appropriately invoked here. The amount of residues showing dynamics in the apo state is small and the number only moderately increases upon RNA binding. The observation that some residues show an increase and a neighboring residue shows a decrease (or vice versa) upon RNA binding could just be random with the small number of observations. This observation would be more convincing if it were happening to larger regions within the protein.

      We agree with the reviewer that the number of residues showing dynamics in the apo-state of the dsRBD2 is small when compared with that of dsRBD1, and the number only moderately increases upon RNA-binding. However, we believe it is quite important to invoke the allosteric quenching as all the new residues where dynamics is induced, do lie in the spatial proximity, as also observed in the dsRBD1 ([Paithankar et al., 2022]). It is a parameter to not only compare the differences and similarities in the two domains but also to highlight the presence of this phenomenon common in both the type-A dsRBDs of TRBP.

      Minor corrections -

      Introduction, page 2 - The order parameter should be defined for non-NMR experts.

      Thank you for the suggestion. The definition of order parameter has now been included on page 2 of the revised manuscript.

      Introduction, page 2 - TRBP should be defined in the main text the first time used.

      We have now defined TRBP on page 2 of the revised manuscript, where it is used in the main text for the first time.

      Results, page 5 - The reference for the HARD experiment should be given earlier in that paragraph.

      Thank you for the suggestion. We have now referenced the HARD experiment earlier in the last paragraph on page 5 of the revised manuscript.

      Results, page 7 - What is the limiting amount of RNA used for the D12-bound dsRBD2 spin relaxation measurements?

      The limiting amount of RNA used for the D12-bound dsRBD2 spin relaxation measurements is 0.05 equivalent (RNA:Protein= 50 mM:1000 mM). It has now been included on page 7 of the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Throughout the manuscript, NMR datasets are not consistent with one another (a few examples are listed below).

      Figures S4, 6, and Table S4: (a) It is unclear why relaxation data for certain residues are missing in Table S4 (e.g., S156, V168, E177, F192, etc.).

      We thank the reviewer for pointing this out. We have now reanalyzed the data for all the above-mentioned residues and other missing residues. In the revised manuscript, we have added the data for the above-mentioned residues like E177, R189, and many more N- and C-terminal residues. Unfortunately, for some residues like V168, S184, F192, S209, and L222, we witnessed severe peak broadening while measuring the R2 rates and/or nOe. Hence, data for V168, S184, F192, S209, and L222 are missing in Table S4. We have explicitly mentioned this in the table legends about missing data for a few residues.

      (b) The reported values are not consistent. For example, Figure S4 says that the average 15N-R2 rate is 10.85 +/- 0.36 s-1 whereas Figure 6 says the 15N-R2 rate is 11.02 +/- 0.39 s-1 for the same dataset.

      The superficial differences are present because of the context in which we are describing the data (now mentioned in the methods section on page 13). In Figure S4, we are talking about the average relaxation rates and nOe values for only the common residues we could analyze between two magnetic field strengths, 600 and 800 MHz. Whereas in Figure 6 (revised figure 3), we compare the averages of all the analyzed core dsRBD residues at 600 MHz in the presence and absence of D12RNA. The differences, however, are insignificant, falling well within the error range.

      (c) There is also a discrepancy in reported R2 values (at 600 MHz) in Table S4. It is unclear to me what the reported values are, as most of these are below 1 s-1.

      Thank you very much for pointing out our mistake here. The Table S4 seems to have the wrong values for R2 at 600 MHz. However, the raw data submitted to the BMRB as entry 52077 holds the correct information. We have now updated the Table S4.

      (d) It is also unclear as to why perfectly resolved residues (e.g., L230, A232, D234, etc.) have been omitted from these data (and other datasets such as 15N-CPMGs shown in Figure S6).

      The residues L230, A232, D234, etc., are the C-terminal residues of TRBP-dsRBD2 beyond the core (159-227 aa) fold of dsRBD. They have now been included in the revised figures S6 and S11 for completeness.

      (e) Figure 6 reports a 15N-R2 of 21 s-1 for one of the residues in the absence of RNA. This data point has been omitted from Figure S4.

      In Figure S4, we are talking about relaxation rates and nOe values only for the common residues we could analyze between the two magnetic field strengths, 600 and 800 MHz. Thus, that 15N-R2 value has been omitted.

      The S2 order parameters reported in Figures S5 and S10 are inconsistent with one another, as additional residues are shown in S10 (e.g., N159).

      Thank you for pointing it out. We have now reanalyzed the data for S2 order parameter and Rex by including more residues (e.g., N159, R189, etc) in the core and have updated both Figures S5 and S10. Please see the revised supplementary information.

      Tables S6 and S7 report values for residue R189. This residue has been omitted in every other dataset. Based on the 1H-15N HSQC spectrum shown in Figure S3, this residue gives a well-resolved crosspeak (which lies adjacent to V228). Can the authors explain why they omit data for this residue in Figures S4, 6, and Table S4?

      The reviewer is correct in pointing out that data for R189 is missing in the fast dynamics data, such as Figure S4, Figure 6 (revised figure 3), and Table S4. We have now reanalyzed our raw data and included data for R189 and other missing residues in our updated manuscript. Please see the revised figures S4 and 6 (revised figure 3) and the revised table S4.  

      Moreover, this residue lies in the loop2 region of this domain. Based on the MD simulations (Figure 2), this region is more flexible compared to the rest of the domain. Does the corresponding 15N-relaxation data support this claim?

      Yes, the apo 15N-relaxation data do strongly support this claim. R189 showed a higher than core average R2 rate (R189 = 15.44 +/- 0.69 s-1; core = 10.92 +/- 0.37 s-1) and a lower than core average nOe (R189 = 0.49 +/- 0.05; core = 0.73 +/- 0.03) which indicate a higher flexibility than the rest of the core (updated Figure 3 and Table S4). Additionally, the S2 order parameter for R189 was found to be 0.52 +/- 0.03, slightly lower than the core average of 0.59 +/- 0.03, indicating a more flexible region than the core (updated Table S14). Moreover, the dynamics parameters extracted from HARD experimental data using the geoHARD method for apo TRBP2-dsRBD2 shown in Table S18 depict a high kex value of 31748.72 +/- 955.20 Hz for R189. This supports the claim that this residue is highly flexible with a high exchange rate.

      Figure S9. I was not able to follow this dataset as the data points are not consistent between different residues.

      In Figure S9, the residue-wise peak intensities plotted against the RNA concentration indicate that line broadening was witnessed for all the core residues (irrespective of the initial peak intensity). Another interesting observation is that the terminal residues do not undergo the same line broadening as seen in the core residues.

      It is also unclear why residue G185 is highlighted.

      It is taken as an example and magnified to show the extent of line broadening. This is now explicitly mentioned in the figure caption in the revised supplementary information.

      It is also not clear exactly what the authors are trying to fit, as I see no chemical shift changes upon the addition of RNA (Fig. S8), and the equation used for data fitting (pg. 11) uses chemical shift changes (and not the changes in intensities).

      The same equation can be used to fit the chemical shift perturbation and peak intensity perturbation as a function of ligand concentration. Here, we have tried to fit the intensity perturbation. We have now modified the statement on page 11 in the revised manuscript.

      Table S2: The ITC analysis reports an n value of ~3. Can authors elaborate as to what this means?

      The stoichiometry ~3 indicates the number of TBDP2-dsRBD2 that can interact with D12 RNA in a single binding event. The minimum binding register for dsRBDs is known to be >8 bp (12 bp for optimal binding) ([Ramos et al., 2000]), and one single domain only covers one-third of the face of the cylindrical RNA ([Masliah et al., 2018]). Hence, 3 dsRBD2 could interact with a 12-mer RNA in solution.

      The reported Kd values between the main text (page 7) and Figure 5 are not consistent with one another (one lists 1.18 uM while the other says 1.11 uM). Table S2 does not list the parameters for interactions between dsRBD1 and D12.

      Figure 5 (revised figure 6) depicts the information of a single isolated experiment out of a total of three, whereas in the main text, we say 1.18 μM as the average Kd value (table S2).

      Figure S4: The red axis should read "211" instead of "111".

      Thank you for your helpful insight. We have now changed it in the revised figure.

      Table S3 lists the structural motifs of the two dsRBDs, which are nearly identical to one another, and yet the manuscript claims that these are different (page 4, paragraph 1).

      We agree with the reviewer that the differences are minute but important, which we have tried to highlight in this paper. In particular, loop 2, critical for dsRNA-binding ([Masliah et al., 2012]), is 1 residue longer in dsRBD2 and has a possible effect in enhanced substrate binding.

      Figure S8 shows severe signal attenuation for many residues upon the addition of 100 uM RNA. The most notable among these are residues M194, T195, and C196. Can the authors explain how they measure 15N-relaxation rates for these residues in the presence of 50 uM D12?

      First, we have recorded the measured 15N-relaxation rates for these residues in the presence of 50 mM D12 (RNA:Protein= 50 mM:1000 mM)), corresponding to 0.05 equivalent RNA. The amount of RNA used is less than that used for the HSQC-based titration shown in Figure S8, 0.1 equivalent RNA (RNA:Protein = 5 mM:50 mM), where we witness line broadening for residues like M194, T195, and C196. Second, we increased the overall protein concentration from 50 mM (used in HSQC-based titration) to 1000 mM (used in relaxation measurements) to ensure a better signal-to-noise ratio in all the spectra.

      Use the same coloring scheme for Figures S7 and S8.

      Thank you for the suggestion. We have now edited Figure S8 accordingly.

      Figures are often listed out-of-order, making it difficult to follow the manuscript.

      Thank you for the suggestion. We have now amended the main text to refer to the figures sequentially. While doing so, we have renumbered Figure 6 as Figure 3, Figure 3 as Figure 4, Figure 4 as Figure 5, and Figure 5 as Figure 6.

      Figure captions for the relaxation data should specify the temperature at which these datasets were collected.

      Thanks for the valuable suggestion. We have now added the temperature wherever applicable.

      References

      Acevedo R, Evans D, Penrod KA, Showalter SA. 2016. Binding by TRBP-dsRBD2 Does Not Induce Bending of Double-Stranded RNA. Biophys J 110:2610–2617. doi:10.1016/j.bpj.2016.05.012

      Acevedo R, Orench-Rivera N, Quarles KA, Showalter SA. 2015. Helical Defects in MicroRNA Influence Protein Binding by TAR RNA Binding Protein. PLoS ONE 10:e0116749. doi:10.1371/journal.pone.0116749

      Koh HR, Kidwell MA, Ragunathan K, Doudna JA, Myong S. 2013. ATP-independent diffusion of double-stranded RNA binding proteins.

      Masliah G, Barraud P, Allain FH-T. 2012. RNA recognition by double-stranded RNA binding domains: a matter of shape and sequence. Cell Mol Life Sci 70:1875–1895. doi:10.1007/s00018-012-1119-x

      Masliah G, Maris C, König SL, Yulikov M, Aeschimann F, Malinowska AL, Mabille J, Weiler J, Holla A, Hunziker J, Meisner‐Kober N, Schuler B, Jeschke G, Allain FH. 2018. Structural basis of siRNA recognition by TRBP double‐stranded RNA binding domains. EMBO J 37:e97089. doi:10.15252/embj.201797089

      Paithankar H, Tarang GS, Parvez F, Marathe A, Joshi M, Chugh J. 2022. Inherent conformational plasticity in dsRBDs enables interaction with topologically distinct RNAs. Biophys J 121:1038–1055. doi:10.1016/j.bpj.2022.02.005

      Protein NMR Spectroscopy, Principles and Practice, John Cavanagh, Wayne J. Fairbrother, Arthur G. Palmer III, and Nicholas J. Skelton. Academic Press, San Diego, 1995, 587 pages, $59.95. ISBN: 0-12-164490-1. 1996. . J Magn Reson, Ser B 113:277. doi:10.1006/jmrb.1996.0189

      Ramos A, Grünert S, Adams J, Micklem DR, Proctor MR, Freund S, Bycroft M, Johnston DS, Varani G. 2000. RNA recognition by a Staufen double‐stranded RNA‐binding domain. EMBO J 19:997–1009. doi:10.1093/emboj/19.5.997

      Vuković L, Koh HR, Myong S, Schulten K. 2014. Substrate Recognition and Specificity of Double-Stranded RNA Binding Proteins. Biochemistry 53:3457–3466. doi:10.1021/bi500352s

    1. Reviewer #3 (Public Review):

      Summary:

      The way an unavailable (distractor) alternative impacts decision quality is of great theoretical importance. Previous work, led by some of the authors of this study, had converged on a nuanced conclusion wherein the distractor can both improve (positive distractor effect) and reduce (negative distractor effect) decision quality, contingent upon the difficulty of the decision problem. In very recent work, Cao and Tsetsos (2022) reanalyzed all relevant previous datasets and showed that once distractor trials are referenced to binary trials (in which the distractor alternative is not shown to participants), distractor effects are absent. Cao and Tsetsos further showed that human participants heavily relied on additive (and not multiplicative) integration of rewards and probabilities.

      The present study by Wong et al. puts forward a novel thesis according to which interindividual differences in the way of combining reward attributes underlie the absence of detectable distractor effect at the group level. They re-analysed the 144 human participants and classified participants into a "multiplicative integration" group and an "additive integration" group based on a model parameter, the "integration coefficient", that interpolates between the multiplicative utility and the additive utility in a mixture model. They report that participants in the "multiplicative" group show a negative distractor effect while participants in the "additive" group show a positive distractor effect. These findings are extensively discussed in relation to the potential underlying neural mechanisms.

      Strengths:

      - The study is forward looking, integrating previous findings well, and offering a novel proposal on how different integration strategies can lead to different choice biases.<br /> - The authors did an excellent job in connecting their thesis with previous neural findings. This is a very encompassing perspective that is likely to motivate new studies towards better understanding of how humans and other animals integrate information in decisions under risk and uncertainty.<br /> - Despite that some aspects of the paper are very technical, methodological details are well explained and the paper is very well written.

      Weaknesses:

      - The authors quantify the distractor variable as "DV - HV", i.e., the relative distractor variable. Conclusions mostly hold when the distractor is quantified in absolute terms (as "DV", see also Cao & Tsetsos, 2023). However, it is not entirely clear why the impact of the distractor alternative is not identical when the distractor variable is quantified in absolute vs. relative terms. Although understanding this nuanced point seems to extend beyond the scope of the paper, it could provide valuable decision-theoretic (and mechanistic) insights.<br /> - The central finding of this study is that participants who integrate reward attributes multiplicatively show a positive distractor effect while participants who integrate additively show a negative distractor effect. This is a very interesting and intriguing observation. However, it does not explain why the integration strategy covaries with the direction of the distractor effect. As the authors acknowledge, the composite model is not explanatory. Although beyond the scope of this paper, it would be valuable to provide a mechanistic explanation of this covariation pattern.

    2. eLife assessment

      This manuscript provides a valuable demonstration that distractor effects in multi-attribute decision-making correlate with the form of attribute integration (additive vs. multiplicative). The evidence supporting the conclusions is convincing, but there are questions about how to interpret the findings. The manuscript will be interesting to decision-making researchers in neuroscience, psychology, and related fields.

    3. Reviewer #1 (Public Review):

      Summary:

      The current study provided a follow-up analysis using published datasets focused on the individual variability of both the distraction effect (size and direction) and the attribute integration style, as well as the association between the two. The authors tried to answer the question of whether the multiplicative attribute integration style concurs with a more pronounced and positively oriented distraction effect.

      Strengths:

      The analysis extensively examined the impacts of various factors on decision accuracy, with particular focus on using two-option trials as control trials, following the approach established by Cao & Tsetsos (2022). The statistical significance results were clearly reported.

      The authors meticulously conducted supplementary examinations, incorporating the additional term HV+LV into GLM3. Furthermore, they replaced the utility function from the expected value model with values from the composite model.

      Weaknesses:

      The authors did a great job addressing the weaknesses I raised in the previous round of review, except on the generalizability of the current result in the larger context of multi-attribute decision-making. It is not really a weakness of the manuscript but more of a limitation of the studied topic, so I want to keep this comment for public readers.

      The reward magnitude and probability information are displayed using rectangular bars of different colors and orientations. Would that bias subjects to choose an additive rule instead of the multiplicative rule? Also, could the conclusion be extended to other decision contexts such as quality and price, where a multiplicative rule is hard to formulate?

      Overall, the authors have achieved their aims after clarifying that the study was trying to establish a correlation between the integration style and attraction effect. This result may be useful to inspire neuroimaging or neuromodulation studies that investigate multi-attribute decision making.

    4. Reviewer #2 (Public Review):

      This paper addresses the empirical demonstration of "distractor effects" in multi-attribute decision-making. It continues a debate in the literature on the presence (or not) of these effects, which domains they arise in, and their heterogeneity across subjects. The domain of the study is in a particular type of multi-attribute decision-making: choices over risky lotteries. The paper reports a re-analysis of lottery data from multiple experiments run previously by the authors and other labs involved in the debate.

      Methodologically, the analysis assumes a number of simple forms for how attributes are aggregated (adaptively, or multiplicatively, or both) and then applies a "reduced form" logistic regression to the choices with a number of interaction terms intended to control for various features of the choice set. One of these interactions, modulated by ternary/binary treatment, is interpreted as a "distractor effect."

      The claimed contribution of the re-analysis is to demonstrate correlation in the strength/sign of this treatment effect with another estimated parameter: the relative mixture of additive/multiplicative preferences.

      Major Issues

      (1) How to Interpret GLM 1 and 2

      This paper, and others before it, have used a binary logistic regression with a number of interaction terms to attempt to control for various features of the choice set and how they influence choice. It is important to recognize that this modelling approach is not derived from a theoretical claim about the form of the computational model that guides decision-making in this task, nor an explicit test for a distractor effect. This can be seen most clearly in the equations after line 321 and its corresponding log-likelihood after 354, which contain no parameter or test for "distractor effects". Rather the computational model assumes a binary choice probability, and then shoehorns the test for distractor effects via a binary/ternary treatment interaction in a separate regression (GLM 1 and 2). This approach has already led to multiple misinterpretations in the literature (see Cao & Tsetsos, 2022; Webb et al., 2020). One of these misinterpretations occurred in the datasets the authors study, in which the lottery stimuli contained a confound with the interaction that Chau et al., (2014) were interpreting as a distractor effect (GLM 1). Cao & Tsetsos (2022) demonstrated that the interaction was significant in binary choice data from the study, therefore it can not be caused by a third alternative. This paper attempts to address this issue with a further interaction with the binary/ternary treatment (GLM 2). Therefore the difference in the interaction across the two conditions is claimed to now be the distractor effect. The validity of this claim brings us to what exactly is meant by a "distractor effect."

      The paper begins by noting that "Rationally, choices ought to be unaffected by distractors" (line 33). This is not true. There are many normative models which allow for the value of alternatives (even low-valued "distractors") to influence choices, including a simple random utility model. Since Luce (1959), it has been known that the axiom of "Independence of Irrelevant Alternatives" (that the probability ratio between any two alternatives not depend on a third) is an extremely strong axiom, and only a sufficiency axiom for a random utility representation (Block and Marschak, 1959). It is not a necessary condition of a utility representation, and if this is our definition of rational (which is highly debatable), not necessary for it either. Countless empirical studies have demonstrated that IIA is falsified, and a large number of models can address it, including a simple random utility model with independent normal errors (i.e. a multivariate Probit model). In fact, it is only the multinomial Logit model that imposes IIA. It is also why so much attention is paid to the asymmetric dominance effect, which is a violation of a necessary condition for random utility (the Regularity axiom).

      So what do the authors even mean by a "distractor effect." It is true that the form of IIA violations (i.e. their path through the probability simplex as the low-option varies) tells us something about the computational model underlying choice (after all, different models will predict different patterns). But we do not know how the interaction terms in the binary logit regression relate to the pattern of the violations because there is no formal theory that relates them. Any test for relative value coding is a joint test of the computational model and the form of the stochastic component (Webb et al,. 2020). These interaction terms may simply be picking up substitution patterns that can be easily reconciled with some form of random utility. While we can not check all forms of random utility in these datasets (because the class of such models is large), this paper doesn't even rule any of these models out.

      (2) How to Interpret the Composite (Mixture) model?

      On the other side of the correlation is the results from the mixture model for how decision-makers aggregate attributes. The authors report that most subjects are best represented by a mixture between additive and multiplicative aggregation models. The authors justify this with the proposal that these values are computed in different brain regions and then aggregated (which is reasonable, though raises the question of "where" if not the mPFC). But an equally reasonable interpretation is that the improved fit of the mixture model simply reflects a misspecification of two extreme aggregation process (additive and EV), so the log-likelihood is maximized at some point in between them.

      One possibility is a model with utility curvature. How much of this result is just due to curvature in valuation? There are many reasonable theories for why we should expect curvature in utility for human subjects (for example, limited perception: Robson, 2001, Khaw, Li Woodford, 2019; Netzer et al., 2022) and of course many empirical demonstrations of risk aversion for small stakes lotteries. The mixture model, on the other hand, has parametric flexibility.

      There is also a large literature on testing expected utility jointly with stochastic choice, and the impact of these assumptions on parameter interpretation (Loomes & Sugden, 1998; Apesteguia & Ballester, 2018; Webb, 2019). This relates back to the point above: the mixture may reflect the joint assumption of how choice departs from deterministic EV.

      (3) So then how should we interpret the correlation that the authors report?

      On one side we have the impact of the binary/ternary treatment which demonstrates some impact of the low value alternative on a binary choice probability. This may reflect some deep flaw in existing theories of choice, or it may simply reflect some departure from purely deterministic expected value maximization that existing theories can address. We have no theory to connect it to, so we cannot tell. On the other side of the correlation with have the mixture between additive and multiplicative preferences over risk. This result may reflect two distinct neural processes at work, or it may simply reflect a misspecification of the manner in which humans perceive and aggregate attributes of a lottery (or even just the stimuli in this experiment) by these two extreme candidates (additive vs. EV). Again, this would entail some departure from purely deterministic expected value maximization that existing theories can address.

      It is entirely possible that the authors are reporting a result that points to the more exciting of these two possibilities. But it is also possible (and perhaps more likely) that the correlation is more mundane. The paper does not guide us to theories that predict such a correlation, nor reject any existing ones. In my opinion, we should be striving for theoretically-driven analyses of datasets, where the interpretation of results is clearer.

      (4) Finally, the results from these experiments might not have external validity for two reasons. First, the normative criterion for multi-attribute decision-making differs depending on whether the attributes are lotteries or nor (i.e. multiplicative vs additive). Whether it does so for humans is a matter of debate. Therefore if the result is unique to lotteries, it might not be robust for multi-attribute choice more generally. The paper largely glosses over this difference and mixes literature from both domains. Second, the lottery information was presented visually and there is literature suggesting this form of presentation might differ from numerical attributes. Which is more ecologically valid is also a matter of debate.

      Minor Issues:

      The definition of EV as a normative choice baseline is problematic. The analysis requires that EV is the normative choice model (this is why the HV-LV gap is analyzed and the distractor effect defined in relation to it). But if the binary/ternary interaction effect can be accounted for by curvature of a value function, this should also change the definition of which lottery is HV or LV for that subject!

      Comments on latest version: the authors did respond to some of my comments with discussion points in the paper.

      References

      Apesteguia, J. & Ballester, M. Monotone stochastic choice models: The case of risk and time preferences. Journal of Political Economy (2018).

      Block, H. D. & Marschak, J. Random Orderings and Stochastic Theories of Responses. Cowles Foundation Discussion Papers (1959).

      Khaw, M. W., Li, Z. & Woodford, M. Cognitive Imprecision and Small-Stakes Risk Aversion. Rev. Econ. Stud. 88, 1979-2013 (2020).

      Loomes, G. & Sugden, R. Testing Different Stochastic Specifications of Risky Choice. Economica 65, 581-598 (1998).

      Luce, R. D. Indvidual Choice Behaviour. (John Wiley and Sons, Inc., 1959).

      Netzer, N., Robson, A. J., Steiner, J. & Kocourek, P. Endogenous Risk Attitudes. SSRN Electron. J. (2022) doi:10.2139/ssrn.4024773.

      Robson, A. J. Why would nature give individuals utility functions? Journal of Political Economy 109, 900-914 (2001).

      Webb, R. The (Neural) Dynamics of Stochastic Choice. Manage Sci 65, 230-255 (2019).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      Campbell et al investigated the effects of light on the human brain, in particular the subcortical part of the hypothalamus during auditory cognitive tasks. The mechanisms and neuronal circuits underlying light effects in non-image forming responses are so far mostly studied in rodents but are not easily translated in humans. Therefore, this is a fundamental study aiming to establish the impact light illuminance has on the subcortical structures using the high-resolution 7T fMRI. The authors found that parts of the hypothalamus are differently responding to illuminance. In particular, they found that the activity of the posterior hypothalamus increases while the activity of the anterior and ventral parts of the hypothalamus decreases under high illuminance. The authors also report that the performance of the 2-back executive task was significantly better in higher illuminance conditions. However, it seems that the activity of the posterior hypothalamus subpart is negatively related to the performance of the executive task, implying that it is unlikely that this part of the hypothalamus is directly involved in the positive impact of light on performance observed. Interestingly, the activity of the posterior hypothalamus was, however, associated with an increased behavioural response to emotional stimuli. This suggests that the role of this posterior part of the hypothalamus is not as simple regarding light effects on cognitive and emotional responses. This study is a fundamental step towards our better understanding of the mechanisms underlying light effects on cognition and consequently optimising lighting standards. 

      Strengths: 

      While it is still impossible to distinguish individual hypothalamic nuclei, even with the highresolution fMRI, the authors split the hypothalamus into five areas encompassing five groups of hypothalamic nuclei. This allowed them to reveal that different parts of the hypothalamus respond differently to an increase in illuminance. They found that higher illuminance increased the activity of the posterior part of the hypothalamus encompassing the MB and parts of the LH and TMN, while decreasing the activity of the anterior parts encompassing the SCN and another part of TMN. These findings are somewhat in line with studies in animals. It was shown that parts of the hypothalamus such as SCN, LH, and PVN receive direct retinal input in particular from ipRGCs. Also, acute chemogenetic activation of ipRGCs was shown to induce activation of LH and also increased arousal in mice. 

      Weaknesses: 

      While the light characteristics are well documented and EDI calculated for all of the photoreceptors, it is not very clear why these irradiances and spectra were chosen. It would be helpful if the authors explained the logic behind the four chosen light conditions tested. Also, the lights chosen have cone-opic EDI values in a high correlation with the melanopic EDI, therefore we can't distinguish if the effects seen here are driven by melanopsin and/or other photoreceptors. In order to provide a more mechanistic insight into the light-driven effects on cognition ideally one would use a silent substitution approach to distinguish between different photoreceptors. This may be something to consider when designing the follow-up studies. 

      Reviewer #1 (Recommendations For The Authors): 

      (1) As suggested in the public review more information regarding the reasons behind the chosen light condition is needed. 

      While the light characteristics are well documented and EDI calculated for all of the photoreceptors, it is not very clear why these irradiances and spectra were chosen. It would be helpful if the authors explained the logic behind the four chosen light conditions tested. Also, the lights chosen have cone-opic EDI values in a high correlation with the melanopic EDI, therefore we can't distinguish if the effects seen here are driven by melanopsin or cone opsins. In order to provide a more mechanistic insight into the light-driven effects on cognition ideally one would use a silent substitution approach to distinguish between different photoreceptors. 

      (2) In support of this work, it was shown in mice that acute activation of ipRGCs using chemogenetics induces c-fos in some of the hypothalamic brain areas discussed here including LH (Milosavljevic et al, 2016 Curr Biol). Another study to consider including in the discussion is by Sonoda et al 2020 Science, in which the authors showed that a subset of ipRGCs release GABA. 

      (3) Figure 1 looks squashed, especially the axes. Also, Figure 2 looks somewhat blurry. I would suggest that the authors edit the figures to correct this.

      We thank the reviewer for their positive comments and agree with the weaknesses they pointed out. 

      (1) The explanation regarding the choice of the illuminance is now included in the revised manuscript (PAGE 17): “Blue-enriched light illuminances were set according to the technical characteristics of the light source and to keep the overall photon flux similar to prior 3T MRI studies of our team (between ~1012 and 1014 ph/cm²/s) (Vandewalle et al., 2010, 2011). The orange light was introduced as a control visual stimulation for potential secondary whole-brain analyses. For the present region of interest analyses, we discarded colour differences between the light conditions and only considered illuminance as indexed by mel EDI lux. This constitutes a limitation of our study as it does not allow attributing the findings to a particular photoreceptor class.”

      The revised discussion makes clear that these choices limit the interpretation about the photoreceptors involved (PAGES 12-13): “We based our rationale and part of our interpretations on ipRGC projections, which have been demonstrated in rodents to channel the NIF biological impact of light and incorporate the inputs from rods and cones with their intrinsic photosensitivity into a light signal that can impact the brain (Güler et al., 2008; Tri & Do, 2019). Given the polychromatic nature of the light we used, classical photoreceptors and their projections to visual brain areas are, however, very likely to have directly or indirectly contributed to the modulation by light of the regional activity of the hypothalamus.”

      The discussion also points out the promises of silent substitution (PAGE 13): “Future human studies could isolate the contribution of each photoreceptor class to the impact of light on cognitive brain functions by manipulating prior light history (Chellappa et al., 2014) or through the use of silent substitutions between metameric light exposures (Viénot et al., 2012)”.

      (2) We now refer to the studies by Milosavljevic et al. and Sonoda et al. 

      PAGE 9: “Our data may therefore be compatible with an increase in orexin release by the LH with increasing illuminance. In line with this assumption, chemoactivation of ipRGCs lead to increase c-fos production, a marker of cellular activation, over several nuclei of the hypothalamus, including the lateral hypothalamus (Milosavljevic et al., 2016). If this initial effect of light we observe over the posterior part of the hypothalamus was maintained over a longer period of exposure, this would stimulate cognition and maintain or increase alertness (Campbell et al., 2023) and may also be part of the mechanisms through which daytime light increases the amplitude in circadian variations of several physiological features (BanoOtalora et al., 2021; Dijk et al., 2012).”

      PAGE 10: “Chemoactivation of ipRGCs in rodents led to an increase activity of the SCN, over the inferior anterior hypothalamus, but had no impact on the activity of the VLPO, over the superior anterior hypothalamus (Milosavljevic et al., 2016). How our findings fit with these fine-grained observations and whether there are species-specific differences in the responses to light over the different part of the hypothalamus remains to be established.”

      PAGE 10: “In terms of chemical communication, these changes in activity could be the results of an inhibitory signal from a subclass of ipRGCs, potentially through the release aminobutyric acid (GABA), as a rodent study found that a subset of ipRGCs release GABA at brain targets including the SCN (and intergeniculate leaflet and ventral lateral geniculate nucleus), leading to a reduction in the ability of light to affect pupil size and circadian photoentrainment (Sonoda et al., 2020). Whatever the signalling of ipRGC, our finding over the anterior hypothalamus could correspond to a modification of GABA signalling of the SCN which has been reported to have excitatory properties, such that the BOLD signal changes we report may correspond to a reduction in excitation arising in part from the SCN (Albers et al., 2017).”

      (3) Figures 1 and 2 were modified. We hope their quality is now satisfactory. We are willing to provide separate figures prior to publication of the Version of Record.

      Reviewer #2 (Public Review): 

      Summary 

      The interplay between environmental factors and cognitive performance has been a focal point of neuroscientific research, with illuminance emerging as a significant variable of interest. The hypothalamus, a brain region integral to regulating circadian rhythms, sleep, and alertness, has been posited to mediate the effects of light exposure on cognitive functions. Previous studies have illuminated the role of the hypothalamus in orchestrating bodily responses to light, implicating specific neural pathways such as the orexin and histamine systems, which are crucial for maintaining wakefulness and processing environmental cues. Despite advancements in our understanding, the specific mechanisms through which varying levels of light exposure influence hypothalamic activity and, in turn, cognitive performance, remain inadequately explored. This gap in knowledge underscores the need for high-resolution investigations that can dissect the nuanced impacts of illuminance on different hypothalamic regions. Utilizing state-of-the-art 7 Tesla functional magnetic resonance imaging (fMRI), the present study aims to elucidate the differential effects of light on the hypothalamic dynamics and establish a link between regional hypothalamic activity and cognitive outcomes in healthy young adults. By shedding light on these complex interactions, this research endeavours to contribute to the foundational knowledge necessary for developing innovative therapeutic strategies aimed at enhancing cognitive function through environmental modulation. 

      Strengths: 

      (1) Considerable Sample Size and Detailed Analysis: The study leverages a robust sample size and conducts a thorough analysis of hypothalamic dynamics, which enhances the reliability and depth of the findings. 

      (2) Use of High-Resolution Imaging: Utilizing 7 Tesla fMRI to analyze brain activity during cognitive tasks offers high-resolution insights into the differential effects of illuminance on hypothalamic activity, showcasing the methodological rigor of the study. 

      (3) Novel Insights into Illuminance Effects: The manuscript reveals new understandings of how different regions of the hypothalamus respond to varying illuminance levels, contributing valuable knowledge to the field. 

      (4) Exploration of Potential Therapeutic Applications: Discussing the potential therapeutic applications of light modulation based on the findings suggests practical implications and future research directions. 

      Weaknesses: 

      (1) Foundation for Claims about Orexin and Histamine Systems: The manuscript needs to provide a clearer theoretical or empirical foundation for claims regarding the impact of light on the orexin and histamine systems in the abstract. 

      (2) Inclusion of Cortical Correlates: While focused on the hypothalamus, the manuscript may benefit from discussing the role of cortical activation in cognitive performance, suggesting an opportunity to expand the scope of the manuscript. 

      (3) Details of Light Exposure Control: More detailed information about how light exposure was controlled and standardized is needed to ensure the replicability and validity of the experimental conditions. 

      (4) Rationale Behind Different Exposure Protocols: To clarify methodological choices, the manuscript should include more in-depth reasoning behind using different protocols of light exposure for executive and emotional tasks. 

      Reviewer #2 (Recommendations For The Authors): 

      Attention to English language precision and correction of typographical errors, such as "hypothalamic nuclei" instead of "hypothalamus nuclei," is necessary for enhancing the manuscript.

      We thank the reviewer for recognising the interest and strength of our study.

      (1) As detailed in the discussion, we do believe orexin and histamine are excellent candidates for mediating the results we report. As also pointing out, however, we are in no position to know which neurons, nuclei, neurotransmitter and neuromodulator underlie the results. The last sentence of the abstract (PAGE 2) was therefore removed as we agree the statement was too strong. We carefully reconsider the discussion and believe that no such overstatement was present.

      (2) Hypothalamus nuclei are connected to multiple cortical (and subcortical) structures. The relevance of these projections will vary with the cognitive task considered. In addition, we have not yet considered the cortex in our analyses such that truly integrating cortical structures appears premature. 

      We nevertheless added the following short statement (PAGE 11): “Subcortical structures, and particularly those receiving direct retinal projections, including those of the hypothalamus, are likely to receive light illuminance signal first before passing on the light modulation to the cortical regions involved in the ongoing cognitive process (Campbell et al., 2023).”

      (3) We now include the following as part of the method section (PAGES 16-17): “Illuminance and spectra could not be directly measured within the MRI scanner due to the ferromagnetic nature of measurement systems. The coil of the MRI and the light stand, together with the lighting system were therefore placed outside of the MR room to reproduce the experimental conditions of the in a completely dark room. A sensor was placed 2 cm away from the mirror of the coil that is mounted at eye level, i.e. where the eye of the first author of the paper would be positioned, to measure illuminance and spectra. The procedure was repeated 4 times for illuminance and twice for spectra and measurements were averaged. This procedure does not take into account interindividual variation in head size and orbit shape such that the reported illuminance levels may have varied slightly across subjects. The relative differences between illuminance are, however, very unlikely to vary substantially across participants such that statistics consisting of tests for the impact of relative differences in illuminance were not affected. The detailed values reported in Supplementary Table 2 were computed combining spectra and illuminance using the excel calculator associated with a published work (Lucas et al., 2014).”

      (4) The explanation regarding the choice of the illuminance is now included in the revised manuscript (PAGE 17): “Blue-enriched light illuminances were set according to the technical characteristics of the light source and to keep the overall photon flux similar to prior 3T MRI studies of our team (between ~1012 and 1014 ph/cm²/s) (Vandewalle et al., 2010, 2011). The orange light was introduced as a control visual stimulation for potential secondary whole-brain analyses. For the present region of interest analyses, we discarded colour differences between the light conditions and only considered illuminance as indexed by mel EDI lux. This constitutes a limitation of our study as it does not allow attributing the findings to a particular photoreceptor class.”

      (5) The manuscript was thoroughly rechecked, and we hope to have spotted all typos and language errors.

      Reviewer #3 (Public Review): 

      Summary: 

      Campbell and colleagues use a combination of high-resolution fMRI, cognitive tasks, and different intensities of light illumination to test the hypothesis that the intensity of illumination differentially impacts hypothalamic substructures that, in turn, promote alterations in arousal that affect cognitive and affective performance. The authors find evidence in support of a posterior-to-anterior gradient of increased blood flow in the hypothalamus during task performance that they later relate to performance on two different tasks. The results provide an enticing link between light levels, hypothalamic activity, and cognitive/affective function, however, clarification of some methodological choices will help to improve confidence in the findings. 

      Strengths: 

      * The authors' focus on the hypothalamus and its relationship to light intensity is an important and understudied question in neuroscience. 

      Weaknesses: 

      (1) I found it challenging to relate the authors' hypotheses, which I found to be quite compelling, to the apparatus used to test the hypotheses - namely, the use of orange light vs. different light intensities; and the specific choice of the executive and emotional tasks, which differed in key features (e.g., block-related vs. event-related designs) that were orthogonal to the psychological constructs being challenged in each task. 

      (4) Given the small size of the hypothalamus and the irregular size of the hypothalamic parcels, I wondered whether a more data-driven examination of the hypothalamic time series would have provided a more parsimonious test of their hypothesis. 

      Reviewer #3 (Recommendations For The Authors): 

      (1) The authors may wish to explain the importance of the orange light condition in the early section of the results -- i.e., when they first present the task structure. As it stands, I don't have a good appreciation of why the orange light was included -- was it a control condition? And if the differences between the light conditions (e.g., the narrow- vs. wide-band of light) were indeed ignored by focussing on the illuminance levels, are there any potential issues that the authors could then mitigate against with further experiments/analyses? 

      (2) Are there other explanations for why illuminance levels might improve cognitive performance? For instance, the capacity to more easily perceive the stimuli in an experiment could plausibly make it easier to complete a given task. If this is the case, can the authors conceptualise a way to rule out this hypothesis? 

      (3) Did the authors control for the differences in the number of voxels in each hypothalamic subregion? Or perhaps consider estimating the variance across voxels within the larger parcels, to determine whether the mean time series was comparable to the time series of the smaller parcels? 

      (4) An alternative strategy that would mitigate against the differences in the size of hypothalamic parcels would be to conduct analyses on the hypothalamus without parcellation, but instead using dimensionality reduction techniques to observe the natural spread of responses across the hypothalamus. From the authors' results, my intuition is that these analyses will lead to similar conclusions, albeit without any of the potential issues with respect to differently-sized parcels. 

      We thank the reviewer for acknowledging the originality and interest of our study. We agree that some methodological choices needed more explanation. We will address the weaknesses they pointed out as follows:

      (1) The explanation regarding the choice of the illuminance is now included in the revised manuscript (PAGE 17): “Blue-enriched light illuminances were set according to the technical characteristics of the light source and to keep the overall photon flux similar to prior 3T MRI studies of our team (between ~1012 and 1014 ph/cm²/s) (Vandewalle et al., 2010, 2011). The orange light was introduced as a control visual stimulation for potential secondary whole-brain analyses. For the present region of interest analyses, we discarded colour differences between the light conditions and only considered illuminance as indexed by mel EDI lux. This constitutes a limitation of our study as it does not allow attributing the findings to a particular photoreceptor class.”

      The revised discussion makes clear that these choices limit the interpretation about the photoreceptors involved (PAGE 12-13): “We based our rationale and part of our interpretations on ipRGC projections, which have been demonstrated in rodents to channel the NIF biological impact of light and incorporate the inputs from rods and cones with their intrinsic photosensitivity into a light signal that can impact the brain (Güler et al., 2008; Tri & Do, 2019). Given the polychromatic nature of the light we used, classical photoreceptors and their projections to visual brain areas are, however, very likely to have directly or indirectly contributed to the modulation by light of the regional activity of the hypothalamus.”

      We further mention that (PAGE 13): “Furthermore, we cannot exclude that colour and/or spectral differences between the orange and 3 blue-enriched light conditions may have contributed to our findings. Research in rodent model demonstrated that variation in the spectral composition of light was perceived by the suprachiasmatic nucleus to set circadian timing (Walmsley et al., 2015). No such demonstration has, however, been reported yet for the acute impact of light on alertness, attention, cognition or affective state.”

      Regarding the choice of tasks, we added the following the method section (PAGE 18): “Prior work of our team showed that the n-back task and emotional task included in the present protocol were successful probes to demonstrate that light illuminance modulates cognitive activity, including within subcortical structures (though resolution did not allow precise isolation of nuclei or subparts) (e.g. (Vandewalle et al., 2007, 2010)). When taking the step of ultra-high-field imaging, we therefore opted for these tasks as our goal was to show that illuminance affects brain activity across cognitive domains while not testing for task-specific aspects of these domains.”

      We further added to the discussion (PAGE 8): “The pattern of light-induced changes was consistent across an executive and an emotional task which consisted of block and an event-related fMRI design, respectively. This suggests that a robust anterior-posterior gradient of activity modulation by illuminance is present in hypothalamus across cognitive domains.”

      (2) We are unsure what the reviewer refers to when he states that the experiment could make it easier to perceive a stimulus. Aside from the fact that illuminance can increase alertness and attention such that a stimulus may be better or more easily perceived/processed, we do not see how blocks of ambient light, i.e. a long-lasting visual stimulus, may render auditory stimulation (letters or pseudo-words in the present) easier to perceive. To our knowledge multimodal or cross-modal integration has been robustly demonstrated for short visual/auditory cues that would precede or accompany auditory/visual stimulation. 

      We are willing to clarify this issue in the text if we receive additional explanation from the reviewer.

      (3) We added subpart size as covariate in the analyses (instead of subpart number) and it did not affect the output of the statistical analyses (Author response table 1). 

      For completeness, we further computed standard deviation of the activity estimates of the voxels within each parcel for the main analysis of the n-back tasks and found a main effect of subpart (Author response table 2) indicating that the variability of the estimates varied across subparts. Post hoc contrast and the display included in Author response image1 show however that the difference were not related to subpart size per see. It is in fact the largest subpart (subpart 4) that shows the largest variability while one of the smallest subpart (subpart 2) shows the lowest variability. Though it may have contributed, it is therefore unlikely to explain our findings. We consider the analyses reported in (Author response table 1 and 2 and (Author response image 1 as very technical and did not include it in the supplementary material for conciseness. If the reviewer judges it essential, we can reconsider our decision.  

      While computing these analyses, we realized that there were errors in the table 1 reporting the statistical outcomes of the main analyses of the emotional task. The main statistical outputs remain the same except for a nominal main effect of the task (emotional vs. neutral) and the fact that post hoc show a consistent difference between the posterior subpart (subpart 3) and all the other subparts, rather than all the other subparts except for the difference with superior tubular hypothalamus subpart: p-corrected = 0.09. We apologise for this slight error and were unable to isolate its origin. It does not modify the rest of the analyses (which were also rechecked) and the interpretations. 

      Author response table 1.

      Recomputations of the main GLMMs using subpart sizes rather than subpart numbers as covariate of interest.

      Author response image 1.

      Activity estimate variability per hypothalamus subpart and subpart size.  

      Author response table 2.

      Difference in activity estimate standard deviation between hypothalamus subparts during the n-back task.

      Outputs of the generalized linear mixed model (GLMM) with subject as the random factor (intercept and slope), and task and subpart as repeated measures (ar(1) autocorrelation).

      * The corrected p-value for multiple comparisons over 2 tests is p < 0.025.

      # Refer to Fig.2A for correspondence of subpart numbers

      The text referring to Table 1 was modified accordingly (PAGE 5): “A nominal main effect of the task was detected for the emotional task [p = 0.049; Table 1] but not for the n-back task. For both tasks, there was no significant main effect for any of the other covariates and post hoc analyses showed that the index of the illuminance impact was consistently different in the posterior hypothalamus subpart compared to the other subparts [pcorrected ≤ 0.05]”.

      (4) We agree that a data driven approach could have constituted an alternative means to tests our hypothesis. We opted for an approach that we mastered best, while still allowing to conclusively test for regional differences in activity across the hypothalamus. Examination of time series of the very same data we used will mainly confirm the results of our analyses – an anterior-posterior gradient in the impact of illuminance - while it may yield slight differences in the boarders of the subparts of the hypothalamus undergoing decreased or increased activity with increasing illuminance. While the suggested approach may have been envisaged if we had been facing negative results (i.e. no differences between subparts, potentially because subparts would not reflect functional differences in response to illuminance change), it would constitute a circular confirmation of our main findings (i.e. using the same data). While we truly appreciate the suggestion, we do not consider that it would constitute a more parsimonious test of our hypothesis, now that we successfully applied GLM/parcellation and GLMM approaches.

      We added the following statement to the discussion to take this comment into account (PAGE 12): “Future research may consider data-driven analyses of hypothalamus voxels time series as an alternative to the parcellation approach we adopted here. This may refine the delineation of the subparts of the hypothalamus undergoing decreased or increased activity with increasing illuminance.”

      Response references

      Albers, H. E., Walton, J. C., Gamble, K. L., McNeill, J. K., & Hummer, D. L. (2017). The dynamics of GABA signaling: Revelations from the circadian pacemaker in the suprachiasmatic nucleus. Frontiers in Neuroendocrinology, 44, 35–82. https://doi.org/10.1016/J.YFRNE.2016.11.003

      Bano-Otalora, B., Martial, F., Harding, C., Bechtold, D. A., Allen, A. E., Brown, T. M., Belle, M. D. C., & Lucas, R. J. (2021). Bright daytime light enhances circadian amplitude in a diurnal

      mammal. Proceedings of the National Academy of Sciences of the United States of America, 118(22), e2100094118. https://doi.org/10.1073/PNAS.2100094118/SUPPL_FILE/PNAS.2100094118.SAPP.PDF

      Campbell, I., Sharifpour, R., & Vandewalle, G. (2023). Light as a Modulator of Non-Image-Forming Brain Functions Positive and Negative Impacts of Increasing Light Availability. Clocks & Sleep, 5(1), 116. https://doi.org/10.3390/CLOCKSSLEEP5010012

      Chellappa, S. L., Ly, J. Q. M., Meyer, C., Balteau, E., Degueldre, C., Luxen, A., Phillips, C., Cooper, H. M., & Vandewalle, G. (2014). Photic memory for executive brain responses. Proceedings of the National Academy of Sciences of the United States of America, 111(16), 6087–6091. https://doi.org/10.1073/pnas.1320005111

      Dijk, D. J., Duffy, J. F., Silva, E. J., Shanahan, T. L., Boivin, D. B., & Czeisler, C. A. (2012). Amplitude reduction and phase shifts of melatonin, cortisol and other circadian rhythms after a gradual advance of sleep and light exposure in humans. PloS One, 7(2). https://doi.org/10.1371/JOURNAL.PONE.0030037

      Güler, A. D., Ecker, J. L., Lall, G. S., Haq, S., Altimus, C. M., Liao, H. W., Barnard, A. R., Cahill, H., Badea, T. C., Zhao, H., Hankins, M. W., Berson, D. M., Lucas, R. J., Yau, K. W., & Hattar, S. (2008). Melanopsin cells are the principal conduits for rod-cone input to non-image-forming vision. Nature, 453(7191), 102–105. https://doi.org/10.1038/nature06829

      Lucas, R. J., Peirson, S. N., Berson, D. M., Brown, T. M., Cooper, H. M., Czeisler, C. A., Figueiro, M. G., Gamlin, P. D., Lockley, S. W., O’Hagan, J. B., Price, L. L. A., Provencio, I., Skene, D. J., & Brainard, G. C. (2014). Measuring and using light in the melanopsin age. Trends in Neurosciences, 37(1), 1–9. https://doi.org/10.1016/j.tins.2013.10.004

      Milosavljevic, N., Cehajic-Kapetanovic, J., Procyk, C. A., & Lucas, R. J. (2016). Chemogenetic Activation of Melanopsin Retinal Ganglion Cells Induces Signatures of Arousal and/or Anxiety in Mice. Current Biology, 26(17), 2358–2363. https://doi.org/10.1016/j.cub.2016.06.057

      Sonoda, T., Li, J. Y., Hayes, N. W., Chan, J. C., Okabe, Y., Belin, S., Nawabi, H., & Schmidt, T. M. (2020). A noncanonical inhibitory circuit dampens behavioral sensitivity to light. Science (New York, N.Y.), 368(6490), 527–531. https://doi.org/10.1126/SCIENCE.AAY3152

      Tri, M., & Do, H. (2019). Melanopsin and the Intrinsically Photosensitive Retinal Ganglion Cells: Biophysics to Behavior. Neuron, 104, 205–226. https://doi.org/10.1016/j.neuron.2019.07.016

      Vandewalle, G., Hébert, M., Beaulieu, C., Richard, L., Daneault, V., Garon, M. Lou, Leblanc, J., Grandjean, D., Maquet, P., Schwartz, S., Dumont, M., Doyon, J., & Carrier, J. (2011). Abnormal hypothalamic response to light in seasonal affective disorder. Biological Psychiatry, 70(10), 954–961. https://doi.org/10.1016/j.biopsych.2011.06.022

      Vandewalle, G., Schmidt, C., Albouy, G., Sterpenich, V., Darsaud, A., Rauchs, G., Berken, P. Y., Balteau, E., Dagueldre, C., Luxen, A., Maquet, P., & Dijk, D. J. (2007). Brain responses to violet, blue, and green monochromatic light exposures in humans: Prominent role of blue light and the brainstem. PLoS ONE, 2(11), e1247. https://doi.org/10.1371/journal.pone.0001247

      Vandewalle, G., Schwartz, S., Grandjean, D., Wuillaume, C., Balteau, E., Degueldre, C., Schabus, M., Phillips, C., Luxen, A., Dijk, D. J., & Maquet, P. (2010). Spectral quality of light modulates emotional brain responses in humans. Proceedings of the National Academy of Sciences of the United States of America, 107(45), 19549–19554. https://doi.org/10.1073/pnas.1010180107

      Viénot, F., Brettel, H., Dang, T.-V., & Le Rohellec, J. (2012). Domain of metamers exciting intrinsically photosensitive retinal ganglion cells (ipRGCs) and rods. Journal of the Optical Society of America A, 29(2), A366. https://doi.org/10.1364/josaa.29.00a366

      Walmsley, L., Hanna, L., Mouland, J., Martial, F., West, A., Smedley, A. R., Bechtold, D. A., Webb, A. R., Lucas, R. J., & Brown, T. M. (2015). Colour As a Signal for Entraining the Mammalian Circadian Clock. PLOS Biology, 13(4), e1002127. https://doi.org/10.1371/journal.pbio.1002127

    2. eLife assessment

      This fundamental work describes the complex interplay between light exposure, hypothalamic activity, and cognitive function. The evidence supporting the conclusion is compelling with potential therapeutic applications of light modulation. The work will be of broad interest to basic and clinical neuroscientists.

    3. Reviewer #1 (Public Review):

      Summary:

      Campbell et al investigated the effects of light on the human brain, in particular the subcortical part hypothalamus during auditory cognitive tasks. The mechanisms and neuronal circuits underlying light effects in non-image forming responses are so far mostly studied in rodents but are not easily translated in humans. Therefore, this is a fundamental study aiming to establish the impact light illuminance has on the subcortical structures using the high-resolution 7T fMRI. The authors found that parts of the hypothalamus are differently responding to illuminance. In particular, they found that the activity of the posterior hypothalamus increases while the activity of the anterior and ventral parts of the hypothalamus decreases under high illuminance. The authors also report that the performance of the 2-back executive task was significantly better in higher illuminance conditions. However, it seems that the activity of the posterior hypothalamus subpart is negatively related to the performance of the executive task, implying that it is unlikely that this part of the hypothalamus is directly involved in the positive impact of light on performance observed. Interestingly, the activity of the posterior hypothalamus was, however, associated with an increased behavioural response to emotional stimuli. This suggests that the role of this posterior part of the hypothalamus is not as simple regarding light effects on cognitive and emotional responses. This study is a fundamental step towards our better understanding of the mechanisms underlying light effects on cognition and consequently optimising lighting standards.

      Strengths:

      While it is still impossible to distinguish individual hypothalamic nuclei, even with the high-resolution fMRI, the authors split the hypothalamus into five areas encompassing five groups of hypothalamic nuclei. This allowed them to reveal that different parts of the hypothalamus respond differently to an increase in illuminance. They found that higher illuminance increased the activity of the posterior part of the hypothalamus encompassing the MB and parts of the LH and TMN, while decreasing the activity of the anterior parts encompassing the SCN and another part of TMN. These findings are somewhat in line with studies in animals. It was shown that parts of the hypothalamus such as SCN, LH, and PVN receive direct retinal input in particular from ipRGCs. Also, acute chemogenetic activation of ipRGCs was shown to induce activation of LH and also increased arousal in mice.

      Weaknesses:

      While the light characteristics are well documented and EDI calculated for all of the photoreceptors, it is not very clear why these irradiances and spectra were chosen. It would be helpful if the authors explained the logic behind the four chosen light conditions tested. Also, the lights chosen have cone-opic EDI values in a high correlation with the melanopic EDI, therefore we can't distinguish if the effects seen here are driven by melanopsin and/or other photoreceptors. In order to provide a more mechanistic insight into the light-driven effects on cognition ideally one would use silent substitution approach to distinguish between different photoreceptors. This may be something to consider when designing the follow-up studies.

    4. Reviewer #2 (Public Review):

      Summary

      The interplay between environmental factors and cognitive performance has been a focal point of neuroscientific research, with illuminance emerging as a significant variable of interest. The hypothalamus, a brain region integral to regulating circadian rhythms, sleep, and alertness, has been posited to mediate the effects of light exposure on cognitive functions. Previous studies have highlighted the role of the hypothalamus in orchestrating bodily responses to light, implicating specific neural pathways such as the orexin and histamine systems, which are crucial for maintaining wakefulness and processing environmental cues. Despite advancements in our understanding, the specific mechanisms through which varying levels of light exposure influence hypothalamic activity and, in turn, cognitive performance, remain inadequately explored. This gap in knowledge underscores the need for high-resolution investigations that can dissect the nuanced impacts of illuminance on different hypothalamic regions. Utilizing state-of-the-art 7 Tesla functional magnetic resonance imaging (fMRI), the present study aims to elucidate the differential effects of light on hypothalamic dynamics and establish a link between regional hypothalamic activity and cognitive outcomes in healthy young adults. By shedding light on these complex interactions, this research endeavours to contribute to the foundational knowledge necessary for developing innovative therapeutic strategies aimed at enhancing cognitive function through environmental modulation.

      Strengths:

      (1) Considerable Sample Size and Detailed Analysis: The study leverages a robust sample size and conducts a thorough analysis of hypothalamic dynamics, which enhances the reliability and depth of the findings.<br /> (2) Use of High-Resolution Imaging: Utilizing 7 Tesla fMRI to analyze brain activity during cognitive tasks offers high-resolution insights into the differential effects of illuminance on hypothalamic activity, showcasing the methodological rigour of the study.<br /> (3) Novel Insights into Illuminance Effects: The manuscript reveals new understandings of how different regions of the hypothalamus respond to varying illuminance levels, contributing valuable knowledge to the field.<br /> (4) Exploration of Potential Therapeutic Applications: Discussing the potential therapeutic applications of light modulation based on the findings suggests practical implications and future research directions.

      The current version of the manuscript addresses previous weaknesses, including details about the illuminance levels, light spectral characteristics used in the MRI study, and light patterns during behavioural tasks. The authors effectively tackle open questions in the field and provide solid evidence that enhances our understanding of the mechanisms underlying the effects of light on cognition.

    5. Reviewer #3 (Public Review):

      Summary:

      Campbell and colleagues use a combination of high-resolution fMRI, cognitive tasks and different intensities of light illumination to test the hypothesis that the intensity of illumination differentially impacts hypothalamic substructures that, in turn, promote alterations in arousal that affect cognitive and affective performance. The authors find evidence in support of a posterior-to-anterior gradient of increased blood flow in the hypothalamus during task performance that they later relate to performance on two different tasks. The results provide an enticing link between light levels, hypothalamic activity and cognitive/affective function, however clarification of some methodological choices will help to improve confidence in the findings.

      Strengths:

      * The authors' focus on the hypothalamus and its relationship to light intensity is an important and understudied question in neuroscience.

      Weaknesses:

      * I found it challenging to relate the authors hypotheses, which I found to be quite compelling, to the apparatus used to test the hypotheses - namely, the use of orange light vs. different light intensities; and the specific choice of the executive and emotional tasks, which differed in key features (e.g., block-related vs. event-related designs) that were orthogonal to the psychological constructs being challenged in each task.

      * Given the small size of the hypothalamus and the irregular size of the hypothalamic parcels, I wondered whether a more data-driven examination of the hypothalamic time series would have provided a more parsimonious test of their hypothesis.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The current study aims to quantify associations between the regular use of proton-pump inhibitors (PPI) - defined as using PPI most days of the week during the last 4 weeks at one cross-section in time - with several respiratory outcomes up to several years later in time. There are 6 respiratory outcomes included: risk of influenza, pneumonia, COVID-19, other respiratory tract infections, as well as COVID-19 severity and mortality).

      Strengths:

      Several sensitivity analyses were performed, including i) estimation of the e-value to assess how strong unmeasured confounders should be to explain observed effects, ii) comparison with another drug with a similar indication to potentially reduce (but not eliminate) confounding by indication.

      We are grateful for your pointing out the strengths in our article, particularly the assessment of e-values and the comparison with another medication to mitigate confounding by indication. We extend our sincere gratitude to the reviewer for identifying multiple concerns and offering constructive feedback to help improve our manuscript. We will incorporate these suggestions into our revisions.

      Weaknesses:

      (1) The main exposure of interest seems to be only measured at one time-point in time (at study enrollment) while patients are considered many years at risk afterwards without knowing their exposure status at the time of experiencing the outcome. As indicated by the authors, PPI are sometimes used for only short amounts of time. It seems biologically implausible that an infection was caused by using PPI for a few weeks many years ago.

      We agree with the reviewer that PPIs are sometimes used for only short amounts of time, as indicated in our manuscript. We acknowledge that it is a limitation of the UK Biobank cohort, and we have discussed this in the discussion section as follows:

      “Given that the PPI exposure was mainly assessed at the baseline recruitment, it was possible that a small proportion of PPI users was misclassified during the follow-up due to the medication discontinuation, which may result in an underestimation of potential risk.” (Page 14, Line 8-10)

      In addition, to alleviate these concerns, we have conducted effect medication for the subgroup of potential long-term users, which were defined by participants with indications of PPI use. This information has been included in the discussion section:

      “In addition, no effect moderation was observed in subgroup analyses for the main outcome among PPI users with indications (more likely to regularly use PPIs for a long period) compared to those without indications, indicating the risks remained increased among long-term PPI users.” (Page 14, Line 12-15)

      We hope that in the future, the concerns highlighted by the reviewer can be resolved by utilizing datasets with close follow-up, especially regarding medication use:

      “Since the follow-up prescription data was lacking in our study to precisely identifying the long-term users, further evaluation using cohorts with close follow-up is needed.” (Page 14, Line 15-17)

      (2) Previous studies have shown that by focusing on prevalent users of drugs, one often induces several biases such as collider stratification bias, selection bias through depletion of susceptible, etc.

      Because of the limitations of data from the UK Biobank, such as the absence of details on initiation of medications and regular monitoring, we were restricted to using a prevalent user design to assess the associations between PPI use and respiratory outcomes. We have discussed it in the limitation section:

      “Given that the PPI exposure was mainly assessed at the baseline recruitment, it was possible that a small proportion of PPI users was misclassified during the follow-up due to the medication discontinuation, which may result in an underestimation of potential risk. However, the prevalent user design could underestimate the actual risks of PPI use for respiratory infections, which indicates the real effect might be stronger [38]……Since the follow-up prescription data was lacking in our study to precisely identifying the long-term users, further evaluation using cohorts with close follow-up is needed.” (Page 14, Line 8-17)

      (3) It seems Kaplan Meier curves are not adjusted for confounding through e.g. inverse probability weighting. As such the KM curves are currently not informative (or the authors need to make clearer that curves are actually adjusted for measured confounding).

      Your kind suggestions are greatly appreciated. We have plotted Kaplan Meier curves adjusted for confounding by inverse probability weighting with the measured confounders according to the reviewer’s advice. The methods and results are demonstrated as follows:

      “The event-free probabilities were compared by Kaplan-Meier survival curves with inverse probability weights adjusting for the measured covariates.” (Page 8, Line 13-15)

      “Regular PPI users had lower event-free probabilities for influenza and pneumonia compared to those of non-users (Supplementary Figure 2 A-B).” (Page 9, Line 21-23)

      “PPI users had lower event-free probabilities for COVID-19 severity and mortality, but not COVID-19 positivity compared to those of non-users (Supplementary Figure 2 C-E).” (Page 10, Line 9-10)

      (4) Throughout the manuscript the authors seem to misuse the term multivariate (using one model with e.g. correlated error terms to assess multiple outcomes at once) when they seem to mean multivariable.

      We apologize for misusing the term “multivariate” and “multivariable” in our previous manuscript. We have corrected the misused terms throughout the manuscript:

      “Univariate and multivariable Cox proportional hazards regression models were utilized to assess the association between regular use of PPIs and the selected outcomes.” (Page 7, Line 19-20)

      “The remaining imbalanced covariates (standardized mean difference ≥ 0.1) after propensity score matching were further adjusted by multivariate multivariable Cox regression models to calculate HRs and 95% CIs.” (Page 8, Line 23-25)

      (5) Given multiple outcomes are assessed there is a clear argument for accounting for multiple testing, which following the logic of the authors used in terms of claiming there is no association when results are not significant may change their conclusions. More high-level, the authors should avoid the pitfall of stating there is evidence of absence if there is only an absence of evidence in a better way (no statistically significant association doesn't mean no relationship exists).

      We have revised our interpretation for the results, particularly for those without statically significant association based on the reviewer’s advice, and clearly recognize that the conclusions should be interpreted with cautions:

      “In contrast, the risk of COVID-19 infection was not significant with regular PPI use…” (Page 2, Line 11-12)

      “PPI users were associated with a higher risk of influenza (HR 1.74, 95%CI 1.19-2.54), but the risks with pneumonia or COVID-19-related outcomes were not evident.” (Page 2, Line 14-16)

      “…while the effects on pneumonia or COVID-19-related outcomes under PPI use were attenuated when compared to the use of H2RAs.” (Page 2, Line 18-19, in the Abstract)

      “…while their association with pneumonia and COVID-19-related outcomes is diminished after comparison with H2RA use and remains to be further explored.” (Page 15, Line 21-22, in the Conclusion)

      (6) While the authors claim that the quantitative bias analysis does show results are robust to unmeasured confounding, I would disagree with this. The e-values are around 2 and it is clearly not implausible that there are one or more unmeasured risk factors that together or alone would have such an effect size. Furthermore, if one would use the same (significance) criteria as used by the authors for determining whether an association exists, the required effect size for an unmeasured confounder to render effects 'statistically non-significant' would be even smaller.

      We agree with the reviewer that there might still exist one or more unmeasured risk factors that have effect sizes larger than 2. Hence, we cannot affirm that the findings are robust to unmeasured confounding in the current analysis, which is a limitation of our study. We have deleted the previous statement, and added more discussion in the limitation section:

      “Moreover, patients with exacerbations of respiratory disorders (e.g., asthma, COPD) might suffer from a wide range of gastrointestinal symptoms that lead to the use of PPIs [38]. Due to the lack of data for respiratory severity and close follow-up for medication use, residual confounding might still exist due to the observational nature.” (Page 14, Line 23-27)

      (7) Some patients are excluded due to the absence of follow-up, but it is unclear how that is determined. Is there potentially some selection bias underlying this where those who are less healthy stop participating in the UK biobank?

      Thank you for your question. The reasons for the absence of follow-up are mainly classified into five categories, including: (1) Death reported to UK Biobank by a relative; (2) NHS records indicate they are lost to follow-up; (3) NHS records indicate they have left the UK; (4) UK Biobank sources report they have left the UK; (5) Participant has withdrawn consent for future linkage. According to the data from UK Biobank (https://biobank.ndph.ox.ac.uk/ showcase/field.cgi?id=190), the major reason for the loss of follow-up among participants is their departure from the UK (84.7% of participants who were lost to follow-up). In addition, not including those who were less healthy in the study might also underestimate the risk, leading to lower estimated effects of PPIs for respiratory infections. We have supplemented this in our revised manuscript:

      “Among them, 1,297 participants without follow-up, which were mainly determined by reported death, departure from the UK, or withdrawn consent, had been removed after initial exclusion.” (Page 4, Line 25-27)

      (8) Given that the exposure is based on self-report how certain can we be that patients e.g. do know that their branded over-the-counter drugs are PPI (e.g. guardium tablets)? Some discussion around this potential issue is lacking.

      Thank you for your concerns. In the data collection by the UK Biobank, the participants can enter the generic or trade name of the treatment on the touchscreen to match the medications they used. We have added this important information to the method section:

      “The exposure of interest was regular use of PPIs. The participants could enter the generic or trade name of the treatment on the touchscreen to match the medications they used (Supplementary Table S1).” (Page 5, Line 6-8)

      We acknowledge that specific information on prescribed or over-the-counter use of medications is lacking in the UK Biobank. We have discussed it in the limitation section:

      “Limitations exist in our study. Information on dose and duration of PPI use, discrimination between prescription and over-the-counter use of PPIs, health-seeking behavior, different types of pneumonia, and pneumococcus vaccination is currently not available from the UK Biobank.” (Page 14, Line 5-8)

      (9) Details about the deprivation index are needed in the main text as this is a UK-specific variable that will be unfamiliar to most readers.

      Thank you for your question on the definition of deprivation index. We have proved the details  about the deprivation index in the manuscript:

      “…socioeconomic status (deprivation index, which was defined using national census information on car ownership, household overcrowding, owner occupation, and unemployment combined for postcode areas of residence)…” (Page 6, Line 14-17)

      (10) It is unclear how variables were coded/incorporated from the main text. More details are required, e.g. was age included as a continuous variable and if so was non-linearity considered and how?

      We apologize for not elucidating how variables were incorporated into the main text. Previously, the linearity between continuous variables and outcomes was assessed by Martingale residuals plots, while the variables detected with non-linearity were regarded as categorical variables for further analyses. For example, after evaluation with the Martingale residuals plot, age demonstrated non-linearity, and we incorporated it as a categorical variable for the analysis of COVID-related mortality.

      We have supplemented the information in the method section:

      “The linearity between continuous variables and outcomes was assessed by Martingale residuals plots, while the variables detected with non-linearity were regarded as categorical variables for further analyses.” (Page 6, Line 28 to Page 7, Line 1)

      (11) The authors state that Schoenfeld residuals were tested, but don't report the test statistics. Could they please provide these, e.g. it would already be informative if they report that all p-values are above a certain value.

      We are sorry for not providing the statistics about the Schoenfeld residual in our previous manuscript. We have supplemented the information in our revisions:

      “Schoenfeld residuals tests were used to evaluate the proportional hazards assumptions, while no violation of the assumption was detected (Supplementary Table S3).” (Page 7, Line 27 to Page 8, Line 1)

      (12) The authors would ideally extend their discussion around unmeasured confounding, e.g. using the DAGs provided in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7832226/, in particular (but not limited to) around severity and not just presence/absence of comorbidities.

      Thank you for your insightful suggestions that the discussion about unmeasured confounding should be extended. We agree with the reviewer that, in addition to the comorbidities themselves, their severity could also have an important impact on the use of PPIs. We have added the discussion in the limitation section with citing the article (PMC7832226):

      “Moreover, patients with exacerbations of comorbid disorders (e.g., diabetes, asthma, COPD) might suffer from a wide range of gastrointestinal symptoms that lead to the use of PPIs [38] (Supplementary Figure S4). Due to the lack of data for respiratory severity and close follow-up for medication use, residual confounding might still exist due to the observational nature.” (Page 14, Line 23-27)

      (13) The UK biobank is known to be highly selected for a range of genetic, behavioural, cardiovascular, demographic, and anthropometric traits. The potential problems this might create in terms of collider stratification bias - as highlighted here for example: https://www.nature.com/articles/s41467-020-19478-2 - should be discussed in greater detail and also appreciated more when providing conclusions.

      We acknowledge the reviewer's point about the UK Biobank's highly selective nature potentially leading to collider stratification bias in the evaluation of COVID-19-related outcomes. We have discussed this in detail and are cautious when generating conclusions.

      “Furthermore, the highly selective nature of the UK Biobank might create collider stratification bias for the evaluation of COVID-19-related outcomes, and thus the conclusions should be interpreted with cautions [39].” (Page 15, Line 2-4)

      Reviewer #2 (Public Review):

      Summary:

      Zeng et al investigate in an observational population-based cohort study whether the use of proton pump inhibitors (PPIs) is associated with an increased risk of several respiratory infections among which are influenza, pneumonia, and COVID-19. They conclude that compared to non-users, people regularly taking PPIs have increased susceptibility to influenza, pneumonia, as well as COVID-19 severity and mortality. By performing several different statistical analyses, they try to reduce bias as much as possible, to end up with robust estimates of the association.

      Strengths:

      The study comprehensively adjusts for a variety of critical covariates and by using different statistical analyses, including propensity-score-matched analyses and quantitative bias analysis, the estimates of the associations can be considered robust.

      We are grateful to the reviewer for pointing out the merits of our articles, which include adjusting for a wide range of covariates, employing diverse statistical analyses, and using robust data. We will revise our manuscript further based on the reviewer's suggestions.

      Weaknesses:

      As it is an observational cohort study there still might be bias. Information on the dose or duration of acid suppressant use was not available, but might be of influence on the results. The outcome of interest was obtained from primary care data, suggesting that only infections as diagnosed by a physician are taken into account. Due to the self-limiting nature of the outcome, differences in health-seeking behavior might affect the results.

      Thank you for your questions for information on the dose/duration of acid suppressants, the source of diagnosis, and the health-seeking behavior of participants. For the data from the UK Biobank, the dose or duration of acid suppressant use was not available since the information was not collected as baseline or follow-up. In addition, the outcome of interest was also retrieved from the hospital ICD diagnosis. We apologize for not clarifying it in our previous manuscript. Moreover, we agree with the reviewer that the health-seeking behavior could have an impact on the analyses, whereas the correlated data are still not available from the UK Biobank. We have discussed them in the method and limitation section:

      “Briefly, the first reported occurrences of respiratory system-related conditions within primary care data,  and hospital inpatient data defined by the International Classification of Diseases (ICD)- 10 codes were categorized by the UK Biobank.” (Page 5, Line 21-25)

      “Limitations exist in our study. Information on dose and duration of PPI use, discrimination between prescription and over-the-counter use of PPIs, health-seeking behavior, different types of pneumonia, and pneumococcus vaccination is currently not available from the UK Biobank.” (Page 14, Line 5-8)

      Reviewer #1 (Recommendations For The Authors):

      Analysis code should be made available.

      Thank you for your question. We have provide the sources of the analysis code we used for this study in our revised manuscript:

      “The codes used in this study can be found at: https://epirhandbook.com/en/ and https://cran.r-project.org/doc/contrib/Epicalc_Book.pdf.” (Page 16, Line 21-22)

      Reviewer #2 (Recommendations For The Authors):

      It might be interesting to study whether including self-reported infections changes the results, as people using PPI may more easily consult their GP even for a self-limiting disease such as influenza and therefore are more likely diagnosed/confirmed with such a respiratory infection.

      Thank you for your insightful suggestions on conducting analyses including self-reported infections. Therefore, we have included the self-reported cases as sensitivity analyses, and the results were not significantly altered, which confirms the robustness of our results:

      “Self-reported infections, except for COVID-19-related outcomes due to the lack of data, were also included for the outcomes as sensitivity analyses. The self-reported cases were reported at the baseline or subsequent UK Biobank assessment center visit.” (Page 8, Line 17-19)

      “Inclusion of the self-reported cases did not significantly alter the results (Supplementary Table S4).” (Page 9, Line 17-18)

      Moreover, to address the above-mentioned, sub-analyses differentiating between over-the-counter and prescribed medication might be interesting.

      Thank you for your questions on differentiating between over-the-counter and prescribed medication. We have thoroughly looked up the data provided by the UK Biobank, but it is a pity that they are not provided. We have discussed this in the limitation section:

      “Information on dose and duration of PPI use, discrimination between prescription and over-the-counter use of PPIs, health-seeking behavior, different types of pneumonia, and pneumococcus vaccination is currently not available from the UK Biobank.” (Page 14, Line 5-8)

    2. eLife assessment

      This useful study aimed to quantify associations between regular use of proton-pump inhibitors (PPI) with the occurrence of respiratory infections, such as influenza, pneumonia, COVID-19, and others over a period of several years. PPI use was associated with increased risks of influenza, pneumonia, but not of COVID-19, although severity and mortality of COVID-19 infections were higher in PPI users. There are inevitable weaknesses of the study design used, such as the fact that PPI use was only measured at one time-point whereas infections were assessed over a long time period, but these are appropriately highlighted in the discussion. Weaknesses are highlighted in the discussion and the study presents convincing evidence for the conclusions overall.

    3. Reviewer #1 (Public Review):

      Summary:

      The current study aims to quantify associations between regular use of proton-pump inhibitors (PPI) - defined as using PPI most days of the week during the last 4 weeks at one cross-section in time - with several respiratory outcomes (6 in total: risk of influenza, pneumonia, COVID-19, other respiratory tract infections, as well as COVID-19 severity and mortality) up to several years later in time.

      Strengths:

      Several sensitivity analyses were performed, including i) estimation of the e-value to assess how strong unmeasured confounders should be to explain observed effects, ii) comparison with another drug with a similar indication to potentially reduce (but not eliminate) confounding by indication, iii)

      Weaknesses:

      While the original submission had several weaknesses, the authors have appropriately addressed all issues raised. There are inevitable weaknesses remaining, but these are appropriately highlighted in the discussion. Remaining weaknesses that remain - but are highlighted in the discussion - include the fact that the main exposure of interest is only measured at one time-point whereas outcomes are assessed over a long time period, the inclusion of prevalent users leading to potential bias (e.g. those experiencing bad outcomes already stopping because of side-effects before inclusion in the study), and the possibility of unmeasured confounding explaining observations (e.g. severity of underlying comorbidities leading to PPI prescriptions combined with the absence of information about comorbidity severity), and potential selection bias.

    1. Reviewer #2 (Public Review):

      This paper examined how the activity of neurons in the entopeduncular nucleus (EPN) of mice relates to kinematics, value, and reward. The authors recorded neural activity during an auditory-cued two-alternative choice task, allowing them to examine how neuronal firing relates to specific movements like licking or paw movements, as well as how contextual factors like task stage or proximity to a goal influence the coding of kinematic and spatiotemporal features. The data shows that the firing of individual neurons is linked to kinematic features such as lick or step cycles. However, the majority of neurons exhibited activity related to both movement types, suggesting that EPN neuronal activity does not merely reflect muscle-level representations. This contradicts what would be expected from traditional action selection or action specification models of the basal ganglia.

      The authors also show that spatiotemporal variables account for more variability compared to kinematic features alone. Using demixed Principal Component Analysis, they reveal that at the population level, the three principal components explaining the most variance were related to specific temporal or spatial features of the task, such as ramping activity as mice approached reward ports, rather than trial outcome or specific actions. Notably, this activity was present in neurons whose firing was also modulated by kinematic features, demonstrating that individual EPN neurons integrate multiple features. A weakness is that what the spatiotemporal activity reflects is not well specified. The authors suggest some may relate to action value due to greater modulation when approaching a reward port, but acknowledge action value is not well parametrized or separated from variables like reward expectation.

      A key goal was to determine whether activity related to expected value and reward delivery arose from a distinct population of EPN neurons or was also present in neurons modulated by kinematic and spatiotemporal features. In contrast to previous studies (Hong & Hikosaka 2008 and Stephenson-Jones et al., 2016), the current data reveals that individual neurons can exhibit modulation by both reward and kinematic parameters. Two potential differences may explain this discrepancy: First, the previous studies used head-fixed recordings, where it may have been easier to isolate movement versus reward-related responses. Second, those studies observed prominent phasic responses to the delivery or omission of expected rewards - responses largely absent in the current paper. This absence suggests a possibility that neurons exhibiting such phasic "reward" responses were not sampled, which is plausible since in both primates and rodents, these neurons tend to be located in restricted topographic regions. Alternatively, in the head-fixed recordings, kinematic/spatial coding may have gone undetected due to the forced immobility.

      Overall, this paper offers needed insight into how the basal ganglia output encodes behavior. The EPN recordings from freely moving mice clearly demonstrate that individual neurons integrate reward, kinematic, and spatiotemporal features, challenging traditional models. However, the specific relationship between spatiotemporal activity and factors like action value remains unclear.

    1. eLife assessment

      The aim of this important study is to functionally characterize neuronal circuits underlying the escape behavior in Drosophila larvae. Upon detection of a noxious stimulus, larvae follow a series of stereotyped movements that include bending of their body, rolling and crawling away. This paper combines quantitative behavioral analyses, cell-type specific manipulations, optogenetics, calcium imaging, immunostaining, and connectomic analysis to provide convincing evidence of an inhibitory descending pathway that controls the switch from rolling to fast crawling behaviors of the larval escape response.

    2. Reviewer #1 (Public Review):

      Summary:

      Zhu et al. set out to better understand the neural mechanisms underlying Drosophila larval escape behavior. The escape behavior comprises several sequenced movements, including a lateral roll motion followed by fast crawling. The authors specifically were looking to identify neurons important for the roll-to-crawl transition.

      Strengths:

      This paper is clearly written, and the experiments are logical and complementary. They support the author's main claim that SeIN128 is a type of descending neuron that is both necessary and sufficient to modulate the termination of rolling. In general, the rigor is high.

      Weaknesses:

      -This manuscript is narrowly focused on Drosophila larval escape behavior. It would be more accessible to a broader audience if this work were put into a larger context of descending control.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors have addressed the majority of my comments, and I believe the revised manuscript has improved significantly.

      The escape behavior of Drosophila larvae includes rolling followed by fast crawling, but the neural mechanism of this sequence was unclear. The authors determined the function of SeIN128, a group of descending neurons that terminate rolling and shorten crawling latency. SeIN128 receives inputs from Basin-2 and A00c neurons, which facilitate rolling, and makes reciprocal inhibitory synapses onto Basin-2 and A00c. SeIN128 shows a delayed activity peak upon Basins or A00c stimulation. Gad staining indicates that SeIN128 neurons are GABAergic, and blocking of SeIN128 function caused increased rolling probability and prolonged rolling. RNAi knockdown of GABA receptors in Basins suggests that several GABA receptors, especially GABA-A-R, mediate the SeIN128 to Basins inhibition. Among Basins subtypes, both Basin-2 and Basin-4 facilitate rolling but SeIN128 specifically terminates rolling elicited by Basin-2 activation. Overall, SeIN128 forms a feedback inhibition ensemble with Basin-2 and A00c that terminates rolling and shifts the animal to crawling.

      Overall, this study discovered a neural mechanism that serves as a switch from rolling to fast crawling behaviors in Drosophila larvae. It addressed important open questions of how neural circuits determine the sequence of locomotor behaviors and how animals switch from one behavior to another. Its results support the conclusions and are backed up with proper control experiments.

      Strengths:

      - The question (i.e., the neural circuitry of action selection) addressed by this study is important.<br /> - Larval and adult Drosophila is a powerful model system in neuroscience study, with rich genetic tools, diverse behaviors, and well-studied nervous systems. This study makes good use of them.<br /> - The experiments, analyses, and results are rigorous and support the major claims. This study combined multiple innovative approaches, such as automated, machine-learning-based behavioral assays, EM reconstruction of larval CNS neurons, and genetic manipulation of specific neurons. A wide range of control experiments enhanced the credibility of the results.<br /> - The graphical representations are clear and mindfully arranged.

      Weaknesses:

      I believe "Corkscrew-like rolling" is not an accurate term for larval rolling. The neuromuscular basis of rolling was recently studied by Cooney et. al., showing that rolling is the circumferential propagation of muscle activity where all segments contract similarly and synchronously. So using another term instead of "Corkscrew-like rolling" may help.

    4. Reviewer #3 (Public Review):

      Summary:

      Combining the behavioral assays with optogenetics, imaging, and connectome approaches, this meticulous study characterizes the underlying neuronal mechanisms of escape behavior in Drosophila larvae. The authors identify the neurons and provide convincing evidence to support their function in the roll-to-crawl locomotor transition.

      Strengths:

      It is a very thorough characterization of locomotor sequences in terms of underlying neural circuits. The findings shed light on investigating the analogous behaviors in other systems.

      Weaknesses:

      None. The authors have revised the article to improve the presentation and clarity.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      In my opinion, the three most important controls (hopefully easy):

      (1) Include no ATR controls for optogenetic activation experiments (not all, just one or two, e.g., Figure 4B, C, or D, for the highest activation condition). The concern is that it can be quite hard to use light to both monitor neural responses while also using light to activate the function of other neurons.

      We thank the reviewer for the suggestions. We use a 2-photon 910-nm laser (which does not activate Chrimson) for imaging of GCaMP and a 624-nm LED (which does not activate GFP) for Chrimson activation. Calcium (GCaMP) signals are detected by PMT during Chrimson activation. With this setup, we are able to image GCaMP signals without crosstalk during activation of Chrimson.

      We performed calcium imaging in animals that were not fed ATR and found that SS04185 showed no response to LED stimulation at the strongest intensity (µW/mm) (New Figure 4 – figure supplement 1B).

      (2) Demonstrate that their RNAi constructs do indeed knock down the intended target gene. They showed nicely in Figure 5A that SeIN128 expresses GABA. Presumably, these neurons also express VGAT. Is it possible to check the expression of VGAT after RNAi knockdown? The concern is that using only a single RNAi introduces the possibility of off-target effects. Using multiple RNAi lines for VGAT or other parts of the pathway would also alleviate this (minor concern).

      We thank the reviewer for raising this point. We agree that using only one RNAi line (HMS02355) for VGAT in Figure 5A is a weakness. 

      Accordingly, we have performed additional experiments to quantify the effect of RNAi knockdown of VGAT using HMS02335 in all neurons, followed by subsequent immunostaining against GABA or VGAT. We found that both VGAT and GABA were significantly reduced in the neuropil (Figure 5 – figure supplement 1C and D). These data strongly suggest that HMS02355 knocks down VGAT and reduces GABA at axon terminals. We note that HMS02355 has been used previously for knocking down GABA signaling in the following studies.

      (1) Kallman BR, Kim H, Scott K (2015). Excitation and inhibition onto central courtship neurons biases Drosophila mate choice. eLife 4:e11188. https://doi.org/10.7554/eLife.11188

      (2) Zhao W, Zhou P, Gong C et al. (2019). A disinhibitory mechanism biases Drosophila innate light preference. Nat Commun 10, 124. https://doi.org/10.1038/s41467-018-07929-w

      (3) Yamagata N, Ezaki T, Takahashi T, Wu H, Tanimoto H (2021). Presynaptic inhibition of dopamine neurons controls optimistic bias. eLife 10:e64907. https://doi.org/10.7554/eLife.6490

      (3) Include genetic controls for their driver line.

      In Figure 1, it would be nice to see one half or the other half of their split GAL4 line in their manipulations. The concern is that perhaps the phenotype is coming from something unexpected in the genetic background.

      We thank the reviewer for the suggestion. We have added half of the GAL4 lines (AD or DBD) as controls (New Figure 1 – figure supplement 2). We found that SS04185 showed reduction of rolling, whereas AD only or DBD only (split control) did not (half of the split lines). 

      In the discussion:

      It seems that activation of SS014185 has additional effects beyond what the authors have quantified. Specifically, larvae do not appear to re-initiate rolling in the same manner as Basin activation alone. Also, there appears to be an off-response, turning.

      We appreciate the reviewer’s comments. We have included a section in the discussion to consider the differences patterns of rolling observed during joint stimulation of Basins and SS04185 and during stimulation of Basins alone, as well as the increase in turning following the offset of joint stimulation of Basins and SS04185 compared with stimulation of Basins alone (lines 464 to 481). Although the reasons for these differences are beyond the scope of the paper, we have added Figure 2 – figure supplement 1K, which shows that co-activation of SS04185-MB and Basins is sufficient to evoke turning following the offset of stimulation, suggesting that the increased turning may be due to the activation of SS04185-MB neurons and independent of SS04185-DN neurons.  

      The labeling of the Figure panels could be improved. In many places, it is not clear that Basins are being stimulated in the background, whereas in nearby panels, it is clearly labeled. This is confusing for the reader.

      We thank the reviewer for the constructive suggestions. We have modified all relevant figures to read “Basins>Chrimson” above the pink line indicating the period of optogenetic activation.

      Reviewer #2 (Recommendations For The Authors):

      Claims, rigorousness, repeatability, and accuracy of terms.

      (1) In line 254, the authors suggest that the slow response of SeIN128 neurons is due to the input they receive from SEZ, but in line 453, they suggest it is due to axo-axonal connections. However, their evidence does not support one factor over the other. Overall, only the axo-axonal connection was strongly suggested in the discussion. The authors could clarify that the delay of SeIN128 activity may also be caused by multisynaptic connections involving SEZ or other neurons in the last section of the Discussion.

      Although SeIN128 primarily receives inputs from the SEZ, it also receives inputs within the VNC from Basin-2 (Figure 4 – figure supplement 2). Specifically, in the VNC, the axons of SeIN128 make inhibitory synaptic contacts onto the axon of Basin-2, which in turn makes reciprocal excitatory contacts onto the axon of SeIN128, thereby forming a feedback loop. However, by the time we wrote the original discussion, we had inadvertently focused on the potential of the negative feedback loop formed by these axo-axonal synapses in the VNC to mediate the slow response of SeIN128, overlooking the possibility that other as yet unidentified pathways could convey Basin or A00c activity indirectly to SeIN128 dendrites in the SEZ. Therefore, we have revised the original text, which read “These data suggest that the main synaptic inputs onto SeIN128 neurons in the SEZ mediate the slow responses upon activation of Basins or A00c neurons” to “These data suggest that the delay of SeIN128 activity may be caused by multi-synaptic connections involving the SEZ or a feedback loop involving axo-axonal connections between SeIN128 and Basin-2 or A00c” (revised, Lines 259 and 261). Accordingly, we have also adjusted the relevant discussion section to be consistent with this change (Lines 460 and 466).

      (2) Please clarify the following: How does the algorithm define rolling and crawling? Healthy larvae complete 360{degree sign} rolls, in each roll they rotate from dorsal up to dorsal up. It is possible that a larva rolls for an incomplete cycle and straightens up. Does the algorithm simply label individual frames as “roll”, “non-roll”, or “unknown”, and defines rolling by the existence of “roll” frames? If so, then larvae that rolled for 90{degree sign} and straightened would be counted as “rolling” though they failed to complete a full rolling bout. Also, how were “hunch” “turn” and “back” identified? Lastly, is there any manual quality control involved? Address this and related issues in the methods:

      a)  Expand the description of the classifier algorithm.

      b)  How are rolling and non-rolling animals defined in the "rolling%" assay? Were all "rolling" animals able to do at least one 360{degree sign} roll?

      c)  How are "rolling duration" and "end of 1st rolling" defined? Is the algorithm able to distinguish different rolling bouts? In these two assays, were the animals rolled for <1 second (in total or their "first roll") able to complete a 360{degree sign} roll?

      The Multi-worm Tracker (MWT) records only the contours of animals (no real video image data). Thus, the data fed into the classifier algorithm only includes features based on contour time-series data. The algorism uses movement perpendicular to the body axis—the characteristic feature of larval rolling—to classify rollers and non-rollers. Although the algorithm cannot determine whether a rolling event involves a rotation of more than 360 degrees, we ensure that rolling events are at least 360 degrees by removing any events that are shorter than 0.2 s (the minimum time to complete a 360-degree roll).

      We have accordingly revised the section of “Behavior detection” relating to the behavior classification algorithm in the methods section as follows (Lines 600 to 620).

      “After extracting behavioral parameters from Choreography, we used an unsupervised machine learning behavior classification algorithm to detect and quantify the following behaviors: hunching (Hunch), headbending (Turn), stopping (Stop), and peristaltic crawling (Crawl) as previously reported (Masson et al., 2020). Escape rolling (Roll) was detected with a classifier developed using the Janelia Automatic Animal Behavior Annotator (JAABA) platform (Kabra et al., 2013; Ohyama et al., 2015). JAABA transforms the MWT tracking data into a collection of ‘per-frame’ behavioral parameters and regenerates 2D dorsal-view videos of the tracked larvae. Based on such videos, we defined rolling as a rotation around the body while the larva maintains a C-shape, which results in a movement perpendicular to larval body axis (Supplementary videos 1 and 2). Using this definition, we trained the algorithm in the JAABA platform by labeling ~10,000 randomly chosen frames as rolling or non-rolling to develop the rolling classifier. If a larva did not curl into a C-shape or move sideways, it was labeled as a “non-roller.” Every animal with at least one rolling event longer than 0.2 s in a given period was labeled as a “roller” (i.e., it was assumed to have rolled at least 360 degrees), based on the observation that when the start and end of rolling events were precisely measured, the algorithm could identify rolling events completed in 0.2 s.

      The rejection of false positives, especially at the beginning and the end of each rolling bout, enhanced accuracy. The algorithm integrated these training labels and parameters generated with Choreography in a time series, such as speed, crabspeed, and body curvature, to generate a score for rolling detection. Above a certain threshold, the classifier labeled the frame as rolling. This classifier, which has false negative and false positive rates of 7.4% and 7.8%, respectively (n = 102), was utilized to detect rolling in this paper.”

      Readability of text

      (1) I suggest giving the SS04185 line and SeIN128 neuron common names that are easier to remember and follow (after mentioning their full name once).

      We acknowledge the reviewer’s concerns. However, because SS04185 was initially named using the Janelia split-line pipeline, and SeIN128 was named independently in a more recent study (Ohyama et al., 2015), we have retained these designations in the present manuscript.

      Figures and figure legends

      (1) It would help if the authors could put visual representations of rolling and crawling, such as a cartoon larva performing the rolling-crawling switch, and still frames of rolling and crawling of real larvae, especially in Figure 1. Also, please consider including a video of rolling and crawling in real larvae (preferably comparing control and experimental groups).

      We appreciate the reviewer’s suggestion. We have added a cartoon of the behavioral sequence in Figure 1A, as well as a Figure 1 supplement video based on MWT data, which shows rolling followed by crawling. 

      (2) To give the reader a take-home message, it would help if the authors could make a simplified version of Figure 4A and put it at the end of the paper.

      We thank the reviewer for the suggestion. To assist the reader, we have added schematics depicting how the circuit may function in panel I of Figure 8.

      (3) In Figure 1A, add the text "activation " after the neuron names.

      We have added “Chrimson” following “Basins>” to the new Figure 1B (old Figure 1A) and other figures (Figure 1C and D, Figure 5A, Figure 6A, and figure supplements).

      (4) Figure 1G: a data point is misaligned (at the top of the graph). 

      We have aligned the data point accordingly.

      (5) Figure 1B can benefit from a better design. If possible, please separate the crawling speed into an independent graph (or at least use a different line shape to code for crawling speed and indicate it on the in-graph legend). Is the speed of Basin/SS04185 co-activation studied?

      We appreciate the reviewer’s suggestion. We have separated the plots for rolling and crawling speed into different panels (Figure 1C and D). As shown in Figure 1D, the crawling speed observed during coactivation of Basins and SS04185 was similar to that during activation of Basins alone.

      (6) Figure S1 uses a different color-coding scheme from Figure 1. I suggest making the color coding consistent between figures.

      We are grateful for the reviewer’s suggestion. We have adjusted the color-coding scheme accordingly.

      (7) Line 692 (Figure 2 legend), "Killer Zipper" is misspelled as "Kipper Zipper". Out of curiosity, is there a way to remove or reduce SS04185-DN expression in the same manner as SS04185-MB reduction?

      We have corrected the text in the legend for Figure 2. As for the reviewer’s question, we did attempt to reduce or abolish SS04185-DN expression with tsh-LexA and LexAop-Kip+ but found no effect. Other identified LexA constructs with SeIN128 expression, however, all showed SS04185-MB expression. Consequently, we could not use these constructs because they inhibit both SeIN128 and SS04185-DN.

      (8) The color coding of Figure 2 (especially in D) makes it hard to distinguish between the brown and red groups.

      We thank the reviewer for the suggestion. Accordingly, we have changed the color for the brown group to orange.

      (9) In line 926 (Figure S2 legends), the description of F and G seems inverted.

      We appreciate the reviewer for pointing out the error. We have revised the text from “(F) has only SS04185-

      MB expression, and (G) has both SS04185-DN and SS04185-MB expression” to “(F) has both SS04185DN and SS04185-MB expression, and (G) has only SS04185-MB expression.”

      (10) Figure 7B: which line does the top group of asterisks belong to?

      The top group of asterisks indicates that each experimental group differs significantly (p < 0.001) from the control group. We have revised the figure to clarify the comparisons indicated by the asterisks in Figure 7B, as well as the figure legend below (Line 890-894).

      “(B) Cumulative plot of rolling duration. Statistics: Kruskal-Wallis test: H = 69.52, p < 0.001; Bonferronicorrected Mann-Whitney test, p < 0.001 between control and the GABA-B-R11, GABA-B-R12 and GABAB-R2 RNAi groups, p < 0.001 between GABA-A-R and all other experimental RNAi group. Sample size for the colored bars from top (control, black) to bottom (GABA-A-R, red); n = 520, 488, 387, 582, 306.”

      (11) Figure S8 D and F: indicate Basin-2 or Basin-4 activation on graph.

      We have revised Figure 8 – figure supplement D and F accordingly.

      Reviewer #3 (Recommendations For The Authors):

      (1) Lines 86-87: Text needs to be rewritten for clarity. Also, include the genotype in the corresponding figure legend (Figure 1B).

      We thank the reviewer for pointing this out. We have clarified the text accordingly and included the genotype in the figure legend (lines 86 and 87). Specifically, we have revised Figure 1B (New Figure 1C and D) and adjusted the legend accordingly as follows. 

      Lines 86 and 87: Crawling speed during the activation of all Basins following rolling was ~1.5 times that of the crawling speed at baseline (Figure 1D).

      (2) Include the protocol for heat shock-FLP out experiments

      We have added the following paragraph to the Methods section describing the heat shock-FlpOut experiments (lines 537 to 546).

      “Heat shock FlpOut mosaic expression

      First instar Drosophila larvae were exposed to heat shock in a water bath at 37°C for 12 min as previously described (Nern et al., 2015). With precise temporal and thermal control of heat shock, larvae with genotype

      w+, hs(KDRT.stop)FLP/13xLexAop2-IVS-CsChrimson::tdTomato; R54B01-Gal4.AD/72F11LexA;20xUAS-(FRT.stop)-CsChrimson::mVenus/R46E07-Gal4.DBD showed sporadic

      CsChrimson::mVenus expression driven by SS04185 split GAL4. As a result, the ratio of the larvae with SS04185-DN and SS04185-MB expression to those with only SS04185-MB expression was 1:1. Each larva was individually examined with optogenetic stimulation and behavior analysis. After behavioral experiments, mVenus expression in CNS was confirmed under the fluorescence microscope.”

      (3) In the immunohistochemistry, the authors exclude the steps for washings. Recommend the authors to cite the previous literature. Similar to the other protocols detailed in the methods.

      We have added a brief description of the steps involved in washing (lines 641 and 648). We have also provided a citation with similar immunohistology protocols (Patel, 1994).

      (4) Keeping the same Y-axis scale for similar graphical representation would be helpful to compare across different experimental conditions and genotypes-for example, 2E and 2H for the start of the first crawl.

      As suggested by the reviewer, we have adjusted the y-axis scales for Figure 2E and H to be identical.

      (5) The color schematics used for the graph make it hard to visualize the data. The author might reconsider the better presentation of the data by avoiding darker colors.

      We thank the reviewer for the constructive suggestion. We have lightened the shading of all violin plots. We have also modified the shading for the middle group in Figure 2C and E from dark brown to orange.

      (6) Co-activation of the SS04185 and Basins in the figures represented as Basins+SS04185 (Figure 1A) and SS04185 (rest of the figures). Authors might reconsider this terminology to define and distinguish the coactivation of SS04185 and Basins neurons from the activation of SS04185 or Basins alone. It needs to be clarified in the figures.

      We have adjusted the terminology by including “Basins>Chrimson” in all panels in which Basin neurons are optogenetically activated to trigger rolling in the background for all groups. Additionally, we have labeled the control group as “Control” and the experimental group as ”SS04185”. 

      (7) Figure 4A, summarizes the synaptic connection and strength between different neurons - SeIN128, Basins, A00c and mdIV. However, the nature of these synaptic connections - excitatory and inhibitory- is not represented. Based on the previous and current studies, the authors consider providing the schematic for circuit mechanisms of escape behavior sequences in larvae. Also, discussing these findings in light of the downstream output circuit and motor regulation might be informative (See Cooney et al. 2023, PNAS).

      As the reviewer correctly points out, the diagram of the connectome shown in Figure 4A does not indicate whether the connections are excitatory or inhibitory. Accordingly, we have added a new summary panel (Figure 8I) based on the results of examining GABAergic synapses (Figure 5A). The schematics in Figure 8I depict how the joint activity of inhibitory and excitatory synapses (indicated by arrowheads and blunt ends, respectively) may lead to rolling or fast crawling.

      We have also added a section discussing the premotor circuits for crawling and rolling premotor circuit in discussion (Line 512 – 519).

      (8) Percentage rolling present in figure 5B and 6A correspond to the control larvae 13xLexAop2-IVS-CsChrimson::mVenus; R72F11-lexA/+; HMS02355/+ and 13xLexAop2-IVS- Cs-Chrimson::mVenus; R72F11-lexA/+; UAS-TeTxLC.tnt/+. How does the author interpret the observed variability across the experiments? The author might consider discussing the genetic background effect on the observed behaviors, if any.

      As pointed out by the reviewer, we noticed that rolling probability varied depending on genetic background. We have revised the text accordingly (Lines 277 to 280).

      (9) Recheck the arrowheads in Figure 5A.

      We have confirmed the positions of the arrowheads in Figure 5A and modified the figures by outlining the cells with dotted lines.

      (10) Lines 295-298: Data presented in the supplementary figure and p-values in the text (p=0.11) suggest that the first crawl's onset is comparable to controls. Rewrite this text for clarity and include the statistical values in the supplemental figure 6.

      We have revised the text as follows (Lines 302 to 305).

      “Although the duration of each rolling bout, time to onset of the first rolling bout, and time to onset of the first crawling bout did not differ from those of controls (Figure 6–figure supplement 1D, E and G), the time to offset of the first rolling bout was delayed relative to controls (p = 0.013 for Figure 6–figure supplement 1F).”

      (11) Lines 263-264: Data provide evidence for SS04185 receiving inputs Basin-2 and A00c neurons. SS04185, which provides inputs to other neurons, specifically A00c neurons, but still needs clarification.

      We have revised the text as follows (Lines 264 to 266).

      The results thus far indicate that, activation of SeIN128 neurons inhibits rolling (Figure 1A–C), SeIN128 neurons receive functional inputs from Basin-2 and A00c (Figure 4A-C); and SeIN128 neurons make anatomical connections onto Basin-2 and A00c (Figure 4A). 

      (12) In the table that lists the genotypes, instead of '-' or the blank space in the label column, the author might consider using 'control,' consistent with the figures.

      In accord with the reviewer’s suggestion, we have revised the notation of ‘-’ or the blank space, to ‘control’ for all figures.

      (13) Check the typographical errors throughout the manuscript. Some below:

      We have revised the text accordingly as suggested below.

      a.  Lines 100, 142: SS4185 should be SS04185

      b.  Line 230: A00C should be A00c

      c.  Line 180: Expand VNC

      d.  10xUAS-IVS-mry::GFP should be 10xUAS-IVS-myr::GFP

      e.  Lines 444, 449: drosophila should be Drosophila

    1. eLife assessment

      This important paper shows that the anti-gremlin-1 (GREM1) antibody is not effective at treating liver inflammation or fibrosis. Critically, the evidence also challenges existing data on the detection of GREM1 by ELISA in serum or plasma by demonstrating that high-affinity binding of GREM1 to heparin would lead to localisation of GREM1 in the ECM or at the plasma membrane of cells. The conclusions are supported by a convincing, well-controlled set of experiments.

    2. Reviewer #1 (Public Review):

      Summary:

      Horn and colleagues present data suggesting that the targeting of GREM1 has little impact on a mouse model of metabolic dysfunction-associated steatohepatitis. Importantly, they also challenge existing data on the detection of GREM1 by ELISA in serum or plasma by demonstrating that high-affinity binding of GREM1 to heparin would lead to localisation of GREM1 in the ECM or at the plasma membrane of cells.

      Strengths:

      This is an impressive tour-de-force study around the potential of targeting GREM1 in MASH.

      This paper will challenge many existing papers in the field around our ability to detect GREM1 in circulation, at least using antibody-mediated detection.

      Well-controlled, detailed studies like this are critically important in order to challenge less vigorous studies in the literature.

      The impressive volume of high-level, well-controlled data using an impressive range of in vitro biochemical techniques, rodent models, and human liver slices.

      Weaknesses: only minor.

      (1) The authors clearly show that heparin can limit the diffusion of GREM1 into the circulation-however, in a setting where GREM1 is produced in excess (e.g. cancer), could this "saturate" the available heparin and allow GREM1 to "escape" into the circulation?

      (2) Secondly, has the author considered that GREM1 be circulating bound to a chaperone protein like albumin which would reduce its reactivity with GREM1 detection antibodies?

      (3) Statistics-there is no mention of blinding of samples-I assume this was done prior to analysis?

      (4) Line 211-I suggest adding the Figure reference at the end of this sentence to direct the reader to the relevant data.

      (5) Figure 1E Y-axis units are a little hard to interpret-can integers be used?

      (6) Did the authors attempt to detect GREM1 protein by IHC? There are published methods for this using the R&D Systems mouse antibody (PMID 31384391).

      (7) Did the authors ever observe GREM1 internalisation using their Atto-532 labelled GREM1?

      (8) Did the authors complete GREM1 ISH in the rat CDAA-HFD model? Was GREM1 upregulated, and if so, where?

      (9) Supplementary Figure 4C - why does the GFP level decrease in the GREM1 transgenic compared to control the GFP mouse? No such change is observed in Supplementary Figure 4E.

    3. Reviewer #2 (Public Review):

      It is controversial whether liver gremlin-1 expression correlates with liver fibrosis in metabolic dysfunction-associated steatohepatitis (MASH). Horn et al. developed an anti-Gremlin-1 antibody in-house and tested its ability to neutralize gremlin-1 and treat liver fibrosis. This article has the advantage of testing its hypothesis with different animal and human liver fibrosis models and using a variety of research methodologies.

      The experimental design and results support the conclusion that the anti-gremlin-1 antibody had no therapeutic effect on treating liver fibrosis, so there are no other suggestions for new experiments:

      (1) The authors used RNAscope in situ hybridization to establish the correlation between Gremlin-1 expression and NMSH livers or cell lines.

      (2) A luminescent oxygen channelling immunoassay was used to measure circulating Gremlin-1 concentration. They found that Gremlin-1 binds to heparin very efficiently, preventing Gremlin-1 from entering circulation, and restricting Gremlin-1's ability to mediate organ cross-communication.

      (3) The authors developed a suitable NMSH rat model which is a choline-deficient, L-amino acid defined high fat 1% cholesterol diet (CDAA-HFD) fed rat model of NMSH, and created a selective anti-Gremlin-1 antibody which is heparin-displacing 0030:HD antibody. They also used human cirrhotic precision-cut liver slices to test their hypotheses. They demonstrated that neutralization of Gremlin-1 activity with monoclonal therapeutic antibodies does not reduce liver inflammation or liver fibrosis.

      One concern is that several reagents and assays are made in-house without external validation. Also, will those in-house reagents and assays be available to the science community?

      Overall this manuscript provides useful information that gremlin-1 has a limited role in liver fibrosis pathogenesis and treatment.

    4. Author response:

      Reviewer #1 (Public Review):

      Summary:

      Horn and colleagues present data suggesting that the targeting of GREM1 has little impact on a mouse model of metabolic dysfunction-associated steatohepatitis. Importantly, they also challenge existing data on the detection of GREM1 by ELISA in serum or plasma by demonstrating that high-affinity binding of GREM1 to heparin would lead to localisation of GREM1 in the ECM or at the plasma membrane of cells.

      Strengths:

      This is an impressive tour-de-force study around the potential of targeting GREM1 in MASH.

      This paper will challenge many existing papers in the field around our ability to detect GREM1 in circulation, at least using antibody-mediated detection.

      Well-controlled, detailed studies like this are critically important in order to challenge less vigorous studies in the literature.

      The impressive volume of high-level, well-controlled data using an impressive range of in vitro biochemical techniques, rodent models, and human liver slices.

      We thank the reviewer for their time in assessing our manuscript and are very grateful for the positive response. Below, we give a point-by-point response to the reviewer’s comments and indicate where we plan to adjust the manuscript.

      Weaknesses: only minor.

      (1) The authors clearly show that heparin can limit the diffusion of GREM1 into the circulation-however, in a setting where GREM1 is produced in excess (e.g. cancer), could this "saturate" the available heparin and allow GREM1 to "escape" into the circulation?

      We thank the reviewer for their question. Indeed theoretically, if the production of Gremlin-1 exceeds the capacity of heparin to immobilise Gremlin-1, the protein may be released into solution and thus may enter the circulation. Whilst we have not addressed this possibility in our studies, we agree that it may be a mechanism worthwhile exploring in future studies.

      (2) Secondly, has the author considered that GREM1 be circulating bound to a chaperone protein like albumin which would reduce its reactivity with GREM1 detection antibodies?

      We have thought of the possibility that Gremlin would bind other proteins such as BMPs, and thereby mask assay-antibody epitopes. To minimise this possibility, we used antibody pairs which bind different epitopes. We also used LC-MS for Gremlin-1 detection (data not shown in the manuscript), a method that is not affected by epitope masking. With the LC-MS analysis we did not pick up any gremlin-signal in plasma. We will mention the LC-MS data in the updated manuscript.

      Also, we were able to detect circulating Gremlin-1 after treatment with anti-Gremlin-1 antibodies. As these were the same antibodies that were used in our assays, we should have not been able to detect Gremlin-1 if there had been a masking interaction with circulating high abundant plasma proteins such as albumin.

      Finally, we believe that the assay antibodies would outcompete binding of any other proteins because of their high affinity and very high concentrations used in the assays.

      In summary, we are very confident that Gremlin-1 is not present in circulation. We will though make some minor adjustments to the manuscript in order to stress this important point.

      (3) Statistics-there is no mention of blinding of samples-I assume this was done prior to analysis?

      All reported results were derived from hard quantitative readouts obtained through assays that are not liable to subjective interpretation. This also applies to immunohistochemistry and RNAscope histologic quantification, using Visiopharm Integrator System software ver. 8.4 or HALO v3.5.3577 (Area Quantification v2.4.2 module), respectively. Therefore, no blinding was necessary prior to analysis.

      (4) Line 211-I suggest adding the Figure reference at the end of this sentence to direct the reader to the relevant data.

      We thank the reviewer for the suggestion and will add a reference to Figure 1F here.

      (5) Figure 1E Y-axis units are a little hard to interpret-can integers be used?

      As the y axis in Figure 1E is on the logarithmic scale, integer numbers would be very hard to read because of the large range of numbers. As we acknowledge that the notation used may be difficult to read, we will change it to superscript scientific notation.

      (6) Did the authors attempt to detect GREM1 protein by IHC? There are published methods for this using the R&D Systems mouse antibody (PMID 31384391).

      Parallel to the work described in PMID 31384391 (Dutton et al., Oncotarget, 10: 4630-4639, 2019), we have tested a whole range of commercial and in-house gremlin-1 antibodies. We independently arrived at the same conclusion as Dutton et al namely that goat anti-gremlin antibody R&D Systems AF956 can stain the mouse or rat intestine in the muscularis layer and in the crypts/lower part of the villi, using FFPE sections. As per Dutton et al. we also corroborated this IHC staining by RNAscope - the mRNA was restricted to the muscularis and the connective tissue just below the crypts, suggesting that Gremlin-1 partially diffuses away from the cells that produce it. In contrast, none of the other commercial or in-house gremlin antibodies that we tested provided any useful staining on FFPE sections.

      We also used the R&D Systems AF956 antibody on several rat MASH liver samples. We saw little or no staining in livers from chow-fed rats, with only occasional weak staining around portal areas. Depending on the rat model, we saw from little or no staining to at most weak staining in portal areas and fibrotic areas. Among the various models tested, we observed the strongest staining in the rat CDAA-HFD+cholesterol model, in line with the ISH data.

      However, we were unable to establish IHC on human MASH liver samples using the R&D Systems AF956 antibody (or any other antibody) despite 98% sequence identity at the amino acid level between human and rat gremlin-1. Considering the results in Dutton et al. on rodent intestines, we tested the antibody on some human intestine samples, but the results on the available samples (inflamed appendices) were inconclusive.

      We will include representative IHC staining images for Gremlin-1 protein on rat livers as a Supplementary Figure and mention in the manuscript that IHC for human Gremlin-1 did not work with the available antibodies.

      (7) Did the authors ever observe GREM1 internalisation using their Atto-532 labelled GREM1?

      The Atto-532 Gremlin-1 cell association assay was mainly intended to visualise the association of Gremlin-1 with cell surface proteoglycans and how this interaction is affected by heparin-displacing and non-displacing antibodies. We observed a possible, but inconclusive intracellular association of Atto-532 Gremlin-1. However, this assay was not specifically designed for this purpose, and we did not follow up on this. Therefore, we cannot draw any conclusions on whether cell surface bound Gremlin-1 can be internalised. However, we appreciate that internalisation of Gremlin-1 would be an interesting biological mechanism worth following up in future studies.

      (8) Did the authors complete GREM1 ISH in the rat CDAA-HFD model? Was GREM1 upregulated, and if so, where?

      We have performed Grem1 ISH in the rat CDAA-HFD model and representative images of this are shown in Figure 1F. In chow-fed animals, Grem1 was expressed in a few cells in the portal tract, whereas after CDAA-HFD, Grem1 positive cells became more abundant in the portal tract and were also detectable in the fibrotic septa, as described in the respective results section. However, we performed no co-staining with other markers as we did for human liver samples.

      (9) Supplementary Figure 4C - why does the GFP level decrease in the GREM1 transgenic compared to control the GFP mouse? No such change is observed in Supplementary Figure 4E.

      In Supplementary Figure 4C we show expression of GFP mRNA and GREM1 mRNA in lysates of GFP-control and GREM1-GFP overexpressing LX-2 cells. The x-axis labels indicate the different lentiviruses. Therefore, the right panel in Supplementary Figure 4C shows that GREM1 overexpressing LX-2 cells expressed more GREM1 compared to GFP-control transduced LX-2, while GFP mRNA expression was comparable between the two.

      The results in Supplementary Figure 4E look different because – as can also be seen from the % of GFP+ cells in Supplementary Figure 4D – the GREM1 lentivirus here was more effective in transducing the cells, which is why both GFP and GREM1 mRNA were increased with GREM1 lentivirus compared to the GFP-only control. Unlike LX-2, the lentivirally transduced HHSC were not sorted on GFP positive cells prior to qPCR, which may explain the differences in GFP mRNA expression pattern between the two cell types.

      We acknowledge that the figure may be difficult to interpret and will adjust the figure annotation to improve on this.

      Reviewer #2 (Public Review):

      It is controversial whether liver gremlin-1 expression correlates with liver fibrosis in metabolic dysfunction-associated steatohepatitis (MASH). Horn et al. developed an anti-Gremlin-1 antibody in-house and tested its ability to neutralize gremlin-1 and treat liver fibrosis. This article has the advantage of testing its hypothesis with different animal and human liver fibrosis models and using a variety of research methodologies.

      The experimental design and results support the conclusion that the anti-gremlin-1 antibody had no therapeutic effect on treating liver fibrosis, so there are no other suggestions for new experiments:

      (1) The authors used RNAscope in situ hybridization to establish the correlation between Gremlin-1 expression and NMSH livers or cell lines.

      (2) A luminescent oxygen channelling immunoassay was used to measure circulating Gremlin-1 concentration. They found that Gremlin-1 binds to heparin very efficiently, preventing Gremlin-1 from entering circulation, and restricting Gremlin-1's ability to mediate organ cross-communication.

      (3) The authors developed a suitable NMSH rat model which is a choline-deficient, L-amino acid defined high fat 1% cholesterol diet (CDAA-HFD) fed rat model of NMSH, and created a selective anti-Gremlin-1 antibody which is heparin-displacing 0030:HD antibody. They also used human cirrhotic precision-cut liver slices to test their hypotheses. They demonstrated that neutralization of Gremlin-1 activity with monoclonal therapeutic antibodies does not reduce liver inflammation or liver fibrosis.

      One concern is that several reagents and assays are made in-house without external validation. Also, will those in-house reagents and assays be available to the science community?

      Overall this manuscript provides useful information that gremlin-1 has a limited role in liver fibrosis pathogenesis and treatment.

      We thank the reviewer for their time in assessing our manuscript and are very grateful for the positive response. We acknowledge the fact that most of our results were derived from assays using in-house generated reagents which will therefore be hard to reproduce externally. Whilst for legal reasons we cannot share the sequences of the monoclonal antibodies, we will be able to share aliquots with fellow scientists upon request. We will include a sentence to this end to the data availability statement.

    1. eLife assessment

      This study presents a valuable methodological advancement in quantifying thoughts over time. A novel multi-dimensional experience-sampling approach is used to identify data-driven patterns that the authors use to interrogate fMRI data collected during naturalistic movie-watching. The experimentation is inventive and the analyses carried out are convincing, although the conceptualization of thoughts remains too vague to allow for a clear interpretation of results.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors used a novel multi-dimensional experience sampling (mDES) approach to identify data-driven patterns of experience samples that they use to interrogate fMRI data collected during naturalistic movie-watching data. They identify a set of multi-sensory features of a set of movies that delineate low-dimensional gradients of BOLD fMRI signal patterns that have previously been linked to fundamental axes of cortical organization.

      Strengths:

      The novel solution to challenges associated with experience sampling offers potential access to aspects of experience that have been challenging to assess. While inventive, I worry that the reliability of the mDES approach is currently under-investigated, making it challenging to interpret the import of the later analyses, which are themselves strong and compelling.

      Weaknesses:

      The lack of direct interrogation of individual differences/reliability of the mDES scores warrants some pause.

    3. Reviewer #2 (Public Review):

      Summary:

      The present study explores how thoughts map onto brain activity, a notoriously challenging question because of the dynamic, subjective, and abstract nature of thoughts. To tackle this question, the authors collected continuous thought ratings from participants watching a movie, and additionally made use of an open-source fMRI dataset recorded during movie watching as well as five established gradients of brain variation as identified in resting state data. Using a voxel-space approach, the results show that episodic knowledge, verbal detail, and sensory engagement of thoughts commonly modulate the activation of the visual and auditory cortex, while intrusive distraction modulates the frontoparietal network. Additionally, sensory engagement is mapped onto a gradient from the primary to the association cortex, while episodic knowledge is mapped onto a gradient from the dorsal attention network to the visual cortex. Building on the association between behavioral performance and neural activation, the authors conclude that sensory coupling to external input and frontoparietal executive control is key to comprehension in naturalistic settings.

      The manuscript stands out for its methodological advancements in quantifying thoughts over time and its aim to study the implementation of thoughts in the brain during naturalistic movie watching. However, the conceptualization of thoughts remains vague, its distinction from other concepts like attention is unclear, and interindividual differences are not sufficiently addressed, limiting the study's insights into brain function.

      Strengths:

      (1) The study raises a question that has been difficult to study in naturalistic settings so far but is key to understanding human cognition, namely how thoughts map onto brain activation.

      (2) The thought ratings introduce a novel method for continuously tracking thoughts, promising utility beyond this study.

      (3) The authors substantiated the effects of thinking from multiple perspectives, using diverse data types, metrics, and analyses.

      (4) The figures are highly informative, accessible, and consistent, aiding comprehension.

      Weaknesses:

      (1) The dimensions of thought seem to distinguish between sensory and executive processing states. However, it is unclear if this effect primarily pertains to thinking. I could imagine highly intrusive distractions in movie segments to correlate with stagnating plot development, little change in scenery, or incomprehensible events. Put differently, it may primarily be the properties of the movies that evoke different processing modes, but these properties are not accounted for. For example, I'm wondering whether a simple measure of engagement with stimulus materials could explain the effects just as much. How can the effects of thinking be distinguished from the perceptual and semantic properties of the movie, as well as attentional effects? Is the measure used here capturing thought processes beyond what other factors could explain?

      (2) I'm skeptical about taking human thought ratings at face value. Intrusive distraction might imply disengagement from stimulus materials, but it could also be an intended effect of the movie to trigger higher-level, abstract thinking. Can a label like intrusive distraction be misleading without considering the actual thought and movie content?

      (3) A jittered sampling approach is used to acquire thought ratings every 15 seconds. Are ratings for the same time point averaged across participants? If so, how consistent are ratings among participants? High consistency would suggest thoughts are mainly stimulus-evoked. Low consistency would question the validity of applying ratings from one (group of) participant(s) to brain-related analyses of another participant.

      (4) Using three different movies to conclude that different genres evoke different thought patterns (e.g., line 277) seems like an overinterpretation with only one instance per genre.

      (5) I see no indication that results were cross-validated, and no effect sizes are reported, leaving the robustness and strength of effects unknown.

    4. Reviewer #3 (Public Review):

      This study attempted to investigate the relationship between processing in the human brain during movie watching and corresponding thought processes. This is a highly interesting question, as movie watching presents a semi-constrained task, combining naturally occurring thoughts and common processing of sensory inputs across participants. This task is inherently difficult because in order to know what participants are thinking at any given moment, one has to interrupt the same thought process which is the object of study.

      This study attempts to deal with this issue by aggregating staggered experience sampling data across participants in one behavioral study and using the population-level thought patterns to model brain activity in different participants in an open-access fMRI dataset.

      The behavioral data consist of 120 participants who watched 3 11-minute movie clips. Participants responded to the mDES questionnaire: 16 visual scales characterizing ongoing thought 5 times, two minutes apart, in each clip. The 16 items are first reduced to 4 factors using PCA, and their levels are compared across the different movies. The factors are "episodic knowledge", "intrusive distraction", "verbal detail", and "sensory engagement". The factors differ between the clips, and distraction is negatively correlated with movie comprehension, and sensory engagement is positively correlated with comprehension.

      The components are aggregated across participants (transforming single-subject mDES answers into PCA space and concatenating responses of different participants), and are used as regressors in a GLM analysis. This analysis identifies brain regions corresponding to the components. The resulting brain maps reveal activations that are consistent with the proposed mental processes (e.g. negative loading for intrusion in the frontoparietal network, and positive loadings for visual and auditory cortices for sensory engagement).

      Then, the coordinates for brain regions that were significant for more than one component are entered into a paper search in neurosynth. It is not clear what this analysis demonstrates beyond the fact that sensory engagement contains both visual and auditory components.

      The next analysis projected group-averaged brain activation onto gradients (based on previous work) and used gradient timecourses to predict the behavioral report timecourses. This revealed that high activations in gradient 1 (sensory→association) predicted high sensory engagement, and that "episodic knowledge" thought patterns were predicted by increased visual cortex activations. Then, permutation tests were performed to see whether these thought pattern-related activations corresponded to well-defined regions on a given cluster.

      This paper is framed as presenting a new paradigm but it does little to discuss what this paradigm serves, what its limitations are, and how it should have been tested. I assume that the novelty is in using experience sampling from 1 sample to model the responses of a second sample.

      What are the considerations for treating high-order thought patterns that occur during film viewing as stable enough to be used across participants? What would be the limitations of this method? (Do all people reading this paper think comparable thoughts reading through the sections?)

      How does this approach differ from collaborative filtering, (for example as presented in Chang et al., 2021)?

      In conclusion, this study tackles a highly interesting subject and does it creatively and expertly. It fails to discuss and establish the utility and appropriateness of its proposed method.

      Luke J. Chang et al. ,Endogenous variation in ventromedial prefrontal cortex state dynamics during naturalistic viewing reflects affective experience.Sci. Adv.7,eabf7129(2021).DOI:10.1126/sciadv.abf7129

    1. eLife assessment

      This important study provides new insights into the mechanisms that underlie perceptual and attentional impairments of conscious access. The paper presents convincing evidence of a dissociation between the early stages of low-level perception, which are impermeable to perceptual or attentional impairments, and subsequent stages of visual integration which are susceptible to perceptual impairment but resilient to attentional manipulations. This study will be of interest to scientists working on visual perception and consciousness.

    2. Reviewer #1 (Public Review):

      Summary:

      In this work, Noorman and colleagues test the predictions of the "four-stage model" of consciousness by combining psychophysics and scalp EEG in humans. The study relies on an elegant experimental design to investigate the respective impact of attentional and perceptual blindness on visual processing.

      The study is very well summarised, the text is clear and the methods seem sound. Overall, a very solid piece of work. I haven't identified any major weaknesses. Below I raise a few questions of interpretation that may possibly be the subject of a revision of the text.

      (1) The perceptual performance on Fig1D appears to show huge variation across participants, with some participants at chance levels and others with performance > 90% in the attentional blink and/or masked conditions. This seems to reveal that the procedure to match performance across participants was not very successful. Could this impact the results? The authors highlight the fact that they did not resort to post-selection or exclusion of participants, but at the same time do not discuss this equally important point.

      (2) In the analysis on collinearity and illusion-specific processing, the authors conclude that the absence of a significant effect of training set demonstrates collinearity-only processing. I don't think that this conclusion is warranted: as the illusory and non-illusory share the same shape, so more elaborate object processing could also be occuring. Please discuss.

      (3) Discussion, lines 426-429: It is stated that the results align with the notion that processes of perceptual segmentation and organization represent the mechanism of conscious experience. My interpretation of the results is that they show the contrary: for the same visibility level in the attentional blind or masking conditions, these processes can be implicated or not, which suggests a role during unconscious processing instead.

      (4). The two paradigms developed here could be used jointly to highlight non-idiosyncratic NCCs, i.e. EEG markers of visibility or confidence that generalise regardless of the method used. Have the authors attempted to train the classifier on one method and apply it to another (e.g. AB to masking and vice versa)? What perceptual level is assumed to transfer?

      (5). How can the results be integrated with the attentional literature showing that attentional filters can be applied early in the processing hierarchy?

    3. Reviewer #2 (Public Review):

      Summary:

      This is a very elegant and important EEG study that unifies within a single set of behaviorally equated experimental conditions conscious access (and therefore also conscious access failures) during visual masking and attentional blink (AB) paradigms in humans. By a systematic and clever use of multivariate pattern classifiers across conditions, they could dissect, confirm, and extend a key distinction (initially framed within the GNWT framework) between 'subliminal' and 'pre-conscious' unconscious levels of processing. In particular, the authors could provide strong evidence to distinguish here within the same paradigm these two levels of unconscious processing that precede conscious access : (i) an early (< 80ms) bottom-up and local (in brain) stage of perceptual processing ('local contrast processing') that was preserved in both unconscious conditions, (ii) a later stage and more integrated processing (200-250ms) that was impaired by masking but preserved during AB. On the basis of preexisting studies and theoretical arguments, they suggest that this later stage could correspond to lateral and local recurrent feedback processes. Then, the late conscious access stage appeared as a P3b-like event.

      Strengths:

      The methodology and analyses are strong and valid. This work adds an important piece in the current scientific debate about levels of unconscious processing and specificities of conscious access in relation to feed-forward, lateral, and late brain-scale top-down recurrent processing.

      Weaknesses:

      - The authors could improve clarity of the rich set of decoding analyses across conditions.<br /> - They could also enrich their Introduction and Discussion sections by taking into account the importance of conscious influences on some unconscious cognitive processes (revision of traditional concept of 'automaticity'), that may introduce some complexity in Results interpretation<br /> - They should discuss the rich literature reporting high-level unconscious processing in masking paradigms (culminating in semantic processing of digits, words or even small group of words, and pictures) in the light of their proposal (deeper unconscious processing during AB than during masking).

    4. Reviewer #3 (Public Review):

      Summary:

      This work aims to investigate how perceptual and attentional processes affect conscious access in humans. By using multivariate decoding analysis of electroencephalography (EEG) data, the authors explored the neural temporal dynamics of visual processing across different levels of complexity (local contrast, collinearity, and illusory perception). This is achieved by comparing the decidability of an illusory percept in matched conditions of perceptual (i.e., degrading the strength of sensory input using visual masking) and attentional impairment (i.e., impairing top-down attention using attentional blink, AB). The decoding results reveal three distinct temporal responses associated with the three levels of visual processing. Interestingly, the early stage of local contrast processing remains unaffected by both masking and AB. However, the later stage of collinearity and illusory percept processing are impaired by the perceptual manipulation but remain unaffected by the attentional manipulation. These findings contribute to the understanding of the unique neural dynamics of perceptual and attentional functions and how they interact with the different stages of conscious access.

      Strengths:

      The study investigates perceptual and attentional impairments across multiple levels of visual processing in a single experiment. Local contrast, collinearity, and illusory perception were manipulated using different configurations of the same visual stimuli. This clever design allows for the investigation of different levels of visual processing under similar low-level conditions.

      Moreover, behavioural performance was matched between perceptual and attentional manipulations. One of the main problems when comparing perceptual and attentional manipulations on conscious access is that they tend to impact performance at different levels, with perceptual manipulations like masking producing larger effects. The study utilizes a staircasing procedure to find the optimal contrast of the mask stimuli to produce a performance impairment to the illusory perception comparable to the attentional condition, both in terms of perceptual performance (i.e., indicating whether the target contained the Kanizsa illusion) and metacognition (i.e., confidence in the response).

      The results show a clear dissociation between the three levels of visual processing in terms of temporal dynamics. Local contrast was represented at an early stage (~80 ms), while collinearity and illusory perception were associated with later stages (~200-250 ms). Furthermore, the results provide clear evidence in support of a dissociation between the effects of perceptual and attentional processes on conscious access: while the former affected both neuronal correlates of collinearity and illusory perception, the latter did not have any effect on the processing of the more complex visual features involved in the illusion perception.

      Weaknesses:

      The design of the study and the results presented are very similar to those in Fahrenfort et al. (2017), reducing its novelty. Similar to the current study, Fahrenfort et al. (2017) tested the idea that if both masking and AB impact perceptual integration, they should affect the neural markers of perceptual integration in a similar way. They found that behavioural performance (hit/false alarm rate) was affected by both masking and AB, even though only the latter was significant in the unmasked condition. An early classification peak was instead only affected by masking. However, a late classification peak showed a pattern similar to the behavioural results, with classification affected by both masking and AB.

      The interpretation of the results mainly centres on the theoretical framework of the recurrent processing theory of consciousness (Lamme, 2020), which lead to the assumption that local contrast, collinearity, and the illusory perception reflect feedforward, local recurrent, and global recurrent connections, respectively. It should be mentioned, however, that this theoretical prediction is not directly tested in the study. Moreover, the evidence for the dissociation between illusion and collinearity in terms of lateral and feedback connections seems at least limited. For instance, Kok et al. (2016) found that, whereas bottom-up stimulation activated all cortical layers, feedback activity induced by illusory figures led to a selective activation of the deep layers. Lee & Nguyen (2001), instead, found that V1 neurons respond to illusory contours of the Kanizsa figures, particularly in the superficial layers. They all mention feedback connections, but none seem to point to lateral connections.

      Moreover, the evidence in favour of primarily lateral connections driving collinearity seems mixed as well. On one hand, Liang et al. (2017) showed that feedback and lateral connections closely interact to mediate image grouping and segmentation. On the other hand, Stettler et al. (2002) showed that, whereas the intrinsic connections link similarly oriented domains in V1, V2 to V1 feedback displays no such specificity. Furthermore, the other studies mentioned in the manuscript did not investigate feedback connections but only lateral ones, making it difficult to draw any clear conclusions.

    1. eLife assessment

      This fundamental state-of-the-art modeling study explores neural mechanisms underlying walking control in cats, demonstrating the probability of three different states of operation of the spinal cord circuits generating locomotion at different speeds. The biophysical modeling sufficiently reproduces and provides explanations for experimental data on how the locomotor cycle and phase durations depend on treadmill walking speed. It also points to new principles of functional architecture and operating regimes underlying how spinal circuits interact with supraspinal signals and limb sensory feedback signals to produce different locomotor behaviors at different speeds, which are major unresolved problems in the field. The modeling evidence is compelling, especially in advancing our understanding of locomotion control mechanisms, and will interest neuroscientists studying the neural control of movement.

    2. Reviewer #1 (Public Review):

      Summary:

      It is suggested that for each limb the RG (rhythm generator) can operate in three different regimes: a non-oscillating state-machine regime, and in a flexordriven and a classical half-center oscillatory regime. This means that the field can move away from the old concept that there is only room for the classic half-center organization

      Strengths:

      A major benefit of the present paper is that a bridge was made between various CPG concepts ( "a potential contradiction between the classical half-center and flexor-driven concepts of spinal RG operation"). Another important step forward is the proposal about the neural control of slow gait ("at slow speeds ({less than or equal to} 0.35 m/s), the spinal network operates in a state regime and requires external inputs for phase transitions, which can come from limb sensory feedback and/or volitional inputs (e.g. from the motor cortex").

      Weaknesses:

      Some references are missing.