10,000 Matching Annotations
  1. Nov 2024
    1. Reviewer #1 (Public review):

      Summary:

      Zhang et al. addressed the question of whether hyperaltruistic preference is modulated by decision context, and tested how oxytocin (OXT) may modulate this process. Using an adapted version of a previously well-established moral decision-making task, healthy human participants in this study undergo decisions that gain more (or lose less, termed as context) meanwhile inducing more painful shocks to either themselves or another person (recipient). The alternative choice is always less gain (or more loss) meanwhile less pain. Through a series of regression analyses, the authors reported that hyperaltruistic preference can only be found in the gain context but not in the loss context, however, OXT reestablished the hyperaltruistic preference in the loss context similar to that in the gain context.

      Strengths:

      This is a solid study that directly adapted a previously well-established task and the analytical pipeline to assess hyperaltruistic preference in separate decision contexts. Context-dependent decisions have gained more and more attention in literature in recent years, hence this study is timely. It also links individual traits (via questionnaires) with task performance, to test potential individual differences. The OXT study is done with great methodological rigor, including pre-registration. Both studies have proper power analysis to determine the sample size.

      Weaknesses:

      Despite the strengths, multiple analytical decisions have to be explained, justified, or clarified. Also, there is scope to enhance the clarity and coherence of the writing - as it stands, readers will have to go back and forth to search for information. Last, it would be helpful to add line numbers in the manuscript during the revision, as this will help all reviewers to locate the parts we are talking about.

      (1) Introduction:<br /> The introduction is somewhat unmotivated, with key terms/concepts left unexplained until relatively late in the manuscript. One of the main focuses in this work is "hyperaltruistic", but how is this defined? It seems that the authors take the meaning of "willing to pay more to reduce other's pain than their own pain", but is this what the task is measuring? Did participants ever need to PAY something to reduce the other's pain? Note that some previous studies indeed allow participants to pay something to reduce other's pain. And what makes it "HYPER-altruistic" rather than simply "altruistic"? Plus, in the intro, the authors mentioned that the "boundary conditions" remain unexplored, but this idea is never touched again. What do boundary conditions mean here in this task? How do the results/data help with finding out the boundary conditions? Can this be discussed within wider literature in the Discussion section? Last, what motivated the authors to examine the decision context? It comes somewhat out of the blue that the opening paragraph states that "We set out to [...] decision context", but why? Are there other important factors? Why decision context is more important than studying those others?

      (2) Experimental Design:<br /> (2a) The experiment per se is largely solid, as it followed a previously well-established protocol. But I am curious about how the participants got instructed? Did the experimenter ever mention the word "help" or "harm" to the participants? It would be helpful to include the exact instructions in the SI.

      (2b) Relatedly, the experimental details were not quite comprehensive in the main text. Indeed, the Methods come after the main text, but to be able to guide readers to understand what was going on, it would be very helpful if the authors could include some necessary experimental details at the beginning of the Results section.

      (3) Statistical Analysis<br /> (3a) One of the main analyses uses the harm aversion model (Eq1) and the results section keeps referring to one of the key parameters of it (ie, k). However, it is difficult to understand the text without going to the Methods section below. Hence it would be very helpful to repeat the equation also in the main text. A similar idea goes to the delta_m and delta_s terms - it will be very helpful to give a clear meaning of them, as nearly all analyses rely on knowing what they mean.

      (3b) There is one additional parameter gamma (choice consistency) in the model. Did the authors also examine the task-related difference of gamma? This might be important as some studies have shown that the other-oriented choice consistency may differ in different prosocial contexts.

      (3c) I am not fully convinced that the authors included two types of models: the harm aversion model and the logistic regression models. Indeed, the models look similar, and the authors have acknowledged that. But I wonder if there is a way to combine them? For example:<br /> Choice ~ delta_V * context * recipient (*Oxt_v._placebo)<br /> The calculation of delta_V follows Equation 1.<br /> Or the conceptual question is, if the authors were interested in the specific and independent contribution of dalta_m and dalta_s to behavior, as their logistic model did, why did the authors examine the harm aversion first, where a parameter k is controlling for the trade-off? One way to find it out is to properly run different models and run model comparisons. In the end, it would be beneficial to only focus on the "winning" model to draw inferences.

      (3d) The interpretation of the main OXT results needs to be more cautious. According to the operationalization, "hyperaltruistic" is the reduction of pain of others (higher % of choosing the less painful option) relative to the self. But relative to the placebo (as baseline), OXT did not increase the % of choosing the less painful option for others, rather, it decreased the % of choosing the less painful option for themselves. In other words, the degree of reducing other's pain is the same under OXT and placebo, but the degree of benefiting self-interest is reduced under OXT. I think this needs to be unpacked, and some of the wording needs to be changed. I am not very familiar with the OXT literature, but I believe it is very important to differentiate whether OXT is doing something on self-oriented actions vs other-oriented actions. Relatedly, for results such as that in Figure 5A, it would be helpful to not only look at the difference but also the actual magnitude of the sensitivity to the shocks, for self and others, under OXT and placebo.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors reported two studies where they investigated the context effect of hyperaltruistic tendency in moral decision-making. They replicated the hyperaltruistic moral preference in the gain domain, where participants inflicted electric shocks on themselves or another person in exchange for monetary profits for themselves. In the loss domain, such hyperaltruistic tendency is abolished. Interestingly, oxytocin administration reinstated the hyperaltruistic tendency in the loss domain. The authors also examined the correlation between individual differences in utilitarian psychology and the context effect of hyperaltruistic tendency.

      Strengths:

      (1) The research question - the boundary condition of hyperaltruistic tendency in moral decision-making and its neural basis - is theoretically important.

      (2) Manipulating the brain via pharmacological means offers a causal understanding of the neurobiological basis of the psychological phenomenon in question.

      (3) Individual difference analysis reveals interesting moderators of the behavioral tendency.

      Weaknesses:

      (1) The theoretical hypothesis needs to be better justified. There are studies addressing the neurobiological mechanism of hyperaltruistic tendency, which the authors unfortunately skipped entirely.

      (2) There are some important inconsistencies between the preregistration and the actual data collection/analysis, which the authors did not justify.

      (3) Some of the exploratory analysis seems underpowered (e.g., large multiple regression models with only about 40 participants).

      (4) Inaccurate conceptualization of utilitarian psychology and the questionnaire used to measure it.

    3. Reviewer #3 (Public review):

      Summary:

      In this study, the authors aimed to index individual variation in decision-making when decisions pit the interests of the self (gains in money, potential for electric shock) against the interests of an unknown stranger in another room (potential for unknown shock). In addition, the authors conducted an additional study in which male participants were either administered intranasal oxytocin or placebo before completing the task to identify the role of oxytocin in moderating task responses. Participants' choice data was analyzed using a harm aversion model in which choices were driven by the subjective value difference between the less and more painful options.

      Strengths:

      Overall I think this is a well-conducted, interesting, and novel set of research studies exploring decision-making that balances outcomes for the self versus a stranger, and the potential role of the hormone oxytocin (OT) in shaping these decisions. The pain component of the paradigm is well designed, as is the decision-making task, and overall the analyses were well suited to evaluating and interpreting the data. Advantages of the task design include the absence of deception, e.g., the use of a real study partner and real stakes, as a trial from the task was selected at random after the study and the choice the participant made was actually executed. 

      Weaknesses:

      The primary weakness of the paper concerns its framing. Although it purports to be measuring "hyper-altruism" it does not provide evidence to support why any of the behavior being measured is extreme enough to warrant the modifier "hyper" (and indeed throughout I believe the writing tends toward hyperbole, using, e.g., verbs like "obliterate" rather than "reduce"). More seriously, I do not believe that the task constitutes altruism, but rather the decision to engage, or not engage, in instrumental aggression.

      I found it surprising that a paradigm that entails deciding to hurt or not hurt someone else for personal benefit (whether acquiring a financial gain or avoiding a loss) would be described as measuring "altruism." Deciding to hurt someone for personal benefit is the definition of instrumental aggression. I did not see that in any of the studies was there a possibility of acting to benefit the other participant in any condition. Altruism is not equivalent to refraining from engaging in instrumental aggression. True altruism would be to accept shocks to the self for the other's benefit (e.g., money).  The interpretation of this task as assessing instrumental aggression is supported by the fact that only the Instrumental Harm subscale of the OUS was associated with outcomes in the task, but not the Impartial Benevolence subscale. By contrast, the IB subscale is the one more consistently associated with altruism (e.g,. Kahane et al 2018; Amormino at al, 2022) I believe it is important for scientific accuracy for the paper, including the title, to be re-written to reflect what it is testing.

      Relatedly: in the introduction I believe it would be important to discuss the non-symmetry of moral obligations related to help/harm--we have obligations not to harm strangers but no obligation to help strangers. This is another reason I do not think the term "hyper altruism" is a good description for this task--given it is typically viewed as morally obligatory not to harm strangers, choosing not to harm them is not "hyper" altruistic (and again, I do not view it as obviously altruism at all).

      The framing of the role of OT also felt incomplete. In introducing the potential relevance of OT to behavior in this task, it is important to pull in evidence from non-human animals on origins of OT as a hormone selected for its role in maternal care and defense (including defensive aggression). The non-human animal literature regarding the effects of OT is on the whole much more robust and definitive than the human literature. The evidence is abundant that OT motivates the defensive care of offspring of all kinds. My read of the present OT findings is that they increase participants' willingness to refrain from shocking strangers even when incurring a loss (that is, in a context where the participant is weighing harm to themselves versus harm to the other). It will be important to explain why OT would be relevant to refraining from instrumental aggression, again, drawing on the non-human animal literature.

      Another important limitation is the use of only male participants in Study 2. This was not an essential exclusion. It should be clear throughout sections of the manuscript that this study's effects can be generalized only to male participants.

    1. eLife Assessment

      This valuable contribution combines high-resolution histology with magnetic resonance imaging in a novel way to study the organisation of the human amygdala. The main findings convincingly show the axes of microstructural organisation within the amygdala and how they map onto the functional organisation. Overall, the approach taken in this paper showcases the utility of combining multiple modalities at different spatial scales to help understand brain organisation.

    2. Reviewer #1 (Public review):

      The paper by Auer et. makes several contributions:

      (1) The study developed a novel approach to map the microstructural organization of the human amygdala by applying radiomics and dimensionality reduction techniques to high-resolution histological data from the BigBrain dataset.

      (2) The method identified two main axes of microstructural variation in the amygdala, which could be translated to in vivo 7 Tesla MRI data in individual subjects.

      (3) Functional connectivity analysis using resting-state fMRI suggests that microstructurally defined amygdala subregions had distinct patterns of functional connectivity to cortical networks, particularly the limbic, frontoparietal, and default mode networks.

      (4) Meta-analytic decoding was used to suggest that the superior amygdala subregion's connectivity is associated with autobiographical memory, while the inferior subregion was linked to emotional face processing.

      (5) Overall, the data-driven, multimodal approach provides an account of amygdala microstructure and possibly function that can be applied at the individual subject level, potentially advancing research on amygdala organization.

      Although these are meritorious contributions there are some concerns that I will summarize below.

      (1) The paper makes little-to-no contact with the monkey literature regarding the anatomy of amygdala subregions, their functionality, and their patterns of anatomical connectivity. This is surprising because such literature on non-human primates is a very important starting point for understanding the human amygdala. I recommend taking a careful look at the work by Helen Barbas, among others. There are too many papers to cite but a notable example is: Ghashghaei, H. T., Hilgetag, C. C., & Barbas, H. (2007). Sequence of information processing for emotions based on the anatomic dialogue between prefrontal cortex and amygdala. Neuroimage, 34(3), 905-923. The work of Amaral is also highly relevant. Furthermore, the authors subscribe to a model with LB, CM, and SF sectors. How does the SF sector relate to monkey anatomy?

      (2) The authors use meta-analytical decoding via NeuroSynth. If the authors like those results of course they should keep them but the quality of coordinate reporting in the literature is insufficient to conclude much in the context of amygdala subregion function in my opinion. I believe the results reported are at most "somewhat suggestive".

      (3) Another significant concern has to do with the results in Figure 3. The red and yellow clusters identified are quite distinct but the differences in functional connectivity are very modest. Figure 3C reveals very similar functional connectivity with the networks investigated. This is very surprising, and the authors should include a careful comparison with related findings in the literature. Overall, there is limited comparison between the observed results and those obtained via other methods. On a more pessimistic note, the results of Figure 3 seem to question the validity of the general approach.

      (4) Some statements in the Discussion feel unwarranted. For example, "significant dissociation in functional connectivity to prefrontal structures that support self-referential, reward-related, and socio-affective processes." This feels way beyond what can be stated based on the analyses performed.

    3. Reviewer #2 (Public review):

      Summary:

      This study bridges a micro- to macroscale understanding of the organization of the amygdala. First, using a data-driven approach, the authors identify structural clusters in the human amygdala from high-resolution post-mortem histological data. Next, multimodal imaging data to identify structural subunits of the amygdala and the functional networks in which they are involved. This approach is exciting because it permits the identification of both structural amygdalar subunits, and their functional implications, in individual subjects. There are, however, some differences in the macro and microscale levels of organization that should be addressed.

      Strengths:

      The use of data-driven parcellation on a structure that is important for human emotion and cognition, and the combination of this with high-resolution individual imaging-based parcellation, is a powerful and exciting approach, addressing both the need for a template-level understanding of organization as well as a parcellation that is valid for individuals. The functional decoding of rsfMRI permits valuable insight into the functional role of structural subunits. Overall, the combination of micro to macro, structure, and function, and general organization to individual relevance is an impressive holistic approach to brain mapping.

      Weaknesses:

      (1) UMAP 1, as calculated from the histological data, appears to correlate well across individuals, and decently with the MRI data, although the medial-lateral coordinate axis is an outlier. UMAP 2, on the other hand, does not appear to correlate well with imaging data or across individuals. This does pose a problem with the claim that this paper bridges micro- and macroscale parcellations. One might certainly expect, however, that different levels of organization might parcellate differently, but the authors should address this in the discussion and offer ways forward.

      (2) It would be interesting to see functional decoding for the right amygdala. This could be included in the supplementary material. A discussion of differences in the results in the two hemispheres could be illuminating.

      (3) The authors acknowledge that this mapping matches some but not all subunits that have been previously described in the amygdala. It would be helpful to neuroanatomists if the authors could discuss these differences in more detail in the discussion, to identify how this mapping differs and what the implications of this are.

      (4) The acronym UMAP is not explained. A brief explanation and description would be useful to the reader.

    1. eLife Assessment

      The focus of this study is the development of a compelling method for analyzing network communication in the brain through an exhaustive computational analysis of virtual lesions. Using human neuroimaging data, the authors identified brain regions that exert the greatest influence over others. These important results revealed the characteristic connectivity profile of such brain regions and provided a network analysis method that will find applicability beyond the datasets used.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, Fakhar et al. use a game-theoretical framework to model interregional communication in the brain. They perform virtual lesioning using MSA to obtain a representation of the influence each node exerts on every other node, and then compare the optimal influence profiles of nodes across different communication models. Their results indicate that cortical regions within the brain's "rich club" are most influential.

      Strengths:

      Overall, the manuscript is well-written. Illustrative examples help to give the reader intuition for the approach and its implementation in this context. The analyses appear to be rigorously performed and appropriate null models are included.

      Weaknesses:

      The use of game theory to model brain dynamics relies on the assumption that brain regions are similar to agents optimizing their influence, and implies competition between regions. The model can be neatly formalized, but is there biological evidence that the brain optimizes signaling in this way? This could be explored further. Specifically, it would be beneficial if the authors could clarify what the agents (brain regions) are optimizing for at the level of neurobiology - is there evidence for a relationship between regional influence and metabolic demands? Identifying a neurobiological correlate at the same scale at which the authors are modeling neural dynamics would be most compelling.

      It is not entirely clear what Figure 6 is meant to contribute to the paper's main findings on communication. The transition to describing this Figure in line 317 is rather abrupt. The authors could more explicitly link these results to earlier analyses to make the rationale for this figure clearer. What motivated the authors' investigation into the persistence of the signal influence across steps?

      The authors used resting-state fMRI data to generate functional connectivity matrices, which they used to inform their model of neural dynamics. If I understand correctly, their functional connectivity matrices represent correlations in neural activity across an entire fMRI scan computed for each individual and then averaged across individuals. This approach seems limited in its ability to capture neural dynamics across time. Modeling time series data or using a sliding window FC approach to capture changes across time might make more sense as a means of informing neural dynamics.

      The authors evaluated their model using three different structural connectomes: one inferred from diffusion spectrum imaging in humans, one inferred from anterograde tract tracing in mice, and one inferred from retrograde tract-tracing in macaque. While the human connectome is presumably an undirected network, the mouse and macaque connectomes are directed. What bearing does experimentally inferred knowledge of directionality have on the derivation of optimal influence and its interpretation?

      It would be useful if the authors could assess the performance of the model for other datasets. Does the model reflect changes during task engagement or in disease states in which relative nodal influence would be expected to change? The model assumes optimality, but this assumption might be violated in disease states.

      The MSA approach is highly computationally intensive, which the authors touch on in the Discussion section. Would it be feasible to extend this approach to task or disease conditions, which might necessitate modeling multiple states or time points, or could adaptations be made that would make this possible?

    3. Reviewer #2 (Public review):

      Summary:

      The authors provide a compelling method for characterizing communication within brain networks. The study engages important, biologically pertinent, concerns related to the balance of dynamics and structure in assessing the focal points of brain communication. The methods are clear and seem broadly applicable, however further clarity on this front is required.

      Strengths:

      The study is well-developed, providing an overall clear exposition of relevant methods, as well as in-depth validation of the key network structural and dynamical assumptions. The questions and concerns raised in reading the text were always answered in time, with straightforward figures and supplemental materials.

      Weaknesses:

      The narrative structure of the work at times conflicts with the interpretability. Specifically, in the current draft, the model details are discussed and validated in succession, leading to confusion. Introducing a "base model" and "core datasets" needed for this type of analysis would greatly benefit the interpretability of the manuscript, as well as its impact.

    1. eLife Assessment

      This valuable study combined whole-head magnetoencephalography (MEG) and subthalamic (STN) local field potential (LFP) recordings in patients with Parkinson's disease undergoing deep brain stimulation surgery. The paper provides solid evidence that cortical and STN beta oscillations are sensitive to movement context and may play a role in the coordination of movement redirection.

    2. Reviewer #1 (Public review):

      Summary:

      Winkler et al. present brain activity patterns related to complex motor behaviour by combining whole-head magnetoencephalography (MEG) with subthalamic local field potential (LFP) recordings from people with Parkinson's disease. The motor task involved repetitive circular movements with stops or reversals associated with either predictable or unpredictable cues. Beta and gamma frequency oscillations are described, and the authors found complex interactions between recording sites and task conditions. For example, they observed stronger modulation of connectivity in unpredictable conditions. Moreover, STN power varied across patients during reversals, which differed from stopping movements. The authors conclude that cortex-STN beta modulation is sensitive to movement context, with potential relevance for movement redirection.

      Strengths:

      This study employs a unique methodology, leveraging the rare opportunity to simultaneously record both invasive and non-invasive brain activity to explore oscillatory networks.

      Weaknesses:

      It is difficult to interpret the role of the STN in the context of reversals because no consistent activity pattern emerged.

    3. Reviewer #2 (Public review):

      Summary:

      This study examines the role of beta oscillations in motor control, particularly during rapid changes in movement direction among patients with Parkinson's disease. The researchers utilized magnetoencephalography (MEG) and local field potential (LFP) recordings from the subthalamic nucleus to investigate variations in beta band activity within the cortex and STN during the initiation, cessation, and reversal of movements, as well as the impact of external cue predictability on these dynamics. The primary finding indicates that beta oscillations more effectively signify the start and end of motor sequences than transitions within those sequences. The article is well-written, clear, and concise.

      Strengths:

      The use of a continuous motion paradigm with rapid reversals extends the understanding of beta oscillations in motor control beyond simple tasks. It offers a comprehensive perspective on subthalamo-cortical interactions by combining MEG and LFP.

      Weaknesses:

      (1) The small and clinically diverse sample size may limit the robustness and generalizability of the findings. Additionally, the limited exploration of causal mechanisms reduces the depth of its conclusions and focusing solely on Parkinson's disease patients might restrict the applicability of the results to broader populations.

      (2) The small sample size and variability in clinical characteristics among patients may limit the robustness of the study's conclusions. It would be beneficial for the authors to acknowledge this limitation and propose strategies for addressing it in future research. Additionally, incorporating patient-specific factors as covariates in the ANOVA could help mitigate the confounding effects of heterogeneity.

      (3) The author may consider using standardized statistics, such as effect size, that would provide a clearer picture of the observed effect magnitude and improve comparability.

      (4) Although the study identifies revelance between beta activity and motor events, it lacks causal analysis and discussion of potential causal mechanisms. Given the valuable datasets collected, exploring or discussing causal mechanisms would enhance the depth of the study.

      (5) The study cohort focused on senior adults, who may exhibit age-related cortical responses during movement planning in neural mechanisms. These aspects were not discussed in the study.

      (6) Including a control group of patients with other movement disorders who also undergo DBS surgery would be beneficial. Because we cannot exclude the possibility that the observed findings are specific to PD or can be generalized. Additionally, the current title and the article, which are oriented toward understanding human motor control, may not be appropriate.

    4. Reviewer #3 (Public review):

      Summary:

      The study highlights how the initiation, reversal, and cessation of movements are linked to changes in beta synchronization within the basal ganglia-cortex loops. It was observed that different movement phases, such as starting, stopping briefly, and stopping completely, affect beta oscillations in the motor system.

      It was found that unpredictable cues lead to stronger changes in STN-cortex beta coherence. Additionally, specific patterns of beta and gamma oscillations related to different movement actions and contexts were observed. Stopping movements was associated with a lack of the expected beta rebound during brief pauses within a movement sequence.

      Overall, the results underline the complex and context-dependent nature of motor-control and emphasize the role of beta oscillations in managing movement according to changing external cues.

      Strengths:

      The paper is very well written, clear, and appears methodologically sound.

      Although the use of continuous movement (turning) with reversals is more naturalistic than many previous button push paradigms.

      Weaknesses:

      The generalizability of the findings is somewhat curtailed by the fact that this was performed peri-operatively during the period of the microlesion effect. Given the availability of sensing-enabled DBS devices now and HD-EEG, does MEG offer a significant enough gain in spatial localizability to offset the fact that it has to be done shortly postoperatively with externalized leads, with an attendant stun effect? Specifically, for paradigms that are not asking very spatially localized questions as a primary hypothesis?

      Further investigation of the gamma signal seems warranted, even though it has a slightly lower proportional change in amplitude in beta. Given that the changes in gamma here are relatively wide band, this could represent a marker of neural firing that could be interestingly contrasted against the rhythm account presented.

    1. eLife Assessment

      The central claim in this valuable manuscript is that microglia in the PVH sculpt the density of AgRP inputs to the PVH in a spatially restricted manner. The anatomical results are solid but the analysis of how microglia activity affects body weight when lactating dams are fed a high-fat diet is incomplete.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Mendoza-Romero et al. investigate the effects of maternal high-fat diet (MHFD) on microglia and AgRP synaptic terminals in the hypothalamus of postnatal mice during lactation. The study employs 3D microglial morphology reconstruction and genetically targeted axonal labeling, offering a detailed examination of microglial changes and their implications for AgRP terminal density and body weight regulation, focusing on the PVN and ARC nuclei. The authors also use pharmacological (e.g., PLX5622) elimination of microglia to test the sufficiency of microglia to shape PVN AgRP+ synapses.

      Strengths:

      This is a well-written paper with a thorough introduction and discussion.

      The impact of microglia on hypothalamic synaptic pruning is poorly characterized, so the findings herein are especially interesting.

      Weaknesses:

      (1) A cartoon paradigm of the HFD treatment window would be a helpful addition to Figure 1. Relatedly, the authors might consider qualifying MHFD as 'lactational MHFD.' Readers might miss the fact that the exposure window starts at birth.

      (2) More details on the modeling pipeline are needed either in Figure 1 or text. Of the ~50 microglia that were counted (based on Figure 1J), were all 50 quantified for the morphological assessments? Were equal numbers used for the control and MHFD groups? Were the 3D models adjusted manually for accuracy? How much background was detected by IMARIS that was discarded? Was the user blind to the treatment group while using the pipeline? Were the microglia clustered or equally spread across the PVN?

      (3) Suggest toning back some of the language. For example: "...consistent with enhanced activity and surveillance of their immediate microenvironment" (Line 195) could be "...perhaps consistent with...". Likewise, "profound" (Lines 194, 377) might be an overstatement.

      (4) Representative images for AgRP+ cells (quantified in Figure 2J) are missing. Why not a co-label of Iba1+/AgRP+ as per Figure 1, 3? Also, what was quantified in Figure 2J - soma? Total immunoreactivity?

      (5) For the PLX experiment:<br /> a) "...we depleted microglia during the lactation period" (Line 234). This statement suggests microglia decreased from the first injection at P4 and throughout lactation, which is inaccurate. PLX5622 effects take time, upwards of a week. Thus, if PLX5622 injections started at P4, it could be P11 before the decrease in microglia numbers is stable. Moreover, by the time microglia are entirely knocked down, the pups might be supplementing some chow for milk, making it unclear how much PLX5622 they were receiving from the dam, which could also impact the rate at which microglia repopulation commences in the fetal brain. Quantifying microglia across the P4-P21 treatment window would be helpful, especially at P16, since the PVN AgRP microglia phenotypes were demonstrated and roughly when pups might start eating some chow.

      b) I am surprised that ~70% of the microglia are present at P21. Does this number reflect that microglia are returning as the pups no longer receive PLX5622 from milk from the dam? Does it reflect the poor elimination of microglia in the first place?

      (6) Was microglia morphology examined for all microglia across the PVN? It is possible that a focus on PVNmpd microglia would reveal a stronger phenotype? In Figure 4H, J, AgRP+ terminals are counted in PVN subregions - PVNmpd and PVNpml, with PVNmpd showing a decrease of ~300 AgRP+ terminals in MHFD/Veh (rescued in MHFD/PLX5622). In Figure 1K, AgRP+ terminals across what appears to be the entire PVN decrease by ~300, suggesting that PVNmpd is driving this phenotype. If true, then do microglia within the PVNmpd display this morphology phenotype?

      (7) What chow did the pups receive as they started to consume solid food? Is this only a MHFD challenge, or could the pups be consuming HFD chow that fell into the cage?

      (8) Figure 5: Does internalized AgRP+ co-localize with CD68+ lysosomes? How was 'internalized' determined?

      (9) Different sample sizes are used across experiments (e.g., Figure 4 NCD n=5, MHFD n=4). Does this impact statistical significance?

    3. Reviewer #2 (Public review):

      Summary:

      Microglia sense stressors and other environmental factors during the postnatal period in rodents and can sculpt developing circuits by promoting or pruning synaptic connections, depending on the brain region and context. Here, the authors examine the contributions of microglia to the effects of maternal high-fat diet during lactation (MHFD) to reduce the formation of projections from AgRP neurons in the ARH to the PVH, a critical node in circuits regulating energy balance. Using detailed histomorphometric analyses of Iba-1+ cells in 3 hypothalamic nuclei (ARH, PVH, and BNST) at two-time points (P16 and P30), the authors show that microglial volume and complexity increase while cell numbers decrease across this period. Exposure to MHFD is associated with an increase in the complexity/volume of microglia at P16 in the PVH but not in the other brain regions or time points assessed. The authors cite this as evidence of "spatial-specific" effects. They also demonstrate that reducing the number of microglia using a pharmacological approach (injection of the CSFR inhibitor from P4-P21) in pups exposed to MHFD enhances AgRP outgrowth to the PVH and reduces body weight at weaning, effectively reversing the effects of MHFD. The central claim in the manuscript is that microglia in the PVH "sculpt the density of AgRP inputs to the PVH" in a spatially restricted manner.

      Strengths:

      (1) Detailed 3-D reconstructions of Iba-1 staining in microglia are used to perform unbiased and comprehensive analyses of microglial complexity and to quantify the spatial relationship between microglial processes and AgRP terminals.

      (2) The rationale for exploring whether the effects of maternal HFD on the formation of AgRP projections to the PVH is mediated via changes in microglia is supported by the literature. For example, microglial development in the postnatal hippocampus and cortex is sensitive to maternal factors, such as inflammation, with lasting effects on circuit formation and function.

      (3) Here the authors explored whether changes in microglia contribute to the effects of maternal HFD feeding during lactation on the formation of AgRP to PVH circuits that are important for the regulation of food intake and energy expenditure.

      Weaknesses:

      (1) Under chow-fed conditions, there is a decrease in the number of microglia in the PVH and ARH between P16 and P30, accompanied by an increase in complexity/volume. With the exception of PVH microglia at P16, this maturation process is not affected by MHFD. This "transient" increase in microglial complexity could also reflect premature maturation of the circuit.

      (2) The key experiment in this paper, the ablation of microglia, was presumably designed to prevent microglial expansion/activation in the PVH of MHFD pups. However, it also likely accelerates and exaggerates the decrease in cell number during normal development regardless of maternal diet. Efforts to interpret these findings are further complicated because microglial and AgRP neuronal phenotypes were not assessed at earlier time points when the circuit is most sensitive to maternal influences.

      (3) Microglial loss was induced broadly in the forebrain. Enhanced AgRP outgrowth to the PVH could be caused by actions elsewhere, such as direct effects on AgRP neurons in the ARH or secondary effects of changes in growth rates.

      (4) Prior publications from the authors and other groups support the idea that the density of AgRP projections to the PVH is primarily driven by factors regulating outgrowth and not pruning. The failure to observe increased engulfment of AgRP fibers by PVH microglia is surprising. Therefore, not surprising. The possibility that synaptic connectivity is modulated by microglia was not explored.

    4. Reviewer #3 (Public review):

      Summary:

      The authors interrogated the putative role of microglia in determining AgRP fiber maturation in offspring exposed to a maternal high-fat diet. They found that changes in specific parts of the hypothalamus (but not in others) occur in microglia and that the effect of microglia on AgRP fibers appears to be beyond synaptic pruning, a classical function of these brain-resident macrophages.

      Strengths:

      The work is very strong in neuroanatomy. The images are clear and nicely convey the anatomical differences. The microglia depletion study adds functional relevance to the paper; however, the pitfalls of the technology regarding functional relevance should be discussed.

      Weaknesses:

      There was no attempt to interrogate microglia in different parts of the hypothalamus functionally. Morphology alone does not reflect a potential for significant signaling alterations that may occur within and between these and other cell types.

      The authors should discuss the limitations of their approach and findings and propose future directions to address them.

    1. eLife Assessment

      This study provides useful information on the potential role of ERbB4 expression in parvalbumin-positive cells on olfactory behaviour and circuit dynamics in the olfactory bulb. The question is timely and novel, and findings could shed light on the critical role that ErbB4 may play in modulating olfactory bulb cell function and olfactory perception. Although the authors use a comprehensive set of experiments for their analysis, the evidence is incomplete as many of the experiments are underpowered and the model for selective knockout of ErbB4 in olfactory parvalbumin cells is not validated.

    2. Reviewer #1 (Public review):

      Summary:

      The study by Hu et al. investigated the role of olfactory ErbB4 in regulating olfactory information processing. The authors demonstrated that ErbB4 deletion impairs odor discrimination, sensitivity, habituation, and dishabituation by using an impressive combination of techniques from morphological to electrophysiology (both slice and in vivo) and from viral injection to cell-type-specific mutation to behavioral analysis. The findings underscore the crucial role of ErbB4 in olfactory PV neurons in modulating mitral cell function and odor perception.

      Strengths:

      This study contains a pretty comprehensive set of experiments.

      Major concerns:

      (1) Line 151 page 7, "PV-Erbb4+/+ mice (generated by crossing PV-Cre mice (Wen et al., 2010) with loxP flanked Erbb4 mice". Does this mean mice carrying PV-Cre and ErbB4 floxed allele? Or with the WT allele? This is confusing. Figures 2B and 2C, ErbB4 expression was evident in many cells that were not positive for PV. What are the identities of those cells? Are they important?

      (2) In Figure 4, the authors performed tetrode recordings in awake head-fixed animals. Although individual neuron spikes could be obtained by spike-sorting, this is not a "single-unit" experiment due to the nature of this approach.

      What is the odor used in Figure 4? How did the authors clean up the odor to limit the stimulation within 2 seconds? In what layer were the tetrodes placed? What is the putative cell type presented in Figure 4C? If Figure 4C is a representative neuron recorded, the odor-induced suppression of spike activity seems to be impaired in PV-ErbB4-/- animals. However, Figure 4D shows that suppressed neurons were similar between the two types of animals. Such comparisons among individual mice are difficult for in vivo electrophysiological experiments because the recorded cell type and placement of electrodes would be different. The authors should apply ErbB4 inhibitors to the same animals and compare the effects before and after. This would ensure the recoding of the same population of neurons.

      (3) At a glance in the heatmap in Figure 4D, excited neurons were reduced in PV-ErbB4-/- mice, but not inhibited neurons. This was different from Figure 4L. The authors need to have a criteria or threshold to show how they categorized each population.

      (4) Figure 4D, 4F and 4J seemed to be inconsistent. In Figure 4D before odor, there was no clear increase in the spontaneous activity in PV-ErbB4-/- mice; in Figure 4F-4G and 4J-4K, clearly, there was a high spontaneous activity in PV-ErbB4-/- mice.

      (5) What are the neurons recorded in Figure 6E-6F? If they were MCs, loss of ErbB4 in PV neurons should not alter their intrinsic electrical properties. Rather GABAergic inputs could be altered. Indeed, the authors presented a reduction of GABAergic inputs from PV neurons to MCs.

      (6) Figure 8E-8H, a better experiment would be specifically expressing ErbB4 or PV neurons. In Figure 8F and Figure 8I, was it the excitability after the current injection? Why not perform the spontaneous activity recording?

    3. Reviewer #2 (Public review):

      Summary:

      Hu et al investigate the role of PV neurons and their expression of Erbb4 in olfactory performance through a series of behavioral tests, selective knockout experiments, and in vivo and in vitro electrophysiology. Knockout of Erbb4, either in PV cells or the whole OB, resulted in impairment of discriminating complex odors. The authors present data that inhibition is impaired in MCs, which is likely underlying the abnormal odor-evoked responses of MCs in vivo and the impaired behavioral responses.

      Strengths:

      Overall, a key strength of this manuscript is the breadth of experiments to test the role of PV Erbb4 expression on circuit dynamics and behavior. The behavioral experiments were clear and sufficiently powered.

      Weaknesses:

      The major drawback of this manuscript is the lack of depth and rigor in experiments. Some experiments are preliminary, underpowered, and not quantified. As a result, many conclusions of the manuscript are weakly supported in its current form and would require significant revisions to address these shortcomings. Major weaknesses that should be addressed are as follows:

      AAV-PV-Cre-GFP is not described or validated. Is this the S5E2 enhancer or something else? What is the specificity and efficacy of this approach in selectively knocking out Erbb4 in PV neurons? Reduced Erbb4 expression in the entire OB with PCR does not validate the selectivity of this approach. At a titer of 10^12, it is unlikely to be specific. Even a small amount of off-target Cre expression will knock out the gene in non-PV cells, so the authors should show whether the gene is knocked out at the single cell level from PV and non-PC cells. Without validation of this approach, this experiment is no different than the AAV-Cre-GFP experiments.

      Figure 1D - three mice per group is insufficient. There is no control group error (the same as Figure 9). Why is it a paired t-test when there is a control group? The authors should be comparing go/go vs. go/no-go. The methods for normalization are unclear and are likely to hide the fact that n=3 is insufficient to capture a difference without extra measures to normalize the data.

      The analysis of LFP is limited. During what period was this quantified? Are there any differences in task-related LFP changes? Also related to in vivo electrophysiology, the authors should show examples of isolated units, including their waveforms and how units were clustered and assigned to M/TCs.

      The authors use 80pA and 100pA to elicit equivalent AP spiking in MCs to determine if recurrent inhibition differs, but do not actually show that AP spiking is the same across groups. This should be quantified.

      There seems to be a prominent increase in the firing of MCs in PV-Erbb4+/+ mice before odor presentation, but not in PV-Erbb4-/- mice. What is the significance of this?

      There is a disconnect between the in vivo firing rates of MCs and ex vivo firing rates. In slice, the authors note that the spontaneous activity of MCs is elevated in the KO, but this is not observed in vivo, where conditions are physiological. Therefore, it is unclear whether the concept of signal-to-noise changes in slice (higher spontaneous, lower evoked), indeed translate to something in vivo. It would be important to know what the PV cells are doing in vivo. Perhaps they have low firing rates prior to odor onset, which may explain the lack of observed difference in baseline FRs in MCs. The authors should have this data in their tetrode recordings, which would offer insight into when inhibition is recruited.

      Since PV neurons are required for gamma oscillations, why is it that KOs have higher gamma oscillations? Is it indeed the case that PV cells have a hypofunctional phenotype in this model? Again, recording from PV cells in vivo would help make sense of this.

      A clearer picture of how PV cell inhibition changes with Erbb4 KO would be achieved with optogenetically evoked IPSPs, rather than changes in mini frequency.

    4. Reviewer #3 (Public review):

      Summary:

      The authors investigate the role of ErbB4 in parvalbumin (PV) interneurons within the olfactory bulb (OB) and its regulation of odor discrimination behavior in mice. They demonstrate that odor discrimination increases ErbB4 kinase activity and that the loss of ErbB4 in the OB impairs the dishabituation of odor response and discrimination of complex odors. The study also characterizes the expression of ErbB4 in the OB, showing it is enriched in PV neurons. Furthermore, the authors utilize a mouse model in which ErbB4 is knocked out in PV neurons and perform a variety of behavioral, electrophysiological, and local field potential (LFP) recording experiments to characterize alterations in olfactory bulb activity. They then use a model in which ErbB4 is specifically knocked out in PV neurons in the OB and show that this manipulation disrupts odor-related behaviors in mice.

      Strengths:

      The study's strengths lie in its use of a diverse range of techniques, including RNAscope, IHC, and Western blotting, to assess the presence of ErbB4 in PV neurons within the OB. Additionally, the authors employ various behavioral tests to evaluate the effects of ErbB4 manipulation in different mouse models, alongside comprehensive electrophysiological experiments and LFP recordings to examine the impact of these manipulations on OB physiology.

      Weaknesses:

      While the data presented in this paper are interesting, several major concerns reduce my enthusiasm for this study, as outlined below:

      (1) In reviewing Figure 1C/D, there are several concerns regarding the clarity and interpretation of the data:

      a) While the Western blot for ErbB4 in other figures (Figure 1F, 2I) of the manuscript shows a clear single band, the blot presented in Figure 1C (for both p-ErbB4 and total ErbB4) shows multiple bands, which is unexpected. This discrepancy raises concerns about the consistency of the results.

      b) The data presented in Figure 1D uses only 3 mice per group, and the reported p-value of 0.0492, while technically significant, is very close to the threshold. This raises concerns about the robustness of the finding, especially given the small sample size. Additionally, the p-ErbB4 band intensity in the Go/No-Go condition in Figure 1C does not appear to show a clear increase over the Go/Go condition, which is not congruent with the bar graph in Figure 1D showing a 50% increase in p-ErbB4/ErbB4 levels.

      c) It is a standard practice in many journals to include full, uncropped Western blot images as supplementary material. This transparency helps ensure that no bands are selectively shown or omitted and increases confidence in the presented data.

      (2) In Figure 2, the authors used the anti-ErbB4 antibody sc-283 from Santa Cruz to assess the expression of ErbB4 in PV neurons and the absence of its expression in PV-ErbB4 knock-out mice. However, this particular antibody has been shown to produce non-specific bands in Western blotting and also generate non-specific labeling in IHC. This non-specificity has been demonstrated in Vullhorst et al. (2009, J Neurosci), raising significant concerns about the reliability of the data generated using this antibody.

      (3) In reviewing the statistical analysis for the series of odor discrimination tests, there could be a potential issue with the clarity of the significance testing. Although the figure legend reports the F and p values from the two-way ANOVA, it is unclear whether these values represent the main effects or the results of a post hoc test. Additionally, it is not clear whether the asterisk in the figures reflects significance from a post hoc test or from the overall ANOVA. The methods section does not explicitly state whether a post hoc test was performed to assess differences between the knockout and control groups. Given that the tests were conducted across multiple days or conditions, a post hoc test that can adjust for multiple comparisons would be necessary to accurately identify where specific differences between the groups exist.

      (4) Throughout the manuscript, the authors use different mouse models, including ErbB4 knockout specifically in the OB (AAV-Cre-GFP), ErbB4 knockout in PV interneurons throughout the brain (PV-ErbB4-/-), and ErbB4 knockout in PV interneurons within the OB (AAV-PV-Cre-GFP). For Figures 4 and 5, the authors use the PV-ErbB4-/- model to examine odor-evoked activity and neural oscillations within the OB. Since the knockout affects PV interneurons across the entire brain, it is difficult to disentangle whether the observed changes in the OB are due to local effects or broader network alterations elsewhere in the brain.

      (5) While the electrophysiological experiments shown in Figures 6-8 provide valuable insights into the reduced inhibition to MCs in PV-ErbB4 knockout mice, it appears that the authors did not record from PV interneurons themselves. Since PV interneurons are central to the proposed mechanism, directly recording them would provide critical information on how the ErbB4 knockout affects their intrinsic properties, synaptic inputs, and firing behavior. Without these direct recordings, the conclusions about the specific role of PV neurons in regulating MC activity remain somewhat indirect. Prior studies have established that knockout of ErbB4 in PV interneurons reduces mEPSC frequency in PV neurons (Del Pino et al., 2013).

      (6) In Figure 9, the authors knock out ErbB4 in PV neurons in the OB with AAV-PV-Cre-GFP and show with western blotting that ErbB4 expression is reduced in the mouse injected with AAV-PV-Cre-GFP. However, it is not clear whether ErbB4 was selectively knocked out in PV neurons without the quantification from IHC assays.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this study from Zhu and colleagues, a clear role for MED26 in mouse and human erythropoiesis is demonstrated that is also mapped to amino acids 88-480 of the human protein. The authors also show the unique expression of MED26 in later-stage erythropoiesis and propose transcriptional pausing and condensate formation mechanisms for MED26's role in promoting erythropoiesis. Despite the author's introductory claim that many questions regarding Pol II pausing in mammalian development remain unanswered, the importance of transcriptional pausing in erythropoiesis has actually already been demonstrated (Martell-Smart, et al. 2023, PMID: 37586368, which the authors notably did not cite in this manuscript). Here, the novelty and strength of this study is MED26 and its unique expression kinetics during erythroid development.

      Strengths:

      The widespread characterization of kinetics of mediator complex component expression throughout the erythropoietic timeline is excellent and shows the interesting divergence of MED26 expression pattern from many other mediator complex components. The genetic evidence in conditional knockout mice for erythropoiesis requiring MED26 is outstanding. These are completely new models from the investigators and are an impressive amount of work to have both EpoR-driven deletion and inducible deletion. The effect on red cell number is strong in both. The genetic over-expression experiments are also quite impressive, especially the investigators' structure-function mapping in primary cells. Overall the data is quite convincing regarding the genetic requirement for MED26. The authors should be commended for demonstrating this in multiple rigorous ways.

      Thank you for your positive feedback.

      Weaknesses:

      (1) The authors state that MED26 was nominated for study based on RNA-seq analysis of a prior published dataset. They do not however display any of that RNA-seq analysis with regards to Mediator complex subunits. While they do a good job showing protein-level analysis during erythropoiesis for several subunits, the RNA-seq analysis would allow them to show the developmental expression dynamics of all subunit members.

      Thank you for this helpful suggestion. While we did not originally nominate MED26 based on RNA-seq analysis, we have analyzed the transcript levels of Mediator complex subunits in our RNA-seq data across different stages of erythroid differentiation (Author response image 1). The results indicate that most Mediator subunits, including MED26, display decreased RNA expression over the course of differentiation, with the exception of MED25, as reported previously (Pope et al., Mol Cell Biol 2013. PMID: 23459945).

      Notably, our study is based on initial observations at the protein level, where we found that, unlike most other Mediator subunits that are downregulated during erythropoiesis, MED26 remains relatively abundant. Protein expression levels more directly reflect the combined influences of transcription, translation and degradation processes within cells, and are likely more closely related to biological functions in this context. It is possible that post-transcriptional regulation (such as m6A-mediated improvement of translational efficiency) or post-translational modifications (like escape from ubiquitination) could contribute to the sustained levels of MED26 protein, and this will be an interesting direction for future investigation.

      Author response image 1,

      Relative RNA expression of Mediator complex subunits during erythropoiesis in human CD34+ erythroid cultures. Different differentiation stages from HSPCs to late erythroblasts were identified using CD71 and CD235a markers, progressing sequentially as CD71-CD235a-, CD71+CD235a-, CD71+CD235a+, and CD71-CD235a+. Expression levels were presented as TPM (transcripts per million).

      (2) The authors use an EpoR Cre for red cell-specific MED26 deletion. However, other studies have now shown that the EpoR Cre can also lead to recombination in the macrophage lineage, which clouds some of the in vivo conclusions for erythroid specificity. That being said, the in vitro erythropoiesis experiments here are convincing that there is a major erythroid-intrinsic effect.

      Thank you for this insightful comment. We recognize that EpoR-Cre can drive recombination in both erythroid and macrophage lineages (Zhang et al., Blood 2021, PMID: 34098576). However, EpoR-Cre remains the most widely used Cre for studying erythroid lineage effects in the hematopoietic community. Numerous studies have employed EpoR-Cre for erythroid-specific gene knockout models (Pang et al, Mol Cell Biol 2021, PMID: 22566683; Santana-Codina et al., Haematologica 2019, PMID: 30630985; Xu et al., Science 2013, PMID: 21998251.).

      While a GYPA (CD235a)-Cre model with erythroid specificity has recently been developed (https://www.sciencedirect.com/science/article/pii/S0006497121029074), it has not yet been officially published. We look forward to utilizing the GYPA-Cre model for future studies. As you noted, our in vivo mouse model and primary human CD34+ erythroid differentiation system both demonstrate that MED26 is essential for erythropoiesis, suggesting that the regulatory effects of MED26 in our study are predominantly erythroid-intrinsic.

      (3) The donor chimerism assessment of mice transplanted with MED26 knockout cells is a bit troubling. First, there are no staining controls shown and the full gating strategy is not shown. Furthermore, the authors use the CD45.1/CD45.2 system to differentiate between donor and recipient cells in erythroblasts. However, CD45 is not expressed from the CD235a+ stage of erythropoiesis onwards, so it is unclear how the authors are detecting essentially zero CD45-negative cells in the erythroblast compartment. This is quite odd and raises questions about the results. That being said, the red cell indices in the mice are the much more convincing data.

      Thank you for your careful and thorough feedback. We have now included negative staining controls (Author response image 2A, top). We agree that CD45 is typically not expressed in erythroid precursors in normal development. Prior studies have characterized BFU-E and CFU-E stages as c-Kit+CD45+Ter119−CD71low and c-Kit+CD45−Ter119−CD71high cells in fetal liver (Katiyar et al, Cells 2023, PMID: 37174702).

      However, our observations indicate that erythroid surface markers differ during hematopoiesis reconstitution following bone marrow transplantation.  We found that nearly all nucleated erythroid progenitors/precursors (Ter119+Hoechst+) express CD45 after hematopoiesis reconstitution (Author response image 2A, bottom).

      To validate our assay, we performed next-generation sequencing by first mixing mouse CD45.1 and CD45.2 total bone marrow cells at a 1:2 ratio. We then isolated nucleated erythroid progenitors/precursors (Ter119+Hoechst+) by FACS and sequenced the CD45 gene locus by targeted sequencing. The resulting CD45 allele distribution matched our initial mixing ratio, confirming the accuracy of our approach (Author response image 2B).

      Moreover, a recent study supports that reconstituted erythroid progenitors can indeed be distinguished by CD45 expression following bone marrow transplantation (He et al., Nature Aging 2024, PMID: 38632351. Extended Data Fig. 8). 

      In conclusion, our data indicate that newly formed erythroid progenitors/precursors post-transplant express CD45, enabling us to identify nucleated erythroid progenitors/precursors by Ter119+Hoechst+ and determine their origin using CD45.1 and CD45.2 markers.

      Author response image 2.

      Representative flow cytometry gating strategy of erythroid chimerism following mouse bone marrow transplantation. A. Gating strategy used in the erythroid chimerism assay. B. Targeted sequencing result of Ter119+Hoechst+ cells isolated by FACS. The cell sample was pre-mixed with 1/3 CD45.2 and 2/3 CD45.1 bone marrow cells. Ptprc is the gene locus for CD45.

      (4) The authors make heavy use of defining "erythroid gene" sets and "non-erythroid gene" sets, but it is unclear what those lists of genes actually are. This makes it hard to assess any claims made about erythroid and non-erythroid genes.

      Thank you for this helpful suggestion. We defined "erythroid genes" and "non-erythroid genes" based on RNA-seq data from Ludwig et al. (Cell Reports 2019. PMID: 31189107. Figure 2 and Table S1). Genes downregulated from stages k1 to k5 are classified as “non-erythroid genes,” while genes upregulated from stages k6 to k7 are classified as “erythroid genes.” We will add this description in the revised manuscript.

      (5) Overall the data regarding condensate formation is difficult to interpret and is the weakest part of this paper. It is also unclear how studies of in vitro condensate formation or studies in 293T or K562 cells can truly relate to highly specialized erythroid biology. This does not detract from the major findings regarding genetic requirements of MED26 in erythropoiesis.

      Thank you for the rigorous feedback. Assessing the condensate properties of MED26 protein in primary CD34+ erythroid cells or mouse models is indeed challenging. As is common in many condensate studies, we used in vitro assays and cellular assays in HEK293T and K562 cells to examine the biophysical properties (Figure S7), condensation formation capacity (Figure 5C and Figure S7C), key phase-separation regions of MED26 protein (Figure S6), and recruitment of pausing factors (Figure 6A-B) in live cells. We then conducted functional assays to demonstrate that the phase-separation region of MED26 can promote erythroid differentiation similarly to the full-length protein in the CD34+ system and K562 cells (Figure 5A). Specifically, overexpressing the MED26 phase-separation domain accelerates erythropoiesis in primary human erythroid culture, while deleting the Intrinsically Disordered Region (IDR) impairs MED26’s ability to form condensates and recruit PAF1 in K562 cells.

      In summary, we used HEK293T cells to study the biochemical and biophysical properties of MED26, and the primary CD34+ differentiation system to examine its developmental roles. Our findings support the conclusion that MED26-associated condensate formation promotes erythropoiesis.

      (6) For many figures, there are some panels where conclusions are drawn, but no statistical quantification of whether a difference is significant or not.

      Thank you for your thorough feedback. We have checked all figures for statistical quantification and added the relevant statistical analysis methods to the corresponding figure legends (Figure 2L and Figure S4C) to clarify the significance of the observed differences. The updated information will be incorporated into the revised manuscript.

      Reviewer #2 (Public review):

      Summary:

      The manuscript by Zhu et al describes a novel role for MED26, a subunit of the Mediator complex, in erythroid development. The authors have discovered that MED26 promotes transcriptional pausing of RNA Pol II, by recruiting pausing-related factors.

      Strengths:

      This is a well-executed study. The authors have employed a range of cutting-edge and appropriate techniques to generate their data, including: CUT&Tag to profile chromatin changes and mediator complex distribution; nuclear run-on sequencing (PRO-seq) to study Pol II dynamics; knockout mice to determine the phenotype of MED26 perturbation in vivo; an ex vivo erythroid differentiation system to perform additional, important, biochemical and perturbation experiments; immunoprecipitation mass spectrometry (IP-MS); and the "optoDroplet" assay to study phase-separation and molecular condensates.

      This is a real highlight of the study. The authors have managed to generate a comprehensive picture by employing these multiple techniques. In doing so, they have also managed to provide greater molecular insight into the workings of the MEDIATOR complex, an important multi-protein complex that plays an important role in a range of biological contexts. The insights the authors have uncovered for different subunits in erythropoiesis will very likely have ramifications in many other settings, in both healthy biology and disease contexts.

      Thank you for your thoughtful summary and encouraging feedback.

      Weaknesses:

      There are almost no discernible weaknesses in the techniques used, nor the interpretation of the data. The IP-MS data was generated in HEK293 cells when it could have been performed in the human CD34+ HSPC system that they employed to generate a number of the other data. This would have been a more natural setting and would have enabled a more like-for-like comparison with the other data.

      Thank you for your positive feedback and insightful suggestions. We will perform validation of the immunoprecipitation results in CD34+ derived erythroid cells to further confirm our findings.

      Reviewer #3 (Public review):

      Summary:

      The authors aim to explore whether other subunits besides MED1 exert specific functions during the process of terminal erythropoiesis with global gene repression, and finally they demonstrated that MED26-enriched condensates drive erythropoiesis through modulating transcription pausing.

      Strengths:

      Through both in vitro and in vivo models, the authors showed that while MED1 and MED26 co-occupy a plethora of genes important for cell survival and proliferation at the HSPC stage, MED26 preferentially marks erythroid genes and recruits pausing-related factors for cell fate specification. Gradually, MED26 becomes the dominant factor in shaping the composition of transcription condensates and transforms the chromatin towards a repressive yet permissive state, achieving global transcription repression in erythropoiesis.

      Thank you for your positive summary and feedback.

      Weaknesses:

      In the in vitro model, the author only used CD34+ cell-derived erythropoiesis as the validation, which is relatively simple, and more in vitro erythropoiesis models need to be used to strengthen the conclusion.

      Thank you for your thoughtful suggestions. We have shown that MED26 promotes erythropoiesis using the primary human CD34+ differentiation system (Figure 2 K-M and Figure S4) and have demonstrated its essential role in erythropoiesis through multiple mouse models (Figure 2A-G and Figure S1-3). Together, these in vitro and in vivo results support our conclusion that MED26 regulates erythropoiesis. However, we are open to further validating our findings with additional in vitro erythropoiesis models, such as iPSC or HUDEP erythroid differentiation systems.

    2. eLife Assessment

      The study is important to show the role of MED26 in red cell formation. Linking transcription pausing with erythropoiesis is a key discovery. The data are solid although there are still spaces to improve. The in vivo data are limited by specificity concerns on their Cre model. Having RNA-seq, using more erythroid markers such as band3 and a4-integrin, and orthogonal validation with iPSC-erythropoiesis model will improve the study.

    3. Reviewer #1 (Public review):

      Summary:

      In this study from Zhu and colleagues, a clear role for MED26 in mouse and human erythropoiesis is demonstrated that is also mapped to amino acids 88-480 of the human protein. The authors also show the unique expression of MED26 in later-stage erythropoiesis and propose transcriptional pausing and condensate formation mechanisms for MED26's role in promoting erythropoiesis. Despite the author's introductory claim that many questions regarding Pol II pausing in mammalian development remain unanswered, the importance of transcriptional pausing in erythropoiesis has actually already been demonstrated (Martell-Smart, et al. 2023, PMID: 37586368, which the authors notably did not cite in this manuscript). Here, the novelty and strength of this study is MED26 and its unique expression kinetics during erythroid development.

      Strengths:

      The widespread characterization of kinetics of mediator complex component expression throughout the erythropoietic timeline is excellent and shows the interesting divergence of MED26 expression pattern from many other mediator complex components. The genetic evidence in conditional knockout mice for erythropoiesis requiring MED26 is outstanding. These are completely new models from the investigators and are an impressive amount of work to have both EpoR-driven deletion and inducible deletion. The effect on red cell number is strong in both. The genetic over-expression experiments are also quite impressive, especially the investigators' structure-function mapping in primary cells. Overall the data is quite convincing regarding the genetic requirement for MED26. The authors should be commended for demonstrating this in multiple rigorous ways.

      Weaknesses:

      (1) The authors state that MED26 was nominated for study based on RNA-seq analysis of a prior published dataset. They do not however display any of that RNA-seq analysis with regards to Mediator complex subunits. While they do a good job showing protein-level analysis during erythropoiesis for several subunits, the RNA-seq analysis would allow them to show the developmental expression dynamics of all subunit members.

      (2) The authors use an EpoR Cre for red cell-specific MED26 deletion. However, other studies have now shown that the EpoR Cre can also lead to recombination in the macrophage lineage, which clouds some of the in vivo conclusions for erythroid specificity. That being said, the in vitro erythropoiesis experiments here are convincing that there is a major erythroid-intrinsic effect.

      (3) The donor chimerism assessment of mice transplanted with MED26 knockout cells is a bit troubling. First, there are no staining controls shown and the full gating strategy is not shown. Furthermore, the authors use the CD45.1/CD45.2 system to differentiate between donor and recipient cells in erythroblasts. However, CD45 is not expressed from the CD235a+ stage of erythropoiesis onwards, so it is unclear how the authors are detecting essentially zero CD45-negative cells in the erythroblast compartment. This is quite odd and raises questions about the results. That being said, the red cell indices in the mice are the much more convincing data.

      (4) The authors make heavy use of defining "erythroid gene" sets and "non-erythroid gene" sets, but it is unclear what those lists of genes actually are. This makes it hard to assess any claims made about erythroid and non-erythroid genes.

      (5) Overall the data regarding condensate formation is difficult to interpret and is the weakest part of this paper. It is also unclear how studies of in vitro condensate formation or studies in 293T or K562 cells can truly relate to highly specialized erythroid biology. This does not detract from the major findings regarding genetic requirements of MED26 in erythropoiesis.

      (6) For many figures, there are some panels where conclusions are drawn, but no statistical quantification of whether a difference is significant or not.

    4. Reviewer #2 (Public review):

      Summary:

      The manuscript by Zhu et al describes a novel role for MED26, a subunit of the Mediator complex, in erythroid development. The authors have discovered that MED26 promotes transcriptional pausing of RNA Pol II, by recruiting pausing-related factors.

      Strengths:

      This is a well-executed study. The authors have employed a range of cutting-edge and appropriate techniques to generate their data, including: CUT&Tag to profile chromatin changes and mediator complex distribution; nuclear run-on sequencing (PRO-seq) to study Pol II dynamics; knockout mice to determine the phenotype of MED26 perturbation in vivo; an ex vivo erythroid differentiation system to perform additional, important, biochemical and perturbation experiments; immunoprecipitation mass spectrometry (IP-MS); and the "optoDroplet" assay to study phase-separation and molecular condensates.

      This is a real highlight of the study. The authors have managed to generate a comprehensive picture by employing these multiple techniques. In doing so, they have also managed to provide greater molecular insight into the workings of the MEDIATOR complex, an important multi-protein complex that plays an important role in a range of biological contexts. The insights the authors have uncovered for different subunits in erythropoiesis will very likely have ramifications in many other settings, in both healthy biology and disease contexts.

      Weaknesses:

      There are almost no discernible weaknesses in the techniques used, nor the interpretation of the data. The IP-MS data was generated in HEK293 cells when it could have been performed in the human CD34+ HSPC system that they employed to generate a number of the other data. This would have been a more natural setting and would have enabled a more like-for-like comparison with the other data.

    5. Reviewer #3 (Public review):

      Summary:

      The authors aim to explore whether other subunits besides MED1 exert specific functions during the process of terminal erythropoiesis with global gene repression, and finally they demonstrated that MED26-enriched condensates drive erythropoiesis through modulating transcription pausing.

      Strengths:

      Through both in vitro and in vivo models, the authors showed that while MED1 and MED26 co-occupy a plethora of genes important for cell survival and proliferation at the HSPC stage, MED26 preferentially marks erythroid genes and recruits pausing-related factors for cell fate specification. Gradually, MED26 becomes the dominant factor in shaping the composition of transcription condensates and transforms the chromatin towards a repressive yet permissive state, achieving global transcription repression in erythropoiesis.

      Weaknesses:

      In the in vitro model, the author only used CD34+ cell-derived erythropoiesis as the validation, which is relatively simple, and more in vitro erythropoiesis models need to be used to strengthen the conclusion.

    1. eLife Assessment

      This is a valuable work that convincingly reveals that place cells in the hippocampus that exhibit repeated firing fields incorporate information about non-positional variables in each firing field. They reveal that individual firing fields of a single place cell can exhibit tuning to different head orientations, suggesting hippocampal neurons are flexible in terms of how they incorporate non-positional inputs.

    2. Reviewer #1 (Public Review):

      The authors investigate whether during free exploration of an environment with an internal structure of corridors and occasionally fluid-rewarded alleys, rat CA1 place cells generate multiple firing fields in repeating patterns, allowing the investigators to analyze whether firing field positional properties like alley orientation, and non-positional properties like heading, field-rate modulation and other properties are similar or different within and across single place cell place fields. They adopt a standard cognitive map analysis framework, conceiving each cell as an individual map element and characterizing each cell's individual activity independently of the activity of other cells, such that the main unit of analysis is a place field averaged across recording times of many minutes. Despite framing the work as an investigation of a fundamentally-subjective episodic memory system sensitive to hidden cognitive and attentional variables, the experiment and analyses are conceived as if the cells respond to positional and non-positional features of experience as static "inputs" that the investigators infer. These "inputs" are conceptualized as effectively stationary and steady, and they are not manipulated. The authors find that there are many "repeated" firing fields, that they tend to have similar orientation more than expected by chance, and that each field's rate is modulated distinctly by heading direction and other factors, leading them to conclude that each field's nonpositional inputs are "individually addressable." The authors do not consider alternative possibilities for which there are strong indications in the contemporary literature like 1) CA1 activity could be internally generated; 2) that there could be hidden cognitive variables that influence CA1 activity episodically and in non-stationary ways rather than consistently; 3) that CA1 cells exhibit mixed tuning to a variety of environmental and navigational variables; 4) that CA1 activity is better interpreted from the point-of-view of a neural ensemble or a neural manifold of conjoint neural activity that represents multiple information variables, or 5) that stable neural representations of information need not depend on stable stimulus-response properties of individual cells. In fact, the analyses provide evidence consistent with each of these alternatives, but they are not considered. There is a case to be made that the authors are allowed to ignore these alternatives because they properly engage the dogmatic point of view, in which case there is little to adjust in the manuscript, which is both well-conceived and well-executed in the classic (but not contemporary) norms of place cell investigations.

      My comments are focused on improving the manuscript without insisting that the authors adopt alternative (contemporary) points of view, but requiring them to clarify their point of view and explain that there are alternatives.

      (1) The authors define what they mean by "positional" and "non-positional" "inputs" later in the manuscript. Since the experimental apparatus and task have been designed to isolate these "inputs" the authors should in the initial description of the environment and task explain what the task does and does not allow them to analyze. Instead, they have repeatedly asserted that the environment is a hybrid of an open-field and a linear track environment. This may be the case, but so what? The authors need to better explain, up front, why that matters and what they will be able to investigate as a result. As written, this all seems to me rather vague and post hoc.

      (2) The abstract states "Previous work implies a distinction between positional inputs to the hippocampus that provide information about an animal's location and non-positional inputs which provide information about the content of experience." While I understand what the authors mean, I want to point out that it is not straightforward to identify the "positional inputs" and the "non-positional inputs." What are they, how can they be measured? Is it not also possible that hippocampus generates "positional" information rather than receiving it, that is in fact the longstanding view of the cognitive map framework that the authors have adopted, and yet they frame the essential issue as one of differential receipt of positional and non-positional inputs. This seems to me imprecise and hard to defend but demonstrates the authors' opinion in framing this work. In my view a more objective and accurate statement might be "Previous work implies a distinction between hippocampal (positional) activity representing information about an animal's location and (non-positional) activity which represents information about the content of experience." This opinion about "inputs" is found throughout the manuscript over 50 times, starting with the title. While in my view this is not an objective treatment of the experimental design or data (positional and non-positional inputs are never identified or manipulated, they are merely inferred), I accept that the authors can say whatever they want so long as they make it clear to the reader that theirs is an opinion or assumption rather than a measurement. The manuscript is written as if the different inputs are identified and valid, rather than inferred.

      (3) The abstract states "even though the animal's behavior was not constrained to 1-D trajectories" whereas page 13 states "but their trajectories were constrained to orthogonal directions by the city-maze architecture" and page 23 states "but their trajectories were constrained to a rectilinear grid." While I understand what the authors mean, the first statement appears to contradict the others. There are additional examples that I do not identify here. In any case, I would like to have seen examples of the animals' trajectories through the maze. A figure showing the raw trajectories and another after the unwanted behaviors have been filtered out should be given, allowing the reader to understand how much the animals tended to travel through the alleys, how much they turned and lingered within them, etc.

      (4) The abstract ends with "These results demonstrate that the positional inputs that drive a cell to fire in similar locations across the maze can be behaviorally and temporally dissociated from the nonpositional inputs that alter the firing rates of the cell within its place fields, thereby increasing the flexibility of the system to encode episodic variables within a spatiotemporal framework provided by place cells." I don't see the evidence for the "thereby ..." claim. The authors are free to speculate and discuss but they should say they are speculating and/or discussing a possibility, rather than assert as if they have demonstrated a fact.

      (5) The Introduction begins with "All behavior is embedded within a spatial and temporal framework." By this statement, I believe the authors mean to assert, or at least they cause a reader to understand that there is a spatial and temporal framework that is separate from the behaving subject. They will use this point of view to design their experiment around the utility of a city- maze. Since the authors appeal to cognitive map theory so much, I point out that O'Keefe and Nadel write in The Hippocampus as a Cognitive Map that "Space was a way of perceiving, not a thing to be perceived." Sentence number 2 of the book states "We shall argue that the hippocampus is the core of a neural memory system providing an objective spatial framework within which the items and events of an organism's experience are located and interrelated." Consistent with Kant and O'Keefe and Nadel, the present authors might more accurately state "All behavior is embedded within a subjective spatial and temporal framework." but then they will have to explain why they conceive of there being "positional inputs" to which they are measuring CA1 responses. This framing seems to me problematic and not logically self-consistent.

      (6) On page 2 the authors assert "Neurons within the hippocampus respond to a wide array of sensory and otherwise nonspatial cues..." then they go on to list sensory features and "non-positional" features of experience to which CA1 cells respond. It seems to me they leave out a class of features of experience that might be considered "subjective spatial frames" that have been investigated by Gothard and Redish when they were in the McNaughton and Barnes lab, as well the Fenton and Muller labs, amongst others. All of these papers describe non-stationary, multi-stable place cell phenomena that are tied to subjective variables, which have the potential to undermine the premise of the present work's analyses and so they should be considered. I list a sample but certainly not all the work that might be considered.

      Gothard KM, Skaggs WE, Moore KM, McNaughton BL (1996) Binding of hippocampal CA1 neural activity to multiple reference frames in a landmark-based navigation task. J Neurosci 16:823-835.

      Gothard KM, Skaggs WE, McNaughton BL (1996) Dynamics of mismatch correction in the hippocampal ensemble code for space: interaction between path integration and environmental cues. J Neurosci 16:8027-8040.

      Gothard KM, Hoffman KL, Battaglia FP, McNaughton BL (2001) Dentate gyrus and ca1 ensemble activity during spatial reference frame shifts in the presence and absence of visual input. J Neurosci 21:7284-7292.

      Redish AD, Rosenzweig ES, Bohanick JD, McNaughton BL, Barnes CA (2000) Dynamics of hippocampal ensemble activity realignment: time versus space. J Neurosci 20:9298-9309.

      Rosenzweig ES, Redish AD, McNaughton BL, Barnes CA (2003) Hippocampal map realignment and spatial learning. Nat Neurosci 6:609-615.

      Jackson J, Redish AD (2007) Network dynamics of hippocampal cell-assemblies resemble multiple spatial maps within single tasks. Hippocampus 17:1209-1229

      Lenck-Santini PP, Fenton AA, Muller RU (2008) Discharge properties of hippocampal neurons during performance of a jump avoidance task. J Neurosci 28:6773-6786.

      Fenton AA, Lytton WW, Barry JM, Lenck-Santini PP, Zinyuk LE, Kubik S, Bures J, Poucet B, Muller RU, Olypher AV (2010) Attention-like modulation of hippocampus place cell discharge. J Neurosci 30:4613-4625.

      Kelemen E, Fenton AA (2013) Key features of human episodic recollection in the cross-episode retrieval of rat hippocampus representations of space. PLoS Biol 11:e1001607.

      (7) The Introduction asserts that "rate remapping" is a hypothesis. Rate remapping is a phenomenon, something that is observed. The interpretation of the observation as being the substrate of episodic memory is certainly a hypothesis that in my opinion has not been tested and is not being tested in the present work. After making the above statement, the authors go on to describe that firing rates differ across "repeated" firing fields, which seems to be a form of rate remapping, and predicted by the relevant hypothesis that different episodes of experience at the same locations are represented by different firing rates. This is very speculative and there are many other explanations.

      (8) The Introduction ends with the statement "Here, we show that repeating fields of the same neuron do not always display the same nonpositional rate modulation, demonstrating that nonpositional cues are dissociable from, and more flexible than, the positional inputs onto place cells in a given environment." Apart from my concern about using the "input" terminology I which to point out that there is very little novel in this statement. It has been described many times before that on linear tracks CA1 firing fields are directionally modulated such that the field rates for traversals in one direction are different compared to field traversals in the opposite direction. Jackson and Redish (2007) cited above show this to be due to reference frame or map switching. That and other work allow one to state that "Others show that repeating fields of the same neuron do not always display the same nonpositional rate modulation, demonstrating that nonpositional cues are dissociable from, and more flexible than, the positional inputs onto place cells in a given environment." Either the present authors should acknowledge that they are demonstrating what others have already demonstrated, or they should more precisely describe what about their contribution is unique.

      (9) Page 6 Methods - Data Filtering and Pre-processing. How did the authors handle theta cells and others that fired more or less everywhere but with spatial modulation?

      (10) Page 9 Methods - Why was the session-wide activity used to normalize the firing rates for the activity vector input to the random forest classifier? The authors state "The normalized firing rate was computed as discussed above with the change that the session-wide activity in the alley was used." It seems to me better to have used the session-averaged firing rate map because the activity would be normalized by the expected positional firing. I imagine "The classifier used the population vector of firing rates as the input." is incorrect and the authors mean to state "The classifier used the population vector of normalized firing rates as the input."

      (11) What does "spatially-gated" mean? The use of such jargon should be explained, or better avoided.

      (12) Page 12: Since fields tend to have similar orientations, but not repeat at all geometrically similar locations, did they tend to be clustered? Was there a proximity feature to their distribution?

      (13) Page 18 states "Thus, although there was a slight trend for repeating field ..." The authors are reporting a significant effect not a "slight trend." They do something similar in reporting Figure 5's result. Despite significant effects, they seem to think the findings are not large enough so state that repeating-field directionality is not conserved. It is fine to explain that a significant effect was small (for example give the effect size, which would have been welcome throughout) but as in these cases and others, the authors should be more objective in their reporting of the outcomes. Either a statistical test was or was not significant. It is not "a little" or "a lot" significant.

      (14) Page 18: What do the authors mean by "topology?" Might they mean "topography?"

      (15) Figure 6 shows field instability and multi-stability (termed temporal dynamics) as described on page 22. The recording sessions were 60 min. Is this impression simply due to long recording sessions? If 10 or 15 minutes of data were analyzed (which is more the norm), would similar instability be observed/detectable?

      (16) I found the Discussion very confusing. On the one hand, there is an assertion that because the location of firing fields is stable there is a "positional code." How would that actually work? Any neural system has to signal by firing rates or firing coincidences across groups of cells (that are affected by changes in rate) so if there is firing field firing rate instability the authors should explain how position can be accurately decoded on a behaviorally-meaningful time scale. In fact, they should demonstrate such decoding explicitly. Just because there is modulation and instability, it is a rather long leap to assert that this is how episodic experience/memory is encoded (as stated at the end of the abstract and elsewhere for example on page 24: "The present data utilize repeating fields to suggest that, within an environment, the positional inputs are relatively rigid, whereas the nonpositional inputs are more flexible, allowing different repeating fields to show different directional preferences. In other words, fields are individually addressable with respect to the nonpositional inputs they receive; they do not inherit their nonpositional tuning as a global property of the cell." What does it mean that a field is "individually addressable?" How is that achieved by neurons? If the authors want to make such assertions they should explain and demonstrate how their assertions can be valid, given the data and findings. At least they should explain what they are assuming.<br /> The main findings seem related to the published finding that in large environments place cells have multiple firing fields, with distinct rates in each field, quite similar to what is here described in the city maze. In my opinion, positional representations can only plausibly work in such cases by using the conjoint population activity moment to moment, which necessarily marginalizes the value of individual firing fields, yet the present work focuses the discussion (and analyses) on interpretations of single firing fields (which they assert are individually addressable multiple times). I don't know what that means exactly and the authors should explain why maintaining the standard single-field perspective is appropriate and how position can be represented in such a system, given the data. In fact, I would have thought that the present findings would cause the authors to reject as invalid the framework they have adopted.

      (17) This is a further example, on page 25 which asserts that "Directionality is affected by an animal's experience through the field (Navratilova et al., 2012), so it is possible the difference in experience between sampling fields on the same versus different corridors affects the directional tuning properties between them." I do not understand how "the difference in experience between sampling fields on the same versus different corridors affects the directional tuning properties between them." If I follow the logic then the so-called directionality would depend on experience and so only emerge after a certain time for experience, or else the firing during one traversal would need to be modulated by information about future traversals, which I suppose the authors would agree does not make sense.

      (18) I found it at times confusing to follow the arguments because the terms "route" and "trajectory" and also "direction" and "heading" were used sometimes interchangeably and sometimes in ways that appear distinct.

      (19) Page 25 states "One explanation for these data is that fields sampled along contiguous routes, without interruptions from heading change or reward delivery, are more likely to share their directionality." The authors should consider alternative explanations like reference frame shifts as mentioned in comment 6 above. These alternatives can be rejected based on data, but they should be considered because they seem to offer more parsimonious explanations for the observations than what the authors have offered. For example, what can explain the bimodality reported in Fig. 5G?

      (20) The authors assert on page 15 that "In the present study, turns at the ends of corridors, along with reward deliveries, may be salient task boundaries at which point theta sequences are terminated. Fields active within the same theta sequence (typically same corridor fields) may be functionally coupled, while fields active on opposite sides of a theta sequence termination (different corridor fields) may be uncoupled and their tuning uncorrelated." The authors should check this. They recorded the LFPs. Why speculate when they can evaluate the speculation?

      (21) The authors assert on page 26 "It is important to note that because a Pearson correlation was used, it is possible the fields are related in time with a phase shift, and we did not have the statistical power to test this possibility adequately." I either do not understand this statement or it is untrue. Please clarify.

      (22) The authors continue on page 26, asserting "Thus, although it is clear that the place fields of repeating cells do not change their firing rates in synchrony, as if the cell had a global excitability change that made all its fields wax and wane together, it nonetheless remains an open question as to whether the subfields of repeating cells engage in certain types of competitive interactions or other network dynamics that couple changes in their firing rates in more complex ways." This statement implies that it might even be possible for firing fields in distinct and distant locations to be modulated together. Could the authors please explain how that is possible? A firing field is an observation that requires averaging over minutes and behavioral sampling across minutes. How might one cell be modulated to fire at a low rate during one minute and then at another minute later be modulated to fire at a high rate everywhere in the environment? Perhaps I am again not understanding the assertion - please clarify.

    3. Reviewer #2 (Public Review):

      The authors present evidence that free-foraging behavior within an environment having structural regularity in its distribution of obstacles (an internal "city block" configuration) yields multiple place-specific firing fields for CA1 neurons. These fields tend to be aligned to analogous locations within the environment. Aligned fields tend to share direction-biased tuning of place-specific activity. The distribution of in-field firing rates across repeating fields of individual neurons varies and in a reliable enough fashion, that reconstruction of the animal's location in the environment can still be achieved. These results are interpreted as reflecting a combined mapping of environmental position as well as repeating structural features of the environment. The results have strong implications for understanding how navigation and spatial awareness might be represented within environments having such regularities (e.g., a city such as Manhattan). Further, the results suggest that repeating firing fields for CA1 neurons can develop in the absence of regularized path-running behavior. Finally, the authors consider drift in the character of the representation across time to represent the position in time across the foraging session. This last claim lacks evidence for reproducibility and is unnecessarily speculative. Altogether, the work is original and, for the most part, well-evidenced.

    1. eLife Assessment

      This paper reports the synthesis of covalent inhibitors bearing a unique fragment as a protected covalent warhead for irreversible binding to histidine in carbonic anhydrase (CA) enzymes. These findings are important due to the broad utility of the approach for covalent drug discovery applications and could have long-term impacts on related covalent targeting approaches. The data convincingly support the main conclusions of the paper.

    2. Reviewer #1 (Public review):

      Summary:

      This paper describes the covalent interactions of small molecule inhibitors of carbonic anhydrase IX, utilizing a pre-cursor molecule capable of undergoing beta-elimination to form the vinyl sulfone and covalent warhead.

      Strengths:

      The use of a novel covalent pre-cursor molecule that undergoes beta-elimination to form the vinyl sulfone in situ. Sufficient structure-activity relationships across a number of leaving groups, as well as binding moieties that impact binding and dissociation constants.

      Weaknesses:

      No major weaknesses noted. Suggested corrections were addressed.

    3. Reviewer #2 (Public review):

      Summary:

      The authors utilized a "ligand-first" targeted covalent inhibition approach to design potent inhibitors of carbonic anhydrase IX (CAIX) based on a known non-covalent primary sulfonamide scaffold. The novelty of their approach lies in their use of a protected pre-vinylsulfone as a precursor to the common vinylsulfone covalent warhead to target a nonstandard His residue in the active site of CAIX. In addition to biochemical assessment of their inhibitors, they showed that their compounds compete with a known probe on the surface of HeLa cells.

      Strengths:

      The authors use a protected warhead for what would typically be considered an "especially hot" or even "undevelopable" vinylsulfone electrophile. This would be the first report of doing so making it a novel targeted covalent inhibition approach specifically with vinylsulfones.

      The authors used a number of orthogonal biochemical and biophysical methods including intact MS, 2D NMR, x-ray crystallography, and an enzymatic stopped-flow setup to confirm the covalency of their compounds and even demonstrate that this novel pre-vinylsulfone is activated in the presence of CAIX. In addition, they included a number of compelling analogs of their inhibitors as negative controls that address hypotheses specific to the mechanism of activation and inhibition.

      The authors employed an assay that allows them to assess target engagement of their compounds with the target on the surface of cells and a fluorescent probe which is generally a critical tool to be used in tandem with phenotypic cellular assays.

      Weaknesses:

      This reviewer does not find any major weaknesses beyond those noted in the first round of review.<br /> I understand that some of the previously suggested experiments are cumbersome and I look forward to seeing this manuscript published as well as follow-up on this work in the future.

    4. Reviewer #3 (Public review):

      Summary:

      Targeted covalent inhibition of therapeutically relevant proteins is an attractive approach in drug development. This manuscript now reports a series of covalent inhibitors for human carbonic anhydrase (CA) isozymes (CAI, CAII, and CAIX, CAXIII) for irreversible binding to a critical histidine amino acid in the active site pocket. To support their findings, they included co-crystal structures of CAI, CAII, and CAIX in the presence of three such inhibitors. Mass spectrometry and enzymatic recovery assays validate these findings, and the results and cellular activity data are convincing.

      Strengths:

      The authors designed a series of covalent inhibitors and carefully selected non-covalent counterparts to make their findings about the selectivity of covalent inhibitors for CA isozymes quite convincing. The supportive X-ray crystallography and MS data are significant strengths. Their approach of targeted binding of the covalent inhibitors to histidine in CA isozyme may have broad utility for developing covalent inhibitors.

      Weaknesses:

      This reviewer did not find any significant weaknesses. The authors have incorporated most of my suggestions from the first round of review.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This paper describes the covalent interactions of small molecule inhibitors of carbonic anhydrase IX, utilizing a pre-cursor molecule capable of undergoing beta-elimination to form the vinyl sulfone and covalent warhead.

      Strengths:

      The use of a novel covalent pre-cursor molecule that undergoes beta-elimination to form the vinyl sulfone in situ. Sufficient structure-activity relationships across a number of leaving groups, as well as binding moieties that impact binding and dissociation constants.

      Overall, the paper is clearly written and provides sufficient data to support the hypothesis and observations. The findings and outcomes are significant for covalent drug discovery applications and could have long-term impacts on related covalent targeting approaches.

      Weaknesses:

      No major weaknesses were noted by this reviewer.

      Reviewer #2 (Public review):

      Summary:

      The authors utilized a "ligand-first" targeted covalent inhibition approach to design potent inhibitors of carbonic anhydrase IX (CAIX) based on a known non-covalent primary sulfonamide scaffold. The novelty of their approach lies in their use of a protected pre(pro?)-vinylsulfone as a precursor to the common vinylsulfone covalent warhead to target a nonstandard His residue in the active site of CAIX. In addition to a biochemical assessment of their inhibitors, they showed that their compounds compete with a known probe on the surface of HeLa cells.

      Strengths:

      The authors use a protected warhead for what would typically be considered an "especially hot" or even "undevelopable" vinylsulfone electrophile. This would be the first report of doing so making it a novel targeted covalent inhibition approach specifically with vinylsulfones.

      The authors used a number of orthogonal biochemical and biophysical methods including intact MS, 2D NMR, x-ray crystallography, and an enzymatic stopped-flow setup to confirm the covalency of their compounds and even demonstrate that this novel pre-vinylsulfone is activated in the presence of CAIX. In addition, they included a number of compelling analogs of their inhibitors as negative controls that address hypotheses specific to the mechanism of activation and inhibition.

      The authors employed an assay that allows them to assess target engagement of their compounds with the target on the surface of cells and a fluorescent probe which is generally a critical tool to be used in tandem with phenotypic cellular assays.

      Weaknesses:

      While the authors show that the pre-vinyl moiety is shown biochemically to be transformed into the vinylsulfone, they do not show what the fate of this -SO2CH2CH2OCOR group is in a cellular context. Does the pre-vinylsulfone in fact need to be in the active site of CAIX on the surface of the cell to be activated or is the vinylsulfone revealed prior to target engagement?

      I appreciate the authors acknowledging the limitations of using an assay such as thermal shift to derive an apparent binding affinity, however, it is not entirely convincing and leaves a gap in our understanding of what is happening biochemically with these inhibitors, especially given the two-step inhibitory mechanism. It is very difficult to properly understand the activity of these inhibitors without a more comprehensive evaluation of kinact and Ki parameters. This can then bring into question how selective these compounds actually are for CAIX over other carbonic anhydrases.

      The authors did not provide any cellular data beyond target engagement with a previously characterized competitive fluorescent probe. It would be critical to know the cytotoxicity profile of these compounds or even how they affect the biology of interest regarding CAIX activity if the intention is to use these compounds in the future as chemical probes to assess CAIX activity in the context of tumor metastasis.

      Reviewer #3 (Public review):

      Summary:

      Targeted covalent inhibition of therapeutically relevant proteins is an attractive approach in drug development. This manuscript now reports a series of covalent inhibitors for human carbonic anhydrase (CA) isozymes (CAI, CAII, and CAIX, CAXIII) for irreversible binding to a critical histidine amino acid in the active site pocket. To support their findings, they included co-crystal structures of CAI, CAII, and CAIX in the presence of three such inhibitors. Mass spectrometry and enzymatic recovery assays validate these findings, and the results and cellular activity data are convincing.

      Strengths:

      The authors designed a series of covalent inhibitors and carefully selected non-covalent counterparts to make their findings about the selectivity of covalent inhibitors for CA isozymes quite convincing. The supportive X-ray crystallography and MS data are significant strengths. Their approach of targeted binding of the covalent inhibitors to histidine in CA isozyme may have broad utility for developing covalent inhibitors.

      Weaknesses:

      This reviewer did not find any significant weaknesses. However, I suggest several points in the recommendation for the authors' section for authors to consider.

      Recommendations for the authors:

      Reviewing Editor Comments:

      The reviewers have made excellent suggestions. We believe a revised version addressing those points can improve the assessment and quality of your work.

      Reviewer #1 (Recommendations for the authors):

      (1) The beta-elimination process is referred to as a "rearrangement" in both the text and the Figure 2 legend. Based on the proposed mechanism the authors provided, it is a simple beta-elimination and conjugate addition mechanism, and is not a rearrangement mechanism. This change should be reflected in the text and Figure 2 legend.

      We have made the requested change from rearrangement to elimination reaction.

      (2) From a structure-based design perspective, it is not obvious why only large cyclo-alkyl groups were used to target the lipophilic pocket, with the exception of the phenyl carbamates. Perhaps this is background literature on CAIX that describes this? It seems like this is a flexible functional moiety that could be used to impact drug properties. Why were other lipophilic and especially more aromatic or heteroaromatic moieties not studied?

      The structure-affinity relationship of the lipophilic ring versus other moieties has been studied and reported previously in manuscripts: Dudutiene 2014, Zubriene 2017, Linkuviene 2018, chapter 16 by Zubriene (https://doi.org/10.1007/978-3-030-12780-0_16). The lipophilic ring served better than a flexible tail or an aromatic ring.

      (3) The color-coded "correlation map" in Figure 8 is difficult to follow. Perhaps a standard SAR table with selectivity and affinity values would be easier to read and follow.

      We are trying to promote “correlation maps” because in our opinion they are easier to follow than tables.

      (4) Although there is a statement for this in line 254 of the SI, the compound numbering in the SI, vs. the numbering used in the manuscript is confusing. The standard format for these is to consecutively number all compounds and have identical compound numbers in both the SI and manuscript. The synthetic intermediates included in the SI can be identified by IUPAC names.

      An additional numbering system had to be made because the synthesis was described in the supplementary materials. We would prefer to leave the numbering as in the current manuscript. There are quite a few intermediate compounds that we assigned intermediate numbers such as 20x in order to make it simpler to distinguish intermediate synthesis compounds from compounds that were studied for binding affinity.

      (5) Ranges of isolated yields for the synthetic steps in SI schemes SI, S2, and S3 need to be included.

      We have remade the SI schemes S1, S2, and S3 to include the yields of each compound.

      (6) Presumably, the AcOH/H2O2 reaction forms the sulfones and not sulfoxides when heat is used. In the SI, the structures of 9x and 10x are shown to be sulfoxides and not sulfones. Initially, this is thought to be a simple structural mistake, however, this is concerning, since the HRMS data (for compound 9x) reported is for the sulfoxide (HRMS for C8H7F4NO4S2 [(M+H)+]: calc. 321.9825, found 321.9824. 482) and not the sulfone? In the synthesis scheme S1, condition "C" is used for both the sulfoxide and sulfone synthesis (i.e. 3ax to 9x vs. 12x to 13x). It appears the sulfoxide is prepared using a room temperature procedure, vs. the sulfone requiring 75 degrees centigrade heat. These two similar conditions need to be designated as different synthetic steps in the schemes with the specific conditions noted since the products formed are different.

      We have made requested corrections/adjustments and added separate reaction conditions for sulfoxide synthesis in SI scheme S1.

      Reviewer #2 (Recommendations for the authors):

      I appreciate that it's difficult to determine parameters such as kinact or Ki of such potent inhibitors and ones that work by a two-step mechanism. I might suggest characterizing the steps separately to determine the detailed parameters. Maybe something like NMR for the for the activation step and SPR for the kinact and Ki of the unmasked vinylsulfone?

      We agree that such information would be helpful. However, it requires significant effort and equipment and will be performed in a separate study.

      I always advocate for at least a global proteomics analysis using a pulldown probe to get an idea of the specificity profile, especially for the so-far untried and untested pre-vinylsulfone moiety.

      We fully agree that the pull-down assay is a good idea. However, this major task will be performed in a separate study.

      This might be picky but wouldn't this be considered a pro-vinylsulfone rather than pre-vinylsulfone? Just as the term "prodrug" is used?

      We agree that both the pre-vinylsulfone and pro-vinylsulfone are suitable names. However, in pharmacology, the prodrug is common, but in organic synthesis, the precursor is commonly used. Therefore, we prefer to keep the pre-vinylsulfone.

      I would also be curious to know what species is responsible for activating the compound to the vinylsulfone. Maybe make some key point mutations of nearby basic residues?

      The His64 formed the covalent bond, thus His64 was the likely activating base. Preparing a mutation could be a good path for future studies.

      Reviewer #3 (Recommendations for the authors):

      (1) The authors presented only a close-up view of the active site with a 2Fo-Fc map mesh in three panels of Figure 4. For readers unfamiliar with the carbonic anhydrase field, adding a complete illustration of each protein-inhibitor complex (protein in cartoon mode and ligand in stick) will be helpful. Also, an image of the 180º rotation of the close-up view presented in each panel should be added. Depicting h-bonds between critical residues (Asn62, Gln 92, etc.) with dashed lines and marking the distances will be helpful for readers.

      We have prepared a requested picture for CAIX. Panels on the left show entire protein molecule view of the bound ligands to each isozyme and there are two close-up views for each structure rotated 180 degrees.

      (2) Line 198 should be revised to refer to the correct complexes. 20, 21, and 23 should be 21, 20, 23.

      We appreciate that the reviewer noticed this error. We corrected the mistake.

      (3) Omit electron density maps around each ligand in Figure 4 should be included for compounds 20, 21, and 23, perhaps as a supplementary figure.

      Detailed electron density map information is provided in the mtz files that have been submitted to the PDB. We think the omit maps are not necessary in the supplementary materials.

      (4) The cyclooctyl group is stabilized by hydrophobic active site residues, L131, A135, L141, and L198. However, only L131 is shown in Figure 4. All residues that stabilize the ligands should be shown.

      For clarity purposes of the figure, we have omitted some of the residues that make contact with the ligand molecule. We think that the structure provided to the PDB could be analyzed in detail to see all contacts between the ligand and protein molecule.

      (5) The supplementary table S1 lacks the crystallographic data on the CAIX-23 complex.

      We have added a new version of the supplementary materials that contains the crystallographic data on the CAIX-23 complex.

      (6) A minor peak (30213 Da) with a 638 Dalton shift compared to the unmodified enzyme is for Figure 5A, not Figure 5B, as mentioned in line 235. This sentence in line 235 should be corrected.

      We corrected this mistake.

      (7) As the authors stated in the text, a minor peak (30213 Da) represents a potential second binding site. Can they revisit their electron density maps and show any residual density if it is present around a second histidine residue? The MS data in Figure S17C indicates the presence of additional sites for compound 12. Thus, additional electron density around the secondary and tertiary sites is possible.

      CAII contains His3 and His4 that are at the N-end of the protein and not visible in the crystal structure. The NMR data indicate that the additional modification may occur at one of these His residues.

      (8) MS data were presented for compounds 12 and 22 in Figure 5A, B, but the co-crystal structures were generated with compounds 21, 20, and 23. Why was no MS data included for compounds 20, 21, and 23? Would these compounds show the presence of a secondary binding site? Can authors include the MS data?

      In the main body of the manuscript in Figure 5A we only present MS data on CAXIII with compound 12. It is only an example that confirms covalent interaction. In the supplementary we have MS data for compound 12 with all carbonic anhydrase isozymes and compound 20 with almost all (except CAVI) CA isozymes. There are also MS data provided with numerous compounds (3, 9, 13, and other) and CA isozymes that serve as a control or confirmation of covalent bond formation.

      (9) The coordination between the zinc ion and NH of the ligand is mentioned in the enzyme schematic in Figure 3. Can the distances and coordination with Zinc be illustrated in ligand-bound structures in Figure 4?

      We considered and decided that picture which shows the numerous distances between ligand atoms and protein residues would be difficult to follow. The structures provided to the PDB could be analyzed for every aspect of the complex structure.

      (10) A key difference between covalent (compound 12) and its non-covalent counterpart, compound 5, is the two oxygens attached to sulfur in compound 12. Do protein side chains or water interact with these oxygens? Are these oxygen atoms exposed to solvent? Can authors show the interactions or clarify if there is no interaction?

      The two oxygens in the ligand molecule serve several purposes. First, they pull out electrons and diminish the pKa of the sulfonamide, thus making interaction stronger. Second, the oxygen atoms may make contacts, hydrogen bonds with the protein molecule and may also be important for covalent bond formation. Exact energy contributions cannot be determined from the structure directly. Thus, we decided to not yet explore and delve into this area.

      (11) Fix the font size of the text in lines 355-356.

      The font has been corrected.

    1. eLife Assessment

      This important study by Liu and colleagues uses lineage tracing of hematopoietic stem and progenitor cells in situ to infer the clonal dynamics of adult hematopoiesis. The authors apply a new mathematical analysis framework enabling a wider range of clonal estimation and the revised study 1) provides evidence of polyclonal adult hematopoiesis, 2) provides insights on clonal dynamics during fetal liver hematopoiesis, and 3) reveals unexpectedly high polyclonality in a mouse model of bone marrow failure (Fanconi anemia), arguing against the prevalent views of clonal attrition in this context. The evidence in this extensively revised and improved study is compelling, with methods, data and analyses more rigorous than the current state-of-the-art, which will be of broad interest not only to stem cell and developmental biologists working on hematopoiesis, but also to researchers working on other systems.

    2. Reviewer #1 (Public review):

      Previous studies have used a randomly induced label to estimate the number of hematopoietic precursors that contribute to hematopoiesis. In particular, the McKinney-Freeman lab established a measurable range of precursors of 50-2500 cells using random induction of one of the 4 fluorescent proteins (FPs) of a Confetti reporter in the fetal liver to show that hundreds of precursors establish lifelong hematopoiesis. In the presented work, Liu and colleagues aim to extend the measurable range of precursor numbers previously established and enable measurement in a variety of contexts beyond embryonic development. To this end, the authors investigated whether the random induction of a given Confetti FP follows the principles of binomial distribution such that the variance inversely correlates with the precursor number. The authors validated their hypothesis and identified sampling conditions to minimize experimental error using a simplified in vitro system. They use tamoxifen-inducible Scl-CreER, active in hematopoietic stem and progenitor cells (HSPCs), to induce Confetti labeling and investigate whether they could extend their model to cell numbers below 50 with in vivo transplantation of high versus low numbers of Confetti total bone marrow (BM) cells. The data generated are generally robust. While the lower and upper limits of the model may show some small error or have not yet been completely validated experimentally, it extends the measurable range of precursor from 15 - 10^5 cells. The authors then apply their model to estimate the number of hematopoietic precursors that contribute to hematopoiesis in a variety of contexts including adult steady state, fetal liver, following myeloablation, and a genetic model of Fanconi anemia.

      Their data highlight the importance of estimating precursor numbers and not just donor frequency in transplantation settings and show that native hematopoiesis is highly polyclonal. Their data also confirm previous findings from Ganuza et al, 2022 that demonstrate no major expansion of precursors between E11.5 - E14.5. Finally, their work reveals intact Fancc-/-precursor numbers following transplantation, suggesting that the observed reduced chimerism is due to defects in cell proliferation.

      The conclusions are generally sound and based on high-quality data. As the authors note, future studies should validate the model using alternative Cre-drivers to exclude any potential functional difference between labelled and non-labelled cells. Although this system does not permit tracing of individual clones, the modeling presented allows measurements of clonal activity covering nearly the entire HSPC population (as recently estimated by Cosgrove et al, 2021) and can be applied to a wide range of in vivo contexts with relative ease.

    3. Reviewer #2 (Public review):

      The manuscript is well written, with beautiful and clear figures, and both methods and mathematical models are clear and easy to understand. Since 2017, Mikel Ganuza, Shannon McKinney-Freeman et al have been using these Confetti approaches that rely on calculating the variance across independent biological replicates as a way to infer clonal dynamics. This is a powerful tool and it is a pleasure to see it being implemented in more labs around the world. One of the cool novelties of the current manuscript is using a mathematical model (based on a binomial distribution) to avoid directly regressing the Confetti labeling variance with the number of clones (which only has linearity for a small range of clone numbers). As a result, this current manuscript of Liu et al. methodologically extends the usability of the Confetti approach, allowing them more precise and robust quantification.

      They then use this model to revisit some questions from various Ganuza et al. papers, validating most of their conclusions. The application to the clonal dynamics of hematopoiesis in a model of Fanconi anemia (Fancc mice) is very much another novel aspect, and shows the surprising result that clonal dynamics are remarkably similar to the wild-type (in spite of the defect that these Fancc HSCs have during engraftment).

      Overall, the manuscript succeeds at what it proposes to do, stretching out the possibilities of this Confetti model, which I believe will be useful for the entire community of stem cell biologists, and possibly make these assays available to other stem cell regenerating systems.

      The revised version has incorporated the reviewer suggestions, strengthening the solidity of the arguments and statements, and highlighting alternative interpretations. My comments were addressed in full.

    4. Reviewer #3 (Public review):

      The paper presents a solid method for quantifying hematopoietic precursors using statistical variance as a proxy, providing valuable insights into hematopoietic dynamics across different physiological and pathological scenarios. The findings are pivotal for understanding hematopoietic dynamics. The strength of the evidence is convincing and acknowledges limitations such as the binomial assumption and the need of tools to measure clonality.

      Liu et al. focus on a mathematical method to quantify active hematopoietic precursors in mice using Confetti reporter mice combined with Cre-lox technology. The paper explores the hematopoietic dynamics in various scenarios, including homeostasis, myeloablation with 5-fluorouracil, Fanconi anemia (FA), and post-transplant environments. The key findings and strengths of the paper include (1) precursor quantification: The study develops a method based on the binomial distribution of fluorescent protein expression to estimate precursor numbers. This method is validated across a wide dynamic range, proving more reliable than previous approaches that suffered from limited range and high variance outside this range; (2) dynamic response analysis: The paper examines how hematopoietic precursors respond to myeloablation and transplantation; (3) application in disease models: The method is applied to the FA mouse model, revealing that these mice maintain normal precursor numbers under steady-state conditions and post-transplantation, which challenges some assumptions about FA pathology. Despite the normal precursor count, a diminished repopulation capability suggests other factors at play, possibly related to cell proliferation or other cellular dysfunctions. In addition, the FA mouse model showed a reduction in active lymphoid precursors post-transplantation, contributing to decreased repopulation capacity as the mice aged. The authors are aware of the limitation of the assumption of uniform expansion. The paper assumes a uniform expansion from active precursor to progenies for quantifying precursor numbers. This assumption may not hold in all biological scenarios, especially in disease states where hematopoietic dynamics can be significantly altered. If non-uniformity is high, this could affect the accuracy of the quantification. Overall, the study underscores the importance of precise quantification of hematopoietic precursors in understanding both normal and pathological states in hematopoiesis, presenting a robust tool that could significantly enhance research in hematopoietic disorders and therapy development. This manuscript would be interesting to the readers of eLife.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Previous studies have used a randomly induced label to estimate the number of hematopoietic precursors that contribute to hematopoiesis. In particular, the McKinneyFreeman lab established a measurable range of precursors of 50-2500 cells using random induction of one of the 4 fluorescent proteins (FPs) of a Confetti reporter in the fetal liver to show that hundreds of precursors establish lifelong hematopoiesis. In the presented work, Liu and colleagues aim to extend the measurable range of precursor numbers previously established and enable measurement in a variety of contexts beyond embryonic development. To this end, the authors investigated whether the random induction of a given Confetti FP follows the principles of binomial distribution such that the variance inversely correlates with the precursor number. They tested their hypothesis using a simplified 2-color in vitro system, paying particular attention to minimizing sources of experimental error (elimination of outliers, sample size, events recorded, etc.) that may obscure the measurement of variance. As a result, the data generated are robust and show that the measurable range of precursors can be extended up to 105 cells. They use tamoxifen-inducible Scl-CreER, which is active in hematopoietic stem and progenitor cells (HSPCs) to induce Confetti labeling, and investigated whether they could extend their model to cell numbers below 50 with in vivo transplantation of high versus low numbers of Confetti total bone marrow (BM) cells. The premise of binomial distribution requires that the number of precursors remains constant within a group of mice. The rare frequency of HSPCs in the BM means that the experimentally generated "low" number recipient animals showed some small variability of seeding number, which does not follow the requirement for binomial distribution. While variance due to differences in precursor numbers still dominates, it is unclear how accurate estimated numbers are when precursor numbers are low (<10).

      According to our simulation, the differences between estimated numbers and the corresponding expected numbers are more profound at numbers below 10, but they are still relatively small. Since Figure S4A is in log-scale, it might be difficult for readers to appreciate the magnitude in difference from the graph. We plan to add a linear scale figure to Figure S4A for better visualization of the absolute value differences (left). We also plan to provide an additional graph quantifying the value differences between estimated and expected values for numbers below 15 (right). From both graphs, the maximum difference between estimated n and expected n occurs at 10 precursor numbers (estimated as 7.6). We admit that these numbers are not numerically the same, and some minor correction of the formula may be needed if a very accurate absolute number is warrant. However, we also want to emphasize that 1. most estimated n values are within 25% range of the expected n; 2. despite the minor discrepancy, the estimated n is still highly correlated with the expected n, so the comparison between different precursor numbers was not affected.

      Author response image 1.

      The authors then apply their model to estimate the number of hematopoietic precursors that contribute to hematopoiesis in a variety of contexts including adult steady state, fetal liver, following myeloablation, and a genetic model of Fanconi anemia. Their modeling shows:

      - thousands of precursors (~2400-2600) contribute to adult myelopoiesis, which is in line with results from a previous study (Sun et al, 2014).

      - myeloablation (single dose 5-FU), while reducing precursor numbers of myeloid progenitors and HSPCs, was not associated with a reduction in precursor numbers of LTHSCs.

      - no major expansion of precursor number in the fetal liver derived from labeling at E11.5 versus E14.5, consistent with recent findings from Ganuza et al, 2022.

      - normal precursor numbers in Fancc-/- mice at steady state and from competitive transplantation of young Fancc-/- BM cells, suggesting that reduced Fancc-/- cell proliferation may underlie the reduced chimerism upon transplantation.

      - reduced number of lymphoid precursors following transplantation of BM cells from 9month-old Fancc-/- animals (beyond this age animals have decreased survival).

      Although this system does not permit the tracing of individual clones, the modeling presented allows measurements of clonal activity covering nearly the entire HSPC population (as recently estimated by Cosgrove et al, 2021) and can be applied to a wide range of in vivo contexts with relative ease. The conclusions are generally sound and based on high-quality data. Nevertheless, some results could benefit from further explanation or discussion:

      - The estimated number of LT-HSCs that contribute to myelopoiesis is not specifically provided, but from the text, it would be calculated to be 1958/5 = ~391. Data from Busch et al, 2015 suggest that the number of differentiation-active HSCs is 5.2x103, which is considered the maximum limit. There is nevertheless a more than 10-fold difference between these two estimates, and it is unclear how this discrepancy arises.

      First, we would like to clarify a sentence in the manuscript. 

      “The average myeloid precursor number at the time of BM analysis (1958) matched the average precursor number calculated from BM myeloid progenitors (MP, Lin-Sca-1-cKit+) and HSPCs (1773 and 1917), but it was five-fold higher than that of LT-HSC (Figure 3E).”

      In this sentence, we compared the number of precursors calculated from peripheral blood myeloid cells to the those calculated from BM myeloid progenitor, HSPC and LT-HSC. However, we did not intend to imply that those precursors numbers calculated from HSPC and LT-HSC specifically contribute to myelopoiesis. To avoid misunderstanding, we propose to change this sentence to read:

      “The average precursor number calculated from PB myeloid cells at the time of BM analysis (1958) matched those calculated from BM myeloid progenitors (MP, Lin-Sca-1-cKit+) and HSPCs (1773 and 1917), but it was fivefold higher than that of LT-HSC (Figure 3E).”

      Nonetheless, we appreciate the reviewers’ comment on the gap between the precursor numbers of LT-HSC and the number of differentiation-active HSCs reported in Busch et al, 2015. We propose the following explanation: 

      First of all, precursor numbers reflect LT-HSC self-renewal by symmetric division and maintenance by asymmetric division but not differentiation. To compare the number of differentiation-active LT-HSC, precursor numbers measured from differentiated progeny (progenitors) is a better choice. As our system does not differentiate the origin of a precursor, measuring the precursor number of differentiation-active LT-HSC is difficult, since progenitors may also derive from other long-lived MPPs. However, if we assume that most divisions of LT-HSC are asymmetric division, generating one LT-HSC and one progenitor, then we can approximate the number of differentiation-active HSCs with the precursor numbers of LT-HSC.

      Second, when Busch et al, 2015 calculated the number of differentiation-active HSC, they measured the cumulative activity of stem cells by following the mice up to 36 weeks postinduction. Our method measured the recent but not accumulative activity of HSC, thus the number of differentiation-active HSC in Busch et al 2015 is predicted to be higher. 

      Third, Busch et al, 2015 used Tie2MCM Cre to trace HSC. It has been shown that Tie2+ HSC have a higher reconstitution capacity (Ito et al 2016, Science), but no one has compared the in situ activity of Tie2+ and Tie2- HSC in a native environment. Since the behavior of HSCs in situ may be very different from their behavior in a transplantation setting, it is possible that Tie2+ HSC are more prone to differentiation than Tie2- HSC in a native environment, leading to an overestimation of differentiation-active HSC in the HSC pool. 

      - Similarly, in Figure 3E, the estimated number of precursors is highest in MPP4, a population typically associated with lymphoid potential and transient myeloid potential, whereas the numbers of MPP3, traditionally associated with myeloid potential, tend to be higher but are not significantly different than those found in HSCs.

      We believe this question results from similar confusion of the nomenclature of myeloid precursors in the previous question. As explained previously, the precursors quantified reflect a variety of possible differentiation routes, not just myelopoiesis. Thus, Figure 3E did not suggest that the lymphoid-biased MPP4 has more myeloid precursors than LTHSC. Instead, it simply means more precursors contribute to MPP4 population than the LT-HSC pool. We apologize for the confusion.

      - The requirement for estimating precursor numbers at stable levels of Confetti labeling is not well explained. As a result, it is unclear how accurate the estimates of B cell precursors upon transplantation of Fancc-/- cells are. In previous experiments on normal Confetti mice (Figure 3B), the authors do not estimate precursors of lymphopoiesis because Confetti labeling of B cells is not saturated, and this appears to be the case in Fanc-/- animals as well (Fig. 5B).

      We appreciate the request for clarification. Our approach required the labeling level to be stable in peripheral blood because we calculate the total number of precursors by normalizing precursor numbers in Confetti+ population with the labeling level (precursor numbers in Confetti+ population divided by labeling efficiency). If the labeling level is not saturated, then the calculation of total precursors will be overestimated. This requirement is more important in native hematopoiesis, since it takes a long time for the mature population, especially the lymphoid population, to be fully replaced by the progenies from the labeled HSPC population (as suggested by Busch et al 2015 and Säwen et al 2018). In transplantation, since lethal irradiation was performed, mature blood cells were rapidly generated by HSPCs, thus saturation of labeling level is not a major concern for precursor quantification. We plan to add Author response image 2 as evidence that Confetti labeling level was stable in mice transplanted with Fancc-/- cells.  

      Author response image 2.

      - Do 9-month-old Fanc-/- animals have reduced lymphoid precursors as well?

      Because of the non-saturated labeling in peripheral blood B cells and extra-HSPC induction of Confetti in T cells, we cannot accurately measure lymphoid precursor numbers in 9-month-old Fancc-/- animals. As an alternative, the precursor number of lymphoid biased MPP4 population were comparable between Fancc+/+ and Fancc-/- animals (Figure 5D).   We plan to add the frequency of common lymphoid progenitors (defined by Lin-IL-7Ra+Sca-1midcKitmid) add a supplementary figure to show were CLP frequencies between these two genotypes.

      Author response image 3.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript by Liu et al. uses Confetti labeling of hematopoietic stem and progenitor cells in situ to infer the clonal dynamics of adult hematopoiesis. The authors apply a new mathematical framework to analyze the data, allowing them to increase the range of applicability of this tool up to tens of thousands of precursors. With this tool, they (1) provide evidence for the large polyclonality of adult hematopoiesis, (2) offer insights on the expansion dynamics in the fetal liver stage, (3) assess the clonal dynamics in a Fanconi anemia model (Fancc), which has engraftment defects during transplantation.

      Strengths:

      The manuscript is well written, with beautiful and clear figures, and both methods and mathematical models are clear and easy to understand.

      Since 2017, Mikel Ganuza and Shannon McKinney-Freeman have been using these Confetti approaches that rely on calculating the variance across independent biological replicates as a way to infer clonal dynamics. This is a powerful tool and it is a pleasure to see it being implemented in more labs around the world. One of the cool novelties of the current manuscript is using a mathematical model (based on a binomial distribution) to avoid directly regressing the Confetti labeling variance with the number of clones (which only has linearity for a small range of clone numbers). As a result, this current manuscript of Liu et al. methodologically extends the usability of the Confetti approach, allowing them more precise and robust quantification.

      They then use this model to revisit some questions from various Ganuza et al. papers, validating most of their conclusions. The application to the clonal dynamics of hematopoiesis in a model of Fanconi anemia (Fancc mice) is very much another novel aspect, and shows the surprising result that clonal dynamics are remarkably similar to the wild-type (in spite of the defect that these Fancc HSCs have during engraftment).

      Overall, the manuscript succeeds at what it proposes to do, stretching out the possibilities of this Confetti model, which I believe will be useful for the entire community of stem cell biologists, and possibly make these assays available to other stem cell regenerating systems.

      Weaknesses:

      My main concern with this work is the choice of CreER driver line, which then relates to some of the conclusions made. Scl-CreER succeeds at being as homogenous as possible in labeling HSC/MPPs... however it is clear that it also labels a subcompartment of HSC clones that become dominant with time... This is seen as the percentage of Confettirecombined cells never ceases to increase during the 9-month chase of labeled cells, suggesting that non-labeled cells are being replaced by labeled cells. The reason why this is important is that then one cannot really make conclusions about the clonal dynamics of the unlabeled cells (e.g. for estimating the total number of clones, etc.).

      We appreciate the reviewers’ comments. We also agree that this is especially a concern for measuring B cell precursors in native hematopoiesis. For myeloid cells, the increase was much less profound (0.5% per month) after month four post-induction. One way to investigate the dynamics of unlabeled cells is to induce different groups of mice with different doses of tamoxifen so that labeling efficiency varies among different groups. With 14 days of tamoxifen treatment, maximum 60% of HSPC can be labeled (RFP+CFP+YFP). If the unlabeled cells behave similarly with labeled cells, then varying the labeling efficiency shouldn’t affect the total number of precursors calculated (if excluding the potential effect of longer tamoxifen treatment on HSC). While we haven’t extensively performed such lengthy experiment, we have performed one measurement (5 mice) with 14-days of tamoxifen treatment and showed that peripheral blood myeloid precursor numbers calculated from this experiment were comparable to the ones from Figure 3 (2-day tamoxifen).

      Author response image 4.

      It's possible that those HSPC that are never labeled with Confetti even during longer tamoxifen treatment could behave differently. In this case, a different Cre driver may provide insight into the total precursor numbers.

      I am not sure about the claims that the data shows little precursor expansion from E11 to E14. First, these experiments are done with fewer than 5 replicates, and thus they have much higher error, which is particularly concerning for distinguishing differences of such a small number of clones. Second, the authors do see a ~0.5-1 log difference between E11 and E14 (when looking at months 2-3). When looking at months 5+, there is already a clear decline in the total number of clones in both adult-labeled and embryonic-labeled, so these time points are not as good for estimating the embryonic expansion. In any case, the number of precursors at E11 (which in the end defines the degree of expansion) is always overestimated (and thus, the expansion underestimated) due to the effects of lingering tamoxifen after injection (which continues to cause Confetti allele recombination as stem cell divide). Thus, I think these results are still compatible with expansion in the fetal liver (the degree of which still remains uncertain to me).

      We agreed adding additional replicates will reducing any error and boost confidence in our conclusions. The dilemma of comparing fetal- and adult-labeled cohorts is that HSPC activities could not be synchronized among different developmental stages. At fetal to neonatal stage, HSPC proliferate faster to generate new blood cells and support developmental need, while at adult stage HSPC proliferate much slower. Thus, it takes long time for the mature myeloid cells in the adult-labeled cohort to reach a stable Confetti labeling and provide an accurate quantification of precursor. While we agree that it might be better to compare precursor numbers in earlier months, we preferred to compare precursor numbers at later time points for the aforementioned reasons. The other option is to compare the number of HSPC precursors in the BM at earlier time points, as no equilibration of labeling level is required in HSPC, but this requires earlier sacrifice, compromising long term assessment.    

      We did not revisit questions about the lingering effect of tamoxifen, as this has been studied by Ganuza et al 2017. They showed that tamoxifen was not able to induce additional Confetti recombination if given one day ahead, suggesting the effective window for tamoxifen is less than 24h.

      Based on our data, the expansion of lifelong precursors range anywhere from 1.4 to 7.0 (Figure 4G). It’s possible that we might observe a higher level of expansion if the comparison was done in earlier time points. Nonetheless, the assertion that the expansion of life-long HSPC is not as profound as evidenced by transplantation, emphasizes value of HSPC activity analysis in situ.

      Reviewer #3 (Public Review):

      Summary:  

      Liu et al. focus on a mathematical method to quantify active hematopoietic precursors in mice using Confetti reporter mice combined with Cre-lox technology. The paper explores the hematopoietic dynamics in various scenarios, including homeostasis, myeloablation with 5-fluorouracil, Fanconi anemia (FA), and post-transplant environments. The key findings and strengths of the paper include (1) precursor quantification: The study develops a method based on the binomial distribution of fluorescent protein expression to estimate precursor numbers. This method is validated across a wide dynamic range, proving more reliable than previous approaches that suffered from limited range and high variance outside this range; (2) dynamic response analysis: The paper examines how hematopoietic precursors respond to myeloablation and transplantation; (3) application in disease models: The method is applied to the FA mouse model, revealing that these mice maintain normal precursor numbers under steady-state conditions and posttransplantation, which challenges some assumptions about FA pathology. Despite the normal precursor count, a diminished repopulation capability suggests other factors at play, possibly related to cell proliferation or other cellular dysfunctions. In addition, the FA mouse model showed a reduction in active lymphoid precursors post-transplantation, contributing to decreased repopulation capacity as the mice aged. The authors are aware of the limitation of the assumption of uniform expansion. The paper assumes a uniform expansion from active precursor to progenies for quantifying precursor numbers. This assumption may not hold in all biological scenarios, especially in disease states where hematopoietic dynamics can be significantly altered. If non-uniformity is high, this could affect the accuracy of the quantification. Overall, the study underscores the importance of precise quantification of hematopoietic precursors in understanding both normal and pathological states in hematopoiesis, presenting a robust tool that could significantly enhance research in hematopoietic disorders and therapy development. The following concerns should be addressed.

      Major Points:

      • The authors have shown a wide range of seeded cells (1 to 1e5) (Figure 1D) that follow the linear binomial rule. As the standard deviation converges eventually with more seeded cells, the authors need to address this limitation by seeding the number of cells at which the assumption fails.

      While number range above 105 is not required for our measurement of hematopoietic precursors in mice, we agree that it will be valuable to understand the upper limit of experimental measurement. we plan to seed 106-107 cells per replicate to address reviewer’s comments. 

      • Line 276: This suggests myelopoiesis is preferred when very few precursors are available after irradiation-mediated injury. Did the authors see more myeloid progenitors at 1 month post-transplantation with low precursor number? The authors need to show this data in a supplement.

      While we appreciate the concern, we did not generate this dataset because this requires take down of a substantial number of animals at one-month post-transplantation. 

      Minor Points:

      • Please cite a reference for line 40: a rare case where a single HSPC clone supports hematopoiesis.

      • Line 262-263: "This discrepancy may reflect uneven seeding of precursors to the BM throughout the body after transplantation and the fact that we only sampled a part of the BM (femur, tibia, and pelvis)." Consider citing this paper (https://doi.org/10.1016/j.cell.2023.09.019) that explores the HSPCs migration across different bones.

      • Lines 299 and 304. Misspellings of RFP.

      We appreciate reviewer’s suggestions and will modify as suggested. 

      • The title is misleading as the paper's main focus is the precursor number estimator using the binomial nature of fluorescent tagging. Using a single-copy cassette of Confetti mice cannot be used to measure clonality.

      We appreciate reviewer’s suggestions and plan to modify the title of the manuscript to read: “Dynamic Tracking of Native Precursors in Adult Mice”.

    1. eLife Assessment

      This useful study holds importance within the focused scope of endometriosis treatment, providing initial evidence of a potential new therapeutic target. The strength of the evidence is solid, as the methods, data, and analyses support the authors' conclusions regarding the specific aims. The study provides promising preliminary evidence of KMO implication in endometriosis, but it falls short of establishing a strong rationale for proposing KNS898 as a treatment for endometriosis given the limitations in evidence and mechanistic insights.

    2. Reviewer #1 (Public review):

      Summary:

      This study serves as a proof of concept for KMO inhibition as a new non-hormonal treatment for endometriosis. The authors investigated KMO expression in human endometrial and endometriosis lesion tissues, confirmed that KNS898 effectively inhibits KMO and alleviates manifestations of endometriosis in mice - reduced endometriosis lesions and improved hyperalgesia and cage behaviour.

      Strengths:

      (1) Inhibition of KMO may present as a promising first-in-class non-hormonal therapeutic agent for patients suffering from endometriosis and the side-effects of hormonal treatments.<br /> (2) The expression of KMO in endometrial tissues was demonstrated in both human (multiple patients per AFS stage of disease) and mice tissues.<br /> (3) Measurement of multiple substrates/analytes of the KMO regulatory pathway was performed and demonstrated strong correlation to each other in response to KMO inhibition.<br /> (4) The aims of study (as proof-of-concept) were achieved in the study and the results support their conclusions.

      Weaknesses:

      If any dysregulation in the KMO/tryptophan metabolic activity, expression and/or pathway in endometriosis can be shown, this will strengthen the rationale for the use of KMO inhibitor in the disease.

    3. Reviewer #2 (Public review):

      Summary:

      The authors aim to address the clinical challenge of treating endometriosis, a debilitating condition with limited and often ineffective treatment options. They propose that inhibiting KMO could be a novel non-hormonal therapeutic approach. Their study focuses on:<br /> • Obtaining proof-of-concept for KMO inhibition as a novel therapy for endometriosis.<br /> • Characterising KMO expression in human and mouse endometriosis tissues.<br /> • Demonstrating the efficacy of KMO inhibition in improving histological and symptomatic features of endometriosis.

      Strengths:

      • Novelty and Relevance: The study addresses a significant clinical need for better endometriosis treatments and explores a novel therapeutic target.

      Weaknesses:

      • Limited Mechanistic Insight: The study lacks a comprehensive investigation of the mechanistic pathways through which KNS898 affects endometriosis. The dysregulation of KMO activity and the kynurenine pathway in endometriosis remains poorly characterized, both in the human condition and the experimental model. While the authors present preliminary evidence that kynurenine metabolites (KYN, 3HK, and KYNA) are not dysregulated in the experimental model of endometriosis, they show that KMO inhibition modulates these metabolite levels and leads to some improvement in disease features. However, these findings do not significantly close the existing knowledge gap or provide a strong rationale for targeting KMO as a therapeutic approach for endometriosis. Further mechanistic insights are necessary to justify the potential of KMO inhibition in this context.

      Achievement of Aims:

      • The authors demonstrated that KMO is expressed in endometriosis lesions and that KNS898 can induce KMO inhibition, leading to biochemical changes and improvements in few endometriosis features in a mouse model. Therefore, the authors addressed the proposed specific aims. However, fail to provide a clear rationale for proposing KMO inhibition as a novel therapy for endometriosis.

      Support of Conclusions:

      • The conclusions are somewhat overextended given the limitations in mechanistic insights to explain how KMO inhibition result in improvment of histological and symptomatic features of experimental endometriosis. The study provides promising initial evidence but requires further exploration to firmly establish the efficacy of KNS898 for endometriosis treatment.

      Impact on the Field:

      • The study introduces a novel therapeutic target to be explored for endometriosis, potentially leading to non-hormonal treatment options.

      Utility of Methods and Data:

      • The methods used provide a foundation for further research, although they require refinement. The data, while promising, need more rigorous investigation and deeper mechanistic exploration to be fully convincing and useful to the community.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This study explores the therapeutic potential of KMO inhibition in endometriosis, a condition with limited treatment options. 

      Strengths: 

      KNS898 is a novel specific KMO inhibitor and is orally bioavailable, providing a convenient and non-hormonal treatment option for endometriosis. The promising efficacy of KNS898 was demonstrated in a relevant preclinical mouse model of endometriosis with pathological and behavioural assessments performed. 

      Weaknesses: 

      (1) The expression of KMO in human normal endometrium and endometrial lesions was not quantified. Western blot or quantification of IHC images will provide valuable insight.

      Given the differential expression of KMO in luminal epithelial cells lining the endometrial glands compared to the other parts of the endometrium, a general endometrial Western Blot prep is not going to be additionally helpful or accurate in addressing this question, without e.g. laser capture microdissection or single cell quantitative proteomics. Furthermore, KMO is a flavin-dependent monooxygenase and the activity, especially generating the oxidative stressor product 3-hydroxykynurenine is far more dependent on kynurenine substrate availability than it is on actual enzyme abundance - although it is important to show (as we have done), that KMO is present in the human endometrial glands and in human distended endometrial gland-like structures (DEGLS).

      If KMO is not overexpressed in diseased tissues i.e. it may have homeostatic roles, and inhibition of KMO may have consequences on general human health and wellbeing.

      KMO certainly does have important homeostatic roles, for example as key step in the repletion of NAD+ through de novo synthesis. Although with good nutrition and sufficient NAD+ precursors in the diet e.g. niacin, that specific role may be partially redundant. KMO knockout mice exhibit normal fertility and fecundity and do not show a survival deficit compared to littermate wildtype controls (e.g. Mole et al Nature Medicine 2016). To further develop KNS898 towards clinical use, preclinical GLP safety and toxicology studies and human Phase 1 clinical trials will of course need to be completed, but that is standard for the development of any new drug

      In addition, KMO expression in control mice was not shown or quantified.

      Control mice that were not inoculated intraperitoneally with endometrial fragments did not develop DEGLS and therefore there is nothing to show or quantify.

      Images of KMO expression in endometriosis mice with treatments should be shown in Figure 4.

      We have now included a representative KMO immunohistochemistry image from each endometriosis group and included all KMO immunohistochemistry images in Supplementary Information.

      The images showing quantification analysis (Figure 4A-F) can be moved to supplementary material.

      This recommendation contradicts the emphasis placed by the same reviewer earlier regarding quantification, so we have elected to keep it where it is.

      (2) Figure 1 only showed representative images from a few patients. A description of whether KMO expression varies between patients and whether it correlates with AFS stages/disease severity will be helpful. Images from additional patients can be provided in supplementary material. 

      We have added extra information to the Figure legend to clarify the disease stage of the superficial peritoneal lesions which were illustrated (Stage I/II) and to link them to the information in supplementary Table S1. In total we examined 11 peritoneal lesions and 5 ovarian lesions (stage III/IV) – in every sample examined immunopositive staining was most intense in epithelial cells lining gland-like structures. Sections illustrated were chosen to illustrate this key finding.

      (3) For Home Cage Analysis, different measurements were performed as stated in methods including total moving distance, total moving time, moving speed, isolation/separation distance, isolated time, peripheral time, peripheral distance, in centre zones time, in centre zones distance, climbing time, and body temperature. However, only the finding for peripheral distance was reported in the manuscript. 

      This was indeed a large amount of output, which we rationalised for the benefit of a concise paper. The paper now includes a description of which parameters showed a difference with drug treatment.

      (4) The rationale for choosing the different dose levels of KNS898 - 0.01-25mg/kg was not provided. What is the IC50 of a drug? 

      KNS898 dosing has been extensively characterised by us in multiple species, and the pIC50 has already been published (e.g. Hayes et al Cell Reports 2023 and elsewhere). We now include the pIC50 in the present manuscript to save the reader from having to search through another reference.

      (5) Statistical significance: 

      (a) Were stats performed for Fig 3B-E?

      Now included, thank you.

      (b) Line 141 - 'P = 0.004 for DEGLS per group' 

      However, statistics were not shown in the figure. 

      Thanks, now displayed on figure.

      (c) Line 166 - 'the mechanical allodynia threshold in the hind paw was statistically significantly lower compared to baseline for the group' 

      However, statistics were not shown in the figure. 

      (d) Line 170 - 'Two-way ANOVA, Group effect P = 0.003, time effect P < 0.0001' The stats need to be annotated appropriately in Figure 5A as two separate symbols. 

      Arguably the far more important comparison in this figure is whether there is any effect of treatment, and to mark multiple statistical comparisons on the figure would make it difficult to understand. Instead, the figure legend and results text have been clarified on this point.

      (e) Figure 5B - multiple comparisons of two-way ANOVA are needed. G4 does not look different to G3 at D42. 

      Multiple comparison testing (Dunnett’s T3) was done and the results have been clarified in the text and figure legends.

      (f) Line 565 - 'non-significant improvement in KNS898 treated groups'. However, ** was annotated in Figure 5A. 

      Thank you. This is an error that has been checked and corrected.

      (6) Discussion is very light. No reference to previous publications was made in the discussion. Discussion on potential mechanistic pathways of KYR/KMO in the pathogenesis of endometriosis will be helpful, as the expression and function of KMO and/or other metabolites in endometrial-related conditions. 

      The discussion is deliberately concise and focussed. The paper has 21 references to previous publications. A speculative discussion is generally not favoured by us.

      The findings in this study generally support the conclusion although some key data which strengthen the conclusion eg quantification of KMO in normal and diseased tissue is lacking.

      We differ from the reviewer here and do not think that those data would materially affect the likelihood of KMO inhibition being efficacious in human endometriosis in Phase 2/3 clinical trials.

      Before KMO inhibitors can be used for endometriosis, the function of KMO in the context of endometriosis should be explored eg KMO knockout mice should be studied. 

      We take the view that before KMO inhibitors can be used for endometriosis in patients there are multiple other regulatory and clinical development steps that are required that would be a priority. While using a KMO knockout mouse might be an interesting scientific experiment, it would not impact on the critical path in a material way.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors aim to address the clinical challenge of treating endometriosis, a debilitating condition with limited and often ineffective treatment options. They propose that inhibiting KMO could be a novel non-hormonal therapeutic approach. Their study focuses on: 

      • Characterising KMO expression in human and mouse endometriosis tissues. 

      • Investigating the effects of KMO inhibitor KNS898 on inflammation, lesion volume, and pain in a mouse model of endometriosis. 

      • Demonstrating the efficacy of KMO blockade in improving histological and symptomatic features of endometriosis. 

      Strengths: 

      • Novelty and Relevance: The study addresses a significant clinical need for better endometriosis treatments and explores a novel therapeutic target. 

      • Comprehensive Approach: The authors use both human biobanked tissues and a mouse model to study KMO expression and the effects of its inhibition. 

      • Clear Biochemical Outcomes: The administration of KNS898 reliably induced KMO blockade, leading to measurable biochemical changes (increased kynurenine, increased kynurenic acid, reduced 3-hydroxykynurenine). 

      Weaknesses: 

      • Limited Mechanistic Insight: The study does not thoroughly investigate the mechanistic pathways through which KNS898 affects endometriosis. Specifically, the local vs. systemic effects of KMO inhibition are not well differentiated. 

      While we agree that this is not a comprehensive mechanistic analysis, given that the ultimate therapy would be almost certainly a once daily oral dosing i.e. systemic administration, we do not consider differentiating local vs systemic effects of KMO inhibition to be critical to therapeutic development in this scenario.

      • Statistical Analysis Issues: The choice of statistical tests (e.g., two-way ANOVA instead of repeated measures ANOVA for behavioral data) may not be the most appropriate, potentially impacting the validity of the results. 

      The selection of two-way ANOVA (time and group) is sufficient and correct for this experimental analysis and its use does not invalidate the results. We agree that repeated measures ANOVA could be a valid alternative.

      • Quantification and Comparisons: There is insufficient quantitative comparison of KMO expression levels between normal endometrium and endometriosis lesions,

      Please see response above to quantification question raised by Reviewer 1.

      and the systemic effects of KNS898 are not fully explored or quantified in various tissues. 

      Please see earlier responses. KNS898 has been thoroughly explored in multiple tissues, species and experimental models, but those data do not need rehearsed here.

      • Potential Side Effects: The systemic accumulation of kynurenine pathway metabolites raises concerns about potential side effects, which are not addressed in the study. 

      As discussed above (response to Reviewer 1), KMO knockout mice exhibit normal fertility and fecundity and do not show a survival deficit compared to littermate wildtype controls (e.g. Mole et al Nature Medicine 2016). To further develop KNS898 towards clinical use, preclinical GLP safety and toxicology studies and human Phase 1 clinical trials will naturally need to be completed, but this is standard for the development of any new drug.

      Achievement of Aims: 

      • The authors successfully demonstrated that KMO is expressed in endometriosis lesions and that KNS898 can induce KMO blockade, leading to biochemical changes and improvements in endometriosis symptoms in a mouse model. 

      Support of Conclusions: 

      • While the data supports the potential of KMO inhibition as a therapeutic strategy, the conclusions are somewhat overextended given the limitations in mechanistic insights and statistical analysis. The study provides promising initial evidence but requires further exploration to firmly establish the efficacy and safety of KNS898 for endometriosis treatment. 

      We do not agree that the conclusions are overextended based on the data presented, as expanded in the reply to the eLife editorial assessment at the beginning of this response. It is clear that additional preclinical, regulatory and clinical development work, and human clinical trials will be required to firmly establish the efficacy and safety of KN898 for endometriosis treatment.

      Impact on the Field: 

      • The study introduces a novel therapeutic target for endometriosis, potentially leading to non-hormonal treatment options. If validated, KMO inhibition could significantly impact the management of endometriosis. 

      Utility of Methods and Data: 

      • The methods used provide a foundation for further research, although they require refinement. The data, while promising, need more rigorous statistical analysis and deeper mechanistic exploration to be fully convincing and useful to the community. 

      We believe that the data are a) convincing, and b) useful to the community. To be advanced effectively towards patients, KNS898 needs to follow the critical development path outlined above.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      (1) Change 'hyperalgia' to hyperalgesia throughout the manuscript including the title. 

      Done

      (2) Line 69 - write '3-HK' in full. 

      Done

      (3) Line 85 - the findings of the study include 'define the preclinical efficacy of KNS898 in reducing inflammation'. The inflammatory profile was not studied. 

      Changed to “disease”

      (4) Line 259 - write 'EPHect' in full. 

      Done

      (5) Line 260 - write 'AFS' in full. Also, abbreviate 'AFS' in the caption of Table S1. 

      Done

      (6) 20 patients were listed in Table S1 but only 19 were accounted for in the methods section. 

      Apologies there was an error and has now been corrected in the methods section as one of the endometrial samples had not been included. Table S1 has also been changed to make it clear which samples were eutopic endometrium to differentiate them from the lesions.

      (7) The location from which the endometrial lesion tissues were obtained should be provided in Table S1. 

      Table S1 has been changed to make it clear that the subtypes of lesions examined were classified as Stage I/II – superficial peritoneal subtype and Stage III/IV – endometrioma. The methods section has also been updated to reflect these subtypes (lines 272-277).

      (8) Table S2 - G5 should be given compound 'A' not 'B'. 

      Thank you. Corrected.

      (9) Figure 2E was not referenced in the text and no figure legend was provided. 

      Now referenced and the figure legend updated.

      (10) Figure 3A - font needs to be enlarged. HCA baseline recording was annotated as performed twice in the protocol. When is the baseline taken and on what day was the Week 12 measurement taken (refer to Figures 5C and D)? 

      Font has been enlarged as requested. The second HCA baseline annotation in Fig 3A is a cut-and-paste error, now rectified and the time of second measurement annotated.

      (11) Line 133 - 'In KNS898-treated group G4 (endometriosis + treatment from Day 19), DEGLS formed in 4 of 15 mice (26.7%) and in G5 (Endo + treatment start on Day 26) in 6 of 15 mice (40%) (Fig. 3f).'. The aforementioned data is not reflected in Figure 3F. 

      Thank you. This has been rectified.

      (12) Line 137 - 'Mice with endometriosis receiving KNS898 from the time of inoculation (G4) had an average of 2.0 DEGLS per animal with DEGLS (total = 8 DEGLS in 4 mice in G4) and those receiving KNS898 1 week after inoculation (G5) had an average of 1.8 DEGLS per animal (total = 11 DEGLS in 6 mice in G5) (Figs. 3g and 3h).' 

      The aforementioned data is not reflected in Figure 3G. There is no Figure 3H shown. 

      Rectified as above.

      (13) Provide a discussion of why KA levels were significantly lower in Figure 3E compared to Figure 2C. 

      (14) Figure legend for Figure 3 - G1 and G2 were noted as n=8. However, Figure S1 and Table S2 noted both groups as n=10. 

      Thank you. This is a typographical error. The legend for Fig 3 should indeed read n=10 for G1 and G2 and has been corrected.

      (15) Line 181 - 'compared to non-operated and sham-operated control groups'. Only the sham group was shown in Figures 5C and D. 

      This text has been clarified to refer only to the data shown.

      (16) Figure 1 images need scalebars. Same for Figure 4. 

      Now added

      (17) Figure 3B - y-axis is fold change? 

      Relative concentration. Legend has been clarified.

      (18) Figures 5A and B - are the last Von Frey measurements taken on Day 40 (as per Figure 3A) or 42?

      Taken on Day 42. Fig 3A (the prospective protocol figure) has been clarified to reflect what actually happened (D42) as opposed to what was planned (D40) to pre-empt any further confusion.

      (19) Symbols in Figure S1 need to be explained in the Figure legend. 

      Done

      (20) Figures 2A and 2D should not be plotted in log scale to match the description of results in Line 106 and Line 118. 

      These particular results are plotted on a log scale to allow the reader to visualise that detectable levels of drug are measurable at very low doses and that there is no significant pharmacodynamic effect at that low dose. We choose to retain the present format.

      Reviewer #2 (Recommendations For The Authors): 

      Comments and queries 

      Introduction/aims section: 

      Line 82 - 87: Clarify in the proposal aims what is being accessed and analysed in humans and/or in animal models (mice). Specifically state clearly the correlations with KMO expression. Were the correlations between KMO expression with features of inflammation performed only in mice or also in humans? 

      Thank you for this comment. The aims have been clarified in the Introduction.

      Section - KMO is expressed in human eutopic endometrium and human endometriosis tissue lesions: 

      Was any quantitative or semi-quantitative method used to quantify the KMO expression in human tissues? Although the authors claimed that "KMO was strongly immunopositive in human peritoneal endometriosis lesions" by the representative figures it is not clear if KMO expression is similar, higher or lower between normal endometrium and peritoneal endometriosis lesions. 

      We have added extra information to the legend of Figure 1 to identify the PIN number of the superficial lesions illustrated. The key finding from the immunostaining with the antibody which had been previously validated as specific for KMO was that the most intense immunopositive response was in glandular epithelial cells and the samples illustrate this result.

      Section - Oral KNS898 inhibits KMO in mice: 

      The authors clearly confirmed the target engagement of KNS898 in inhibiting KMO activity and, therefore, affecting upstream and downstream metabolites systemically in (peripheral fluid/ plasma) mice. Whether KNS898 effect is broad and targets systemic immune cells and whole body cells and tissue was not explored. It was also not explored if KNS898 is able to specifically inhibit KMO locally at the endometrium tissue by targeting epithelial and/or infiltrated immune cells, for example. 

      That is correct.

      It would be interesting to measure (or if it was measured to report in this section and also in Figure 2) the levels of KYN, KA and 3HK in naïve animals that did not receive KNS898. It would help to understand the net effect of KNS898 on the levels of kynurenine pathway metabolites and, therefore, justify the dose chosen.

      These data are already presented in Fig 3B-E, control group.

      Perhaps then the chosen dose could be lower considering the possible substantial changes in kynurenine pathway metabolites levels, which are reported to exert an effect in many cells, tissues and systems and could, therefore, precipitate side effects. Even more considering that the values for these metabolites are expressed as ng/ml, which hinders the comparison of the metabolite levels with the one reported for naïve animals in the literature. I would also suggest expressing the metabolite levels as nM/L. 

      This is not a relevant method of determining dose-limiting toxicity or safety pharmacology/toxicology, either non-GLP or GLP. There are international guidelines on the proper conduct of those studies. This is also why it is important not to make claims about the safety or otherwise of an experimental compound in an in vivo setting that has not explicitly complied with those regulatory standards. With regard to the units recommendation, accepted units are ng/mL or nM, not usually nM/L.

      Section - KMO blockade reduces endometrial gland-like lesion burden in experimental endometriosis in mice: 

      Line 130: It would be better to replace "blockade of 3HK production" with "reduction of 3HK production" to better reflect the results. 

      Changed to “inhibition of 3HK production”.

      Line 140: In G5 (treatment starting at Day 26/ 1 week after inoculation), is the experimental model of endometriosis already established with all pathological and phenotypic features? 

      This was not specifically tested in this experiment.

      Lines 146 - 148: It would be better to specify that "Overall, there was no significant difference IN BODY WEIGHT between G3 and the KNS898 treatment groups G4 and G5 (endometriosis + treatment from Day 26)". Otherwise, this last sentence might be interpreted as the overall conclusion of this result sub-section. 

      Thank you, a good point and has been corrected.

      The authors demonstrated with an experimental approach that KMO blockade reduces a pathological measure of endometriosis i.e., endometrial gland-like lesion burden, in experimental endometriosis in mice when both administrated concomitant but also after the disease development. Although mechanistic insights about how reduced KMO activity can reduce the developed distended endometrial gland-like structures were not explored. Therefore, it remains to be investigated which (and how ) kynurenine pathway metabolites are directly linked to the beneficial effects of KMO blockade in the experimental model of endometriosis.

      We agree.

      Although the beneficial effects on the pathological measures are evident, Figure 3 shows an exorbitant accumulation of KYN and KA and also a substantial reduction in 3HK after the treatment with KNS898, which then raises concerns about tolerability and side effects. Would this effective KNS898 dose be viable and translational as a therapeutic approach? 

      Please refer to comments above at multiple junctures about safety pharmacology and the clinical development critical path.

      Section - KMO is expressed in experimental endometriosis in mice: 

      By histological examination, the authors confirm that the treatment with KNS898 specifically reduced the KMO expression intensity in the DEGLS from mice. Therefore, the effect exerted by KNS898 locally on the KMO expression at the DEGLS could be, at least, partially responsible for the beneficial effects observed in Figure 3 i.e., the reduction of pathological measures. Although remains to be explored whether the effect of KNS898 in other cells or tissues could also be accountable for the beneficial effects exerted by KNS898 on the animal model of endometriosis. 

      This is correct.

      From a logical experimental point of view, I would suggest switching the order of the result subsection "KMO blockade reduces endometrial gland-like lesion burden in experimental endometriosis in mice" and "KMO is expressed in experimental endometriosis in mice" as well as the respective Figures 3 and 4. 

      We do not agree. Fig 3 (and section) is the macroscopic enumeration of DEGLS, Fig 4 (and section) is the microscopic and immunohistochemical evaluation of the lesions introduced in Fig 3. The sequence as originally presented is the more logical.

      Sections - KMO inhibition reduces mechanical allodynia in experimental endometriosis - and - KMO inhibition reduces mechanical allodynia in experimental endometriosis: 

      The authors suggested that the KMO inhibition with KNS898 exerts beneficial effects on behavioural paradigms related to the experimental model of endometriosis. Based on the statistical analysis performed for the author, KMO inhibition with KNS898 reduces mechanical allodynia, as well as rescues, impaired cage exploration behaviour and mobility in mice with endometriosis. However, I believe that the most indicated statistical tests for Von Frey (allodynia behaviour) and Home cage (illness behaviour) analyses over time would be repeated measures ANOVA and paired t-test, respectively (and not two-way ANOVA as performed). Therefore for a more trustful analysis and interpretation of this data set, I would suggest the authors modify the statistical analysis and report the corresponding interpretation of these tests. 

      The selection of two-way ANOVA (time and group) is suitable for this experimental analysis and its use does not invalidate the results. We agree that repeated measures ANOVA could be a valid alternative.

      Overall, the authors present a solid and useful case for KMO inhibition as a potential therapeutic strategy for endometriosis. However, the study would benefit from more detailed mechanistic insights, appropriate statistical analyses, and an evaluation of potential side effects. With these improvements, the research could have a significant impact on the field and pave the way for new treatment modalities for endometriosis. 

      We thank the reviewer for the positive comments and we have responded to the criticisms above.

      Specific recommendations for improvement: 

      • Mechanistic Studies: Conduct detailed studies to understand the local vs. systemic effects of KMO inhibition and its specific impacts on different cell types and tissues. If not feasible here, the authors could include in the discussion section a detailed overview of the possible mechanisms implicated. 

      While we agree that this is not a comprehensive mechanistic analysis, given that the ultimate therapy would be almost certainly a once daily oral dosing i.e. systemic administration, we do not consider differentiating local vs systemic effects of KMO inhibition to be critical to therapeutic development in this scenario. We do not think speculation about possible mechanisms that is not supported by experimental data should be included. Furthermore, that notion (of statements not supported by data) has been given as a criticism by the reviewers, and therefore consistency on this point must be preferable.

      • Quantitative Analysis: Include more robust quantitative methods to compare KMO expression levels in different tissues and assess the correlation between KNO expression and pathological and behavioural changes. 

      As discussed above, the pathophysiological importance of KMO is in its enzymatic activity, not in its abundance as a protein, and 3HK production is far more dependent on kynurenine substrate availability rather than KMO protein abundance.

      • Appropriate Statistics: Use the most suitable statistical tests for behavioural and other repeated measures data to ensure accurate interpretation. 

      As discussed above

      • Side Effect Evaluation: Investigate potential side effects of systemic KMO inhibition, particularly focusing on the long-term implications of altered kynurenine pathway metabolites. If not feasible here, the authors could include in the discussion section a detailed overview of the possible side effects associated as well as inform if KNS898 can cross the BBB and its implications. 

      For a novel small molecule therapeutic compound in preclinical/clinical development, there are strictly regulated preclinical and clinical development standards that need to be met. It would not be responsible to publish or make claims about safety and potential adverse effect profiles without conducting the proper panel of tests within a suitable regulatory framework.

    1. eLife Assessment

      This study highlights an important discovery: a bacterial pathogen's effector influences plant responses that in turn affect how the leafhopper insect vector for the bacteria is attracted to the plants in a sex-dependent manner. The research is backed by convincing physiological and transcriptome analyses. This study unveils a complex interdependence between the pathogen effector, male leafhoppers, and a plant transcription factor in modulating female attraction to the plant, shedding light on previously unexplored aspects of plant-bacteria-insect interactions.

    2. Reviewer #1 (Public review):

      Summary:

      Orlovski and his colleagues revealed an interesting phenomenon that SAP54-overexpressing leaf exposure to leafhopper males is required for the attraction of followed females. By transcriptomic analysis, they demonstrated that SAP54 effectively suppresses biotic stress response pathways in leaves exposed to the males. Furthermore, they clarified how SAP54, by targeting SVP, heightens leaf vulnerability to leafhopper males, thus facilitating female attraction and subsequent plant colonization by the insects.

      Strengths:

      The phenomenon of this study is interesting and exciting.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, the authors show that leaf exposure to leafhopper males is required for female attraction in the SAP54-expressing plant. They clarify how SAP54, by degrading SVP, suppresses biotic stress response pathways in leaves exposed to the males, thus facilitating female attraction and plant colonization.

      Strengths:

      This study suggests the possibility that the attraction of insect vectors to leaves is the major function of SAP54, and the induction of the leaf-like flowers may be a side-effect of the degradation of MTFs and SVP. It is a very surprising discovery that only male insect vectors can effectively suppress the plant's biotic stress response pathway. Although there has been interest in the phyllody symptoms induced by SAP54, the purpose and advantage of secreting SAP54 were unknown. The results of this study shed light on the significance of secreted proteins in the phytoplasma life cycle and should be highly evaluated.

      Weaknesses:

      There are no major weaknesses. The mechanism behind why only male leafhoppers reduce plant defense responses in the presence of SAP54 remains somewhat unclear, but clarifying this is beyond the scope of this study and is for future work.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Orlovskis and his colleagues revealed an interesting phenomenon that SAP54-overexpressing leaf exposure to leafhopper males is required for the attraction of followed females. By transcriptomic analysis, they demonstrated that SAP54 effectively suppresses biotic stress response pathways in leaves exposed to the males. Furthermore, they clarified how SAP54, by targeting SVP, heightens leaf vulnerability to leafhopper males, thus facilitating female attraction and subsequent plant colonization by the insects.

      Strengths:

      The phenomenon of this study is interesting and exciting.

      Weaknesses:

      The underlying mechanisms of this phenomenon are not convincing.

      We thank the reviewer for the comment of finding our study interesting and exciting. However, we respectfully disagree with the reviewer assertion that the mechanisms we uncovered are unconvincing.

      We have uncovered a significant portion of the mechanisms by which SAP54 induces the leafhopper attraction phenotype.

      First, we discovered that the SAP54-mediated attraction of leafhoppers requires the presence of male leafhoppers on the leaves. Female leafhoppers were only attracted and laid more eggs on leaves when both SAP54 and male leafhoppers were present. In the absence of either males or SAP54, female leafhoppers did not exhibit this behaviour.

      Second, we found that biotic stress responses in leaves were significantly downregulated when exposed to SAP54 and male leafhoppers, with a much lesser effect observed in the presence of females.

      Third, we identified that the presence of the MADS-box transcription factor SHORT VEGETATIVE PHASE (SVP) in leaves is crucial for the leafhopper attraction phenotype, and that SAP54 facilitates the degradation of SVP.

      Our research corroborates previous findings that SAP54-mediated degradation of MADS-box transcription factors depends on the 26S proteasome shuttle factor RAD23, which we found previously to also be necessary for the leafhopper attraction phenotype (MacLean et al., 2014. PMID: 24714165). This finding has been replicated by other research groups. Previous research has also revealed that leafhoppers are specifically attracted to leaves, not to the leaf-like flowers (Orlovskis & Hogenhout, 2016. PMID: 27446117).

      Collectively, these results suggest that SAP54 acts as a "matchmaker", helping male leafhoppers locate mates more easily by degrading SVP-containing complexes in leaves. We have updated the model in Fig. 7 to better illustrate our findings.

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors show that leaf exposure to leafhopper males is required for female attraction in the SAP54-expressing plant. They clarify how SAP54, by degrading SVP, suppresses biotic stress response pathways in leaves exposed to the males, thus facilitating female attraction and plant colonization.

      Strengths:

      This study suggests the possibility that the attraction of insect vectors to leaves is the major function of SAP54, and the induction of the leaf-like flowers may be a side-effect of the degradation of MTFs and SVP. It is a very surprising discovery that only male insect vectors can effectively suppress the plant's biotic stress response pathway. Although there has been interest in the phyllody symptoms induced by SAP54, the purpose, and advantage of secreting SAP54 were unknown. The results of this study shed light on the significance of secreted proteins in the phytoplasma life cycle and should be highly evaluated.

      Weaknesses:

      One weakness of this study is that the mechanisms by which male and female leafhoppers differentially affect plant defense responses remain unclear, although I understand that this is a future study.

      The authors show that female feeding suppresses female colonization on SAP54-expressing plants. This is also an intriguing phenomenon but this study doesn't explain its molecular mechanism (Figure 7).

      Strengths:

      We appreciate the reviewer's assessment of the strengths of our study. We do indeed discuss the possibility that the induction of leaf-like flowers could be a side effect of the SAP54 effector function. However, it is not uncommon for effectors to have multiple functions, as has been frequently demonstrated for viral proteins (e.g., PMID: 34618877). Furthermore, it is increasingly evident that developmental and immune processes in organisms often overlap and are mediated by the same proteins. A notable example is the Toll-like receptors, which are widely recognized for their role in innate immunity but were initially discovered for their involvement in various developmental processes (e.g., PMID: 29695493).

      MADS-box transcription factors are known to regulate various developmental pathways in plants, and their diversification has been a key driver of evolutionary innovations in plant development. These factors are comparable to HOX genes, which are essential for the development of bilateral animals. While the role of MADS-box transcription factors in orchestrating flowering has been well-documented, recent evidence has emerged showing that they also play a role in regulating immune processes in plants. Our findings contribute to this emerging understanding, presenting novel insights into the multifunctional roles of these transcription factors.

      Specifically, the MADS-box transcription factor SVP has vital roles in both plant immunity and flowering. The SAP54-mediated targeting of this transcription factor may therefore confer multiple advantages to phytoplasmas that, as obligate colonisers, depend on plants and transmission by insects for survival. Firstly, the inhibition of flowering could delay plant senescence and death, which is particularly relevant in annual plants, the primary hosts of AY-WB phytoplasma studied here. Secondly, the downregulation of plant defence responses, particularly against males, facilitates the attraction of females, which are more likely to reproduce and thus increase the number of vectors for phytoplasma transmission. Given that phytoplasmas are obligate organisms with highly reduced genomes, it is plausible that they rely on ‘efficient proteins’ capable of targeting multiple key pathways in their hosts.

      Weaknesses:

      As explained above, we have uncovered a substantial portion of the mechanisms through which SAP54 induces the leafhopper attraction phenotypes that includes the identification of MADS-box transcription factor SVP as an important contributor. We have updated the model in Fig. 7 to better illustrate our findings.

      It is known that SVP forms quaternary structures with other (MADS-box) transcription factors, and it is seems likely that the degradations of specific SVP complexes present in fully developed leaves play a significant role in the downregulation of immune genes in the presence of SAP54 and males. These specific complexes also do not form in svp mutants, which could explain why females are attracted to these mutant plants in the presence of males. However, transcription profiles are different in male-exposed SAP54 vs male-exposed svp plants. This may be explained by SVP having multiple functions, including those that are not targeted by SAP54.

      Identifying which SVP complexes contribute to the male-mediated downregulation of immunity in the presence of SAP54 would require the development of a broad range of tools to investigate plant immunity without the confounding effects of developmental changes. This line of inquiry extends beyond the findings presented in this study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Orlovskis and colleagues revealed an interesting phenomenon that SAP54-overexpressing leaf exposure to leafhopper males is required for the attraction of followed females. By transcriptomic analysis, they demonstrated that SAP54 effectively suppresses biotic stress response pathways in leaves exposed to the males. Furthermore, they clarified how SAP54, by targeting SVP, heightens leaf vulnerability to leafhopper males, thus facilitating female attraction and subsequent plant colonization by the insects. The discovery of this study is interesting and exciting. However, I have a few concerns that require authors to address.

      (1) The author demonstrated that SAP54-overexpressing leaf exposure to leafhopper males is more attractive to females. However, I was confused that the author did not analyse the choice preference of males. This is important, as the author demonstrated later that "SAP54 plants exposed to males display significant downregulation of biotic stress responses". It is very possible that the female is attracted by a mating signal, but not by reduced biotic stress responses. Also, it is important to address whether the female used in this study is virgin.

      We have analysed male preference in feeding choice tests (Figure 1, treatment 3) and described our findings in the text (p7; lines 214-216). For added clarity, we have revised the text on p7 (lines 214-216) to specify that males alone do not show any feeding preference for SAP54 plants.

      Additionally, we investigated whether females could be attracted to male-exposed SAP54 plants prior to landing and feeding using choice experiments, as depicted in Supplemental Figure 3 and discussed in the text (p9; lines 265-271). These findings suggest that long-distance cues alone do not fully account for the female attraction phenotype observed in Figure 1. We acknowledge that mating calls or volatiles may complement or enhance the transcriptional changes in male-exposed SAP54 leaves. This interpretation is further supported by comparing Figure 1, treatments 4 and 5, which shows that removing males from SAP54 leaves before female choice does not increase female colonisation. To enhance clarity and precision, we have added the term "solely" to the results (p9; line 265) and discussion (p25; line 719), and included a new sentence on p26 (lines 726-730): "However, given that the removal of males from SAP54 leaves prior to female choice does not enhance female colonisation (comparison of Figure 1, treatment 4 with treatment 5), we cannot exclude the possibility that male-produced volatiles or mating calls could enhance or supplement SAP54-dependent changes in biotic stress responses to males, thereby enhancing female attraction."

      We have also updated the methods section to clarify that a mixture of virgin and pre-mated females was used in all experiments (p28; lines 798-799), consistent with our previously published work (Orlovskis & Hogenhout, 2016. PMID: 27446117; MacLean et al., 2014. PMID: 24714165).

      (2) I was confused by the rationality of the section "Female leafhopper preference for male-exposed SAP54 plants unlikely involves long-distance cues". The volatile cues or mating calls from males can be only perceived from a distance?

      As mentioned in our response to comment 1, for clarity, we have added new text to both the results (p9; line 265) and discussion sections (p25; lines 719 and 726-730). In the results section highlighted by the reviewer (p8-9), we aimed to explicitly test whether cues produced by males (such as mating calls or pheromones) or SAP54 plants (such as plant volatiles) could account for female attraction from a distance, independent of, and prior to, physical contact with the plants or male insects.

      To address the possibility that volatiles or mating calls might be perceived simultaneously with downregulated biotic stress responses, we have included an additional sentence in the discussion, which addresses comments 1 and 2 from the reviewers. Furthermore, it is important to note that Figure 1, treatment 4, mirrors the results of Figure 1, treatment 1, suggesting that direct physical contact between males and females is not necessary for the observed female attraction. This conclusion, derived from our experiments, was already emphasised in the main text (p7; lines 218-222).

      (3) Line 271-273. How the author concluded the "immediate access". A time course experiment (detect the number of insects on each plant at different time point) for host-choice experiment is necessary.

      We have corrected and rephrased the sentence as follows:

      ‘’Therefore, these results indicate that female reproductive preference for the male-exposed SAP54 versus GFP plants is dependent on immediate access of the direct females access to the leaves of SAP54 plants and presence of males on these leaves.’’ (p9; lines 267-271).

      (4) I appreciate the transcriptome analysis. However, the figures are poorly organized. i.e. the heatmap in Figure 2 was poorly understood. The author should clearly address what is upregulated or downregulated. It is meaningless to exhibit the heatmap without explaining what gene represented. Also, it is hard for readers to distinguish the difference between the 4 maps in Figure 2, similar to the two figures in Figure 3.

      We thank the reviewer for the recommendation. To make Figure 2 and 3 easier to read and understand as stand-alone, we have changed and improved the corresponding figure legends, highlighting the colouring of up- and down-regulated DEGs as well as explaining the related supplementary file content in figure legends. For brevity and clarity, we have removed the mentioning of figure supplement 4, 5 and 6 as they have already been explained and referred to in the main text but do not directly relate to Figure 2 or 3 but rather data processing prior to analysis in Figure 2.

      We hope that the improvements in figure legends will make the Figures 2 and 3 easier and quicker to understand.

      (5) For transcriptomic analysis, three out of four replicates were well clustered, and the author excluded the outliers in subsequent analysis. Is this treatment commonly used in transcriptomic analysis? If yes, please provide corresponding references.

      Removing outliers from transcriptomic data is not unusual, as it enhances the classification of treatment groups and increases the efficiency of detecting biologically relevant differentially expressed genes (DEGs) (PMID: 36833313; PMID: 32600248). For large datasets, especially in clinical studies, automated procedures and algorithms have been developed for this purpose (PMID: 32600248; doi.org/10.1101/144519). Given our relatively small sample size of 4, we opted for a PCA-based manual outlier evaluation, followed by repeated PCA without the identified outliers. This approach demonstrated improved group discrimination (Figure Supplement 4), which can enhance downstream characterization of DEGs and pathways that explain female preference for male-exposed SAP54 plants. We have detailed this procedure on pages 9-10. It is worth noting that other automated outlier removal methods, which are also based on PCA, have been shown to be as effective as manual outlier removal (PMID: 32600248).

      (6) Figure 5A. How the experiment was done? The HA-SVP and other HA-tagged genes were stably or transiently expressed in GFP and GFP-SAP54 plants? How many replicates were conducted? The band intensity from different biological replicates should be provided. In this manuscript, no information is provided even in the method section.

      We thank the reviewer for noticing this and have updated the methods section providing more details on transient protoplast expression assays (p39; line 835). We have performed two independent degradation assays for all 5 MTF proteins and indicated in the legend of Figure 5. Western blot results from both experiments are provided as a new figure supplement 10 (p53). The degradation/destabilisation efficiency was calculated as the HA intensity divided by the RuBisCo large subunit (rbcL) intensity from the same sample, normalised to the intensity of the sample with the highest ratio from the same leaf (Rel HA/rbcL) using ImageJ. Relative pixel intensities are provided above each treatment in new figure supplement 10, as requested by the reviewer.

      (7) For the interaction assay, only Y2H was conducted. Generally, at least two methods are needed to confirm protein interaction. This is also applicable to degradation assays.

      There is substantial prior evidence that SAP54 interacts with MADS-box transcription factors and facilitates their degradation in plants, a process that also involves the 26S proteasome shuttle factor RAD23 (MacLean et al., 2014; PMID: 24714165). This interaction has been independently confirmed by other research groups using various methods, including split-YFP assays (e.g., PMID: 24597566, PMID: 26179462). Given the extensive data already available on this topic, it would be redundant to replicate all of these findings in our manuscript. Instead, we have focused on a few validated assays that effectively demonstrate the specific interactions between SAP54 and MADS-box transcription factors.

      (8) Lines 528-530. No direct evidence in this study was provided for how SAP54-mediated degradation of SVP. The author should tone down the claim.

      Our findings demonstrate that SVP is degraded in plant cells in the presence of SAP54. Additionally, through yeast two-hybrid assays, we show that SAP54 does not directly bind to SVP but does directly interact with several MADS-box transcription factors known to associate with SVP. We also provide evidence that they interact with SVP herein. Furthermore, previous studies have shown that SAP54 facilitates the degradation of MADS-box transcription factor complexes of Arabidopsis and several other eudicot species (PMID: 24597566, PMID: 26179462, PMID: 28505304, PMID: 35234248; PMID: 38105442). We have described observations herein and of others (see main text pages 4-5,  pages 19-20), and believe that we have presented them accurately without overstating our conclusions.

      (9) Overall, the phenomenon of this study is interesting, but the underlying mechanisms are not solidified. Additional work is still needed in future studies.

      We respectfully disagree—we have identified a significant portion of the mechanisms by which SAP54 induces these phenotypes. As with any research, new data often leads to further questions that may be addressed by follow-up studies. Please refer to our previous responses for additional context.

      Reviewer #2 (Recommendations For The Authors):

      Major comment

      It will be interesting to see how long male feeding affects changes in gene expression in plants. No feeding choice of females was observed on the SAP54 plants when males were removed from the clip-cages prior to the choice test with females alone (Figure 1, Treatment 5; Figure Supplement 1, Treatment 5). This indicates that SAP54 plants lose their ability to attract females as soon as males are removed. On the other hand, if the suppression of the plant's stress response pathway by male feeding continues for some time even after males are removed, I think that we cannot exclude the possiblity that volatiles emitted by males may partially promote female feeding and colonization.

      As described above, our findings suggest that long-distance cues alone do not fully account for the female attraction phenotype observed in Figure 1. We acknowledge that mating calls or volatiles may complement or enhance the transcriptional changes in male-exposed SAP54 leaves. This interpretation is further supported by comparing Figure 1, treatments 4 and 5, which shows that removing males from SAP54 leaves before female choice does not increase female colonisation. To enhance clarity and precision, we have added the term "solely" to the results (p9; line 265) and discussion (p25; line 719), and included a new sentence on p26 (lines 726-730): "However, given that the removal of males from SAP54 leaves prior to female choice does not enhance female colonisation (comparison of Figure 1, treatment 4 with treatment 5), we cannot exclude the possibility that male-produced volatiles or mating calls could enhance or supplement SAP54-dependent changes in biotic stress responses to males, thereby enhancing female attraction."

      Minor comments

      The legend of Figure 1 is missing an explanation for panel C.

      Thank you for noticing this. We have added the missing information.

      Although from a different perspective from this study, a relationship between phytoplasma infection and SVP has been previously reported (Yang et al., Plant Physiology, 2015). Shouldn't this paper be cited somewhere?

      We thank the reviewer for identifying this oversight. We have added the missing reference (PMID: 26103992) and clarified that, as seen in Figure 5E (p20; lines 555-558), our findings show a similar upregulation of SVP in male-exposed SAP54 plants as reported by Yang et al. This suggests that SAP54 and its homologs, such as PHYL1, may indeed operate through similar mechanisms by targeting MTFs that are crucial for their function. While Yang et al. described the role of SVP in the development of abnormal flower phenotypes in Catharanthus, our study reveals a completely novel role for SVP in plant-insect interactions. Although SAP54 destabilises the SVP protein, its transcript is upregulated in the presence of SAP54, indicating a potential disruption of MTF autoregulation and the MTF network as a whole.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Response to reviewer 1:

      We thank the reviewer for their positive comments and note that we made many attempts to genetically alter endothelial cells to expression mutants of SEC61A1 that are resistant to the effects of mycolactone. However, these cells were not capable of supporting expression of this transgene. Instead, we used an approach where we tested other translocation inhibitors, with a different chemical structure but same mechanism of action at the Sec61 translocon and found that these phenocopied the effects.


      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors have investigated the effect of the toxin mycolactone produced by mycobacterium ulcerans on the endothelium. Mycobacterium ulcerans is involved in Buruli ulcer classified as a neglected disease by WHO. This disease has dramatic consequences on the microcirculation causing important cutaneous lesions. The authors have previously demonstrated that endothelial cells are especially sensitive to mycolactone. The present study brings more insight into the mechanism involved in mycolactone-induced endothelial cells defect and thus in microcirculatory dysfunction. The authors showed that mycolactone directly affected the synthesis of proteoglycans at the level of the golgi with a major consequence on the quality of the glycocalyx and thus on the endothelial function and structure. Importantly, the authors show that blockade of the enzyme involve in this synthesis (galactosyltransferase II) phenocopied the effects of mycolactone. The effect of mycolactone on the endothelium was confirmed in vivo. Finally, the authors showed that exogenous laminin-511 reversed the effects of mycolactone, thus opening an important therapeutic perspective for the treatment of wound healing in patients suffering Buruli ulcer and presenting lesions.  

      Reviewer #2 (Public Review):  

      The authors dissected the effects of mycolacton on endothelial cell biology and vessel integrity. The study follows up on previous work by the same group, which highlighted alterations in vascular permeability and coagulation in patients with Buruli ulcer. It provides a mechanistic explanation for these clinical observations, and suggests that blockade of Sec61 in endothelial cells contributes to tissue necrosis and slow wound healing.  

      Overall, the generated data support their conclusions and I only have two major criticisms:  

      - Replicating the effects of mycolactone on endothelial parameters with Ipomoeassin F (or its derivative ZIF-80) does not demonstrate that these effects are due to Sec61 blockade. This would require genetic proof, using for example endothelial cells expressing Sec61A mutants that confer resistance to mycolactone blockade. The authors claimed in the Discussion that they could not express such mutants in primary endothelial cells, but did they try expressing mutants in HUVEC cell lines? Without such genetic evidence all statements claiming a causative link between the observed effects on endothelial parameters and Sec61 blockade should be removed or rephrased. The same applies to speculations on the role of Sec61 in epithelial migration defects in discussion. Data corresponding to Ipomoeassin F and ZIF-80 do not add important information, and may be removed or shown as supplemental information.  

      - While statistical analysis is done and P values are provided, no information is given on the statistical tests used, neither in methods nor results. This must be corrected, to evaluate the repeatability and reproducibility of their data.  

      We respectfully but fundamentally disagree with the comments regarding the Sec61 dependence of the effects that we observed. We showed that loss of glycocalyx and basement membrane components underpinned the phenotypic changes in endothelial cells (morphological changes, loss of adhesion, increased permeability, and reduced ability to repair scratch wounds). We demonstrated that we could phenocopy permeability increases and elongation phenotype by knocking down the type II membrane protein B3Galt6, and reverse the adhesion defect by exogenous provision of the secreted laminin-511 heterotrimer.

      Our conclusion that mycolactone mediates these effects via Sec61 inhibition is not based solely on the use of alternative inhibitors but is built on several pillars of evidence:  

      First, the proteomics data conforms entirely to predictions based on the topology of affected vs. non-effected proteins, and agrees with independently published proteomic datasets from T lymphocytes, dendritic cells and sensory neurons (ref.12), as well as biochemical studies performed using in vitro translocation assays (ref.11,34). Furthermore, the pattern of membrane protein down regulation observed in our experiments fits perfectly with established models of protein translocation mechanisms, particularly with respect to the lack of effect on specific topologies of multipass membrane proteins, tail anchored- and type III membrane proteins (ref.34-36).  

      Second, since Sec61 very highly conserved amongst mammals and is found in all nucleated cells, it is hard to conceptualise a framework in which mycolactone targets Sec61 in some cells and not others, as this reviewer suggests might be the case for epithelial cells [noting that the work being referred to (ref.29) predates our 2014 work showing that mycolactone is a Sec61 inhibitor (ref.7)]. Indeed, mycolactone has been shown to target Sec61 in multiple independent approaches including forward genetic screens involving random mutagenesis and CRISPR/Cas9 (ref.10, PMID: 35939511). Genetic evidence has previously been provided for the Sec61 dependence of mycolactone effects in epithelial cells (ref.10,17). We have unpublished genetic evidence that the rounding and detachment of epithelial cells due to mycolactone is reduced when resistance mutations are over expressed, and will consider including this in the next version of the manuscript.

      Third, given this weight of evidence, one would be hard-pressed to provide an alternative explanation for the specific down-regulation of glycosaminoglycan-synthesising enzymes and adhesion/basement membrane molecules while most cytosolic and non-Sec61 dependent membrane proteins are unchanged or upregulated. However, seeking to be as rigorous as possible we have here shown that a completely independent Sec61 inhibitor produces the same phenotype at the gross and molecular level. Ipomoeassin F (Ipom-F) is a glycolipid, not a polyketide lactone, yet they both compete for binding with cotransin in Sec61α (ref.6). There is significant overlap in the cellular responses to mycolactone and Ipom-F, including the induction of the integrated stress response (ref.17, PMID: 34079010), which we observed again in the current data, providing further evidence that this approach is useful when genetic approaches are technically unattainable.  

      Therefore, we are confident the effects seen on endothelial cells are Sec61-dependent. We are happy to provide more detail on our lengthy attempts at over-expressing mycolactone resistant SEC61A1 genes in HUVECs; primary endothelial cells derived from the umbilical vein. We are highly experienced in this area, and have previously stably expressed these proteins in epithelial cell lines, reproducing the resistance profile (ref.10,17). Notably though, these cells do not have normal ‘fitness’ in the absence of challenge. Since endothelial cells (and endothelial cell lines; PMID: 12560236) are extremely hard to transfect with plasmids, with efficiency routinely 5-10% (including in our hands), we developed a lentivirus system. We were eventually (after multiple attempts using different protocols) able to transduce primary HUVECs with constructs expressing GFP (at an efficiency of about 10-20%) and select/expand these under puromycin selection. Never-the-less, we never recovered any cells that expressed the flag-tagged SEC61A1 wild type or SEC61A1 carrying the resistance mutant D60G. We also attempted to select D60G-transduced cells with mycolactone epimers, an approach that can help the cells compete against non-transduced cells in culture flasks (ref.10).  We concluded that primary endothelial cells are unable to tolerate the expression of additional Sec61α, and this was incompatible with survival.  

      It’s also important to note that most endothelial cell specialists would agree that endothelial cell lines are not good models of endothelial behaviour. We tested the HMEC-1 cell line, but found it did not express prototypical endothelial marker vWF in the expected way. Therefore we focussed our efforts on primary endothelial cells. Should we be able to overcome the dual challenge of the necessity to work in primary cells, and the difficulty of over-expressing Sec61, we will update this paper at a later date with this data, and will also expand the above arguments.  

      We apologise for the embarrassing oversight of not including information about the statistical analyses we used, which of course we will correct in full in the revised version. However, we would like to provide this information to readers of the current version of the manuscript. All data were analysed using GraphPad Prism Version 9.4.1:

      Figure 1: one-way ANOVA with Dunnett’s (panel A) or Tukey’s (panel B) correction for multiple comparisons

      Figure 2 supplement: one-way ANOVA with Tukey’s correction for multiple comparisons (analysed panel)

      Figure 3: one-way ANOVA with Tukey’s (panel B) or Dunnett’s (panel E&F) correction for multiple comparisons

      Figure 4:  one-way ANOVA with Dunnett’s correction for multiple comparisons (all analysed panels)

      Figure 5 and supplement:  one-way ANOVA with Dunnett’s correction for multiple comparisons (all analysed panels)

      Figure 6:  one-way ANOVA with Dunnett’s correction for multiple comparisons (analysed panel)

      Figure 6 supplement: one-way ANOVA with Dunnett’s correction for multiple comparisons (all analysed panels)

      Figure 7: two-way ANOVA with Tukey’s correction for multiple comparisons (all analysed panels; panels B&C also included the Geisser Greenhouse correction for sphericity)

      Figure 7 supplement: Panels A&D used a repeated measures one-way ANOVA with Dunnett’s correction for multiple comparisons (panel D also included the Geisser Greenhouse correction for sphericity). Panels B,C&E used a two-way ANOVA with Tukey’s correction for multiple comparisons (panels B&C also included the Geisser Greenhouse correction for sphericity)

      Reviewer #3 (Public Review):

      Buruli ulcer is a severe skin infection in humans that is caused by a bacterium, Mycobacterium ulcerans. The main clinical sign is a massive tissue necrosis subsequent to an edema stage. The main virulence factor called mycolactone is a polyketide with a lactone core and a long alkyl chain that is released within vesicles by the bacterium. Mycolactone was already shown to account for several disease phenotypes characteristic of Buruli ulcer, for instance tissue necrosis, host immune response modulation and local analgesia. A large number of cellular pathways in various cell types was reported to be impacted by mycolactone. Among those, the Sec61 translocon involved in the transport of certain proteins to the endoplasmic reticulum was first identified by the authors of the study and is currently the most consensual target. Mycolactone disruption of Sec61 function was then shown to directly impact on cell apoptosis in macrophages, limited immune responses by T-cells and increased autophagy in dermal endothelial cells and fibroblasts. In their manuscript, TzungHarn Hsieh and their collaborators investigated the Sec61- dependent role of mycolactone on morphology, adhesion and migration of primary human dermal microvascular endothelial cells (HDMEC). They used a combination of sugar and proteomic studies on a live imagebased phenotypic assay on HDMEC to characterize the effect of mycolactone. First, they showed that upon incubation of monolayer of HDMEC with mycolactone at low dose (10 ng/mL) for 24h, the cells become elongated before rounding and eventually detached from the culture dish at 48h. Next, mycolactone was probed on a scratch assay and migration of the cells ceased upon a 24h incubation. The same effect as mycolactone on these two assays was observed for two other Sec61 inhibitors Ipomoeassin F and ZIF-80. Then, the authors resorted to the widely established mouse footpad model of M. ulcerans infection to evidence fibrinogen accumulation outside the blood vessel within the endothelium at 28 days postinfection, correlating with severe endothelial cell morphology changes.  

      To dissect the molecular pathways involved in these phenotypes, the authors performed an HDMEC membrane protein analysis and showed a decrease in the numbers of proteins involved in glycosylation and adhesion. As protein glycosylation mainly occurs in the Golgi apparatus, a deeper analysis revealed that enzymes involved in glycosaminoglycan (GAG) synthesis were lost in mycolactone treated HDMEC. A combination of immunofluorescence and flow cytometry approaches confirmed the impact of mycolactone on the ability of endothelial cells to synthesize GAG chains. The mycolactone effect on cell elongation was phenocopied by knock-down of galactosyltransferase II (B3Galt6) involved in GAG biosynthesis. A second extensive analysis of the endothelial basement membrane component and their ligands identified multiple laminins affected by mycolactone. Using similar functional studies as for GAG, the impact of mycolactone on cell rounding and migration could be reversed by the addition of laminin α5.  

      The major strengths of the study relies on a combination of cleverly designed phenotypic assays and in-depth cleverly designed membrane proteomic studies and follow-up analysis.  

      The results really support the conclusions. Congratulations!  

      The discussion takes into account the current state of the art, which has mostly been established by the authors of the present manuscript.  

      Recommendations for the authors:

      In preparing this revised version we have made a number of general improvements:

      • We added the missing information on statistical analysis that was mentioned in the public review of reviewer #2

      • We have changed all gene names to the HUGO nomenclature

      • We have changed our abbreviation of mycolactone from “MYC” to “Myco” in all figures to avoid any potential confusion with other protein factors

      • We have moved the fibrin(ogen) staining of the mouse footpads to its own figure (now Fig 2), partly due to the inclusion of additional data in Fig 1. This has changed the numbering of subsequent figures, but has also made the supplementary figures easier to track.

      Reviewer #1 (Recommendations For The Authors):  

      (1) Figure 1I. When mice are injected M. Ulcerens a measurement of local blood flow would be very informative in addition of the data shown. Cutaneous blood flow at the level of the feet is possible using laser doppler or Laser speckle imaging. With these measurements the authors would have a functional quantification of the effect of the glycosaminoglycans- Sec61α associated damages on the microcirculatory blood flow. The same measurement could also better validate the therapeutic effect of laminin. 

      We thank the reviewer for this great suggestion, and respectfully remind the reviewer that these experiments take place in CL3 containment. This often completely precludes certain procedures due to the availability of equipment inside the containment, and our ability to sterilise it. Where we are able to perform procedures, it greatly increases their complexity since any procedures on live animals must take place inside of a cabinet. Therefore, we can only use equipment that we have at our animal facility. It is not trivial to set up the regulatory permissions to perform these experiments at other facilities where more specialist equipment is located due to the containment restrictions. 

      Never-the-less we have attempted to perform ultrasound imaging of mouse feet using the VivoF and have set up a collaboration with other researchers at Surrey who have developed a novel imaging instrument to measure microvascular circulation call optical coherence tomography (OCT; https://pubmed.ncbi.nlm.nih.gov/34882760/), and we are working with them to develop a protocol that be used in small rodents.  

      However, while we have dedicated considerable time to trying to perform the suggested experiment, we have not been successful within a reasonable time frame. Consequently, if we are able to establish this technique in the M. ulcerans infection model, and/or OCT in small rodents, this will likely be beyond the scope of the current manuscript and will be a publication in its own right. We note that we have been able to perform almost all of the other requested experiments (see below), and have also been able to undertake transmission electron microscopy of M. ulcerans infected mouse footpads, which confirms the loss of the basement membrane at high resolution (Fig 7E).

      (2) Figure 1 -D. Endothelial cells were exposed to mycolactone, Ipomoeassin F or ZIF-80. The effect on the cells is clear and impressive. Nevertheless, endothelial cells in no flow conditions are considered "diseased" cells as in the areas of low flow or no flow are prone to atherosclerosis in vivo. Would the authors expect similar effects in cells submitted to flow? In this conditions cells would be already elongated in the direction of flow. 

      We agree that flow is usually experienced by endothelial cells in vivo, and have repeated a selection of our experiments under conditions that mimic flow and produce uniaxial shear stress. All showed a similar pattern of response to mycolactone, including the phenotypic changes (Fig 1I-K), loss of perlecan (Fig S6C) and laminin α4 (Fig S7B). It is true that the elongation phenotype is not as striking in a cell monolayer that already contains many elongated cells, but qualitatively the cells become disorganised and at 48 hours, their length/width ratio had increase. These results provide reassurance that our findings are physiologically relevant.

      (3) Discuss the possible consequences of your findings on vascular reactivity and especially on flow-mediated dilation and/or flow-mediated remodeling which as both are important in tissue repair and wound healing. 

      We agree with this reviewer that there are likely to be broad consequences to endothelial and vascular function as a result of our findings here. Vascular reactivity is not something we directly considered in this manuscript, and is probably better linked to our planned future work, laid out above, regarding vascular flow in the infected animals. While a key mediator of vascular tone, endothelin 1, is a Sec61-dependent secreted peptide mediator (and is likely to also be affected by mycolactone’s actions), this was not one of the >6500 proteins we identified in our proteomic study. On the other hand, it has been shown by others that mycolactone can induce NO production by in other types of cells.

      Reviewer #2 (Recommendations For The Authors):  

      - The authors use a mouse model of M. ulcerans infection of footpads to assess the in vivo relevance of their results. It would be useful to comment on any differences between human and mouse with regard to endothelial cell biology and vessel wall architecture. Since the authors have access to patients samples, parallel stainings in human lesions would have strengthened the study. 

      This is an important issue, and is one we have already addressed in our two previous articles https://pubmed.ncbi.nlm.nih.gov/35100311/ https://pubmed.ncbi.nlm.nih.gov/26181660/ . Indeed, this latter work already included a detailed analysis of fibrin staining in these Buruli ulcer patient biopsies and underpinned the hypothesis that we have now tested in the current manuscript. 

      It is worth noting that our data supports that the critical step is at an early (pre-clinical) stage, for which patient samples are not available. The proposed human challenge model (https://pubmed.ncbi.nlm.nih.gov/37384606/ ) may well provide a suitable platform such studies in the future.

      - The authors should provide in the Discussion some explanation for the differential effects of Laminin-11, -411 and -511 in Fig. 7 

      This is an interesting point, and probably related to the expression of laminin binding proteins by mycolactone-exposed endothelial cells. We pursued several candidates based on the proteomic data but could not identify a unique gene that explained this observation. Mostly likely they are explained by partial (be it low or high) loss of a combination of integrin binding proteins. Since this was rather inconclusive and we preferred not to present this data, and already said (p34-35) “We have not been able to ascribe this to the retention of a specific adhesion molecule, and instead postulate that rescue could be via residual expression of a wide variety of laminin α5 receptors

      - The word "catastrophic" in the title is very dramatic given the limited impact on the vital prognosis of patients 

      This word has been changed to “destructive”

      Reviewer #3 (Recommendations For The Authors):  

      Several points could be further discussed:  

      -In mouse model of M. ulcerans infection, in 5% of cases, animals heal spontaneously. How could the authors results contribute to bring hypothesis to this phenomenon? 

      Others have shown that the ability of some mice to control M. ulcerans infection is related to loss of mycolactone production by an unknown mechanism. It is not something we have ever observed in the infection experiments we have performed, although this may be due to the humane endpoints of our licence. However, this seems somewhat outside the main focus of the paper and we have not discussed this further.  

      -Mycolactone was also reported to induce analgesia in the mouse model. There is still controversy about the precise mechanisms involved in this mycolactone mediated painless effect. Could the data obtained here help to resolve the controversy? 

      We agree that analgaesia in M. ulcerans infection (both in mouse models and in clinical infections) is an extremely interesting area. However, we cannot mechanistically link loss of vascular integrity with the analgaesia based on the data generated in the current manuscript. Therefore we prefer not to speculate on this.

      The quantification of the microscopy images and videos should be provided as well as the script used to quantify them. 

      The reviewer is not specific about which microscopy images are being referred to in this comment, but the reference to videos leads us to assume this is related to the ZenCell OWL images/videos presented in Figure 1 and Figure S1. We had already provided quantification of these in the graphs provided, and the algorithms use for % coverage and % detached cells were provided in the instrument software used to gather the data, the ZenCell OWL (which are proprietary). Other counts were made manually, and the length:width ratio is simple arithmetic as already described in the methodology.

      The authors performed their work using chemically synthesized mycolactone obtained from the very generous Professor Kishi (Harvard University). Would the same phenotype and proteomics analysis be obtained with biologically purified mycolactone? 

      Our lab has extensive experience of both biologically purified and synthetic mycolactone, and the phenotypes observed have always been identical when using the chemically synthesised form. Therefore we did not repeat the proteomics experiments as we do not believe it would provide any greater insight into the disease mechanism. However, we have now replicated a range of findings using mycolactone biologically purified from M. ulcerans. In particular, we confirmed that the cytotoxic activity of synthetic and biological mycolactone are inseparable (Figure S1A), and the main phenotypic changes induced by mycolactone in endothelial cells (Phenotypes; Figures S1D-F, B3GALT6/perlecan/laminin α5 loss; S5A, S6B, S7A).

      Although already very comprehensive, a kinetic study of their proteomic analysis over time could strengthen the analysis (from 2H to 48H). 

      We agree that more data is always better, but since we validated our proteomic data set over multiple timepoints between 2 and 48 hrs, we do not believe this would alter the main conclusions of our work.   

      The siRNA transfection protocol could be better described. A Table listing all the reagents would help the reader.  

      A more detailed siRNA transfection protocol has been added to the methods section, and we now include a Key Resources Table at the start of the Materials & Methods section.

    2. eLife Assessment

      The toxin mycolactone is produced by Mycobacterium ulcerans which is responsible for the Buruli ulcer lesions. The authors performed a valuable study showing the effects of mycolactone on blood vessel integrity. This convincing data provides new therapeutic targets to accelerate the healing of Buruli ulcer lesions.

    3. Reviewer #1 (Public review):

      Summary:

      By employing human primary microvascular endothelial cells, along with live confocal imaging, proteomics, and chemical validation studies, the authors reveal a novel cellular mechanism underlying mycolactone's effects in Buruli ulcer lesions. This finding provides important insights into the specific mechanisms of skin pathogenesis.

      Strengths:

      The techniques employed are state-of-the-art.

      Weaknesses:

      The study lacks genetic validation of the findings.

    4. Reviewer #2 (Public review):

      The authors have investigated the effect of the toxin mycolactone produced by Mycobacterium ulcerans on the endothelium. Mycobacterium ulcerans is involved in Buruli ulcer lesions classified as a neglected disease by WHO. This disease has dramatic consequences on the microcirculation causing important cutaneous lesions. The authors have previously demonstrated that endothelial cells are especially sensitive to mycolactone. The present study brings more insight into the mechanism involved in mycolactone-induced endothelial cells defect and thus in microcirculatory dysfunction. The authors showed that mycolactone directly affected the synthesis of proteoglycans at the level of the golgi with a major consequence on the quality of the glycocalyx and thus on the endothelial function and structure. Importantly, the authors show that blockade of the enzyme involve in this synthesis (galactosyltransferase II) phenocopied the effects of mycolactone. The effect of mycolactone on the endothelium was confirmed in vivo. Finally, the authors showed that exogenous laminin-511 reversed the effects of mycolactone, thus opening an important therapeutic perspective for the treatment of wound healing in patients suffering Buruli ulcer lesions.

    1. eLife Assessment

      The investigators studied the membrane-targeting sequence (MTS) of bactofilin A (BacA) in Caulobacter crescentus to explore its role in membrane binding and polymerization. They used various techniques, including microscopy, liposome binding assays, and simulations, to show that membrane targeting may be crucial for BacA polymerization. While their findings on membrane association are valuable, the absence of direct polymerization assays and lack of proper controls in some experiments make the study incomplete.

    2. Reviewer #1 (Public review):

      Summary:

      The investigators undertook detailed characterization of a previously proposed membrane targeting sequence (MTS), a short N-terminal peptide, of the bactofilin BacA in Caulobacter crescentus. Using light microscopy, single molecule tracking, liposome binding assays, and molecular dynamics simulations, they provide data to suggest that this sequence indeed does function in membrane targeting and further conclude that membrane targeting is required for polymerization. While the membrane association data are reasonably convincing, there are no direct assays to assess polymerization and some assays used lack proper controls as detailed below. Since the MTS isn't required for bactofilin polymerization in other bacterial homologues, showing that membrane binding facilitates polymerization would be a significant advance for the field.

      Major concerns

      (1) This work claims that the N-termina MTS domain of BacA is required for polymerization, but they do not provide sufficient evidence that the ∆2-8 mutant or any of the other MTS variants actually do not polymerize (or form higher order structures). Bactofilins are known to form filaments, bundles of filaments, and lattice sheets in vitro and bundles of filaments have been observed in cells. Whether puncta or diffuse labeling represents different polymerized states or filaments vs. monomers has not been established. Microscopy shows mis-localization away from the stalk, but resolution is limited. Further experiments using higher resolution microscopy and TEM of purified protein would prove that the MTS is required for polymerization.<br /> (2) Liposome binding data would be strengthened with TEM images to show BacA binding to liposomes. From this experiment, gross polymerization structures of MTS variants could also be characterized.<br /> (3) The use of the BacA F130R mutant throughout the study to probe the effect of polymerization on membrane binding is concerning as there is no evidence showing that this variant cannot polymerize. Looking through the papers the authors referenced, there was no evidence of an identical mutation in BacA that was shown to be depolymerized or any discussion in this study of how the F130R mutation might to analogous to polymerization-deficient variants in other bactofilins mentioned in these references.<br /> (4) Microscopy shows that a BacA variant lacking the native MTS regains the ability to form puncta, albeit mis-localized, in the cell when fused to a heterologous MTS from MreB. While this swap suggests a link between puncta formation and membrane binding the relationship between puncta and polymerization has not been established (see comment 1).<br /> (5) The authors provide no primary data for single molecule tracking. There is no tracking mapped onto microscopy images to show membrane localization or lack of localization in MTS deletion/variants. A known soluble protein (e.g. unfused mVenus) and a known membrane bound protein would serve as valuable controls to interpret the data presented. It also is unclear why the authors chose to report molecular dynamics as mean squared displacement rather than mean squared displacement per unit time, and the number of localizations is not indicated. Extrapolating from the graph in figure 4 D for example, it looks like WT BacA-mVenus would have a mobility of 0.5 (0.02/0.04) micrometers squared per second which is approaching diffusive behavior. Further justification/details of their analysis method is needed. It's also not clear how one should interpret the finding that several of the double point mutants show higher displacement than deleting the entire MTS. These experiments as they stand don't account for any other cause of molecular behavior change and assume that a decrease in movement is synonymous with membrane binding.<br /> (6) The experiments that map the interaction surface between the N-terminal unstructured region of PbpC and a specific part of the BacA bactofilin domain seem distinct from the main focus of the paper and the data somewhat preliminary. While the PbpC side has been probed by orthogonal approaches (mutation with localization in cells and affinity in vitro), the BacA region side has only been suggested by the deuterium exchange experiment and needs some kind of validation.

    3. Reviewer #2 (Public review):

      Summary:

      The authors of this study investigated the membrane-binding properties of bactofilin A from Caulobacter crescentus, a classic model organism for bacterial cell biology. BacA was the progenitor of a family of cytoskeletal proteins that have been identified as ubiquitous structural components in bacteria, performing a range of cell biological functions. Association with the cell membrane is a common property of the bactofilins studied and is thought to be important for functionality. However, almost all bactofilins lack a transmembrane domain. While membrane association has been attributed to the unstructured N-terminus, experimental evidence had yet to be provided. As a result, the mode of membrane association and the underlying molecular mechanics remained elusive.

      Liu at al. analyze the membrane binding properties of BacA in detail and scrutinize molecular interactions using in-vivo, in-vitro and in-silico techniques. They show that few N-terminal amino acids are important for membrane association or proper localization and suggest that membrane association promotes polymerization. Bioinformatic analyses revealed conserved lineage-specific N-terminal motifs indicating a conserved role in protein localization. Using HDX analysis they also identify a potential interaction site with PbpC, a morphogenic cell wall synthase implicated in Caulobacter stalk synthesis. Complementary, they pinpoint the bactofilin-interacting region within the PbpC C-terminus, known to interact with bactofilin. They further show that BacA localization is independent of PbpC.

      Strengths

      These data significantly advance the understanding of the membrane binding determinants of bactofilins and thus their function at the molecular level. The major strength of the comprehensive study is the combination of complementary in vivo, in vitro and bioinformatic/simulation approaches, the results of which are consistent.

      Weaknesses:

      The results are limited to protein localization and interaction, as there is no data on phenotypic effects. Therefore, the cell biological significance remains somewhat underrepresented.

    4. Author response:

      Reviewer #1:

      Summary:

      The investigators undertook detailed characterization of a previously proposed membrane targeting sequence (MTS), a short N-terminal peptide, of the bactofilin BacA in Caulobacter crescentus. Using light microscopy, single molecule tracking, liposome binding assays, and molecular dynamics simulations, they provide data to suggest that this sequence indeed does function in membrane targeting and further conclude that membrane targeting is required for polymerization. While the membrane association data are reasonably convincing, there are no direct assays to assess polymerization and some assays used lack proper controls as detailed below. Since the MTS isn't required for bactofilin polymerization in other bacterial homologues, showing that membrane binding facilitates polymerization would be a significant advance for the field

      We thanks Reviewer #1 for the constructive criticism and will address the points detailed below in a revised version of the manuscript.

      Major concerns

      (1) This work claims that the N-termina MTS domain of BacA is required for polymerization, but they do not provide sufficient evidence that the ∆2-8 mutant or any of the other MTS variants actually do not polymerize (or form higher order structures). Bactofilins are known to form filaments, bundles of filaments, and lattice sheets in vitro and bundles of filaments have been observed in cells. Whether puncta or diffuse labeling represents different polymerized states or filaments vs. monomers has not been established. Microscopy shows mis-localization away from the stalk, but resolution is limited. Further experiments using higher resolution microscopy and TEM of purified protein would prove that the MTS is required for polymerization.

      We do not propose that the MTS is directly involved in the polymerization process, and preliminary transmission electron microscopy (TEM) data show that variants lacking the MTS or carrying amino acid exchanges in the MTS still form polymers when highly overproduced in E. coli and then purified from cell lysates by affinity chromatography. This finding is consistent with the results of previous studies and in line with the finding that bactofilin polymerization is exclusively mediated by the conserved bactofilin domain (Deng et al, Nat Microbiol, 2019). However, under native expression conditions, bactofilin levels are often relatively low, with only a few hundred molecules of BacA measured per cell in C. crescentus (Kühn et al, EMBO J, 2006). Our data indicate that, under this condition, the concentration of BacA on the 2D surface of the cytoplasmic membrane and, potentially, steric contraints induced by membrane curvature, may be required to facilitate its efficient assembly into functional polymeric complexes. We will provide TEM images of purified proteins in a revised version of our manuscript and explain this model in more detail in the Discussion.

      In the case of polymer-forming proteins, defined localized signals are typically interpreted as polymeric complexes. An even distribution of the fluorescence signals, by contrast, indicates that the proteins form monomers or, at most, small oligomers that diffuse rapidly within the cell and are thus no longer detected as a stationary focus by widefield microscopy. Our single-molecule data also indicate that proteins that are no longer able to interact with the membrane (as verified by cell fractionation studies and in vitro liposome binding assays) show a high diffusion rate, similar to that measured for the non-polymerizing and non-membrane-bound F130R variant. These results indicate that a loss of membrane binding strongly reduces the ability of BacA to form polymeric assemblies. To support this hypothesis, we will perform additional single-molecule tracking analyses of a freely diffusible and membrane-bound monomeric fluorescent proteins for comparison.

      (2) Liposome binding data would be strengthened with TEM images to show BacA binding to liposomes. From this experiment, gross polymerization structures of MTS variants could also be characterized.

      We do not have the possibility to perform cryo-electron microscopy studies of liposomes bound to BacA. However, the results of the cell fractionation and liposome sedimentation assays clearly support a critical role of the MTS in membrane binding.

      (3) The use of the BacA F130R mutant throughout the study to probe the effect of polymerization on membrane binding is concerning as there is no evidence showing that this variant cannot polymerize. Looking through the papers the authors referenced, there was no evidence of an identical mutation in BacA that was shown to be depolymerized or any discussion in this study of how the F130R mutation might to analogous to polymerization-deficient variants in other bactofilins mentioned in these references.

      Residue F130 in the C-terminal polymerization interface of BacA is highly conserved among bactofilin homologs, although its absolute position in the protein sequence may vary, depending on the length of the N-terminal unstructured tail. The papers cited in our manuscript show that an exchange of this conserved phenylalanine residue abolishes polymer formation. We will make this fact clearer in the revised version of the manuscript. Moreover, we will provide gel filtration and transmission electron microscopy data showing that the BacA-F130R variant no longer forms polymers.

      (4) Microscopy shows that a BacA variant lacking the native MTS regains the ability to form puncta, albeit mis-localized, in the cell when fused to a heterologous MTS from MreB. While this swap suggests a link between puncta formation and membrane binding the relationship between puncta and polymerization has not been established (see comment 1).

      We show that a BacA variant lacking the MTS regains the ability to form membrane-associated foci when fused to the MTS of MreB. In contrast, a similar variant that additionally carries the F130R exchange (preventing its polymerization) shows a diffuse cytoplasmic localization. In addition, we show that the F130R exchange leads to a loss of membrane binding and to a considerable increase in the mobility of the variants carrying the MreB MTS. Together, these results strongly support the hypothesis that membrane binding and polymerization act synergistically to establish localized bactofilin assemblies.

      (5) The authors provide no primary data for single molecule tracking. There is no tracking mapped onto microscopy images to show membrane localization or lack of localization in MTS deletion/ variants. A known soluble protein (e.g. unfused mVenus) and a known membrane bound protein would serve as valuable controls to interpret the data presented. It also is unclear why the authors chose to report molecular dynamics as mean squared displacement rather than mean squared displacement per unit time, and the number of localizations is not indicated. Extrapolating from the graph in figure 4 D for example, it looks like WT BacA-mVenus would have a mobility of 0.5 (0.02/0.04) micrometers squared per second which is approaching diffusive behavior. Further justification/details of their analysis method is needed. It's also not clear how one should interpret the finding that several of the double point mutants show higher displacement than deleting the entire MTS. These experiments as they stand don't account for any other cause of molecular behavior change and assume that a decrease in movement is synonymous with membrane binding.

      We agree that a more in-depth analysis of the single-molecule-tracking data would be helpful to support our conclusions.  We will map the reads on the cells, although the loss of membrane localization of BacA variants with a defective MTS is already obvious in the widefield fluorescence images. Moreover, we will perform additional measurements on soluble mVenus and a membrane-associated variant of mVenus for comparison and address the other issues raised here.

      The single-molecule tracking data alone are certainly not sufficient to draw firm conclusions on the relationship between membrane binding and protein mobility. However, our other in vivo and in vitro analyses indicate a very clear correlation of between the mobility of BacA and its ability to interact with the membrane and polymerize (processes that synergistically promote each other).

      (6) The experiments that map the interaction surface between the N-terminal unstructured region of PbpC and a specific part of the BacA bactofilin domain seem distinct from the main focus of the paper and the data somewhat preliminary. While the PbpC side has been probed by orthogonal approaches (mutation with localization in cells and affinity in vitro), the BacA region side has only been suggested by the deuterium exchange experiment and needs some kind of validation

      The results of the HDX analysis per se are not preliminary and clearly indicate a change in the accessibily of surface-exposed residues in the central bactofilin domain. However, we agree that additional experiments would be required to verify the binding site suggested by these data. However, this aspect is indeed not the main focus of the paper. We included the analysis of the interaction between PbpC and BacA, because we see effects of membrane binding/polymerization on the BacA-PbpC interaction and thus on the physiological function of BacA in C. crescentus.

      Reviewer #2:

      Summary:

      The authors of this study investigated the membrane-binding properties of bactofilin A from Caulobacter crescentus, a classic model organism for bacterial cell biology. BacA was the progenitor of a family of cytoskeletal proteins that have been identified as ubiquitous structural components in bacteria, performing a range of cell biological functions. Association with the cell membrane is a common property of the bactofilins studied and is thought to be important for functionality. However, almost all bactofilins lack a transmembrane domain. While membrane association has been attributed to the unstructured N-terminus, experimental evidence had yet to be provided. As a result, the mode of membrane association and the underlying molecular mechanics remained elusive.

      Liu at al. analyze the membrane binding properties of BacA in detail and scrutinize molecular interactions using in-vivo, in-vitro and in-silico techniques. They show that few N-terminal amino acids are important for membrane association or proper localization and suggest that membrane association promotes polymerization. Bioinformatic analyses revealed conserved lineage-specific N-terminal motifs indicating a conserved role in protein localization. Using HDX analysis they also identify a potential interaction site with PbpC, a morphogenic cell wall synthase implicated in Caulobacter stalk synthesis. Complementary, they pinpoint the bactofilin-interacting region within the PbpC C-terminus, known to interact with bactofilin. They further show that BacA localization is independent of PbpC.

      Strengths

      These data significantly advance the understanding of the membrane binding determinants of bactofilins and thus their function at the molecular level. The major strength of the comprehensive study is the combination of complementary in vivo, in vitro and bioinformatic/simulation approaches, the results of which are consistent.

      We thank Reviewer #2 for the positive evaluation of our paper and for the constructive criticism sent to us in the the non-public review. We will address the points raised in a revised version of the manuscript.

      Weaknesses:

      The results are limited to protein localization and interaction, as there is no data on phenotypic effects. Therefore, the cell biological significance remains somewhat underrepresented.

      We agree that it would be interesting to investigate the phenotypic effects caused by a defect of BacA in membrane binding. We will investigate PbpC localization and stalk length in phosphate-limited medium for mutants producing MTS-deficient BacA variants and include these data in the revised version of the manuscript. However, we would like to point out that the relevance of our findings goes beyond the C. cres­centus system, because the MTS and its role for bactofilin function is likely to be conserved in many other species.

    1. eLife Assessment

      This study presents an important finding regarding a significant, understudied question: How does adaptation affect spatial frequency processing in the human visual cortex? Using both psychophysics and neuroimaging the authors conclude that adaptation induces changes in perceived spatial frequency and population receptive field size (pRF) size, depending on the adaptation state. Specifically, adapting to a low spatial frequency increases perceived spatial frequency and results in smaller pRFs, whereas adapting to a high spatial frequency decreases perceived spatial frequency and leads to broader pRFs. These results offer an explanation for previous seemingly conflicting findings regarding the effects of adaptation on size illusions and the evidence is solid; however, including a clear, direct comparison between pRF sizes in the high-adapted and low-adapted conditions would further strengthen the argument.

    2. Reviewer #1 (Public review):

      Summary:

      This paper tests the hypothesis that neuronal adaptation to spatial frequency affects the estimation of spatial population receptive field sizes as commonly measured using the pRF paradigm in fMRI. To this end, the authors modify a standard pRF setup by presenting either low or high SF (near full field) adaptation stimuli prior to the start of each run and interleaved between each pRF bar stimulus. The hypothesis states that adaptation to a specific spatial frequency (SF) should affect only a specific subset of neurons in a population (measured with an fMRI voxel), leaving the other neurons in the population intact, resulting in a shift in the tuning of the voxel in the opposite direction of the adapted stimulus (so high SF adaptation > larger pRF size and vice versa). The paper shows that this 'repelling' effect is robustly detectable psychophysically and is evident in pRF size estimates after adaptation in line with the hypothesized direction, thereby demonstrating a link between SF tuning and pRF size measurements in the human visual cortex.

      Strengths:

      The paper introduces a new experimental design to study the effect of adaptation on spatial tuning in the cortex, nicely combining the neuroimaging analysis with a separate psychophysical assessment.

      The paper includes careful analyses and transparent reporting of single-subject effects, and several important control analyses that exclude alternative explanations based on perceived contrast or signal-to-noise differences in fMRI.

      The paper contains very clear explanations and visualizations, and a carefully worded Discussion that helpfully contextualizes the results, elucidating prior findings on the effect of spatial frequency adaptation on size illusion perception.

      Weaknesses:

      The fMRI experiments consist of a relatively small sample size (n=8), of which not all consistently show the predicted pattern in all ROIs. For example, one subject shows a strong effect in the pRF size estimates in the opposite direction in V1. It's not clear if this subject is also in the psychophysical experiment and if there is perhaps a behavioral correlate of this deviant pattern. The addition of a behavioral task in the scanner testing the effect of adaptation could perhaps have helped clarify this (although arguably it's difficult to do psychophysics in the scanner). Although the effects are clearly robust at the group level here, a larger sample size could clarify how common such deviant patterns are, and potentially allow for the assessment of individual differences in adaption effects on spatial tuning as measured with fMRI, and their perceptual implications.

      The psychophysical experiment in which the perceptual effects are shown included a neutral condition, which allowed for establishing a baseline for each subject and the discovery of an asymmetry in the effects with stronger perceptual effects after high SF adaptation compared to low SF. This neutral condition was lacking in fMRI, and thus - as acknowledged - this asymmetry could not be tested at the neural level, also precluding the possibility of comparing the obtained pRF estimates to the typical ranges found using standard pRF mapping procedures (without adaptation), or to compare the SNR using in the adaptation pRF paradigm with that of a regular paradigm, etc.

      The results indicate quite some variability in the magnitude of the shift in pRF size across eccentricities and ROIs (Figure 5B). It would be interesting to know more about the sources of this variability, and if there are other effects of adaptation on the estimated retinotopic maps other than on pRF size (there is one short supplementary section on the effects on eccentricity tuning, but not polar angle).

    3. Reviewer #2 (Public review):

      The manuscript "Spatial frequency adaptation modulates population receptive field sizes" is a heroic attempt to untangle a number of visual phenomena related to spatial frequency using a combination of psychophysical experiments and functional MRI. While the paper clearly offers an interesting and clever set of measurements supporting the authors' hypothesis, my enthusiasm for its findings is somewhat dampened by the small number of subjects, high noise, and lack of transparency in the report. Despite several of the methods being somewhat heuristically and/or difficult to understand, the authors do not appear to have released the data or source code nor to have committed to doing so, and the particular figures in the paper and supplements give a view of the data that I am not confident is a complete one. If either data or source code for the analyses and figures were provided, this concern could be largely mitigated, but the explanation of the methods is not sufficient for me to be anywhere near confident that an expert could reproduce these results, even starting from the authors' data files.

      Major Concerns:

      I feel that the authors did a nice job with the writing overall and that their explanation of the topic of spatial frequency (SF) preferences and pRFs in the Introduction was quite nice. One relatively small critique is that there is not enough explanation as to how SF adaptation would lead to changes in pRF size theoretically. In a population RF, my assumption is that neurons with both small and large RFs are approximately uniformly distributed around the center of the population. (This distribution is obviously not uniform globally, but at least locally, within a population like a voxel, we wouldn't expect the small RFs to be on average nearer the voxel's center than the voxel's edges.) Why then would adaptation to a low SF (which the authors hypothesize results in higher relative responses from the neurons with smaller RFs) lead to a smaller pRF? The pRF size will not be a function of the mean of the neural RF sizes in the population (at least not the neural RF sizes alone). A signal driven by smaller RFs is not the same as a signal driven by RFs closer to the center of the population, which would more clearly result in a reduction of pRF size. The illustration in Figure 1A implies that this is because there won't be as many small RFs close to the edge of the population, but there is clearly space in the illustration for more small RFs further from the population center that the authors did not draw. On the other hand, if the point of the illustration is that some neurons will have large RFs that fall outside of the population center, then this ignores the fact that such RFs will have low responses when the stimulus partially overlaps them. This is not at all to say that I think the authors are wrong (I don't) - just that I think the text of the manuscript presents a bit of visual intuition in place of a clear model for one of the central motivations of the paper.

      The fMRI methods are clear enough to follow, but I find it frustrating that throughout the paper, the authors report only normalized R2 values. The fMRI stimulus is a very interesting one, and it is thus interesting to know how well pRF models capture it. This is entirely invisible due to the normalization. This normalization choice likely leads to additional confusion, such as why it appears that the R2 in V1 is nearly 0 while the confidence in areas like V3A is nearly 1 (Figure S2). I deduced from the identical underlying curvature maps in Figures 4 and S2 that the subject in Figure 4 is in fact Participant 002 of Figure S2, and, assuming this deduction is correct, I'm wondering why the only high R2 in that participant's V1 (per Figure S2) seems to correspond to what looks like noise and/or signal dropout to me in Figure 4. If anything, the most surprising finding of this whole fMRI experiment is that SF adaptation seems to result in a very poor fit of the pRF model in V1 but a good fit elsewhere; this observation is the complete opposite of my expectations for a typical pRF stimulus (which, in fairness, this manuscript's stimulus is not). Given how surprising this is, it should be explained/discussed. It would be very helpful if the authors showed a map of average R2 on the fsaverage surface somewhere along with a map of average normalized R2 (or maps of each individual subject).

      On page 11, the authors assert that "Figure 4c clearly shows a difference between the two conditions, which is evident in all regions." To be honest, I did not find this to be clear or evident in any of the highlighted regions in that figure, though close inspection leads me to believe it could be true. This is a very central point, though, and an unclear figure of one subject is not enough to support it. The plots in Figure 5 are better, but there are many details missing. What thresholding was used? Could the results in V1 be due to the apparently small number of data points that survive thresholding (per Figure S2)? I would very much like to see a kernel density plot of the high-adapted (x-axis) versus low-adapted (y-axis) pRF sizes for each visual area. This seems like the most natural way to evaluate the central hypothesis, but it's notably missing.

      Regarding Figure 4, I was curious why the authors didn't provide a plot of the difference between the PRF size maps for the high-adapted and low-adapted conditions in order to highlight these apparent differences for readers. So I cut the image in half (top from bottom), aligned the top and bottom halves of the figure, and examined their subtraction. (This was easy to do because the boundary lines on the figure disappear in the difference figure when they are aligned correctly.) While this is hardly a scientific analysis (the difference in pixel colors is not the difference in the data) what I noticed was surprising: There are differences in the top and bottom PRF size maps, but they appear to correlate spatially with two things: (1) blobs in the PRF size maps that appear to be noise and (2) shifts in the eccentricity maps between conditions. In fact, I suspect that the difference in PRF size across voxels correlates very strongly with the difference in eccentricity across voxels. Could the results of this paper in fact be due not to shifts in PRF size but shifts in eccentricity? Without a better analysis of the changes in eccentricity and a more thorough discussion of how the data were thresholded and compared, this is hard to say.

      While I don't consider myself an expert on psychophysics methods, I found the sections on both psychophysical experiments easy to follow and the figures easy to understand. The one major exception to this is the last paragraph of section 4.1.2, which I am having trouble following. I do not think I could reproduce this particular analysis based on the text, and I'm having a hard time imagining what kind of data would result in a particular PSE. This needs to be clearer, ideally by providing the data and analysis code.

      Overall, I think the paper has good bones and provides interesting and possibly important data for the field to consider. However, I'm not convinced that this study will replicate in larger datasets - in part because it is a small study that appears to contain substantially noisy data but also because the methods are not clear enough. If the authors can rewrite this paper to include clearer depictions of the data, such as low- and high-adapted pRF size maps for each subject, per visual-area 2D kernel density estimates of low- versus high-adapted pRF sizes for each voxel/vertex, clear R2 and normalized-R2 maps, this could be much more convincing.

    4. Reviewer #3 (Public review):

      This is a well-designed study examining an important, surprisingly understudied question: how does adaptation affect spatial frequency processing in the human visual cortex? Using a combination of psychophysics and neuroimaging, the authors test the hypothesis that spatial frequency tuning is shifted to higher or lower frequencies, depending on the preadapted state (low or high s.f. adaptation). They do so by first validating the phenomenon psychophysically, showing that adapting to 0.5 cpd stimuli causes an increase in perceived s.f., and 3.5 cpd causes a relative decrease in perceived s.f. Using the same stimuli, they then port these stimuli to a neuroimaging study, in which population receptive fields are measured under high and low spatial frequency adaptation states. They find that adaptation changes pRF size, depending on adaptation state: adapting to high s.f. led to broader overall pRF sizes across the early visual cortex, whereas adapting to low s.f. led to smaller overall pRF sizes. Finally, the authors carry out a control experiment to psychophysically rule out the possibility that the perceived contrast change w/ adaptation may have given rise to these imaging results (this doesn't appear to be the case). All in all, I found this to be a good manuscript: the writing is taut, and the study is well designed There are a few points of clarification that I think would help, though, including a little more detail about the pRF analyses carried out in this study. Moreover, one weakness is that the sample size is relatively small, given the variability in the effects.

      (1) The pRF mapping stimuli and paradigm are slightly unconventional. This is, of course, fairly necessary to assess the question at hand. But, unless I missed it, there is a potentially critical piece of the analyses that I couldn't find in the results or methods: is the to-our adapter incorporated into the inputs for the pRF analyses, or was it simply estimating pRF size in response to the pRF mapping bar? Ignoring the large, full field-ish top-up seems like it might be dismissing an important nonlinearity in RF response to that aspect of the display (including that that had different s.f. content from the mapping stimulus) -especially because it occurred 50% of the time during the pRF mapping procedure. While the bar/top-up were events sub-TR, you could still model the prfprobe+topup response, then downsample to TR level afterwards. In any case, to fully understand this, some more detail is needed here regarding the prf fitting procedure.

      (2) I appreciate the eccentricity-dependent breakdown in Figure 5b. However, it would be informative to have included the actual plots of the pRF size as a function of eccen, for the two conditions individually, in addition to the difference effects depicted in 5b.

      (3) I know the N is small for this, but did the authors take a look at whether there was any relationship between the magnitude of the psychophysical effect and the change in pRF size, per individual? This is probably underpowered but could be worth a peek.

    5. Author response:

      We thank the reviewers for their valuable comments. Our revision will address their recommendations and clarify any misconceptions. The main points we plan to amend are as follows:

      Direct comparison of pRF sizes

      We may have misunderstood this comment in the eLife assessment. We believe our original analyses and the figures already provided a “direct comparison between pRF sizes in the high-adapted and low-adapted conditions”. Specifically, we included a figure showing the histograms of pRF sizes in both conditions, and also reported statistical tests to compare conditions both within each participant and across the group. However, we now realize these comparisons might not be as clear to readers as we intended, which would explain Reviewer #2’s interpretations. To clarify, in our revised version we will instead show 2D plots comparing pRF sizes between conditions as suggested by Reviewer #2, and also show the pRF size plotted against eccentricity (rather than only the difference) as suggested by Reviewer #3.

      Data sharing 

      The behavioral data, fMRI data (where ethically permissible), stimulus-generation code, statistical analyses, and fMRI stimulus video are already publicly available at the link: https://osf.io/9kfgx/. However, we unfortunately failed to include the link in the preprint. We apologize for this oversight. It will be included in the revision. The repository now also contains a script for simulated adaptation effects on pRF size used in our response to Reviewer #2. Moreover, for transparency, we will include plots of all the pRF parameter maps for all participants, including pRF size, polar angle, eccentricity, normalized R2, and raw R2.

      Sample size

      The reviewers shared concerns about the sample size of our study. We disagree that this is a weakness of our study. It is important to note that large sample sizes are not necessary to obtain conclusive results, especially when the research aims to test whether an effect exists, rather than finding out how strong the effect is on average in a population (Schwarzkopf & Huang, 2024, currently out as preprint, but in press at Psychological Methods). Our results showed robust within-subject effects, consistent across multiple visual regions in most individual participants. A larger sample size would not necessarily improve the reliability of our findings. Treating each individual as an independent replication, our results suggest a high probability that they would replicate in each additional participant we could scan. 

      Reviewer #1:

      We thank the reviewer for their careful evaluation and positive comments. We will include a more detailed discussion about the issues pointed out, and an additional plot showing the polar angle for both adapter conditions. In line with previous work on the reliability of pRF estimates (van Dijk, de Haas, Moutsiana, & Schwarzkopf, 2016; Senden, Reithler, Gijsen, & Goebel, 2014), both polar angle and eccentricity maps are very stable between the two adaptation conditions.

      Reviewer #2:

      We thank the reviewer for their comments - we will improve how we report key findings which we hope will clarify matters raised by the reviewer.

      RF positions in a voxel

      The reviewer’s comments suggest that they may have misunderstood the diagram (Figure 1A) illustrating the theoretical basis of the adaptation effect, likely due to us inadvertently putting the small RFs in the middle of the illustration. We will change this figure to avoid such confusion.

      Theoretical explanation of adaptation effect

      The reviewer’s explanation for how adaptation should affect the size of pRF averaging across individual RFs is incorrect. When selecting RFs from a fixed range of semi-uniformly distributed positions (as in an fMRI voxel), the average position of RFs (corresponding to pRF position) is naturally near the center of this range. The average size (corresponding to pRF size) reflects the visual field coverage of these individual RFs. This aggregate visual field coverage thus also reflects the individual sizes. When large RFs have been adapted out, this means the visual field coverage at the boundaries is sparser, and the aggregate pRF is therefore smaller. The opposite happens when adapting out the contribution of small RFs. We demonstrate this with a simple simulation at this OSF link: https://osf.io/ebnky/.

      Figure S2 

      It is not actually possible to compare R2 between regions by looking at Figure S2 because it shows the pRF size change, not R2. Therefore, the arguments Reviewer #2 made based on their interpretation of the figure are not valid. Just as the reviewer expected, V1 is one of the brain regions with good pRF model fits. In our revision, we will include normalized and raw R2 maps to make this more obvious to the readers and provide additional explanations.

      V1 appeared essentially empty in that plot primarily due to the sigma threshold we selected, which was unintentionally more conservative than those applied in our analyses and other figures. We apologize for this mistake and will correct it in the revised version by including a plot with the appropriate sigma threshold.

      Thresholding details 

      Thresholding information was included in our original manuscript; however, we will include more information in the figure captions to make it more obvious.

      2D plots will replace histograms

      We thank the reviewer for this suggestion. The manuscript contained histograms showing the distribution of pRF size for both adaptation conditions for each participant and visual area (Figure S1). However, we agree that 2D plots better communicate the difference in pRF parameters between conditions, so we will replace this figure. We will consider 2D kernel density plots as suggested by the reviewer; however, such plots can obscure distributional anomalies so they may not be the optimal choice and we may opt to show transparent scatter plots of individual pRFs instead.

      (proportional) pRF size-change map 

      The reviewer requests pRF size difference maps. Figure S2 in fact demonstrates the proportional difference between the pRF sizes of the two adaptation conditions. Instead of simply taking the difference, we believe showing the proportional change map is more sensible because overall pRF size varies considerably between visual regions. We will explain this more clearly in our revision. 

      pRF eccentricity plot 

      “I suspect that the difference in PRF size across voxels correlates very strongly with the difference in eccentricity across voxels.”

      Our manuscript already contains a supplementary plot (Figure S4 B) comparing the eccentricity between adapter conditions, showing no notable shift in eccentricities except in V3A - but that is a small region and the results are generally more variable. We will comment more on this finding in the main text and explain this figure in more detail. 

      To the reviewer’s point, even if there were an appreciable shift in eccentricity between conditions (as they suggest may have happened for the example participant we showed), this does not mean that the pRF size effect is “due [...] to shifts in eccentricity.” Parameters in a complex multi-dimensional model like the pRF are not independent. There is no way of knowing whether a change in one parameter is causally linked with a change in another. We can only report the parameter estimates the model produces. 

      In fact, it is conceivable that adaptation causes both: changes in pRF size and eccentricity. If more central or peripheral RFs tend to have smaller or larger RFs, respectively, then adapting out one part of the distribution will shift the average accordingly. However, as we already established, we find no compelling evidence that pRF eccentricity changes dramatically due to adaptation, while pRF size does. We will illustrate this using the 2D plots in our revision.

      Reviewer #3:

      We thank the reviewer for their comments.

      pRF model

      Top-up adapters were not modelled in our analyses because they are shared events in all TRs, critically also including the “blank” periods, providing a constant source of signal. Therefore modelling them separately cannot meaningfully change the results. However, the reviewer makes a good suggestion that it would be useful to mention this in the manuscript, so we will add a discussion of this point.

      pRF size vs eccentricity

      We will add a plot showing pRF size in the two adaptation conditions (in addition to the pRF size difference) as a function of eccentricity.

      Correlation with behavioral effect

      In the original manuscript, we pointed out why the correlation between the magnitude of the behavioral effect and the pRF size change is not an appropriate test for our data. First, the reviewer is right that a larger sample size would be needed to reliably detect such a between-subject correlation. More importantly, as per our recruitment criteria for the fMRI experiment, we did not scan participants showing weak perceptual effects. This limits the variability in the perceptual effect and makes correlation inapplicable.

      References

      van Dijk, J. A., de Haas, B., Moutsiana, C., & Schwarzkopf, D. S. (2016). Intersession reliability of population receptive field estimates. NeuroImage, 143, 293–303. https://doi.org/10.1016/J.NEUROIMAGE.2016.09.013

      Schwarzkopf, D. S., & Huang, Z. (2024). A simple statistical framework for small sample studies. BioRxiv, 2023.09.19.558509. https://doi.org/10.1101/2023.09.19.558509

      Senden, M., Reithler, J., Gijsen, S., & Goebel, R. (2014). Evaluating population receptive field estimation frameworks in terms of robustness and reproducibility. PloS One, 9(12). https://doi.org/10.1371/JOURNAL.PONE.0114054

    1. Author response:

      eLife Assessment 

      This valuable study investigates how the neural representation of individual finger movements changes during the early period of sequence learning. By combining a new method for extracting features from human magnetoencephalography data and decoding analyses, the authors provide incomplete evidence of an early, swift change in the brain regions correlated with sequence learning, including a set of previously unreported frontal cortical regions. The addition of more control analyses to rule out that head movement artefacts influence the findings, and to further explain the proposal of offline contextualization during short rest periods as the basis for improvement performance would strengthen the manuscript. 

      We appreciate the Editorial assessment on our paper’s strengths and novelty.  We have implemented additional control analyses to show that neither task-related eye movements nor increasing overlap of finger movements during learning account for our findings, which are that contextualized neural representations in a network of bilateral frontoparietal brain regions actively contribute to skill learning.  Importantly, we carried out additional analyses showing that contextualization develops predominantly during rest intervals.

      Public Reviews:

      We thank the Reviewers for their comments and suggestions, prompting new analyses and additions that strengthened our report.

      Reviewer #1 (Public review): 

      Summary: 

      This study addresses the issue of rapid skill learning and whether individual sequence elements (here: finger presses) are differentially represented in human MEG data. The authors use a decoding approach to classify individual finger elements and accomplish an accuracy of around 94%. A relevant finding is that the neural representations of individual finger elements dynamically change over the course of learning. This would be highly relevant for any attempts to develop better brain machine interfaces - one now can decode individual elements within a sequence with high precision, but these representations are not static but develop over the course of learning. 

      Strengths: The work follows a large body of work from the same group on the behavioural and neural foundations of sequence learning. The behavioural task is well established and neatly designed to allow for tracking learning and how individual sequence elements contribute. The inclusion of short offline rest periods between learning epochs has been influential because it has revealed that a lot, if not most of the gains in behaviour (ie speed of finger movements) occur in these so-called micro-offline rest periods. The authors use a range of new decoding techniques, and exhaustively interrogate their data in different ways, using different decoding approaches. Regardless of the approach, impressively high decoding accuracies are observed, but when using a hybrid approach that combines the MEG data in different ways, the authors observe decoding accuracies of individual sequence elements from the MEG data of up to 94%. 

      We have previously showed that neural replay of MEG activity representing the practiced skill correlated with micro-offline gains during rest intervals of early learning, 1 consistent with the recent report that hippocampal ripples during these offline periods predict human motor sequence learning2.  However, decoding accuracy in our earlier work1 needed improvement.  Here, we reported a strategy to improve decoding accuracy that could benefit future studies of neural replay or BCI using MEG.

      Weaknesses: 

      There are a few concerns which the authors may well be able to resolve. These are not weaknesses as such, but factors that would be helpful to address as these concern potential contributions to the results that one would like to rule out. Regarding the decoding results shown in Figure 2 etc, a concern is that within individual frequency bands, the highest accuracy seems to be within frequencies that match the rate of keypresses. This is a general concern when relating movement to brain activity, so is not specific to decoding as done here. As far as reported, there was no specific restraint to the arm or shoulder, and even then it is conceivable that small head movements would correlate highly with the vigor of individual finger movements. This concern is supported by the highest contribution in decoding accuracy being in middle frontal regions - midline structures that would be specifically sensitive to movement artefacts and don't seem to come to mind as key structures for very simple sequential keypress tasks such as this - and the overall pattern is remarkably symmetrical (despite being a unimanual finger task) and spatially broad. This issue may well be matching the time course of learning, as the vigor and speed of finger presses will also influence the degree to which the arm/shoulder and head move. This is not to say that useful information is contained within either of the frequencies or broadband data. But it raises the question of whether a lot is dominated by movement "artefacts" and one may get a more specific answer if removing any such contributions. 

      Reviewer #1 expresses concern that the combination of the low-frequency narrow-band decoder results, and the bilateral middle frontal regions displaying the highest average intra-parcel decoding performance across subjects is suggestive that the decoding results could be driven by head movement or other artefacts.

      Head movement artefacts are highly unlikely to contribute meaningfully to our results for the following reasons. First, in addition to ICA denoising, all “recordings were visually inspected and marked to denoise segments containing other large amplitude artifacts due to movements” (see Methods). Second, the response pad was positioned in a manner that minimized wrist, arm or more proximal body movements during the task. Third, while head position was not monitored online for this study, the head was restrained using an inflatable air bladder, and head position was assessed at the beginning and at the end of each recording. Head movement did not exceed 5mm between the beginning and end of each scan for all participants included in the study. Fourth, we agree that despite the steps taken above, it is possible that minor head movements could still contribute to some remaining variance in the MEG data in our study. The Reviewer states a concern that “it is conceivable that small head movements would correlate highly with the vigor of individual finger movements”. However, in order for any such correlations to meaningfully impact decoding performance, such head movements would need to: (A) be consistent and pervasive throughout the recording (which might not be the case if the head movements were related to movement vigor and vigor changed over time); and (B) systematically vary between different finger movements, and also between the same finger movement performed at different sequence locations (see 5-class decoding performance in Figure 4B). The possibility of any head movement artefacts meeting all these conditions is extremely unlikely.

      Given the task design, a much more likely confound in our estimation would be the contribution of eye movement artefacts to the decoder performance (an issue appropriately raised by Reviewer #3 in the comments below). Remember from Figure 1A in the manuscript that an asterisk marks the current position in the sequence and is updated at each keypress. Since participants make very few performance errors, the position of the asterisk on the display is highly correlated with the keypress being made in the sequence. Thus, it is possible that if participants are attending to the visual feedback provided on the display, they may move their eyes in a way that is systematically related to the task.  Since we did record eye movements simultaneously with the MEG recordings (EyeLink 1000 Plus; Fs = 600 Hz), we were able to perform a control analysis to address this question. For each keypress event during trials in which no errors occurred (which is the same time-point that the asterisk position is updated), we extracted three features related to eye movements: 1) the gaze position at the time of asterisk position update (or keyDown event), 2) the gaze position 150ms later, and 3) the peak velocity of the eye movement between the two positions. We then constructed a classifier from these features with the aim of predicting the location of the asterisk (ordinal positions 1-5) on the display. As shown in the confusion matrix below (Author response image 1), the classifier failed to perform above chance levels (Overall cross-validated accuracy = 0.21817):

      Author response image 1.

      Confusion matrix showing that three eye movement features fail to predict asterisk position on the task display above chance levels (Fold 1 test accuracy = 0.21718; Fold 2 test accuracy = 0.22023; Fold 3 test accuracy = 0.21859; Fold 4 test accuracy = 0.22113; Fold 5 test accuracy = 0.21373; Overall cross-validated accuracy = 0.2181). Since the ordinal position of the asterisk on the display is highly correlated with the ordinal position of individual keypresses in the sequence, this analysis provides strong evidence that keypress decoding performance from MEG features is not explained by systematic relationships between finger movement behavior and eye movements (i.e. – behavioral artefacts).

      In fact, inspection of the eye position data revealed that a majority of participants on most trials displayed random walk gaze patterns around a center fixation point, indicating that participants did not attend to the asterisk position on the display. This is consistent with intrinsic generation of the action sequence, and congruent with the fact that the display does not provide explicit feedback related to performance. A similar real-world example would be manually inputting a long password into a secure online application. In this case, one intrinsically generates the sequence from memory and receives similar feedback about the password sequence position (also provided as asterisks), which is typically ignored by the user. The minimal participant engagement with the visual task display observed in this study highlights another important point – that the behavior in explicit sequence learning motor tasks is highly generative in nature rather than reactive to stimulus cues as in the serial reaction time task (SRTT).  This is a crucial difference that must be carefully considered when designing investigations and comparing findings across studies.

      We observed that initial keypress decoding accuracy was predominantly driven by contralateral primary sensorimotor cortex in the initial practice trials before transitioning to bilateral frontoparietal regions by trials 11 or 12 as performance gains plateaued.  The contribution of contralateral primary sensorimotor areas to early skill learning has been extensively reported in humans and non-human animals. 1,3-5  Similarly, the increased involvement of bilateral frontal and parietal regions to decoding during early skill learning in the non-dominant hand is well known.  Enhanced bilateral activation in both frontal and parietal cortex during skill learning has been extensively reported6-11, and appears to be even more prominent during early fine motor skill learning in the non-dominant hand12,13.  The frontal regions identified in these studies are known to play crucial roles in executive control14, motor planning15, and working memory6,8,16-18 processes, while the same parietal regions are known to integrate multimodal sensory feedback and support visuomotor transformations6,8,16-18, in addition to working memory19. Thus, it is not surprising that these regions increasingly contribute to decoding as subjects internalize the sequential task.  We now include a statement reflecting these considerations in the revised Discussion.

      A somewhat related point is this: when combining voxel and parcel space, a concern is whether a degree of circularity may have contributed to the improved accuracy of the combined data, because it seems to use the same MEG signals twice - the voxels most contributing are also those contributing most to a parcel being identified as relevant, as parcels reflect the average of voxels within a boundary. In this context, I struggled to understand the explanation given, ie that the improved accuracy of the hybrid model may be due to "lower spatially resolved whole-brain and higher spatially resolved regional activity patterns".

      We strongly disagree with the Reviewer’s assertion that the construction of the hybrid-space decoder is circular. To clarify, the base feature set for the hybrid-space decoder constructed for all participants includes whole-brain spatial patterns of MEG source activity averaged within parcels. As stated in the manuscript, these 148 inter-parcel features reflect “lower spatially resolved whole-brain activity patterns” or global brain dynamics. We then independently test how well spatial patterns of MEG source activity for all voxels distributed within individual parcels can decode keypress actions. Again, the testing of these intra-parcel spatial patterns, intended to capture “higher spatially resolved regional brain activity patterns”, is completely independent from one another and independent from the weighting of individual inter-parcel features. These intra-parcel features could, for example, provide additional information about muscle activation patterns or the task environment. These approximately 1150 intra-parcel voxels (on average, within the total number varying between subjects) are then combined with the 148 inter-parcel features to construct the final hybrid-space decoder. In fact, this varied spatial filter approach shares some similarities to the construction of convolutional neural networks (CNNs) used to perform object recognition in image classification applications. One could also view this hybrid-space decoding approach as a spatial analogue to common time-frequency based analyses such as theta-gamma phase amplitude coupling (PAC), which combine information from two or more narrow-band spectral features derived from the same time-series data.

      We directly tested this hypothesis – that spatially overlapping intra- and inter-parcel features portray different information – by constructing an alternative hybrid-space decoder (HybridAlt) that excluded average inter-parcel features which spatially overlapped with intra-parcel voxel features, and comparing the performance to the decoder used in the manuscript (HybridOrig). The prediction was that if the overlapping parcel contained similar information to the more spatially resolved voxel patterns, then removing the parcel features (n=8) from the decoding analysis should not impact performance. In fact, despite making up less than 1% of the overall input feature space, removing those parcels resulted in a significant drop in overall performance greater than 2% (78.15% ± SD 7.03% for HybridOrig vs. 75.49% ± SD 7.17% for HybridAlt; Wilcoxon signed rank test, z = 3.7410, p = 1.8326e-04) (Author response image 2).

      Author response image 2.

      Comparison of decoding performances with two different hybrid approaches. HybridAlt: Intra-parcel voxel-space features of top ranked parcels and inter-parcel features of remaining parcels. HybridOrig:  Voxel-space features of top ranked parcels and whole-brain parcel-space features (i.e. – the version used in the manuscript). Dots represent decoding accuracy for individual subjects. Dashed lines indicate the trend in performance change across participants. Note, that HybridOrig (the approach used in our manuscript) significantly outperforms the HybridAlt approach, indicating that the excluded parcel features provide unique information compared to the spatially overlapping intra-parcel voxel patterns.

      Firstly, there will be a relatively high degree of spatial contiguity among voxels because of the nature of the signal measured, i.e. nearby individual voxels are unlikely to be independent. Secondly, the voxel data gives a somewhat misleading sense of precision; the inversion can be set up to give an estimate for each voxel, but there will not just be dependence among adjacent voxels, but also substantial variation in the sensitivity and confidence with which activity can be projected to different parts of the brain. Midline and deeper structures come to mind, where the inversion will be more problematic than for regions along the dorsal convexity of the brain, and a concern is that in those midline structures, the highest decoding accuracy is seen. 

      We definitely agree with the Reviewer that some inter-parcel features representing neighboring (or spatially contiguous) voxels are likely to be correlated. This has been well documented in the MEG literature20,21 and is a particularly important confound to address in functional or effective connectivity analyses (not performed in the present study). In the present analysis, any correlation between adjacent voxels presents a multi-collinearity problem, which effectively reduces the dimensionality of the input feature space. However, as long as there are multiple groups of correlated voxels within each parcel (i.e. - the effective dimensionality is still greater than 1), the intra-parcel spatial patterns could still meaningfully contribute to the decoder performance. Two specific results support this assertion.

      First, we obtained higher decoding accuracy with voxel-space features [74.51% (± SD 7.34%)] compared to parcel space features [68.77% (± SD 7.6%)] (Figure 3B), indicating individual voxels carry more information in decoding the keypresses than the averaged voxel-space features or parcel-space features.  Second, Individual voxels within a parcel showed varying feature importance scores in decoding keypresses (Author response image 3). This finding supports the Reviewer’s assertion that neighboring voxels express similar information, but also shows that the correlated voxels form mini subclusters that are much smaller spatially than the parcel they reside in.

      Author response image 3.

      Feature importance score of individual voxels in decoding keypresses: MRMR was used to rank the individual voxel space features in decoding keypresses and the min-max normalized MRMR score was mapped to a structural brain surface. Note that individual voxels within a parcel showed different contribution to decoding.

       

      Some of these concerns could be addressed by recording head movement (with enough precision) to regress out these contributions. The authors state that head movement was monitored with 3 fiducials, and their time courses ought to provide a way to deal with this issue. The ICA procedure may not have sufficiently dealt with removing movement-related problems, but one could eg relate individual components that were identified to the keypresses as another means for checking. An alternative could be to focus on frequency ranges above the movement frequencies. The accuracy for those still seems impressive and may provide a slightly more biologically plausible assessment. 

      We have already addressed the issue of movement related artefacts in the first response above. With respect to a focus on frequency ranges above movement frequencies, the Reviewer states the “accuracy for those still seems impressive and may provide a slightly more biologically plausible assessment”. First, it is important to note that cortical delta-band oscillations measured with local field potentials (LFPs) in macaques is known to contain important information related to end-effector kinematics22,23 muscle activation patterns24 and temporal sequencing25 during skilled reaching and grasping actions. Thus, there is a substantial body of evidence that low-frequency neural oscillatory activity in this range contains important information about the skill learning behavior investigated in the present study. Second, our own data shows (which the Reviewer also points out) that significant information related to the skill learning behavior is also present in higher frequency bands (see Figure 2A and Figure 3—figure supplement 1). As we pointed out in our earlier response to questions about the hybrid space decoder architecture (see above), it is likely that different, yet complimentary, information is encoded across different temporal frequencies (just as it is encoded across different spatial frequencies). Again, this interpretation is supported by our data as the highest performing classifiers in all cases (when holding all parameters constant) were always constructed from broadband input MEG data (Figure 2A and Figure 3—figure supplement 1).  

      One question concerns the interpretation of the results shown in Figure 4. They imply that during the course of learning, entirely different brain networks underpin the behaviour. Not only that, but they also include regions that would seem rather unexpected to be key nodes for learning and expressing relatively simple finger sequences, such as here. What then is the biological plausibility of these results? The authors seem to circumnavigate this issue by moving into a distance metric that captures the (neural network) changes over the course of learning, but the discussion seems detached from which regions are actually involved; or they offer a rather broad discussion of the anatomical regions identified here, eg in the context of LFOs, where they merely refer to "frontoparietal regions". 

      The Reviewer notes the shift in brain networks driving keypress decoding performance between trials 1, 11 and 36 as shown in Figure 4A. The Reviewer questions whether these substantial shifts in brain network states underpinning the skill are biologically plausible, as well as the likelihood that bilateral superior and middle frontal and parietal cortex are important nodes within these networks.

      First, previous fMRI work in humans performing a similar sequence learning task showed that flexibility in brain network composition (i.e. – changes in brain region members displaying coordinated activity) is up-regulated in novel learning environments and explains differences in learning rates across individuals26.  This work supports our interpretation of the present study data, that brain networks engaged in sequential motor skills rapidly reconfigure during early learning.

      Second, frontoparietal network activity is known to support motor memory encoding during early learning27,28. For example, reactivation events in the posterior parietal29 and medial prefrontal30,31 cortex (MPFC) have been temporally linked to hippocampal replay, and are posited to support memory consolidation across several memory domains32, including motor sequence learning1,33,34.  Further, synchronized interactions between MPFC and hippocampus are more prominent during early learning as opposed to later stages27,35,36, perhaps reflecting “redistribution of hippocampal memories to MPFC” 27.  MPFC contributes to very early memory formation by learning association between contexts, locations, events and adaptive responses during rapid learning37. Consistently, coupling between hippocampus and MPFC has been shown during, and importantly immediately following (rest) initial memory encoding38,39.  Importantly, MPFC activity during initial memory encoding predicts subsequent recall40. Thus, the spatial map required to encode a motor sequence memory may be “built under the supervision of the prefrontal cortex” 28, also engaged in the development of an abstract representation of the sequence41.  In more abstract terms, the prefrontal, premotor and parietal cortices support novice performance “by deploying attentional and control processes” 42-44 required during early learning42-44. The dorsolateral prefrontal cortex DLPFC specifically is thought to engage in goal selection and sequence monitoring during early skill practice45, all consistent with the schema model of declarative memory in which prefrontal cortices play an important role in encoding46,47.  Thus, several prefrontal and frontoparietal regions contributing to long term learning 48 are also engaged in early stages of encoding. Altogether, there is strong biological support for the involvement of bilateral prefrontal and frontoparietal regions to decoding during early skill learning.  We now address this issue in the revised manuscript.

      If I understand correctly, the offline neural representation analysis is in essence the comparison of the last keypress vs the first keypress of the next sequence. In that sense, the activity during offline rest periods is actually not considered. This makes the nomenclature somewhat confusing. While it matches the behavioural analysis, having only key presses one can't do it in any other way, but here the authors actually do have recordings of brain activity during offline rest. So at the very least calling it offline neural representation is misleading to this reviewer because what is compared is activity during the last and during the next keypress, not activity during offline periods. But it also seems a missed opportunity - the authors argue that most of the relevant learning occurs during offline rest periods, yet there is no attempt to actually test whether activity during this period can be useful for the questions at hand here. 

      We agree with the Reviewer that our previous “offline neural representation” nomenclature could be misinterpreted. In the revised manuscript we refer to this difference as the “offline neural representational change”. Please, note that our previous work did link offline neural activity (i.e. – 16-22 Hz beta power and neural replay density during inter-practice rest periods) to observed micro-offline gains49.

      Reviewer #2 (Public review): 

      Summary 

      Dash et al. asked whether and how the neural representation of individual finger movements is "contextualized" within a trained sequence during the very early period of sequential skill learning by using decoding of MEG signal. Specifically, they assessed whether/how the same finger presses (pressing index finger) embedded in the different ordinal positions of a practiced sequence (4-1-3-2-4; here, the numbers 1 through 4 correspond to the little through the index fingers of the non-dominant left hand) change their representation (MEG feature). They did this by computing either the decoding accuracy of the index finger at the ordinal positions 1 vs. 5 (index_OP1 vs index_OP5) or pattern distance between index_OP1 vs. index_OP5 at each training trial and found that both the decoding accuracy and the pattern distance progressively increase over the course of learning trials. More interestingly, they also computed the pattern distance for index_OP5 for the last execution of a practice trial vs. index_OP1 for the first execution in the next practice trial (i.e., across the rest period). This "off-line" distance was significantly larger than the "on-line" distance, which was computed within practice trials and predicted micro-offline skill gain. Based on these results, the authors conclude that the differentiation of representation for the identical movement embedded in different positions of a sequential skill ("contextualization") primarily occurs during early skill learning, especially during rest, consistent with the recent theory of the "micro-offline learning" proposed by the authors' group. I think this is an important and timely topic for the field of motor learning and beyond. <br /> Strengths 

      The specific strengths of the current work are as follows. First, the use of temporally rich neural information (MEG signal) has a large advantage over previous studies testing sequential representations using fMRI. This allowed the authors to examine the earliest period (= the first few minutes of training) of skill learning with finer temporal resolution. Second, through the optimization of MEG feature extraction, the current study achieved extremely high decoding accuracy (approx. 94%) compared to previous works. As claimed by the authors, this is one of the strengths of the paper (but see my comments). Third, although some potential refinement might be needed, comparing "online" and "offline" pattern distance is a neat idea. 

      Weaknesses 

      Along with the strengths I raised above, the paper has some weaknesses. First, the pursuit of high decoding accuracy, especially the choice of time points and window length (i.e., 200 msec window starting from 0 msec from key press onset), casts a shadow on the interpretation of the main result. Currently, it is unclear whether the decoding results simply reflect behavioral change or true underlying neural change. As shown in the behavioral data, the key press speed reached 3~4 presses per second already at around the end of the early learning period (11th trial), which means inter-press intervals become as short as 250-330 msec. Thus, in almost more than 60% of training period data, the time window for MEG feature extraction (200 msec) spans around 60% of the inter-press intervals. Considering that the preparation/cueing of subsequent presses starts ahead of the actual press (e.g., Kornysheva et al., 2019) and/or potential online planning (e.g., Ariani and Diedrichsen, 2019), the decoder likely has captured these future press information as well as the signal related to the current key press, independent of the formation of genuine sequential representation (e.g., "contextualization" of individual press). This may also explain the gradual increase in decoding accuracy or pattern distance between index_OP1 vs. index_OP5 (Figure 4C and 5A), which co-occurred with performance improvement, as shorter inter-press intervals are more favorable for the dissociating the two index finger presses followed by different finger presses. The compromised decoding accuracies for the control sequences can be explained in similar logic. Therefore, more careful consideration and elaborated discussion seem necessary when trying to both achieve high-performance decoding and assess early skill learning, as it can impact all the subsequent analyses.

      The Reviewer raises the possibility that (given the windowing parameters used in the present study) an increase in “contextualization” with learning could simply reflect faster typing speeds as opposed to an actual change in the underlying neural representation. The issue can essentially be framed as a mixing problem. As correct sequences are generated at higher and higher speeds over training, MEG activity patterns related to the planning, execution, evaluation and memory of individual keypresses overlap more in time. Thus, increased overlap between the “4” and “1” keypresses (at the start of the sequence) and “2” and “4” keypresses (at the end of the sequence) could artefactually increase contextualization distances even if the underlying neural representations for the individual keypresses remain unchanged (assuming this mixing of representations is used by the classifier to differentially tag each index finger press). If this were the case, it follows that such mixing effects reflecting the ordinal sequence structure would also be observable in the distribution of decoder misclassifications. For example, “4” keypresses would be more likely to be misclassified as “1” or “2” keypresses (or vice versa) than as “3” keypresses. The confusion matrices presented in Figures 3C and 4B and Figure 3—figure supplement 3A in the previously submitted manuscript do not show this trend in the distribution of misclassifications across the four fingers.

      Moreover, if the representation distance is largely driven by this mixing effect, it’s also possible that the increased overlap between consecutive index finger keypresses during the 4-4 transition marking the end of one sequence and the beginning of the next one could actually mask contextualization-related changes to the underlying neural representations and make them harder to detect. In this case, a decoder tasked with separating individual index finger keypresses into two distinct classes based upon sequence position might show decreased performance with learning as adjacent keypresses overlapped in time with each other to an increasing extent. However, Figure 4C in our previously submitted manuscript does not support this possibility, as the 2-class hybrid classifier displays improved classification performance over early practice trials despite greater temporal overlap.

      We also conducted a new multivariate regression analysis to directly assess whether the neural representation distance score could be predicted by the 4-1, 2-4 and 4-4 keypress transition times observed for each complete correct sequence (both predictor and response variables were z-score normalized within-subject). The results of this analysis affirmed that the possible alternative explanation put forward by the Reviewer is not supported by our data (Adjusted R2 = 0.00431; F = 5.62). We now include this new negative control analysis result in the revised manuscript.

      Overall, we do strongly agree with the Reviewer that the naturalistic, self-paced, generative task employed in the present study results in overlapping brain processes related to planning, execution, evaluation and memory of the action sequence. We also agree that there are several tradeoffs to consider in the construction of the classifiers depending on the study aim. Given our aim of optimizing keypress decoder accuracy in the present study, the set of trade-offs resulted in representations reflecting more the latter three processes, and less so the planning component. Whether separate decoders can be constructed to tease apart the representations or networks supporting these overlapping processes is an important future direction of research in this area. For example, work presently underway in our lab constrains the selection of windowing parameters in a manner that allows individual classifiers to be temporally linked to specific planning, execution, evaluation or memory-related processes to discern which brain networks are involved and how they adaptively reorganize with learning. Results from the present study (Figure 4—figure supplement 2) showing hybrid-space decoder prediction accuracies exceeding 74% for temporal windows spanning as little as 25ms and located up to 100ms prior to the keyDown event strongly support the feasibility of such an approach.

      Related to the above point, testing only one particular sequence (4-1-3-2-4), aside from the control ones, limits the generalizability of the finding. This also may have contributed to the extremely high decoding accuracy reported in the current study. 

      The Reviewer raises a question about the generalizability of the decoder accuracy reported in our study. Fortunately, a comparison between decoder performances on Day 1 and Day 2 datasets does provide some insight into this issue. As the Reviewer points out, the classifiers in this study were trained and tested on keypresses performed while practicing a specific sequence (4-1-3-2-4). The study was designed this way as to avoid the impact of interference effects on learning dynamics. The cross-validated performance of classifiers on MEG data collected within the same session was 90.47% overall accuracy (4-class; Figure 3C). We then tested classifier performance on data collected during a separate MEG session conducted approximately 24 hours later (Day 2; see Figure 3—supplement 3). We observed a reduction in overall accuracy rate to 87.11% when tested on MEG data recorded while participants performed the same learned sequence, and 79.44% when they performed several previously unpracticed sequences. Both changes in accuracy are important with regards to the generalizability of our findings. First, 87.11% performance accuracy for the trained sequence data on Day 2 (a reduction of only 3.36%) indicates that the hybrid-space decoder performance is robust over multiple MEG sessions, and thus, robust to variations in SNR across the MEG sensor array caused by small differences in head position between scans.  This indicates a substantial advantage over sensor-space decoding approaches. Furthermore, when tested on data from unpracticed sequences, overall performance dropped an additional 7.67%. This difference reflects the performance bias of the classifier for the trained sequence, possibly caused by high-order sequence structure being incorporated into the feature weights. In the future, it will be important to understand in more detail how random or repeated keypress sequence training data impacts overall decoder performance and generalization. We strongly agree with the Reviewer that the issue of generalizability is extremely important and have added a new paragraph to the Discussion in the revised manuscript highlighting the strengths and weaknesses of our study with respect to this issue.

      In terms of clinical BCI, one of the potential relevance of the study, as claimed by the authors, it is not clear that the specific time window chosen in the current study (up to 200 msec since key press onset) is really useful. In most cases, clinical BCI would target neural signals with no overt movement execution due to patients' inability to move (e.g., Hochberg et al., 2012). Given the time window, the surprisingly high performance of the current decoder may result from sensory feedback and/or planning of subsequent movement, which may not always be available in the clinical BCI context. Of course, the decoding accuracy is still much higher than chance even when using signal before the key press (as shown in Figure 4 Supplement 2), but it is not immediately clear to me that the authors relate their high decoding accuracy based on post-movement signal to clinical BCI settings.

      The Reviewer questions the relevance of the specific window parameters used in the present study for clinical BCI applications, particularly for paretic patients who are unable to produce finger movements or for whom afferent sensory feedback is no longer intact. We strongly agree with the Reviewer that any intended clinical application must carefully consider these specific input feature constraints dictated by the clinical cohort, and in turn impose appropriate and complimentary constraints on classifier parameters that may differ from the ones used in the present study.  We now highlight this issue in the Discussion of the revised manuscript and relate our present findings to published clinical BCI work within this context.

      One of the important and fascinating claims of the current study is that the "contextualization" of individual finger movements in a trained sequence specifically occurs during short rest periods in very early skill learning, echoing the recent theory of micro-offline learning proposed by the authors' group. Here, I think two points need to be clarified. First, the concept of "contextualization" is kept somewhat blurry throughout the text. It is only at the later part of the Discussion (around line #330 on page 13) that some potential mechanism for the "contextualization" is provided as "what-and-where" binding. Still, it is unclear what "contextualization" actually is in the current data, as the MEG signal analyzed is extracted from 0-200 msec after the keypress. If one thinks something is contextualizing an action, that contextualization should come earlier than the action itself. 

      The Reviewer requests that we: 1) more clearly define our use of the term “contextualization” and 2) provide the rationale for assessing it over a 200ms window aligned to the keyDown event. This choice of window parameters means that the MEG activity used in our analysis was coincident with, rather than preceding, the actual keypresses.  We define contextualization as the differentiation of representation for the identical movement embedded in different positions of a sequential skill. That is, representations of individual action elements progressively incorporate information about their relationship to the overall sequence structure as the skill is learned. We agree with the Reviewer that this can be appropriately interpreted as “what-and-where” binding. We now incorporate this definition in the Introduction of the revised manuscript as requested.

      The window parameters for optimizing accurate decoding individual finger movements were determined using a grid search of the parameter space (a sliding window of variable width between 25-350 ms with 25 ms increments variably aligned from 0 to +100ms with 10ms increments relative to the keyDown event). This approach generated 140 different temporal windows for each keypress for each participant, with the final parameter selection determined through comparison of the resulting performance between each decoder.  Importantly, the decision to optimize for decoding accuracy placed an emphasis on keypress representations characterized by the most consistent and robust features shared across subjects, which in turn maximize statistical power in detecting common learning-related changes. In this case, the optimal window encompassed a 200ms epoch aligned to the keyDown event (t0 = 0 ms).  We then asked if the representations (i.e. – spatial patterns of combined parcel- and voxel-space activity) of the same digit at two different sequence positions changed with practice within this optimal decoding window.  Of course, our findings do not rule out the possibility that contextualization can also be found before or even after this time window, as we did not directly address this issue in the present study.  Ongoing work in our lab, as pointed out above, is investigating contextualization within different time windows tailored specifically for assessing sequence skill action planning, execution, evaluation and memory processes.

      The second point is that the result provided by the authors is not yet convincing enough to support the claim that "contextualization" occurs during rest. In the original analysis, the authors presented the statistical significance regarding the correlation between the "offline" pattern differentiation and micro-offline skill gain (Figure 5. Supplement 1), as well as the larger "offline" distance than "online" distance (Figure 5B). However, this analysis looks like regressing two variables (monotonically) increasing as a function of the trial. Although some information in this analysis, such as what the independent/dependent variables were or how individual subjects were treated, was missing in the Methods, getting a statistically significant slope seems unsurprising in such a situation. Also, curiously, the same quantitative evidence was not provided for its "online" counterpart, and the authors only briefly mentioned in the text that there was no significant correlation between them. It may be true looking at the data in Figure 5A as the online representation distance looks less monotonically changing, but the classification accuracy presented in Figure 4C, which should reflect similar representational distance, shows a more monotonic increase up to the 11th trial. Further, the ways the "online" and "offline" representation distance was estimated seem to make them not directly comparable. While the "online" distance was computed using all the correct press data within each 10 sec of execution, the "offline" distance is basically computed by only two presses (i.e., the last index_OP5 vs. the first index_OP1 separated by 10 sec of rest). Theoretically, the distance between the neural activity patterns for temporally closer events tends to be closer than that between the patterns for temporally far-apart events. It would be fairer to use the distance between the first index_OP1 vs. the last index_OP5 within an execution period for "online" distance, as well. 

      The Reviewer suggests that the current data is not convincing enough to show that contextualization occurs during rest and raises two important concerns: 1) the relationship between online contextualization and micro-online gains is not shown, and 2) the online distance was calculated differently from its offline counterpart (i.e. - instead of calculating the distance between last IndexOP5 and first IndexOP1 from a single trial, the distance was calculated for each sequence within a trial and then averaged).

      We addressed the first concern by performing individual subject correlations between 1) contextualization changes during rest intervals and micro-offline gains; 2) contextualization changes during practice trials and micro-online gains, and 3) contextualization changes during practice trials and micro-offline gains (Author response image 4). We then statistically compared the resulting correlation coefficient distributions and found that within-subject correlations for contextualization changes during rest intervals and micro-offline gains were significantly higher than online contextualization and micro-online gains (t = 3.2827, p = 0.0015) and online contextualization and micro-offline gains (t = 3.7021, p = 5.3013e-04). These results are consistent with our interpretation that micro-offline gains are supported by contextualization changes during the inter-practice rest period.

      Author response image 4.

      Distribution of individual subject correlation coefficients between contextualization changes occurring during practice or rest with  micro-online and micro-offline performance gains. Note that, the correlation distributions were significantly higher for the relationship between contextualization changes during rest and micro-offline gains than for contextualization changes during practice and either micro-online or offline gain.

      With respect to the second concern highlighted above, we agree with the Reviewer that one limitation of the analysis comparing online versus offline changes in contextualization as presented in the reviewed manuscript, is that it does not eliminate the possibility that any differences could simply be explained by the passage of time (which is smaller for the online analysis compared to the offline analysis). The Reviewer suggests an approach that addresses this issue, which we have now carried out.   When quantifying online changes in contextualization from the first IndexOP1 the last IndexOP5 keypress in the same trial we observed no learning-related trend (Author response image 5, right panel). Importantly, offline distances were significantly larger than online distances regardless of the measurement approach and neither predicted online learning (Author response image 6).

      Author response image 5.

      Trial by trial trend of offline (left panel) and online (middle and right panels) changes in contextualization. Offline changes in contextualization were assessed by calculating the distance between neural representations for the last IndexOP5 keypress in the previous trial and the first IndexOP1 keypress in the present trial. Two different approaches were used to characterize online contextualization changes. The analysis included in the reviewed manuscript (middle panel) calculated the distance between IndexOP1 and IndexOP5 for each correct sequence, which was then averaged across the trial. This approach is limited by the lack of control for the passage of time when making online versus offline comparisons. Thus, the second approach controlled for the passage of time by calculating distance between the representations associated with the first IndexOP1 keypress and the last IndexOP5 keypress within the same trial. Note that while the first approach showed an increase online contextualization trend with practice, the second approach did not.

      Author response image 6.

      Relationship between online contextualization and online learning is shown for both within-sequence (left; note that this is the online contextualization measure used in the reviewd manuscript) and across-sequence (right) distance calculation. There was no significant relationship between online learning and online contextualization regardless of the measurement approach.

      A related concern regarding the control analysis, where individual values for max speed and the degree of online contextualization were compared (Figure 5 Supplement 3), is whether the individual difference is meaningful. If I understood correctly, the optimization of the decoding process (temporal window, feature inclusion/reduction, decoder, etc.) was performed for individual participants, and the same feature extraction was also employed for the analysis of representation distance (i.e., contextualization). If this is the case, the distances are individually differently calculated and they may need to be normalized relative to some stable reference (e.g., 1 vs. 4 or average distance within the control sequence presses) before comparison across the individuals. 

      The Reviewer makes a good point here. We have now implemented the suggested normalization procedure in the analysis provided in the revised manuscript.

      Reviewer #3 (Public review): 

      Summary: 

      One goal of this paper is to introduce a new approach for highly accurate decoding of finger movements from human magnetoencephalography data via dimension reduction of a "multi-scale, hybrid" feature space. Following this decoding approach, the authors aim to show that early skill learning involves "contextualization" of the neural coding of individual movements, relative to their position in a sequence of consecutive movements. Furthermore, they aim to show that this "contextualization" develops primarily during short rest periods interspersed with skill training and correlates with a performance metric which the authors interpret as an indicator of offline learning. <br /> Strengths: 

      A clear strength of the paper is the innovative decoding approach, which achieves impressive decoding accuracies via dimension reduction of a "multi-scale, hybrid space". This hybrid-space approach follows the neurobiologically plausible idea of the concurrent distribution of neural coding across local circuits as well as large-scale networks. A further strength of the study is the large number of tested dimension reduction techniques and classifiers (though the manuscript reveals little about the comparison of the latter). 

      We appreciate the Reviewer’s comments regarding the paper’s strengths.

      A simple control analysis based on shuffled class labels could lend further support to this complex decoding approach. As a control analysis that completely rules out any source of overfitting, the authors could test the decoder after shuffling class labels. Following such shuffling, decoding accuracies should drop to chance level for all decoding approaches, including the optimized decoder. This would also provide an estimate of actual chance-level performance (which is informative over and beyond the theoretical chance level). Furthermore, currently, the manuscript does not explain the huge drop in decoding accuracies for the voxel-space decoding (Figure 3B). Finally, the authors' approach to cortical parcellation raises questions regarding the information carried by varying dipole orientations within a parcel (which currently seems to be ignored?) and the implementation of the mean-flipping method (given that there are two dimensions - space and time - what do the authors refer to when they talk about the sign of the "average source", line 477?). 

      The Reviewer recommends that we: 1) conduct an additional control analysis on classifier performance using shuffled class labels, 2) provide a more detailed explanation regarding the drop in decoding accuracies for the voxel-space decoding following LDA dimensionality reduction (see Fig 3B), and 3) provide additional details on how problems related to dipole solution orientations were addressed in the present study.  

      In relation to the first point, we have now implemented a random shuffling approach as a control for the classification analyses. The results of this analysis indicated that the chance level accuracy was 22.12% (± SD 9.1%) for individual keypress decoding (4-class classification), and 18.41% (± SD 7.4%) for individual sequence item decoding (5-class classification), irrespective of the input feature set or the type of decoder used. Thus, the decoding accuracy observed with the final model was substantially higher than these chance levels.  

      Second, please note that the dimensionality of the voxel-space feature set is very high (i.e. – 15684). LDA attempts to map the input features onto a much smaller dimensional space (number of classes-1; e.g. –  3 dimensions, for 4-class keypress decoding). Given the very high dimension of the voxel-space input features in this case, the resulting mapping exhibits reduced accuracy. Despite this general consideration, please refer to Figure 3—figure supplement 3, where we observe improvement in voxel-space decoder performance when utilizing alternative dimensionality reduction techniques.

      The decoders constructed in the present study assess the average spatial patterns across time (as defined by the windowing procedure) in the input feature space.  We now provide additional details in the Methods of the revised manuscript pertaining to the parcellation procedure and how the sign ambiguity problem was addressed in our analysis.

      Weaknesses: 

      A clear weakness of the paper lies in the authors' conclusions regarding "contextualization". Several potential confounds, described below, question the neurobiological implications proposed by the authors and provide a simpler explanation of the results. Furthermore, the paper follows the assumption that short breaks result in offline skill learning, while recent evidence, described below, casts doubt on this assumption. 

      We thank the Reviewer for giving us the opportunity to address these issues in detail (see below).

      The authors interpret the ordinal position information captured by their decoding approach as a reflection of neural coding dedicated to the local context of a movement (Figure 4). One way to dissociate ordinal position information from information about the moving effectors is to train a classifier on one sequence and test the classifier on other sequences that require the same movements, but in different positions50. In the present study, however, participants trained to repeat a single sequence (4-1-3-2-4). As a result, ordinal position information is potentially confounded by the fixed finger transitions around each of the two critical positions (first and fifth press). Across consecutive correct sequences, the first keypress in a given sequence was always preceded by a movement of the index finger (=last movement of the preceding sequence), and followed by a little finger movement. The last keypress, on the other hand, was always preceded by a ring finger movement, and followed by an index finger movement (=first movement of the next sequence). Figure 4 - Supplement 2 shows that finger identity can be decoded with high accuracy (>70%) across a large time window around the time of the key press, up to at least +/-100 ms (and likely beyond, given that decoding accuracy is still high at the boundaries of the window depicted in that figure). This time window approaches the keypress transition times in this study. Given that distinct finger transitions characterized the first and fifth keypress, the classifier could thus rely on persistent (or "lingering") information from the preceding finger movement, and/or "preparatory" information about the subsequent finger movement, in order to dissociate the first and fifth keypress. Currently, the manuscript provides no evidence that the context information captured by the decoding approach is more than a by-product of temporally extended, and therefore overlapping, but independent neural representations of consecutive keypresses that are executed in close temporal proximity - rather than a neural representation dedicated to context. 

      Such temporal overlap of consecutive, independent finger representations may also account for the dynamics of "ordinal coding"/"contextualization", i.e., the increase in 2-class decoding accuracy, across Day 1 (Figure 4C). As learning progresses, both tapping speed and the consistency of keypress transition times increase (Figure 1), i.e., consecutive keypresses are closer in time, and more consistently so. As a result, information related to a given keypress is increasingly overlapping in time with information related to the preceding and subsequent keypresses. The authors seem to argue that their regression analysis in Figure 5 - Figure Supplement 3 speaks against any influence of tapping speed on "ordinal coding" (even though that argument is not made explicitly in the manuscript). However, Figure 5 - Figure Supplement 3 shows inter-individual differences in a between-subject analysis (across trials, as in panel A, or separately for each trial, as in panel B), and, therefore, says little about the within-subject dynamics of "ordinal coding" across the experiment. A regression of trial-by-trial "ordinal coding" on trial-by-trial tapping speed (either within-subject or at a group-level, after averaging across subjects) could address this issue. Given the highly similar dynamics of "ordinal coding" on the one hand (Figure 4C), and tapping speed on the other hand (Figure 1B), I would expect a strong relationship between the two in the suggested within-subject (or group-level) regression. Furthermore, learning should increase the number of (consecutively) correct sequences, and, thus, the consistency of finger transitions. Therefore, the increase in 2-class decoding accuracy may simply reflect an increasing overlap in time of increasingly consistent information from consecutive keypresses, which allows the classifier to dissociate the first and fifth keypress more reliably as learning progresses, simply based on the characteristic finger transitions associated with each. In other words, given that the physical context of a given keypress changes as learning progresses - keypresses move closer together in time and are more consistently correct - it seems problematic to conclude that the mental representation of that context changes. To draw that conclusion, the physical context should remain stable (or any changes to the physical context should be controlled for). 

      The issues raised by Reviewer #3 here are similar to two issues raised by Reviewer #2 above and agree they must both be carefully considered in any evaluation of our findings.

      As both Reviewers pointed out, the classifiers in this study were trained and tested on keypresses performed while practicing a specific sequence (4-1-3-2-4). The study was designed this way as to avoid the impact of interference effects on learning dynamics. The cross-validated performance of classifiers on MEG data collected within the same session was 90.47% overall accuracy (4-class; Figure 3C). We then tested classifier performance on data collected during a separate MEG session conducted approximately 24 hours later (Day 2; see Figure 3—supplement 3). We observed a reduction in overall accuracy rate to 87.11% when tested on MEG data recorded while participants performed the same learned sequence, and 79.44% when they performed several previously unpracticed sequences. This classification performance difference of 7.67% when tested on the Day 2 data could reflect the performance bias of the classifier for the trained sequence, possibly caused by mixed information from temporally close keypresses being incorporated into the feature weights.

      Along these same lines, both Reviewers also raise the possibility that an increase in “ordinal coding/contextualization” with learning could simply reflect an increase in this mixing effect caused by faster typing speeds as opposed to an actual change in the underlying neural representation. The basic idea is that as correct sequences are generated at higher and higher speeds over training, MEG activity patterns related to the planning, execution, evaluation and memory of individual keypresses overlap more in time. Thus, increased overlap between the “4” and “1” keypresses (at the start of the sequence) and “2” and “4” keypresses (at the end of the sequence) could artefactually increase contextualization distances even if the underlying neural representations for the individual keypresses remain unchanged (assuming this mixing of representations is used by the classifier to differentially tag each index finger press). If this were the case, it follows that such mixing effects reflecting the ordinal sequence structure would also be observable in the distribution of decoder misclassifications. For example, “4” keypresses would be more likely to be misclassified as “1” or “2” keypresses (or vice versa) than as “3” keypresses. The confusion matrices presented in Figures 3C and 4B and Figure 3—figure supplement 3A in the previously submitted manuscript do not show this trend in the distribution of misclassifications across the four fingers.

      Following this logic, it’s also possible that if the ordinal coding is largely driven by this mixing effect, the increased overlap between consecutive index finger keypresses during the 4-4 transition marking the end of one sequence and the beginning of the next one could actually mask contextualization-related changes to the underlying neural representations and make them harder to detect. In this case, a decoder tasked with separating individual index finger keypresses into two distinct classes based upon sequence position might show decreased performance with learning as adjacent keypresses overlapped in time with each other to an increasing extent. However, Figure 4C in our previously submitted manuscript does not support this possibility, as the 2-class hybrid classifier displays improved classification performance over early practice trials despite greater temporal overlap.

      As noted in the above replay to Reviewer #2, we also conducted a new multivariate regression analysis to directly assess whether the neural representation distance score could be predicted by the 4-1, 2-4 and 4-4 keypress transition times observed for each complete correct sequence (both predictor and response variables were z-score normalized within-subject). The results of this analysis affirmed that the possible alternative explanation put forward by the Reviewer is not supported by our data (Adjusted R2 = 0.00431; F = 5.62). We now include this new negative control analysis result in the revised manuscript.

      Finally, the Reviewer hints that one way to address this issue would be to compare MEG responses before and after learning for sequences typed at a fixed speed. However, given that the speed-accuracy trade-off should improve with learning, a comparison between unlearned and learned skill states would dictate that the skill be evaluated at a very low fixed speed. Essentially, such a design presents the problem that the post-training test is evaluating the representation in the unlearned behavioral state that is not representative of the acquired skill. Thus, this approach would not address our experimental question: “do neural representations of the same action performed at different locations within a skill sequence contextually differentiate or remain stable as learning evolves”.

      A similar difference in physical context may explain why neural representation distances ("differentiation") differ between rest and practice (Figure 5). The authors define "offline differentiation" by comparing the hybrid space features of the last index finger movement of a trial (ordinal position 5) and the first index finger movement of the next trial (ordinal position 1). However, the latter is not only the first movement in the sequence but also the very first movement in that trial (at least in trials that started with a correct sequence), i.e., not preceded by any recent movement. In contrast, the last index finger of the last correct sequence in the preceding trial includes the characteristic finger transition from the fourth to the fifth movement. Thus, there is more overlapping information arising from the consistent, neighbouring keypresses for the last index finger movement, compared to the first index finger movement of the next trial. A strong difference (larger neural representation distance) between these two movements is, therefore, not surprising, given the task design, and this difference is also expected to increase with learning, given the increase in tapping speed, and the consequent stronger overlap in representations for consecutive keypresses. Furthermore, initiating a new sequence involves pre-planning, while ongoing practice relies on online planning (Ariani et al., eNeuro 2021), i.e., two mental operations that are dissociable at the level of neural representation (Ariani et al., bioRxiv 2023). 

      The Reviewer argues that the comparison of last finger movement of a trial and the first in the next trial are performed in different circumstances and contexts. This is an important point and one we tend to agree with. For this task, the first sequence in a practice trial (which is pre-planned offline) is performed in a somewhat different context from the sequence iterations that follow, which involve temporally overlapping planning, execution and evaluation processes.  The Reviewer is particularly concerned about a difference in the temporal mixing effect issue raised above between the first and last keypresses performed in a trial. However, in contrast to the Reviewers stated argument above, findings from Korneysheva et. al (2019) showed that neural representations of individual actions are competitively queued during the pre-planning period in a manner that reflects the ordinal structure of the learned sequence.  Thus, mixing effects are likely still present for the first keypress in a trial. Also note that we now present new control analyses in multiple responses above confirming that hypothetical mixing effects between adjacent keypresses do not explain our reported contextualization finding. A statement addressing these possibilities raised by the Reviewer has been added to the Discussion in the revised manuscript.

      In relation to pre-planning, ongoing MEG work in our lab is investigating contextualization within different time windows tailored specifically for assessing how sequence skill action planning evolves with learning.

      Given these differences in the physical context and associated mental processes, it is not surprising that "offline differentiation", as defined here, is more pronounced than "online differentiation". For the latter, the authors compared movements that were better matched regarding the presence of consistent preceding and subsequent keypresses (online differentiation was defined as the mean difference between all first vs. last index finger movements during practice).  It is unclear why the authors did not follow a similar definition for "online differentiation" as for "micro-online gains" (and, indeed, a definition that is more consistent with their definition of "offline differentiation"), i.e., the difference between the first index finger movement of the first correct sequence during practice, and the last index finger of the last correct sequence. While these two movements are, again, not matched for the presence of neighbouring keypresses (see the argument above), this mismatch would at least be the same across "offline differentiation" and "online differentiation", so they would be more comparable. 

      This is the same point made earlier by Reviewer #2, and we agree with this assessment. As stated in the response to Reviewer #2 above, we have now carried out quantification of online contextualization using this approach and included it in the revised manuscript. We thank the Reviewer for this suggestion.

      A further complication in interpreting the results regarding "contextualization" stems from the visual feedback that participants received during the task. Each keypress generated an asterisk shown above the string on the screen, irrespective of whether the keypress was correct or incorrect. As a result, incorrect (e.g., additional, or missing) keypresses could shift the phase of the visual feedback string (of asterisks) relative to the ordinal position of the current movement in the sequence (e.g., the fifth movement in the sequence could coincide with the presentation of any asterisk in the string, from the first to the fifth). Given that more incorrect keypresses are expected at the start of the experiment, compared to later stages, the consistency in visual feedback position, relative to the ordinal position of the movement in the sequence, increased across the experiment. A better differentiation between the first and the fifth movement with learning could, therefore, simply reflect better decoding of the more consistent visual feedback, based either on the feedback-induced brain response, or feedback-induced eye movements (the study did not include eye tracking). It is not clear why the authors introduced this complicated visual feedback in their task, besides consistency with their previous studies.

      We strongly agree with the Reviewer that eye movements related to task engagement are important to rule out as a potential driver of the decoding accuracy or contextualization effect. We address this issue above in response to a question raised by Reviewer #1 about the impact of movement related artefacts in general on our findings.

      First, the assumption the Reviewer makes here about the distribution of errors in this task is incorrect. On average across subjects, 2.32% ± 1.48% (mean ± SD) of all keypresses performed were errors, which were evenly distributed across the four possible keypress responses. While errors increased progressively over practice trials, they did so in proportion to the increase in correct keypresses, so that the overall ratio of correct-to-incorrect keypresses remained stable over the training session. Thus, the Reviewer’s assumptions that there is a higher relative frequency of errors in early trials, and a resulting systematic trend phase shift differences between the visual display updates (i.e. – a change in asterisk position above the displayed sequence) and the keypress performed is not substantiated by the data. To the contrary, the asterisk position on the display and the keypress being executed remained highly correlated over the entire training session. We now include a statement about the frequency and distribution of errors in the revised manuscript.

      Given this high correlation, we firmly agree with the Reviewer that the issue of eye movement-related artefacts is still an important one to address. Fortunately, we did collect eye movement data during the MEG recordings so were able to investigate this. As detailed in the response to Reviewer #1 above, we found that gaze positions and eye-movement velocity time-locked to visual display updates (i.e. – a change in asterisk position above the displayed sequence) did not reflect the asterisk location above chance levels (Overall cross-validated accuracy = 0.21817; see Author response image 1). Furthermore, an inspection of the eye position data revealed that a majority of participants on most trials displayed random walk gaze patterns around a center fixation point, indicating that participants did not attend to the asterisk position on the display. This is consistent with intrinsic generation of the action sequence, and congruent with the fact that the display does not provide explicit feedback related to performance. As pointed out above, a similar real-world example would be manually inputting a long password into a secure online application. In this case, one intrinsically generates the sequence from memory and receives similar feedback about the password sequence position (also provided as asterisks), which is typically ignored by the user. Notably, the minimal participant engagement with the visual task display observed in this study highlights an important difference between behavior observed during explicit sequence learning motor tasks (which is highly generative in nature) with reactive responses to stimulus cues in a serial reaction time task (SRTT).  This is a crucial difference that must be carefully considered when comparing findings across studies. All elements pertaining to this new control analysis are now included in the revised manuscript.

      The authors report a significant correlation between "offline differentiation" and cumulative micro-offline gains. However, it would be more informative to correlate trial-by-trial changes in each of the two variables. This would address the question of whether there is a trial-by-trial relation between the degree of "contextualization" and the amount of micro-offline gains - are performance changes (micro-offline gains) less pronounced across rest periods for which the change in "contextualization" is relatively low? Furthermore, is the relationship between micro-offline gains and "offline differentiation" significantly stronger than the relationship between micro-offline gains and "online differentiation"? 

      In response to a similar issue raised above by Reviewer #2, we now include new analyses comparing correlation magnitudes between (1) “online differention” vs micro-online gains, (2) “online differention” vs micro-offline gains and (3) “offline differentiation” and micro-offline gains (see Author response images 4, 5 and 6 above). These new analyses and results have been added to the revised manuscript. Once again, we thank both Reviewers for this suggestion.

      The authors follow the assumption that micro-offline gains reflect offline learning.

      This statement is incorrect. The original Bonstrup et al (2019) 49 paper clearly states that micro-offline gains must be carefully interpreted based upon the behavioral context within which they are observed, and lays out the conditions under which one can have confidence that micro-offline gains reflect offline learning.  In fact, the excellent meta-analysis of Pan & Rickard (2015) 51, which re-interprets the benefits of sleep in overnight skill consolidation from a “reactive inhibition” perspective, was a crucial resource in the experimental design of our initial study49, as well as in all our subsequent work. Pan & Rickard stated:

      “Empirically, reactive inhibition refers to performance worsening that can accumulate during a period of continuous training (Hull, 1943). It tends to dissipate, at least in part, when brief breaks are inserted between blocks of training. If there are multiple performance-break cycles over a training session, as in the motor sequence literature, performance can exhibit a scalloped effect, worsening during each uninterrupted performance block but improving across blocks52,53. Rickard, Cai, Rieth, Jones, and Ard (2008) and Brawn, Fenn, Nusbaum, and Margoliash (2010) 52,53 demonstrated highly robust scalloped reactive inhibition effects using the commonly employed 30 s–30 s performance break cycle, as shown for Rickard et al.’s (2008) massed practice sleep group in Figure 2. The scalloped effect is evident for that group after the first few 30 s blocks of each session. The absence of the scalloped effect during the first few blocks of training in the massed group suggests that rapid learning during that period masks any reactive inhibition effect.”

      Crucially, Pan & Rickard51 made several concrete recommendations for reducing the impact of the reactive inhibition confound on offline learning studies. One of these recommendations was to reduce practice times to 10s (most prior sequence learning studies up until that point had employed 30s long practice trials). They stated:

      “The traditional design involving 30 s-30 s performance break cycles should be abandoned given the evidence that it results in a reactive inhibition confound, and alternative designs with reduced performance duration per block used instead 51. One promising possibility is to switch to 10 s performance durations for each performance-break cycle Instead 51. That design appears sufficient to eliminate at least the majority of the reactive inhibition effect 52,53.”

      We mindfully incorporated recommendations from Pan and Rickard51  into our own study designs including 1) utilizing 10s practice trials and 2) constraining our analysis of micro-offline gains to early learning trials (where performance monotonically increases and 95% of overall performance gains occur), which are prior to the emergence of the “scalloped” performance dynamics that are strongly linked to reactive inhibition effects. 

      However, there is no direct evidence in the literature that micro-offline gains really result from offline learning, i.e., an improvement in skill level.

      We strongly disagree with the Reviewer’s assertion that “there is no direct evidence in the literature that micro-offline gains really result from offline learning, i.e., an improvement in skill level.”  The initial Bönstrup et al. (2019) 49 report was followed up by a large online crowd-sourcing study (Bönstrup et al., 2020) 54. This second (and much larger) study provided several additional important findings supporting our interpretation of micro-offline gains in cases where the important behavioral conditions clarified above were met (see Author response image 7 below for further details on these conditions).

      Author response image 7.

      Micro-offline gains observed in learning and non-learning contexts are attributed to different underlying causes. (A) Micro-offline and online changes relative to overall trial-by-trial learning. This figure is based on data from Bönstrup et al. (2019) 49. During early learning, micro-offline gains (red bars) closely track trial-by-trial performance gains (green line with open circle markers), with minimal contribution from micro-online gains (blue bars). The stated conclusion in Bönstrup et al. (2019) is that micro-offline gains only during this Early Learning stage reflect rapid memory consolidation (see also 54). After early learning, about practice trial 11, skill plateaus. This plateau skill period is characterized by a striking emergence of coupled (and relatively stable) micro-online drops and micro-offline increases. Bönstrup et al. (2019) as well as others in the literature 55-57, argue that micro-offline gains during the plateau period likely reflect recovery from inhibitory performance factors such as reactive inhibition or fatigue, and thus must be excluded from analyses relating micro-offline gains to skill learning.  The Non-repeating groups in Experiments 3 and 4 from Das et al. (2024) suffer from a lack of consideration of these known confounds.

      Evidence documented in that paper54 showed that micro-offline gains during early skill learning were: 1) replicable and generalized to subjects learning the task in their daily living environment (n=389); 2) equivalent when significantly shortening practice period duration, thus confirming that they are not a result of recovery from performance fatigue (n=118);  3) reduced (along with learning rates) by retroactive interference applied immediately after each practice period relative to interference applied after passage of time (n=373), indicating stabilization of the motor memory at a microscale of several seconds consistent with rapid consolidation; and 4) not modified by random termination of the practice periods, ruling out a contribution of predictive motor slowing (N = 71) 54.  Altogether, our findings were strongly consistent with the interpretation that micro-offline gains reflect memory consolidation supporting early skill learning. This is precisely the portion of the learning curve Pan and Rickard51 refer to when they state “…rapid learning during that period masks any reactive inhibition effect”.

      This interpretation is further supported by brain imaging evidence linking known memory-related networks and consolidation mechanisms to micro-offline gains. First, we reported that the density of fast hippocampo-neocortical skill memory replay events increases approximately three-fold during early learning inter-practice rest periods with the density explaining differences in the magnitude of micro-offline gains across subjects1. Second, Jacobacci et al. (2020) independently reproduced our original behavioral findings and reported BOLD fMRI changes in the hippocampus and precuneus (regions also identified in our MEG study1) linked to micro-offline gains during early skill learning. 33 These functional changes were coupled with rapid alterations in brain microstructure in the order of minutes, suggesting that the same network that operates during rest periods of early learning undergoes structural plasticity over several minutes following practice58. Third, even more recently, Chen et al. (2024) provided direct evidence from intracranial EEG in humans linking sharp-wave ripple events (which are known markers for neural replay59) in the hippocampus (80-120 Hz in humans) with micro-offline gains during early skill learning. The authors report that the strong increase in ripple rates tracked learning behavior, both across blocks and across participants. The authors conclude that hippocampal ripples during resting offline periods contribute to motor sequence learning. 2

      Thus, there is actually now substantial evidence in the literature directly supporting the assertion “that micro-offline gains really result from offline learning”.  On the contrary, according to Gupta & Rickard (2024) “…the mechanism underlying RI [reactive inhibition] is not well established” after over 80 years of investigation60, possibly due to the fact that “reactive inhibition” is a categorical description of behavioral effects that likely result from several heterogenous processes with very different underlying mechanisms.

      On the contrary, recent evidence questions this interpretation (Gupta & Rickard, npj Sci Learn 2022; Gupta & Rickard, Sci Rep 2024; Das et al., bioRxiv 2024). Instead, there is evidence that micro-offline gains are transient performance benefits that emerge when participants train with breaks, compared to participants who train without breaks, however, these benefits vanish within seconds after training if both groups of participants perform under comparable conditions (Das et al., bioRxiv 2024). 

      It is important to point out that the recent work of Gupta & Rickard (2022,2024) 55 does not present any data that directly opposes our finding that early skill learning49 is expressed as micro-offline gains during rest breaks. These studies are essentially an extension of the Rickard et al (2008) paper that employed a massed (30s practice followed by 30s breaks) vs spaced (10s practice followed by 10s breaks) to assess if recovery from reactive inhibition effects could account for performance gains measured after several minutes or hours. Gupta & Rickard (2022) added two additional groups (30s practice/10s break and 10s practice/10s break as used in the work from our group). The primary aim of the study was to assess whether it was more likely that changes in performance when retested 5 minutes after skill training (consisting of 12 practice trials for the massed groups and 36 practice trials for the spaced groups) had ended reflected memory consolidation effects or recovery from reactive inhibition effects. The Gupta & Rickard (2024) follow-up paper employed a similar design with the primary difference being that participants performed a fixed number of sequences on each trial as opposed to trials lasting a fixed duration. This was done to facilitate the fitting of a quantitative statistical model to the data.  To reiterate, neither study included any analysis of micro-online or micro-offline gains and did not include any comparison focused on skill gains during early learning. Instead, Gupta & Rickard (2022), reported evidence for reactive inhibition effects for all groups over much longer training periods. Again, we reported the same finding for trials following the early learning period in our original Bönstrup et al. (2019) paper49 (Author response image 7). Also, please note that we reported in this paper that cumulative micro-offline gains over early learning did not correlate with overnight offline consolidation measured 24 hours later49 (see the Results section and further elaboration in the Discussion). Thus, while the composition of our data is supportive of a short-term memory consolidation process operating over several seconds during early learning, it likely differs from those involved over longer training times and offline periods, as assessed by Gupta & Rickard (2022).

      In the recent preprint from Das et al (2024) 61,  the authors make the strong claim that “micro-offline gains during early learning do not reflect offline learning” which is not supported by their own data.   The authors hypothesize that if “micro-offline gains represent offline learning, participants should reach higher skill levels when training with breaks, compared to training without breaks”.  The study utilizes a spaced vs. massed practice group between-subjects design inspired by the reactive inhibition work from Rickard and others to test this hypothesis. Crucially, the design incorporates only a small fraction of the training used in other investigations to evaluate early skill learning1,33,49,54,57,58,62.  A direct comparison between the practice schedule designs for the spaced and massed groups in Das et al., and the training schedule all participants experienced in the original Bönstrup et al. (2019) paper highlights this issue as well as several others (Author response image 8):

      Author response image 8.

      (A) Comparison of Das et al. Spaced & Massed group training session designs, and the training session design from the original Bönstrup et al. (2019) 49 paper. Similar to the approach taken by Das et al., all practice is visualized as 10-second practice trials with a variable number (either 0, 1 or 30) of 10-second-long inter-practice rest intervals to allow for direct comparisons between designs. The two key takeaways from this comparison are that (1) the intervention differences (i.e. – practice schedules) between the Massed and Spaced groups from the Das et al. report are extremely small (less than 12% of the overall session schedule) and (2) the overall amount of practice is much less than compared to the design from the original Bönstrup report 49  (which has been utilized in several subsequent studies). (B) Group-level learning curve data from Bönstrup et al. (2019) 49 is used to estimate the performance range accounted for by the equivalent periods covering Test 1, Training 1 and Test 2 from Das et al (2024). Note that the intervention in the Das et al. study is limited to a period covering less than 50% of the overall learning range.

      First, participants in the original Bönstrup et al. study 49 experienced 157.14% more practice time and 46.97% less inter-practice rest time than the Spaced group in the Das et al. study (Author response image 8).  Thus, the overall amount of practice and rest differ substantially between studies, with much more limited training occurring for participants in Das et al.  

      Second, and perhaps most importantly, the actual intervention (i.e. – the difference in practice schedule between the Spaced and Massed groups) employed by Das et al. covers a very small fraction of the overall training session. Identical practice schedule segments for both the Spaced & Massed groups are indicated by the red shaded area in Author response image 8. Please note that these identical segments cover 94.84% of the Massed group training schedule and 88.01% of the Spaced group training schedule (since it has 60 seconds of additional rest). This means that the actual interventions cover less than 5% (for Massed) and 12% (for Spaced) of the total training session, which minimizes any chance of observing a difference between groups.

      Also note that the very beginning of the practice schedule (during which Figure R9 shows substantial learning is known to occur) is labeled in the Das et al. study as Test 1.  Test 1 encompasses the first 20 seconds of practice (alternatively viewed as the first two 10-second-long practice trials with no inter-practice rest). This is immediately followed by the Training 1 intervention, which is composed of only three 10-second-long practice trials (with 10-second inter-practice rest for the Spaced group and no inter-practice rest for the Massed group). Author response image 8 also shows that since there is no inter-practice rest after the third Training practice trial for the Spaced group, this third trial (for both Training 1 and 2) is actually a part of an identical practice schedule segment shared by both groups (Massed and Spaced), reducing the magnitude of the intervention even further.

      Moreover, we know from the original Bönstrup et al. (2019) paper49 that 46.57% of all overall group-level performance gains occurred between trials 2 and 5 for that study. Thus, Das et al. are limiting their designed intervention to a period covering less than half of the early learning range discussed in the literature, which again, minimizes any chance of observing an effect.

      This issue is amplified even further at Training 2 since skill learning prior to the long 5-minute break is retained, further constraining the performance range over these three trials. A related issue pertains to the trials labeled as Test 1 (trials 1-2) and Test 2 (trials 6-7) by Das et al. Again, we know from the original Bönstrup et al. paper 49 that 18.06% and 14.43% (32.49% total) of all overall group-level performance gains occurred during trials corresponding to Das et al Test 1 and Test 2, respectively. In other words, Das et al averaged skill performance over 20 seconds of practice at two time-points where dramatic skill improvements occur. Pan & Rickard (1995) previously showed that such averaging is known to inject artefacts into analyses of performance gains.

      Furthermore, the structure of the Test in Das et. al study appears to have an interference effect on the Spaced group performance after the training intervention.  This makes sense if you consider that the Spaced group is required to now perform the task in a Massed practice environment (i.e., two 10-second-long practice trials merged into one long trial), further blurring the true intervention effects. This effect is observable in Figure 1C,E of their pre-print. Specifically, while the Massed group continues to show an increase in performance during test relative to the last 10 seconds of practice during training, the Spaced group displays a marked decrease. This decrease is in stark contrast to the monotonic increases observed for both groups at all other time-points.

      Interestingly, when statistical comparisons between the groups are made at the time-points when the intervention is present (as opposed to after it has been removed) then the stated hypothesis, “If micro-offline gains represent offline learning, participants should reach higher skill levels when training with breaks, compared to training without breaks”, is confirmed.

      The data presented by Gupta and Rickard (2022, 2024) and Das et al. (2024) is in many ways more confirmatory of the constraints employed by our group and others with respect to experimental design, analysis and interpretation of study findings, rather than contradictory. Still, it does highlight a limitation of the current micro-online/offline framework, which was originally only intended to be applied to early skill learning over spaced practice schedules when reactive inhibition effects are minimized49. Extrapolation of this current framework to post-plateau performance periods, longer timespans, or non-learning situations (e.g. – the Non-repeating groups from Experiments 3 & 4 in Das et al. (2024)), when reactive inhibition plays a more substantive role, is not warranted. Ultimately, it will be important to develop new paradigms allowing one to independently estimate the different coincident or antagonistic features (e.g. - memory consolidation, planning, working memory and reactive inhibition) contributing to micro-online and micro-offline gains during and after early skill learning within a unifying framework.

      References

      (1) Buch, E. R., Claudino, L., Quentin, R., Bonstrup, M. & Cohen, L. G. Consolidation of human skill linked to waking hippocampo-neocortical replay. Cell Rep 35, 109193 (2021). https://doi.org:10.1016/j.celrep.2021.109193

      (2) Chen, P.-C., Stritzelberger, J., Walther, K., Hamer, H. & Staresina, B. P. Hippocampal ripples during offline periods predict human motor sequence learning. bioRxiv, 2024.2010.2006.614680 (2024). https://doi.org:10.1101/2024.10.06.614680

      (3) Classen, J., Liepert, J., Wise, S. P., Hallett, M. & Cohen, L. G. Rapid plasticity of human cortical movement representation induced by practice. J Neurophysiol 79, 1117-1123 (1998).

      (4) Karni, A. et al. Functional MRI evidence for adult motor cortex plasticity during motor skill learning. Nature 377, 155-158 (1995). https://doi.org:10.1038/377155a0

      (5) Kleim, J. A., Barbay, S. & Nudo, R. J. Functional reorganization of the rat motor cortex following motor skill learning. J Neurophysiol 80, 3321-3325 (1998).

      (6) Shadmehr, R. & Holcomb, H. H. Neural correlates of motor memory consolidation. Science 277, 821-824 (1997).

      (7) Doyon, J. et al. Experience-dependent changes in cerebellar contributions to motor sequence learning. Proc Natl Acad Sci U S A 99, 1017-1022 (2002).

      (8) Toni, I., Ramnani, N., Josephs, O., Ashburner, J. & Passingham, R. E. Learning arbitrary visuomotor associations: temporal dynamic of brain activity. Neuroimage 14, 1048-1057 (2001).

      (9) Grafton, S. T. et al. Functional anatomy of human procedural learning determined with regional cerebral blood flow and PET. J Neurosci 12, 2542-2548 (1992).

      (10) Kennerley, S. W., Sakai, K. & Rushworth, M. F. Organization of action sequences and the role of the pre-SMA. J Neurophysiol 91, 978-993 (2004). https://doi.org:10.1152/jn.00651.2003 00651.2003 [pii]

      (11) Hardwick, R. M., Rottschy, C., Miall, R. C. & Eickhoff, S. B. A quantitative meta-analysis and review of motor learning in the human brain. Neuroimage 67, 283-297 (2013). https://doi.org:10.1016/j.neuroimage.2012.11.020

      (12) Sawamura, D. et al. Acquisition of chopstick-operation skills with the non-dominant hand and concomitant changes in brain activity. Sci Rep 9, 20397 (2019). https://doi.org:10.1038/s41598-019-56956-0

      (13) Lee, S. H., Jin, S. H. & An, J. The difference in cortical activation pattern for complex motor skills: A functional near- infrared spectroscopy study. Sci Rep 9, 14066 (2019). https://doi.org:10.1038/s41598-019-50644-9

      (14) Battaglia-Mayer, A. & Caminiti, R. Corticocortical Systems Underlying High-Order Motor Control. J Neurosci 39, 4404-4421 (2019). https://doi.org:10.1523/JNEUROSCI.2094-18.2019

      (15) Toni, I., Thoenissen, D. & Zilles, K. Movement preparation and motor intention. Neuroimage 14, S110-117 (2001). https://doi.org:10.1006/nimg.2001.0841

      (16) Wolpert, D. M., Goodbody, S. J. & Husain, M. Maintaining internal representations: the role of the human superior parietal lobe. Nat Neurosci 1, 529-533 (1998). https://doi.org:10.1038/2245

      (17) Andersen, R. A. & Buneo, C. A. Intentional maps in posterior parietal cortex. Annu Rev Neurosci 25, 189-220 (2002). https://doi.org:10.1146/annurev.neuro.25.112701.142922 112701.142922 [pii]

      (18) Buneo, C. A. & Andersen, R. A. The posterior parietal cortex: sensorimotor interface for the planning and online control of visually guided movements. Neuropsychologia 44, 2594-2606 (2006). https://doi.org:S0028-3932(05)00333-7 [pii] 10.1016/j.neuropsychologia.2005.10.011

      (19) Grover, S., Wen, W., Viswanathan, V., Gill, C. T. & Reinhart, R. M. G. Long-lasting, dissociable improvements in working memory and long-term memory in older adults with repetitive neuromodulation. Nat Neurosci 25, 1237-1246 (2022). https://doi.org:10.1038/s41593-022-01132-3

      (20) Colclough, G. L. et al. How reliable are MEG resting-state connectivity metrics? Neuroimage 138, 284-293 (2016). https://doi.org:10.1016/j.neuroimage.2016.05.070

      (21) Colclough, G. L., Brookes, M. J., Smith, S. M. & Woolrich, M. W. A symmetric multivariate leakage correction for MEG connectomes. NeuroImage 117, 439-448 (2015). https://doi.org:10.1016/j.neuroimage.2015.03.071

      (22) Mollazadeh, M. et al. Spatiotemporal variation of multiple neurophysiological signals in the primary motor cortex during dexterous reach-to-grasp movements. J Neurosci 31, 15531-15543 (2011). https://doi.org:10.1523/JNEUROSCI.2999-11.2011

      (23) Bansal, A. K., Vargas-Irwin, C. E., Truccolo, W. & Donoghue, J. P. Relationships among low-frequency local field potentials, spiking activity, and three-dimensional reach and grasp kinematics in primary motor and ventral premotor cortices. J Neurophysiol 105, 1603-1619 (2011). https://doi.org:10.1152/jn.00532.2010

      (24) Flint, R. D., Ethier, C., Oby, E. R., Miller, L. E. & Slutzky, M. W. Local field potentials allow accurate decoding of muscle activity. J Neurophysiol 108, 18-24 (2012). https://doi.org:10.1152/jn.00832.2011

      (25) Churchland, M. M. et al. Neural population dynamics during reaching. Nature 487, 51-56 (2012). https://doi.org:10.1038/nature11129

      (26) Bassett, D. S. et al. Dynamic reconfiguration of human brain networks during learning. Proc Natl Acad Sci U S A 108, 7641-7646 (2011). https://doi.org:10.1073/pnas.1018985108

      (27) Albouy, G., King, B. R., Maquet, P. & Doyon, J. Hippocampus and striatum: dynamics and interaction during acquisition and sleep-related motor sequence memory consolidation. Hippocampus 23, 985-1004 (2013). https://doi.org:10.1002/hipo.22183

      (28) Albouy, G. et al. Neural correlates of performance variability during motor sequence acquisition. Neuroimage 60, 324-331 (2012). https://doi.org:10.1016/j.neuroimage.2011.12.049

      (29) Qin, Y. L., McNaughton, B. L., Skaggs, W. E. & Barnes, C. A. Memory reprocessing in corticocortical and hippocampocortical neuronal ensembles. Philos Trans R Soc Lond B Biol Sci 352, 1525-1533 (1997). https://doi.org:10.1098/rstb.1997.0139

      (30) Euston, D. R., Tatsuno, M. & McNaughton, B. L. Fast-forward playback of recent memory sequences in prefrontal cortex during sleep. Science 318, 1147-1150 (2007). https://doi.org:10.1126/science.1148979

      (31) Molle, M. & Born, J. Hippocampus whispering in deep sleep to prefrontal cortex--for good memories? Neuron 61, 496-498 (2009). https://doi.org:S0896-6273(09)00122-6 [pii] 10.1016/j.neuron.2009.02.002

      (32) Frankland, P. W. & Bontempi, B. The organization of recent and remote memories. Nat Rev Neurosci 6, 119-130 (2005). https://doi.org:10.1038/nrn1607

      (33) Jacobacci, F. et al. Rapid hippocampal plasticity supports motor sequence learning. Proc Natl Acad Sci U S A 117, 23898-23903 (2020). https://doi.org:10.1073/pnas.2009576117

      (34) Albouy, G. et al. Maintaining vs. enhancing motor sequence memories: respective roles of striatal and hippocampal systems. Neuroimage 108, 423-434 (2015). https://doi.org:10.1016/j.neuroimage.2014.12.049

      (35) Gais, S. et al. Sleep transforms the cerebral trace of declarative memories. Proc Natl Acad Sci U S A 104, 18778-18783 (2007). https://doi.org:0705454104 [pii] 10.1073/pnas.0705454104

      (36) Sterpenich, V. et al. Sleep promotes the neural reorganization of remote emotional memory. J Neurosci 29, 5143-5152 (2009). https://doi.org:10.1523/JNEUROSCI.0561-09.2009

      (37) Euston, D. R., Gruber, A. J. & McNaughton, B. L. The role of medial prefrontal cortex in memory and decision making. Neuron 76, 1057-1070 (2012). https://doi.org:10.1016/j.neuron.2012.12.002

      (38) van Kesteren, M. T., Fernandez, G., Norris, D. G. & Hermans, E. J. Persistent schema-dependent hippocampal-neocortical connectivity during memory encoding and postencoding rest in humans. Proc Natl Acad Sci U S A 107, 7550-7555 (2010). https://doi.org:10.1073/pnas.0914892107

      (39) van Kesteren, M. T., Ruiter, D. J., Fernandez, G. & Henson, R. N. How schema and novelty augment memory formation. Trends Neurosci 35, 211-219 (2012). https://doi.org:10.1016/j.tins.2012.02.001

      (40) Wagner, A. D. et al. Building memories: remembering and forgetting of verbal experiences as predicted by brain activity. Science (New York, N.Y.) 281, 1188-1191 (1998).

      (41) Ashe, J., Lungu, O. V., Basford, A. T. & Lu, X. Cortical control of motor sequences. Curr Opin Neurobiol 16, 213-221 (2006).

      (42) Hikosaka, O., Nakamura, K., Sakai, K. & Nakahara, H. Central mechanisms of motor skill learning. Curr Opin Neurobiol 12, 217-222 (2002).

      (43) Penhune, V. B. & Steele, C. J. Parallel contributions of cerebellar, striatal and M1 mechanisms to motor sequence learning. Behav. Brain Res. 226, 579-591 (2012). https://doi.org:10.1016/j.bbr.2011.09.044

      (44) Doyon, J. et al. Contributions of the basal ganglia and functionally related brain structures to motor learning. Behavioural brain research 199, 61-75 (2009). https://doi.org:10.1016/j.bbr.2008.11.012

      (45) Schendan, H. E., Searl, M. M., Melrose, R. J. & Stern, C. E. An FMRI study of the role of the medial temporal lobe in implicit and explicit sequence learning. Neuron 37, 1013-1025 (2003). https://doi.org:10.1016/s0896-6273(03)00123-5

      (46) Morris, R. G. M. Elements of a neurobiological theory of hippocampal function: the role of synaptic plasticity, synaptic tagging and schemas. The European journal of neuroscience 23, 2829-2846 (2006). https://doi.org:10.1111/j.1460-9568.2006.04888.x

      (47) Tse, D. et al. Schemas and memory consolidation. Science 316, 76-82 (2007). https://doi.org:10.1126/science.1135935

      (48) Berlot, E., Popp, N. J. & Diedrichsen, J. A critical re-evaluation of fMRI signatures of motor sequence learning. Elife 9 (2020). https://doi.org:10.7554/eLife.55241

      (49) Bonstrup, M. et al. A Rapid Form of Offline Consolidation in Skill Learning. Curr Biol 29, 1346-1351 e1344 (2019). https://doi.org:10.1016/j.cub.2019.02.049

      (50) Kornysheva, K. et al. Neural Competitive Queuing of Ordinal Structure Underlies Skilled Sequential Action. Neuron 101, 1166-1180 e1163 (2019). https://doi.org:10.1016/j.neuron.2019.01.018

      (51) Pan, S. C. & Rickard, T. C. Sleep and motor learning: Is there room for consolidation? Psychol Bull 141, 812-834 (2015). https://doi.org:10.1037/bul0000009

      (52) Rickard, T. C., Cai, D. J., Rieth, C. A., Jones, J. & Ard, M. C. Sleep does not enhance motor sequence learning. J Exp Psychol Learn Mem Cogn 34, 834-842 (2008). https://doi.org:10.1037/0278-7393.34.4.834

      53) Brawn, T. P., Fenn, K. M., Nusbaum, H. C. & Margoliash, D. Consolidating the effects of waking and sleep on motor-sequence learning. J Neurosci 30, 13977-13982 (2010). https://doi.org:10.1523/JNEUROSCI.3295-10.2010

      (54) Bonstrup, M., Iturrate, I., Hebart, M. N., Censor, N. & Cohen, L. G. Mechanisms of offline motor learning at a microscale of seconds in large-scale crowdsourced data. NPJ Sci Learn 5, 7 (2020). https://doi.org:10.1038/s41539-020-0066-9

      (55) Gupta, M. W. & Rickard, T. C. Dissipation of reactive inhibition is sufficient to explain post-rest improvements in motor sequence learning. NPJ Sci Learn 7, 25 (2022). https://doi.org:10.1038/s41539-022-00140-z

      (56) Jacobacci, F. et al. Rapid hippocampal plasticity supports motor sequence learning. Proceedings of the National Academy of Sciences 117, 23898-23903 (2020).

      (57) Brooks, E., Wallis, S., Hendrikse, J. & Coxon, J. Micro-consolidation occurs when learning an implicit motor sequence, but is not influenced by HIIT exercise. NPJ Sci Learn 9, 23 (2024). https://doi.org:10.1038/s41539-024-00238-6

      (58) Deleglise, A. et al. Human motor sequence learning drives transient changes in network topology and hippocampal connectivity early during memory consolidation. Cereb Cortex 33, 6120-6131 (2023). https://doi.org:10.1093/cercor/bhac489

      (59) Buzsaki, G. Hippocampal sharp wave-ripple: A cognitive biomarker for episodic memory and planning. Hippocampus 25, 1073-1188 (2015). https://doi.org:10.1002/hipo.22488

      (60) Gupta, M. W. & Rickard, T. C. Comparison of online, offline, and hybrid hypotheses of motor sequence learning using a quantitative model that incorporate reactive inhibition. Sci Rep 14, 4661 (2024). https://doi.org:10.1038/s41598-024-52726-9

      (61) Das, A., Karagiorgis, A., Diedrichsen, J., Stenner, M.-P. & Azanon, E. “Micro-offline gains” convey no benefit for motor skill learning. bioRxiv, 2024.2007.2011.602795 (2024). https://doi.org:10.1101/2024.07.11.602795

      (62) Mylonas, D. et al. Maintenance of Procedural Motor Memory across Brief Rest Periods Requires the Hippocampus. J Neurosci 44 (2024). https://doi.org:10.1523/JNEUROSCI.1839-23.2024

    2. eLife Assessment

      This valuable study investigates how the neural representation of individual finger movements changes during the early period of sequence learning. By combining a new method for extracting features from human magnetoencephalography data and decoding analyses, the authors provide incomplete evidence of an early, swift change in the brain regions correlated with sequence learning, including a set of previously unreported frontal cortical regions. The addition of more control analyses to rule out that head movement artefacts influence the findings, and to further explain the proposal of offline contextualization during short rest periods as the basis for improvement performance would strengthen the manuscript.

    3. Reviewer #1 (Public review):

      Summary:

      This study addresses the issue of rapid skill learning and whether individual sequence elements (here: finger presses) are differentially represented in human MEG data. The authors use a decoding approach to classify individual finger elements, and accomplish an accuracy of around 94%. A relevant finding is that the neural representations of individual finger elements dynamically change over the course of learning. This would be highly relevant for any attempts to develop better brain machine interfaces - one now can decode individual elements within a sequence with high precision, but these representations are not static but develop over the course of learning.

      Strengths:

      The work follows a large body of work from the same group on the behavioural and neural foundations of sequence learning. The behavioural task is well established and neatly designed to allow for tracking learning and how individual sequence elements contribute. The inclusion of short offline rest periods between learning epochs has been influential because it has revealed that a lot, if not most of the gains in behaviour (ie speed of finger movements) occur in these so-called micro-offline rest periods.

      The authors use a range of new decoding techniques, and exhaustively interrogate their data in different ways, using different decoding approaches. Regardless of the approach, impressively high decoding accuracies are observed, but when using a hybrid approach that combines the MEG data in different ways, the authors observe decoding accuracies of individual sequence elements from the MEG data of up to 94%.

      Weaknesses:

      There are a few concerns which the authors may well be able to resolve. These are not weaknesses as such, but factors that would be helpful to address as these concern potential contributions to the results that one would like to rule out.

      Regarding the decoding results shown in Figure 2 etc, a concern is that within individual frequency bands, the highest accuracy seems to be within frequencies that match the rate of keypresses. This is a general concern when relating movement to brain activity, so is not specific to decoding as done here. As far as reported, there was no specific restraint to the arm or shoulder, and even then it is conceivable that small head movements would correlate highly with the vigor of individual finger movements. This concern is supported by the highest contribution in decoding accuracy being in middle frontal regions - midline structures that would be specifically sensitive to movement artefacts and don't seem to come to mind as key structures for very simple sequential keypress tasks such as this - and the overall pattern is remarkably symmetrical (despite being a unimanual finger task) and spatially broad. This issue may well be matching the time course of learning, as the vigor and speed of finger presses will also influence the degree to which the arm/shoulder and head move.

      This is not to say that useful information is contained within either of the frequencies or broadband data. But it raises the question of whether a lot is dominated by movement "artefacts" and one may get a more specific answer if removing any such contributions.

      A somewhat related point is this: when combining voxel and parcel space, a concern is whether a degree of circularity may have contributed to the improved accuracy of the combined data, because it seems to use the same MEG signals twice - the voxels most contributing are also those contributing most to a parcel being identified as relevant, as parcels reflect the average of voxels within a boundary. In this context, I struggled to understand the explanation given, ie that the improved accuracy of the hybrid model may be due to "lower spatially resolved whole-brain and higher spatially resolved regional activity patterns". Firstly, there will be a relatively high degree of spatial contiguity among voxels because of the nature of the signal measured, ie nearby individual voxels are unlikely to be independent. Secondly, the voxel data gives a somewhat misleading sense of precision; the inversion can be set up to give an estimate for each voxel, but there will not just be dependence among adjacent voxels, but also substantial variation in the sensitivity and confidence with which activity can be projected to different parts of the brain. Midline and deeper structures come to mind, where the inversion will be more problematic than for regions along the dorsal convexity of the brain, and a concern is that in those midline structures, the highest decoding accuracy is seen.

      Some of these concerns could be addressed by recording head movement (with enough precision) to regress out these contributions. The authors state that head movement was monitored with 3 fiducials, and their timecourses ought to provide a way to deal with this issue. The ICA procedure may not have sufficiently dealt with removing movement-related problems, but one could eg relate individual components that were identified to the keypresses as another means for checking. An alternative could be to focus on frequency ranges above the movement frequencies. The accuracy for those still seems impressive, and may provide a slightly more biologically plausible assessment.

      One question concerns the interpretation of the results shown in Figure 4. They imply that during the course of learning, entirely different brain networks underpin the behaviour. Not only that, but they also include regions that would seem rather unexpected to be key nodes for learning and expressing relatively simple finger sequences, such as here. What then is the biological plausibility of these results? The authors seem to circumnavigate this issue by moving into a distance metric that captures the (neural network) changes over the course of learning, but the discussion seems detached from which regions are actually involved; or they offer a rather broad discussion of the anatomical regions identified here, eg in the context of LFOs, where they merely refer to "frontoparietal regions".

      If I understand correctly, the offline neural representation analysis is in essence the comparison of the last keypress vs the first keypress of the next sequence. In that sense, the activity during offline rest periods is actually not considered. This makes the nomenclature somewhat confusing. While it matches the behavioural analysis, having only key presses one can't do it in any other way, but here the authors actually do have recordings of brain activity during offline rest. So at the very least calling it offline neural representation is misleading to this reviewer because what is compared is activity during the last and during the next keypress, not activity during offline periods. But it also seems a missed opportunity - the authors argue that most of the relevant learning occurs during offline rest periods, yet there is no attempt to actually test whether activity during this period can be useful for the questions at hand here.

    4. Reviewer #2 (Public review):

      Summary

      Dash et al. asked whether and how the neural representation of individual finger movements is "contextualized" within a trained sequence during the very early period of sequential skill learning by using decoding of MEG signal. Specifically, they assessed whether/how the same finger presses (pressing index finger) embedded in the different ordinal positions of a practiced sequence (4-1-3-2-4; here, the numbers 1 through 4 correspond to the little through the index fingers of the non-dominant left hand) change their representation (MEG feature). They did this by computing either the decoding accuracy of the index finger at the ordinal positions 1 vs. 5 (index_OP1 vs index_OP5) or pattern distance between index_OP1 vs. index_OP5 at each training trial and found that both the decoding accuracy and the pattern distance progressively increase over the course of learning trials. More interestingly, they also computed the pattern distance for index_OP5 for the last execution of a practice trial vs. index_OP1 for the first execution in the next practice trial (i.e., across the rest period). This "off-line" distance was significantly larger than the "on-line" distance, which was computed within practice trials and predicted micro-offline skill gain. Based on these results, the authors conclude that the differentiation of representation for the identical movement embedded in different positions of a sequential skill ("contextualization") primarily occurs during early skill learning, especially during rest, consistent with the recent theory of the "micro-offline learning" proposed by the authors' group. I think this is an important and timely topic for the field of motor learning and beyond.

      Strengths

      The specific strengths of the current work are as follows. First, the use of temporally rich neural information (MEG signal) has a large advantage over previous studies testing sequential representations using fMRI. This allowed the authors to examine the earliest period (= the first few minutes of training) of skill learning with finer temporal resolution. Second, through the optimization of MEG feature extraction, the current study achieved extremely high decoding accuracy (approx. 94%) compared to previous works. As claimed by the authors, this is one of the strengths of the paper (but see my comments). Third, although some potential refinement might be needed, comparing "online" and "offline" pattern distance is a neat idea.

      Weaknesses

      Along with the strengths I raised above, the paper has some weaknesses. First, the pursuit of high decoding accuracy, especially the choice of time points and window length (i.e., 200 msec window starting from 0 msec from key press onset), casts a shadow on the interpretation of the main result. Currently, it is unclear whether the decoding results simply reflect behavioral change or true underlying neural change. As shown in the behavioral data, the key press speed reached 3~4 presses per second already at around the end of the early learning period (11th trial), which means inter-press intervals become as short as 250-330 msec. Thus, in almost more than 60% of training period data, the time window for MEG feature extraction (200 msec) spans around 60% of the inter-press intervals. Considering that the preparation/cueing of subsequent presses starts ahead of the actual press (e.g., Kornysheva et al., 2019) and/or potential online planning (e.g., Ariani and Diedrichsen, 2019), the decoder likely has captured these future press information as well as the signal related to the current key press, independent of the formation of genuine sequential representation (e.g., "contextualization" of individual press). This may also explain the gradual increase in decoding accuracy or pattern distance between index_OP1 vs. index_OP5 (Figure 4C and 5A), which co-occurred with performance improvement, as shorter inter-press intervals are more favorable for the dissociating the two index finger presses followed by different finger presses. The compromised decoding accuracies for the control sequences can be explained in similar logic. Therefore, more careful consideration and elaborated discussion seem necessary when trying to both achieve high-performance decoding and assess early skill learning, as it can impact all the subsequent analyses.

      Related to the above point, testing only one particular sequence (4-1-3-2-4), aside from the control ones, limits the generalizability of the finding. This also may have contributed to the extremely high decoding accuracy reported in the current study.

      In terms of clinical BCI, one of the potential relevance of the study, as claimed by the authors, it is not clear that the specific time window chosen in the current study (up to 200 msec since key press onset) is really useful. In most cases, clinical BCI would target neural signals with no overt movement execution due to patients' inability to move (e.g., Hochberg et al., 2012). Given the time window, the surprisingly high performance of the current decoder may result from sensory feedback and/or planning of subsequent movement, which may not always be available in the clinical BCI context. Of course, the decoding accuracy is still much higher than chance even when using signal before the key press (as shown in Figure 4 Supplement 2), but it is not immediately clear to me that the authors relate their high decoding accuracy based on post-movement signal to clinical BCI settings.

      One of the important and fascinating claims of the current study is that the "contextualization" of individual finger movements in a trained sequence specifically occurs during short rest periods in very early skill learning, echoing the recent theory of micro-offline learning proposed by the authors' group. Here, I think two points need to be clarified. First, the concept of "contextualization" is kept somewhat blurry throughout the text. It is only at the later part of the Discussion (around line #330 on page 13) that some potential mechanism for the "contextualization" is provided as "what-and-where" binding. Still, it is unclear what "contextualization" actually is in the current data, as the MEG signal analyzed is extracted from 0-200 msec after the keypress. If one thinks something is contextualizing an action, that contextualization should come earlier than the action itself.

      The second point is that the result provided by the authors is not yet convincing enough to support the claim that "contextualization" occurs during rest. In the original analysis, the authors presented the statistical significance regarding the correlation between the "offline" pattern differentiation and micro-offline skill gain (Figure 5. Supplement 1), as well as the larger "offline" distance than "online" distance (Figure 5B). However, this analysis looks like regressing two variables (monotonically) increasing as a function of the trial. Although some information in this analysis, such as what the independent/dependent variables were or how individual subjects were treated, was missing in the Methods, getting a statistically significant slope seems unsurprising in such a situation. Also, curiously, the same quantitative evidence was not provided for its "online" counterpart, and the authors only briefly mentioned in the text that there was no significant correlation between them. It may be true looking at the data in Figure 5A as the online representation distance looks less monotonically changing, but the classification accuracy presented in Figure 4C, which should reflect similar representational distance, shows a more monotonic increase up to the 11th trial. Further, the ways the "online" and "offline" representation distance was estimated seem to make them not directly comparable. While the "online" distance was computed using all the correct press data within each 10 sec of execution, the "offline" distance is basically computed by only two presses (i.e., the last index_OP5 vs. the first index_OP1 separated by 10 sec of rest). Theoretically, the distance between the neural activity patterns for temporally closer events tends to be closer than that between the patterns for temporally far-apart events. It would be fairer to use the distance between the first index_OP1 vs. the last index_OP5 within an execution period for "online" distance, as well.

      A related concern regarding the control analysis, where individual values for max speed and the degree of online contextualization were compared (Figure 5 Supplement 3), is whether the individual difference is meaningful. If I understood correctly, the optimization of the decoding process (temporal window, feature inclusion/reduction, decoder, etc.) was performed for individual participants, and the same feature extraction was also employed for the analysis of representation distance (i.e., contextualization). If this is the case, the distances are individually differently calculated and they may need to be normalized relative to some stable reference (e.g., 1 vs. 4 or average distance within the control sequence presses) before comparison across the individuals.

    5. Reviewer #3 (Public review):

      Summary:

      One goal of this paper is to introduce a new approach for highly accurate decoding of finger movements from human magnetoencephalography data via dimension reduction of a "multi-scale, hybrid" feature space. Following this decoding approach, the authors aim to show that early skill learning involves "contextualization" of the neural coding of individual movements, relative to their position in a sequence of consecutive movements. Furthermore, they aim to show that this "contextualization" develops primarily during short rest periods interspersed with skill training, and correlates with a performance metric which the authors interpret as an indicator of offline learning.

      Strengths:

      A clear strength of the paper is the innovative decoding approach, which achieves impressive decoding accuracies via dimension reduction of a "multi-scale, hybrid space". This hybrid-space approach follows the neurobiologically plausible idea of the concurrent distribution of neural coding across local circuits as well as large-scale networks. A further strength of the study is the large number of tested dimension reduction techniques and classifiers (though the manuscript reveals little about the comparison of the latter).

      A simple control analysis based on shuffled class labels could lend further support to this complex decoding approach. As a control analysis that completely rules out any source of overfitting, the authors could test the decoder after shuffling class labels. Following such shuffling, decoding accuracies should drop to chance level for all decoding approaches, including the optimized decoder. This would also provide an estimate of actual chance-level performance (which is informative over and beyond the theoretical chance level). Furthermore, currently, the manuscript does not explain the huge drop in decoding accuracies for the voxel-space decoding (Figure 3B). Finally, the authors' approach to cortical parcellation raises questions regarding the information carried by varying dipole orientations within a parcel (which currently seems to be ignored?) and the implementation of the mean-flipping method (given that there are two dimensions - space and time - what do the authors refer to when they talk about the sign of the "average source", line 477?).

      Weaknesses:

      A clear weakness of the paper lies in the authors' conclusions regarding "contextualization". Several potential confounds, described below, question the neurobiological implications proposed by the authors and provide a simpler explanation of the results. Furthermore, the paper follows the assumption that short breaks result in offline skill learning, while recent evidence, described below, casts doubt on this assumption.

      The authors interpret the ordinal position information captured by their decoding approach as a reflection of neural coding dedicated to the local context of a movement (Figure 4). One way to dissociate ordinal position information from information about the moving effectors is to train a classifier on one sequence and test the classifier on other sequences that require the same movements, but in different positions (Kornysheva et al., Neuron 2019). In the present study, however, participants trained to repeat a single sequence (4-1-3-2-4). As a result, ordinal position information is potentially confounded by the fixed finger transitions around each of the two critical positions (first and fifth press). Across consecutive correct sequences, the first keypress in a given sequence was always preceded by a movement of the index finger (=last movement of the preceding sequence), and followed by a little finger movement. The last keypress, on the other hand, was always preceded by a ring finger movement, and followed by an index finger movement (=first movement of the next sequence). Figure 4 - Supplement 2 shows that finger identity can be decoded with high accuracy (>70%) across a large time window around the time of the key press, up to at least {plus minus}100 ms (and likely beyond, given that decoding accuracy is still high at the boundaries of the window depicted in that figure). This time window approaches the keypress transition times in this study. Given that distinct finger transitions characterized the first and fifth keypress, the classifier could thus rely on persistent (or "lingering") information from the preceding finger movement, and/or "preparatory" information about the subsequent finger movement, in order to dissociate the first and fifth keypress. Currently, the manuscript provides no evidence that the context information captured by the decoding approach is more than a by-product of temporally extended, and therefore overlapping, but independent neural representations of consecutive keypresses that are executed in close temporal proximity - rather than a neural representation dedicated to context.

      Such temporal overlap of consecutive, independent finger representations may also account for the dynamics of "ordinal coding"/"contextualization", i.e., the increase in 2-class decoding accuracy, across Day 1 (Figure 4C). As learning progresses, both tapping speed and the consistency of keypress transition times increase (Figure 1), i.e., consecutive keypresses are closer in time, and more consistently so. As a result, information related to a given keypress is increasingly overlapping in time with information related to the preceding and subsequent keypresses. The authors seem to argue that their regression analysis in Figure 5 - Figure Supplement 3 speaks against any influence of tapping speed on "ordinal coding" (even though that argument is not made explicitly in the manuscript). However, Figure 5 - Figure Supplement 3 shows inter-individual differences in a between-subject analysis (across trials, as in panel A, or separately for each trial, as in panel B), and, therefore, says little about the within-subject dynamics of "ordinal coding" across the experiment. A regression of trial-by-trial "ordinal coding" on trial-by-trial tapping speed (either within-subject or at a group-level, after averaging across subjects) could address this issue. Given the highly similar dynamics of "ordinal coding" on the one hand (Figure 4C), and tapping speed on the other hand (Figure 1B), I would expect a strong relationship between the two in the suggested within-subject (or group-level) regression. Furthermore, learning should increase the number of (consecutively) correct sequences, and, thus, the consistency of finger transitions. Therefore, the increase in 2-class decoding accuracy may simply reflect an increasing overlap in time of increasingly consistent information from consecutive keypresses, which allows the classifier to dissociate the first and fifth keypress more reliably as learning progresses, simply based on the characteristic finger transitions associated with each. In other words, given that the physical context of a given keypress changes as learning progresses - keypresses move closer together in time and are more consistently correct - it seems problematic to conclude that the mental representation of that context changes. To draw that conclusion, the physical context should remain stable (or any changes to the physical context should be controlled for).

      A similar difference in physical context may explain why neural representation distances ("differentiation") differ between rest and practice (Figure 5). The authors define "offline differentiation" by comparing the hybrid space features of the last index finger movement of a trial (ordinal position 5) and the first index finger movement of the next trial (ordinal position 1). However, the latter is not only the first movement in the sequence but also the very first movement in that trial (at least in trials that started with a correct sequence), i.e., not preceded by any recent movement. In contrast, the last index finger of the last correct sequence in the preceding trial includes the characteristic finger transition from the fourth to the fifth movement. Thus, there is more overlapping information arising from the consistent, neighbouring keypresses for the last index finger movement, compared to the first index finger movement of the next trial. A strong difference (larger neural representation distance) between these two movements is, therefore, not surprising, given the task design, and this difference is also expected to increase with learning, given the increase in tapping speed, and the consequent stronger overlap in representations for consecutive keypresses. Furthermore, initiating a new sequence involves pre-planning, while ongoing practice relies on online planning (Ariani et al., eNeuro 2021), i.e., two mental operations that are dissociable at the level of neural representation (Ariani et al., bioRxiv 2023).

      Given these differences in the physical context and associated mental processes, it is not surprising that "offline differentiation", as defined here, is more pronounced than "online differentiation". For the latter, the authors compared movements that were better matched regarding the presence of consistent preceding and subsequent keypresses (online differentiation was defined as the mean difference between all first vs. last index finger movements during practice). It is unclear why the authors did not follow a similar definition for "online differentiation" as for "micro-online gains" (and, indeed, a definition that is more consistent with their definition of "offline differentiation"), i.e., the difference between the first index finger movement of the first correct sequence during practice, and the last index finger of the last correct sequence. While these two movements are, again, not matched for the presence of neighbouring keypresses (see the argument above), this mismatch would at least be the same across "offline differentiation" and "online differentiation", so they would be more comparable.

      A further complication in interpreting the results regarding "contextualization" stems from the visual feedback that participants received during the task. Each keypress generated an asterisk shown above the string on the screen, irrespective of whether the keypress was correct or incorrect. As a result, incorrect (e.g., additional, or missing) keypresses could shift the phase of the visual feedback string (of asterisks) relative to the ordinal position of the current movement in the sequence (e.g., the fifth movement in the sequence could coincide with the presentation of any asterisk in the string, from the first to the fifth). Given that more incorrect keypresses are expected at the start of the experiment, compared to later stages, the consistency in visual feedback position, relative to the ordinal position of the movement in the sequence, increased across the experiment. A better differentiation between the first and the fifth movement with learning could, therefore, simply reflect better decoding of the more consistent visual feedback, based either on the feedback-induced brain response, or feedback-induced eye movements (the study did not include eye tracking). It is not clear why the authors introduced this complicated visual feedback in their task, besides consistency with their previous studies.

      The authors report a significant correlation between "offline differentiation" and cumulative micro-offline gains. However, it would be more informative to correlate trial-by-trial changes in each of the two variables. This would address the question of whether there is a trial-by-trial relation between the degree of "contextualization" and the amount of micro-offline gains - are performance changes (micro-offline gains) less pronounced across rest periods for which the change in "contextualization" is relatively low? Furthermore, is the relationship between micro-offline gains and "offline differentiation" significantly stronger than the relationship between micro-offline gains and "online differentiation"?

      The authors follow the assumption that micro-offline gains reflect offline learning. However, there is no direct evidence in the literature that micro-offline gains really result from offline learning, i.e., an improvement in skill level. On the contrary, recent evidence questions this interpretation (Gupta & Rickard, npj Sci Learn 2022; Gupta & Rickard, Sci Rep 2024; Das et al., bioRxiv 2024). Instead, there is evidence that micro-offline gains are transient performance benefits that emerge when participants train with breaks, compared to participants who train without breaks, however, these benefits vanish within seconds after training if both groups of participants perform under comparable conditions (Das et al., bioRxiv 2024).

    1. eLife Assessment

      This study presents valuable insights into the organization of second-order circuits for gustatory neurons, particularly how they integrate opposing taste inputs and the metabolic states that regulate feeding behavior. An elegant, compelling combination of multiple techniques discovered the target neurons for gustatory integration. However, the functional and behavioral evidence for the function of these neurons is incomplete.

    2. Reviewer #1 (Public review):

      Summary:

      Mollá-Albaladejo et al. investigate the neurons downstream of GR64f and Gr66a, called G2Ns. They identify downstream neurons using trans-Tango labeling with RFP and then perform bulk RNA-seq on the RFP-sorted cells. Gene expression is up- or downregulated between the cell populations and between fed and starved states. They specifically identify Leukocinin as a neuropeptide that is upregulated in starved Gr66a cells. Leucokinin cells, identified by a GAL4 line indeed show higher expression when starved, especially in the SEZ. Furthermore, Leucokinin cells colocalize with the trans-Tango signal from downstream neurons of both GRs. This connection is confirmed with GRASP. According to EM data, Leucokinin cells in the SEZ receive a lot of input and connect to many downstream neurons. In behavior experiments performed with flies lacking Leucokinin neurons, flies show reduced responsiveness to sugar and bitter mixtures when starved. The authors suggest that Leucokinin neurons integrate bitter and sugar tastes and that their output is modified by a hunger state.

      Strengths:

      The authors use a multitude of tools to identify SELK neurons downstream of taste sensory neurons and as starvation-sensitive cells. This study provides an example of how combining genetic labeling, RNA-seq, and EM analysis can be combined to investigate neural circuits.

      Weaknesses:

      The authors do not show a functional connection between sensory neurons and SELK neurons. Additionally, data from RNA seq, anatomical studies, and EM analysis are sometimes contradictory in terms of connectivity. GRASP signal is not foolproof that cells are synaptically connected.

      The authors describe a behavioral phenotype when flies are starved, however, they do not use a specific driver for the described cell type, thus they should also tone down their claims.

      Generally, the authors do not provide a big advancement to the field and some of the results are contradictory with previous publications.

    3. Reviewer #2 (Public review):

      Summary:

      A core task of the brain is processing sensory cues from the environment. The neural mechanisms of how sensory information is transmitted from peripheral sense organs to subsequent being processing in defined brain centers remain an important topic in neuroscience. The taste system hereby assesses the palatability of food by evaluating the chemical composition and nutrient content while integrating the current need for energy by assessing the satiation level of the organism. The current manuscript provides insights into the early circuits of gustatory coding using the fruit fly as a model. By combining trans-tango and FACS-based bulk RNAseq to assess the target neurons of sweet sensing (using Gr64f-Gal4) and bitter sensing (using Gr66a-Gal4) in a first set of experiments the authors investigate genes that are differentially expressed or co-expressed in normal and starved conditions. With a focus on neuropeptides and neurotransmitters, different expressions in the different conditions were assessed resulting in the identification of Leucokinin as a potentially interesting gene. The notion is further supported by RNAseq of Lk-Gal4>mCD8:GFP sorted cells and immunostainings. GRASP and BacTrace experiments further support that the two Lk-expressing cells in the SEZ should indeed be postsynaptic to both types of sensories. Using EM-based connectomics data (based on a previous publication by Engert et al.), the authors also look for downstream targets of the bitter versus sweet gustatory neurons to identify the Lk-neurons. Based on the morphology they identify candidates and further depict the potential downstream neurons in the connectome, which appears largely in agreement with GRASP experiments. Finally silencing the Lk-neurons shows an increased PER response in starved flies (when combined with bitter compounds) as well as increased feeding in a FlyPad assay.

      Strengths:

      Overall this is an intriguing manuscript, which provides insight into the organization of 2nd order gustatory neurons. It specifically provides strong evidence for the Lk-neurons as a target of sweet and bitter GRNs and provides evidence for their role in regulating sweet vs bitter-based behavioral responses. Particularly the integration of different techniques and datasets in an elegant fashion is a strong side of the manuscript. Moreover to put the known LK-neurons into the context of 2nd order gustatory signalling is strengthening the knowledge about this pathway.

      Weaknesses:

      I do not see any major weakness in the current manuscript. Novelty is to some degree lessened by the fact, that the RNAseq approach did not identify new neurons but rather put the known LK-neurons as major findings. Similarly, the final behavioral section is not very deep and to some degree corroborates the previous publication by the Keene and Nässel labs - that said, the model they propose is indeed novel (but lacks depth in analyses; e.g. there is no physiology that would support the modulation of Lk neurons by either type of GRN). The connectomic section appears a bit out of place and after reading it it's not really clear what one should make of the potential downstream neurons (particularly since the Lk-receptor expression has been previously analyzed); here it might have been interesting to address if/how Lk-neurons may signal directly via a classical neurotransmitter (an information that might be found easily in the adult brain single-cell data).

    4. Reviewer #3 (Public review):

      Summary:

      To make feeding decisions, animals need to process three types of information: positive cues like sweetness, negative cues like bitterness, and internal states such as hunger or satiety. This study aims to identify where the information is integrated into the fruit fly brain. The authors applied RNA sequencing on second-order gustatory neurons responsible for sweet and bitter processing, under fed and starved conditions. The sequencing data reveal significant changes in gene expression across sweet vs. bitter pathways and fed vs. starved states. The authors focus on the neuropeptide Leucokinin (Lk), whose expression is dependent on the starvation state. They identify a pair of neurons, named SELK neurons, which express Lk and receive direct input from both sweet and bitter gustatory neurons. These SELK neurons are ideal candidates to integrate gustatory and internal state information. Behavioral experiments show that blocking these neurons in starved flies alters their tolerance to bitter substances during feeding.

      Strengths:

      (1) The study employs a well-designed approach, targeting specific neuronal populations, which is more efficient and precise compared to traditional large-scale genetic screening methods.

      (2) The RNAseq results provide valuable data that can be utilized in future studies to explore other molecules beyond Lk.

      (3) The identification of SELK neurons offers a promising avenue for future research into how these neurons integrate conflicting gustatory signals and internal state information.

      Weaknesses:

      (1) Unfortunately, due to technical challenges, the authors were unable to directly image the functional activity of SELK neurons.

      (2) In the behavioral experiments, tetanus toxin was used to block SELK neurons. Since these neurons may release multiple neurotransmitters or neuropeptides, the results do not specifically demonstrate that Leucokinin (Lk) is the critical factor, as suggested in Figure 8. To address this, I recommend using RNAi to inhibit Lk expression in SELK neurons and comparing the outcomes to wild-type controls via the PER assay.

    1. eLife Assessment

      This study presents an important finding on durotaxis in various amoeboid cells that is independent of focal adhesions. The evidence supporting the authors' claims is compelling. The work will be of interest to cell biologists and biophysicists working on rigidity sensing, the cytoskeleton, and cell migration.

    1. eLife Assessment

      This work describes a new software platform for machine-learning-based segmentation of and particle-picking in cryo-electron tomograms. The program and its corresponding online database of trained models will allow experimentalists to conveniently test different models and share their results with others. The paper provides convincing evidence that the software will be valuable to the community.

    2. Reviewer #1 (Public review):

      This paper describes "Ais", a new software tool for machine-learning based segmentation and particle picking of electron tomograms. The software can visualise tomograms as slices and allows manual annotation for the training of a provided set of various types of neural networks. New networks can be added, provided they adhere to a python file with an (undescribed) format. Once networks have been trained on manually annotated tomograms, they can be used to segment new tomograms within the same software. The authors also set up an online repository to which users can upload their models, so they might be re-used by others with similar needs. By logically combining the results from different types of segmentations, they further improve the detection of distinct features. The authors demonstrate the usefulness of their software on various data sets. Thus, the software appears to be a valuable tool for the cryo-ET community that will lower the boundaries of using a variety of machine-learning methods to help interpret tomograms.

    3. Reviewer #2 (Public review):

      Summary:

      Last et al. present Ais, a new deep learning based software package for segmentation of cryo electron tomography data sets. The distinguishing factor of this package is its orientation to the joint use of different models, rather than the implementation of a given approach: Notably, the software is supported by an online repository of segmentation models, open to contributions from the community.

      The usefulness of handling different models in one single environment is showcased with a comparative study on how different models perform on a given data set; then with an explanation on how the results of several models can be manually merged by the interactive tools inside Ais.

      The manuscripts presents two applications of Ais on real data sets; one oriented to showcase its particle picking capacities on a study previously completed by the authors; a second one refers to a complex segmentation problem on two different data sets (representing different geometries as bacterial cilia and mitochondria in a mouse neuron), both from public databases.

      The software described in the paper is compactly documented in its website, additionally providing links to some youtube videos (less than an hour it toral) where the authors videocapture and comment major workflows.

      In short, the manuscript describes a valuable resource for the community of tomography practitioners.

      Strengths:

      Public repository of segmentation models; easiness of working with several models and comparing/merging the results.

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Last and colleagues describe Ais, an open-source software package for the semi-automated segmentation of cryo-electron tomography (cryo-ET) maps. Specifically, Ais provides a graphical user interface (GUI) for the manual segmentation and annotation of specific features of interest. These manual annotations are then used as input ground-truth data for training a convolutional neural network (CNN) model, which can then be used for automatic segmentation. Ais provides the option of several CNNs so that users can compare their performance on their structures of interest in order to determine the CNN that best suits their needs. Additionally, pretrained models can be uploaded and shared to an online database.

      Algorithms are also provided to characterize "model interactions" which allows users to define heuristic rules on how the different segmentations interact. For instance, a membrane adjacent protein can have rules where it must colocalize a certain distance away from a membrane segmentation. Such rules can help reduce false positives; as in the case above, false negatives predicted away from membranes are eliminated.

      The authors then show how Ais can be used for particle picking and subsequent subtomogram averaging and for segmentation of cellular tomograms for visual analysis. For subtomogram averaging, they used a previously published dataset and compared the averages of their automated picking with the published manual picking. Analysis of cellular tomogram segmentations were primarily visual.

      Strengths:

      CNN-based segmentation of cryo-ET data is a rapidly developing area of research, as it promises substantially faster results than manual segmentation as well as the possibility for higher accuracy. However, this field is still very much in the development and the overall performance of these approaches, even across different algorithms, still leaves much to be desired. In this context, I think Ais is an interesting packages, as it aims to provide both new and experienced users streamlined approaches for manual annotation, access to a number of CNNs, and methods to refine the outputs of CNN models against each other. I think this can be quite useful for users, particularly as these methods develop.

    5. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank the reviewers for helping us improve our article and software. The feedback that we received was very helpful and constructive, and we hope that the changes that we have made are indeed effective at making the software more accessible, the manuscript clearer, and the online documentation more insightful as well. A number of comments related to shared concerns, such as:

      • the need to describe various processing steps more clearly (e.g. particle picking, or the nature of ‘dust’ in segmentations)

      • describing the features of Ais more clearly, and explaining how it can interface with existing tools that are commonly used in cryoET

      • a degree of subjectivity in the discussion of results (e.g. about Pix2pix performing better than other networks in some cases.)

      We have now addressed these important points, with a focus on streamlining not only the workflow within Ais but also making interfacing between Ais and other tools easier. For instance, we explain more clearly which file types Ais uses and we have added the option to export .star files for use in, e.g., Relion, or meshes instead of coordinate lists. We also include information in the manuscript about how the particle picking process is implemented, and how false positives (‘dust’) can be avoided. Finally, all reviewers commented on our notion that Pix2pix can work ‘better’ despite reaching a higher loss after training. As suggested, we included a brief discussion about this idea in the supplementary information (Fig. S6) and used it to illustrate how Ais enables iteratively improving segmentation results. 

      Since receiving the reviews we have also made a number of other changes to the software that are not discussed below but that we nonetheless hope have made the software more reliable and easier to use. These include expanding the available settings, slight changes to the image processing that can help speed it up or avoid artefacts in some cases, improving the GUI-free usability of Ais, and incorporating various tools that should help make it easier to use Ais with remote data (e.g. doing annotation on an office PC, but model training on a more powerful remote PC). We have also been in contact with a number of users of the software, who reported issues or suggested various other miscellaneous improvements, and many of whom had found the software via the reviewed preprint.

      Reviewer 1 (Public Review):

      This paper describes "Ais", a new software tool for machine-learning-based segmentation and particle picking of electron tomograms. The software can visualise tomograms as slices and allows manual annotation for the training of a provided set of various types of neural networks. New networks can be added, provided they adhere to a Python file with an (undescribed) format. Once networks have been trained on manually annotated tomograms, they can be used to segment new tomograms within the same software. The authors also set up an online repository to which users can upload their models, so they might be re-used by others with similar needs. By logically combining the results from different types of segmentations, they further improve the detection of distinct features. The authors demonstrate the usefulness of their software on various data sets. Thus, the software appears to be a valuable tool for the cryo-ET community that will lower the boundaries of using a variety of machine-learning methods to help interpret tomograms. 

      We thank the reviewer for their kind feedback and for taking the time to review our article. On the basis of their  comments, we have made a number of changes to the software, article, and documentation, that we think have helped improve the project and render it more accessible (especially for interfacing with different tools, e.g. the suggestions to describe the file formats in more detail). We respond to all individual comments one-by-one below.

      Recommendations:

      I would consider raising the level of evidence that this program is useful to *convincing* if the authors would adequately address the suggestions for improvement below.

      (1) It would be helpful to describe the format of the Python files that are used to import networks, possibly in a supplement to the paper. 

      We have now included this information in both the online documentation and as a supplementary note (Supplementary Note 1). 

      (2) Likewise, it would be helpful to describe the format in which particle coordinates are produced. How can they be used in subsequent sub-tomogram averaging pipelines? Are segmentations saved as MRC volumes? Or could they be saved as triangulations as well? More implementation details like this would be good to have in the paper, so readers don't have to go into the code to investigate. 

      Coordinates: previously, we only exported arrays of coordinates as tab-separated .txt files, compatible with e.g. EMAN2. We now added a selection menu where users can specify whether to export either .star files or tsv .txt files, which together we think should cover most software suites for subtomogram averaging. 

      Triangulations: We have now improved the functionality for exporting triangulations. In the particle picking menu, there is now the option to output either coordinates or meshes (as .obj files). This was previously possible in the Rendering tab, but with the inclusion in the picking menu exporting triangulations can now be done for all tomograms at once rather than manually one by one.

      Edits in the text: the output formats were previously not clear in the text. We have now included this information in the introduction:

      “[…] To ensure compatibility with other popular cryoET data processing suites, Ais employs file formats that are common in the field, using .mrc files for volumes, tab-separated .txt or .star files for particle datasets, and the .obj file format for exporting 3D meshes.”

      (3) In Table 2, pix2pix has much higher losses than alternatives, yet the text states it achieves fewer false negatives and fewer false positives. An explanation is needed as to why that is. Also, it is mentioned that a higher number of epochs may have improved the results. Then why wasn't this attempted? 

      The architecture of Pix2pix is quite different from that of the other networks included in the test. Whereas all others are trained to minimize a binary cross entropy (BCE) loss, Pix2pix uses a composite loss function that is a weighted combination of the generator loss and a discriminator penalty, neither of which employ BCE. However, to be able to compare loss values, we do compute a BCE loss value for the Pix2pix generator after every training epoch. This is the value reported in the manuscript and in the software. Although Pix2pix’ BCE loss does indeed diminish during training, the model is not actually optimized to minimize this particular value and a comparison by BCE loss is therefore not entirely fair to Pix2pix. This is pointed out (in brief) in the legend to the able: 

      “Unlike the other architectures, Pix2pix is not trained to minimize the bce loss but uses a different loss function instead. The bce loss values shown here were computed after training and may not be entirely comparable.”

      Regarding the extra number of epochs for Pix2pix: here, we initially ran in to the problem that the number of samples in the training data was low for the number of parameters in Pix2pix, leading to divergence later during training. This problem did not occur for most other models, so we decided to keep the data for the discussion around Table 1 and Figure 2 limited to that initial training dataset. After that, we increased the sample size (from 58 to 170 positive samples) and trained the model for longer. The resulting model was used in the subsequent analyses. This was previously implicit in the text but is now mentioned explicitly and in a new supplementary figure. 

      “For the antibody platform, the model that would be expected to be one of the worst based on the loss values, Pix2pix, actually generates segmentations that are seem well-suited for the downstream processing tasks. It also output fewer false positive segmentations for sections of membranes than many other models, including the lowest-loss model UNet. Moreover, since Pix2pix is a relatively large network, it might also be improved further by increasing the number of training epochs. We thus decided to use Pix2pix for the segmentation of antibody platforms, and increased the size of the antibody platform training dataset (from 58 to 170 positive samples) to train a much improved second iteration of the network for use in the following analyses (Fig. S6).”

      (4) It is not so clear what absorb and emit mean in the text about model interactions. A few explanatory sentences would be useful here. 

      We have expanded this paragraph to include some more detail.

      “Besides these specific interactions between two models, the software also enables pitching multiple models against one another in what we call ‘model competition’. Models can be set to ‘emit’ and/or ‘absorb’ competition from other models. Here, to emit competition means that a model’s prediction value is included in a list of competing models. To absorb competition means that a model’s prediction value will be compared to all values in that list, and that this model’s prediction value for any pixel will be set to zero if any of the competing models’ prediction value is higher. On a pixel-by-pixel basis, all models that absorb competition are thus suppressed whenever their prediction value for a pixel is lower than that of any of the emitting models.”

      (5) Under Figure 4, the main text states "the model interactions described above", but because multiple interactions were described it is not clear which ones they were. Better to just specify again. 

      Changed as follows:

      “The antibody platform and antibody-C1 complex models were then applied to the respective datasets, in combination with the membrane and carbon models and the model interactions described above (Fig. 4b): the membrane avoiding carbon, and the antibody platforms colocalizing with the resulting membranes”.

      (6) The next paragraph mentions a "batch particle picking process to determine lists of particle coordinates", but the algorithm for how coordinates are obtained from segmented volumes is not described. 

      We have added a paragraph to the main text to describe the picking process:

      “This picking step comprises a number of processing steps (Fig. S7). First, the segmented (.mrc) volumes are thresholded at a user-specified level. Second, a distance transform of the resulting binary volume is computed, in which every nonzero pixel in the binary volume is assigned a new value, equal to the distance of that pixel to the nearest zero-valued pixel in the mask. Third, a watershed transform is applied to the resulting volume, so that the sets of pixels closest to any local maximum in the distance transformed volume are assigned to one group. Fourth, groups that are smaller than a user-specified minimum volume are discarded. Fifth, groups are assigned a weight value, equal to the sum of the prediction value (i.e. the corresponding pixel value in the input .mrc volume) of the pixels in the group. For every group found within close proximity to another group (using a user-specified value for the minimum particle spacing), the group with the lower weight value is discarded. Finally, the centroid coordinate of the grouped pixels is considered the final particle coordinate, and the list of all

      coordinates is saved in a tab-separated text file.

      “As an alternative output format, segmentations can also be converted to and saved as triangulated meshes, which can then be used for, e.g., membrane-guided particle picking. After picking particles, the resulting coordinates are immediately available for inspection in the Ais 3D renderer (Fig. S8).“

      The two supplementary figures are pasted below for convenience. Fig. S7 is new, while Fig. S8 was previously Fig. S10 -the reference to this figure was originally missing in the main text, but is now included.

      (7) In the Methods section, it is stated that no validation splits are used "in order to make full use of an input set". This sounds like an odd decision, given the importance of validation sets in the training of many neural networks. Then how is overfitting monitored or prevented? This sounds like a major limitation of the method. 

      In our experience, the best way of preparing a suitable model is to (iteratively) annotate a set of training images and visually inspect the result. Since the manual annotation step is the bottleneck in this process, we decided not to use validation split in order to make full use of an annotated training dataset (i.e. a validation split of 20% would mean that 20% of the manually annotated training data is not used for training)

      We do recognize the importance of using separate data for validation, or at least offering the possibility of doing so. We have now added a parameter to the settings (and made a Settings menu item available in the top menu bar) where users can specify what fraction (0, 10, 20, or 50%) of training datasets should be set aside for validation. If the chosen value is not 0%, the software reports the validation loss as well as the size of the split during training, rather than (as was done previously) the training loss. We have, however, set the default value for the validation split to 0%, for the same reason as before. We also added a section to the online documentation about using validation splits, and edited the corresponding paragraph in the methods section:

      “The reported loss is that calculated on the training dataset itself, i.e., no validation split was applied. During regular use of the software, users can specify whether to use a validation split or not. By default, a validation split is not applied, in order to make full use of an input set of ground truth annotations. Depending on the chosen split size, the software reports either the overall training loss or the validation loss during training.”

      (8) Related to this point: how is the training of the models in the software modelled? It might be helpful to add a paragraph to the paper in which this process is described, together with indicators of what to look out for when training a model, e.g. when should one stop training? 

      We have expanded the paragraph where we write about the utility of comparing different networks architectures to also include a note on how Ais facilitates monitoring the output of a model during training:

      “When taking the training and processing speeds in to account as well as the segmentation results, there is no overall best architecture. We therefore included multiple well-performing model architectures in the final library, in order to allow users to select from these models to find one that works well for their specific datasets. Although it is not necessary to screen different network architectures and users may simply opt to use the default (VGGNet), these results thus show that it can be useful to test different networks in order to identify one that is best. Moreover, these results also highlight the utility of preparing well-performing models by iteratively improving training datasets and re-training models in a streamlined interface. To aid in this process, the software displays the loss value of a network during training and allows for the application of models to datasets during training. Thus, users can inspect how a model’s output changes during training and decide whether to interrupt training and improve the training data or choose a different architecture.”

      (9) Figure 1 legend: define the colours of the different segmentations. 

      Done

      (10) It may be better to colour Figure 2B with the same colours as Figure 2A. 

      We tried this, but the effect is that the underlying density is much harder to see. We think the current grayscale image paired with the various segmentations underneath is better for visually identifying which density corresponds to membranes, carbon film, or antibody platforms.

      Reviewer 2 (Public Review):

      Summary: 

      Last et al. present Ais, a new deep learning-based software package for the segmentation of cryo-electron tomography data sets. The distinguishing factor of this package is its orientation to the joint use of different models, rather than the implementation of a given approach. Notably, the software is supported by an online repository of segmentation models, open to contributions from the community. 

      The usefulness of handling different models in one single environment is showcased with a comparative study on how different models perform on a given data set; then with an explanation of how the results of several models can be manually merged by the interactive tools inside Ais. 

      The manuscripts present two applications of Ais on real data sets; one is oriented to showcase its particlepicking capacities on a study previously completed by the authors; the second one refers to a complex segmentation problem on two different data sets (representing different geometries as bacterial cilia and mitochondria in a mouse neuron), both from public databases. 

      The software described in the paper is compactly documented on its website, additionally providing links to some YouTube videos (less than an hour in total) where the authors videocapture and comment on major workflows. 

      In short, the manuscript describes a valuable resource for the community of tomography practitioners. 

      Strengths: 

      A public repository of segmentation models; easiness of working with several models and comparing/merging the results. 

      Weaknesses: 

      A certain lack of concretion when describing the overall features of the software that differentiate it from others. 

      We thank the reviewer for their kind and constructive feedback. Following the suggestion to use the Pix2pix results to illustrate the utility of Ais for analyzing results, we have added a new supplementary figure (Fig. S6) and brief discussion, showing the use of Ais in iteratively improving segmentation results. We have also expanded the online documentation and included a note in the supplementary information about how models are saved/loaded (Supplemetary note 1) 

      Recommendations:

      I would like to ask the authors about some concerns about the Ais project as a whole: 

      (1) The website that accompanies the paper (aiscryoet.org), albeit functional, seems to be in its first steps. Is it planned to extend it? In particular, one of the major contributions of the paper (the maintenance of an open repository of models) could use better documentation describing the expected formats to submit models. This could even be discussed in the supplementary material of the manuscript, as this feature is possibly the most distinctive one of the paper. Engaging third-party users would require giving them an easier entry point, and the superficial mention of this aspect in the online documentation could be much more generous.

      We have added a new page to the online documentation, titled ‘Sharing models’ where we include an explanation of the structure of model files and demonstrate the upload page. We also added a note to the Supplementary Information that explains the file format for models, and how they are loaded/saved (i.e., that these standard keras model obects). 

      To make it easier to interface Ais with other tools, we have now also made some of the core functionality available (e.g. training models, batch segmentation) via the command line interface. Information on how to use this is included in the online documentation. All file formats are common formats used in cryoET, so that using Ais in a workflow with, e.g. AreTomo -> Ais -> Relion should now be more straightforward.

      (2) A different major line advanced by the authors to underpin the novelty of the software, is its claimed flexibility and modularity. In particular, the restrictions of other packages in terms of visualization and user interaction are mentioned. Although in the manuscript it is also mentioned that most of the functionalities in Ais are already available in major established packages, as a reader I am left confused about what exactly makes the offer of Ais different from others in terms of operation and interaction: is it just the two aspects developed in the manuscript (possibility of using different models and tools to operate model interaction)? If so, it should probably be stated; but if the authors want to pinpoint other aspects of the capacity of Ais to drive smoothly the interactions, they should be listed and described, instead of leaving it as an unspecific comment. As a potential user of Ais, I would suggest the authors add (maybe in the supplementary material) a listing of such features. Figure 1 does indeed carry the name "overview of (...) functionalities", but it is not clear to me which functionalities I can expect to be absent or differently solved on the other tools they mention.

      We have rewritten the part of the introduction where we previously listed the features as below. We think it should now be clearer for the reader to know what features to expect, as well as how Ais can interface with other software (i.e. what the inputs and outputs are). We have also edited the caption for Figure 1 to make it explicit that panels A to C represent the annotation, model preparation, and rendering steps of the Ais workflow and that the images are screenshots from the software.

      “In this report we present Ais, an open-source tool that is designed to enable any cryoET user – whether experienced with software and segmentation or a novice – to quickly and accurately segment their cryoET data in a streamlined and largely automated fashion. Ais comprises a comprehensive and accessible user interface within which all steps of segmentation can be performed, including: the annotation of tomograms and compiling datasets for the training of convolutional neural networks (CNNs), training and monitoring performance of CNNs for automated segmentation, 3D visualization of segmentations, and exporting particle coordinates or meshes for use in downstream processes. To help generate accurate segmentations, the software contains a library of various neural network architectures and implements a system of configurable interactions between different models. Overall, the software thus aims to enable a streamlined workflow where users can interactively test, improve, and employ CNNs for automated segmentation. To ensure compatibility with other popular cryoET data processing suites, Ais employs file formats that are common in the field, using .mrc files for volumes, tab-separated .txt or .star files for particle datasets, and the .obj file format for exporting 3D meshes.”

      “Figure 1 – an overview of the user interface and functionalities. The various panels represent sequential stages in the Ais processing workflow, including annotation (a), testing CNNs (b), visualizing segmentation (c). These images (a-c) are unedited screenshots of the software. a) […]”

      (3) Table 1 could have the names of the three last columns. The table has enough empty space in the other columns to accommodate this. 

      Done.

      (4) The comment about Pix2pix needing a larger number of training epochs (being a larger model than the other ones considered) is interesting. It also lends itself for the authors to illustrate the ability of their software to precisely do this: allow the users to flexibly analyze results and test hypothesis

      Please see the response to Reviewer 1 comment #3. We agree that this is a useful example of the ability to iterate between annotation and training, and have added an explicit mention of this in the text:

      “Moreover, since Pix2pix is a relatively large network, it might also be improved further by increasing the number of training epochs. In a second iteration of annotation and training, we thus increased the size of the antibody platform training dataset (from 58 to 170 positive samples) and generated an improved Pix2pix model for use in the following analyses.”

      Reviewer 3 (Public Review):

      We appreciate the reviewer’s extensive and very helpful feedback and are glad to read that they consider Ais potentially quite useful for the users. To address the reviewer’s comments, we have made various edits to the text, figures, and documentation, that we think have helped improve the clarity of our work. We list all edits below. 

      Summary

      In this manuscript, Last and colleagues describe Ais, an open-source software package for the semi-automated segmentation of cryo-electron tomography (cryo-ET) maps. Specifically, Ais provides a graphical user interface (GUI) for the manual segmentation and annotation of specific features of interest. These manual annotations are then used as input ground-truth data for training a convolutional neural network (CNN) model, which can then be used for automatic segmentation. Ais provides the option of several CNNs so that users can compare their performance on their structures of interest in order to determine the CNN that best suits their needs. Additionally, pre-trained models can be uploaded and shared to an online database. 

      Algorithms are also provided to characterize "model interactions" which allows users to define heuristic rules on how the different segmentations interact. For instance, a membrane-adjacent protein can have rules where it must colocalize a certain distance away from a membrane segmentation. Such rules can help reduce false positives; as in the case above, false negatives predicted away from membranes are eliminated. 

      The authors then show how Ais can be used for particle picking and subsequent subtomogram averaging and for the segmentation of cellular tomograms for visual analysis. For subtomogram averaging, they used a previously published dataset and compared the averages of their automated picking with the published manual picking. Analysis of cellular tomogram segmentation was primarily visual. 

      Strengths:

      CNN-based segmentation of cryo-ET data is a rapidly developing area of research, as it promises substantially faster results than manual segmentation as well as the possibility for higher accuracy. However, this field is still very much in the development and the overall performance of these approaches, even across different algorithms, still leaves much to be desired. In this context, I think Ais is an interesting package, as it aims to provide both new and experienced users with streamlined approaches for manual annotation, access to a number of CNNs, and methods to refine the outputs of CNN models against each other. I think this can be quite useful for users, particularly as these methods develop. 

      Weaknesses: 

      Whilst overall I am enthusiastic about this manuscript, I still have a number of comments: 

      (1) On page 5, paragraph 1, there is a discussion on human judgement of these results. I think a more detailed discussion is required here, as from looking at the figures, I don't know that I agree with the authors' statement that Pix2pix is better. I acknowledge that this is extremely subjective, which is the problem. I think that a manual segmentation should also be shown in a figure so that the reader has a better way to gauge the performance of the automated segmentation.

      Please see the answer to Reviewer 1’s comment #3.

      (2) On page 7, the authors mention terms such as "emit" and "absorb" but never properly define them, such that I feel like I'm guessing at their meaning. Precise definitions of these terms should be provided. 

      We have expanded this paragraph to include some more detail:

      “Besides these specific interactions between two models, the software also enables pitching multiple models against one another in what we call ‘model competition’. Models can be set to ‘emit’ and/or ‘absorb’ competition from other models. Here, to emit competition means that a model’s prediction value is included in a list of competing models. To absorb competition means that a model’s prediction value will be compared to all values in that list, and that this model’s prediction value for any pixel will be set to zero if any of the competing models’ prediction value is higher. On a pixel-by-pixel basis, all models that absorb competition are thus suppressed whenever their prediction value for a pixel is lower than that of any of the emitting models.” 

      (3) For Figure 3, it's unclear if the parent models shown (particularly the carbon model) are binary or not.

      The figure looks to be grey values, which would imply that it's the visualization of some prediction score. If so, how is this thresholded? This can also be made clearer in the text. 

      The figures show the grayscale output of the parent model, but this grayscale output is thresholded to produce a binary mask that is used in an interaction. We have edited the text to include a mention of thresholding at a user-specified threshold value:

      “These interactions are implemented as follows: first, a binary mask is generated by thresholding the parent model’s predictions using a user-specified threshold value. Next, the mask is then dilated using a circular kernel with a radius 𝑅, a parameter that we call the interaction radius. Finally, the child model’s prediction values are multiplied with this mask.”

      To avoid confusion, we have also edited the figure to show the binary masks rather than the grayscale segmentations. 

      (4) Figure 3D was produced in ChimeraX using the hide dust function. I think some discussion on the nature of this "dust" is in order, e.g. how much is there and how large does it need to be to be considered dust? Given that these segmentations can be used for particle picking, this seems like it may be a major contributor to false positives. 

      ‘Dust’ in segmentations is essentially unavoidable; it would require a perfect model that does not produce any false positives. However, when models are sufficiently accurate, the volume of false positives is typically smaller than that of the structures that were intended to be segmented. In these cases, discarding particles based on size is a practical way of filtering the segmentation results. Since it is difficult to generalize when to consider something ‘dust’ we decided to include this additional text in the Method’s section rather than in the main text:

      “… with the use of the ‘hide dust’ function (the same settings were used for each panel, different settings used for each feature).

      This ‘dust’ corresponds to small (in comparison to the segmented structures of interest) volumes of false positive segmentations, which are present in the data due to imperfections in the used models. The rate and volume of false positives can be reduced either by improving the models (typically by including more examples of the images of what would be false negatives or positives in the training data) or, if the dust particles are indeed smaller than the structures of interest, they can simply be discarded by filtering particles based on their volume, as applied here. In particle picking a ‘minimum particle volume’ is specified – particles with a smaller volume are considered ‘dust’.

      In combination with the newly included text about the method of converting volumes into lists of coordinates (see Reviewer 1’s comment #6).

      “Third, a watershed transform is applied to the resulting volume, so that the sets of pixels closest to any local maximum in the distance transformed volume are assigned to one group. Fourth, groups that are smaller than a user-specified minimum volume are discarded…”

      We think it should now be clearer that (some form of) discarding ‘dust’ is a step that is typically included in the particle picking process.

      (5) Page 9 contains the following sentence: "After selecting these values, we then launched a batch particle picking process to determine lists of particle coordinates based on the segmented volumes." Given how important this is, I feel like this requires significant description, e.g. how are densities thresholded, how are centers determined, and what if there are overlapping segmentations? 

      Please see the response to Reviewer 1’s comment #6.

      (6) The FSC shown in Figure S6 for the auto-picked maps is concerning. First, a horizontal line at FSC = 0 should be added. It seems that starting at a frequency of ~0.045, the FSC of the autopicked map increases above zero and stays there. Since this is not present in the FSC of the manually picked averages, this suggests the automatic approach is also finding some sort of consistent features. This needs to be discussed. 

      Thank you for pointing this out. Awkwardly, this was due to a mistake made while formatting the figure. In the two separate original plots, the Y axes had slightly different ranges, but this was missed when they were combined to prepare the joint supplementary figure. As a result, the FSC values for the autopicked half maps are displayed incorrectly. The original separate plots are shown below to illustrate the discrepancy:

      Author response image 1.

      The corrected figure is Figure S9 in the manuscript. The values of 44 Å and 46 Å were not determined from the graph and remain unchanged.

      (7) Page 11 contains the statement "the segmented volumes found no immediately apparent false positive predictions of these pores". This is quite subjective and I don't know that I agree with this assessment. Unless the authors decide to quantify this through subtomogram classification, I don't think this statement is appropriate. 

      We originally included this statement and the supplementary figure because we wanted to show another example of automated picking, this time in the more crowded environment of the cell. We do agree that it requires better substantiation, but also think that the demonstration of automated picking of the antibody platforms and IgG3-C1 complexes for subtomogram averaging suffices to demonstrate Ais’ picking capabilities. Since the supplementary information includes an example of picked coordinates rendered in the Ais 3D viewer (Figure S7) that also used the pore dataset, we still include the supplementary figure (S10) but have edited the statement to read:

      “Moreover, we could identify the molecular pores within the DMV, and pick sets of particles that might be suitable for use in subtomogram averaging (see Fig. S11).”

      We have also expanded the text that accompanies the supplementary figure to emphasize that results from automated picking are likely to require further curation, e.g. by classification in subtomogram averaging, and that the selection of particles is highly dependent on the thresholds used in the conversion from volumes to lists of coordinates.

      (8) In the methods, the authors note that particle picking is explained in detail in the online documentation. Given that this is a key feature of this software, such an explanation should be in the manuscript. 

      Please see the response to Reviewer 1’s comment #6. 

      Recommendations:

      (9) The word "model" seems to be used quite ambiguously. Sometimes it seems to refer to the manual segmentations, the CNN architectures, the trained models, or the output predictions. More precision in this language would greatly improve the readability of the manuscript.

      This was indeed quite ambiguous, especially in the introduction. We have edited the text to be clearer on these differences. The word ‘model’ is now only used to refer to trained CNNs that segment a particular feature (as in ‘membrane model’ or ‘model interactions’). Where we used terms such as ‘3D models’ to describe scenes rendered in 3D, we now use ‘3D visualizations’ or similar terms. Where we previously used the term ‘models’ to refer to CNN architectures, we now use terms such as ‘neural network architectures’ or ‘architecture’. Some examples:

      … with which one can automatically segment the same or any other dataset …

      Moreover, since Pix2pix is a relatively large network, …       

      … to generate a 3D visualization of ten distinct cellular …

      … with the use of the same training datasets for all network architectures …

      In Figure 1, the text in panels D and E is illegible. 

      We have edited the figure to show the text more clearly (the previous images were unedited screenshots of the website).

      (10) Prior to the section on model interactions, I was under the impression that all annotations were performed simultaneously. I think it could be clarified that models are generated per annotation type. 

      Multiple different features can be annotated (i.e. drawn by hand by the user) at the same time, but each trained CNN only segments one feature. CNNs that output segmentations for multiple features can be implemented straightforwardly, but this introduces the need to provide training data where for every grayscale image, every feature is annotated. This can make preparing the training data much more cumbersome. Reusability of the models is also hampered. We now mention the separateness of the networks explicitly in the introduction:

      “Multiple features, such as membranes, microtubules, ribosomes, and phosphate crystals, can be segmented and edited at the same time across multiple datasets (even hundreds). These annotations are then extracted and used as ground truth labels upon which to condition multiple separate neural networks, …”

      (11) On page 6, there is the text "some features are assigned a high segmentation value by multiple of the networks, leading to ambiguity in the results". Do they mean some false features? 

      To avoid ambiguity of the word ‘features’, we have edited the sentence to read:

      “… some parts of the image are assigned a high segmentation value by multiple of the networks, leading to false classifications and ambiguity in the results.”

      (12) Figures 2 and 3 would be easier to follow if they had consistent coloring. 

      We have changed the colouring in Figure 2 to match that of Figure 3 better:

      (13) For Figure 3D, I'm confused as to why the authors showed results from the tomogram in Figure 2B. It seems like the tomogram in Figure 3C would be a more obvious choice, as we would be able to see how the 2D slices look in 3D. This would also make it easier to see the effect of interactions on false negatives. Also, since the orientation of the tomogram in 2B is quite different than that shown in 3D, it's a bit difficult to relate the two.

      We chose to show this dataset because it exemplifies the effects of both model competition and model interactions better than the tomogram in Figure 3C. See Figure 3D and Author response image 2 for a comparison:

      Author response image 2.

      (14) I'm confused as to why the tomographic data shown in Figures 4D, E, and F are black on white while all other cryo-ET data is shown as white on black. 

      The images in Figure 4DEF are now inverted.

      (15) For Figure 5, there needs to be better visual cueing to emphasize which tomographic slices are related to the segmentations in Panels A and B. 

      We have edited the figure to show more clearly which grayscale image corresponds to which segmentation:

      (16) I don't understand what I should be taking away from Figures S1 and S2. There are a lot of boxes around membrane areas and I don't know what these boxes mean. 

      We have added a more descriptive text to these figures. The boxes are placed by the user to select areas of the image that will be sampled when saving training datasets.

    1. eLife Assessment

      The authors report that a secreted ubiquitin ligase of Shigella, called IpaH1.4, mediates the degradation of a host defense factor, RNF213. The data are solid and represent an important contribution to our understanding of cell-autonomous immunity and bacterial pathogenesis, as they provide new mechanistic insight into how the cytosolic bacterial pathogen Shigella flexneri evades IFN-induced host immunity.

    2. Reviewer #1 (Public review):

      Shigella flexneri is a bacterial pathogen that is an important globally significant cause of diarrhea. Shigella pathogenesis remains poorly understood. In their manuscript, Saavedra-Sanchez et al report their discovery that a secreted E3 ligase effector of Shigella, called IpaH1.4, mediates the degradation of a host E3 ligase called RNF213. RNF213 was previously described to mediate ubiquitylation of intracellular bacteria, an initial step in their targeting of xenophagosomes. Thus, Shigella IpaH1.4 appears to be an important factor in permitting evasion of RNF213-mediated host defense.

      Strengths:

      The work is focused, convincing, well-performed, and important. The manuscript is well-written.

    3. Reviewer #2 (Public review):

      Summary:

      The authors find that the bacterial pathogen Shigella flexneri uses the T3SS effector IpaH1.4 to induce degradation of the IFNg-induced protein RNF213. They show that in the absence of IpaH1.4, cytosolic Shigella is bound by RNF213. Furthermore, RNF213 conjugates linear and lysine-linked ubiquitin to Shigella independently of LUBAC. Intriguingly, they find that Shigella lacking ipaH1.4 or mxiE, which regulates the expression of some T3SS effectors, are not killed even when ubiquitylated by RNF213 and that these mutants are still able to replicate within the cytosol, suggesting that Shigella encodes additional effectors to escape from host defenses mediated by RNF213-driven ubiquitylation.

      Strengths:

      The authors take a variety of approaches, including host and bacterial genetics, gain-of-function and loss-of-function assays, cell biology, and biochemistry. Overall, the experiments are elegantly designed, rigorous, and convincing.

      Weaknesses:

      The authors find that ipaH1.4 mutant S. flexneri no longer degrades RNF213 and recruits RNF213 to the bacterial surface. The authors should perform genetic complementation of this mutant with WT ipaH1.4 and the catalytically inactive ipaH1.4 to confirm that ipaH1.4 catalytic activity is indeed responsible for the observed phenotype.

    4. Reviewer #3 (Public review):

      Summary:

      In this study, the authors set out to investigate whether and how Shigella avoids cell-autonomous immunity initiated through M1-linked ubiquitin and the immune sensor and E3 ligase RNF213. The key findings are that the Shigella flexneri T3SS effector, IpaH1.4 induces degradation of RNF213. Without IpaH1.4, the bacteria are marked with RNF213 and ubiquitin following stimulation with IFNg. Interestingly, this is not sufficient to initiate the destruction of the bacteria, leading the authors to conclude that Shigella deploys additional virulence factors to avoid this host immune response. The second key finding of this paper is the suggestion that M1 chains decorate the mxiE/ipaH Shigella mutant independent of LUBAC, which is, by and large, considered the only enzyme capable of generating M1-linked ubiquitin chains.

      Strengths:

      The data is for the most part well controlled and clearly presented with appropriate methodology. The authors convincingly demonstrate that IpaH1.4 is the effector responsible for the degradation of RNF213 via the proteasome, although the site of modification is not identified.

      Weaknesses:

      The work builds on prior work from the same laboratory that suggests that M1 ubiquitin chains can be formed independently of LUBAC (in the prior publication this related to Chlamydia inclusions). In this study, two pieces of evidence support this statement -fluorescence microscopy-based images and accompanying quantification in Hoip and Hoil knockout cells for association of M1-ub, using an antibody, to Shigella mutants and the use of an internally tagged Ub-K7R mutant, which is unable to be incorporated into ubiquitin chains via its lysine residues. Given that clones of the M1-specific antibody are not always specific for M1 chains, and because it remains formally possible that the Int-K7R Ub can be added to the end of the chain as a chain terminator or as mono-ub, the authors should strengthen these findings relating to the claim that another E3 ligase can generate M1 chains de novo.

      The main weakness relating to the infection work is that no bacterial protein loading control is assayed in the western blots of infected cells, leaving the reader unable to determine if changes in RNF213 protein levels are the result of the absent bacterial protein (e.g. IpaH1.4) or altered infection levels.

      The importance of IFNgamma priming for RNF213 association to the mxiE or ipaH1.4 strain could have been investigated further as it is unclear if RNF213 coating is enhanced due to increased protein expression of RNF213 or another factor. This is of interest as IFNgamma priming does not seem to be needed for RNF213 to detect and coat cytosolic Salmonella.

      Overall, the findings are important for the host-pathogen field, cell-autonomous/innate immune signaling fields, and microbial pathogenesis fields. If further evidence for LUBAC independent M1 ubiquitylation is achieved this would represent a significant finding.

    1. eLife assessment

      This fundamental work describes an understudied bird migration pattern using data from an Arctic raptor. With an extensive dataset and comprehensive analyses, the observed pattern is convincing. This study will be of interest to researchers exploring the ecological drivers of bird migration.

    2. Reviewer #4 (Public review):

      Summary:

      This study describes an understudied migration pattern of dynamic non-breeding range using data from an Arctic raptor. Using data from GPS tags, the study describes the known pattern of fast migration during autumn and spring, and an undescribed pattern of slow migration, at much slower pace, throughout the over-wintering season.

      Strengths:

      The study presents a comprehensive analysis of the annual cycle of an interesting and undescribed migration system. The conceptual advancement is original and the data is rich and persuading. The Discussion part of the manuscript is well written.

      Weaknesses:

      Other sections of the manuscript need some more polish, both in terms of the terminology, the language and the logic of the presentation of the subject. The title is not good. During most of the text, the authors do not properly follow a certain terminology regarding migration, over-wintering, non-breeding range, and this is very confusing. So, consistency of the text is warranted. A bigger issue is the selection of latitudes (or the actual reason for movement) during the over-wintering period. The study claims that this relates to snow cover but fails to properly demonstrate it. It is likely that the birds move because of changes in snow cover rather than because of the level of snow cover. This is a testable prediction. A possible explanation is that there is a cost for moving further south and thus the birds are reluctant of moving unless they are forced to do it by the high snow cover. Another, similar and testable prediction is that the birds aim at selecting latitudes where snow cover is partial and move slowly during the winter to areas that are only partially covered by the snow with the progression of the winter. A modified, non-linear, snow cover analysis using GAMM could uncover such patterns.

    3. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #4

      We sincerely appreciate the time and effort you have taken to review our manuscript. We followed your recommendations to polish the text and make it easier to understand.

      Regarding terms and terminology, we changed “non-breeding” everywhere in the text to “over- wintering.”

      Regarding the title, as it was suggested by reviewer #1 as his recommendation, we tried to find a compromise and make the changes you suggested but left part of the suggestion from reviewer #1. So, now it’s “Foxtrot migration and dynamic over-wintering range of an arctic raptor”

      Thank you for highlighting the importance of snow cover and changes in snow cover as a possible factor of over-wintering movements. We appreciate your feedback and have explored several approaches to address this issue. Specifically, we examined how both snow cover extent and changes in snow cover influenced movement distance. However, we found no effect of either factor on movement distance.

      Our data show that birds leave their sites in October and move southwest, even though snow cover is minimal at that time. They also leave their sites in November and in subsequent months, regardless of the snow cover levels. Thus, we observed no pattern of birds leaving sites when snow cover reaches a specific threshold (e.g., 75-80%). Similarly, we found no evidence of birds staying in areas with a certain snow cover extent (e.g., 30%), nor did they leave sites when snow cover increased by a specific amount (e.g., by 10 or 20%).

      It is possible that more experienced birds anticipate that October plots will become inaccessible later in the winter and, therefore, leave early without waiting for significant snow accumulation. Alternatively, other factors, such as brief heavy snowfalls, may trigger movement, even if these do not lead to sustained increases in snow cover. Multiple factors, possibly acting asynchronously, could also play a role. This complexity adds an interesting dimension to the study of ecological patterns. However, in this study, we chose to focus on describing the migration pattern itself and its impact on aspects like over-winter range determination and population dynamics. While we have prioritized this approach, we remain committed to further analyzing the data to uncover additional details about this behavior.

      In response to your suggestion, we have expanded the Methods sections to clarify that we tested the effects of snow cover and changes in snow cover on distance (Lines 241-246); the Results section (Lines 348-349). We have also included the relevant plots in the Supplementary Materials. In the Discussion, we noted that this approach did not reveal any significant dependence and acknowledged that this issue requires further investigation (Lines 422-459).

      ---------

      The following is the authors’ response to the previous reviews.

      Reviewer #2:

      We sincerely appreciate the time and effort you have taken to review our manuscript. 

      First of all, we apologize for publishing the preprint without incorporating certain adjustments outlined in our earlier response, particularly in the Methods section. This was due to an oversight regarding the different versions of the manuscript. We have corrected this mistake. Our response to the feedback on this section (Methods), with line numbers of the changes made, is immediately below this response. In addition, we have included the units of measurement (mean and standard deviation) in both the results and figure captions for clarity.

      To focus on the main point regarding wintering strategies, we acknowledge that in the previous versions, this aspect was inadequately addressed and caused some confusion. In the revised edition, both the Introduction and the Discussion have been thoroughly reworked.

      As you suggested, we have removed the long introductory paragraph and all references to foxtrot migrations from the Introduction. As a result, the Introduction is now short and to the point. In the second paragraph, we explain why we propose the wintering strategies outlined (L74-81).

      In the Discussion, we've added a substantial new section at the beginning that discusses different wintering strategies. We have also updated Figure 4 accordingly. Previously, we erroneously suggested that Montagu's harrier and other African-Palaearctic migrants might adopt wintering strategies similar to those we describe. Upon further investigation, however, we found that almost all African-Palaearctic migrants exhibit an itinerant wintering strategy. Conversely, the strategy we describe is primarily observed in mid-latitude wintering species.

      We have shown that, unlike itinerancy, the birds in our study don't pause for 1-2 months at multiple non-breeding sites, but instead migrate significant distances, up to 1000 km, throughout the winter. Furthermore, unlike itinerancy, the sites they reach are consistently snow-free throughout the year. Following the logic of publications on Montagu's harriers (Schlaich et al. 2023), our birds do not wait for favorable conditions at the next site, as is typical of itinerancy. Moreover, this behavior is influenced by external factors such as snow cover dynamics and occurs primarily in mid-latitudes. Researchers studying a species similar to our subject, the Common buzzard, observed a similar pattern and termed it "prolonged autumn migration" rather than itinerancy. Although their transmitters stopped working in mid-winter, precluding a full observation of the annual cycle, they captured the essence of continued migration at a slower pace, distinct from itinerancy. We've detailed all of these findings in a new section.

      In addition, we acknowledge the mischaracterization of the implications of our research as ‘Conservation implications’ and have corrected this to ‘Mapping ranges and assessing population trends’, as you suggested.

      Finally, we've rewritten the Conclusion, removing overly grandiose statements and simply summarizing the main findings.

      We appreciate your time and effort in reviewing our manuscript. With your invaluable input, it has become clearer, more concise, and easier to understand.

      Dataset: unclear what is the frequency of GPS transmissions. Furthermore, information on relative tag mass for the tracked individuals should be reported.

      We have included this information in our manuscript (L 115-122). We also refer to the study in which this dataset was first used and described in detail (L 123).

      Data pre-processing: more details are needed here. What data have been removed if the bird died? The entire track of the individual? Only the data classified in the last section of the track? The section also reports on an 'iterative procedure' for annotating tracks, which is only vaguely described. A piecewise regression is mentioned, but no details are provided, not even on what is the dependent variable (I assume it should be latitude?).

      Regarding the deaths, we only removed the data when the bird was already dead. We estimated the date of death and excluded tracking data corresponding to the period after the bird's death. We have corrected the text to make this clear (L 130-131).

      Regarding the piecewise regression. We have added a detailed description on lines 136-148.

      Data analysis: several potential issues here:

      (1) Unclear why sex was not included in all mixed models. I think it should be included.

      Our dataset contains 35 females and eight males (L116). This ratio does not allow us to include sex in all models and adequately assess the influence of this factor. At the same time, because adult females disperse farther than males in some raptor species, we conducted a separate analysis of the dependence of migration distance on sex (Table S8) and found no evidence for this in our species. We have written about that in the Methods (L177-181) and after in the Results (L277-278).

      (2) Unclear what is the rationale of describing habitat use during migration; is it only to show that it is a largely unsuitable habitat for the species? But is a formal analysis required then? Wouldn't be enough to simply describe this?

      Habitat use and snow cover determine the two main phases (quick and slow) of the pattern we describe. We believe that habitat analysis is appropriate in this case, and a simple description would be uninformative and not support our conclusions.

      (3) Analysis of snow cover: such a 'what if' analysis is fine but it seems to be a rather indirect assessment of the effect of snow cover on movement patterns. Can a more direct test be envisaged relating e.g. daily movement patterns to concomitant snow cover? This should be rather straightforward. The effectiveness of this method rests on among-year differences in snow cover and timing of snowfall. A further possibility would be to demonstrate habitat selection within the entire non-breeding home range of an individual in relation snow cover. Such an analysis would imply associating presenceabsence of snow to every location within the non-breeding range and testing whether the proportion of locations with snow is lower than the proportion of snow of random locations within the entire nonbreeding home range (95% KDE) for every individual (e.g. by setting a 1/10 ratio presence to random locations).

      The proposed analysis will provide an opportunity to assess whether the Rough-legged buzzard selects areas with the lowest snow cover, but will not provide an opportunity to follow the dynamics and will therefore give a misleading overall picture. This is especially true in the spring months. In March-April, Rough-legged buzzards move northeast and are in an area that is not the most open to snow. At this time, areas to the southwest are more open to snow (this can be seen in Figure 3b). If we perform the proposed analysis, the control points for this period would be both to the north (where there is more snow) and to the south (where there is less snow) from the real locations, and the result would be that there is no difference in snow cover. 

      A step-selection analysis could be used, as we did in our previous work (Curk et al 2020 Sci Rep) with the same Rough-legged buzzards (but during migration, not winter). But this would only give us a qualitative idea, not a quantitative one - that Rough-legged Buzzards move from snow (in the fall) and follow snowmelt progression (in the spring). 

      At the same time, our analysis gives a complete picture of snow cover dynamics in different parts of the non-breeding range. This allows us to see that if Rough-legged buzzards remained at their fall migration endpoint without moving southwest, they would encounter 14.4% more snow cover (99.5% vs. 85.1%). Although this difference may seem small (14.4%), it holds significance for rodent-hunting birds, distinguishing between complete and patchy snow cover.

      Simultaneously, if Rough-legged buzzards immediately flew to the southwest and stayed there throughout winter, they would experience 25.7% less snow cover (57.3% vs. 31.6%). Despite a greater difference than in the first case, it doesn't compel them to adopt this strategy, as it represents the difference between various degrees of landscape openness from snow cover.

    1. Reviewer #1 (Public review):

      Summary:

      In an era of increasing antibiotic resistance, there is a pressing need for the development of novel sustainable therapies to tackle problematic pathogens. In this study, the authors hypothesize that pyoverdines - metal-chelating compounds produced by fluorescent pseudomonads - can act as antibacterials by locking away iron, thereby arresting pathogen growth. Using biochemical, growth and virulence assays on 12 opportunistic pathogens strains, the authors demonstrate that pyoverdines induce iron starvation, but this affect was highly context dependent. This same effect has been demonstrated for plant pathogens, but not for human opportunistic pathogens exposed to natural siderophores. Only those pathogens lacking (1) a matching receptor to take up pyoverdine-bound iron and/or (2) the ability to produce strong iron chelators themselves experienced strong growth arrest. This would suggest that pyoverdines might not be effective against all pathogens, thereby potentially limiting the utility of pyoverdines as global antibacterials.

      Strengths:

      The work addresses an important and timely question - can pyoverdines be used as an alternative strategy to deal with opportunistic pathogens? In general, the work is well conducted with rigorous biochemical, growth and virulence assays. In line, the work is clearly written, and the findings are supported by high-quality figures.

      Weaknesses:

      I do not think there are any 'weaknesses' as such. The authors have taken all suggestions on board and this has greatly improved the quality and robustness of the work

    2. Reviewer #2 (Public review):

      In this work, Vollenweider et al. examine the effectiveness of using natural products, specifically molecules that chelate iron, to treat infectious agents. Through the purification of 320 environmental isolates, 25 potential candidates were identified based on inhibition assays and further screened. The structural information and chemical composition of these candidates were determined. Using a series of well-described and standard assays, the authors show that three compounds have some effect in reducing mortality in a simple in vivo model.

      The paper is well-structured and thorough; targeting virulence factors in this manner is an excellent approach. However, my enthusiasm is dampened by the mediocre effects of the compounds. A reduction in the hazard ratio is reported, indicating that the compounds are having an effect, but without comparison to other iron-chelating molecules or current standards of care, it is difficult to contextualize the significance of these reductions.

      I am less convinced by a claim from the abstract: "Furthermore, experimental evolution combined with whole-genome sequencing revealed reduced potentials for resistance evolution compared to an antibiotic." Perhaps this is a semantic issue, but what is meant by "potential for resistance evolution"? My understanding is that this refers to mutations or sets of mutations that would be favored under selective pressure, allowing the bacteria to more easily climb a fitness landscape peak. However, the authors present a different result: the bacteria did not grow better after selection in different conditions (except for the positive control using ciprofloxacin). They correctly suggest that there may be individuals in the populations that have developed resistance and recommend isolating 8 from each treatment for testing. However, they then use the mean value of these individuals to conclude that there is no difference from the ancestor. This seems incorrect-surely the point of using individuals is not to compare them as a group but to determine if any one has a growth rate outside the expected distribution. In short, Figure S10 does not seem to support the findings reported in line 417.

      A final consideration for the evolution experiment is the choice of a bactericidal antibiotic. It might have been more appropriate to use a bacteriostatic drug as a control. However, I feel that additional work on this topic is beyond the scope of the current paper.

      Similarly, it would be interesting to consider how evolving the isolates in iron-limited media would affect resistance levels. Currently, I think the difference in growth rate is attributed to the iron-scavenging nature of the siderophores. In future work, this could be tested, and an evolution experiment in which iron availability is measured could provide valuable insights. To clarify, I believe this work is not necessary for the current paper, but it would be an interesting avenue for future research.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In an era of increasing antibiotic resistance, there is a pressing need for the development of novel sustainable therapies to tackle problematic pathogens. In this study, the authors hypothesize that pyoverdines - metal-chelating compounds produced by fluorescent pseudomonads - can act as antibacterials by locking away iron, thereby arresting pathogen growth. Using biochemical, growth, and virulence assays on 12 opportunistic pathogens strains, the authors demonstrate that pyoverdines induce iron starvation, but this effect was highly context-dependent. This same effect has been demonstrated for plant pathogens, but not for human opportunistic pathogens exposed to natural siderophores. Only those pathogens lacking (1) a matching receptor to take up pyoverdine-bound iron and/or (2) the ability to produce strong iron chelators themselves experienced strong growth arrest. This would suggest that pyoverdines might not be effective against all pathogens, thereby potentially limiting the utility of pyoverdines as global antibacterials.

      Strengths:

      The work addresses an important and timely question - can pyoverdines be used as an alternative strategy to deal with opportunistic pathogens? In general, the work is well conducted with rigorous biochemical, growth, and virulence assays. The work is clearly written and the findings are supported by high-quality figures.

      Weaknesses:

      I do not think there are any 'weaknesses' as such. However, it is well known that siderophore production is highly plastic, typically being upregulated in response to metal limitation (as well as toxic metal stress). Did the authors quantify whether pyoverdine supplementation altered siderophore production in the focal pathogens (either through phenotypic assays / transcriptomics)? Could such a phenotypic plastic response result in an increased capacity to scavenge iron from the environment? Importantly, increased expression of siderophores has been shown to enhance pathogen virulence (e.g. Lear et al 2023: increased pyoverdine production is linked with increased virulence in Pseudomonas aeruginosa). I really appreciate the amount of work the authors have put into this study, but I would suggest expanding the discussion a bit to include a few sentences on

      (1) unintentional consequences of pyoverdine treatment (e.g. changes in gene expression and non-siderophore-related mutations (e.g. biofilm formation)) on disease dynamics/pathogen virulence:

      (2) the efficacy of siderophore treatment under more natural conditions, i.e. when the pathogens have to compete with other species in the resident community (i.e. any other effects than resistance evolution through HGT of pyoverdine receptors as mentioned).

      Response 1: We would like to thank reviewer # 1 for the positive and constructive assessment. We agree that discussing the above points is important. We have added new paragraphs in the discussion, in which we elaborate on unintentional consequences (lines 532-551) and HGT of receptors (lines 599-607).

      Reviewer #1 (Recommendations For The Authors):

      I only have minor comments/suggestions for the authors, all listed below:

      • The authors' findings show that the antibacterial activity of pyoverdine is highly context-dependent. As such, I would suggest somewhat toning down the quite general statement in the Abstract: 'Thus, pyoverdines from environmental strain could become new sustainable antibacterials against human pathogens'

      Response 2: We agree that the pyoverdine treatment is especially potent against Acinetobacter baumannii and Staphylococcus aureus, but less so against Klebsiella pneumoniae. The treatment success is pathogen-dependent, and we have thus modified the phrase in the abstract (lines 32-34). The new sentence now reads: 'Thus, pyoverdines from environmental strains have the potential to become a new class of sustainable antibacterials against specific human pathogens.' Also in other parts of the manuscript (Results and Discussion), we emphasize that the pyoverdine treatment will likely be effective against specific pathogens (e.g., those with lower-iron affinity siderophores).

      • Bacteria often produce more than one type of siderophore. Do you know whether the 320 natural isolates used in this study produce any non-pyoverdine siderophores? Previous work has shown that pyochelin production is suppressed in PAO1 under a wider range of lab conditions. Do you know whether this is the case for the natural isolates used here (and rule out a potential role of non-pyoverdines in iron starvation as observed in Figure 1).

      Response 3: This is a valid question. Our own bioinformatic and phenotypic assays reveal that a certain fraction of strains (~ 40%) can produce secondary siderophores (unpublished data). We now mention the existence of secondary siderophores on lines 97-100 and 123. However, we do not think that their contribution to the supernatant assay results is large since the expression of pyoverdine typically suppresses the expression of the secondary siderophores (Cornelis 2010 Appl Microbiol Biotechnol; Dumas et al. 2013 Proc B) under stringent iron limitation. Furthermore, secondary siderophores have lower iron-binding affinities than pyoverdine. Finally, both the semi-pure and ultra-pure pyoverdine extracts showed strong pathogen inhibition (Fig. 3), and we are thus confident that pyoverdine is responsible for the observed growth inhibition.

      • Upon first mentioning the 'mock control' in the Results section in the main text, please state what the actual treatment is.

      Response 4: Thank you for noticing this. We now explain in more detail the actual treatment conditions used on lines 103-107 and in the caption of Figure 1. We have further removed the term 'mock' as it is confusing in this context and simple refer to the 'control treatment' in the text.

      • Please mention what the different colours mean in the legend of growth recovery in Figure 1B

      Response 5: We have clarified the colour scheme in the legend of Figure 1B.

      • Please clarify whether you used 12 or 14 strains of human pathogens (the latter number is mentioned in the results section)?

      Response 6: In the methods (lines 647-650), we now clearly specify that we used 12 strains of human pathogens in the initial supernatant screen (Figure 1). For all subsequent analyses (dose-response curves and infection experiments), we included the ESKAPE pathogens K. pneumoniae and A. baumannii.

      • Please explain whether ferribactin can be used in any other way than iron chelation (e.g. can this precursor be recycled to form pyoverdine)?

      Response 7: We apologize for not having properly explained the role of ferribactin. Under natural conditions, ferribactin is not secreted. It is kept in the periplasmic space, where it matures to pyoverdine. We most likely recovered ferribactin in the supernatant because of the vigorous shaking and centrifugation involved in the pyoverdine purification protocol. We now explain this on lines 216-218. Thus, there is no ferribactin secretion and recycling.

      • Have the authors looked at whether there is a relationship between the degree of growth arrest and phylogenetic distance? Would you expect there to be one?

      Response 8: This is an interesting question. We have now constructed a phylogenetic tree to explore this relationship (new Figure S2). We found that strains with inhibitory supernatants were scattered across the phylogenetic tree (described on lines 129-135). However, we also found two branches on the tree on which strains with inhibitory supernatant effects were overrepresented. This matches well our previous analysis that closely related species can produce similar pyoverdine types, but that the same pyoverdine can also be produced by completely different species (Gu et al. 2024 eLife).

      • In the Methods section, please mention you used pyoverdine-only controls in the infection assay.

      Response 9: We now mention the use of pyoverdine-only controls in the Methods section (lines 788-790). Overall, we have improved the infection procedure section (starting on line 770). Thank you for pointing this out.

      • Did you confirm whether the addition of pyoverdine resulted in lower bacterial loads in Galleria? In other words, were the observed changes in mortality solely related to changes in bacterial density?

      Response 10: Thank you for this valid question. No, we did not test whether pyoverdine treatment reduces the bacterial load. However, we did this in the past in two studies with a similar set of pathogens (Weigert et al. 2017 Evol Appl; Schmitz et al. 2023 Proc B) and found strong correlations between G. mellonella survival and bacterial loads. We agree that it is important to understand how pyoverdine affects pathogen load in the host and we will address this point in future studies.

      • In your infection assay, were Galleria (n=10) for each treatment housed in the same environment/container? If so, can you treat these as independent observations or should you use some sort of grouping variable in your survival analysis?

      Response 11: Thank you for pointing this out. We forgot to clarify this in the Methods section and now do so on lines 777-779. All larvae were individually housed in separate wells of a 24-well plate. There was no physical contact between larvae and no opportunity for pathogen exchange. As such, we treat each individual larvae as an independent observation.

      Reviewer #2 (Public Review):

      In this work, Vollenweider et al. examine the effectiveness of using natural products, specifically molecules that chelate iron, to treat infectious agents. Through the purification of 320 environmental isolates, 25 potential candidates were identified from natural products based on inhibition assays and were further screened. The structural information and chemical composition were determined.

      The paper is well-structured and thorough; targeting virulence factors in this manner is a great idea. My enthusiasm is dampened by the mediocre effects of the compounds. The lack of a dose-response curve in the survivability assays suggests a limited scope for these molecules. While it is encouraging that the best survivability occurred at the lowest toxicity level, it opens questions as to how effective such molecules can be. Either the reduction in mortality was offset by using higher concentrations, which was not observed in the compound-alone test, or there is no dose-response curve. The latter would suggest to me that the variation in survivability is not due to the addition of siderophores.

      Response 12: Thank you very much for the overall positive assessment. We understand your concern regarding the effectiveness of pyoverdines in the host. However, we wish to emphasize that hazard risks were reduced by more than 50% when treating A. baumannii and K. pneumoniae. Moreover, it was not so surprising to us that the treatment worked best at intermediate pyoverdine concentrations. We anticipated that pyoverdines could have negative effects for the host at relatively high concentrations because siderophore can interfere with host iron stocks (see discussion starting on line 552). Finally, dose-response curves do not necessarily need to be linear or sigmoid, they can also be hump-shaped. To better illustrate this aspect, we have now plotted the time to death for all the deceased larvae against the pyoverdine concentration gradient and fitted polynomial regression (new Fig. S6). For the above two pathogens, we found humped-shaped dose-response curves in four out of the six comparisons. We present this new analysis on lines 351-362.

      I would also like to see how these molecules compare to other iron-chelating molecules. Desferoxamine is a bacteria-derived siderophore that is FDA-approved. However, it is not used to treat infections. Would the author consider comparing their candidate molecules to well-studied molecules? This also raises questions about the novelty of this work; I think the authors could rephrase the discussion to better reflect that bioprospecting for iron-chelating molecules has previously occurred and been successful.

      Response 13: Thank you for the comment. The initial version of our manuscript already featured a brief discussion on other iron-chelation therapies. We have now changed the narrative to better reflect the differences of our approach to already existing iron-chelating molecules such as deferoxamine (lines 608-632).

      Finally, I am concerned about the few mutations reported in the resistance study. Looking at the SI, it appears that very few mutations were seen. It is unclear what filtering the authors used to arrive at such a low number of mutations. Even filtering against mutations that were selected by adaptation to the media, it seems low that only a handful of clones had distinct mutations.

      Response 14: We apologise for the unclear explanations and data analysis. When reanalysing the data we indeed detected a mistake: we originally treated all genomes as clonal origin, despite the fact that we sequenced entire populations for the control treatments. We have now completely re-done the mutational analysis using the breseq pipeline as newly described in the Methods (lines 861-866) and presented in the Results (lines 421-451). We have improved the filtering process and indeed found many more mutations, including the loss of mobile genetic elements. However, it is important to note that it is not uncommon to only find a few beneficial mutations. Especially, in cases where there are selective sweeps often only a few mutations fix.

      This paper has a lot of strengths. The workflow is logical and well-executed; the only significant weakness is the effect of the molecules and the lack of an explanation for a dose-response curve in the survivability assay, especially when compared to the data reported in Figure 3. As the authors describe in lines 214-217.

      Response 15: Thank you for this overall positive assessment. As discussed in our response 12, the effect of the molecule in the host was not weak as it decreased hazard risks by more than 50% for A. baumannii and K. pneumoniae. Moreover, we explain that the benefit of the pyoverdine treatment (in terms of treating the infection) can be offset by adverse effects on the host, especially at high pyoverdine concentrations.

      Reviewer #2 (Recommendations For The Authors):

      • Compare these compounds to well-studied iron chelating molecules.

      Response 16: We have addressed this comment in our response 13.

      • Considering adding time of death to the analysis for the survivability. While the reduction in mortality was not large perhaps the time to death increased.

      Response 17: This is an excellent suggestion. We have now analysed the time-to-death as a function of pyoverdine concentration (new Figure S6). Time-to-death was highly variable and sample size was fairly low for A. baumannii and K. pneumoniae as many larvae survived. Nonetheless, we found hump-shaped dose-response curves in four out of six comparisons and a linear dose-response curve in one case. We now report the new analyses on lines 351-362. Finally, we like to stress once more that reduction in mortality was considerable (hazard risk reduction by more than 50%).

      • I would also like to see the actual growth curves of the pathogens in the SI to accompany Fig 6.

      Response 18: This is a good point. We have now included the actual growth curves of the pathogens in the Supporting Information to accompany Figure 6 (new Figures S9 and S10).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Joint Public Review:

      Summary:

      This study presents a strategy to efficiently isolate PcrV-specific BCRs from human donors with cystic fibrosis who have/had Pseudomonas aeruginosa (PA) infection. Isolation of mAbs that provide protection against PA may be a key to developing a new strategy to treat PA infection as the PA has intrinsic and acquired resistance to most antibiotic drug classes. Hale et al. developed fluorescently labeled antigen-hook and isolated mAbs with anti-PA activity. Overall, the authors' conclusion is supported by solid data analysis presented in the paper. Four of five recombinantly expressed PcrV-specific mAbs exhibited anti-PA activity in a murine pneumonia challenge model as potent as the V2L2MD mAb (equivalent to gremubamab). However, therapeutic potency for these isolated mAbs is uncertain as the gremubamab has failed in Phase 2 trials. Clarification of this point would greatly benefit this paper.

      Strengths:

      (1) High efficiency of isolating antigen-specific BCRs using an antigenic hook.

      (2) The authors' conclusion is supported by data.

      Weaknesses:

      Although the authors state that the goal of this study was to generate novel protective mAbs for therapeutic use (P12; Para. 2), it is unclear whether PcrV-specific mAbs isolated in this study have therapeutic potential better than the gremubamab, which has failed in Phase 2 trials. Four of five PcrV-specific mAbs isolated in this study reduced bacterial burdens in mice as potent as, but not superior to, gremubamab-equivalent mAb. Clarification of this concern by revising the text or providing experimental results that show better potential than gremubamab would greatly benefit this paper.

      The authors thank the reviewer for their thoughtful positive assessment. As noted by the reviewer, the studies described here, which were performed in mice, show that our MBC-derived mAbs are as effective as V2L2MD, a mAb that is one component of the gremubamab bi-specific. However, key theoretical strengths of MBC-derived mAbs (reduced immunogenicity, full participation in effector functions) are not easily tested in mice. We have clarified and expanded our discussion of these points in our revised manuscript, particularly in the Discussion paragraph 4.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Page 8. Using improved methods that enhanced the efficiency and depth of sequencing (manuscript in preparation...). This method is not provided in detail. The authors should provide a detailed method (as a preprint on a public database or described in the method section).

      We thank the reviewers for their interest in the details of the specific methods for single cell B cell receptor sequencing. We regret that the manuscript is still in preparation. In fact, our current methods section provides much more detail about sequencing methods than is customarily supplied by authors mAb development papers. However, we understand the frustration and will remove our citation of our manuscript in preparation in our revised manuscript.

    2. eLife Assessment

      Treatment of Pseudomonas aeruginosa (PA) infections is challenging because of intrinsic and acquired antibiotic resistance to most antibiotic drug classes. Therefore, by using donor B cells in subjects with cystic fibrosis who undergo intermittent or chronic airway PA infections, the authors aimed to isolate B-cell receptors against PA virulence factors and examined their biological activities. The data are solid and the protective antibodies identified in this study could be useful for protection against PA.

    3. Joint Public Review:

      Summary:

      This study presents a strategy to efficiently isolate PcrV-specific BCRs from human donors with cystic fibrosis who have/had Pseudomonas aeruginosa (PA) infection. Isolation of mAbs that provide protection against PA may be a key to developing a new strategy to treat PA infection as the PA has intrinsic and acquired resistance to most antibiotic drug classes. Hale et al. developed fluorescently labeled antigen-hook and isolated mAbs with anti-PA activity. Overall, the authors' conclusion is supported by solid data analysis presented in the paper. Four of five recombinantly expressed PcrV-specific mAbs exhibited anti-PA activity in a murine pneumonia challenge model as potent as the V2L2MD mAb (equivalent to gremubamab). However, therapeutic potency for these isolated mAbs is uncertain as the gremubamab has failed in Phase 2 trials. Clarification of this point would greatly benefit this paper.

      Strengths:

      (1) High efficiency of isolating antigen-specific BCRs using an antigenic hook.

      (2) The authors' conclusion is supported by data.

      Weaknesses:

      Although the authors state that the goal of this study was to generate novel protective mAbs for therapeutic use (P12; Para. 2), it is unclear whether PcrV-specific mAbs isolated in this study have therapeutic potential better than the gremubamab, which has failed in Phase 2 trials. Four of five PcrV-specific mAbs isolated in this study reduced bacterial burdens in mice as potent as, but not superior to, gremubamab-equivalent mAb. Clarification of this concern by revising the text or providing experimental results that show better potential than gremubamab would greatly benefit this paper.

    1. eLife Assessment

      This study presents an important finding on durotaxis in various amoeboid cells that is independent of focal adhesions. The evidence supporting the authors' claims is compelling. The work will be of interest to cell biologists and biophysicists working on rigidity sensing, the cytoskeleton, and cell migration.

    2. Reviewer #1 (Public review):

      In their paper, Kang et al. investigate rigidity sensing in amoeboid cells, showing that, despite their lack of proper focal adhesions, amoeboid migration of single cells is impacted by substrate rigidity. In fact, many different amoeboid cell types can durotax, meaning that they preferentially move towards the stiffer side of a rigidity gradient.

      The authors observed that NMIIA is required for durotaxis and, buiding on this observation, they generated a model to explain how durotaxis could be achieved in the absence of strong adhesions. According to the model, substrate stiffness alters the diffusion rate of NMAII, with softer substrates allowing for faster diffusion. This allows for NMAII accumulation at the back, which, in turn, results in durotaxis.

      The evidence provided for durotaxis of non adherent (or low-adhering) cells is strong. I am particularly impressed by the fact that amoeboid cells can durotax even when not confined. I wish to congratulate the authors for the excellent work, which will fuel discussion in the field of cell adhesion and migration.

    3. Reviewer #2 (Public review):

      Summary:

      The authors developed an imaging-based device, that provides both spatial confinement and stiffness gradient, to investigate if and how amoeboid cells, including T cells, neutrophils and Dictyostelium can durotax. Furthermore, the authors showed that the mechanism for the directional migration of T cells and neutrophils depends on non-muscle myosin IIA (NMIIA) polarized towards the soft-matrix-side. Finally, they developed a mathematical model of an active gel that captures the behavior of the cells described in vitro.

      Strengths:

      The topic is intriguing as durotaxis is essentially thought to be a direct consequence of mechanosensing at focal adhesions. To the best of my knowledge, this is the first report on amoeboid cells that are not dependent on FAs to exert durotaxis. The authors developed an imaging-based durotaxis device that provides both spatial confinement and stiffness gradient and they also utilized several techniques such as quantitative fluorescent speckle microscopy and expansion microscopy. The results of this study have well-designed control experiments and are therefore convincing.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In their paper, Kang et al. investigate rigidity sensing in amoeboid cells, showing that, despite their lack of proper focal adhesions, amoeboid migration of single cells is impacted by substrate rigidity. In fact, many different amoeboid cell types can durotax, meaning that they preferentially move towards the stiffer side of a rigidity gradient. 

      The authors observed that NMIIA is required for durotaxis and, buiding on this observation, they generated a model to explain how durotaxis could be achieved in the absence of strong adhesions. According to the model, substrate stiffness alters the diffusion rate of NMAII, with softer substrates allowing for faster diffusion. This allows for NMAII accumulation at the back, which, in turn, results in durotaxis. 

      The authors responded to all my comments and I have nothing to add. The evidence provided for durotaxis of non adherent (or low-adhering) cells is strong. I am particularly impressed by the fact that amoeboid cells can durotax even when not confined. I wish to congratulate the authors for the excellent work, which will fuel discussion in the field of cell adhesion and migration.

      We thank the reviewer for critically evaluating our work and giving kind suggestions. We are glad that the reviewer found our work to be of potential interest to the broad scientific community.

      Reviewer #2 (Public Review):

      Summary:

      The authors developed an imaging-based device that provides both spatialconfinement and stiffness gradient to investigate if and how amoeboid cells, including T cells, neutrophils, and Dictyostelium, can durotax. Furthermore, the authors showed that the mechanism for the directional migration of T cells and neutrophils depends on non-muscle myosin IIA (NMIIA) polarized towards the soft-matrix-side. Finally, they developed a mathematical model of an active gel that captures the behavior of the cells described in vitro.

      Strengths:

      The topic is intriguing as durotaxis is essentially thought to be a direct consequence of mechanosensing at focal adhesions. To the best of my knowledge, this is the first report on amoeboid cells that do not depend on FAs to exert durotaxis. The authors developed an imaging-based durotaxis device that provides both spatial confinement and stiffness gradient and they also utilized several techniques such as quantitative fluorescent speckle microscopy and expansion microscopy. The results of this study have well-designed control experiments and are therefore convincing.

      Weaknesses:

      Overall this study is well performed but there are still some minor issues I recommend the authors address:

      (1) When using NMIIA/NMIIB knockdown cell lines to distinguish the role of NMIIA and NMIIB in amoeboid durotaxis, it would be better if the authors took compensatory effects into account.

      We thank the reviewer for this suggestion. We have investigated the compensation of myosin in NMIIA and NMIIB KD HL-60 cells using Western blot and added this result in our updated manuscript (Fig. S4B, C). The results showed that the level of NMIIB protein in NMIIA KD cells doubled while there was no compensatory upregulation of NMIIA in NMIIB KD cells. This is consistent with our conclusion that NMIIA rather than NMIIB is responsible for amoeboid durotaxis since in NMIIA KD cells, compensatory upregulation of NMIIB did not rescue the durotaxis-deficient phenotype. 

      (2) The expansion microscopy assay is not clearly described and some details are missed such as how the assay is performed on cells under confinement.

      We thank the reviewer for this comment. We have updated details of the expansion microscopy assay in our revised manuscript in line 481-485 including how the assay is performed on cells under confinement:

      Briefly, CD4+ Naïve T cells were seeded on a gradient PA gel with another upper gel providing confinement. 4% PFA was used to fix cells for 15 min at room temperature. After fixation, the upper gradient PA gel is carefully removed and the bottom gradient PA gel with seeded cells were immersed in an anchoring solution containing 1% acrylamide and 0.7% formaldehyde (Sigma, F8775) for 5 h at 37 °C.

      (3) In this study, an active gel model was employed to capture experimental observations. Previously, some active nematic models were also considered to describe cell migration, which is controlled by filament contraction. I suggest the authors provide a short discussion on the comparison between the present theory and those prior models.

      We thank the reviewer for this suggestion. Active nematic models have been employed to recapitulate many phenomena during cell migration (Nat Commun., 2018, doi: 10.1038/s41467-018-05666-8.). The active nematic model describes the motion of cells using the orientation field, Q, and the velocity field, u. The director field n with (n = −n) is employed to represent the nematic state, which has head-tail symmetry. However, in our experiments, actin filaments are obviously polarized, which polymerize and flow towards the direction of cell migration. Therefore, we choose active gel model which describes polarized actin field during cell migration. In the discussion part, we have provided the comparison between active gel model and motor-clutch model. We have also supplemented a short discussion between the present model and active nematic model in the main text of line 345-347:

      The active nematic model employs active extensile or contractile agents to push or pull the fluid along their elongation axis to simulate cells flowing (61). 

      (4) In the present model, actin flow contributes to cell migration while myosin distribution determines cell polarity. How does this model couple actin and myosin together?

      We thank the reviewer for this question. In our model, the polarization field is employed to couple actin and myosin together. It is obvious that actin accumulate at the front while myosin diffuses in the opposite direction. Therefore, we propose that actin and myosin flow towards the opposite direction, which is captured in the convection term of actin ) and myosin () density field.

    1. eLife Assessment

      This manuscript reports important findings on the impact of maternal obesity on offspring metabolism. It presents solid evidence that maternal obesity induces genomic methylation alterations in oocytes, which can be partly transmitted to F2 in females, and that melatonin is involved in regulating the hyper-methylation of high fat diet oocytes by increasing the expression of DNMTs via the cAMP/PKA/CREB pathway. This study would be of interest to biologists in the fields of epigenetics and metabolism.

    2. Joint Public review:

      Summary

      This manuscript offers significant insights into the impact of maternal obesity on oocyte methylation and its transgenerational effects. Chao and colleagues demonstrated the potential mechanisms behind the DNA methylation changes. The major observations of the work include transgenerational DNA methylation changes in offspring of maternal obesity and metabolites such as methionine and melatonin which correlated with the epigenetic changes. Exogenous melatonin treatment could reverse the effects of obesity. The authors further hypothesized that the linkage may be mediated by the cAMP/PKA/CREB pathway to regulate the expression of DNMTs. This work has done lots of breeding and DNA Methylation analysis across multiple generations, which provides solid data for future research. The results of this work may benefit from deeper data analysis to make more causal analyses and conclusions more concrete.

      Strengths

      The study employs comprehensive methodologies, including transgenerational breeding experiments, whole genome bisulfite sequencing, and metabolomics analysis, and provides the convincing data.

      Weaknesses

      The results of this work are correlational, which may require further analysis to establish more concrete conclusions on causal relationships.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      With socioeconomic development, more and more people are obese which is an important reason for sub-fertility and infertility. Maternal obesity reduces oocyte quality which may be a reason for the high risk of metabolic diseases for offspring in adulthood. Yet the underlying mechanisms are not well elucidated. Here the authors examined the effects of maternal obesity on oocyte methylation. Hyper-methylation in oocytes was reported by the authors, and the altered methylation in oocytes may be partially transmitted to F2. The authors further explored the association between the metabolome of serum and the altered methylation in oocytes. The authors identified decreased melatonin. Melatonin is involved in regulating the hyper-methylation of high-fat diet (HFD) oocytes, via increasing the expression of DNMTs which is mediated by the cAMP/PKA/CREB pathway.

      Strengths:

      This study is interesting and should have significant implications for the understanding of the transgenerational inheritance of GDM in humans.

      Thank you for your positive comments to our manuscript.

      Weaknesses:

      The link between altered DNA methylation and offspring metabolic disorders is not well elucidated; how the altered DNA methylation in oocytes escapes reprogramming in transgenerational inheritance is also unclear.

      Thanks. These are very good questions. There is a long way to completely elucidate the relationship between methylation and offspring metabolic disorders, and the underlying mechanisms of obtained methylation escaping the reprogramming during development. We would like to explore these in the future.

      Reviewer #2 (Public Review):

      This manuscript offers significant insights into the impact of maternal obesity on oocyte methylation and its transgenerational effects. The study employs comprehensive methodologies, including transgenerational breeding experiments, whole genome bisulfite sequencing, and metabolomics analysis, to explore how high-fat diet (HFD)-induced obesity alters genomic methylation in oocytes and how these changes are inherited by subsequent generations. The findings suggest that maternal obesity induces hyper-methylation in oocytes, which is partly transmitted to F1 and F2 oocytes and livers, potentially contributing to metabolic disorders in offspring. Notably, the study identifies melatonin as a key regulator of this hyper-methylation process, mediated through the cAMP/PKA/CREB pathway.

      Strengths:

      The study employs comprehensive methodologies, including transgenerational breeding experiments, whole genome bisulfite sequencing, and metabolomics analysis, and provides convincing data.

      Thank you for your positive comments to our manuscript.

      Weaknesses:

      The description in the results section is somewhat verbose. This section (lines 126~227) utilized transgenerational breeding experiments and methylation analysis to demonstrate that maternal obesity-induced alterations in oocyte methylation (including hyper-DMRs and hypo-DMRs) can be partially transmitted to F1 and F2 oocytes and livers. The authors should consider condensing and revising this section for clarity and brevity.

      Thanks for your suggestions. We have re-written this parts in the revised manuscript.

      There is a contradiction with Reference 3, but the discrepancy is not discussed. In this study, the authors observed an increase in global methylation in oocytes from HFD mice, whereas Reference 3 indicates Stella insufficiency in oocytes from HFD mice. This Stella insufficiency should lead to decreased methylation (Reference 33). There should be a discussion of how this discrepancy can be reconciled with the authors' findings.

      Thanks for your suggestions. As reported by Reference 33, STELLA prevents hypermethylation in oocytes by sequestering UHRF1 from the nuclei which recruits DNMT1 into nuclei. Han et al. reported that obesity induced by high-fat diet reduces STELLA level in oocytes. These indicate that STELLA insufficiency might induce hypermethylation in oocytes, although significant hypermethylation in obese oocytes is not reported by Han et al. using immunofluorescence. This contradiction may be caused by the limited sample sizes (n=14) used by Han et al. We have added a brief discussion in the revised manuscript.

      Reviewer #3 (Public Review):

      Summary:

      Maternal obesity is a health problem for both pregnant women and their offspring. Previous works including work from this group have shown significant DNA methylation changes for offspring of obese pregnancies in mice. In this manuscript, Chao et al digested the potential mechanisms behind the DNA methylation changes. The major observations of the work include transgenerational DNA methylation changes in offspring of maternal obesity, and metabolites such as methionine and melatonin correlated with the above epigenetic changes. Exogenous melatonin treatment could reverse the effects of obesity. The authors further hypothesized that the linkage may be mediated by the cAMP/PKA/CREB pathway to regulate the expression of DNMTs.

      Strengths:

      The transgenerational change of DNA methylation following HFD is of great interest for future research to follow. The metabolic treatment that could change the DNA methylation in oocytes is also interesting and has potential relevance to future clinical practice.

      Thank you for your positive comments to our manuscript.

      Weaknesses:

      The HFD oocytes have more 5mC signal based on staining and sequencing (Fig 1A-1F). However, the authors also identified almost equal numbers of hyper- and hypo-DMRs, which raises questions regarding where these hypo-DMRs were located and how to interpret their behaviors and functions. These questions are also critical to address in the following mechanistic dissections as the metabolic treatments may also induce bi-directional changes of DNA methylation. The authors should carefully assess these conflicts to make the conclusions solid.

      Thanks for the helpful comments and suggestions. As presented in Fig. 1F, there is an increase of methylation level in promoter and exon regions and there is a decrease in intron, utr3 and repeat regions. According to the suggestions, we further analyzed the distribution of DMRs, and found that hypo-DMRs were mainly distributed at utr3, intron, repeat, and tes regions compared with hyper-DMRs (Fig. S3). These suggest that the distribution of DMRs in genome is not random.

      The transgenerational epigenetic modifications are controversial. Even for F0 offspring under maternal obesity, there were different observations compared to this work (Hou, YJ., et al. Sci Rep, 2016). The authors should discuss the inconsistencies with previous works.

      Thanks for the suggestions. There are contradictions on the whole genome DNA methylation of oocytes in obese mice. Hou YJ et al. in 2016 reported that obesity reduces the whole genome DNA methylation of NSN GV oocytes using immunofluorescence. In 2018, Han LS et al. reported that the whole genome 5mC of oocytes is not significantly influenced by obesity using immunofluorescence, but they find the Stella level is reduced in oocytes by obesity. Stella locates in the cytoplasm and nuclei of oocytes and sequesters Uhrf1 from the nuclei. Stella knockout in oocytes results in about twofold increase of global methylation in MII oocytes via recruiting more DNMT1 into nuclei. These suggest that the global methylation of oocytes in obese mice should be increased, but the similar methylation in oocytes between obese and non-obese mice is reported by Han LS et al. Thus, the contradiction may be induced by the different sample size in our manuscript and previous studies, and Hou YJ and colleagues just examined the methylation of NSN GV oocytes. As present in Stella+/- oocytes, the global methylation of oocytes is normal, which suggest that the insufficiency of Stella may be not the main reason for the increased methylation of oocytes in obese mice. We have added a brief discussion in the revised manuscript.

      In addition to the above inconsistencies, the DNA methylation analysis in this work was not carefully evaluated. Several previous works were evaluating the DNA methylation in mice oocytes, which showed global methylation levels of around 50% (Shirane K, et al. PLoS Genet, 2013; Wang L., et al, Cell, 2014). In Figure 1E, the overall methylation level is about 23% in control, which is significantly different from previous works. The authors should provide more details regarding the WGBS procedure, including but not limited to sequencing coverage, bisulfite conversion rate, etc.

      Thanks for the good questions. Smallwood et al. reported the the CG methylation of MII oocyte is about 33.1% (Smallwood et al. Nature Methods, 2014) using single-cell genome-wide bisulfite sequencing. Shirane K et al. reported that the average methylation level of GV oocytes is 37.9%. Kobayashi H et al. Reported that the CG methylation in GV oocytes is about 40% (Kobayashi H et al. Plos Genet. 2012). CG methylation in fully grown oocytes is about 38.7% (Maenohara S et al. Plos Genet. 2017). The variation of methylation in oocytes is associated with sequencing methods, sequencing depth, and mapping rates. In the present study, whole genome bisulfite sequencing (WGBS) for small sample and methylation analysis were performed by NovoGene. The reads are 31613641 to 37359643, unique mapping rate is ≥32.88%,  conversation rate is > 99.44%, and sequencing depth is 2.45 to 2.75. Relative information is presented in Table S1. The sequencing depth might be a reason for the inconsistence. But we further confirmed our sequencing results using bisulfite sequencing (BS), and the result is similar between BS and WGBS results. These findings suggest that our results are reliable.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Since the results show that melatonin may play a role in hyper-methylation, the authors need to give some basic information in the Introduction section.

      Thanks. We added more information in the section of Introduction.

      (2) There are many differential metabolites identified. Besides melatonin, other differential metabolites are involved in the altered methylation in oocytes

      These is a good question. We firstly filtered the differential metabolites which may be involved in methylation, and then further filtered these metabolites according to the relative DNA methylation pathways and published papers. After that, we confirmed the concentrations of relative metabolites in the serum using ELISA. Certainly, we can not completely exclude all the metabolites which might involved in regulating DNA methylation.

      (3) The altered methylation would be found in the F1 tissues. Did the authors examine the other parts besides the liver?

      Thank you. In the present study, we didn’t examined the DNA methylation in the other tissues besides the liver. We agree that the altered methylation should be observed in the other tissues.

      (4) Did the authors try or guess how many generations the maternal obesity-induced genomic methylation alterations can be transmitted?

      Thanks. This is a good question. Takahashi Y and colleagues reported that obtained DNA methylation at CpG island can be transmitted across multiple generations using DNA methylation-edited mouse (Takahashi Y et al. 2023, cell). Similar inheritance is also reported by other studies using different models.

      (5) The F2 is indirectly affected by maternal obesity, so the evidence is not enough to prove the transgenerational inheritance of the altered methylation.

      Thanks. We find the altered DNA methylation in F2 tissue and oocytes is similar to that in F1 oocytes. These suggest the altered DNA methylation in F2 oocytes should be at least partly transmitted to F3. Previous paper (Takahashi Y et al. 2023, cell) confirms that obtain DNA methylation in CpG island can be transmitted across several generations through paternal and maternal germ lines. Certainly, it’s better if it is examined in F3 tissues.

      Reviewer #2 (Recommendations For The Authors):

      (1) Figure Font Size: The font sizes in the figures are quite inconsistent. Please try to uniform the font size of similar types of text.

      Thanks for your suggestions. We re-edited the relative figures in the revised manuscript.

      (2) Figure Clarity: Ensure that all critical information in the figures is clearly visible, such as in Figure 3C.

      Thank you. We revised this figure.

      (3) Figure 1B, C: The position of the asterisks ("**") is not centered in the corresponding columns, and the font size is too small. Please correct this and address similar issues in other figures.

      Thank you for your suggestions. We re-edited these in the revised figures.

      (4) Line 126: The current expression is confusing. It may be revised to: "Both the oocyte quality and the uterine environment can contribute to adult diseases, which may be mediated by epigenetic modifications."

      Thanks. We revised this sentence in the revised manuscript.

      (5) Missing Panel in Figure 3: Figure 3 is missing panel 3N.

      Thank you so much. We corrected it in the revised manuscript.

      (6) Figure Panel Order: Please adjust the order of the panels in the figures to follow a logical reading sequence.

      Thank you. We changed the orders in the revised manuscript.

      (7) Line 493: Correct "inthe" to "in the".

      Thank you. We revised it.

      (8) Lines 102-106: Polish the wording and expression, an example as follows: "We analyzed the differentially methylated regions (DMRs) in oocytes from both HFD and CD groups and identified 4,340 DMRs. These DMRs were defined by the criteria: number of CG sites {greater than or equal to} 4 and absolute methylation difference {greater than or equal to} 0.2. Among these, 2,013 were hyper-DMRs (46.38%) and 2,327 were hypo-DMRs (53.62%) (Fig. 1G). These DMRs were distributed across all chromosomes (Fig. 1H). "

      Thank you! We re-wrote these parts in the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      The sample numbers should be annotated in the figure legend for all the bar plots using Image J. The lines in Figures 2B and 2C were without error bars. How many mice were used for these plots?

      Thanks for your suggestions. We added the sample size in the revised manuscript. We made a mistake when we prepared the pictures for figure 2B and figure 2C, which resulted in missing the error bars. We have corrected these pictures. Thanks again!

      The authors should revise the panel arrangement of the figures (Figure 2, Figure 5, etc) to make them more clear and readable.

      Thank you! We have revised these in the revised manuscript.

      The writing should be improved since there were multiple typos and unclear expressions. AI tools like Grammarly or ChatGPT may help.

      Thank you! We have re-edited the language in the revised manuscript using AI tools.

      Please recheck the immunofluorescence images for clear interpretability. For example, in Figure 5F (H89 treated), the GV is all the way at the edge of the oocyte, and the oocyte in the DIC image appears like it is partially lysed. The DIC images and the DAPI images are not clear enough.

      Thanks for your suggestions. We have re-edited these pictures in the revised manuscript.

      Another concern is that the Methods describes the immunofluorescence preparation for 5mC and 5hmC staining as a simple fixation in 4% paraformaldehyde followed by permeabilization with .5% TritonX-100, but there is no antigen exposure step described, a step that is normally required for visualizing these DNA modifications (e.g., 4N HCl).

      Thanks. Sorry for that we didn’t describe the methods clearly. We have added more information about the methods in the revised manuscript.

      The metabolomic analysis revealed a highly significant increase in dibutylphthalate, genistein, and daidzein in the control mice. The presence of these exogenous metabolites suggests that the diets differed in many aspects, not just fat content, so it would be very difficult to interpret the results as related to a high-fat diet alone. Both daidzein and genistein are phytoestrogens and dibutylphthalate is a plasticizer, suggesting differences in the diet and/or in the materials used to collect the samples for analysis from the mice. The Methods define the high-fat diet adequately, as the formulation can be found online using the catalog number. However, the control diet is just listed as "normal diet", so one has no idea what is in it

      Thank you for your good questions. The daidzein and genistein may be from the diets and the dibutylthalate may be from the materials used to collect samples. If so, these should be similar between groups. Thus, we added the formulation of normal diet in the revised manuscript. The raw materials of normal diet include corn, bean pulp, fish meal, flour, yeast powder, plant oil, salt, vitamins, and mineral elements. According to the suggestions, we re-checked the data about these metabolites, and found that the abundance of these metabolites was low. And the result of these metabolites was at a low confidence level because the iron of these metabolites was only mapped to ChemSpider(HMDB,KEGG,LIPID MAPS). To further confirm these results, we examined these metabolites in serum using ELISA, and results revealed that the concentrations of genistein and dibutylthalate were similar between groups. These results suggest that these metabolites may be not involved in the altered methylation of oocytes induced by obesity.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this work by Wang et al., the authors use single-molecule super-resolution microscopy together with biochemical assays to quantify the organization of Nipah virus fusion protein F (NiV-F) on cell and viral membranes. They find that these proteins form nanoscale clusters which favors membrane fusion activation, and that the physical parameters of these clusters are unaffected by protein expression level and endosomal cleavage. Furthermore, they find that the cluster organization is affected by mutations in the trimer interface on the NiV-F ectodomain and the putative oligomerization motif on the transmembrane domain, and that the clusters are stabilized by interactions among NiV-F, the AP2-complex, and the clathrin coat assembly. This work improves our understanding of the NiV fusion machinery, which may have implications also for our understanding of the function of other viruses.

      Strengths:

      The conclusions of this paper are well-supported by the presented data. This study sheds light on the activation mechanisms underlying the NiV fusion machinery.

      Weaknesses:

      The authors provide limited details of the convolutional neural network they developed in this work. Even though custom-codes are made available, a description of the network and specifications of how it was used in this work would aid the readers in assessing its performance and applicability. The same holds for the custom-written OPTICS algorithm. Furthermore, limited details are provided for the imaging setup, oxygen scavenging buffer, and analysis for the single-molecule data, which limits reproducibility in other laboratories. The claim of 10 nm resolution is not backed up by data and seems low given the imaging conditions and fluorophores used. Fourier Ring Correlation analysis would have validated this claim. If the authors refer to localization precision rather than resolution, then this should be specified and appropriate data provided to support this claim.

      We thank reviewer 1 for these suggestions. We described key steps in imaging setup, singlemolecule data reconstruction, the OPTICS algorithm in cluster identification, and 1D CNN in

      classification of the OPTICS data in the Materials and Methods section. We also provided a recipe for the imaging buffer. We refer to 10 nm localization precision rather than resolution. The localization precision achieved by our SMLM system is shown in the Author response image 1.

      Author response image 1.

      The localization precision of the custom-built SMLM. Shows the distribution of localization error at the x (dX), y (dY), and z (dZ) direction in nanometer of blinks generated from Alexa Flour 647 labeled to NiV-F expressed on the plasma membrane of PK13 cells. The lateral precision is <10 nm and the axial precision is < 20 nm. 

      Reviewer #2 (Public Review): 

      Summary:

      In this manuscript, Wang and co-workers employ single molecule light microscopy (SMLM) to detect NiV fusion protein (NiV-F) in the surface of cells. They corroborate that these glycoproteins form microclusters (previously seen and characterized together with the NiVG and Nipah Matrix protein by Liu and co-workers (2018) also with super-resolution light microscopy). Also seen by Liu and coworkers the authors show that the level of expression of NiV-F does not alter the identity of these microclusters nor endosomal cleavage. Moreover, mutations and the transmembrane domain or the hexamer-of-trimer interface seem to have a mild effect on the size of the clusters that the authors quantified.

      Importantly, it has also been shown that these particles tend to cluster in Nipah VLPs.

      We thank reviewer #2 for the comments and suggestions. This paper is built on Liu et al 1 to further characterize the nanoclusters formed by NiV-F and their role in membrane fusion activation. While Liu et al. studied the NiV glycoprotein distribution at the NiV assembly sites to inform mechanisms in NiV assembly and release, Wang et al. analyzed the nanoorganization and distribution of NiV-F at the prefusion conformation, providing insights into the membrane fusion activation mechanisms.  

      Strengths:

      The authors have tried to perform SMLM in single VLPs and have shown partially the importance of NiV-F clustering.

      Weaknesses:

      The labelling strategy for the NiV-F is not sufficiently explained. The use of a FLAG tag in the extracellular domain should be validated and compared with the unlabelled WT NiV-F when expressed in functional pseudoviruses (for example HIV-1 based particles decorated with NiV-F). This experiment should also be carried out for both infection and fusion (including BlaM-Vpr as a readout for fusion). I would also suggest to run a time-of-addition BlaM experiment to understand how this particular labelling strategy affects single virion fusion as compared to the the WT.  

      We thank reviewer #2 for this suggestion. We have made various efforts to validate the expression and function of FLAG-tagged NiV-F. The NiV-F-FLAG shows comparable cell surface expression levels and induces similar cell-cell fusion levels in 293T cells as that of untagged NiV-F 1. The NiV-F-FLAG also showed similar levels of virus entry as untagged NiV-F when both were pseudotyped on a recombinant Vesicular Stomatitis Virus (VSV) with the VSV glycoprotein replaced by a Renilla luciferase reporter gene (VSV-ΔG-rLuc; Fig. S1D). We also performed a virus entry kinetics assay using NiV VLPs expressing NiV-M-βlactamase (NiV-M-Bla), NiV-G-HA, and NiV-F-FLAG, NiV-F-AU1 or untagged NiV-F. The intracellular AU1 tag is located at the C-terminus of NiV-F (Genbank accession no. AY816748.1). However, we detected different levels of NiV-M-Bla in equal volume of VLPs, suggesting that the tags in NiV-F affect the budding of the VLPs (Author response image 2A). Therefore, we performed fusion kinetics assay by using VLPs expressing the same levels of NiV-M-Bla. Among them, the NiV-F-FLAG on VLPs shows the most efficient fusion between VLP and HEK293T cell membranes (Author response image 2B), significantly more efficient than that of untagged NiV-F and NiV-FAU1. However, we cannot attribute the enhanced fusion activity to the FLAG tag, because the readout of this assay relies on both the levels of β-lactamase (introduced by NiV-M-Bla in VLPs) and the NiV-F constructs. The tags in NiV-F could affect both the budding of VLPs and the stoichiometry of F and M in individual VLPs. We did not use the HIV-based pseudovirus system because the incorporation of NiV-F into HIV pseudoviruses requires a C-terminal deletion 2,3.

      In summary, the FLAG tag does not affect cell-cell fusion 1 and virus entry when pseudotyped to the recombinant VSV-ΔG-rLuc viruses (Fig. S1D). Given that we do not observe any difference in clustering between an HA- and FLAG-tagged NiV-F constructs on PK13 cell surface (Fig. S1A-C), we conclude that the FLAG tag has minimal effect on both the fusion activity and the nanoscale distribution of NiV-F. 

      Author response image 2.

      Viral entry is not affected by labeling of NiV-F. A) Western blot analysis of NiV-M-Bla in NiV-VLPs generated by HEK293T cells expressing NiV-M-Bla, NiV-G-HA and NiV-F-FLAG, untagged NiV-F, or NiV-F-AU1. Equal volume of VLPs were separated by a denaturing 10% SDS–PAGE and probed against β-lactamase (SANTA CRUZ, sc-66062). B) NiV-VLPs expressing NiV-M-BLa, NiV-G-HA, and NiV-F-FLAG, untagged NiV-F or NiV-F-AU1 expression plasmids were bond to the target HEK293T cells loaded with CCF2-AM dye at 4°C. The Blue/Green (B/G) ratio was measured at 37°C for 4 hrs at a 3-min interval. Results were normalized to the maximal B/G ratio of NiV-F-FLAG-NiV VLPs. Results from one representative experiment out of three independent experiments are shown. 

      It would also be very important to compare the FLAG labelling approach with recent advances in the field (for instance incorporating noncanonical amino acids (ncAAs) into NiVF by amber stop-codon suppression, followed by click chemistry). 

      We are greatly thankful for this comment from reviewer #2. Labeling noncanonical amino acids (ncAAs) with biorthogonal click chemistry is indeed a more precise labeling strategy compared to the traditional epitope labeling approach used in this paper. We will explore the applications of ncAAs labeling in single-molecule localization imaging and virus-host interactions in future projects. 

      In this paper, the FLAG tag inserted in NiV-F protein seems to have minimal effect on the NiV-F-induced virus entry and cell-cell fusion 1 (Fig. S1). Although the FLAG tag labeling approach may increase the detectable size of NiV-F nanoclusters due to the use of the antibody complex, it should not affect our conclusions drawn from the relative comparisons between wt and mutant NiV-F or control and drug-treated cells. 

      The correlation between the existence of microclusters of a particular size and their functionality is missing. Only cell-cell fusion assays are shown in supplementary figures and clearly, single virus entry and fusion cannot be compared with the biophysics of cell-cell fusion. Not only the environment is completely different, membrane curvature and the number of NiV-F drastically varies also. Therefore, specific fusion assays (either single virus tracking and/or time-of-addition BlaM kinetics with functional pseudoviruses) are needed to substantiate this claim.  

      We thank Reviewer 2 for the suggestion. To support the link between F clustering and viruscell membrane fusion, we conducted pseudotyped virus entry and VLP fusion kinetics assays, as shown in revised Figure S4. The viral entry results (Fig. S4 E and F) corroborate that of the cell-cell fusion assay (Fig. S4A and B) and previously published data 4. The fusion kinetics confirmed that the real-time fusion kinetics was affected by mutations at the hexameric interface, with the hypo-fusogenic mutants L53D and V108D exhibited reduced entry efficiency while the hyper-fusogenic mutant Q393L showed increased efficiency (Fig. S4G and H). The results were described in detail in the revised manuscript. 

      Additionally, we performed a pseudotyped virus entry assay on the LI4A (Fig. S6F and G) and YA (Fig. S7F and G) mutants to verify the function of these mutants on viruses in revised Supplemental Figures. Neither LI4A nor YA incorporated into the VSV/NiV pseudotyped viruses as shown by the Western blot analyses of the pseudovirions (Fig. S6F and S7F), and thus did not induce virus entry, consisting with the cell-cell fusion results (Fig. S6C, D and Fig. S7C, D). We did not perform the entry kinetic assay of these two mutants as they do not incorporate into VLPs or pseudovirions. 

      The authors also claim they could not characterize the number of NiV-F particles per cluster. Another technique such as number and brightness (Digman et al., 2008) could support current SMLM data and identify the number of single molecules per cluster. Also, this technology does not require complex microscopy apparatus. I suggest they perform either confocal fluorescence fluctuation spectroscopy or TIRF-based nandb to validate the clusters and identify how many molecule are present in these clusters.  

      We thank reviewer 2 for this suggestion. Determining the true copy number of NiV-F in individual clusters could verify whether the F clusters on the plasma membrane are hexamer-of-trimer assemblies. Regardless, it does not affect our conclusion that the organization of NiV-F into nanoclusters affects the membrane fusion triggering ability. The confocal fluorescence fluctuation spectroscopy (FFS) and TIRF-based analyses are accessible tools for quantifying fluorophore copy numbers and/or stoichiometry based on fluorescence fluctuation or photobleaching. However, these methods are unable to quantify the number of proteins in individual clusters because they analyze fluorophores either in the entire cell (as in wide-field epifluorescence microscopy coupled with FFS and TIRF-coupled photobleaching) 5–7 or within a large excitation volume (confocal laser scanning microscopycoupled FFS) 8. Both of these volumes are significantly larger than a single NiV-F cluster, which has an average diameter of 24-26 nm (Fig. 1F). 

      The current SMLM setup is useful for characterizing the protein distribution and organization. However, quantifying the true protein copy number within a nanocluster is challenging because of the stochasticity of fluorophore blinking and the unknown labeling stoichiometry 9–11. To address the challenge in fluorophore blinking, quantitative DNA-PAINT (qDNA-PAINT) may be used because the on-off frequency of the fluorophores is tied to the well-defined kinetic constants of DNA binding and the influx rate of the imager strands, rather than the stochasticity of fluorophore blinking. Thus, the frequency of blinks can be translated to protein counting 12. To address the challenge in unknown labeling stoichiometry, DNA origami can be used as a calibration standard 11. DNA origami supports handles at a regular space with several to tens of nanometers apart, and the handles can be conjugated with a certain number of proteins of interest. The copy number of protein interest in the experimental group can be determined by comparing the SMLM localization distribution of the sample to that of the DNA origami calibration standard. Given the requirement of a more sophisticated SMLM setup and a high-precision calibration tool, we will explore the quantification of NiV-F copy numbers in nanoclusters in a future project. 

      Also, it is not clear how many cells the authors employ for their statistics (at least 30-50 cells should be employed and not consider the number of events blinking events. I hope the authors are not considering only a single cell to run their stats... The differences between the mutants and the NiV-F is minor even if their statistical analyses give a difference (they should average the number and size of the clusters per cell for a total of 30-50 cells with experiments performed at least in three different cells following the same protocol). Overall, it seems that the authors have only evaluated a very low number of cells.

      We disagree with this comment from Reviewer #2. The sample size for cluster analysis in SMLM images was chosen by considering the target of the study (cells and VLPs) and the data acquisition and analysis standards in the SMLM imaging field. We also noted the sample size (# of ROI and cells) in the figure legend. 

      Below, we compared the sample sizes in our study to those in similar studies that used comparable imaging and cluster analysis methods from 2015 to 2024. The classical clustering analysis methods are categorized into global clustering (e.g. nearest neighbor analysis, Ripley’s K function, and pair correlation function) and complete clustering, such as density-based analysis (e.g. DBSCAN, Superstructure, FOCAL, ToMATo) and Tessellationbased analysis (e.g. Delaunay triangulation, Voronoii Tessellation). The global clustering analysis method provides spatial statistics for global protein clustering or organization (e.g. clustering extent), while the complete clustering approach extracts information from a single-cluster level, such as the morphology and localization density of individual clusters. We used the density-based analyses, DBSCAN and OPTICS, for cluster analysis on cell plasma membranes and VLP membranes. 

      Author response table 1.

      The comparison of imaging methods, analysis methods, and sample size in the current study to other studies conducted from 2015 to 2024.

      They should also compare the level of expression (with the number of molecules per cell provided by number and brightness) with the total number of clusters. 

      We thank reviewer 2 for this suggestion. We compared the level of expression with the total number of clusters for F-WT in Figure 1I in the main text.  

      The same applies to the VLP assay. I assume the authors have only taken VLPs expressing both NiV-M and NiV-F (and NiV-G). But even if this is not clearly stated I would urge the authors to show how many viruses were compared per condition (normally I would expect 300 particles per condition coming from three independent experiments. As a negative control to evaluate the cluster effect I would mix the different conditions. Clearly you have clusters with all conditions and the differences in clustering depending on each condition are minimal. Therefore you need to increase the n for all experiments.

      We thank reviewer 2 for this comment. We acquired and analyzed more images of NiV VLPs bearing F-WT, Q393L, L53D, and V108D. Results are shown in the revised Figure 4 and the number of VLPs (>300) used for analysis is specified in the figure legend. An increased number of VLP images does not affect the classification result in Figure 4C. 

      As for the suggestion on “evaluating the cluster effect at different mixed conditions”, I assume that reviewer 2 would like to see how the presence of different viral structural proteins (F, M, and G) on VLPs could affect F clustering.  We showed that the organization of NiV envelope proteins on the VLP membrane is similar in the presence or absence of NiV-M by direct visualization 27, suggesting that the effect of NiV-M on F-WT clustering on VLPs is minimal. We also show comparable incorporation of NiV-F among the NiV-F hexamer-oftrimer mutants (Fig. 4A). Therefore, we did not test the F clustering at different F, M, and G combinations in this paper. However, this could be an interesting question to pursue in a paper focusing on NiV VLP production. 

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Wang and colleagues describes single molecule localization microscopy to quantify the distribution and organization of Nipah virus F expressed on cells and on virus-like particles. Notably the crystal structure of F indicated hexameric assemblies of F trimers. The authors propose that F clustering favors membrane fusion.

      Strengths:

      The manuscript provides solid data on imaging of F clustering with the main findings of:

      -  F clusters are independent of expression levels

      -  Proteolytic cleavage does not affect F clustering

      -  Mutations that have been reported to affect the hexamer interface reduce clustering on cells and its distribution on VLPs - - F nanoclusters are stabilized by AP

      Weaknesses:

      The relationship between F clustering and fusion is per se interesting, but looking at F clusters on the plasma membrane does not exclude that F clustering occurs for budding. Many viral glycoproteins cluster at the plasma membrane to generate micro domains for budding. 

      This does not exclude that these clusters include hexamer assemblies or clustering requires hexamer assemblies. 

      We thank reviewer #3 for this question. We did not focus on the role of NiV-F clusters for budding in the current manuscript, although this is an interesting topic to pursue. In this manuscript, we observed that NiV VLP budding is decreased for some cluster-disrupting mutants, such as F-YA, and F-LI4A. however, F-V108D showed increased budding compared to F-WT (Fig. 4A). We also observed that VLPs and VSV/NiV pseudoviruses expressing L53D have little NiV-G (Fig. 4A, Fig. S4F and S4H), although the incorporation level of L53D is comparable to that of wt F in both VLPs and pseudovirions (Fig. 4A and Fig. S4F). L53D is a hypofusogenic mutant with decreased clustering ability. Therefore, our current data do not show a clear link between F clustering and NiV VLP budding or glycoprotein incorporation. 

      We reported that both NiV-F and -M form clusters at the plasma membrane although NiV-F clusters are not enriched at the NiV-M positive membrane domains 1. This result indicates that NiV-M is the major driving force for assembly and budding, while NiV-F is passively incorporated into the assembly sites. The central role of NiV-M in budding is also supported by a recent study showing that NiV-M induces membrane curvature by binding to PI(4,5)P2 in the inner leaflet of the plasma membrane 28. However, the expression of NiV-F alone induces the production of vesicles bearing NiV-F 29 and NiV-F recruits vesicular trafficking and actin cytoskeleton factors to VLPs either alone or in combination with NiV-G and -M, indicating a potential autonomous role in budding 30. Additionally, several electron microscopy studies show that the paramyxovirus F forms 2D lattice interspersed above the M lattice, suggesting the participation of F in virus assembly and budding. Nonetheless, the evidence above suggests that NiV-F may play a role in budding, but our data cannot correlate NiV-F clustering to budding. 

      Assuming that the clusters are important for entry, hexameric clusters are not unique to Nipah virus F. Similar hexameric clusters have been described for the HEF on influenza virus C particles (Halldorsson et al 2021) and env organization on Foamy virus particles (Effantin et al 2016), both with specific interactions between trimers. What is the organization of F on Nipah virus particles? If F requires to be hexameric for entry, this should be easily imaged by EM on infectious or inactivated virus particles. 

      We thank reviewer #3 for this suggestion. The hexamer-of-trimer NiV-F is observed on the VLP surface by electron tomography 4. The NiV-F hexamer-of-trimers are arranged into a soccer ball-like structure, with one trimer being part of multiple hexamer-of-trimers. The implication of NiV-F clusters in virus entry and the potential mechanism for NiV-F higherorder structure formation are discussed in the revised manuscripts. 

      AP stabilization of the F clusters is curious if the clusters are solely required for entry? Virus entry does not recruit the clathrin machinery. Is it possible that F clusters are endocytosed in the absence of budding? 

      We thank reviewer #3 for this question. The evidence from the current study does not exclude the role of NiV-F clustering in virus budding. NiV-F is known to be endocytosed in the virus-producing cells for cleavage by Cathepsin B or L at endocytic compartments at a pH-dependent manner31–33 in the absence of budding. However, given that all cleaved and uncleaved NiV-F have an endocytosis signal sequence at the cytoplasmic tail and are able to interact with AP-2 for endosome assembly and the cleaved and uncleaved F may have similar clustering patterns (Fig. 2), we do not think NiV-F clustering is specifically regulated for the cleavage of NiV-F. A plausible hypothesis is that NiV-F clusters are stabilized by multiple intrinsic factors (e.g. trimer interface) and host factors (e.g. AP-2) on cell membrane for cell-cell fusion and virus budding. We linked the clustering to the fusion ability of NiV-F in this study, but the NiV-F clustering may also be important in facilitating virus budding. Once in the viruses, the higher-order assembly of the clusters (e.g. lattice) may form due to protein enrichment, and the cell factors may not be the major maintenance force. 

      Clusters are required for budding. 

      Other points:

      Fig. 3: Some of the V108D and L53D clusters look similar in size than wt clusters. It seems that the interaction is important but not absolutely essential. Would a double mutant abrogate clustering completely?

      We thank Reviewer #3 for the suggestion. We generated a double mutant of NIV-F with L53D and V108D (NiV-F-LV) and assessed its expression and processing. Although the mutant retained processing capability, it exhibited minimal surface expression, making it unfeasible to analyze its nano-organization on the cell or viral membrane.

      Author response image 4.

      The expression and fusion activity of Flag-tagged NiV-F and NiV-F L53D-V108D (LV). (A) Representative western blot analysis of NiV-F-WT, LV in the cell lysate of 293T cells. 293T cells were transfected by NiV-F-WT or the LV mutant. The empty vector was used as a negative control. The cell lysates were analyzed on SDS-PAGE followed by western blotting after 28hrs post-transfection. F0 and F2 were probed by the M2 monoclonal mouse antiFLAG antibody. GAPDH was probed by monoclonal mouse anti-GAPDH. (B) Representative images of 293T cell-cell fusion induced by NiV-G and NiV-F-WT or NiV-F-LV. 293T cells were co-transfected with plasmids coding for NiV-G and empty vector (NC) or NiV-F constructs. Cells were fixed at 18 hrs post-transfection. Arrows point to syncytia. Scale bar: 10um. (C) Relative cell-cell fusion levels in 293T cells in (B). Five fields per experiment were counted from three independent experiments. Data are presented as mean ± SEM. (D) The cell surface expression levels of NiV-F-WT, NiV-F-LV in 293T cells measured by flow cytometry. Mean fluorescence Intensity (MFI) values were calculated by FlowJo and normalized to that of F-WT. Data are presented as mean ± SEM of three independent experiments. Statistical significance was determined by the unpaired t-test with Welch’s correction (*P<0.05, **P<0.01, ***P<0.001, ****P<0.0001). Values were compared to that of the NiV-F-WT.

      Fig. 4: The distribution of F on VLPs should be confirmed by cryoEM analyses. This would also confirm the symmetry of the clusters. The manuscript by Chernomordik et al. JBC 2004 showed that influenza HA outside the direct contact zone affects fusion, which could be further elaborated in the context of F clusters and the fusion mechanism.

      We thank reviewer 3 for this suggestion. The distribution of F on VLPs was resolved by electron tomogram which showed that the NiV-F hexamer-of-trimers are arranged into a soccer ball-like structure 4. The role of influenza HA outside of the contact zone in fusion activation is an interesting phenomenon. It may address the energy transmission within and among clusters. We will pursue this topic in a future project.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      •  Please define all used abbreviations throughout the manuscript and in the SI.

      We defined the abbreviations at their first usage. 

      •  The sentence starting with "Additionally, ..." on line 155 appears to be incomplete.

      We corrected this sentence.  

      •  The statement starting with "As reported, ..." on line 181 should be supported by a reference.

      We added a reference. 

      •  In Fig. 4C, it is unclear what the x and y axes represent.  

      Fig. 4C is a t-SNE plot for visualizing high-dimensional data in a low-dimensional space. It maintains the local data structure but does not represent exact quantitative relationships. In other words, points that are close together in Fig. 4C are also close in the high-dimensional space, meaning the OPTICS plots, which reflect the clustering patterns, are similar for two points that are positioned near each other in Fig. 4C. Therefore, the x and y axes do not represent the original, quantitative data, and thus the axis titles are meaningless.  

      •  The reference on line 306 appears to be unformatted.

      We reformatted the reference.  

      Reviewer #2 (Recommendations For The Authors):

      The authors need to include the overall statistics for each experiment (at least 30 to 50 cells with three independent experiments are needed). 

      We highlighted the sample size (number of ROI and number of cells) used for analysis in the figure legend. The determination of the sample size is justified in Table 1 in the response letter. 

      The authors need to generate a functional pseudovirus system (for example HIVpp/NiV F) to run both infectivity and fusion experiments (including Apr-BlaM assay). 

      We tested viral entry using a VSV/NiV pseudovirus system and the viral entry kinetics using VLPs expressing NiV-M-β-lactamase. The results are presented in Fig. S1, S4, S6, and S7.  

      Reviewer #3 (Recommendations For The Authors):

      Even low resolution EM data on VLPs or viruses would strengthen the conclusions.

      We thank this reviewer for the suggestion. We cited the NiV VLP images acquired by electron tomography 4, but we currently have limited resources to perform cryoEM on NiV VLPs.  

      References.

      (1) Liu, Q., Chen, L., Aguilar, H. C. & Chou, K. C. A stochastic assembly model for Nipah virus revealed by super-resolution microscopy. Nature Communications 9, 3050 (2018).

      (2) Khetawat, D. & Broder, C. C. A Functional Henipavirus Envelope Glycoprotein Pseudotyped Lentivirus Assay System. Virology Journal 7, 312 (2010).

      (3) Palomares, K. et al. Nipah Virus Envelope-Pseudotyped Lentiviruses Efficiently Target ephrinB2Positive Stem Cell Populations In Vitro and Bypass the Liver Sink When Administered In Vivo. J Virol 87, 2094–2108 (2013).

      (4) Xu, K. et al. Crystal Structure of the Pre-fusion Nipah Virus Fusion Glycoprotein Reveals a Novel Hexamer-of-Trimers Assembly. PLoS Pathog 11, e1005322 (2015).

      (5)    Bakker, E. & Swain, P. S. Estimating numbers of intracellular molecules through analysing fluctuations in photobleaching. Sci Rep 9, 15238 (2019).

      (6) Nayak, C. R. & Rutenberg, A. D. Quantification of Fluorophore Copy Number from Intrinsic

      Fluctuations during Fluorescence Photobleaching. Biophys J 101, 2284–2293 (2011).

      (7) Salavessa, L. & Sauvonnet, N. Stoichiometry of ReceptorsReceptors at the Plasma MembranePlasma membrane During Their EndocytosisEndocytosis Using Total Internal Reflection Fluorescent (TIRF) MicroscopyMicroscopy Live Imaging and Single-Molecule Tracking. in Exocytosis and Endocytosis: Methods and Protocols (eds. Niedergang, F., Vitale, N. & Gasman, S.) 3–17 (Springer US, New York, NY, 2021). doi:10.1007/978-1-0716-1044-2_1.

      (8) Slenders, E. et al. Confocal-based fluorescence fluctuation spectroscopy with a SPAD array detector. Light Sci Appl 10, 31 (2021).

      (9) Annibale, P., Vanni, S., Scarselli, M., Rothlisberger, U. & Radenovic, A. Identification of clustering artifacts in photoactivated localization microscopy. Nat Methods 8, 527–528 (2011).

      (10) Baumgart, F. et al. Varying label density allows artifact-free analysis of membrane-protein nanoclusters. Nat Methods 13, 661–664 (2016).

      (11) Zanacchi, F. C. et al. A DNA origami platform for quantifying protein copy number in super-resolution. Nat Methods 14, 789–792 (2017).

      (12) Jungmann, R. et al. Multiplexed 3D cellular super-resolution imaging with DNA-PAINT and Exchange-PAINT. Nature Methods 11, 313–318 (2014).

      (13) Rubin-Delanchy, P. et al. Bayesian cluster identification in single-molecule localization microscopy data. Nat Methods 12, 1072–1076 (2015).

      (14) Griffié, J. et al. 3D Bayesian cluster analysis of super-resolution data reveals LAT recruitment to the T cell synapse. Sci Rep 7, 4077 (2017).

      (15) Dynamic Bayesian Cluster Analysis of Live-Cell Single Molecule Localization Microscopy Datasets - Griffié - 2018 - Small Methods - Wiley Online Library. https://onlinelibrary.wiley.com/doi/full/10.1002/smtd.201800008.

      (16) Caetano, F. A. et al. MIiSR: Molecular Interactions in Super-Resolution Imaging Enables the Analysis of Protein Interactions, Dynamics and Formation of Multi-protein Structures. PLOS Computational Biology 11, e1004634 (2015).

      (17) Malkusch, S. & Heilemann, M. Extracting quantitative information from single-molecule superresolution imaging data with LAMA – LocAlization Microscopy Analyzer. Sci Rep 6, 34486 (2016).

      (18) Zhang, Y., Lara-Tejero, M., Bewersdorf, J. & Galán, J. E. Visualization and characterization of individual type III protein secretion machines in live bacteria. Proceedings of the National Academy of Sciences 114, 6098–6103 (2017).

      (19) Tobin, S. J. et al. Single molecule localization microscopy coupled with touch preparation for the quantification of trastuzumab-bound HER2. Sci Rep 8, 15154 (2018).

      (20) Levet, F. et al. SR-Tesseler: a method to segment and quantify localization-based super-resolution microscopy data. Nature Methods 12, 1065–1071 (2015).

      (21) Peters, R., Griffié, J., Burn, G. L., Williamson, D. J. & Owen, D. M. Quantitative fibre analysis of singlemolecule localization microscopy data. Sci Rep 8, 10418 (2018).

      (22) Levet, F. et al. A tessellation-based colocalization analysis approach for single-molecule localization microscopy. Nat Commun 10, (2019).

      (23) Banerjee, C. et al. ULK1 forms distinct oligomeric states and nanoscopic structures during autophagy initiation. Science Advances 9, eadh4094 (2023).

      (24) Pageon, S. V. et al. Functional role of T-cell receptor nanoclusters in signal initiation and antigen discrimination. Proceedings of the National Academy of Sciences 113, E5454–E5463 (2016).

      (25) Cresens, C. et al. Flat clathrin lattices are linked to metastatic potential in colorectal cancer. iScience 26, 107327 (2023).

      (26) Seeling, M. et al. Immunoglobulin G-dependent inhibition of inflammatory bone remodeling requires pattern recognition receptor Dectin-1. Immunity 56, 1046-1063.e7 (2023).

      (27) Liu, Q. T. et al. The nanoscale organization of Nipah virus matrix protein revealed by super-resolution microscopy. Biophysical Journal 121, 2290–2296 (2022).

      (28) Norris, M. J. et al. Measles and Nipah virus assembly: Specific lipid binding drives matrix polymerization. Science Advances 8, eabn1440 (2022).

      (29) Patch, J. R. et al. The YPLGVG sequence of the Nipah virus matrix protein is required for budding. Virol. J. 5, 137 (2008).

      (30) Johnston, G. P. et al. Nipah Virus-Like Particle Egress Is Modulated by Cytoskeletal and Vesicular Trafficking Pathways: a Validated Particle Proteomics Analysis. mSystems 4, e00194-19 (2019).

      (31) Diederich, S. et al. Activation of the Nipah Virus Fusion Protein in MDCK Cells Is Mediated by Cathepsin B within the Endosome-Recycling Compartment. J Virol 86, 3736–3745 (2012).

      (32) Diederich, S., Thiel, L. & Maisner, A. Role of endocytosis and cathepsin-mediated activation in Nipah virus entry. Virology 375, 391–400 (2008).

      (33) Pager, C. T., Craft, W. W., Patch, J. & Dutch, R. E. A mature and fusogenic form of the Nipah virus fusion protein requires proteolytic processing by cathepsin L. Virology 346, 251–257 (2006).

    2. eLife Assessment

      This valuable study advances our understanding of how Nipah virus fusion protein F (NiV-F) organizes into nanoclusters on cell and viral membranes using biochemical and super-resolution microscopy methods. The conclusions are supported by solid evidence and the revision has addressed most of the reviewers' concerns. The relationship between clustering and fusion is of high interest and an interesting hypothesis to continue investigating in future studies.

    3. Reviewer #1 (Public review):

      Summary:

      In this work by Wang et al., the authors use single-molecule super-resolution microscopy together with biochemical assays to quantify the organization of Nipah virus fusion protein F (NiV-F) on cell and viral membranes. They find that these proteins form nanoscale clusters which favors membrane fusion activation, and that the physical parameters of these clusters are unaffected by protein expression level and endosomal cleavage. Furthermore, they find that the cluster organization is affected by mutations in the trimer interface on the NiV-F ectodomain and the putative oligomerization motif on the transmembrane domain, and that the clusters are stabilized by interactions among NiV-F, the AP2-complex, and the clathrin coat assembly. This work improves our understanding of the NiV fusion machinery, which may also have implications for our understanding of the function of other viruses.

      Strengths:

      The conclusions of this paper are well-supported by the presented data. This study sheds light on the activation mechanisms underlying the NiV fusion machinery.

    4. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Wang and co-workets employ single molecule light microscopy (SMLM) to detect Nipah virus Fusion protein (NiV-F) in the surface of cells. They corroborate that these glycoproteins form microclusters (previously seen and characterized together with the NiV-G and Nipah Matrix protein by Liu and co-workers (2018) also with super-resolution light microscopy). Also seen by Liu and coworkers the authors show that the level of expression of NiV-F does not alter the identity of these microclusters nor endosomal cleavage. Moreover, mutations and the transmembrane domain or the hexamer-of-trimer interface seem to have a mild effect on the size of the clusters that the authors quantified. Importantly, it has also been shown that these particles tend to cluster in Nipah VLPs.

      Strengths:

      The authors have tried to perform SMLM in single VLPs and have shown partially the importance of NiV-F clustering.

      Comments on the revised version:

      I am happy with the answers the authors have provided to my questions

    5. Reviewer #3 (Public review):

      Summary:

      The manuscript by Wang and colleagues describes single molecule localization microscopy to quantify the distribution and organization of Nipah virus F expressed on cells and on virus-like particles. Notably the crystal structure of F indicated hexameric assemblies of F trimers. The authors propose that F clustering favors membrane fusion.

      Strengths:

      The manuscript provides solid data on imaging of F clustering with the main findings of:<br /> - F clusters are independent of expression levels<br /> - Proteolytic cleavage does not affect F clustering<br /> - Mutations that have been reported to affect the hexamer interface reduce clustering on cells and its distribution on VLPs<br /> - F nanoclusters are stabilized by AP

      Comments on the revised version:

      The authors addressed most of my previous concerns.

    1. eLife Assessment

      This important work presents a novel approach to infer causal relations in non-stationary time series data. To do so, the authors introduce a novel machine-learning model of Temporal Autoencoders for Causal Inference to identify and measure the direction and strength of time-varying causal interactions. The authors provide solid evidence for their claims through thorough numerical validation and comprehensive exploration of the method on both synthetic and real-world datasets. This is a timely contribution that may have theoretical and practical implications for diverse real-life applications.

    2. Reviewer #1 (Public review):

      Summary:

      The authors make a new contribution with careful computational validation/exploration of their method on synthetic and real-world datasets. Overall, I find their results significant and their presentation compelling.

      Strengths:

      The authors provide extensive computational validation of their approach to synthetic and real-world datasets of increasing complexity.

      Weaknesses:

      The authors should provide a comparison of their approach to other state-of-the-art neural network-based methods. Without this, it is difficult to tell which aspects of their approach (novel coupling metric, or network architecture) are most important for their results.

    3. Reviewer #2 (Public review):

      Summary:

      This paper introduces a new methodology for probing time-varying causal interactions in complex dynamical systems using a novel machine-learning architecture of Temporal Autoencoders for Causal Inference (TACI) combined with a novel metric (CSGI) for assessing causal interactions using surrogate data. This is a timely contribution in the field of causal inference from temporal data which has been largely restricted to stationary time series so far. However, the benchmarking of the proposed methods could be improved.

      Strength:

      The method's capacity to uncover piecewise time-varying non-linear dynamic systems is demonstrated on synthetic datasets as well as on two real-world applications on climate and brain activity data. A particular advantage of the approach is to train a single model capturing the dynamics of the whole time series, thereby allowing for time-varying interactions to be found without retraining over different time periods.

      Weaknesses:

      (1) It is not clear why the new metric Comparative Surrogate Granger Index CSGI (Eq.6) should be better than the Extended Granger Causality Index EGCI (Eq.5), which can also be used to compare the information about y(t) contained in the actual data x(t) versus in a randomized surrogate x^s(t), as implemented in the proposed metric (Eq.6).

      (2) The benchmarking of the new approach TACI against earlier metrics (ie Surrogate Linear Granger, Convergent Cross Mapping, and Transfer Entropy) should be revised:

      (a) The details of the computation should be provided to clarify how the different metrics are estimated notably between multidimensional variables [for instance to estimate Ty->x for x=(x_1,x_2,x_3) and y=(y_1,y_2,y_3)].

      (b) Reliable implementations of the different metrics should be used, as some of the reported results do not seem right. In particular, the unidirectional examples, Eq.9 (Figure 2) and Eq.12 (Figure 5), are expected to lead to vanishing transfer entropies from Y to X, ie Ty->x =0, for all values of the coupling parameter below the synchronization threshold. This can be verified by computing transfer entropies as conditional mutual information using MIIC R package, i.e. Ty->x = I(x(t);y(t-1)|x(t-1)).

      (c) Besides, some reported benchmarks focus on peculiar non-linear systems displaying somewhat "pathological" behaviors. For instance, the two Hénon maps with unidirectional coupling Eq.12 (Figure 5) lead to an equality between the two variables, i.e. y(t)=x(t) for all t, above the synchronization threshold C>0.7. This leads mathematically to zero transfer entropy upon synchronization, as I(x(t);y(<br /> d) By contrast, Eq.9 (Fig.2) leads to strongly coupled, yet non-identical variables above the synchronization threshold. This strong coupling can be shown to yield non-vanishing transfer entropies in both directions, as observed in Figure 2c, and does not correspond to "incorrect prediction of non-existent interactions", as stated in the "Summary of Results on Artificial Test Systems". Clearly synchronized variables do interact and their bidirectional transfer entropies are actually consistent with a non-causal (or bidirectional) relationship. Only a vanishing transfer entropy in one direction implies a causal relation (in the opposite direction). Likewise, vanishing transfer entropies in both directions imply either independent variables or a spurious dependency between them due to an unobserved common cause L, i.e. X<--(L)-->Y. This is usually represented with a bidirected edge (X<-->Y), which is different from a bidirectional relation corresponding to two opposite unidirectional edges (ie X-->Y and X<--Y). It is therefore surprising that TACI metric vanishes in both directions upon synchronization in this case (Eq.9, Figure 2), as one would expect to learn variable y(t) more reliably using the actual data x(<br /> e) In order to assess TACI performance on non-stationary time series, it might be more informative to benchmark it on datasets displaying intermittency rather than synchrony. In particular, the change of causal directions over time, presented as one of the motivations for the new approach, should be more thoroughly benchmarked in the paper. For instance, it would be nice to demonstrate the tracking of the spontaneous reversal of causal relation in a simple 'toggle switch' regulatory network between two mutually repressing genes + expression noise. This is something that causal inference methods assuming stationarity cannot do.

      (3) Concerning the real-world applications, the analysis of the electrocorticography (ECoG) data does not seem to be in strong disagreement with the general trends of the original more detailed study by Tajima et al 2015. Could the authors better delineate what are the common versus conflicting findings between the two approaches? The main difference appears to be the near loss of interaction in the anesthetized state, which might be linked to TACI's tendency to report no interaction between synchronized variables as discussed in d) above. Does the anesthetized state correspond to a global synchrony of the brain regions? This could be easily validated by a more direct analysis of synchrony.

    1. eLife Assessment

      This study describes the application of machine learning and Markov state models to characterize the binding mechanism of alpha-Synuclein to the small molecule Fasudil. The results suggest that entropic expansion can explain such binding. However, the simulations and analyses in their present form are inadequate.

    2. Reviewer #2 (Public Review):

      The manuscript by Menon et al describes a set of simulations of alpha-Synuclein (aSYN) and analyses of these and previous simulations in the presence of a small molecule.

      Comments on latest version:

      I have read the authors' response to my comments as well as to the other reviewers. Summarizing briefly, I don't think they provide substantial answer to the questions/comments by me or reviewer 3, and generally do not quantify the results/effects data. I still remain unconvinced about the analyses and conclusions. Rather than rewriting another set of comments, I think it will be more useful for all (authors and readers) simply to be able to see the entire set of reviews and responses together with the paper.

    3. Reviewer #3 (Public Review):

      In this manuscript Menon, Adhikari, and Mondal analyze explicit solvent molecular dynamics (MD) computer simulations of the intrinsically disordered protein (IDP) alpha-synuclein in the presence and absence of a small molecule ligand, Fasudil, previously demonstrated to bind alpha-synuclein by NMR spectroscopy without inducing folding into more ordered structures. In order to provide insight into the binding mechanism of Fasudil the authors analyze an unbiased 1500us MD simulation of alpha-synuclein in the presence of Fasudil previously reported by Robustelli et.al. (Journal of the American Chemical Society, 144(6), pp.2501-2510). The authors compare this simulation to a very different set of apo simulations: 23 separate1-4us simulations of alpha-synuclein seeded from different apo conformations taken from another previously reported by Robustelli et. al. (PNAS, 115 (21), E4758-E4766), for a total of ~62us.

      To analyze the conformational space of alpha-synuclein - the authors employ a variational auto-encoder (VAE) to reduce the dimensionality of Ca-Ca pairwise distances to 2 dimensions, and use the latent space projection of the VAE to build Markov state Models. The authors utilize k-means clustering to cluster the sampled states of alpha-synuclein in each condition into 180 microstates on the VAE latent space. They then coarse grain these 180 microstates into a 3-macrostate model for apo alpha-synuclein and a 6-macrostate model for alpha-synuclein in the presence of fasudil using the PCCA+ course graining method. Few details are provided to explain the hyperparameters used for PCCA+ coarse graining and the rationale for selecting the final number of macrostates.

      The authors analyze the properties of each of the alpha-synuclein macrostates from their final MSMs - examining intramolecular contacts, secondary structure propensities, and in the case of alpha-synuclein:Fasudil holo simulations - the contact probabilities between Fasudil and alpha-synuclein residues.

      The authors utilize an additional variational autoencoder (a denoising convolutional VAE) to compare denoised contact maps of each macrostate, and project onto an additional latent space. The authors conclude that their apo and holo simulations are sampling distinct regions of the conformational space of alpha-synuclein projected on the denoising convolutional VAE latent space.

      Finally, the authors calculate water entropy and protein conformational entropy for each microstate. To facilitate water entropy calculations - the author's take a single structure from each macrostate - and ran a 20ps simulation at a finer timestep (4 femtoseconds) using a previously published method (DoSPT), which computes thermodynamic properties of water from MD simulations using autocorrelation functions of water velocities. The authors report that water entropy calculated from these individual 20ps simulations is very similar.

      For each macrostate the authors compute protein conformational entropy using a previously published Maximum Information Spanning tree approach based on torsion angle distributions - and observe that the estimated protein conformational entropy is substantially more negative for the macrostates of the holo ensemble.

      The authors calculate mean first passage times from their Markov state models and report a strong correlation between the protein conformational entropy of each state and the mean first passage time from each state to the highest populated state.

      As the authors observe the conformational entropy estimated from macrostates of the holo alpha-synuclein:Fasudil is greater than those estimated from macrostates of the apo holo alpha-synuclein macrostates - they suggest that the driving force of Fasudil binding is an increase in the conformational entropy of alpha-synuclein. No consideration/quantification of the enthalpy of alpha-synuclein Fasudil binding is presented.

      Strengths:

      The author's utilize MD simulations run with an appropriate force field for IDPs (a99SB-disp and a99SB-disp water (Robustelli et. al, PNAS, 115 (21), E4758-E4766) - which has previously been used to perform MD simulations of alpha-synuclein that have been validated with extensive NMR data.

      The contact probability between Fasudil and each alpha-synuclein residue observed in the previously performed 1500us MD simulation of alpha-synuclein in the presence of Fasudil (Robustelli et. al., Journal of the American Chemical Society, 144(6), pp.2501-2510) was previously found to be in good agreement with experimental NMR chemical shift perturbations upon Fasudil binding - suggesting that this simulation is a reasonable choice for understanding IDP:small molecule interactions.

      Comments on the latest version:

      While the authors have provided additional information in the updated manuscript, none of the additional analyses address the fundamental flaws of the manuscript.

      The additional analyses do not convincingly demonstrate that these two extremely different simulation datasets (1500 microsecond unbiased MD for a-synuclein + fasudil, 23 separate 1-4 microsecond simulations of apo a-synuclein) are directly comparable for the purposes of building MSMs.

      The additional analyses do not demonstrate that there are sufficient conformational transitions among kinetically metastable states observed in 23 separate 1-4 microsecond simulations of apo a-synuclein to build a valid MSM, or that the latent space of the VAE is kinetically meaningful.

      If one is interested in modeling the kinetics and thermodynamics of transitions between a set of conformational states, and they run a small number of MD simulations that are too short to see conformational transitions between conformational states - any kinetics and thermodynamics modeled by an MSM will be inherently meaningless. This is likely to be the case with the apo a-synuclein dataset analyzed in this investigation.

      Simulations of 1-4 microseconds are almost certainly far too short to see a meaningful sampling of conformational transitions of a highly entangled 140-residue IDP beyond a very local relaxation of the starting structures, and the authors provide no analyses to suggest otherwise.

      Without convincingly demonstrating reasonable statistics of conformational changes from the very small apo simulation dataset analyzed here, it seems highly likely the apparent validity of the apo MSM results from learning a VAE latent space that groups structurally and kinetically distinct conformations into similar states, creating the spurious appearance of transitions between states. As such, the kinetics and thermodynamics of the resulting MSM are likely to be relatively meaningless, and comparisons with an MSM for a-synuclein in the presence of fasudil are likely to be meaningless.

      In its present form, this study provides an example of how the use of black-box machine learning methods to analyze molecular simulations can lead to obtaining misleading results (such as the appearance of a valid MSM) - when more basic analyses are omitted.

    4. Author response:

      The following is the authors’ response to the current reviews.

      Public Reviews:

      Reviewer #2 (Public Review):

      I have read the authors' response to my comments as well as to the other reviewers. Summarizing briefly, I don't think they provide substantial answer to the questions/comments by me or reviewer 3, and generally do not quantify the results/effects data. I still remain unconvinced about the analyses and conclusions. Rather than rewriting another set of comments, I think it will be more useful for all (authors and readers) simply to be able to see the entire set of reviews and responses together with the paper.

      The authors disagree with the views of referees. The authors have provided point-wise precise responses to each of the previous comments. The authors find that the referee has not been able to engage with the responses and accompanying analysis that were provided while communicating the previous response.

      The following extensive analyses were performed by the authors while submitting our revision of round 2 of peer-review to address the comments of reviewer 2 and reviewer 3   that were raised by them on the previous versions:

      (1) We calculated the distribution of multiple metrics for both the apo and holo simulations, including their secondary structure composition, and demonstrated the robustness of our findings.

      (2) We analyzed smaller 60 µs chunks from two parts of the 1.5 ms trajectory and showed how, in combination with the Markov state modeling (MSM) approach, these chunks effectively capture equilibrium properties.

      (3) We thoroughly investigated the choice of starting structures, examining parameters such as Rg, RMSD, secondary structure, and SASA, in response to Referee 3's concerns about the objectivity of our dimension reduction approach.

      (4) We conducted multiple analyses using VAMP-scores and justified the use of a Variational Autoencoder (VAE) over tICA.

      (5) We had extensively verified the choice of hyperparameters used in constructing the MSM.

      (6) To aleviate referee concerns, we had retrained a VAE with four latent dimensions and used it to build an MSM, ensuring the robustness of our approach.

      However, we find that Referee has not considered these additional analysis in response to his/her comments on the manuscript.

      Since referee 2 also draws comments from Referee 3, it is worth noting that some of the comments from Referee 2 and Referee 3 in Round 1 were mutually contradictory. In particular, Referee 3's suggestion in Round 1 to use the same initial configuration for simulations of intrinsically disordered proteins (IDPs) in both apo and ligand-bound forms contradicts the fundamental principle that IDPs should not possess structural bias. This recommendation also directly conflicts with Referee 2's request for greater diversity in starting structures. Our manuscript provided robust evidence that our initial configurations are indeed diverse, with one configuration coincidentally matching that used in the ligand-bound simulations. Despite this, we addressed both sets of concerns in our Round 2 revisions. Unfortunately, it seems that these efforts were overlooked in the subsequent round of review.

      Referee 2's suggestion in prevous round of review comments to mix both holo and apo simulation trajectories for MSM construction is conceptually wrong and indicates a lack of understanding of transition matrix building in this field. Nevertheless, we addressed these comments by performing additional analyses and demonstrating the robustness of our current MSM.

      Reviewer #3 (Public Review):

      Summary:

      While the authors have provided additional information in the updated manuscript, none of the additional analyses address the fundamental flaws of the manuscript.

      The additional analyses do not convincingly demonstrate that these two extremely different simulation datasets (1500 microsecond unbiased MD for a-synuclein + fasudil, 23 separate 1-4 microsecond simulations of apo a-synuclein) are directly comparable for the purposes of building MSMs.

      The 23 unbiased 1-4 microsecond simulations of apo αS totals to ~ 60 us.

      Author response image 1.

      Left figure : Distribution of the radius of gyration (Rg) of the 23 apo simulation (as shown in the colourbar) and holo simulation (black). Right figure : Mean and standard deviation (as error bar) of the Rg of the 23 apo (colourbar) and holo simulations (black).

      We have plotted the distribution of the Radius of gyration ((Rg) for the 23 apo simulation (colour bar) and the holo simulation (black) as shown in the left figure and also compared the mean and standard deviations of the Rg values (right figure). We find that our apo simulations span the entire space of Rg as is spanned by the holo simulation. We have also measured the mean and standard deviations (SD) (horizontal error bar) of the apo and holo simulations. The fact that the apo simulations have mean and SDs comparable to those of the holo ensemble suggests that the majority of the apo simulations are sampling similar conformational space as those observed in the ligand-bound holo form and hence can be used for building the MSM.

      The additional analyses do not demonstrate that there are sufficient conformational transitions among kinetically metastable states observed in 23 separate 1-4 microsecond simulations of apo a-synuclein to build a valid MSM, or that the latent space of the VAE is kinetically meaningful.      

      We have performed the Chapman-Kolmogorov test to compare observed and predicted transition probabilities over increasing lag times and found good agreement between these probabilities, thereby suggesting that transitions between states are well-sampled for both the apo (Author response image 2) and holo simulation (Figure S9).

      Author response image 2.

      The Chapman-Kolmogorov test performed for the three state Markov State Model of the αS ensemble.

      As for the latent space of VAE, we have compared the VAMP2 score and compared with tICA. VAE has a higher VAMP2 score as compared to tICA thereby indicating its efficacy in capturing slower mode for both apo and holo simulation (Fig. S7 and S8).

      If one is interested in modeling the kinetics and thermodynamics of transitions between a set of conformational states, and they run a small number of MD simulations that are too short to see conformational transitions between conformational states - any kinetics and thermodynamics modeled by an MSM will be inherently meaningless. This is likely to be the case with the apo asynuclein dataset analyzed in this investigation.

      We disagree with the referee’s view. The referee does not seem to understand the point of building Markov state models via short-time scale trajectories. The distribution of Rg of all the 23 apo simulations spans the entire Rg space sampled by the holo simulation, thereby suggesting that multiple short simulations can sample structures of varying sizes as sampled from the 1.5 ms holo simulation (see Author response image 1).

      Simulations of 1-4 microseconds are almost certainly far too short to see a meaningful sampling of conformational transitions of a highly entangled 140-residue IDP beyond a very local relaxation of the starting structures, and the authors provide no analyses to suggest otherwise.

      Author response image 3.

      Autocorrelation of the first principal component of the backbone dihedral for the apo (colourbar) and holo (black) simulation.

      Author response image 4.

      Autocorrelation of the second principal component of the backbone dihedral for the apo (colourbar) and holo (black) simulation.

      In order to assess the 23 short simulations in capturing meaningful kinetics and thermodynamics, we have computed the backbone dihedrals which were then reduced to two principal components for both the 23 apo and holo simulations. We then calculated the autocorrelation time for each of the components and for each of the apo and holo simulations which are plotted in Author response image 3 and Author response image 4 respectively.

      The autocorrelation for the holo and most of the apo simulation is similar, thereby suggesting that there is sufficient sampling of conformational transitions between conformational states in the apo simulations and are therefore able to represent the structural changes of the system similarly to the long simulation.

      Without convincingly demonstrating reasonable statistics of conformational changes from the very small apo simulation dataset analyzed here, it seems highly likely the apparent validity of the apo MSM results from learning a VAE latent space that groups structurally and kinetically distinct conformations into similar states, creating the spurious appearance of transitions between states. As such, the kinetics and thermodynamics of the resulting MSM are likely to be relatively meaningless, and comparisons with an MSM for a-synuclein in the presence of fasudil are likely to be meaningless.

      We have shown above that the short simulations are able to capture the structural changes in the long simulation. In addition we have compared the VAMP2 score of the apo and holo simulation with tICA and found out that VAE is superior in capturing long timescale dynamics, for both apo and holo simulation (Fig. S7 and S8).

      In its present form, this study provides an example of how the use of black-box machine learning methods to analyze molecular simulations can lead to obtaining misleading results (such as the appearance of a valid MSM) - when more basic analyses are omitted.

      The authors disagree with the referee’s viewpoint on our manuscript. We find that the majority of the contents of the referee’s comments are cursory and lack objectivity.

      The referee’s loose reference on Machine learning as a black box lacks basic knowledge to comprehend artificial deep neutral network’s long-proven ability to objectively deduce optimal set of lower-dimensional representation of conformational subspace of complex biomacromolecule. The referee’s views on the manuscript ignore the extensive optimization of hyper-parameters that were carried out by the authors in developing the suitable framework of beta-variational autoencoder for deducing optimal latent space representation of complex and fuzzy conformational  landscape of an IDP such as alpha-synuclein. We had thoroughly investigated the choice of starting structures, examining parameters such as Rg, RMSD, secondary structure, and SASA, in response to Referee 3's concerns about the objectivity of our dimension reduction approach. However, we find that referee 3 has ignored the analysis provided to justify our choice.

      Referee 3's advocacy for linear dimensional reduction techniques overlooks the necessity and generality of non-linear approaches, as enabled by artificial deep neural network frameworks, demonstrated in the present manuscript. Nevertheless, our manuscript includes evidence demonstrating the optimality of our current reduced dimensions through varied dimensional analyses. Our extensive analysis, based on the VAMP-2 score, supports the sufficiency of the present dimensions compared to other linear reduction methods.

      The referee’s views that developing Markov state models (MSM) of apo form of the alphasynulclein using multiple number of 1-4 microsecond long simulation length is misleading, suggests referee’s lack of knowledge on the fundamental purpose and motivation for the usage of MSM, which is, to derive long-time scale equilibrium properties from significantly short-length adaptively sampled trajectories. The referee has overlooked the extensive analysis that the authors had provided while demonstrating that the Markov state models developed from short length simulation trajectories of alpha-synclein can statistically replicate the properties derived from very long trajectories.

      ---

      The following is the authors’ response to the original reviews.

      The following extensive analyses were performed to address the reviewer comments:

      (1) We have calculated the distribution of radius of gyration (Rg), end-to-end distance (Ree), solvent accessible surface area (SASA)  of the apo and holo simulations and also their secondary structure composition.

      (2) We have performed a similar analysis for the smaller 60 µs chunk from two parts of the 1.5 ms trajectory.

      (3) The choice of starting structures have been thoroughly investigated in terms of Rg, RMSD, secondary structure and SASA.

      (4) We have justified the use of VAE over tICA.

      (5) We have verified the choice of hyperparameters that were used to build the MSM.

      (6) We have retrained a VAE with four latent dimensions and used it to build MSM. 

      (7) As per recommendation of the referee 1, we have updated the title of the manuscript by introducing ‘expansion’ phrase.

      The manuscript has been accordingly revised by updating it with additional analysis.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This is a well-conducted study about the mechanism of binding of a small molecule (fasudil) to a disordered protein (alpha-synuclein). Since this type of interaction has puzzled researchers for the last two decades, the results presented are welcome as they offer relevant insight into the physical principles underlying this interaction.

      Strengths:

      The results show convincingly that the mechanism of entropic expansion can explain the previously reported binding of fasudil to alpha-synuclein. In this context, the analysis of the changes in the entropy of the protein and of water is highly relevant. The combination use of machine learning for dimensional reduction and of Markov State Models could become a general procedure for the analysis of other systems where a compound binds a disordered protein.

      Weaknesses:

      It would be important to underscore the computational nature of the results, since the experimental evidence that fasudil binds alpha-synuclein is not entirely clear, at least to my knowledge.

      The experimental evidence of binding of fasudil to α-synuclein and potentially preventing its aggregation is reported in the paper “Fasudil attenuates aggregation of α-synuclein in models of Parkinson’s disease. Tatenhorst et al. Acta Neuropathologica Communications (2016) 4:39 DOI 10.1186/s40478-016-0310-y ”. In this work, solution state 15N-1H HSQC NMR experiments were performed of α-synuclein in increasing amounts of fasudil which led to large chemical shift perturbation of Y133 and Y136 residues. Additionally single and double mutant  synT-Y133A and synT-Y136A (tyrosine is replaced with alanine), when treated with fasudil, had no significant effect as evident from immunochemistry, thereby indicating that α-synuclein aggregation can be inhibited by the interaction of C-terminal tyrosines with  fasudil. These two analyses point to binding specific binding sites of fasudil to α-synuclein.

      In our work, we have built a MSM using the latent dimension of a deep learning method called VAE,  to address how fasudil interacts with α-synuclein. An analysis of the macrostates as obtained from MSM, gives insights into how fasudil interacts with α-synuclein, in terms of  transition probabilities among the states, thereby predicting which states are most favorable for binding.

      Reviewer #2 (Public Review):

      The manuscript by Menon et al describes a set of simulations of alpha-Synuclein (aSYN) and analyses of these and previous simulations in the presence of a small molecule.

      While I agree with the authors that the questions addressed are interesting, I am not sure how much we learn from the present simulations and analyses. In parts, the manuscript reads more like an attempt to apply a whole range of tools rather than with a goal of answering any specific questions.

      In this manuscript, we have employed a variational bayesian method, VAE, that uses variational inference to approximate the distribution of latent variable. Unlike conventional linear dimension reduction methods such as tICA (as provided in the SI), this method has been found to be better (higher VAMP2 score) in capturing slow modes and thereby facilitate the study of long-time dynamics. Markov State Model was built on this lower dimension space which indicated the presence of three and six states for the apo and holo simulations respectively. The exclusivity of the states was justified by determining the backbone contact map and further mapping these states using a denoising CNN-VAE. The increase in the number of states in the presence of the small molecule was justified by calculating the entropy of the macrostates. The entropic contribution from water remained similar across all states, while for the protein in the holo ensemble, entropy was significantly modulated (either increased or decreased) compared to the apo state. In contrast, the entropy of the apo states showed much less modulation. This proves that an increase in the number of states is primarily an entropic effect caused by the small molecule. Finally we have compared the mean first passage time (MFPT) of other states to the most populated state, which reveals a strong correlation between transition time and the system's entropy for both apo and holo ensemble. However, the transition times (to the most populated state) are much lower for the holo ensemble, thereby suggesting that fasudil may potentially trap the protein conformations in the intermediate states, thereby slowing down αS in exploring the large conformational space and eventually slow down aggregation.

      There's a lot going on in this paper, and I am not sure it is useful for the authors, readers or me to spell out all of my comments in detail. But here are at least some points that I found confusing/etc

      Major concerns

      p. 5 and elsewhere:

      I lack a serious discussion of convergence and the statistics of the differences between the two sets of simulations. On p. 5 it is described how the authors ran multiple simulations of the ligandfree system for a total of 62 µs; that is about 25 times less than for the ligand system. I acknowledge that running 1.5 ms is unfeasible, but at a bare minimum the authors should discuss and analyse the consequences for the relatively small amount of sampling. Here it is important to say that while 62 µs may sound like a lot it is probably not enough to sample the relevant properties of a 140-residue long disordered protein.

      As to referee 2’s original comment on ‘a lot going on in the manuscript’, we believe that the complexity of the project demanded that this work needs to be dealt with an extensive analysis and objective machine learning approaches, instead of routine collective variable or traditional linear dimensional reduction techniques. This is what has been accomplished in this manuscript. For someone to get the gist of the work, the last paragraph of the introduction and first paragraph of conclusion provides a summary of the overall finding and investigation in the manuscript. First, a VAE-based machine learning approach demonstrates the modulation of free energy landscape of alpha-synuclein in presence of fasudil. Next, Markov State Model elucidates distinct binding competing states of alpha-synuclein in presence of the small-molecule drug. Then the MSMderived metastable states of alpha-synuclein monomer are structurally characterized in presence of fasudil. Next we mapped the macrostates in apo and bound-state ensembles using denoising convolutional variational autoencoder, to ensure that these are mutually distinct. Next we show that fasudil exhibits conformation-dependent interactions with individual metastable states. Finally the investigation quantatively brings out entropic signatures of small molecule binding.

      We thank the reviewer for the question. For the apo simulations, we performed 1-4 μs long simulations with 23 different starting structures and the ensemble amounted to an ensemble of ~62 μs. In the Supplementary figures,  we show analyses of how the starting structures used for apo simulations compare with the structure used to run the holo simulations as well as comparison of the apo and holo ensembles in terms of structures features as Rg, Ree, solvent accessible surface area (SASA) and secondary structure properties. This is updated in the manuscript on page 3,31- 33 and figures S1-S6, S25-S30.

      Also, regarding the choice of starting structures, we chose multiple distinct conformations from a previous simulation of alpha synuclein monomer, reported in Robustelli et. al, PNAS, 115 (21), E4758-E4766. The Rg of the starting structures represent the entire distribution of Rg of the holo ensemble; from compact, intermediate to extended states. Importantly, the Rg distribution of the apo and holo ensembles are highly comparable and overlapping, indicating that the apo simulations, although of short timescale, have sampled the phase space locally around each starting conformation and thus covered the protein phase space as in the holo simulation. Similarly, other structural properties such as SASA, Ree  and secondary structure are comparable for the two ensembles. These analyses show that the local sampling across a variety of starting conformations has ensured sufficient sampling of the IDP phase space. This is  updated in the manuscript on page 33-34 and figure S1, S25-S30.

      p. 7:

      The authors make it sound like a bad thing than some methods are deterministic. Why is that the case? What kind of uncertainty in the data do they mean? One can certainly have deterministic methods and still deal with uncertainty. Again, this seems like a somewhat ad hoc argument for the choice of the method used.

      We appreciate the reviewer’s comment. In this work, we have used a single VAE model to map the simulation of αS in its apo state and in the presence of fasudil, into two dimensions. If we had used an autoencoder, which is a deterministic model, we would have to train two independent models; one for the apo-state and one for fasudil. It would then be questionable to compare the two dimensions obtained from two different autoencoders as the model parameters are not shared. 

      VAE gives us this flexibility by not mapping it to a single point, but to a distribution, thereby encouraging it to learn more generalizable representation. The uncertainty is not in the data; but mapping a conformation (of the fasudil simulation) to a distribution would provide a new point for a similar structure (from the apo simulation). 

      p. 8:

      The authors should make it clear (i) what the reconstruction loss and KL is calculated over and (ii) what the RMSD is calculated over.

      (i) The reconstruction loss is calculated between the reconstructed and original pairwise distances, whereas the KL loss is calculated between the approximated posterior distribution and the prior distribution (for VAE it is a standard normal distribution)

      (ii) The RMSE is the root mean square error between the original data and the reconstructed data. 

      (i) is updated on page 34 and (ii) is updated in the revised manuscript on page 8.

      p. 9/figure 1:

      The authors select a beta value that may be the minimum, but then is just below a big jump in the cross-validation error. Why does the error jump so much and isn't it slightly dangerous to pick a value close to such a large jump.

      In this work, RMSE has been chosen as a metric to select the best VAE model. To do so, the β parameter (weighting factor for the KL loss) was varied. The β value was chosen as this had the minimum value.

      This is updated on page 8.

      p. 10:

      Why was a 2-dimensional representation used in the VAE? What evidence do the authors have that the representation is meaningful? The authors state "The free energy landscape represents a large number of spatially close local minima representative of energetically competitive conformations inherent in αS" but they do not say what they mean by "spatially close". In the original space? If so, where is the evidence.

      We thank the reviewer for the question. Even though an increase in the number of latent dimensions may make the model more accurate, this can also result in overfitting. The model can simply memorize the pattern in the data instead of generalizing them. A higher dimensional latent space is also more difficult to interpret; therefore, we chose two dimensions. 

      The reconstruction loss (which is the mean squared error between the input and the reconstructed data) is of the order of 10-4. Also, the MSM built on the latent space of VAE is able to identify states that are distinct for both apo and holo simulations, which ensures that the latent space representation is meaningful.

      We have also trained a model with 4 neurons in the latent space and built an MSM. The implied timescales indicate the presence of six states which is consistent with the model with two latent dimensions.

      This is updated in the manuscript on page 13 and figure S14-S15.

      No, not spatially close in the original space, but in the reduced two dimensional latent space.

      p. 10:

      It is not clear from the text whether the VAEs are the same for both aSYN and aSYN-Fasudil. I assume they are. Given that the Fasudil dataset is 25x larger, presumably the VAE is mostly driven by that system. Is the VAE an equally good representation of both systems?

      Yes, the same model is used for both aSYN and aSYN-Fasudil ensemble.

      The states obtained from the MSM of the aSyn ensemble are distinct when their Cα contact maps are analyzed. So we think it is a good representation for this system.

      p. 10/11:

      Do the authors have any evidence that the latent space representation preserves relevant kinetic properties? This is a key point because the entire analysis is built on this. The choice of using z1 and z2 to build the MSM seems somewhat ad hoc. What does the auto-correlation functions of Z1 and Z2 look like? Are the related to dynamics of some key structural properties like Rg or transient helical structure.

      Autocorrelation of z1 and z2 of the latent space of VAE and the radius of gyration for asyn-fasudil simulation.

      Author response image 5.

      We find that z1 of VAE has a much slower decay as compared to Rg. This indicates that it is much better in capturing long-time-scale dynamics as compared to Rg.

      p. 11:

      What's the argument for not building an MSM with states shared for aSYN +- Fasudil?

      We have built two different markov state models for two aSYN simulation in its apo state and in the presence of ligand. Mixing the two latent spaces to build one MSM would give incorrect transition timescales among the states as these are independent simulations.

      p. 12:

      Fig. 3b/c show quite clearly that the implied timescales are not converged at the chosen lag time (incidentally, it would have been useful with showing the timescales in physical time). The CK test is stated to be validated with "reasonable accuracy", though it is unclear what that means.

      We have mentioned the physical timescales in the main manuscript (Page no. 38), which is 36 and 32 ns for apo and holo simulations, respectively. We used “reasonable accuracy” in the context of the Chapman-Kolmogorov test. We note that for the ligand simulations, the estimated and predicted models are in excellent agreement as compared to some of the transitions in the apo state. This good agreement implies that the model has reached Markovianity and the timescales have converged. 

      The CK test is updated in the manuscript on page 12.

      p. 12:

      In Fig. 3d, what are the authors bootstrapping over? What are the errors if the authors analyse sampling noise (e.g. bootstrap over simulation blocks)?

      For bootstrapping, we randomly deleted a part of the simulation (simulation block) and rebuilt the MSM with this reduced dataset. We repeated this 10 times and reported the average value of the population and the transition timescales over the 10 iterations.  

      p. 13:

      I appreciate that the authors build an MSM using only a subset of the fasudil simulations. Here, it would be important that this analysis includes the entire workflow so that the VAE is also rebuilt from scratch. Is that the case?

      The VAE model was trained over data points of the ligand simulation sampled at every 9 ns starting from time t=0, for the entire 1.5 ms. We did not train it for the subset of the fasudil simulation, but rather used the trained VAE model to get the latent space of the 60 μs of the fasudil simulation to build the MSM. Additionally, we have compared the distributions of Rg for this simulation block with the apo ensemble and found good agreement among them. 

      Rg distribution is updated in the manuscript on page 13 and see figure S10-S11.

      p. 18:

      I don't understand the goal of building the CVAE and DCVAE. Am I correct that the authors are building a complex ML model using only 3/6 input images? What is the goal of this analysis. As it stands, it reads a bit like simply wanting to apply some ML method to the data. Incidentally, the table in Fig. 6C is somewhat intransparent.

      We appreciate the reviewer’s valid question. The ensemble averaged contact map of the macrostates of aSyn in apo state and in the presence of ligand posed us a challenge in finding contacts that are exclusive to each state. Since VAEs are excellent in finding patterns, we employed a convolutional VAE (typically used for images). However, owing to the few number of contact maps, the model overfitted and to prevent this, we added noise to the data.  A visual inspection of the ensemble averaged contact map, especially for IDPs is difficult and this lower dimensional space will give us a preliminary idea of how each macrostate is different from every other. The table in Fig. 6C provides scores for the denoised contact maps (SSIM and PSNR scores). An SSIM score above 0.9 and PSNR score between 20-48 indicates that the reconstruction of the contact map is of good quality.

      p. 22:

      "Our results indicate that the interaction of fasudil with αS residues governs the structural features of the protein."

      What results indicate this?

      By building a Markov State Model and comparing them across the apo and holo ensembles, we showed the interaction of fasudil with aSyn leads to the population of more states (than apo). In these states, we observe that fasudil interacts with aSyn in different regions as shown by the protein-ligand contact map as shown in figure 7. Also, the contact maps and the extent of secondary structure of the six states are distinct across the states. The location and extent of the helix and sheet-like character in the ensemble of the six macrostates as shown in figure S16-S17.  Based on these observations, we state that the interaction of the small molecule favors the population of new aSyn states that are distinct in their structural features.

      p. 23:

      The authors should add some (realistic) errors to the entropy values quoted. Fig. 8 have some error bars, though they seem unrealistically small. Also, is the water value quoted from the same force field and conditions as for the simulations?

      The error values are the standard deviations that are provided by the PDB2ENTROPY package. Yes, the water value is from the same force field and conditions for the simulations are the same as reported in the section “Entropy of water”  

      p. 23:

      Has PDB2ENTROPY been validated for use with disordered proteins?

      Yes, it has been used in the following paper studying liquid-liquid phase separation of an IDP. 

      This paper has also been cited in the manuscript (reference 66).

      “Thermodynamic forces from protein and water govern condensate formation of an intrinsically disordered protein domain” by Saumyak Mukherjee & Lars V. Schäfer, Nature Communications volume  14, Article number: 5892 (2023) https://doi.org/10.1038/s41467-023-41586-y

      p. 23/24:

      It would be useful to compare (i) the free energies of the states (from their populations), (ii) the entropies (as calculated) and (iii) the enthalpies (as calculated e.g. as the average force field energy). Do they match up?

      Our analysis stems from previous studies where enthalpy driven drug design has not led to significant advances in drug design, particularly for IDPs. In the presence of the drug/ligand, the protein may be able to explore a larger conformational space and hence an increase in the number of states accessible by the protein, which we found by building Markov State Model using the latent space of VAE. The entropy of the protein is calculated based on the torsional degrees of freedom relative to the random distribution (the protein with the most random configuration).

      p. 31:

      It is unclear which previous simulation the new aSYN simulations were launched from. What is the size of the box used?

      The starting conformations for the new aSYN simulations were randomly chosen from a previously reported 73 μs simulation in Robustelli et. al. (PNAS, 115 (21), E4758-E4766). 

      Box size for the 23 simulation has been added to the supplemental information in Table S1.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript Menon, Adhikari, and Mondal analyze explicit solvent molecular dynamics (MD) computer simulations of the intrinsically disordered protein (IDP) alpha-synuclein in the presence and absence of a small molecule ligand, Fasudil, previously demonstrated to bind alpha-synuclein by NMR spectroscopy without inducing folding into more ordered structures. In order to provide insight into the binding mechanism of Fasudil the authors analyze an unbiased 1500us MD simulation of alpha-synuclein in the presence of Fasudil previously reported by Robustelli et.al. (Journal of the American Chemical Society, 144(6), pp.2501-2510). The authors compare this simulation to a very different set of apo simulations: 23 separate1-4us simulations of alphasynuclein seeded from different apo conformations taken from another previously reported by Robustelli et. al. (PNAS, 115 (21), E4758-E4766), for a total of ~62us.

      To analyze the conformational space of alpha-synuclein - the authors employ a variational autoencoder (VAE) to reduce the dimensionality of Ca-Ca pairwise distances to 2 dimensions, and use the latent space projection of the VAE to build Markov state Models. The authors utilize kmeans clustering to cluster the sampled states of alpha-synuclein in each condition into 180 microstates on the VAE latent space. They then coarse grain these 180 microstates into a 3macrostate model for apo alpha-synuclein and a 6-macrostate model for alpha-synuclein in the presence of fasudil using the PCCA+ course graining method. Few details are provided to explain the hyperparameters used for PCCA+ coarse graining and the rationale for selecting the final number of macrostates.

      The authors analyze the properties of each of the alpha-synuclein macrostates from their final MSMs - examining intramolecular contacts, secondary structure propensities, and in the case of alpha-synuclein:Fasudil holo simulations - the contact probabilities between Fasudil and alphasynuclein residues.

      The authors utilize an additional variational autoencoder (a denoising convolutional VAE) to compare denoised contact maps of each macrostate, and project onto an additional latent space. The authors conclude that their apo and holo simulations are sampling distinct regions of the conformational space of alpha-synuclein projected on the denoising convolutional VAE latent space.

      Finally, the authors calculate water entropy and protein conformational entropy for each microstate. To facilitate water entropy calculations - the author's take a single structure from each macrostate - and ran a 20ps simulation at a finer timestep (4 femtoseconds) using a previously published method (DoSPT), which computes thermodynamic properties of water from MD simulations using autocorrelation functions of water velocities. The authors report that water entropy calculated from these individual 20ps simulations is very similar.

      For each macrostate the authors compute protein conformational entropy using a previously published Maximum Information Spanning tree approach based on torsion angle distributions - and observe that the estimated protein conformational entropy is substantially more negative for the macrostates of the holo ensemble.

      The authors calculate mean first passage times from their Markov state models and report a strong correlation between the protein conformational entropy of each state and the mean first passage time from each state to the highest populated state.

      As the authors observe the conformational entropy estimated from macrostates of the holo alphasynuclein:Fasudil is greater than those estimated from macrostates of the apo holo alphasynuclein macrostates - they suggest that the driving force of Fasudil binding is an increase in the conformational entropy of alpha-synuclein. No consideration/quantification of the enthalpy of alpha-synuclein Fasudil binding is presented.

      Strengths:

      The author's utilize MD simulations run with an appropriate force field for IDPs (a99SB-disp and a99SB-disp water (Robustelli et. al, PNAS, 115 (21), E4758-E4766) - which has previously been used to perform MD simulations of alpha-synuclein that have been validated with extensive NMR data.

      The contact probability between Fasudil and each alpha-synuclein residue observed in the previously performed 1500us MD simulation of alpha-synuclein in the presence of Fasudil (Robustelli et. al., Journal of the American Chemical Society, 144(6), pp.2501-2510) was previously found to be in good agreement with experimental NMR chemical shift perturbations upon Fasudil binding - suggesting that this simulation is a reasonable choice for understanding IDP:small molecule interactions.

      Weaknesses:

      Major Weakness 1: Simulations of apo alpha-synuclein and holo simulations of alpha-synuclein and fasudil are not comparable.

      The most robust way to determine how presence of Fasudil affects the conformational ensemble of alpha-synuclein conclusions is to run apo and holo simulations of the same length from the same starting structures using the same simulation parameters.

      The 23 1-4 us independent simulations of apo alpha-synuclein and the long unbiased 1500us alpha-synuclein in the presence of fasudil are not directly comparable. The starting structures of simulations used to build a Markov state model to describe apo alpha-synuclein were taken from a previously reported 73us MD simulation of alpha-synuclein run with the a99SB-disp force field and water model) with 100mM NaCl, (Robustelli et. al, PNAS, 115 (21), E4758-E4766). As the holo simulation of alpha-synuclein and Fasudil was run in 50mM NaCl, snapshots from the original apo alpha-synuclein simulation were resolvated with 50mM NaCl - and new simulations were run.

      No justification is offered for how starting structures were selected. We have no sense of the conformational variability of the starting structures selected and no sense of how these conformations compare to the alpha-synuclein conformations sampled in the holo simulation in terms of standard structural descriptors such as tertiary contacts, secondary structure, radius of gyration (Rg), solvent exposed surface area etc. (we only see a comparison of projections on an uninterpretable non-linear latent-space and average contact maps). Additionally, 1-4 us is a relatively short timescale for a simulation of a 140 residue IDP- and one is unlikely to see substantial evolution for many structural properties of interest (ie. secondary structure, radius of gyration, tertiary contacts) in simulations this short. Without any information about the conformational space sample in the 23 apo simulations (aside from a projection on an uninterpretable latent space)- we have no way to determine if we observe transitions between distinct states in these short simulations, and therefore if it is possible the construct a meaningful MSM from these simulations.

      If the structures used for apo simulations are on average more compact or contain more tertiary contacts - then it is unsurprising that in short independent simulations they sample a smaller region of conformational space. Similarly, if the starting structures have similar dimensions - but we only observe extremely local sampling around starting structures in apo simulations in the short simulation times - it would also not be surprising that we sample a smaller amount of conformational space. By only presenting comparisons of conformational states on an uninformative VAE latent space - it is not possible for a reader to ask simple questions about how the conformational ensembles compare.

      It is noted that the authors attempt to address questions about sampling by building an MSM of single contiguous 60us portion of the holo simulation of alpha-synuclein and Fasudil - noting that:

      "the MSM built using lesser data (and same amount of data as in water) also indicated the presence of six states of alphaS in presence of fasudil, as was observed in the MSM of the full trajectory. Together, this exercise invalidates the sampling argument and suggests that the increase in the number of metastable macrostates of alphaS in fasudil solution relative to that in water is a direct outcome of the interaction of alphaS with the small molecule."

      However, the authors present no data to support this assertion - and readers have no sense of how the conformational space sampled in this portion of the trajectory compares to the conformational space sampled in the independent apo simulations or the full holo simulation. As the analyzed 60us portion of the holo trajectory may have no overlap with conformational space sampled in the independent apo simulations - it is unclear if this control provides any information. There is no quantification of the conformational entropy of the 6 states obtained from this portion of the holo trajectory or the full conformational space sampled. No information is presented to determine if we observe similar states in the shorter portion of the holo trajectory. Furthermore - as the authors provide almost no justification for the criteria used to select of the final number of macrostates for any of the MSMs reported in this work- and the number of macrostates is effectively a free parameter in the PCCA+ method, arriving at an MSM with 6 macrostates does not convey any information about the conformational entropy of alpha-synuclein in the presence or absence of ligands. Indeed - the implied timescale plot for 60us holo MSM (Figure S2) - shows that at least 10 processes are resolved in the 120 microstate model - and there is no information to provided explaining/justifying how a final 6-macrostate model was determined. The authors also do not project the conformations sampled in this sub- trajectory onto the latent space of the final VAE.

      One certainly expects that an MSM built with 1/20th of the simulation data should have substantial differences from an MSM built from the full trajectory - so failing additional information and hyperparameter justification - one wonders if the emergence of a 6-state model could be the direct result of hardcoded VAE and MSM construction hyperparameter choices.

      Required Controls For Supporting the Conclusions of the Study: The authors should initiate apo and holo simulations from the same starting structures - using the same simulation software and parameters. This could be done by adding a Fasudil ligand to the apo structures - or by removing the Fasudil ligand from a subset of holo structures. This would enable them to make apples-toapples comparisons about the effect of Fasudil on alpha-synuclein conformational space.

      Failing to add direct apples-to-apples comparisons, which would be required to truly support the studies conclusions, the authors should at least compare the conformational space sampled in the independent apo simulations and holo simulations using standard interpretable IDP order parameters (ie. Rg, end-to-end distance, secondary structure order parameters) and/or principal components from PCA or tICA obtained from the holo simulation. The authors should quantify the number of transitions observed between conformational states in their apo simulations. The authors could also perform more appropriate holo controls, without additional calculations, by taking batches of a similar number of short 1-4us segments of simulations used to compute the apo MSMs and examining how the parameters/macrostates of the holo MSMs vary with the input with random selections.

      In case of IDPs, one should not bias the simulation by starting from identical structures, as IDP does not have a defined structure and the starting configuration has little significance. It is the microenvironment that matters most. As for the choice of simulation software and parameters, we have used the same force field that was used in the holo simulation at the same temperature and same salt concentration. We have performed multiple independent simulations that have varying structural signatures such as Rg, SASA and secondary structure content. In fact, the starting structure for apo simulations covered the entire span of the Rg distribution of holo simulation, including the starting structure of the holo simulation. The simulations are unbiased w.r.t the starting structure. Although the fasudil simulation was run for 1.5 ms, we should also understand that it is difficult to run a millisecond range of simulation in reasonable time from a single starting structure. It is exactly for this reason that we start with different structures so that we do not bias ourselves and sample every possible conformation. 

      We have updated the manuscript on page 33-34 and figure S1, S25-S30.

      Considering the computational expense for simulating 1.5 ms timescale of a 140-residue IDP, we generated an ensemble from multiple short runs amounting to ~60 µs. The premise of this investigation is a widely popular method, Markov State Models (MSMs) that can be used to estimate long timescale kinetics and stationary populations of metastable states built from ensembles of short simulations. We have also demonstrated that comparable to the apo data, when we build an MSM for asyn-fasudil (holo) using 60 µs simulation block, the implied timescales (ITS) plot shows identical number of metastable states as for the 1.5 ms data.  

      An intrinsically disordered protein (IDP) is not represented by a fixed structure. Therefore, it would be most appropriate to run multiple simulations starting from different initial structures and simulate the local environment around those structures; thus generating an ensemble effectively sampling the phase space. Accordingly, for initiating the apo simulations, instead of biasing the initial structure (using the starting structure used for simulations with fasudil), we chose randomly 23 different conformations from the 73 µs long simulation of 𝛼-synuclein monomer reported in Robustelli et. al, PNAS, 115 (21), E4758-E4766.  Based on the reviewer’s comment on providing a justification for choice of the starting structures for apo simulations, we provide a compilation of figures below showing comparison of standard conformational properties of the chosen initial structures for apo simulations with the starting structure of the long holo simulation; we have also provided comparative analyses of the apo (~60 µs) and holo ensemble (1.5 ms) properties. 

      Figure S1 compares the Rg of the apo and holo ensembles of ~60 μs and 1.5 ms, respectively. The distributions are majorly overlapping, indicating that the apo ensemble is comparable to the holo ensemble, in terms of the extent of compaction of the conformations. In Figure 1, we have also marked the Rg values corresponding to the starting structures used to seed the apo simulations. It is evident that the 23 starting conformations chosen represent the whole range of the Rg space that is sampled in the holo ensemble. Therefore, while the apo simulations are relatively short (1-4 μs), the local sampling of these multiple starting conformations of variable compaction (Rg) ensures that the phase space is efficiently sampled and the resulting ensemble is comparable to the holo ensemble. Furthermore, the implementation of MSM on such an ensemble can be efficiently used to identify metastable states and the long timescale transitions happening between them

      Another property that is proportional to Rg is the end-to-end distance of the protein conformations. Figure S2 shows that the distribution of this property in the apo and holo ensembles are highly similar.

      Figure S3 depicts another fundamental structural descriptor i.e. solvent accessible surface area (SASA) that indicates the extent of folding and the exposure of the residues. The apo ensemble only shows a minimal shift in the distribution towards higher SASA values. The distributions of the two ensembles largely overlap. 

      In Figure S25, we have provided the root mean square deviation (RMSD) of the starting structures used in the apo simulations with the structure used to start the long simulation with fasudil. The RMSD values range from 1.6 to 3 nm, indicating that the starting structures used are highly variable. This is justifiable for IDPs since they are not identified by a single, fixed structure, but rather by an array of different conformations.  

      Figures S26-S28 show the fraction of the secondary structure elements i.e. helix, beta and coil in the starting structures of apo and holo simulations. All the conformations are mostly disordered in nature with the greatest extent of coil content. The helix content ranges from 3-10 % while sheet content varies from 3-15 % in the initial simulation structures. 

      Figures S4-s6 represent the residue-wise percentage of secondary structure elements (helix, beta and coil) in the apo and holo ensembles. It is evident that the extent of secondary structure is comparable in the two ensembles. 

      The above analyses comparing distributions of several structural features clearly indicate that the apo simulations we performed from different starting structures have effectively sampled the phase space as the single long simulation of the holo system.

      We have discussed the above in the manuscript: Computational Methods section, Page 33-34.

      The above VAMP score analyses (Figures S7 and S8has been now presented in the manuscript: Results and Discussion (Page 8)

      Building the MSM

      While building the MSM, we iteratively varied the hyperparameters to build a reasonable model. In this process, we explored different values of the number of clusters, maximum number of iterations, tolerance, stride, metric, seed, chunk size and initialization methods. There is no possible way to perform an optimization on the choice of the above hyperparameters using gradient descent methods, as no convergence would be guaranteed. The parameters were tuned carefully so that we get the best possible implied timescales of the system. The quality of the MSM was further validated using the Chapman-Kolmogorov (CK) test on a state-by-state basis i.e by considering the transitions between each pair of the metastable states. In addition, we have built the contact maps to show that the states are mutually exclusive. This is also justified by the latent space of denoising convolutional variational autoencoders.

      We have compared the conformational space in the independent apo and holo simulations for Rg, Ree, SASA and secondary structure. As for PCA/TICA, we have computed the VAMP-2 score for TICA and found out to be low as compared to VAE. In fact, neural networks have been shown previously as a better dimension reduction technique due to its non-linearity over linear methods such as PCA or TICA.

      Author response image 6.

      Distribution of (a)Rg, (b) Ree, (c) SASA and of the apo ensemble and a 60 μs slice of the holo simulation trajectory.  (d) ITS plot of the 60 μs chunk.

      First, someone familiar with MSM should understand that the basic philosophy of MSM is not the requirement of long simulation trajectories, which would defeat the purpose of its usage. Rather as motivated by Noe and coworkers in seminal PNAS (vol. 106, page 9011, year 2009) paper, MSM plays an important role in inferring long-time scale equilibrium properties by using significantly short-length scale non-equilibrium trajectories. 

      Considering the difference in the size of the ensembles in the apo and holo simulations, we verified how different is the MSM built using 60 μs slice of the data from the 1.5 ms holo simulation in terms of the number of metastable states identified by the model. For this, we considered 60 μs data beginning from 966 μs - 1026 μs. First, we compared the gross structural properties of these datasets. Author response image 6a-c compares the distributions of Rg, Ree and SASA. The distributions show that the apo and holo simulations are very similar with respect to these standard properties of protein conformations. 

      We built the MSM for this 60 μs data of the holo ensemble from the reduced data obtained from the same VAE model. We would like to clarify that the hyperparameters of the model are not hardcoded but rather carefully fine-tuned to obtain a good model that performs good kinetic discretization of the underlying macrostates. The implied timescale plot of this new MSM shows distinct timescales corresponding to six macrostates. This led us to conclude that the six-state model is robust despite the differences in the ensemble size. The implied timescale is shown in Author response image 6d.

      The above analyses in Author response image 6 are presented in Results and Discussion, Page 13. 

      Major Weakness 2: There is little justification of how the hyperparameters MSMs were selected. It is unclear if the results of the study depend on arbitrary hyperparameter selections such as the final number of macrostates in each model.

      It is unclear what criteria were used to determine the appropriate number of microstates and macrostates for each MSM. Most importantly - as all analyses of water entropy and conformational entropy are restricted to the final macrostates - the criteria used to select the final number of macrostates with the PCCA+ are extremely important to the results of the conclusions of the study. From examining the ITS plots in Figure 3 - it seems both MSMs show the same number of resolved processes (at least 11) - suggesting that a 10-state model could be apropraite for both systems. If one were to simply select a large number of macrostates for the 20x longer holo simulation - do these states converge to the same conformational entropy as the states seen in the short apo simulations? Is there some MSM quality metric used to determine what number of macrostates is more appropriate?

      Required Controls For Supporting the Conclusions of the Study: The authors should specify the criteria used to determine the appropriate number of microstates and macrostates for their MSMs and present controls that demonstrate that the conformational entropies calculated for their final states are not simply a function of the ratio of the number macrostates chosen to represent very disparate amounts of conformational sampling.

      VAMP-2 score was used to determine the number of microstates. We have calculated the VAMP2 score by varying the number of microstates, ranging from 10 to 220. We find that the VAMP-2 score has saturated at a higher number of microstates for both apo and holo simulations.

      The number of macrostates were determined by the gap between the lines of the Implied timescales plot followed by a CK test (shown in figure S1). Since we plotted the first 10 slowest timescales, the implied timescales show 10 timescales and this is not an indicator of the number of macrostates. The macrostates are separated by distinct gaps in the timescales and do not merge as seen beyond 5 timescales in the plot. The timescales, when leveled off and distinct, indicate that the system has well defined metastable states and the MSM is accurate in identifying the macrostates. We find this to be three and six for the apo and holo simulations from the corresponding implied timescales.

      The above is discussed in Computational Methods, Page 37-38.

      Major Weakness 3: The use of variational autoencoders (VAEs) obscures insights into the underlying conformational ensembles of apo and holo alpha-synuclein rather than providing new ones

      No rationale is offered for the selection of the VAE architecture or hyperparameters used to reduce the dimensionality of alpha-synuclein conformational space.

      It is not clear the VAEs employed in this study are providing any new insight into the conformational ensembles and binding mechanisms of Fasudil to alpha-synuclein, or if the underlying latent space of the VAEs are more informative or kinetically meaningful than standard linear dimensionality reduction techniques like PCA and tICA. The initial VAE is used to reduce the dimensionality of alpha-synuclein conformational ensembles to 2 degrees of freedom - but it is unclear if this projection is structurally or kinetically meaningful. It is not clear why the authors choice to use a 2-dimeinsional projection instead of a higher number of dimensions to build their MSMs. Can they produce a more kinetically and structurally meaningful model using a higher dimensional VAE latent space?

      Additionally - it is not clear what insights are provided by the Denoising Convolutional Variational Autoencoder. The authors appear to be noising-and-denoising the contact maps of each macrostate, and then projecting the denoised values onto a new latent space - and commenting that they are different. Does this provide additional insight that looking at the contact maps in Figures 4&5 does not? Is this more informative than examining the distribution of the Radii of gyration or the secondary structure propensities of each ensemble? It is not clear what insight this analysis adds to the manuscript.

      Suggested controls to improve the study: The authors should project interpretable IDP structural descriptors (ie. secondary structure, radius of gyration, secondary structure content, # of intramolecular contacts, # of intermolecular contacts between alpha-synuclein and Fasudil ) onto this latent space to illustrate if any of these properties are meaningful separated by the VAE projection. The authors should compare these projections, and MSMs built from these projections, to projections and MSMs built from projections using standard linear dimensionality projection techniques like PCA and tICA.

      We have already pointed out the IDP structural parameters for the first question.

      In case of VAE, the latent space captures the underlying pattern of the higher dimensional data. A non-linear projection using VAE has shown to have a higher VAMP-2 score over linear dimension reduction methods such as tICA. The latent space of VAE was then used to build the MSM, in order to get the macrostates and also the transition timescales among them. We can project the data onto a higher dimension, but the goal is to reduce it to lower dimensions where it will be easier to interpret. Higher number dimensions would also risk overfitting; and the model, instead of learning the pattern, it may simply memorize the data. The training and validation loss curve from VAE has reached the order of 10^-4 thereby indicating good reconstruction of the original data.

      As for dimension reduction using tICA, the VAMP-2 score confirms that our VAE model performs better than tICA. This manuscript uses deep neural networks to understand the structural and kinetic process of IDP and small molecule interaction. Dimension reduction using tICA would give different reaction coordinates and MSM built using the projected data of tICA will not be one-to one comparable with that obtained from VAE.

      We had to perform noising, as we had only 9 contact maps. This led to overfitting of the CVAE model. To overcome this problem, we have introduced white noise to our data, so as to prevent the model from overfitting. The objective of the DCVAE model was to see how distinct these contact maps are based on their locations on a lower dimensional space. A visual inspection of the ensemble averaged contact map, especially for IDPs is much more difficult as compared to folded proteins. So, even before computing the Rg, Ree, SASA or secondary structure, this lower dimensional space will give us a preliminary idea of how each macrostate is different from every other.

      As for the distribution of Rg, we have plotted it in Author response image 7. The residue-wise percentage secondary structure is plotted in figure S4-S6  for the holo and apo simulation respectively.

      Author response image 7.

      Distribution of radius of gyration for the three and six macrostates in the apo and holo simulation respectively.

      As for training a model with a higher number of latent dimensions, we have retrained a VAE model with four dimensions in the latent space. The loss was of the order of 10-4. We built a MSM with the appropriate number of microstates and found the presence of six macrostates as evident from the ITS plot as shown in Figure S14 and S15.

      This data is presented in Results and Discussion, Page 13

      Major Weakness 4: The MSMs produced in this study have large discrepancies with MSMs previously produced on the same dataset by the same authors that are not discussed.

      Previously - two of the authors of this manuscript (Menon and Mondal) authored a preprint titled "Small molecule modulates α-synuclein conformation and its oligomerization via Entropy Expansion" (https://www.biorxiv.org/content/10.1101/2022.10.20.513005v1.full) that analyzed the same 1500us holo simulation of alpha-synuclein binding Fasudil. In this study - they utilized the variational approach to Markov processes (VAMP) to build an MSM using a 1D order parameter as input (the radius of gyration), first discretizing the conformational space into 300 microstates before similarly building a 6 macrostate model. From examining the contact maps and secondary structure propensities of the holo MSMs from the current study and the previous study- some of the macrostates appear similar, however there appear to be orders of magnitude differences in the timescales of conformational transitions between the two models. The timescales of conformational transitions in the previous MSM are on the order of 10s of microseconds, while the timescales of transitions in this manuscript are 100s-1000s microseconds. In the previous manuscript, a 3 state MSM is built from an apo α-synuclein obtained from a continuous 73ms unbiased MD simulation of alpha-synuclein run at a different salt concentration (100mM) and an additional 33 ms of shorter simulations. The apo MSM from the previous study similarly reports very fast timescales of transitions between apo states (on the order ~1ms) - while the MSM reported in the current study (Figure 9) are on the order of 10s-100s of microseconds).

      These discrepancies raise further concerns that the properties of the MSMs built on these systems are extremely sensitive to the chosen projection methods and MSM modeling choices and hyperparameters, and that neither model may be an accurate description of the true underlying dynamics

      Suggestions to improve the study: The authors should discuss the discrepancies with the MSMs reported in their previous studies.

      In the previous preprint, the radius of gyration was used as the collective variable to build the MSM. In this manuscript, we have used a much more general collective variable, reduced pairwise distance using VAE. Firstly, the collective variables used to build the model in the two works are different. Secondly, for the 73 μs apo simulation in the previous manuscript, the salt concentration used was 100 mM, but in this work, we have used a salt concentration of 50 mM, same as the salt concentration used in the holo simulations. Since the two simulation conditions are different with respect to salt concentration, the conformational space sampled in these conditions will be different and this will be reflected in the nature/features of the metastable states and the associated transition kinetics. Thirdly, the lag time at which the MSM was built was 3.6 ns in the previous manuscript, whereas, in this work we have used 32 ns. This is already off by a factor of 10. So the order of timescales have also changed. Thus, changes in the collective variable and change in the lag time at which the system reaches Markovianity is different. Hence, the timescales of transition among the macrostates are also different. Because of these differences, it would not be correct to compare the results that we would get from the two investigations.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      To highlight the role of the entropic expansion mechanism, I would suggest modifying the title to capture this result, for example: "An Integrated Machine Learning Approach Delineates an Entropic Expansion Mechanism for the Binding of a Small Molecule to α-Synuclein".

      We have changed the title as suggested by the reviewer.

      To my knowledge the binding of fasudil to alpha-synuclein has been shown in the simulations by Robustelli et al (JACS 2022), but the experimental evidence is less clear cut. If an experimental binding affinity and the effect on alpha-synuclein aggregation have been measured, they should be reported.

      Reviewer #2 (Recommendations For The Authors):

      We thank the reviewer for the careful evaluation of our manuscript and providing comments and questions that we have attempted to address and incorporate. 

      Minor

      Abstract:

      In "which is able to statistically distinguish fuzzy ensemble", what does the word "statistically" mean in this context? Do the authors present evidence that the two ensembles are statistically different, and if so in what ways?

      We have analyzed the apo and holo ensembles of aSyn using the framework of Markov State Models, which provides the stationary populations of the states that the model identifies. For this reason, we have used ‘which is able to statistically distinguish fuzzy ensemble’ as we compare and contrast the metastable states that we resolve using MSM. The MSM provides metastable states which are identified through statistical analysis of the transitions between states (transition probability matrix). We characterize their structural features to distinguish them which gives a meaningful interpretation of the fuzzy ensemble.

      Abstract:

      What does "entropic ordering" mean?

      We thank the reviewer for pointing this out. Here, we mean that the presence of the small molecule only affects the protein backbone entropy while the entropy of water is not affected in the simulations with fasudil. We will rewrite this more clearly in the abstract. 

      The changed sentence is as follows: 

      “A thermodynamic analysis indicates that small-molecule modulates the structural repertoire of αS by tuning protein backbone entropy, however the entropy of the water remains unperturbed.”

      Abstract:

      What does "offering insights into entropic modulation" mean?

      In this investigation, we first discretized the ensemble of a small-molecule binding/interacting with a disordered aSyn into the underlying metastable states, followed by characterisation of these identified states. As small molecule interactions can affect the overall entropy of the IDP, we estimated the said effect of fasudil binding on aSyn. We find that small molecule binding effect is manifested in the protein backbone entropy and the solvent entropy is not affected. Through this work, we highlight these insights into the modulatory effect that fasudil brings about in the entropy of the system (entropic modulation).

      p. 3/4:

      When the authors write "However, a routine comparison of monomeric αS ensemble... ensemble" it is unclear whether they are referring to previous work (they only cite a paper with simulations of "apo" aSYN, and if so which. Do they mean Ref 32? Also, the word "routine" sounds odd in this context.

      We thank the author for pointing this out. We compared the ensemble properties (such as the distributions of the radius of gyration, end-to-end distance, solvent accessible surface area, secondary structure properties) of ɑ-synuclein monomer that we generated in neat water and the ensemble of ɑ-synuclein in the presence of the small molecule fasudil that is reported in Robustelli et.al. (Journal of the American Chemical Society, 144(6), pp.2501-2510).  We have now modified this sentence in the main manuscript as follows: (Page no 3)

      “However, comparison of the global and local structural features of the αS ensemble in neat water and that in the presence of fasudil [32] (see Figure S1-S6) did not indicate a significant difference that is a customary signature of the dynamic IDP ensemble.”

      p. 4:

      Regarding "Integrative approaches are therefore gaining importance in IDP studies", these kinds of integrative approaches have been used for 20 years for studies of IDPs (with increasing sophistication and success), so I think "gaining" is somewhat of a stretch.

      We thank the reviewer for this comment. We agree with the reviewer and have now changed this sentence  as follows:

      “Integrative approaches have been exploited in studying IDPs as well as small-molecule binding to IDPs.”

      p. 5:

      What does "large scale" mean in "This study showed no large-scale differences between the bound and unbound states of αS"? Do the authors mean substantially/significantly different, or differences on a large (length) scale?

      Here, we refer to the study of small molecule (fasudil) binding study to α-synclein reported in Robustelli et.al. (Journal of the American Chemical Society, 144(6), pp.2501-2510). In this study, the authors report no substantial (“large scale”) differences in the conformational ensembles of αsynuclein in the bound and unbound states of fasudil such as the backbone conformation distributions. 

      p. 6:

      The authors write "In a clear departure from the classical view of ligand binding to a folded globular protein, the visual change in αS ensemble due to the presence of small molecule is not so strikingly apparent." I don't understand this. Normally, there is very little difference between apo and holo protein structures for folded proteins, so I don't understand the "in a clear departure" part. This seems like a strawman. Of course, for folded proteins one can generally see the ligand bound, but here the authors are talking about the protein.

      In case of folded proteins, the overall tertiary structure of the protein remains mostly the same upon binding of the ligand. Structural changes are localized in nature and primarily around the binding site. However, in case of ⍺Syn, binding of fasudil is transient and not as strong as seen for folded proteins. “Clear departure” refers to the fact that for ⍺Syn, binding of fasudil is more subtle and dispersed across the ensemble of conformations rather than localized changes as in case of folded proteins.

      p. 6:

      I don't think the term "data-agnostic" makes sense since these methods are based on data and also make some assumptions about how the data can/should be used.

      We have replaced this term with “model-agnostic”.

      p. 16:

      How are contacts defined; please add to caption.

      A contact is considered if the Cα atoms of two residues are within a distance of 8 Å of each other. We have updated the caption with this information in Figures 4 and 5.  

      p. 20:

      What do the authors mean by "non-specific interactions" in this context?

      The interactions of fasudil are predominantly with the negatively charged residues in the C-terminal region of ⍺Syn via charge-charge and π-stacking interactions (Robustelli et.al. (Journal of the American Chemical Society, 144(6), pp.2501-2510)).

      In addition, in some metastable states that we identify, we also observe transient interactions with residues in the hydrophobic NAC region and N-terminal region. We refer to these transient interactions as “non-specific” interactions.

      p. 27:

      Are the axes of Fig. 9c/d z1 and z2?

      Yes. The axes are z1 and z2

      Smaller than minor

      Abstract:

      Rephrase "In particular, the presence of fasudil in milieu"

      We have rephrased the sentence as follows: 

      “In particular, the presence of fasudil in the solvent…”

      p. 4:

      What does the word "potentially" do in "ensemble of conformations potentially sampled"?

      Here, by potentially, we mean the various conformations that the protein can adopt, subject to the environmental conditions. 

      p. 10:

      "we trained a large array of inter-residue pairwise distances"

      The distances were not trained; please reformulate

      We have corrected this sentence as follows:  

      “We trained a VAE model using a large array of inter-residue pairwise distances.”

      p. 13:

      N/C-terminal -> terminus (or in the C-terminal region)

      We have made the changes in the manuscript at the required places. 

      p. 20:

      Precedent -> previous (?)

      We have made the change in the manuscript. 

      p. 30:

      As far as I understand, Anton does not use GPUs and does not run Desmond.

      We thank the reviewer for providing this information. We referred to the original paper of the ⍺syn-fasudil simulations (Robustelli et.al. (Journal of the American Chemical Society, 144(6), pp.2501-2510)). The authors have performed equilibration with GPU/Desmond and used Anton for production runs. We have modified this sentence as:

      We have modified this sentence as: 

      “A 1500 μs long all-atom MD simulation trajectory of αS monomer in aqueous fasudil solution was simulated by D. E. Shaw Research with the Anton supercomputer that is specially purposed for running long-time-scale simulations.” on page 31

      References : 

      (1) Schütte  C,  Fischer  A,  Huisinga  W,  Deuflhard  P  (1999)  A  direct  approach  to  conformational  dynamics  based  on  hybrid  monte  carlo. J  Comput  Phys 151:146–168

      (2) Chodera JD, Swope WC, Pitera JW, Dill KA (2006) Long-time protein folding dynamics from short-time molecular dynamics simulations.Multiscale  Model  Simul5:1214–1226.

    1. eLife Assessment

      This important manuscript demonstrates that UGGT1 is involved in preventing the premature degradation of endoplasmic reticulum (ER) glycoproteins through the re-glucosylation of their N-linked glycans following release from the calnexin/calreticulin lectins. The authors include a wealth of convincing data in support of their findings, although extending these findings to other types of substrates, such as secreted proteins, could further demonstrate the global importance of this mechanism for protein trafficking through the secretory pathway. This work will be of interest to scientists interested in ER protein quality control, proteostasis, and protein trafficking.

    2. Reviewer #1 (Public review):

      Summary:

      UGGTs are involved in the prevention of premature degradation for misfolded glycoproteins, by utilizing UGGT1-KO cells and a number of different ERAD substrates. They proposed a concept by which the fate of glycoproteins can be determined by a tug-of-war between UGGTs and EDEMs.

      Strengths:

      The authors provided a wealth of data to indicate that UGGT1 competes with EDEMs, which promotes the glycoprotein degradation.

    3. Reviewer #2 (Public review):

      In this study, Ninagawa et al., sheds light on UGGT's role in ER quality control of glycoproteins. By utilizing UGGT1/UGGT2 DKO , they demonstrate that several model misfolded glycoproteins undergo early degradation. One such substrate is ATF6alpha where its premature degradation hampers the cell's ability to mount an ER stress response.

      This study convincingly demonstrates that many unstable misfolded glycoproteins undergo accelerated degradation without UGGTs. Also, this study provides evidence of a "tug of war" model involving UGGTs (pulling glycoproteins to being refolded) and EDEMs (pulling glycoproteins to ERAD).

      The study explores the physiological role of UGGT, particularly examining the impact of ATF6α in UGGT knockout cells' stress response. The authors further investigate the physiological consequences of accelerated ATF6α degradation, convincingly demonstrating that cells are sensitive to ER stress in the absence of UGGTs and unable to mount an adequate ER stress response.

      These findings offer significant new insights into the ERAD field, highlighting UGGT1 as a crucial component in maintaining ER protein homeostasis. This represents a major advancement in our understanding of the field.

    4. Reviewer #3 (Public review):

      This valuable manuscript demonstrates the long-held prediction that the glycosyltransferase UGGT slows degradation of endoplasmic reticulum (ER)-associated degradation substrates through a mechanism involving re-glucosylation of asparagine-linked glycans following release from the calnexin/calreticulin lectins. The evidence supporting this conclusion is solid using genetically-deficient cell models and well established biochemical methods to monitor the degradation of trafficking-incompetent ER-associated degradation substrates, although this could be improved by better defining of the importance of UGGT in the secretion of trafficking competent substrates. This work will be of specific interest to those interested in mechanistic aspects of ER protein quality control and protein secretion.

      The authors have largely addressed my comments from the previous round of review. The only remaining comment is about defining the impact of UGGT1 in the regulation of secretion-competent proteins, which the authors indicate they will continue to pursue in subsequent work, which is fine, but remains a minor limitation of the study.

      As I mentioned in my previous review, I think that this work is interesting and addresses an important gap in experimental evidence supporting a previously asserted dogma in the field. I do think that the authors would be better suited for highlighting the limitations of the study, as discussed above. Ultimately, though, this is an important addition to the literature.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      UGGTs are involved in the prevention of premature degradation for misfolded glycoproteins, by utilizing UGGT1-KO cells and a number of different ERAD substrates. They proposed a concept by which the fate of glycoproteins can be determined by a tug-of-war between UGGTs and EDEMs. 

      Strengths: 

      The authors provided a wealth of data to indicate that UGGT1 competes with EDEMs, which promotes the glycoprotein degradation. 

      Weaknesses: 

      NA 

      We appreciate your comment.

      Reviewer #2 (Public review): 

      In this study, Ninagawa et al., sheds light on UGGT's role in ER quality control of glycoproteins. By utilizing UGGT1/UGGT2 DKO , they demonstrate that several model misfolded glycoproteins undergo early degradation. One such substrate is ATF6alpha where its premature degradation hampers the cell's ability to mount an ER stress response. 

      This study convincingly demonstrates that many unstable misfolded glycoproteins undergo accelerated degradation without UGGTs. Also, this study provides evidence of a "tug of war" model involving UGGTs (pulling glycoproteins to being refolded) and EDEMs (pulling glycoproteins to ERAD). 

      The study explores the physiological role of UGGT, particularly examining the impact of ATF6α in UGGT knockout cells' stress response. The authors further investigate the physiological consequences of accelerated ATF6α degradation, convincingly demonstrating that cells are sensitive to ER stress in the absence of UGGTs and unable to mount an adequate ER stress response. 

      These findings offer significant new insights into the ERAD field, highlighting UGGT1 as a crucial component in maintaining ER protein homeostasis. This represents a major advancement in our understanding of the field. 

      Thank you very much for your comment.

      Reviewer #3 (Public review): 

      This valuable manuscript demonstrates the long-held prediction that the glycosyltransferase UGGT slows degradation of endoplasmic reticulum (ER)-associated degradation substrates through a mechanism involving re-glucosylation of asparaginelinked glycans following release from the calnexin/calreticulin lectins. The evidence supporting this conclusion is solid using genetically-deficient cell models and well established biochemical methods to monitor the degradation of trafficking-incompetent ER-associated degradation substrates, although this could be improved by better defining of the importance of UGGT in the secretion of trafficking competent substrates. This work will be of specific interest to those interested in mechanistic aspects of ER protein quality control and protein secretion. 

      The authors have attempted to address my comments from the previous round of review, although some issues still remain. For example, the authors indicate that it is difficult to assess how UGGT1 influences degradation of secretion competent proteins, but this is not the case. This can be easily followed using metabolic labeling experiments, where you would get both the population of protein secreted and degraded under different conditions. Thus, I still feel that addressing the impact of UGGT1 depletion on the ER quality control for secretion competent protein remains an important point that could be better addressed in this work. 

      We mainly focused on the impact of UGGT1 depletion on ERAD in this paper and intend to determine the impact of UGGT1 depletion on the ER quality control for secretion competent protein in the near future.

      Further, in the previous submission, the authors showed that UGGT2 depletion demonstrates a similar reduction of ATF6 activation to that observed for UGGT1 depletion, although UGGT2 depletion does not reduce ATF6 protein levels like what is observed upon UGGT1 depletion. In the revised manuscript, they largely remove the UGGT2 data and only highlight the UGGT1 depletion data. While they are somewhat careful in their discussion, the implication is that UGGT1 regulates ATF6 activity by controlling its stability. The fact that UGGT2 has a similar effect on activity, but not stability, indicates that these enzymes may have other roles not directly linked to ATF6 stability. It is important to include the UGGT2 data and explicitly highlight this point in the discussion. Its fine to state that figuring out this other function is outside the scope of this work but removing it does not seem appropriate.

      We have added the data of UGGT2-KO and UGGT-DKO cells to Figure 4 and discussed appropriately.

      As I mentioned in my previous review, I think that this work is interesting and addresses an important gap in experimental evidence supporting a previously asserted dogma in the field. I do think that the authors would be better suited for highlighting the limitations of the study, as discussed above. Ultimately, though, this is an important addition to the literature. 

      We appreciate your comments. Thank you very much.

      Recommendations for the authors: 

      Reviewer #1 (Recommendations for the authors): 

      I have carefully gone through the revised manuscript and responses to the reviewers' comments; I believe that the authors did a great job on revisions, and I do think that now this manuscript has been much improved (far easier to read through). Now I have only minor comments as follows; 

      Page 9: Lines 8-9; Comparison between WT and EDEM-TKO cells indicates that ATF6alpha is still degraded via gpERAD requiring mannose trimming even in the presence of DNJ (Fig. 1D). (it would be better to indicate which figure to look) 

      We have fixed it.

      Page 10: Lines 9-11; as multiple higher molecular weight bands (representing a mixture of G3M9, G2M9m and GM9 etc.) in WT cells treated with CST -> I am NOT AT ALL convinced with this statement on Figure 1-figure supplement 6A). How can the subtle glycan structure difference cause the ladder of the band? And if it is indeed the case (which I frankly doubt by the way), will endo-alpha-mannosidase treatment end up with a single band for CST? And PNGase F digestion can cancel all size difference between samples (control, +DNJ and +CST)? 

      CD3d-DTM-HA is a small protein (~20 kDa) possessing three N-glycans. Clear increase in the level of GM9 in WT cells treated with DNJ (Figure 1-Figure supplement 5A) caused an upward band shift (Figure 1-Figure supplement 6A). Similarly, clear increase in the levels of GM9, G2M9, G3M9 in WT cells treated with CST (Figure 1-Figure supplement 6B) produced the ladder of the band (Figure 1-Figure supplement 6A).

      Crystal violet assay (new Fig 4G; Page 33); It said that, after treating cells with drug (Tg) for 4 hours, cells were spread on 24 well plates and cultured without Tg for 5 days. If incubated that long, I wonder that any compromised viability may have been canceled by growing cells (cells become confluent no matter what?). Am I missing something? Please clarify. 

      We employed a previously published method to determine ER stress sensitivity (Yamamoto et al., Dev. Cell, 2007). Although any compromised viability may have been canceled by growing cells, as suggested, we were able to detect the difference between WT and UGGT-KO cells.

      Figure 5D; why one of the three N-glycans is missing on the last protein?? 

      We have fixed it.

    1. eLife Assessment

      This work is an important contribution to understanding the role of FGF signaling in the induction of primitive-like cells in a 2D system of human gastrulation. The authors provide compelling evidence showing that endogenous FGF ligands, acting through FGF receptors localized basolaterally, are determinant in the acquisition of a primitive streak cell fate. These observations will be of broad relevance to the FGF field.

    2. Reviewer #1 (Public review):

      Summary:

      This is an interesting study on the role of FGF signaling in the induction of primitive streak-like cells (PS-LC) in human 2D-gastruloids. The authors use a previously characterized standard culture that generates a ring of PS-LCs (TBXT+) and correlate this with pERK staining. A requirement for FGF signaling in TBXT induction is demonstrated via pharmacological inhibition of MEK and FGFR activity. A second set of culture conditions (with no exogenous FGFs) suggests that endogenous FGFs are required for pERK and TBXT induction. The authors then characterize, via scRNA-seq, various components of the FGF pathway (genes for ligands, receptors, ERK regulators, and HSPG regulation). They go on to characterize the pFGFR1, receptor isoforms, and polarized localization of this receptor. Finally, they perform FGF4 inhibition and use a cell line with a limited FGF17 inactivation (heterozygous null) and show that loss of these FGFs reduces PS-LC and derivative cell types.

      Strengths:

      (1) As the authors point out, the role of FGF signaling in gastrulation is less well understood than other signaling pathways. Hence this is a valuable contribution to that field.

      (2) The FGF4 and FGF17 loss-of-function experiments in Figure 5 are very intriguing. This is especially so given the intriguing observation that these FGFs appear to be dominating in this model of human gastrulation, in contrast to what FGFs dominate in mice, chicks, and frogs.

      (3) In general this paper is valuable as a further development of the Human gastruloid system and the role of FGF signaling in the induction of PS-CLs. The wide net that the authors cast in characterizing the FGF ligand gene, receptor isoforms, and downstream components provides a foundation for future work. As the authors write near the beginning of the Discussion "Many questions remain."

      Weaknesses:

      (1) FGFs are cell survival factors in various aspects of development. The authors fail to address cell death due to loss of FGF signaling in their experiments. For example, in Figure 1E (which requires statistical analysis) and 1G (the bottom FGFRi row), there appears to be a significant amount of cell loss. Is this due to cell death? The authors should address the question of whether the role of FGF/ERK signaling is to keep the cells alive.

      (2) Regarding the sparse cells in 1G, is there a reduction in cell number only with FGFRi and not MEKi? Is this reproducible? Gattiglio et al (Development, 2023, PMID: 37530863) present data supporting a "community effect" in the FGF-induced mesoderm differentiation of mouse embryonic stem cells. Could a community effect be at play in this human system (especially given the images in the bottom row of 1G)? If the authors don't address this experimentally they should at least address the ideas in Gattoglio et al.

      (3) Do the FGF4 and FGF17 LOF experiments in Figure 5 affect cell numbers like FGFRi in Figure 1? Why examine PS-LC induction only in FGF17 heterozygous cells and not homozygous FGF17 nulls?

      (4) The idea that FGF8 plays a dominant role during gastrulation of other species but not humans is so intriguing it warrants deeper testing. The authors dismiss FGF8 because its mRNA "...levels always remained low." (line 363) as well as the data published in Zhai et al (PMID: 36517595) and Tyser et al (PMID: 34789876). But there are cases in mouse development where a gene was expressed at levels so low, that it might be dismissed, and yet LOF experiments revealed it played a role or even was required in a developmental process. The authors should consider FGF8 inhibition or inactivation to explore its potential role, despite its low levels of expression.

      (5) Redundancy is a common feature in FGF genetics. What is the effect of inhibiting FGF4 in FGF17 LOF cells?

      (6) I suggest stating that the authors take more caution in describing FGF gradients. For example, in one Results heading they write "Endogenous FGF4 and FGF17 gradients underly the ERK activity pattern.", implying an FGF protein gradient. However, they only present data for FGF mRNA , not protein. This issue would be clarified if they used proper nomenclature for gene, mRNA (italics), and protein (no italics) throughout the paper.

    3. Reviewer #2 (Public review):

      Summary:

      The role of FGFs in embryonic development and stem cell differentiation has remained unclear due to its complexity. In this study, the authors utilized a 2D human stem cell-based gastrulation model to investigate the functions of FGFs. They discovered that FGF-dependent ERK activity is closely linked to the emergence of primitive streak cells. Importantly, this 2D model effectively illustrates the spatial distribution of key signaling effectors and receptors by correlating these markers with cell fate markers, such as T and ISL1. Through inhibition and loss-of-function studies, they further corroborated the needs of FGF ligands. Their data shows that FGFR1 is the primary receptor, and FGF2/4/17 are the key ligands for primitive streak development, which aligns with observations in primate embryos. Additional experiments revealed that the reduction of FGF4 and FGF17 decreases ERK activity.

      Strengths:

      This study provides comprehensive data and improves our understanding of the role of FGF signaling in primate primitive streak formation. The authors provide new insights related to the spatial localization of the key components of FGF signaling and attempt to reveal the temporal dynamics of the signal propagation and cell fate decision, which has been challenging.

      Weaknesses:

      Given the solid data, the work only partially clarifies the complex picture of FGF signaling, so details remain somewhat elusive. The findings lack a strong punchline, which may limit their broader impact.

    4. Reviewer #3 (Public review):

      Jo and colleagues set out to investigate the origins and functions of localized FGF/ERK signaling for the differentiation and spatial patterning of primitive streak fates of human embryonic stem cells in a well-established micropattern system. They demonstrate that endogenous FGF signaling is required for ERK activation in a ring-domain in the micropatterns, and that this localized signaling is directly required for differentiation and spatial patterning of specific cell types. Through high-resolution microscopy and transwell assays, they show that cells receive FGF signals through basally localized receptors. Finally, the authors find that there is a requirement for exogenous FGF2 to initiate primitive streak-like differentiation, but endogenous FGFs, especially FGF4 and FGF17, fully take over at later stages.

      Even though some of the authors' findings - such as the localized expression of FGF ligands during gastrulation and the importance of FGF/ERK signaling for cell differentiation in the primitive streak - have been reported in model organisms before, this is one of the first studies to investigate the role of FGF signaling during primitive streak-like differentiation of human cells. In doing so, the paper reports a number of interesting and valuable observations, namely the basal localization of FGF receptors which mirrors that of BMP and Nodal receptors, as well as the existence of a positive feedback loop centered on FGF signaling that drives primitive-streak differentiation. The authors also perform a comparison of the role of different FGFs across species and try to assign specific functions to individual FGFs. In the absence of clean genetic loss-of-function cell lines, this part of the work remains less strong.

    5. Author response:

      Reviewer #1 (Public review):

      Summary:

      This is an interesting study on the role of FGF signaling in the induction of primitive streak-like cells (PS-LC) in human 2D-gastruloids. The authors use a previously characterized standard culture that generates a ring of PS-LCs (TBXT+) and correlate this with pERK staining. A requirement for FGF signaling in TBXT induction is demonstrated via pharmacological inhibition of MEK and FGFR activity. A second set of culture conditions (with no exogenous FGFs) suggests that endogenous FGFs are required for pERK and TBXT induction. The authors then characterize, via scRNA-seq, various components of the FGF pathway (genes for ligands, receptors, ERK regulators, and HSPG regulation). They go on to characterize the pFGFR1, receptor isoforms, and polarized localization of this receptor. Finally, they perform FGF4 inhibition and use a cell line with a limited FGF17 inactivation (heterozygous null) and show that loss of these FGFs reduces PS-LC and derivative cell types.

      Strengths:

      (1) As the authors point out, the role of FGF signaling in gastrulation is less well understood than other signaling pathways. Hence this is a valuable contribution to that field.

      (2) The FGF4 and FGF17 loss-of-function experiments in Figure 5 are very intriguing. This is especially so given the intriguing observation that these FGFs appear to be dominating in this model of human gastrulation, in contrast to what FGFs dominate in mice, chicks, and frogs.

      (3) In general this paper is valuable as a further development of the Human gastruloid system and the role of FGF signaling in the induction of PS-CLs. The wide net that the authors cast in characterizing the FGF ligand gene, receptor isoforms, and downstream components provides a foundation for future work. As the authors write near the beginning of the Discussion "Many questions remain."

      We thank the reviewer for these positive comments.

      Weaknesses:

      (1) FGFs are cell survival factors in various aspects of development. The authors fail to address cell death due to loss of FGF signaling in their experiments. For example, in Figure 1E (which requires statistical analysis) and 1G (the bottom FGFRi row), there appears to be a significant amount of cell loss. Is this due to cell death? The authors should address the question of whether the role of FGF/ERK signaling is to keep the cells alive.

      Indeed, FGF also strongly affects cell number and it is an interesting question to what extent this depends on ERK. Our manuscript focuses instead on the role of FGF/ERK signaling in cell fate patterning. However, as mentioned in our discussion, figure 1de show that doxycycline induced pERK leads to more TBXT+ cells than the control without restoring cell number, suggesting the role of FGF in controlling cell number is independent of the requirement for FGF/ERK in PS-LC differrentiation. Unpublished data below showing a MEK inhibitor dose response further supports this: low doses of MEKi are sufficient to inhibit differentiation without affecting cell number. To address the reviewer’s question we will include this data in the revised manuscript and perform several additional experiments to determine in more detail how cell death and proliferation depend on FGF.

      Author response image 1.

      MEK affects differentiation and cell number at different doses. a-c) control and MEKi (0.3uM) treated colonies with similar cell number but different TBXT expression. d-f) quantification of cell number per colonies (d), percentage of TBXT-positive cell per colony (e), and the distribution of pERK intensities for different doses of MEK inhibitor (f). N>6 colonies per condition. MEKi = PD0325901. Scalebar = 50 micron.

      (2) Regarding the sparse cells in 1G, is there a reduction in cell number only with FGFRi and not MEKi? Is this reproducible? Gattiglio et al (Development, 2023, PMID: 37530863) present data supporting a "community effect" in the FGF-induced mesoderm differentiation of mouse embryonic stem cells. Could a community effect be at play in this human system (especially given the images in the bottom row of 1G)? If the authors don't address this experimentally they should at least address the ideas in Gattoglio et al.

      Indeed, FGFRi reproducibly affects cell number more than MEKi, in line with the fact that pathways downstream of FGF other than MAPK/ERK (e.g. PI3K) play important roles in cell survival and growth. We think the lack of differentiation in MEKi and FGFRi in Fig.1g cannot be attributed to a loss of cells combined with a community effect. This is because without FGFRi or MEKi cells also differentiate to primitive streak at much lower densities than those shown, consistent with the data we show above in response to (1), which argue against a primarily indirect effect of FGF on PS-LC differentiation through cell density. In the context of directed differentiation (rather than 2D gastruloids), we will show this in a controlled manner by repeating the experiment in Fig.1g while adjusting cell seeding densities to obtain similar final cell densities in all three conditions. We will also include Gattoglio et al. in our revised discussion.

      (3) Do the FGF4 and FGF17 LOF experiments in Figure 5 affect cell numbers like FGFRi in Figure 1?

      It seems the effect on cell number is small but we will analyze this carefully and include it in the revised manuscript. A small effect would be consistent with our unpublished data below showing a near uniform proliferation rate. This in turn suggests that low levels of pERK in the center are sufficient to maintain proliferation there while the much higher pERK levels in the PS-LC ring (that we think depend on FGF4 and FGF17) do not signifcantly increase the proliferation rate (see Fig.1 in the manuscript for the pERK pattern). Thus, loss of high pERK in PS-LC ring while maintaining low pERK throughout would not be expected to have a major impact on cell number but would impact differentiation. In contrast, loss of all FGF signaling through FGFRi does dramatically affect cell number. This is again consistent with the data provided in response to (1) showing that ERK levels can be reduced to a point where PS-LC differentiation is lost without significantly affecting cell number. We will include the data below in the revised manuscript.

      Author response image 2.

      Why examine PS-LC induction only in FGF17 heterozygous cells and not homozygous FGF17 nulls?

      We were unable to obtain homozygous FGF17 nulls, it is not clear if there is a reason for this. We will try again and otherwise attempt to corroborate our findings with further knockdown data.

      (4) The idea that FGF8 plays a dominant role during gastrulation of other species but not humans is so intriguing it warrants deeper testing. The authors dismiss FGF8 because its mRNA "...levels always remained low." (line 363) as well as the data published in Zhai et al (PMID: 36517595) and Tyser et al (PMID: 34789876). But there are cases in mouse development where a gene was expressed at levels so low, that it might be dismissed, and yet LOF experiments revealed it played a role or even was required in a developmental process. The authors should consider FGF8 inhibition or inactivation to explore its potential role, despite its low levels of expression.

      We agree with the reviewer that FGF8 is worth investigating further and we will now pursue this.

      (5) Redundancy is a common feature in FGF genetics. What is the effect of inhibiting FGF4 in FGF17 LOF cells?

      We will attempt to do the experiment the reviewer suggests.

      (6) I suggest stating that the authors take more caution in describing FGF gradients. For example, in one Results heading they write "Endogenous FGF4 and FGF17 gradients underly the ERK activity pattern.", implying an FGF protein gradient. However, they only present data for FGF mRNA , not protein. This issue would be clarified if they used proper nomenclature for gene, mRNA (italics), and protein (no italics) throughout the paper.

      We will edit the paper to more clearly distinguish protein and mRNA.

      Reviewer #2 (Public review):

      Summary:

      The role of FGFs in embryonic development and stem cell differentiation has remained unclear due to its complexity. In this study, the authors utilized a 2D human stem cell-based gastrulation model to investigate the functions of FGFs. They discovered that FGF-dependent ERK activity is closely linked to the emergence of primitive streak cells. Importantly, this 2D model effectively illustrates the spatial distribution of key signaling effectors and receptors by correlating these markers with cell fate markers, such as T and ISL1. Through inhibition and loss-of-function studies, they further corroborated the needs of FGF ligands. Their data shows that FGFR1 is the primary receptor, and FGF2/4/17 are the key ligands for primitive streak development, which aligns with observations in primate embryos. Additional experiments revealed that the reduction of FGF4 and FGF17 decreases ERK activity.

      Strengths:

      This study provides comprehensive data and improves our understanding of the role of FGF signaling in primate primitive streak formation. The authors provide new insights related to the spatial localization of the key components of FGF signaling and attempt to reveal the temporal dynamics of the signal propagation and cell fate decision, which has been challenging.

      Weaknesses:

      Given the solid data, the work only partially clarifies the complex picture of FGF signaling, so details remain somewhat elusive. The findings lack a strong punchline, which may limit their broader impact.

      We thank this reviewer for their valuable feedback and the compliment on the solidity of our data. The punchline of our work is that FGF4- and FGF17-dependent ERK signaling plays a key role in human PS-LC differentiation, and that these are different FGFs than those thought to drive mouse gastrulation. A second key point is that like BMP and TGFβ signaling, FGF signaling is restricted to the basolateral sides of pluripotent stem cell colonies due to polarized receptor expression, which is crucial for understanding the response to exogenous ligands added to the cell medium. Indeed, many facets of FGF signaling remain to investigated in the future, such as how FGF regulates and is regulated by other signals, which we will dedicate a different manuscript to.

      Reviewer #3 (Public review):

      Jo and colleagues set out to investigate the origins and functions of localized FGF/ERK signaling for the differentiation and spatial patterning of primitive streak fates of human embryonic stem cells in a well-established micropattern system. They demonstrate that endogenous FGF signaling is required for ERK activation in a ring-domain in the micropatterns, and that this localized signaling is directly required for differentiation and spatial patterning of specific cell types. Through high-resolution microscopy and transwell assays, they show that cells receive FGF signals through basally localized receptors. Finally, the authors find that there is a requirement for exogenous FGF2 to initiate primitive streak-like differentiation, but endogenous FGFs, especially FGF4 and FGF17, fully take over at later stages.

      Even though some of the authors' findings - such as the localized expression of FGF ligands during gastrulation and the importance of FGF/ERK signaling for cell differentiation in the primitive streak - have been reported in model organisms before, this is one of the first studies to investigate the role of FGF signaling during primitive streak-like differentiation of human cells. In doing so, the paper reports a number of interesting and valuable observations, namely the basal localization of FGF receptors which mirrors that of BMP and Nodal receptors, as well as the existence of a positive feedback loop centered on FGF signaling that drives primitive-streak differentiation. The authors also perform a comparison of the role of different FGFs across species and try to assign specific functions to individual FGFs. In the absence of clean genetic loss-of-function cell lines, this part of the work remains less strong.

      We thank the reviewer for emphasizing the value of our findings in a human model for gastrulation. We agree more loss-of-function experiments would provide further insight into the role of different FGFs, and we plan to provide additional data along these lines in the revised manuscript.

    1. eLife Assessment

      The study provides valuable findings regarding the identification of a new bacteriophage that uses the Pseudomonas aeruginosa exopolysaccharide Psl as a receptor, thus suggesting a novel approach to control biofilms. While much of the data presented is solid, additional work and clarifications are still required to fully support some of the main claims. This manuscript will interest those working on biofilms, specifically in Pseudomonas, on phage physiology and discovery, and on alternatives to controlling bacterial pathogens.

    2. Reviewer #1 (Public review):

      Summary:

      Walton et al. set out to isolate new phages targeting the opportunistic pathogen Pseudomonas aeruginosa. Using a double ∆fliF ∆pilA mutant strain, they were able to isolate 4 new phages, CLEW-1. -3, -6, and -10, which were unable to infect the parental PAO1F Wt strain. Further experiments showed that the 4 phages were only able to infect a ∆fliF strain, indicating a role of the MS-protein in the flagellum complex. Through further mutational analysis of the flagellum apparatus, the authors were able to identify the involvement of c-di-GMP in phage infection. Depletion of c-di-GMP levels by an inducible phosphodiesterase renders the bacteria resistant to phage infection, while elevation of c-di-GMP through the Wsp system made the cells sensitive to infection by CLEW-1. Using TnSeq, the authors were able to not only reaffirm the involvement of c-di-GMP in phage infection but also able to identify the exopolysaccharide PSL as a downstream target for CLEW-1. C-di-GMP is a known regulator of PSL biosynthesis. The authors show that CLEW-1 binds directly to PSL on the cell surface and that deletion of the pslC gene resulted in complete phage resistance. The authors also provide evidence that the phage-PSL interaction happens during the biofilm mode of growth and that the addition of the CLEW-1 phage specifically resulted in a significant loss of biofilm biomass. Lastly, the authors set out to test if CLEW-1 could be used to resolve a biofilm infection using a mouse keratitis model. Unfortunately, while the authors noted a reduction in bacterial load assessed by GFP fluorescence, the keratitis did not resolve under the tested parameters.

      Strengths:

      The experiments carried out in this manuscript are thoughtful and rational and sufficient explanation is provided for why the authors chose each specific set of experiments. The data presented strongly supports their conclusions and they give present compelling explanations for any deviation. The authors have not only developed a new technique for screening for phages targeting P. aeruginosa, but also highlight the importance of looking for phages during the biofilm mode of growth, as opposed to the more standard techniques involving planktonic cultures.

      Weaknesses:

      While the paper is strong, I do feel that further discussions could have gone into the decision to focus on CLEW-1 for the majority of the paper. The paper also doesn't provide any detailed information on the genetic composition of the phages. It is unclear if the phages isolated are temperate or virulent. Many temperate phages enter the lytic cycle in response to QS signalling, and while the data as it is doesn't suggest that is the case, perhaps the paper would be strengthened by further elimination of this possibility. At the very least it might be worth mentioning in the discussion section.

    3. Reviewer #2 (Public review):

      This manuscript by Walton et al. suggests that they have identified a new bacteriophage that uses the exopolysaccharide Psl from Pseudomonas aeruginosa (PA) as a receptor. As Psl is an important component in biofilms, the authors suggest that this phage (and others similarly isolated) may be able to specifically target biofilm-growing bacteria. While an interesting suggestion, the manner in which this paper is written makes it difficult to draw this conclusion. Also, some of the results do not directly follow from the data as presented and some relevant controls seem to be missing.

    4. Author response:

      Reviewer #1 (Public review): 

      Summary: 

      Walton et al. set out to isolate new phages targeting the opportunistic pathogen Pseudomonas aeruginosa. Using a double ∆fliF ∆pilA mutant strain, they were able to isolate 4 new phages, CLEW-1. -3, -6, and -10, which were unable to infect the parental PAO1F Wt strain. Further experiments showed that the 4 phages were only able to infect a ∆fliF strain, indicating a role of the MS-protein in the flagellum complex. Through further mutational analysis of the flagellum apparatus, the authors were able to identify the involvement of c-di-GMP in phage infection. Depletion of c-di-GMP levels by an inducible phosphodiesterase renders the bacteria resistant to phage infection, while elevation of c-di-GMP through the Wsp system made the cells sensitive to infection by CLEW-1. Using TnSeq, the authors were able to not only reaffirm the involvement of c-di-GMP in phage infection but also able to identify the exopolysaccharide PSL as a downstream target for CLEW-1. C-di-GMP is a known regulator of PSL biosynthesis. The authors show that CLEW-1 binds directly to PSL on the cell surface and that deletion of the pslC gene resulted in complete phage resistance. The authors also provide evidence that the phage-PSL interaction happens during the biofilm mode of growth and that the addition of the CLEW-1 phage specifically resulted in a significant loss of biofilm biomass. Lastly, the authors set out to test if CLEW-1 could be used to resolve a biofilm infection using a mouse keratitis model. Unfortunately, while the authors noted a reduction in bacterial load assessed by GFP fluorescence, the keratitis did not resolve under the tested parameters. 

      Strengths: 

      The experiments carried out in this manuscript are thoughtful and rational and sufficient explanation is provided for why the authors chose each specific set of experiments. The data presented strongly supports their conclusions and they give present compelling explanations for any deviation. The authors have not only developed a new technique for screening for phages targeting P. aeruginosa, but also highlight the importance of looking for phages during the biofilm mode of growth, as opposed to the more standard techniques involving planktonic cultures. 

      Weaknesses: 

      While the paper is strong, I do feel that further discussions could have gone into the decision to focus on CLEW-1 for the majority of the paper. The paper also doesn't provide any detailed information on the genetic composition of the phages. It is unclear if the phages isolated are temperate or virulent. Many temperate phages enter the lytic cycle in response to QS signalling, and while the data as it is doesn't suggest that is the case, perhaps the paper would be strengthened by further elimination of this possibility. At the very least it might be worth mentioning in the discussion section. 

      Thank you for your review. We will upload the genomes of all Clew phages and Ocp-2 before resubmission. It turns out that the Clew phage are highly related, which we wanted to express with the genomic comparison in the supplementary figure (rather unsuccessfully). It therefore made sense to focus our in-depth analysis on one of the phage. We will include a supplementary figure demonstrating that all Clew-1 phage require an intact psl locus for infection, to make that logic clearer. The phage are virulent (there is apparently a bit of a debate about this with regard to Bruynogheviruses, but we have not been able to isolate lysogens). This will be explained in the revised version of the manuscript as well.

      Reviewer #2 (Public review): 

      This manuscript by Walton et al. suggests that they have identified a new bacteriophage that uses the exopolysaccharide Psl from Pseudomonas aeruginosa (PA) as a receptor. As Psl is an important component in biofilms, the authors suggest that this phage (and others similarly isolated) may be able to specifically target biofilm-growing bacteria. While an interesting suggestion, the manner in which this paper is written makes it difficult to draw this conclusion. Also, some of the results do not directly follow from the data as presented and some relevant controls seem to be missing. 

      Thank you for your review. We would argue that the combination of demonstrating Psl-dependent binding of Clew-1 to P. aeruginosa, as well as demonstration of direct binding of Clew-1 to affinity-purified Psl, indicates that the phage binds directly to Psl and uses it as a receptor. In looking at the recommendations, it appears that the remark about controls refers to not using the ∆pslC mutant alone (as opposed to the ∆fliF2 ∆pslC double mutant) as a control for some of the binding experiments. However, since the ∆fliF2 mutant is more permissive for phage infection, analyzing the effect of deleting pslC in the context of the ∆fliF2 mutant background is the more stringent test.

    1. Author response:

      We sincerely thank all the reviewers for their enthusiasm and positive feedback, which has encouraged us to delve deeper into this research. As this is the first report of POLK in the brain using a longitudinal normative aging model, our primary aim was to establish the observational and phenomenological aspects. We agree with the reviewers that more detailed molecular, biochemical, and cellular studies are essential to elucidate underlying mechanisms. However, as noted by some reviewers, these investigations, while they will raise the impact, may fall outside the scope of the current report. Indeed, many of these lines of investigation are currently ongoing. Below, we provide our provisional responses to individual reviewer comments.

      Response to Reviewer #1:

      a) Concern over POLK antibody characterization in mice:

      We performed knocking down of POLK by siRNA in mice cortical primary neuronal culture (Fig S1C). In the revised version, we will provide a more detailed characterization of POLK antibodies in mouse cells.

      b) More mechanistic investigation is needed before POLK could be considered as a brain aging clock:

      We sincerely appreciate the valuable suggestion. In our ongoing work exploring the mechanisms of POLK in postmitotic neurons, preliminary findings using siPOLK indicate an upregulation of senescence markers along with a reduction in DNA repair synthesis (manuscript in preparation). We will reference this companion manuscript in the revised version and are pleased to share these data with the reviewers for their consideration.

      Response to Reviewer #2:<br /> a) Concern on more mechanistic understanding of the pathways regulating POLK dynamics between the nucleus and cytosol:

      We sincerely appreciate the reviewer’s enthusiasm and valuable guidance in helping us better understand the mechanism of nuclear-cytoplasmic POLK dynamics. Previously, we developed a modified aniPOND (accelerated native isolation of proteins on nascent DNA) protocol, which we termed iPoKD-MS (isolation of proteins on Pol kappa synthesized DNA  followed by mass spectrometry), to capture proteins bound to nascent DNA synthesized by POLK in human cell lines (bioRxiv https://www.biorxiv.org/content/10.1101/2022.10.27.513845v3). In this dataset, we identified potential candidates that may regulate nuclear/cytoplasmic POLK dynamics. These candidates are currently undergoing validation in human cell lines, and we are preparing a manuscript on these findings. Among these, some candidates, including previously identified proteins such as exportin and importin (Temprine et al., 2020, PMID: 32345725), are being explored further as potential POLK nuclear/cytoplasmic shuttles. We are also conducting tests on these candidates in mouse cortical primary neurons to assess their role in POLK dynamics. In the revised version of the manuscript, we will include a discussion of our current understanding and outline our planned studies.

      b) Question on “… what is POLK doing in the cytosol, and what is it interacting with …”:

      Our data so far indicate that POLK accumulates in stress granules and lysosomes. We are very grateful for the reviewer’s insightful suggestions and will make every effort to incorporate them in the revised manuscript. Currently, we are characterizing POLK accumulation in the cytoplasm using additional lysosomal markers, as recommended by the reviewer. If these experiments prove challenging in mouse brain tissues, we plan to investigate them in primary neuron cultures. We are hopeful to include these findings in the revised version. Additionally, we have optimized the POLK antibody for immunoprecipitation from nuclear and cytoplasmic fractions of mouse brain tissue. These findings, which are beyond the scope of the current study, will be reported in a separate manuscript.

      Response to Reviewer #3:

      We highly appreciate the reviewer bringing up the context of biomolecular condensates. Our iPoKD-MS data referenced above suggests candidates from various biomolecular condensates that we are currently investigating. We are currently investigating by subcellular fractionation the presence of POLK in different biomolecular condensates that will be fully reported in future publications. We appreciate the reviewer providing important literature that will be cited and potential biomolecular condensates will be discussed in the revised version.

    2. eLife Assessment

      Abdelmageed et al. demonstrate POLK expression in neurons and report an important observation that POLK exhibits an age-dependent change in subcellular localization, from the nucleus in young tissue to the cytoplasm in old tissue. Despite potentially exciting and novel findings, many of the authors' claims are provided with incomplete support (e.g. lack of validation of the POLK antibody, characterization of the subcellular compartment, etc).

    3. Reviewer #1 (Public review):

      Summary:

      Abdelmageed et al. investigate age-related changes in the subcellular localization of DNA polymerase kappa (POLK) in the brains of mice. POLK has been actively investigated for its role in translesion DNA synthesis and involvement in other DNA repair pathways in proliferating cells, very little is known about POLK in a tissue-specific context, let alone in post-mitotic cells. The authors investigated POLK subcellular distribution in the brains of young, middle-aged, and old mice via immunoblotting of fractioned tissue extracts and immunofluorescence (IF). Immunoblotting revealed a progressive decrease in the abundance of nuclear POLK, while cytoplasmic POLK levels concomitantly increased. Similar findings were present when IF was performed on brain sections. Further, IF studies of the cingulate cortex (Cg1), the motor cortex (M1, M2), and the somatosensory (S1) cortical regions all showed an age-related decline in nuclear POLK. Nuclear speckles of POLK decrease in each region, meanwhile, the number of cytoplasmic POLK granules decreases in all four regions, but granule size is increasing. The authors report similar findings for REV1, another Y-family DNA polymerase.

      The authors then investigate the colocalization of POLK with other DNA damage response (DDR) proteins in either pyramidal neurons or inhibitory interneurons. At 18 months of age, DNA damage marker gH2AX demonstrated colocalization with nuclear POLK, while strong colocalization of POLK and 8-oxo-dG was present in geriatric mice. The authors find that cytoplasmic POLK granules colocalize with stress granule marker G3BP1, suggesting that the accumulated POLK ends up in the lysosome.

      Brain regions were further stained to identify POLK patterns in NeuN+ neurons, GABAergic neurons, and other non-neuronal cell types present in the cortex. Microglia associated with pyramidal neurons or inhibitory interneurons were found to have a higher abundance of cytoplasmic POLK. The authors also report that POLK localization can be regulated by neuronal activity induced by Kainic acid treatment. Lastly, the authors suggest that POLK could serve as an aging clock for brain tissue, but POLK deserves further characterization and correlation to functional changes before being considered as a biomarker.

      Strengths:

      Investigation of TLS polymerases in specific tissues and in post-mitotic cells is largely understudied. The potential changes in sub-cellular localization of POLK and potentially other TLS polymerases open up many questions about DNA repair and damage tolerance in the brain and how it can change with age.

      Weaknesses:

      The work is quite novel and interesting, and the authors do suggest some potentially interesting roles for POLK in the brain, but these are in and of themselves a bit speculative. The majority of the findings of this paper draw upon findings from POLK antibody and its presumed specificity for POLK. However, this antibody has not been fully validated and needs further work. Further validation experiments using Polk-deficient or knocked-down cells to investigate antibody specificity for both immunoblotting and immunofluorescence should be performed. More mechanistic investigation is needed before POLK could be considered as a brain aging clock.

    4. Reviewer #2 (Public review):

      Summary:

      Abdelmageed et al., demonstrate POLK expression in nervous tissue and focus mainly on neurons. Here they describe an exciting age-dependent change in POLK subcellular localization, from the nucleus in young tissue to the cytoplasm in old tissue. They argue that the cytosolic POLK is associated with stress granules. They also investigate the cell-type specific expression of POLK, and quantitate expression changes induced by cell-autonomous (activity) and cell nonautonomous (microglia) factors.

      I think it is an interesting report but requires a few more experiments to support their findings in the latter half of the paper. Additionally, a more mechanistic understanding of the pathways regulating POLK dynamics between the nucleus and cytosol, what is POLK doing in the cytosol, and what is it interacting with; would greatly increase the impact of this report. However, additional mechanistic experiments are mostly not needed to support much of the currently presented results, again, it would simply increase the impact.

    5. Reviewer #3 (Public review):

      Summary:

      In this study, the authors show that DNA polymerase kappa POLK relocalizes in the cytoplasm as granules with age in mice. The reduction of nuclear POLK in old brains is congruent with an increase in DNA damage markers. The cytoplasmic granules colocalize with stress granules and endo-lysosome. The study proposes that protein localization of POLK could be used to determine the biological age of brain tissue sections.

      Strengths:

      Very few studies focus on the POLK protein in the peripheral nervous system (PNS). The microscopy approach used here is also very relevant: it allows the authors to highlight a radical change in POLK localization (nuclear versus cytoplasmic) depending on the age of the neurons.

      The conclusions of the study are strong. Several types of neurones are compared, the colocalization with several proteins from the NHEJ and BER repair pathways is tested, and microscopy images are systematically quantified.

      Weaknesses:

      The authors do not discuss the physical nature of POLK granules. There is a large field of research dedicated to the nature and function of condensates: in particular numerous studies have shown that some condensates but not all exhibit liquid-like properties (https://www.nature.com/articles/nrm.2017.7, https://pubmed.ncbi.nlm.nih.gov/33510441/ https://www.mdpi.com/2073-4425/13/10/1846). The change of physical properties of condensates is particularly important in cells undergoing stress and during aging. The authors should discuss this literature.

    1. eLife Assessment

      This valuable study by Ganesh and colleagues examined how both the value and salience of sensory information can affect economic decision-making. The results provide insights into how different sources of uncertainty found in the real world, including those related to the perception of objects and those related to values associated with objects, can together influence decision-making behavior in systematic ways. The evidence is solid but overlaps with previous studies and could be improved by clarifying novelty and experimental details and considering additional models.

    2. Reviewer #1 (Public review):

      This study examined the effects of uncertainty over states (i.e., stimuli) and uncertainty over rewards (i.e., reward probability) on human learning and decision-making in a simple reinforcement learning task. The authors proposed two hypotheses: (1) high uncertainty over states reduces the learning rate, and (2) visual salience drives decision-making. A Bayesian learner is proposed to support the first hypothesis and several regression analyses confirm this finding. Furthermore, the analysis of salience bias also supports the second hypothesis.

      Strengths:

      (1) The experiment is simple and solid.

      (2) The experimental design is clever and consistent with several well-established paradigms.

      Weaknesses:

      (1) One of my main concerns is that the first conclusion "high uncertainty over states reduces learning rate" is not new and has been shown recently in Yoo et al. (2023). In that study, a slower learning rate was found when stimuli were perceptually similar. It seems to me that the only difference here is that simple Gabor patches are used instead of e.g., green vegetable images in that study. The conclusion is exactly the same.

      (2) The second hypothesis should be more explicit. Instead of claiming "A drives B", can you show specific predictions for the direction of this influence? For example, given the same expected value, do human learners prefer to choose a high-contrast stimulus? and why?

      (3) The analyses of salience bias support the second hypothesis. However, If I understand it correctly, there is no salience parameter (i.e., absolute contrast of each stimulus) in the decision process, according to Eqs. 4,5, and 6 in the Methods. In other words, the Bayesian learner should not exhibit a salience bias. The question then became, why do human learners have such a bias? What are the underlying mechanisms of the salience bias?

      (4) If high perceptual uncertainty reduces the learning rate, why does the normative agent, which takes perceptual uncertainty into account, learn faster than the categorical agent, which has no perceptual uncertainty at all? Did I miss something?

      (5) The learning algorithm is different from the standard Q-learning modeling approach. Better to include more explanation of why this type of learning algorithm is Bayesian optimal?

      (6) Similar to the above, Bayesian modeling here only confirms that high perceptual uncertainty reduces the learning rate in an optimal Bayesian learner. Two questions remain elusive: (a) whether human learners are close to the Bayesian learner (i.e., near optimal). It seems that (a) is unlikely given several suboptimal heuristics (e.g., confirmation bias) found in humans. Then the question is (b) how optimal learning and suboptimal heuristics are combined in the human learning process. One of the major disadvantages of this study is that no new model is proposed to fit trial-by-trial human choices. I believe that building formal process models is the key to improving this study.

      (7) The writing should be substantially improved. The main concern here is that the authors used several seemingly related but ambiguous words to represent the same concept. For example, "perceptual uncertainty" in Figures 1 & 2 indicate the contrast differences between two patches. But page 5 line 9 includes "belief-state uncertainty". Are they the same concept? Moreover, on page 18 line 17, if I understand it correctly, "perceptual uncertainty" here indicates sensory noise not contrast differences. Please carefully check all terminologies and use a single and concrete one to represent a concept throughout the paper.

      (8) Similarly, is the "task state" on page 17 the same as the "perceptual state" in Figure 1&2?

      (9) The Methods section could also be improved. For example, I am not sure how Eq. 5 is derived. Also, page 18 line 16 states that "in our simulations, we manipulated...'. I did not find any information about the simulation. How was the simulation performed? Did I miss something?

    3. Reviewer #2 (Public review):

      Summary:

      The authors addressed the question of how perceptual uncertainty and reward uncertainty jointly shape value-based decision-making. They sought to test two main hypotheses: (H1) perceptual uncertainty modulates learning rates, and (H2) perceptual salience is integrated in value computation. Through a series of analyses, including regression models and normative computational modeling, they showed that learning rates were modulated by perceptual uncertainty (reflected by differences in contrast), supporting H1, and the update was indeed biased toward high-contrast (ie, salient) stimuli, supporting H2.

      Strengths:

      This is a timely and interesting study, with a strong theory-driven focus, reflected by the sophisticated experimental design that systematically tests both perceptual and reward uncertainty. This paper is also well written, with relevant examples (bakery) that draw the analogy to explain the main research question. The main response by participants is reward probability estimation (on a slider), which goes beyond commonly used binary choices and offers richness of the data, that was eventually used in the regression analysis. This work may also open new directions to test the interaction between perceptual decision-making and value-based decision-making.

      Weaknesses:

      Despite the strengths, multiple points may need to be clarified, to make this paper stronger.

      (1) Experimental design:

      (1a) The authors stated (page 6) that "The systematic manipulation of uncertainty resulted in three experimental conditions." If this is truly systematic, wouldn't there be a low-low condition, in a factorial design fashion? Essentially, the current study has H(perceptual uncertainty)-H(reward uncertainty), L(perceptual uncertainty)-H(reward uncertainty), H(perceptual uncertainty)-L(reward uncertainty), but naturally, one would anticipate a L-L condition. It could be argued that the L-L condition may seem too easy, causing a ceiling effect, but it nonetheless provides a benchmark for baseline learning when everting is not ambiguous. Unless the authors would love to, I am not asking the authors to run additional experiments to include all these 4 conditions. But it would be helpful to justify their initial choice of why a L-L condition was not included.

      (1b) I feel there are certain degrees of imbalance regarding the levels of uncertainty. For reward uncertainty, {0.9, 0.1} is low uncertainty, and {0.7, 0.3} is uncertainty, whereas for perceptual uncertainty, the levels of differences in contrasts of the Gabor stimuli are much higher. This means the design appears to be more sensitive to detect any effect that can be caused by perceptual uncertainty (as there is sufficient variation) than reward uncertainty. Again, I am not asking the authors to run additional experiments, but it would be very helpful if they can explain/justify the choice of experimental set up and specification.

      (2) Statistical Analysis:

      (2a) There is some inconsistency regarding the stats used. For all the comparisons across the three conditions, sometimes an F-test is used followed by a series of t-tests (eg. page 6), but in other places, only pair-wise t-tests were reported without an F-test (eg, page 12). It would be helpful, for all of them, to have an F-test first, and then three t-tests. And for the F-test, I assume it was one-way ANOVA? This info was not explicit in the Methods. Also, what multiple comparison corrections were used, or whether it was used at all?

      (2b) Regarding normative modeling, I am aware that this is a pure simulation without model fitting, but it loses the close relationship between the data and model without model fitting. I wonder if model fitting can be done at all. As it stands, there is even no qualitative evidence regarding how well the model could explain the data (eg, by adding real data to Figure 3e). In other words, now that it is a normative model, it is no surprise that it works, but it is not known if it works to account for human data. As a side note, I appreciate that certain groups of researchers tend not to run model estimation; instead, model simulations are used to qualitatively compare the model and data. This is particularly true for "normative models". But at least in the current case, I believe model estimation can be implemented, and will provide mode insights.

      (2c) Relatedly, regarding specific results shown in Figure 4b - the normative agent has a near-zero effect on the fixed learning rate. I do not find these results surprising, because since the normative agent "knows" what is going to happen, and which state the agent is in, there is no need to update the prediction error in the classic Q-learning fashion. But humans, on the other hand, do NOT know the environment, hence they do not know what they are supposed to do, like the model. In essence, the model knows more than the humans in the task know. We can leave this to debate, but I believe most cognitive modelers would agree that the model should not know more than humans know. I think it would be helpful if the authors could discuss the advantages and disadvantages of using normative models in this case.

      (2d) I find the results in Figure 5 interesting. But given the dependent variable is identical across the three correlations (ie, absolute estimation error), I would suggest the authors put all three predicters into a single multiple regression. This way, shared variance, if any, could also be taken into account by the model.

      (2e) I feel the focus on testing H2 is somewhat too less on H1. The authors did a series of analyses on testing and supporting H1, but then only briefly on H2. On first reading, I wondered why not having a normative model also tests the effect of salience, but actually, salience is indeed included in the model (buried in the methods). I am curious to know whether analyzing the salience-related parameter (beta_4) would also support H2.

    1. eLife Assessment

      This important study introduces an approach to discovering antibiotic resistance determinants by leveraging diverse susceptibility profiles among related mycobacterial species, with particular relevance to high-level resistance against natural product-derived antibiotics. The research provides convincing evidence for the role of ADP-ribosylation enzymes in rifamycin resistance among mycobacteria, whilst also demonstrating that antibiotic susceptibility is not correlated with growth rate or intracellular compound concentration. Although some broader claims require additional experimental support, this work lays a significant foundation for understanding the complexity of antibiotic resistance mechanisms in mycobacteria and opens new avenues for future antimicrobial research.

    2. Reviewer #1 (Public review):

      This work shows that resistance profiles to a variety of drugs are variable between different mycobacterial species and are not correlated with growth rate or intrabacterial compound concentration (at least for linezolid, bedaquiline, and Rifampicin). Note that intrabacterial compound concentration does not distinguish between cytosolic and periplasmic/cell wall-associated drugs. The susceptibility profiles for a wide range of mycobacteria tested under the same conditions against 15 commonly used antimycobacterial drugs provide the first recorded cross-species comparison which will be a valuable resource for the scientific community. To understand the reasons for the high Rifampicin resistance seen in many mycobacteria, the authors confirm the presence of the arr gene known to encode a Rif ribosyltransferase involved in Rif resistance in M. smegmatis in the resistant mycobacteria after confirming the absence of on-target mutations in the RpoB RRDR. Metabolomic analyses confirm the presence of ribosylated Rif in some of the naturally resistant mycobacteria which may not be entirely surprising but an important confirmation. Presumably M. branderi is highly resistant despite lacking the arr homolog due to the rpoB S45N mutation. M. flavescens has an MIC similar to that of M. smegmatis, despite having both Arr-1 and Arr-X. Various Arr-1 and Arr-X proteins are expressed and characterized for catalytic activity which shows that Arr-X is a faster enzyme,, especially with respect to more hydrophobic rifamycins. M. flavescens has similar MIC values to Rifapentine and Rifabutin to M. smegmatis. Thus, the Arr-1 versus Arr-X comparison does not provide a complete explanation for the underlying reasons driving natural Rif resistance in mycobacteria. Downregulation of Arr-X expression in M. conceptionense confers increased sensitivity to Rifabutin confirming its role as a rifamycin-inactivating enzyme.

      Overall, the comparison of cross-species susceptibility profiles is novel; the demonstration that MIC is not correlated with intracellular drug concentration is important but not sufficiently interrogated, the demonstration that Arr-X is also a Rif ADP-ribosyltransferase is a good confirmation and shows that it is more efficient than Arr-1 on hydrophobic rifamycins is interesting but maybe not entirely surprising. The manuscript seems to have two parts that are related, but the rifamycin modification aspect of the work is not strongly linked to the first part since it interrogates the modification of one drug but not the common cause of natural resistance for other drugs.

    3. Reviewer #2 (Public review):

      Summary:

      The authors use a variety of methods to investigate the mechanisms of innate drug resistance in mycobacteria. They end up focusing on two primary determinants - drug accumulation, which correlates rather poorly with resistance for many species, and, for the rifamycins, ADP-ribosyltransferases. The latter enzymes do appear to account for a good deal of resistance, though it is difficult to extrapolate quantitatively what their relative contributions are.

      Overall, they make excellent use of biochemical methods to support their conclusions. Though they set out to draw very broad lessons, much of the focus ends up being on rifamycins. This is still a very interesting set of conclusions.

      Strengths:

      (1) A very interesting approach and set of questions.

      (2) Outstanding technical approaches to measuring intracellular drug concentrations and chemical modification of rifamycins.

      (3) Excellent characterization of variant rifamycin ADP-ribosyltransferases

      Weaknesses:

      (1) Figure 3c/d: These panels show the same experiment done twice, yet they display substantially different results in certain cases. For instance, M. smegmatis appears to show an order of magnitude lower RIF accumulation in panel d compared to M. flavescens, despite them displaying equal accumulation in panel c. The authors should provide justification for this variation, particularly as quantitative intra-species comparisons are central to the conclusions of this figure.

      (2) There are several technical concerns with Figure 3 that affect how to interpret the work. According to the methods, the authors did not appear to normalize to an internal standard, only to an external antibiotic standard (which may account for some of the technical variation alluded to above). Second, the authors used different concentrations of drug for each species to try to match the species' MICs. I appreciate the authors' thinking on this, but I think for an uptake experiment it would be more appropriate to treat with the same concentration of drug since uptake is likely saturable at higher drug concentrations. In the current setup, for the species with higher MIC, they have to be able to uptake substantially more antibiotics than the species with low MIC in order to end up with the same normalized uptake value in Figure 3d. It would be helpful to repeat this experiment with a single drug concentration in the media for all species and test whether that gives the same results seen here.

      (3) Figure 4f: This panel seems to argue against the idea that the efficacy of RIF ribosylation is what's driving drug susceptibility. M. flavescens is similarly resistant to RIF as M. smegmatis, yet M. flavescens has dramatically lower riboslyation of RIF. This is perhaps not surprising, as the authors appropriately highlight the number of different rif-modifying enzymes that have been identified that likely also contribute to drug resistance. However, I do think this means that the authors can't make the claim that the resistance they observe is caused by rifamycin modification, so those claims in the text and figure legend should be altered unless the authors can provide further evidence to support them. This experiment also has results that are inconsistent with what appears to be an identical experiment performed in Supplemental Figure 5b. The authors should provide context for why these results differ.

      (4) Fig 4f/5c: M. flavescens has both Arr-1 and Arr-X, yet it appears to not have ribosylated RIF. This result seems to undermine the authors' reliance on the enzyme assay shown in Fig 5c - in that assay, M. flavescens Arr-X is very capable of modifying rifampicin, yet that doesn't appear to translate to the in vivo setting. This is of importance because the authors use this enzyme assay to argue that Arr-X is a fundamentally more powerful RIF resistance mechanism than Arr-1 and that it has specificity for rifabutin. However, the result in Figure 4f would argue that the enzyme assay results cannot be directly translated to in vivo contexts. For the authors to claim that Arr-X is most potent at modifying rifabutin, they could test their CRISPRi knockdowns of Arr-X and Arr-1 under treatment with each of the rifamycins they use in the enzyme assay. The authors mentioned that they didn't do this because all the strains are resistant to those compounds; however, if Arr-X is important for drug resistance, it would be reasonable to expect to see sensitization of the bacteria to those compounds upon knockdown.

      (5) Figure 5d: The authors use this CRISRPi experiment to claim that ArrX from M. conceptionanse is more potent at inactivating rifabutin than Arr-1. This claim depends on there being equal degrees of knockdown of Arr-1 and Arr-X, so the authors should validate the degree of knockdown they get. This is particularly important because, to my knowledge, nobody has used this system in M. conceptionanse before

      (6) The authors' arguments about Arr-X and Arr-1 would be strengthened by showing by LC/MS that Arr-X knockdown in M. conceptionense results in more loss of ribosyl-rifabutin than knockdown of Arr-1.

    4. Reviewer #3 (Public review):

      This manuscript presents a macroevolutionary approach to the identification of novel high-level antibiotic resistance determinants that takes advantage of the natural genetic diversity within a genus (mycobacteria, in this case) by comparing antibiotic resistance profiles across related bacterial species and then using computational, molecular, and cellular approaches to identify and characterize the distinguishing mechanisms of resistance. The approach is contrasted with "microevolutionary" approaches based on comparing resistant and susceptible strains of the same species and approaches based on ecological sampling that may not include clinically relevant pathogens or related species. The potential for new discoveries with the macroevolution-inspired approach is evident in the diversity of drug susceptibility profiles revealed amongst the selected mycobacterial species and the identification and characterization of a new group of rifamycin-modifying ADP-ribosyltransferase (Arr) orthologs of previously described mycobacterial Arr enzymes. Additional findings that intra-bacterial antibiotic accumulation does not always predict potency within this genus, that M. marinum is a better proxy for M. tuberculosis drug susceptibility than the commonly used saprophyte M. smegmatis, and that susceptibility to semi-synthetic antibiotic classes is generally less variable than susceptibility to antibiotics more directly derived from natural products strengthen the claim that the macroevolutionary lens is valuable for elucidating general principles of susceptibility within a genus.

      There are some limitations to the work. The argument for the novelty of the approach could be better articulated. While the opportunities for new discoveries presented by the identification of discrepant susceptibility results between related species are evident, it is less clear how the macroevolutionary approach is further leveraged for the discovery of truly novel resistance determinants. The example of the discovery of Arr-X enzymes presented here relied upon foundational knowledge of previously characterized Arr orthologs. There is little clarity on what the pipeline for identifying more novel resistance determinants would look like. In other words, what does the macroevolutionary perspective contribute to discovery from the point of finding interspecies differences in susceptibility? Does the framework still remain distinct from other discovery frameworks and approaches? If so, how?

      While the experimentation and analyses performed appear well-designed and rigorous, there are a few instances in which broad claims are based on inferences from sample sets or data sets that are too limited to provide robust support. For example, the claim that rifampicin modification, and precisely ADP-ribosylation, is the dominant mechanism of resistance to rifampicin in mycobacteria may be a bit premature or an over-generalization, as other enzymatic modification mechanisms and other mechanisms such as helR-mediated dissociation of rifampicin-stalled RNA polymerases, efflux, etc were not examined nor were CRISPRi knockdown experiments conducted beyond an experiment to tease out the role of Arr-X and Arr-1 in one strain. The general claim that intra-bacterial antibiotic accumulation does not predict potency in mycobacteria may be another over-generalization based on the limited number of drugs and species studied, but perhaps the intended assertion was that antibiotic accumulation ALONE does not predict potency.

    1. eLife Assessment

      This important manuscript provides insights into the competition between Splicing Factor 1 (SF1) and Quaking (QKI) for binding at the ACUAA branch point sequence in a model intron, regulating exon inclusion. The study employs rigorous transcriptomic, proteomic, and reporter assays, with both mammalian cell culture and yeast models. Nevertheless, while the data are convincing, broadening the analysis to additional exons and narrowing the manuscript's title to better align with the experimental scope would strengthen the work.

    2. Reviewer #1 (Public review):

      In this manuscript, the authors aimed to show that SF1 and QKI compete for the intron branch point sequence ACUAA and provide evidence that QKI represses inclusion when bound to it.

      Major strengths of this manuscript include:<br /> (1) Identification of the ACUAA-like motif in exons regulated by QKI and SF1.<br /> (2) The use of the splicing reporter and mutant analysis to show that upstream and downstream ACUAAC elements in intron 10 of RAI are required for repressing splicing.<br /> (3) The use of proteomic to identify proteins in C2C12 nuclear extract that binds to the wild type and mutant sequence.<br /> (4) The yeast studies showing that ectopic lethality when Qki5 expression was induced, due to increased mis-splicing of transcripts that contain the ACUAA element.

      The authors conclusively show that the ACUAA sequence is bound by QKI and provide strong evidence that this leads to differences in exons inclusion and exclusion. In animal cells, and especially in human, branchpoint sequences are degenerate but seem to be recognized by specific splicing factors. Although a subset of splicing factors shows tissue-specific expression patterns most don't, suggesting that yet-to-be-identified mechanisms regulate splicing. This work suggests that an alternate mechanism could be related to the binding affinity of specific RNA binding factors for branchpoint sequences coupled with the level of these different splicing factors in a given cell.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Pereira de Castro and coworkers are studying potential competition between a more standard splicing factor SF1, and an alternative splicing factor called QK1. This is interesting because they bind to overlapping sequence motifs and could potentially have opposing effects on promoting the splicing reaction. To test this idea, the authors KD either SF1 or QK1 in mammalian cells and uncover several exons whose splicing regulation follows the predicted pattern of being promoted for splicing by SF1 and repressed by QK1. Importantly, these have introns enriched in SF1 and QK1 motifs. The authors then focus on one exon in particular with two tandem motifs to study the mechanism of this in greater detail and their results confirm the competition model. Mass spec analysis largely agrees with their proposal; however, it is complicated by the apparently quick transition of SF1-bound complexes to later splicing intermediates. An inspired experiment in yeast shows how QK1 competition could potentially have a detrimental impact on splicing in an orthogonal system. Overall, these results show how splicing regulation can be achieved by competition between a "core" and alternative splicing factor and provide additional insight into the complex process of branch site recognition. The manuscript is exceptionally clear and the figures and data are very logically presented. The work will be valuable to those in the splicing field who are interested in both mechanism and bioinformatics approaches to deconvolve any apparent "splicing code" being used by cells to regulate gene expression. Criticisms are minor and the most important of them stem from overemphasis on parts of the manuscript on the evolutionary angle when evolution itself wasn't analyzed per se.

      Strengths:

      (1) The main discovery of the manuscript involving evidence for SF1/QK1 competition is quite interesting and important for this field. This evidence has been missing and may change how people think about branch site recognition.

      (2) The experiments and the rationale behind them are exceptionally clearly and logically presented. This was wonderful!

      (3) The experiments are carried out to a high standard and well-designed controls are included.

      (4) The extrapolation of the result to yeast in order to show the potentially devastating consequences of the QK1 competition was very exciting and creative.

      Weaknesses:

      Overall the weaknesses are relatively minor and involve cases where clarification is necessary, some additional analysis could bolster the arguments, and suggestions for focusing the manuscript on its strengths.

      (1) The title (Ancient...evolutionary outcomes), abstract, and some parts of the discussion focus heavily on the evolutionary implications of this work. However, evolutionary analysis was not performed in these studies (e.g., when did QK1 and SF1 proteins arise and/or diverge? How does this line up with branch site motifs and evolution of U2? Any insight from recent work from Scott Roy et al?). I think this aspect either needs to be bolstered with experimental work/data or this should be tamped down in the manuscript. I suggest highlighting the idea expressed in the sentence "A nuanced implication of this model is that loss-of-function...". To me, this is better supported by the data and potentially by some analysis of mutations associated with human disease.

      (2) One paper that I didn't see cited was that by Tanackovic and Kramer (Mol Biol Cell 2005). This paper is relevant because they KD SF1 and found it nonessential for splicing in vivo. Do their results have implications for those here? How do the results of the KD compare? Could QK1 competition have influenced their findings (or does their work influence the "nuanced implication" model referenced above?)?

      (3) Can the authors please provide a citation for the statement "degeneracy is observed to a higher degree in organisms with more alternative splicing"? Does recent evolutionary analysis support this?

      (4) For the data in Figure 3, I was left wondering if NMD was confounding this analysis. Can the authors respond to this and address this concern directly?

      (5) To me, the idea that an engaged U2 snRNP was pulled down in Figure 4F would be stronger if the snRNA was detected. Was that able to be observed by northern or primer extension? Would SF1 be enriched if the U2 snRNA was degraded by RNaseH in the NE?

      (6) I'm wondering how additive the effects of QK1 and SF1 are... In Figure 2, if QK1 and SF1 are both knocked down, is the splicing of exon 11 restored to "wt" levels?

      (7) The first discussion section has two paragraphs that begin "How does competition between SF1..." and "Relatively little is known about how...". I found the discussion and speculation about localization, paraspekles, and lncRNAs interesting but a bit detracting from the strengths of the manuscript. I would suggest shortening these two paragraphs into a single one.

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript, the authors were trying to establish whether competition between the RNA-binding proteins SF1 and QKI controlled splicing outcomes. These two proteins have similar binding sites and protein sequences, but SF1 lacks a dimerization motif and seems to bind a single version of the binding sequence. Importantly, these binding sequences correspond to branchpoint consensus sequences, with SF1 binding leading to productive splicing, but QKI binding leading instead to association with paraspeckle proteins. They show that in human cells SF1 generally activates exons and QKI represses, and a large group of the jointly regulated exons (43% of joint targets) are reciprocally controlled by SF1 and QKI. They focus on one of these exons RAI14 that shows this reciprocal pattern of regulation, and has 2 repeats of the binding site that make it a candidate for joint regulation, and confirm regulation within a minigene context. The authors used the assembly of proteins within nuclear extracts to explain the effect of QKI versus SF1 binding. Finally, the authors show that the expression of QKI is lethal in yeast, and causes splicing defects.

      How this fits in the field. This study is interesting and provides a conceptual advance by providing a general rule on how SF1 and QKI interact in relation to binding sites, and the relative molecular fates followed, so is very useful. Most of the analysis seems to focus on one example, although the molecular analysis and global work significantly add to the picture from the previously published paper about NUMB joint regulation by QKI and SF (Zong et al, cited in text as reference 50, that looked at SF1 and QKI binding in relation to a duplicated binding site/branchpoint sequence in NUMB).

      Strengths:

      The data presented are strong and clear. The ideas discussed in this paper are of wide interest, and present a simple model where two binding sites generate a potentially repressive QKI response, whereas exons that have a single upstream sequence are just regulated by SF1. The assembly of splicing complexes on RNAs derived from RAI14 in nuclear extracts, followed by mass spec gave interesting mechanistic insight into what was occurring as a result of QKI versus SF1 binding.

      Weaknesses:

      I did not think the title best summarises the take-home message and could be perhaps a bit more modest. Although the authors investigated splicing patterns in yeast and human cells, yeast do not have QKI so there is no ancient competition in that case, and the study did not really investigate physiological or evolutionary outcomes in splicing, although it provides interesting speculation on them. Also as I understood it, the important issue was less conserved branchpoints in higher eukaryotes enabling alternative splicing, rather than competition for the conserved branchpoint sequence. So despite the the data being strong and properly analysed and discussed in the paper, could the authors think whether they fit best with the take-home message provided in the title? Just as a suggestion (I am sure the authors can do a better job), maybe "molecular competition between variant branchpoint sequences predict physiological and evolutionary outcomes in splicing"?

      Although the authors do provide some global data, most of the detailed analysis is of RAI14. It would have been useful to examine members of the other quadrants in Figure 1C as well for potential binding sites to give a reason why these are not co-regulated in the same way as RAI14. How many of the RAI14 quadrants had single/double sites (the motif analysis seemed to pull out just one), and could one of the non-reciprocally regulated exons be moved into a different quadrant by addition or subtraction of a binding site or changing the branchpoint (using a minigene approach for example).

    1. eLife Assessment

      This study presents an important finding on the role of telomeres in modulating interleukin-1 signaling and tumor immunity in TNBC. The evidence supporting these findings is solid, presented through comprehensive analyses including TNBC clinical samples, tumor-derived organoids, cancer cells, and xenografts. The work will be of broad interest to cell and medical biologists focusing on TNBC.

    2. Reviewer #2 (Public review):

      This study highlights the role of role of telomeres in modulating IL-1 signaling and tumor immunity. The authors demonstrate a strong correlation between telomere length and IL-1 signaling by analyzing TNBC patient samples and tumor-derived organoids. Mechanistic insights revealed that non-telomeric TRF2 binding at the IL-1R1. The observed effects on NF-kB signaling and subsequent alterations in cytokine expression contribute significantly to our understanding of the complex interplay between telomeres and the tumor microenvironment. Furthermore, the study reports that the length of telomeres and IL-1R1 expression is associated with TAM enrichment. However, the manuscript lacks in-depth mechanistic insights into how telomere length affects IL-1R1 expression Overall, this work broadens our understanding of telomere biology.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, entitled "Telomere length sensitive regulation of Interleukin Receptor 1 type 1 (IL1R1) by the shelterin protein TRF2 modulates immune signalling in the tumour microenvironment", Dr Mukherjee and colleagues pointed at clarifying the extra-telomeric role of TRF2 in regulating IL1R1 expression with consequent impact on TAMs tumor-infiltration.

      Strengths:

      Upon a careful manuscript evaluation, I feel to conclude that the presented story is undoubtedly well conceived. At technical level, experiments have been properly performed and the obtained results well-support author conclusions.

      Weaknesses:

      Unfortunately, the covered topic is not particularly novel. In detail, TRF2 capability of binding extratelomeric foci in cells with short telomeres has been well demonstrated in a previous work published by the same research group. The capability of TRF2 to regulate gene expression is well-known, the capability of TRF2 to interact with p300 has been already demonstrated and, finally, the capability of TRF2 to regulate TAMs infiltration (that is the effective novelty of the manuscript) appears as an obvious consequence of IL1R1 modulation (this is probably due to the current manuscript organization).

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This manuscript from Mukherjee et al examines potential connections between telomere length and tumor immune responses. This examination is based on the premise that telomeres and tumor immunity have each been shown to play separate, but important, roles in cancer progression and prognosis as well as prior correlative findings between telomere length and immunity. In keeping with a potential connection between telomere length and tumor immunity, the authors find that long telomere length is associated with reduced expression of the cytokine receptor IL1R1. Long telomere length is also associated with reduced TRF2 occupancy at the putative IL1R1 promoter. These observations lead the authors towards a model in which reduced telomere occupancy of TRF2 - due to telomere shortening - promotes IL1R1 transcription via recruitment of the p300 histone acetyltransferase. This model is based on earlier studies from this group (i.e. Mukherjee et al., 2019) which first proposed that telomere length can influence gene expression by enabling TRF2 binding and gene transactivation at telomere-distal sites. Further mechanistic work suggests that G-quadruplexes are important for TRF2 binding to IL1R1 promoter and that TRF2 acetylation is necessary for p300 recruitment. Complementary studies in human triple-negative breast cancer cells add potential clinical relevance but do not possess a direct connection to the proposed model. Overall, the article presents several interesting observations, but disconnection across central elements of the model and the marginal degree of the data leave open significant uncertainty regarding the conclusions.

      Strengths:

      Many of the key results are examined across multiple cell models.

      The authors propose a highly innovative model to explain their results.

      Weaknesses:

      Although the authors attempt to replicate most key results across multiple models, the results are often marginal or appear to lack statistical significance. For example, the reduction in IL1R1 protein levels observed in HT1080 cells that possess long telomeres relative to HT1080 short telomere cells appears to be modest (Supplementary Figure 1I). Associated changes in IL1R1 mRNA levels are similarly modest.

      Related to the point above, a lack of strong functional studies leaves an open question as to whether observed changes in IL1R1 expression across telomere short/long cancer cells are biologically meaningful.

      Statistical significance is described sporadically throughout the paper. Most major trends hold, but the statistical significance of the results is often unclear. For example, Figure 1A uses a statistical test to show statistically significant increases in TRF2 occupancy at the IL1R1 promoter in short telomere HT1080 relative to long telomere HT1080. However, similar experiments (i.e. Figure 2B, Figure 4A - D) lack statistical tests.

      TRF2 overexpression resulted in ~ 5-fold or more change in IL1R1 expression. Compared to this, telomere length-dependent alterations in IL1R1 expression, although about 2-fold, appear modest (~ 50% reduction in cells with long telomeres across different model systems used). Notably, this was consistent and significant across cell-based model systems and xenograft tumors (see Figure 1). Unlike TRF2 induction, telomere elongation or shortening vary within the permissible physiological limits of cells. This is likely to result in the observed variation in IL1R1 levels.

      For biological relevance, we have shown this using multiple models where telomere length was either different (patient tissue, organoids) or were altered (cell lines, xenograft models) . Where IL1 signalling in TNBC tissue and tumor organoids, and cells/xenografts were shown to impact M2 macrophage infiltration in a telomere length sensitive fashion. We made use of the tumor organoids to test M2 macrophage infiltration using IL1RA and small molecule based IL1R1 inhibition.

      We have now included statistical tests in all the relevant figures and incorporated the necessary details about the tests performed in the figure legend for clarity of readers. Additionally, all data points, p values and details of statistical tests have been included in Figure wise excel sheets for both main and supplementary figures.

      Reviewer #1 (Recommendations For The Authors):

      There are typos throughout the manuscript. The word 'expression' is incorrectly spelled on y-axis labels throughout the manuscript (for example see Figure 1B). The word 'telomere' is incorrectly spelled in Supplementary Figure 1 legend panel A. Most errors, such as these, do not interfere with my comprehension of the manuscript. However, others made the manuscript difficult to follow. For example, I think that MDAMB231, MDAMD231, and MDAM231 are frequently used interchangeably to refer to the same cell line. This makes it very difficult to understand certain experiments.

      I often found it difficult to understand which statistical test was used for a specific experiment. I suggest changing the style in the legends to more clearly connect statistical tests with specific data points.

      We thank the reviewer for pointing out the typological errors. We have now made relevant corrections to both figures and text.

      As stated above, we have now provided details of statistical tests performed in the figure legend for clarity of readers. Additionally, all data points, p values and details of statistical tests have been included in Figure wise excel sheets for both main and supplementary figures.

      Reviewer #2 (Public Review):

      This study highlights the role of telomeres in modulating IL-1 signaling and tumor immunity. The authors demonstrate a strong correlation between telomere length and IL-1 signaling by analyzing TNBC patient samples and tumor-derived organoids. Mechanistic insights revealed non-telomeric TRF2 binding at the IL-1R1. The observed effects on NF-kB signaling and subsequent alterations in cytokine expression contribute significantly to our understanding of the complex interplay between telomeres and the tumor microenvironment. Furthermore, the study reports that the length of telomeres and IL-1R1 expression is associated with TAM enrichment. However, the manuscript lacks in-depth mechanistic insights into how telomere length affects IL-1R1 expression. Overall, this work broadens our understanding of telomere biology.

      The mechanism of how telomere length affects IL1R1 expression involves sequestration and reallocation of TRF2 between telomeres and gene promoters (in this case, the IL1R1 promoter). We have previously shown this across multiple genomic sites (Mukherjee et al, 2018; reviewed in J. Biol. Chem. 2020, Trends in Genetics 2023). We have described this in the manuscript along with references citing the previous works. A scheme explaining the model was provided as Additional Supplementary Figure 1, along with a description of the mechanistic model.

      Figure 1-4 in main figures describe the molecular mechanism of telomere-dependent IL1R1 activation. This includes ChIP data for TRF2 on the IL1R1 promoter in long/short telomeres, as well as TRF2-mediated histone/p300 recruitment and IL1R1 gene expression. We further show how specific acetylation on TRF2 is crucial for TRF2-mediated IL1R1 regulation (Figure 5).

      Reviewer #2 (Recommendations For The Authors):

      The study primarily provides a snapshot of cytokine expression and telomere length at a single time point. Longitudinal studies or dynamic analyses could provide a more comprehensive understanding of the temporal relationship between telomere length and cytokine expression.

      Tumor heterogeneity is a significant problem for the various therapies. The study notes significant heterogeneity in telomere length but does not investigate the implications of this heterogeneity. Understanding the role of telomere length variation in different tumor cell populations is essential for a comprehensive interpretation of the results.

      The study only mentions a correlation between IL1R1 and relative telomere length but does not provide any potential clinical correlations with patient outcomes or survival. Addressing the clinical relevance of these molecular changes would improve the translational impact.

      The importance of IL1R1 in prognostic and clinical outcomes of TNBC has been studied by multiple groups. The overall consensus is that higher IL1R1 leads to poor prognosis – aiding both cancer progression and metastasis. Using publicly available TCGA data, we found that IL1R1 high samples had significantly lower survival in breast cancer (BRCA) datasets. The results have now been included in the manuscript as Supplemnetray Figure 7G.

      Addition in text:

      “We, next, used publicly available TCGA gene expression data of breast cancer samples (BRCA) (Supplementary file 4) to assess the effect of IL1R1 expression on cancer prognosis. We categorized samples based on IL1R1 expression: IL1R1 high (N=254) and IL1R1 low samples (N= 709). It was seen that overall patient survival was significantly lower in IL1R1 high samples (Log-rank p value -0.0149) (Supplementary Figure 7G). We also checked the frequency of occurrence of various breast cancer sub-types in IL1R1 high and low samples (Supplementary Figure 7H). While invasive mixed mucinous carcinoma (the most abundant sub-type) was predominantly seen in IL1R1 low samples, metaplastic breast cancer was only found within the IL1R1 high samples. Interestingly, metaplastic breast cancer has been frequently found to be ‘triple negative’-i.e., ER-,PR- and HER2-. (Reddy et al., 2020).”

      However, we could not access a TNBC (or any breast cancer dataset) that has been characterized for telomere length. Unfortunately, the clinical TNBC samples that we had access to did not have any paired short-term/long-term survival datasets. We could, in principle, use TERT/TERC expression as a proxy for telomere length; however, in our experiments, we found that telomerase activity did not positively correlate with telomere length as expected (Supplementary Figure 7C, Supplementary Figure 8D). Therefore, transcriptional signature (of telomere-associated genes) may not be a reliable indicator of telomere length.

      The study lacks in-depth mechanistic insights into how telomere length affects IL1R1 expression and subsequently influences TAM infiltration. Further molecular studies or pathway analyses are necessary to elucidate the underlying mechanisms.

      The mechanism involves sequestration and reallocation of TRF2 between telomeres and gene promoters (in this case, IL1R1 promoter). We have previously shown this across multiple genomic sites (Mukherjee et al, 2018). We have appropriately discussed this in the manuscript.

      A schematic explaining the model has been provided as Additional Supplementary Figure 1.

      We have provided ChIP data for TRF2 on IL1R1 promoter in long/short telomeres in the manuscript as well as histone/p300 ChIP and gene expression (Figure 1-4 in main figures exclusively deal with molecular mechanism of telomere dependent IL1R1 activation).  We further go on to show how specific acetylation on TRF2 might be crucial for TRF2-mediated IL1R1 regulation (Figure 5). One of the key findings herein is the fact that TRF2 can directly regulate IL1R1 expression through promoter occupancy- tested in telomere altered cell lines (HT1080, MDAMB231) and tumor xenografts (Figure 1 A, F, I- for TRF2 promoter occupancy).

      Pathway analysis of HT1080 (short vs long telomere) transcriptome, shows that cytokine-cytokine receptor interaction is one of the key pathways in upregulated genes.

      While we have focused on TRF2 mediated IL1R1 regulation, it is quite possible that there are other telomere sensitive pathways/mechanisms by which IL1R1 is regulated. This has been duly acknowledged in the discussion.

      The manuscript title suggests modulation of immune signaling in the tumor microenvironment, yet the authors exclusively focus on CD206+ TAMs, limiting the scope. It is recommended to investigate other immune cell types for a more comprehensive understanding of changes in the immune tumor microenvironment.

      As stated above, we approached the manuscript from the purview of TRF2-mediated IL1R1 regulation. In our assessment of TCGA data for breast cancer, we found that CD206 (MRC1) had the highest enrichment in IL1R1 high samples among key TAM and TIL markers- now added as Figure 8A (Details in Supplementary file 5). It also had the highest correlation with IL1R1 among the tested markers. Therefore, we proceeded to check CD206+ve TAMs.

      Now the following section has been added to text:

      “We further found that the total proportion of immune cells (% of CD45 +ve cells) did not vary significantly between short and long telomere TNBC samples (Supplementary Figure 8C). However, TNBC-ST samples had a higher percentage of myeloid cells (CD11B +ve) within the CD 45 +ve immune cell population. We checked in three TNBC-ST and TNBC-LT samples each and found that the percentage of M1 macrophages (CD86 high CD 206 low) in the myeloid population was lower than that of the M2 macrophages (CD 206 high CD 86 low) and unlike the latter, did not vary significantly between the TNBC-ST and TNBC-LT samples (Supplementary Figure 8C).”

      Unfortunately, due to sample limitations we are unable to test this on a larger cohort of samples.

      A single cell transcriptome experiment may have been a good way to have a more comprehensive immune profiling. However, with our TNBC samples, isolated nuclei for downstream processing had low viability as per 10X genomics specifications.

      Does IL1R1 influence TAM recruitment or polarization within the tumor microenvironment? To assess the impact, the authors should use a marker indicative of M1-like macrophages, such as CD80 or CD86.

      To address the issue of TAM recruitment vs polarization meaningfully we need to characterize tissue resident macrophages as well as macrophages in circulation. We did not have access to patient blood.  A murine breast cancer in-vivo model might be a more appropriate model to test this, which would take considerable time for us to develop. It is something that we hope to address in a follow up study.

      Did the authors analyze other breast cancer subtypes for telomere length?

      Unfortunately, other breast cancer sub-types besides TNBC were not available to us for experimentation.

      Figure legends are very briefly written and need to be elaborated. Scale bars are also missing in images.

      Add a gating strategy for flow cytometry results in Figure 8A.

      Figure legend have been expanded for clarity. More prominent scale bars have been added for better visibility and reference.  A relevant gating strategy has been added as Supplementary figure 8B.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, entitled "Telomere length sensitive regulation of Interleukin Receptor 1 type 1 (IL1R1) by the shelterin protein TRF2 modulates immune signalling in the tumour microenvironment", Dr. Mukherjee and colleagues pointed out clarifying the extra-telomeric role of TRF2 in regulating IL1R1 expression with consequent impact on TAMs tumor-infiltration.

      Strengths:

      Upon careful manuscript evaluation, I feel that the presented story is undoubtedly well conceived. At the technical level, experiments have been properly performed and the obtained results support the authors' conclusions.

      Weaknesses:

      Unfortunately, the covered topic is not particularly novel. In detail, the TRF2 capability of binding extratelomeric foci in cells with short telomeres has been well demonstrated in a previous work published by the same research group. The capability of TRF2 to regulate gene expression is well-known, the capability of TRF2 to interact with p300 has been already demonstrated and, finally, the capability of TRF2 to regulate TAMs infiltration (that is the effective novelty of the manuscript) appears as an obvious consequence of IL1R1 modulation (this is probably due to the current manuscript organization).

      Here we studied the TRF2-IL1R1 regulatory axis (not reported earlier by us or others) as a case of the telomere sequestration model that we described earlier (Mukherjee et al., 2018; reviewed in J. Biol. Chem. 2020, Trends in Genetics 2023). This manuscript demonstrates the effect of the TRF2-IL1R1 regulation on telomere-sensitive tumor macrophage recruitment. To the best of our knowledge, no previous study connects telomeres of tumor cells mechanistically to the tumor immune microenvironment. Here we focused on the IL1R1 promoter and provided mechanistic evidence for acetylated-TRF2 engaging the HAT p300 for epigenetically altering the promoter. This mechanism of TRF2 mediated activation has not been previously reported. Further, the function of a specific post translational modification (acetylation of the lysine residue 293K) of TRF2 in IL1R1 regulation is described for the first time. Additional experiments showed that TRF2-acetylation mutants, when targeted to the IL1R1 promoter, significantly alter the transcriptional state of the IL1R1 promoter. To our knowledge, the function of any TRF2 residue in transcriptional activation had not been previously described. Taken together, these demonstrate novel insights into the mechanism of TRF2-mediated gene regulation, that is telomere-sensitive, and affects the tumor-immune microenvironment.

      We considered the reviewer’s suggestion to reorganize the result section. Reorganizing the manuscript to describe the TAM-related results first would, in our opinion, limit focus of the new findings and discovery [and novelty of the mechanisms (as described in above response, and in response to other comments by reviewers)] of the non-telomeric TRF2-mediated IL1R1 regulation. We have tried to bring out the novelty, implications and importance of the TAM-related observations in the discussion.

      Reviewer #3 (Recommendations For The Authors):

      Based on the comments reported above, I would encourage the author to modify the manuscript by reorganizing the text. I would suggest starting from the capability of TRF2 to modulate macrophages infiltration. Data relative to IL1R1 expression may be used to explain the mechanism through which TRF2 exerts its immune-modulatory role. This, in my view, would dramatically strengthen the presented story.

      Concerning the text, "results" should be dramatically streamlined and background information should be just limited to the "introduction" section.

      The manuscript should be carefully revisited at grammar level. A number of incomplete sentences and some typos are present within the text.

      We thank the reviewer for the appreciation of our work for its technical strengths.

      At the onset, we agree that we have explored the TRF2-IL1R1 regulatory axis. This underscores the significance of the telomere sequestration model that we had proposed earlier (Mukherjee et al., 2018). Herein, however, we significantly extend our previous work (which was more general and intended for putting forward the idea of telomere-dependent distal gene expression) by studying TRF2-mediated regulation of IL1 signalling (which was previously unreported). In addition, mechanistic details of how telomeres are connected to IL1 signaling through non-telomeric TRF2 are entirely new, not reported before by us or others.

      We have removed some text descriptions from the result section to streamline the section.

    1. eLife Assessment

      This study presents a valuable finding on how the sensorimotor control system deals with redundancy within our body, based on a novel bimanual task. The evidence supporting the authors' claims is convincing, as demonstrated over four different experiments. The work will be of interest to researchers from the motor control community and related fields, and further investigation into the interpretation of the findings could increase the generalisation of the study to a broader audience.

    2. Reviewer #1 (Public Review):

      Summary/Strengths:

      This manuscript describes a stimulating contribution to the field of human motor control. The complexity of control and learning is studied with a new task offering a myriad of possible coordination patterns. Findings are original and exemplify how baseline relationships determine learning.

      Weaknesses:

      A new task is presented: it is a thoughtful one, but because it is a new one, the manuscript section is filled with relatively new terms and acronyms that are not necessarily easy to rapidly understand.

      First, some more thoughts may be devoted to the take-home message. In the title, I am not sure manipulating a stick with both hands is a key piece of information. Also, the authors appear to insist on the term 'implicit', and I wonder if it is a big deal in this manuscript and if all the necessary evidence appears in this study that control and adaptation are exclusively implicit. As there is no clear comparison between gradual and abrupt sessions, the authors may consider removing at least from the title and abstract the words 'implicit' and 'implicitly'. Most importantly, the authors may consider modifying the last sentence of the abstract to clearly provide the most substantial theoretical advance from this study.

      It seems that a substantial finding is the 'constraint' imposed by baseline control laws on sensorimotor adaptation. This seems to echo and extend previous work of Wu, Smith et al. (Nat Neurosci, 2014): their findings, which were not necessarily always replicated, suggested that the more participants were variable in baseline, the better they adapted to a systematic perturbation. The authors may study whether residual errors are smaller or adaptation is faster for individuals with larger motor variability in baseline. Unfortunately, the authors do not present the classic time course of sensorimotor adaptation in any experiment. The adaptation is not described as typically done: the authors should thus show the changes in tip movement direction and stick-tilt angle across trials, and highlight any significant difference between baseline, early adaptation, and late adaptation, for instance. I also wonder why the authors did not include a few no-perturbation trials after the exposure phase to study after-effects in the study design: it looks like a missed opportunity here. Overall, I think that showing the time course of adaptation is necessary for the present study to provide a more comprehensive understanding of that new task, and to re-explore the role of motor variability during baseline for sensorimotor adaptation.

      The distance between hands was fixed at 15 cm with the Kinarm instead of a mechanical constraint. I wonder how much this distance varied and more importantly whether from that analysis or a force analysis, the authors could determine whether one hand led the other one in the adaptation.

      I understand the distinction between task- and end-effector irrelevant perturbation, and at the same time results show that the nervous system reacts to both types of perturbation, indicating that they both seem relevant or important. In line 32, the errors mentioned at the end of the sentence suggest that adaptation is in fact maladaptive. I think the authors may extend the Discussion on why adaptation was found in the experiments with end-effector irrelevant and especially how an internal (forward) model or a pair of internal (forward) models may be used to predict both the visual and the somatosensory consequences of the motor commands.

    3. Reviewer #2 (Public review):

      Summary:

      The authors have developed a novel bimanual task that allows them to study how the sensorimotor control system deals with redundancy within our body. Specifically, the two hands control two robot handles that control the position and orientation of a virtual stick, where the end of the stick is moved into a target. This task has infinite solutions to any movement, where the two hands influence both tip-movement direction and stick-tilt angle. When moving to different targets in the baseline phase, participants change the tilt angle of the stick in a specific pattern that produces close to minimum movement of the two hands to produce the task. In a series of experiments, the authors then apply perturbations to the stick angle and stick movement direction to examine how either tip-movement (task-relevant) or stick-angle (task-irrelevant) perturbations effect adaptation. Both types of perturbations affect adaptation, but this adaptation follows the baseline pattern of tip-movement and stick angle relation such that even task-irrelevant perturbations drive adaptation in a manner that results in task-relevant errors. Overall, the authors suggest that these baseline relations affect how we adapt to changes in our tasks. This work provides an important demonstration that underlying solutions\relations can affect the manner in which we adapt. I think one major contribution of this work will also be the task itself, which provides a very fruitful and important framework for studying more complex motor control tasks.

      Strengths:

      Overall, I find this a very interesting and well-written paper. Beyond providing a new motor task that could be influential in the field, I think it also contributes to studying a very important question - how we can solve redundancy in the sensorimotor control system, as there are many possible mechanisms or methods that could be used - each of which produces different solutions and might affect the manner in which we adapt.

      Weaknesses:

      The visual perturbations were only provided while reaching to one target, which limits the amount of exploration of the environment that the participants experience. Overall, I would find the results even more compelling if the same perturbations applied to movements to more (or all) of the targets produced similar adaptation profiles. The question is to what degree the results derive from only providing a small subset of the environment to explore.

    4. Reviewer #3 (Public review):

      Summary:

      This study investigated motor system adaptation to new environments through modifications in redundant body movements. Utilizing a novel bimanual stick-manipulation task, participants controlled a virtual stick to reach targets, focusing on how tip-movement direction perturbations affected tip movement and stick-tilt adaptation. The findings revealed a consistent strategy among participants who flexibly adjusted the tilt angle of the stick in response to errors. The adaptation patterns were influenced by physical space relationships, which guided the motor system's selection of movement patterns. This study underscores the motor system's adaptability through changes in redundant body movement patterns.

      Strengths:

      This study introduces an innovative bimanual stick manipulation task to explore motor system adaptation to novel environments through alterations in redundant body movement patterns. It also expands the use of endpoint robots in motor control studies.

      Weaknesses:

      The generalizability of the findings is limited. Future work may strengthen the present study's findings by examining whether the observed relationships hold for different stick lengths (i.e., varying hand positions along the virtual stick) or when reaching targets to the left and right of the starting position, not just at varying angles along one side. Additionally, a more comprehensive review of the existing literature on redundant systems, rather than primarily focusing on the lack of redundancy in endpoint-reaching tasks, would have strengthened this study. While the novel task expands the use of endpoint robots in motor control studies, its utility in exploring broader aspects of motor control and learning may be constrained.

    5. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank all the reviewers for their positive evaluation of our paper, as described in the Strengths section. We are also grateful for their helpful comments and suggestions, which we have addressed below. We believe that the manuscript has been significantly improved as a result of these suggestions. In addition to these changes, we also corrected some inconsistencies (statistical values in the last sentence of a Figure 5 caption) and sentences in the main text (lines 155, 452, 522) (these corrections did not affect the results).

      Fig. 5e: R=0.599, P<0.001 -> R=0.601, P=0.007

      L150: "the angle of stick tilt angle" -> "the angle of stick tilt"

      L437: "no such" -> "such"

      L522: "?" -> "."

      Reviewer #1 (Public Review):

      Summary/Strengths:

      This manuscript describes a stimulating contribution to the field of human motor control. The complexity of control and learning is studied with a new task offering a myriad of possible coordination patterns. Findings are original and exemplify how baseline relationships determine learning.

      Weaknesses:

      A new task is presented: it is a thoughtful one, but because it is a new one, the manuscript section is filled with relatively new terms and acronyms that are not necessarily easy to rapidly understand.

      First, some more thoughts may be devoted to the take-home message. In the title, I am not sure manipulating a stick with both hands is a key piece of information. Also, the authors appear to insist on the term ‘implicit’, and I wonder if it is a big deal in this manuscript and if all the necessary evidence appears in this study that control and adaptation are exclusively implicit. As there is no clear comparison between gradual and abrupt sessions, the authors may consider removing at least from the title and abstract the words ‘implicit’ and ‘implicitly’. Most importantly, the authors may consider modifying the last sentence of the abstract to clearly provide the most substantial theoretical advance from this study.

      Thank you for your positive comment on our paper. We agree with the reviewer that our paper used a lot of acronyms that might confuse the readers. As we have addressed below (in the rebuttal to the Results section), we have reduced the number of acronyms.

      Regarding the comment on the use of the word “implicit” in the title and the abstract, we believe that its use in this paper is very important and indispensable. One of our main findings was that the pattern of adaptation between the tip-movement direction and the stick-tilt angle largely followed that in the baseline condition when aiming at different target directions. This adaptation was largely implicit because participants were not aware of the presence of the perturbation as the amount of perturbation was gradually increased. This implicitness suggests that the adaptation pattern of how the movement should be corrected is embedded in the motor learning system. On the other hand, if this adaptation pattern was achieved on the basis of the explicit strategy of changing the direction of the tip-movement, the adaptation pattern that follows the baseline pattern is not at all surprising. For these reasons, we will continue to use the word "implicit".

      It seems that a substantial finding is the ‘constraint’ imposed by baseline control laws on sensorimotor adaptation. This seems to echo and extend previous work of Wu, Smith et al. (Nat Neurosci, 2014): their findings, which were not necessarily always replicated, suggested that the more participants were variable in baseline, the better they adapted to a systematic perturbation. The authors may study whether residual errors are smaller or adaptation is faster for individuals with larger motor variability in baseline. Unfortunately, the authors do not present the classic time course of sensorimotor adaptation in any experiment. The adaptation is not described as typically done: the authors should thus show the changes in tip movement direction and stick-tilt angle across trials, and highlight any significant difference between baseline, early adaptation, and late adaptation, for instance. I also wonder why the authors did not include a few noperturbation trials after the exposure phase to study after-effects in the study design: it looks like a missed opportunity here. Overall, I think that showing the time course of adaptation is necessary for the present study to provide a more comprehensive understanding of that new task, and to re-explore the role of motor variability during baseline for sensorimotor adaptation.

      We appreciate the reviewer for raising these important issues.

      Regarding the learning curve, because the amount of perturbation was gradually increased except for Exp.1B, we were not able to obtain typical learning curves (i.e., the curve showing errors decaying exponentially with trials). However, it may still be useful to show how the movement changed with trials during adaptation. Therefore, following the reviewer's suggestion, we have added the figures of the time course of adaptation in the supplementary data (Figures S1, S2, S4, and S5).

      There are two reasons why our experiments did not include aftereffect quantification trials (i.e., probe trials). First, in the case of adaptation to a visual perturbation (e.g., visual rotation), probe trials are not necessary because the degree of adaptation can be easily quantified by the amount of compensation in the perturbation trials (however, in the case of dynamic perturbations such as force fields, the use of probe trials is necessary). Second, the inclusion of probe trials allows participants to be aware of the presence of the perturbation, which we would like to avoid.

      We also appreciate the interesting additional questions regarding the relevance of our work to the relationship between baseline motor variability and adaptation performance. As this topic, although interesting, is outside the scope of this paper, we concluded that we would not address it in the manuscript. In fact, the experiments were not ideal for quantifying motor variability in the baseline phase because participants had to aim at different targets, which could change the characteristics of motor variability. In addition, we gradually increased the size of the perturbation except for Exp.1B (see Author response image 1, upper panel), which could make it difficult to assess the speed of adaptation. Nevertheless, we think it is worth mentioning this point in this rebuttal. Specifically, we examined the correlation between baseline motor variability when aiming the 0 deg target (tip-movement direction or stick-tilt angle) and adaptation speed in Exp 1A and Exp 1B (Author response image 1 and Author response image 2). To assess adaptation speed in Exp.1A, we quantified the slope of the tip-movement direction to a gradually increasing perturbation (Author response image 1, upper panel). The adaptation speed in Exp.1B was obtained by fitting the exponential function to the data (Author response image 2, upper panel). Although the statistical results were not completely consistent, we found that the participants with greater the motor variability at baseline tended to show faster adaptation, as shown in a previous study (Wu et al., Nat Neurosci, 2014).

      Author response image 1.

      Correlation between the baseline variability and learning speed (Experiment 1A). In Exp 1A, the rotation of the tip-movement direction was gradually increased by 1 degree per trial up to 30 degrees. The learning speed was quantified by calculating how quickly the direction of movement followed the perturbation (upper panel). The lower left panel shows the variability of the tip-movement direction versus learning speed, while the lower right panel shows the variability of the stick-tilt angle versus learning speed. Baseline variability was calculated as a standard deviation across trials (trials in which a target appeared in a 0-degree direction).

      Author response image 2.

      Correlation between the baseline variability and learning speed (Experiment 1B). In Exp 1B, the rotation of the tip-movement direction was abruptly applied from the first trial (30 degrees). The learning speed was calculated as a time constant obtained by exponential curve fitting. The lower left panel shows the variability of the tip-movement direction versus learning speed, while the lower right panel shows the variability of the stick-tilt angle versus learning speed. Baseline variability was calculated as a standard deviation across trials (trials in which a target appeared in a 0-degree direction).

      The distance between hands was fixed at 15 cm with the Kinarm instead of a mechanical constraint. I wonder how much this distance varied and more importantly whether from that analysis or a force analysis, the authors could determine whether one hand led the other one in the adaptation.

      Thank you very much for this important comment. Since the distance between the two hands was maintained by the stiff virtual spring (2000 N/m), it was kept almost constant throughout the experiments as shown in Author response image 3 (the averaged distance during a movement). The distance was also maintained during reaching movements (Author response image 4).

      We also thank the reviewer for the suggestion regarding the force analysis. As shown in Author response image 5, we did not find a role for a specific hand for motor adaptation from the handle force data. Specifically, Author response image 5 shows the force applied to each handle along and orthogonal to the stick. If one hand led the other in adaptation, we should have observed a phase shift as adaptation progressed. However, no such hand specific phase shift was observed. It should be noted, however, that it was theoretically difficult to know from the force sensors which hand produced the force first, because the force exerted by the right handle was transmitted to the left handle and vice versa due to the connection by the stiff spring. 

      Author response image 3.

      The distance between hands during the task. We show the average distance between hands for each trial. The shaded area indicates the standard deviation across participants.

      Author response image 4.

      Time course changes in the distance between hands during the movement. The color means the trial epoch shown in the right legend.

      Author response image 5.

      The force profile during the movement (Exp 1A). We decomposed the force of each handle into the component along (upper panels) and orthogonal to the stick (lower panels). Changes in the force profiles in the adaptation phase are shown (left: left hand force, right: right hand force). The colors (magenta to cyan) mean trial epoch shown in the right legend.

      I understand the distinction between task- and end-effector irrelevant perturbation, and at the same time results show that the nervous system reacts to both types of perturbation, indicating that they both seem relevant or important. In line 32, the errors mentioned at the end of the sentence suggest that adaptation is in fact maladaptive. I think the authors may extend the Discussion on why adaptation was found in the experiments with end-effector irrelevant and especially how an internal (forward) model or a pair of internal (forward) models may be used to predict both the visual and the somatosensory consequences of the motor commands.

      Thank you very much for your comment. As we already described in the discussion of the original manuscript (Lines 519-538 in the revised manuscript), two potential explanations exist for the motor system’s response to the end-effector irrelevant perturbation (i.e., stick rotation). First, the motor system predicts the sensory information associated with the action and attempts to correct any discrepancies between the prediction and the actual sensory consequences, regardless of whether the error information is end-effector relevant or end-effector irrelevant. Second, given the close coupling between the tip-movement direction and stick-tilt angle, the motor system can estimate the presence of end-effector relevant error (i.e., tip-movement direction) by the presence of end-effector irrelevant error (i.e., stick-tilt angle). This estimation should lead to the change in the tip-movement direction. As the reviewer pointed out, the mismatch between visual and proprioceptive information is another possibility, we have added the description of this point in Discussion (Lines 523-526).

      Reviewer #1 (Recommendations For The Authors):

      Minor

      Line 16: “it remains poorly understood” is quite subjective and I would suggest reformulating this statement.

      We have reformulated this statement as “This limitation prevents the study of how….”  (Line 16).

      Introduction

      Line 49: the authors may be more specific than just saying ‘this task’. In particular, they need to clarify that there is no redundancy in studies where the shoulder is fixed and all movement is limited to a plane ... which turns out to truly happen in a limited set of experimental setups (for example: Kinarm exoskeleton, but not endpoint; Kinereach system...).

      We have changed this to “such a planar arm-reaching task” (Line 49).

      Line 61: large, not infinite because of biomechanical constraints.

      We have changed “an infinite” to “a large” (Line 61) and “infinite” to “a large number of” (legend in Fig. 1f).

      Lines 67-69: consider clarifying.

      We have tried to clarify the sentence (Lines 67-69).

      Results

      TMD and STA, and TMD-STA plane, are new terms with new acronyms that are not easy to immediately understand. Consider avoiding acronyms.

      We have reduced the use of these acronyms as much as possible. 

      “visual TMD–STA plane” -> “plane representing visual movement patterns” (Lines 179180)

      “TMD axis” -> “x-axis” (Line 181, Line 190)

      “physical TMD–STA plane” -> “plane representing physical movement patterns” (Lines 182-187)

      “physical TMD–STA plane” -> “physical plane” (Line 191, Line 201, Lines 216-217, Line 254, Line 301, Line 315, Line 422, Line 511, and captions of Figures 4-9, S3)

      “visual TMD–STA plane” -> “visual plane” (Line 193, Line 241, Line 248, Line 300, Lines

      313-314, and captions of Figures 4-9, S3)

      “STA axis” -> “y-axis” (Line 241)

      Line 169: please clarify the mismatch(es) that are created when the tip-movement direction is visually rotated in the CCW direction around the starting position (tip perturbation), whereas the stick-tilt angle remains unchanged.

      Thank you for your pointing this out. We have clarified that the stick-tilt angle remains identical to the tilt of both hands (Lines 171-172).

      Discussion

      I understand the physical constraint imposed between the 2 hands with the robotic device, but I am not sure I understand the physical constraint imposed by the TMD-STA relationship.

      The phrase “physical constraint” meant the constraint of the movement on the physical space. However, as the reviewer pointed out, this phrase could confuse the constraint between the two hands. Therefore, we have avoided using the phrase “physical constraint” throughout the manuscript.

      Some work looking at 3-D movements should be used for Discussion (e.g. Lacquaniti & Soechting 1982; work by d’Avella A or Jarrasse N).

      Thank you for sharing this important information. We have cited these studies in Discussion (Lines 380-382). 

      Reviewer #2 (Public Review):

      Summary:

      The authors have developed a novel bimanual task that allows them to study how the sensorimotor control system deals with redundancy within our body. Specifically, the two hands control two robot handles that control the position and orientation of a virtual stick, where the end of the stick is moved into a target. This task has infinite solutions to any movement, where the two hands influence both tip-movement direction and stick-tilt angle. When moving to different targets in the baseline phase, participants change the tilt angle of the stick in a specific pattern that produces close to the minimum movement of the two hands to produce the task. In a series of experiments, the authors then apply perturbations to the stick angle and stick movement direction to examine how either tipmovement (task-relevant) or stick-angle (task-irrelevant) perturbations affect adaptation. Both types of perturbations affect adaptation, but this adaptation follows the baseline pattern of tip-movement and stick angle relation such that even task-irrelevant perturbations drive adaptation in a manner that results in task-relevant errors. Overall, the authors suggest that these baseline relations affect how we adapt to changes in our tasks. This work provides an important demonstration that underlying solutions/relations can affect the manner in which we adapt. I think one major contribution of this work will also be the task itself, which provides a very fruitful and important framework for studying more complex motor control tasks.

      Strengths:

      Overall, I find this a very interesting and well-written paper. Beyond providing a new motor task that could be influential in the field, I think it also contributes to studying a very important question - how we can solve redundancy in the sensorimotor control system, as there are many possible mechanisms or methods that could be used - each of which produces different solutions and might affect the manner in which we adapt.

      Weaknesses:

      I would like to see further discussion of what the particular chosen solution implies in terms of optimality.

      The underlying baseline strategy used by the participants appears to match the path of minimum movement of the two hands. This suggests that participants are simultaneously optimizing accuracy and minimizing some metabolic cost or effort to solve the redundancy problem. However, once the perturbations are applied, participants still use this strategy for driving adaptation. I assume that this means that the solution that participants end up with after adaptation actually produces larger movements of the two hands than required. That is - they no longer fall onto the minimum hand movement strategy - which was used to solve the problem. Can the authors demonstrate that this is either the case or not clearly? These two possibilities produce very different implications in terms of the results.

      If my interpretation is correct, such a result (using a previously found solution that no longer is optimal) reminds me of the work of Selinger et al., 2015 (Current Biology), where participants continue to walk at a non-optimal speed after perturbations unless they get trained on multiple conditions to learn the new landscape of solutions. Perhaps the authors could discuss their work within this kind of interpretation. Do the authors predict that this relation would change with extensive practice either within the current conditions or with further exploration of the new task landscape? For example, if more than one target was used in the adaptation phase of the experiment?

      On the other hand, if the adaptation follows the solution of minimum hand movement and therefore potentially effort, this provides a completely different interpretation.

      Overall, I would find the results even more compelling if the same perturbations applied to movements to all of the targets and produced similar adaptation profiles. The question is to what degree the results derive from only providing a small subset of the environment to explore.

      Thank you very much for pointing out this significant issue. As the reviewer correctly interprets, the physical movement patterns deviated from the baseline relationship as exemplified in Exp.2. However, this deviation is not surprising for the following reason. Under the perturbation that creates the dissociation between the hands and the stick, the motor system cannot simultaneously return both the visual stick motion and physical hands motion to the original motions: When the motor system tries to return the visual stick motion to the original visual motion, then the physical hands motion inevitably deviates from the original physical hands motion, and vice versa.  

      Our interpretation of this result is that the motor system corrects the movement to reduce the visual dissociation of the visual stick motion from the baseline motion (i.e., sensory prediction error), but this movement correction is biased by the baseline physical hands motion. In other words, the motor system attempts to balance the minimization of sensory prediction error and the minimization of motor cost. Thus, our results do not indicate that the final adaptation pattern is non-optimal, but rather reflect the attempts for optimization.

      In the revised manuscript, we have added the description of this interpretation (Lines 515-517).

      Reviewer #2 (Recommendations For The Authors):

      The authors have suggested that the only study (line 472) that has also examined an end-effector irrelevant perturbation is the bimanual study of Omrani et al., 2013, which only examined reflex activity rather than adaptation. To clarify this issue - exactly what is considered end-effector irrelevant perturbations - I was wondering about the bimanual perturbations in Dimitriou et al., 2012 (J Neurophysiol) and the simultaneous equal perturbations in Franklin et al., 2016 (J Neurosci), as well as other recent papers studying task-irrelevant disturbances which aren’t discussed. I would consider these both to also be end-effector irrelevant perturbations, although again they only used these to study reflex activity and not adaptation as in the current paper. Regardless, further explanation of exactly what is the difference between task-irrelevant and end-effector irrelevant would be useful to clarify the exact difference between the current manuscript and previous work.

      Thank you for your helpful comments. We have included as references the study by Dimitriou et al. (Line 490) and Franklin et al. (Lines 486-487), which use an endeffector irrelevant perturbation and the task-irrelevant perturbation condition, respectively. We have also added further explanation of what is the difference between task-irrelevant and end-effector irrelevant (Lines 344-352). 

      Line 575: I assume that you mean peak movement speed

      We have added “peak”. (Line 597).

      Reviewer #3 (Public Review):

      Summary:

      This study explored how the motor system adapts to new environments by modifying redundant body movements. Using a novel bimanual stick manipulation task, participants manipulated a virtual stick to reach targets, focusing on how tip-movement direction perturbations affected both tip movement and stick-tilt adaptation. The findings indicated a consistent strategy among participants who flexibly adjusted the tilt angle of the stick in response to errors. The adaptation patterns are influenced by physical space relationships, guiding the motor system’s choice of movement patterns. Overall, this study highlights the adaptability of the motor system through changes in redundant body movement patterns.

      Strengths:

      This paper introduces a novel bimanual stick manipulation task to investigate how the motor system adapts to novel environments by altering the movement patterns of our redundant body.

      Weaknesses:

      The generalizability of the findings is quite limited. It would have been interesting to see if the same relationships were held for different stick lengths (i.e., the hands positioned at different start locations along the virtual stick) or when reaching targets to the left and right of a start position, not just at varying angles along one side. Alternatively, this study would have benefited from a more thorough investigation of the existing literature on redundant systems instead of primarily focusing on the lack of redundancy in endpointreaching tasks. Although the novel task expands the use of endpoint robots in motor control studies, the utility of this task for exploring motor control and learning may be limited.

      Thank you very much for the important comment. Given that there are many parameters (e.g., stick length, locations of hands, target position etc), one may wonder how the findings obtained from only one combination can be generalized to other configurations. In the revised manuscript, we have explicitly described this point (Lines 356-359). 

      Thus, the generalizability needs to be investigated in future studies, but we believe that the main results also apply to other configurations. Regarding the baseline stick movement pattern, the control with tilting the stick was observed regardless of the stick-tip positions (Author response image 6). Regarding the finding that the adapted stick movement patterns follow the baseline movement patterns, we confirmed the same results even when the other targets were used as the target for the adaptation (Author response image 7). 

      Author response image 6.

      Stick-tip manipulation patterns when the length of the stick varied. Top: 10 naïve participants moved the stick with different lengths. A target appeared on one of five directions represented by a color of each tip position. Regardless of the length of the stick and laterality, a similar relationship between tip-movement direction and stick-tilt angle was observed. (middle: at peak velocity, bottom: at movement offset).

      Author response image 7.

      Patterns of adaptation when using the other targets. In the baseline phase, 40 naïve participants moved a stick tip to a peripheral target (24 directions). They showed a stereotypical relationship between the tip-movement direction and the stick-tilt angle (a bold gray curve). In the adaptation phase, participants were divided into four groups, each with a different target training direction (lower left, lower right, upper right, or upper left), and visual rotation was gradually imposed on the tip-movement direction. Irrespective of the target direction, the adaptation pattern of the tipmovement and stick-tilt followed with the baseline relationship.

      We also thank you for your comment about studying the existing redundant systems. We can understand the reviewer's concern about the usefulness of our task, but we believe that we have proposed the novel framework for motor adaptation in the redundant system. The future studies will be able to clarify how the knowledge gained from our task can be generally applied to understand the control and learning of the redundant system.

      Reviewer #3 (Recommendations For The Authors):

      Line 49: replace “uniquely” with primarily. A number of features of the task setup could affect the joint angles, from if/how the arm is supported, whether the wrist is fixed, alignment of the target in relation to the midline of the participant, duration of the task, and whether fatigue is an issue, etc. Your statement relates to fixed limb lengths of a participant, rather than standard reaching tasks as a whole. Not to mention the degree of inter- and intra-subject variability that does exist in point-to-point reaching tasks.

      Thank you for your helpful point. We have replaced “uniquely” with “primarily”. (Line 49).

      Line 72: the cursor is not an end-effector - it represents the end-effector.

      We have changed the expression as “the perturbation to the cursor representing the position of the end-effector (Line 72).

      Lines 73 – 78: it would benefit the authors to consider the role of intersegmental dynamics.

      Thank you for your suggestion. We are not sure if we understand this suggestion correctly, but we interpret that this suggestion to mean that the end-effector perturbation can be implemented by using the perturbation that considers the intersegmental dynamics. However, the implementation is not so straightforward, and the panels in Figure 1j,k are only conceptual for the end-effector irrelevant perturbation. Therefore, we have not described the contribution of intersegmental dynamics here.

      Lines 90 – 92: “cannot” should be “did not”, as the studies being referenced are already completed. This statement should be further unpacked to explain what they did do, and how that does not meet the requirement of redundancy in movement patterns.

      We have changed “cannot” to “did not” (Line 91). We have also added the description of what the previous studies had demonstrated (Line 88-90).

      Figure text could be enlarged for easier viewing.

      We have enlarged texts in all figures. 

      Lines 41 - 47: Interesting selection of supporting references. For the introduction of a novel environment, I would recommend adding the support of Shadmehr and MussaIvaldi 1994.

      Thank you for your suggestion. We have added Shadmehr and Mussa-Ivaldi 1994 as a reference (Line 45).

      Line 49: “this task” is vague - the above references relate to a number of different tasks. For example, the authors could replace it with a reaching task involving an end-point robot.

      Thank you very much for your suggestion. As per the suggestion by Reviewer #1, we have changed this to “such a planar arm-reaching task” (Line 49).

      Line 60: “hypothetical limb with three joints” - in Figure 1a, the human subject, holding the handle of a robotic manipulandum does have flexibility around the wrist.

      Previous studies using planar arm-reaching task have constrained the wrist joint (e.g., Flash & Hogan, 1985; Gordon et al., 1994; Nozaki et al., 2006). We tried to emphasize this point as “participants manipulate a visual cursor with their hands primarily by moving their shoulder and elbow joints” (Line 42). In the revised manuscript, we have also emphasized this point in the legend of Figure 1a.

      Lines 93-108: this paragraph could be cleaned up more clearly stating that while the use of task-irrelevant perturbations has been used in the domain of reaching tasks, the focus of these tasks has not been specifically to address “In our task, we aim to exploit this feature by doing”

      Thank you very much for your helpful comments. To make this paragraph clear, we have modified some sentences (Line 100-104).

      Line 109: “coordinates to adapt” is redundant.

      We have changed this to “adapts” (Line 110).

      Lines 109-112: these sentences could be combined to have better flow.

      Thank you very much for your valuable suggestion. We have combined these two sentences for the better flow (Line 110-112).

      Line 113-114: consider rewording - “This is a redundant task because ...” to something like “Redundancy in the task is achieved by acknowledging that ....“.

      We have changed the expression according to the reviewer’s suggestion (Line 114).

      Line 118: Consider changing “changes” to “makes use of”.

      We have changed the expression (Line 119).

      Lines 346 - 348: grammar and clarity - “This redundant motor task enables the investigation of adaptation patterns in the redundant system following the introduction of perturbations that are either end-effector relevant, end-effector irrelevant, or both.“.

      Thank you very much again for your helpful suggestion of English expression. We have adopted the sentence you suggested (Line 354-356).

    1. eLife Assessment

      The contributions of ipsilateral cortical pathways to motor control are yet not fully understood. Here, the authors present important insights into their role in locomotion following unilateral spinal cord injury. Their data provide convincing evidence in rats that stimulation of ipsilateral motor cortex improves the injured side's ability to support weight and leads to improved locomotion, a result that may inspire new treatments for spinal or cerebral injuries.

    2. Reviewer #2 (Public review):

      Summary:

      The authors long term goals are to understand the utility of precisely phased cortex stimulation regimes on recovery of function after spinal cord injury (SCI). In prior work the authors explored effects of contralesion cortex stimulation. Here, they explore ipsilesion cortex stimulation in which the ipsilesion corticospinal fibers that cross at the pyramidal decussation are spared. The authors explore the effects of such stimulation in intact rats and rats with a hemisection lesion at thoracic level ipsilateral to the stimulated cortex. The appropriately phased microstimulation enhances contralateral flexion and ipsilateral extension, presumably through lumbar spinal cord crossed extension interneuron systems. This microstimulation improves weight bearing in the ipsilesion hindlimb soon after injury, before any normal recovery of function would be seen. The contralateral homologous cortex can be lesioned in intact rats without impacting the microstimulation effect on flexion and extension during gait. In two rats ipsilateral flexion responses are noted, but these are not clearly demonstrated to be independent of the contralateral homologous cortex remaining intact.

      Strengths:

      This paper adds to prior data on cortical microstimulation by the authors' laboratory in interesting ways. First, the strong effects of the spared crossed fibers from ipsi-lesional cortex in parts of the ipsi-lesion leg's step cycle and weight support function are solidly demonstrated. This raises the interesting possibility that stimulating contra-lesion cortex as reported previously may execute some of its effects through callosal coordination with the ipsi-lesion cortex tested here. This is also now discussed by the authors and may represent a significant aspect of these data. The authors demonstrate solidly that ablation of the contra-lesional cortex does not impede the effects reported here. I believe this has not been shown for the contra-lesional cortex microstimulation effects reported earlier, but I may be wrong.<br /> Effects and neuroprosthetic control of these effects are explored well in the ipsi-lesion cortex tests here.

      Weaknesses:

      Some data is based on only a few rats. For example (N=2) for ipsilateral flexion effects of microstimulation. N=3 for homologous cortex ablation, and only ipsi extension is tested it seems. However, these data clearly point the way and replication is likely.

      Likely Impacts:

      This data adds in significant ways to prior work by the authors, and an understanding of how phased stimulation in cortical neuroprosthetics may aid in recovery of function after SCI, especially if a few ambiguities in writing and interpretation are fully resolved.

    3. Reviewer #3 (Public review):

      Summary:

      This article aims to investigate the impact of neuroprosthesis (intracortical microstimulation) implanted unilaterally on the lesion side in the context of locomotor recovery following thoracic spinal hemisection.

      Strength:

      The study reveals that stimulating the left motor cortex, on the same side as the lesion, not only activates the expected right (contralateral) muscle activity but also influences unexpected muscle activity on the left (ipsilateral) side. These muscle activities resulted a substantial enhancement in lift during the swing phase of the contralateral limb and improved trunk-limb support for the ipsilateral limb. They used different experimental and stimulation condition to show the ipsilateral limb control evoked by the stimulation. This outcome holds significance, shedding light on the engagement of the contralateral-projecting corticospinal tract (CST) in activating a not only contralateral but also ipsilateral spinal network.

      The experimental design and findings align with the investigation of the stimulation effect of contralateral projecting CSTs. They carefully examined the recovery of ipsilateral limb control with motor maps. And they also tested the effective sites of cortical stimulation. The study successfully demonstrates the impact of electrical stimulation on the contralateral projecting neurons on ipsilateral limb control during locomotion, as well as identifying importance stimulation spots for such effect. These results contribute to our understanding of how these neurons influence bilateral spinal circuitry. The study's findings contribute valuable insights to the broader neuroscience and rehabilitation communities.

      Weakness:

      The term "ipsilateral" lacks a clear definition in some cases, potentially causing confusion for the reader. Readers can potentially link ipsilateral cortical network to ipsilateral-projecting CSTs, which is less likely to play a role to ipsilateral limb control in this study since this tract is disrupted by the thoracic hemisection.

      Specific comments:

      Abstract: Line 1-4: Consider refining the initial sentences of the abstract to reduce ambiguity around the term 'ipsilateral lesion' and its potential conflation with ipsilateral projecting cortical neurons.

      The abstract begins with 'Control of voluntary limb movement is predominantly attributed to the contralateral motor cortex.' This is followed by, 'However, increasing evidence suggests the involvement of ipsilateral cortical networks in this process, especially in motor tasks requiring bilateral coordination, such as locomotion.'

      The phrase 'ipsilateral cortical networks' remains somewhat unclear. Readers may mistakenly interpret it as referring to the ipsilateral projecting corticospinal tract (CST), which is not the focus of this study.

      Shifting the focus away from 'ipsilateral cortical control' and instead highlighting ipsilateral limb control following a spinal hemisection would improve clarity. This adjustment would also align the title and abstract more closely with the study's primary focus.

      Introduction:<br /> It is suggested to revise the introduction to more closely align with the study's experimental design and outcomes, placing emphasis on the stimulation effects observed in contralateral projecting tracts rather than implying a primary focus on ipsilateral projecting CST neurons.

      Line 30-32: "Nevertheless, the function of the ipsilateral motor cortex is unclear and its role in the recovery of motor control after injury remains controversial. " This still gives the impression that ipsilateral projecting CST is the topic of the research here. Also, some of the cited references contains discuss ipsilateral projecting CSTs.

      Line 34-36: "While the most prominent feature of motor cortex pathways is their contralateral organization, unilateral or bilateral movements are well represented in the ipsilateral hemisphere." This sentence is unclear to me. It would be helpful to specify what 'ipsilateral hemisphere' refers to-ipsilateral to what? Clarifying whether it's ipsilateral to the lesion or another reference point would make the statement more precise."

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This manuscript reveals important insights into the role of ipsilateral descending pathways in locomotion, especially following unilateral spinal cord injury. The study provides solid evidence that this method improves the injured side's ability to support weight, and as such the findings may lead to new treatments for stroke, spinal cord injuries, or unilateral cerebral injuries. However, the methods and results need to be better detailed, and some of the statistical analysis enhanced.

      Thank you for your assessment. We incorporated various text improvements in the final version of the manuscript to address the weaknesses you have pointed out. The specific improvements are outlined below.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript provides potentially important new information about ipsilateral cortical impact on locomotion. A number of issues need to be addressed.

      Strengths:

      The primary appeal and contribution of this manuscript are that it provides a range of different measures of ipsilateral cortical impact on locomotion in the setting of impaired contralateral control. While the pathways and mechanisms underlying these various measures are not fully defined and their functional impacts remain uncertain, they comprise a rich body of results that can inform and guide future efforts to understand cortical control of locomotion and to develop more effective rehabilitation protocols.

      Weaknesses:

      (1) The authors state that they used a cortical stimulation location that produced the largest ankle flexion response (lines 102-104). Did other stimulation locations always produce similar, but smaller responses (aside from the two rats that showed ipsilateral neuromodulation)? Was there any site-specific difference in response to stimulation location?

      We derived motor maps in each rat, akin to the representation depicted in Fig 6. In each rat, alternative cortical sites did, indeed, produce distal or proximal contralateral leg flexion responses. Distal responses were more likely to be evoked in the rostral portion of the array, similarly to proximal responses early after injury. This distribution in responses across different cortical sites is reported in this study (Fig. 6) and is consistent with our prior work. The Results section has been revised to provide additional clarification of the passage you indicated and context for the data presented in Figure 6:

      On page 4, we have clarified: “Stimulation through these channels produced a strong whole-leg flexion movement, with an evident distal component. From visual inspection, all responding electrodes in the array produced contralateral leg flexion, although with different strength of contraction for a fixed stimulation intensity (100μA). Moreover, some sites did not present a distal movement component, failing in eliciting ankle flexion and resulting in a generally weaker proximal flexion.”

      On page 12, we have further noted: “By visually inspecting the responses elicited by stimulation delivered through each of the array electrodes, we categorized movements as proximal or distal. This classification was based on whether the ankle participated in the evoked response or if the movement was restricted to the proximal hindlimb. Each leg was scored independently.”

      (2) Figure 2: There does not appear to be a strong relationship between the percentage of spared tissue and the ladder score. For example, the animal with the mild injury (based on its ladder score) in the lower left corner of Figure 2A has less than 50% spared tissue, which is less spared tissue than in any animal other than the two severe injuries with the most tissue loss. Is it possible that the ladder test does not capture the deficits produced by this spinal cord injury? Have the authors looked for a region of the spinal cord that correlates better with the deficits that the ladder test produces? The extent of damage to the region at the base of the dorsal column containing the corticospinal tract would be an appropriate target area to quantify and compare with functional measures.

      In Fig. S6 of our 2021 publication "Bonizzato and Martinez, Science Translational Medicine", we investigated the predictive value of tissue sparing in specific sub-regions of the spinal cord for ladder performance. Among others, we examined the correlation between the accuracy of left leg ladder performance in the acute state and the preservation of the corticospinal tract (CST). Our results indicated that dorsal CST sparing serves as a mild predictor for ladder deficits, confirming the results obtained in this study.

      (3) Lines 219-221: The authors state that "phase-coherent stimulation reinstated the function of this muscle, leading to increased burst duration (90{plus minus}18% of the deficit, p=0.004, t-test, Fig. 4B) and total activation (56{plus minus}13% of the deficit, p=0.014, t-test, Fig. 3B). This way of expressing the data is unclear. For example, the previous sentence states that after SCI, burst duration decreased by 72%. Does this mean that the burst duration after stimulation was 90% higher than the -72% level seen with SCI alone, i.e., 90% + -72% = +18%? Or does it mean that the stimulation recovered 90% of the portion of the burst duration that had been lost after SCI, i.e., -72% * (100%-90%)= -7%? The data in Figure 4 suggests the latter. It would be clearer to express both these SCI alone and SCI plus stimulation results in the text as a percent of the pre-SCI results, as done in Figure 4.

      Your assessment is correct; we intended to report that the stimulation recovered 90% of the portion of the burst duration that had been lost after SCI. This point has been clarified (see page 9):

      “…leading to increased burst duration (recovered 90±18% of the lost burst duration, p=0.004, t-test, Fig. 4B) and total activation (recovered 56±13% of the total activation, p=0.014, t-test, Fig. 3B)”

      (4) Lines 227-229: The authors claim that the phase-dependent stimulation effects in SCI rats are immediate, but they don't say how long it takes for these effects to be expressed. Are these effects evident in the response to the first stimulus train, or does it take seconds or minutes for the effects to be expressed? After the initial expression of these effects, are there any gradual changes in the responses over time, e.g., habituation or potentiation?

      The effects are immediately expressed at the very first occurrence of stimulation. We never tested a rat completely naïve to stimuli, as each treadmill session involves prior cortical mapping to identify a suitable active site for involvement in locomotor experiments. Yet, as demonstrated in Supplementary Video 1 accompanying our 2021 publication on contralateral effects of cortical stimulation, "Bonizzato and Martinez, Science Translational Medicine," the impact of phase-dependent cortical stimulation on movement modulation is instantaneous and ceases promptly upon discontinuation of the stimulation. We did not quantify potential gradual changes in responsiveness over time, but we cannot exclude that for long stimulation sessions (e.g., 30 min or more), stimulus amplitude may need to be slightly increased over time to compensate habituation.

      (5) Awake motor maps (lines 250-277): The analysis of the motor maps appears to be based on measurements of the percentage of channels in which a response can be detected. This analytic approach seems incomplete in that it only assesses the spatial aspect of the cortical drive to the musculature. One channel could have a just-above-threshold response, while another could have a large response; in either case, the two channels would be treated as the same positive result. An additional analysis that takes response intensity into account would add further insight into the data, and might even correlate with the measures of functional recovery. Also, a single stimulation intensity was used; the results may have been different at different stimulus intensities.

      We confirm that maps of cortical stimulation responsiveness may vary at different stimulus amplitudes. To establish an objective metric of excitability, we identified 100µA as a reliable stimulation amplitude across rats and used this value to build the ipsilateral motor representation results in Figure 6. This choice allows direct comparison with Figure 6 of our 2021 article, related to contralateral motor representation. The comparison reveals a lack of correlation with functional recovery metrics in the ipsilateral case, in contrast to the successful correlation achieved in the contralateral case.

      Regarding the incorporation of stimulation amplitudes into the analysis, as detailed in the Method section (lines 770-771), we systematically tested various stimulation amplitudes to determine the minimal threshold required for eliciting a muscle twitch, identified as the threshold value. This process was conducted for each electrode site.

      Upon reviewing these data, we considered the possibility of presenting an additional assessment of ipsilateral cortical motor representation based on stimulation thresholds. However, the representation depicted in the figure did not differ significantly from the data presented in Figure 6A. Furthermore, this representation introduced an additional weakness, as it was unclear how to represent the absence of a response in the threshold scale. We chose to arbitrarily designate it as zero on the inverse logarithmic scale, where, for reference, 100 µA is positioned at 0.2 and 50 µA at 0.5.

      In conclusion, we believe that the conclusions drawn from this analysis align substantially with those in the text. The addition of the threshold analysis, in our assessment, would not contribute significantly to improving the manuscript.

      Author response image 1.

      Threshold analysis

      Author response image 2.

      Occurrence probability analysis, for comparison.

      (6) Lines 858-860: The authors state that "All tests were one-sided because all hypotheses were strictly defined in the direction of motor improvement." By using the one-sided test, the authors are using a lower standard for assessing statistical significance that the overwhelming majority of studies in this field use. More importantly, ipsilateral stimulation of particular kinds or particular sites might conceivably impair function, and that is ignored if the analysis is confined to detecting improvement. Thus, a two-sided analysis or comparable method should be used. This appropriate change would not greatly modify the authors' current conclusions about improvements.

      Our original hypothesis, drawn from previous studies involving cortical stimulation in rats and cats, as well as other neurostimulation research for movement restoration, posited a favorable impact of neurostimulation on movement. Consistent with this hypothesis, we designed our experiments with a focus on enhancing movement, emphasizing a strict direction of improvement.

      It's important to note that a one-sided test is the appropriate match for a one-sided hypothesis, and it is not a lower standard in statistics. Each experiment we conducted was constructed around a strictly one-sided hypothesis: the inclusion of an extensor-inducing stimulus would enhance extension, and the inclusion of a flexion-inducing stimulus would enhance flexion. This rationale guided our choice of the appropriate statistical test.

      We acknowledge your concern regarding the potential for ipsilateral stimulation to have negative effects on locomotion, which might not be captured when designing experiments based on one-sided hypotheses. That is, when hypothesizing that an extensor stimulus would enhance extension (a one-sided hypothesis) in a functional task, and finding an opposite result (inhibition), statistical rigor would impose that we cannot present that result as significant. This concern is valid, and we explicitly mentioned our design choice it in the method section, Quantification and statistical analyses:

      “All tests were one-sided, as our hypotheses were strictly defined to predict motor improvement. Specifically, we hypothesized that delivering an extension-inducing stimulus would enhance leg extension, and delivering a flexion-inducing stimulus would enhance leg flexion. Consequently, any potentially statistically significant result in the opposite direction (e.g., inhibition) would not be considered. However, no such occurrences were observed.”

      As a final note, even if such opposite observations were made, they could serve as the basis for triggering an ad-hoc follow-up study.

      Reviewer #1 also provided several detailed suggestions in the section “Recommendations for the authors”. We estimated that each of them was beneficial for the correctness or for the readability of the text, and thus all were incorporated into the final version.

      Reviewer #2 (Public Review):

      Summary:

      The authors' long-term goals are to understand the utility of precisely phased cortex stimulation regimes on recovery of function after spinal cord injury (SCI). In prior work, the authors explored the effects of contralesion cortex stimulation. Here, they explore ipsilesion cortex stimulation in which the corticospinal fibers that cross at the pyramidal decussation are spared. The authors explore the effects of such stimulation in intact rats and rats with a hemisection lesion at the thoracic level ipsilateral to the stimulated cortex. The appropriately phased microstimulation enhances contralateral flexion and ipsilateral extension, presumably through lumbar spinal cord crossed-extension interneuron systems. This microstimulation improves weight bearing in the ipsilesion hindlimb soon after injury, before any normal recovery of function would be seen. The contralateral homologous cortex can be lesioned in intact rats without impacting the microstimulation effect on flexion and extension during gait. In two rats ipsilateral flexion responses are noted, but these are not clearly demonstrated to be independent of the contralateral homologous cortex remaining intact.

      Strengths:

      This paper adds to prior data on cortical microstimulation by the laboratory in interesting ways. First, the strong effects of the spared crossed fibers from the ipsi-lesional cortex in parts of the ipsi-lesion leg's step cycle and weight support function are solidly demonstrated. This raises the interesting possibility that stimulating the contra-lesion cortex as reported previously may execute some of its effects through callosal coordination with the ipsi-lesion cortex tested here. This is not fully discussed by the authors but may represent a significant aspect of these data. The authors demonstrate solidly that ablation of the contra-lesional cortex does not impede the effects reported here. I believe this has not been shown for the contra-lesional cortex microstimulation effects reported earlier, but I may be wrong. Effects and neuroprosthetic control of these effects are explored well in the ipsi-lesion cortex tests here.

      In the revised version of the manuscript, we incorporated various text improvements to address the points you have highlighted in your review. Additionally, we have integrated the suggested discussion topic on callosal coordination related to contralateral cortical stimulation. The discussion section now incorporates:

      “Since bi-cortical interactions in sculpting descending commands are known (Brus-Ramer et al., 2009), and in light of the changes we report in ipsilesional motor cortex excitability, the role of the ipsilateral cortex in mediating or supporting functional descending commands from the contralateral cortex, particularly the immediate increase in flexion of the affected hindlimb and long-term recovery of functional control (Bonizzato & Martinez, 2021), could be further explored.”

      The localization of the specific channels closest to the interhemispheric fissure (Fig. 7D) may suggest the involvement of transcallosal interactions in mediating the transmission of the cortical command generated in the ipsilateral motor cortex (Brus-Ramer, Carmel, & Martin, 2009). “While ablation experiments (Fig. 8) refute this hypothesis for ipsilateral extension control, they do not conclusively determine whether a different efferent pathway is involved in ipsilateral flexion control in this specific case."

      Weaknesses:

      Some data is based on very few rats. For example (N=2) for ipsilateral flexion effects of microstimulation. N=3 for homologous cortex ablation, and only ipsi extension is tested it seems. There is no explicit demonstration that the ipsilateral flexion effects in only 2 rats reported can survive the contra-lateral cortex ablation.

      We agree with this assessment. The ipsilateral flexion representation is here reported as a rare but consistent phenomenon, which we believe to have robustly described with Figure 7 experiments. We underlined in the text that the ablation experiment did not conclude on the unilateral-cortical nature of ipsilateral flexion effects, by replacing the sentence with the following:

      “While ablation experiments (Fig. 8) refute this hypothesis for ipsilateral extension control, they do not conclusively determine whether a different efferent pathway is involved in ipsilateral flexion control in this specific case."

      Some improvements in clarity and precision of descriptions are needed, as well as fuller definitions of terms and algorithms.

      Likely Impacts: This data adds in significant ways to prior work by the authors, and an understanding of how phased stimulation in cortical neuroprosthetics may aid in recovery of function after SCI, especially if a few ambiguities in writing and interpretation are fully resolved.

      The manuscript text has been revised in its final version, and we sought to eliminate all ambiguity in writing and data interpretation.

      In the section “Recommendations for the authors” Reviewer #2 also suggested to better define multiple terms throughout the manuscript. A clarification was added for each.

      The Reviewer pointed out that we might have overlooked a correlation between locomotor recovery and motor maps increase in Figure 6. We re-approached this evaluation and found that the reviewer is correct. We were led to think that there was no correlation by “horizontally” looking at whether motor map size across rats would predict locomotor scores (as it did in the case of contralateral cortex mapping, Bonizzato and Martinez, 2021). However we now found a strong correlation between changes that happen over time for each rat and locomotor recovery, a result that was only hinted with no appropriate quantification in the previous version of the manuscript. We have now reformulated the results of Figure 6 on page 12, to include this result, and we would like to thank the reviewer for having noticed this opportunity.

      Finally, we have expanded the discussion to include the following points:

      The possibility that hemi-cortex coordination of contralesional microstimulation inputs may explain the Sci Transl Med results for contralesional cortex ICMS, which warrants further investigation.

      The recognition that the ablation experiments do not provide conclusive evidence regarding ipsilateral flexion control and whether an alternative efferent pathway might be involved in this specific case.

      Reviewer #3 (Public Review):

      Summary:

      This article aims to investigate the impact of neuroprosthesis (intracortical microstimulation) implanted unilaterally on the lesion side in the context of locomotor recovery following unilateral thoracic spinal cord injury.

      Strength:

      The study reveals that stimulating the left motor cortex, on the same side as the lesion, not only activates the expected right (contralateral) muscle activity but also influences unexpected muscle activity on the left (ipsilateral) side. These muscle activities resulted in a substantial enhancement in lift during the swing phase of the contralateral limb and improved trunk-limb support for the ipsilateral limb. They used different experimental and stimulation conditions to show the ipsilateral limb control evoked by the stimulation. This outcome holds significance, shedding light on the engagement of the "contralateral projecting" corticospinal tract in activating not only the contralateral but also the ipsilateral spinal network.

      The experimental design and findings align with the investigation of the stimulation effect of contralateral projecting corticospinal tracts. They carefully examined the recovery of ipsilateral limb control with motor maps. They also tested the effective sites of cortical stimulation. The study successfully demonstrates the impact of electrical stimulation on the contralateral projecting neurons on ipsilateral limb control during locomotion, as well as identifying important stimulation spots for such an effect. These results contribute to our understanding of how these neurons influence bilateral spinal circuitry. The study's findings contribute valuable insights to the broader neuroscience and rehabilitation communities.

      Thank you for your assessment of this manuscript. The final version of the manuscript incoporates your suggestions for improving term clarity and we enhanced the discussion on the mechanisms of spinal network engagement, as outlined below.

      Weakness:

      The term "ipsilateral" lacks a clear definition in the title, abstract, introduction, and discussion, potentially causing confusion for the reader.

      [and later] However, in my opinion, readers can easily link the ipsilateral cortical network to the ipsilateral-projecting corticospinal tract, which is less likely to play a role in ipsilateral limb control in this study since this tract is disrupted by the thoracic spinal injury.

      In order to mitigate the risk of having readers linking the effects of ipsilateral cortical stimulation with ipsilateral-projecting corticospinal tract, we specified:

      In the abstract, we precise that our goal was: “to investigate the functional role of the ipsilateral motor cortex in rat movement through spared contralesional pathways.”

      In the introduction: “In most cases, this lesion also disrupts all spinal tracts descending on the same side as the cortex under investigation at the thoracic level, meaning that the transmission of cortical commands to the ipsilesional hindlimb must depend on crossed descending tracts (Fig. S1).”

      The unexpected ipsilateral (left) muscle activity is most likely due to the left corticospinal neurons recruiting not only the right spinal network but also the left spinal network. This is probably due to the joint efforts of the neuroprosthesis and activation of spinal motor networks which work bilaterally at the spinal level.

      We agree with your assessment and the discussion section now emphasizes the effects of supraspinal drive onto spinal circuits.

      In the section “Recommendations for the authors” Reviewer #3 suggested to provide an early reminder to the reader that the focus is on exploring the control of the ipsilateral limb through the corticospinal tract of the same side, projecting contralaterally. We did so in the abstract and introduction, as presented above.

      The reviewer also suggested that the discussion could be shorter. While we recognize it covers diverse subjects that may appeal to different readers, we believe omitting some sections could limit its overall scope. The manuscript underwent three revisions and a thorough dialogue with reviewers from diverse backgrounds, and we are hesitant to undo some of these improvements.

      Moreover, the section falls short of fully exploring the involvement of contralateral projecting corticospinal neurons in spinal networks for diverse motor behaviors. It could potentially delve into aspects like the potential impact of corticospinal inputs on gating the cross-extensor reflex loop and elucidating the mechanisms underlying the recruitment of the ipsilateral spinal network for generating ipsilateral limb movements. Is it a direct control on motor neurons or via existing spinal circuits?

      The discussion section now includes the potential spinal circuits through which corticospinal neurons may affect motor control and reflexes.

      Reviewer #3 also provided several detailed suggestions in the sub-section “Minor points”. We estimated that all of them were beneficial for the correctness or for the readability of the text, and thus were incorporated into the final version. Some of the questions raised were answered directly in the text (defining “% of chronic map” and rephrasing the original Line 479). We would like to answer here below two remaining questions:

      Fig. 3C I wonder what is the average latency between stimulation onset and onset of right ankle flexor activity. Is the latency fixed, or variable (which probably indicates that the Cortical activation signal is integrated with spinal CPG activity.)

      ICMS trains, unfortunately, do not allow for precise dissection of transmission timing. Single pulses at 100 µA are insufficient to generate motoneuron responses and require multiple pulses to build up cortical transmission. Alstermark et al. (Journal of Neurophysiology, 2004) used two to four stimuli with higher amplitudes to investigate forelimb transmission timing. In our 2021 Science Translational Medicine paper, we employed single pulses at 1 mA to establish transmission delays from the contralateral cortex to the ankle flexor. However, the circuits recruited at 1 mA are not directly comparable to those activated by shorter trains.

      In this study, we used cortical trains of approximately 14 pulses, typical of ICMS protocols. Each pulse could potentially be the first to generate a response volley in the ankle flexor, with delays measured at 30 to 60 ms from ICMS train onset. While we believe that cortical commands are necessarily integrated with spinal CPG activity—as indicated in Figures 1B and 3D, where timing is crucial and descending commands can be gated out if delivered off-phase—the variability in latency that we recorded could be attributed to any of the following factors: cortical activation build-up, integration within reticular relay networks, or CPG integration.

      Fig. 4A. Why is the activity of under contralateral ankle flexor intact condition is later than the stimulation condition?

      We timed the stimulation to coincide with the contralateral leg lift and did not adjust its onset relative to spontaneous walking in SCI rats. Although stimulation could induce leg lift, as shown in Fig. 4A, SCI rats exhibited a slightly earlier and stronger activation of the right (contralateral) ankle flexor muscle even during spontaneous walking. This phenomenon is attributed to the deficits observed on the left side. The stronger right leg bears the body weight, as illustrated in Fig. 3, and thus, during body advancement, the right leg is engaged sooner and more rapidly (with a shorter swing phase) to provide support (right foot forward).

    1. eLife Assessment

      This study provides convincing evidence for functional subpopulations of β-cells responsible for Ca2+ signal initiation and maintenance using novel three-dimensional light sheet microscopy imaging and analysis of pancreatic islets. The findings are important as they help decode the mechanistic underpinnings of islet calcium oscillations and the resulting pulsatile insulin secretion. The work will be of general interest to cell biologists and of particular interest to islet biologists.

    2. Reviewer #1 - Public Review

      Summary:

      Jin, Briggs, and colleagues use light sheet imaging to reconstruct the islet three-dimensional Ca2+ network. The authors find that early/late responding (leader) cells are dynamic over time, and located at the islet periphery. By contrast, highly connected or hub cells are stable and located toward the islet center. Suggesting that the two subpopulations are differentially regulated by fuel input, glucokinase activation only influences leader cell phenotype, whereas hubs remain stable.

      Strengths:

      The studies are novel in providing the first three-dimensional snapshot of the beta cell functional network, as well as determining the localization of some of the different subpopulations identified to date. The studies also provide some consensus as to the origin, stability, and role of such subpopulations in islet function.

      Weaknesses:

      Experiments with metabolic enzyme activators do not take into account the influence of cell viability on the observed Ca2+ network data. Limitations of the imaging approach used need to be recognized and evaluated/discussed.

    3. Reviewer #2 - Public Review

      The manuscript by Erli Jin, Jennifer Briggs et al. utilizes light sheet microscopy to image islet beta cell calcium oscillations in 3D and determine where beta cell populations are located that begin and coordinate glucose-stimulated calcium oscillations. The light sheet technique allowed clear 3D mapping of beta cell calcium responses to glucose, glucokinase activation, and pyruvate kinase activation. The manuscript finds that synchronized beta-cells are found at the islet center, that leader beta cells showing the first calcium responses are located on the islet periphery, that glucokinase activation helped maintain beta cells that lead calcium responses, and that pyruvate kinase activation primarily increases islet calcium oscillation frequency. The study is well-designed, contains a significant amount of high-quality data, and the conclusions are largely supported by the results.

      It has recently been shown that beta cells within islets containing intact vasculature (such as those in a pancreatic slice) show different calcium responses compared to isolated islets (such as that shown in PMID: 35559734). It would be important to include some discussion about the potential in vitro artifacts in calcium that arise following islet isolation (this could be included in the discussion about the limitations of the study).

    4. Reviewer #3 - Public Review

      Summary:

      Jin, Briggs et al. made use of light-sheet 3D imaging and data analysis to assess the collective network activity in isolated mouse islets. The major advantage of using whole islet imaging, despite compromising on the speed of acquisition, is that it provides a complete description of the network, while 2D networks are only an approximation of the islet network. In static-incubation conditions, excluding the effects of perfusion, they assessed two subpopulations of beta cells and their spatial consistency and metabolic dependence.

      Strengths:

      The authors confirmed that coordinated Ca2+ oscillations are important for glycemic control. In addition, they definitively disproved the role of individual privileged cells, which were suggested to lead or coordinate Ca²⁺ oscillations. They provided evidence for differential regional stability, confirming the previously described stochastic nature of the beta cells that act as strongly connected hubs as well as beta cells in initiating regions (doi.org/10.1103/PhysRevLett.127.168101).

      The fact that islet cores contain beta cells that are more active and more coordinated has also been readily observed in high-frequency 2D recordings (e.g. DOI: 10.2337/db22-0952), suggesting that the high-speed capture of fast activity can partially compensate for incomplete topological information.

      They also found an increased metabolic sensitivity of mantle regions of an islet with a subpopulation of beta cells with a high probability of leading the islet activity which can be entrained by fuel input. They discuss a potential role of alpha/delta cell interaction, however relative lack of beta cells in the islet border region could also be a factor contributing to less connectivity and higher excitability.

      The Methods section contains a useful series of direct instructions on how to approach fast 3D imaging with currently available hardware and software.

      The Discussion is clear and includes most of the issues regarding the interpretation of the presented results.

      Some issues concerning inconsistencies between data presented and statements made as well as statistical analysis need to be addressed.

      Taken together it is a strong technical paper to demonstrate the stochasticity regarding the functions subpopulations of beta cells in the islets may have and how less well-resolved approaches (both missing spatial resolution as well as missing temporal resolution) led us to jump to unjustified conclusions regarding the fixed roles of individual beta cells within an islet.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      In Ryu et al., the authors use a cortical mouse astrocyte culture system to address the functional contribution of astrocytes to circadian rhythms in the brain. The authors' starting point is transcriptional output from serum-shocked culture, comparative informatics with existing tools and existing datasets. After fairly routine pathway analyses, they focus on the calcium homeostasis machinery and one gene, Herp, in particular. They argue that Herp is rhythmic at both mRNA and protein levels in astrocytes. They then use a calcium reporter targeted to the ER, mitochondria, or cytosol and show that Herp modulates calcium signaling as a function of circadian time. They argue that this occurs through the regulation of inositol receptors. They claim that the signaling pathway is clock-controlled by a limited examination of Bmal1 knockout astrocytes. Finally, they switch to calcium-mediated phosphorylation of the gap junction protein Connexin 43 but do not directly connect HERP-mediated circadian signaling to these observations. While these experiments address very important questions related to the critical role of astrocytes in regulating circadian signaling, the mechanistic arguments for HERP function, its role in circadian signaling through inositol receptors, the connection to gap junctions, and ultimately, the functional relevance of these findings is only partially substantiated by experimental evidence. 

      Strengths: 

      - The paper provides useful datasets of astrocyte gene expression in circadian time. 

      - Identifies HERP as a rhythmic output of the circadian clock. 

      - Demonstrates the circadian-specific sensitivity of ATP -> calcium signaling. 

      - Identifies possible rhythms in both Connexin 43 phosphorylation and rhythmic movement of calcium between cells. 

      Weaknesses: 

      - It is not immediately clear why the authors chose to focus on Ca2+ homeostasis or Herp from their initial screens as neither were the "most rhythmic" pathways in their primary analyses. 

      We appreciate the reviewer’s comment. We chose to focus on Ca2+ homeostasis processes because intracellular Ca2+ signaling plays crucial role in numerous astrocyte functions and is notably associated with sleep/wake status of animals, which is our primary interest (Bojarskaite et al., 2020; Ingiosi et al., 2020; Blum et al., 2021; Szabó et al., 2017). Among the genes involved in calcium ion homeostasis, Herp exhibited the most robust rhythmicity (supplementary table 1). The rationale for our focus on Ca2+ homeostasis and Herp is explained in the results section (line 143-150). We hope this provides a clear justification for our focus.

      - It would have been interesting (and potentially important) to know whether various methods of cellular synchronization would also render HERP rhythmic (e.g., temperature, forskolin, etc). If Herp is indeed relatively astrocyte-specific and rhythmic, it should be easy to assess its rhythmicity in vivo. 

      Thank you for the reviewer’s insightful comment. In response, we examined HERP expression in cultured astrocytes synchronized using either Dexamethasone or Forskolin treatment. We found that Herp exhibited rhythmic expression at both the the mRNA and protein levels under these conditions. These results have been added to Figure S3 and are explained in the manuscript (lines 173-175).

      Additionally, we measured HERP levels in the prefrontal cortex of mice at CT58 and CT70 and found no rhythmicity, as shown in Author response image 1. Given that Herp is expressed in various brain cell types, including microglia, endothelial cells, neurons, oligodendrocytes, and the astrocytes- with the highest expression in microglia(Cahoy et al., 2008), we reason that the potential rhythmic expression of HERP in astrocytes might be masked by its continuous expression in other cell types. Nonetheless, to assess HERP rhythmicity specifically in astrocytes in vivo, we attempted immunostaining using several anti-HERP antibodies, but none were successful. Consequently, we were unable to determine whether HERP exhibits rhythmic expression in astrocytes in vivo.

      Author response image 1.

      HERP levels were constant at CT58 and CT70. (A, B) Mice were entrained under 12h:12h LD cycle and maintained in constant dark. Prefrontal cortices were harvested at indicated time and processed for Western blot analysis. Representative image shows three independent samples. (B) Quantification of HERP levels normalized to VINCULIN. Values in graphs are mean ± SEM (*p < 0.05, **p < 0.005, ***p < 0.0005, and ****p < 0.00005; t-test)

      - The authors show that Herp suppression reduces ATP-mediated suppression of calcium whereas it initially increases Ca2+ in the cytosol and mitochondria and then suppresses it. The dynamics of the mitochondrial and cytosolic responses are not discussed in any detail and it is unclear what their direct relationship is to Herp-mediated ER signaling. What is the explanation for Herp (which is thought to be ER-specific) to calcium signaling in other organelles? 

      Our examination of cytosolic and mitochondrial Ca2+ responses was aimed at corroborating HERP’s effect on ER Ca2+ response. Upon ATP stimulation, Ca2+ is released from the ER via IP3R receptors (IP3Rs) and subsequently transmitted to other organelles including mitochondria (Carreras-Sureda et al., 2018; Giorgi et al., 2018). Ca2+ is directly transferred to the cytosol by IP3Rs located on the ER membrane, and to the mitochondria through a complex formed by IP3R and the voltage-dependent anion channel (VDAC) on the mitochondria (Giorgi et al., 2018).  Consistent with previous reports, we observed an increase of cytosolic and mitochondrial Ca2+ levels accompanied by decrease in ER Ca2+ levels following ATP treatment (See Fig. 3B, E, H, control siRNA). The ATP-stimulated ER Ca2+ release was enhanced by Herp knockdown. We reasoned that if Ca2+ release was enhanced, then cytosolic and mitochondrial Ca2+ uptakes would also be enhanced. The results were consistent with our hypothesis (See Fig. 3B, E, H, Herp siRNA). These observations are described in the Results section (lines 202-208) and in the Discussion (lines 333-348). We hope this explanation clarifies the relationship between Herp-mediated ER Ca2+ response and Ca2+ response in other organelles. Thank you for your consideration.

      - What is the functional significance of promoting ATP-mediated suppression of calcium in ER? 

      In astrocytes, intracellular Ca2+ plays crucial role in regulating several processes. In this study, among various downstream effects of intracellular Ca2+, we examined the gap junction channel (GJC) conductance, which affects astrocytic communication. As discussed in the manuscript (lines 357-381), circadian variation in HERP results in rhythmic Cx43 (S368) phosphorylation linked with GJC conductance. We propose that during the subjective night phase, heightened ATP induced ER Ca2+ release reduces GJC conductance, uncoupling astrocytes from the syncytium, making them better equipped for localized response. On the other hand, during the subjective day phase, increased GJC conductance may allow astrocytes to control a larger area for synchronous neuronal activity which is a key feature of sleep.

      - The authors then nicely show that the effect of ATP is dependent on intrinsic circadian timing but do not explain why these effects are antiphase in cytosol or mitochondria.

      Moreover, the ∆F/F for calcium in mitochondria and cytosol both rise, cross the abscissa, and then diminish - strongly suggesting a biphasic signaling event. Therefore, one wonders whether measuring the area under the curve is the most functionally relevant measurement of the change. 

      We appreciate the reviewer’s insightful comments. As explained in our previous response, Ca2+ released from the ER is transferred to the cytosol and mitochondria. This transfer explains why the fluorescent intensities of cytosolic and mitochondrial Ca2+ indicators show anti-phasic responses to those of the ER.

      We agree that cytosolic and mitochondrial Ca2+ responses may be biphasic. The decrease below the abscissa in mitochondria and cytosol likely reflects Ca2+ extrusion from these organelles. However, our primary focus was on the initial uptake of Ca2+ following ER Ca2+ release. Thus, when calculating the area under the curve (AUC), we measured the area between the ∆F/F graph and the y=0 (X-axis) for both mitochondria and cytosol. We reason that the measuring the area under the curve (above the abscissa) fits with our objective.

      While addressing your concerns, we noticed errors in the Y-axis labels of Fig. 3C, 4D, and 5C. For the ER Ca2+ dynamics, we measured the area above curve. These mistakes have now been corrected.

      - Why are mitochondrial and cytosolic calcium not also demonstrated for Bmal1 KO astrocytes? 

      In two sets of experiments (Fig. 3 and Fig. 4), we demonstrated that the increase in cytosolic and mitochondrial Ca2+ aligns with ER Ca2+ release. Since there were no circadian time differences in ER Ca2+ release in the Bmal1 KO cultures, we concluded that it was unnecessary to measure Ca2+ levels in the mitochondria and cytosol. Additionally, our primary focus is on the ER Ca2+ response rather than the Ca2+ dynamics in subcellular organelles. We hope this clarifies our rationale and maintains the focus of our study.

      - The authors claim that Herp acts by regulating the degradation of ITPRs but this hypothesis - rather central to the mechanisms proposed in this study - is not experimentally substantiated. 

      We appreciate the reviewer’s insightful comments regarding the role of HERP in the degradation of IP3Rs. In the original manuscript, we demonstrated that treating cells with Herp siRNA leads to an increase in the levels of ITPR1 and ITPR2, suggesting that HERP might be involved in the regulation of IP3Rs stability. This observation is consistent with previous studies, which showed that Herp siRNA treatment increases ITPR levels in HeLa and cardiac cells (Paredes et al., 2016; Torrealba et al., 2017). Torrealba et al. also showed that HERP regulates the polyubiquitination of IP3Rs. Based on our results and previous reports, we hypothesized that HERP similarly regulates ITPR degradation in cultured astrocytes.

      However, as the reviewer rightly pointed out, further evidence is needed to confirm that HERP specifically regulates ITPR degradation. To address this, we conducted new experiments examining the effect of XesC, an inhibitor of IP3Rs, on ER Ca2+ release. The treatment of XesC reduced the ER Ca2+ release and abolished the enhancement of ER Ca2+ release by Herp KD. These results demonstrated that HERP influences ER Ca2+ response through IP3Rs. These new findings have been added to Fig. 3N – 3P and explained in the Results section (lines 217-221).

      We believe these additional experiments and clarifications strengthen our hypothesis that HERP regulates IP3R degradation, thereby modulating ER Ca2+ responses.

      - There is no clear demonstration of the functional relevance of the circadian rhythms of ATP-mediated calcium signaling.

      As mentioned in the previous response, we examined Cx43 phosphorylation linked with GJC conductance in the context of ATP-mediated Ca2+ signaling. Our results demonstrated circadian variations in Cx43 Ser368 phosphorylation leading to variations of gap junction channel (GJC) conductance (Fig. 6C – F and Fig. 7D - I). We have discussed the significance of this circadian rhythm in ATP driven ER Ca2+ signaling concerning astrocytic function during sleep/wake states in the manuscript (lines 357 – 382) as follows.

      “ATP-stimulated Cx43 (S368) phosphorylation is higher at 30hr (subjective night phase) than at 42hr (subjective day phase) (Fig. 6C and 6D.), a finding further supported by in vivo experiments showing higher pCx43(S368) levels in the prefrontal cortex during the subjective night than during the day (Fig. 6E and 6F). What are the implications of this day/night variation in Cx43 (S368) phosphorylation? We reasoned that the circadian variation in Cx43 phosphorylation could significantly impact astrocyte functionality within the syncytium. Indeed, our cultured astrocytes exhibited circadian phase-dependent variation in gap junctional communication (Fig.7D – 7F). Astrocytes influence synaptic activity through the release of gliotransmitters such as glutamate, GABA, D-serine, and ATP, triggered by increases in intracellular Ca2+ in response to the activity of adjacent neurons and astrocytes (Verkhratsky & Nedergaard, 2018). Importantly, this increase in Ca2+ spreads to adjacent astrocytes through GJCs (Fujii et al., 2017), influencing a large area of the neuronal network. Considering that Cx43 Ser368 phosphorylation occurs to uncouple specific pathways in the astrocytic syncytium to focus local responses (Enkvist & McCarthy, 1992), our findings suggest that astrocytes better equipped for localized responses when presented with a stimulus during the active phase in mice. Conversely, during the rest period, characterized by more synchronous neuronal activity across broad brain areas (Vyazovskiy et al., 2009) higher GJC conductance might allow astrocytes to exert control over a larger area. In support of this idea, recent study showed that synchronized astrocytic Ca2+ activity advances the slow wave activity (SWA) of the brain, a key feature of non-REM sleep (Szabó et al., 2017). Blocking GJC was found to reduce SWA, further supporting this interpretation. However, conflicting findings have also been reported. For instance, Ingiosi et al. (Ingiosi et al., 2020) found that astrocytic synchrony was higher during wakefulness than sleep in the mouse frontal cortex. Whether these differing results in astrocyte synchrony during resting and active periods are attributable to differences in experimental context (e.g., brain regions, sleep-inducing condition) remains unclear. Indeed, astrocyte Ca2+ dynamics during wakefulness/sleep vary according to brain regions (Tsunematsu et al., 2021). While the extent of astrocyte synchrony might differ depending on brain region and/or stimulus, on our results suggest that the baseline state of astrocyte synchrony, which is affected by GJC conductance, varies with the day/night cycle.”

      Reviewer #2 (Public Review): 

      Summary: 

      The article entitled "Circadian regulation of endoplasmic reticulum calcium response in mouse cultured astrocytes" submitted by Ryu and colleagues describes the circadian control of astrocytic intracellular calcium levels in vitro. 

      Strengths: 

      The authors used a variety of technical approaches that are appropriate 

      We appreciate the reviewer’s acknowledgement of the strengths of our manuscript.

      Weaknesses: 

      Statistical analysis is poor and could lead to a misinterpretation of the data 

      Thank you for the comment. We have carefully reviewed our statistical analyses and applied appropriate methods where necessary. Please see below for the specific revisions and improvements made.

      For Fig. 2D-E, we initially used a t-test. However, after adding more replicates and conducting a normality test, we found that the data did not follow a normal distribution. Therefore, we switched to the Mann-Whitney U test. In Fig. 5D-E, we originally used a repeated measures two-way ANOVA, but we have now changed it to a standard two-way ANOVA. For Fig. 7C and I, we also observed non-normal distribution in the normality test and consequently replaced the t-test with the Mann-Whitney U test. For other analyses not specifically mentioned, normality tests confirmed normal distribution, allowing us to use t-tests or ANOVA as appropriate for statistical analysis.

      Several conceptual issues have been identified. 

      We have addressed the reviewer’s concerns. Please see our detailed point-by-point responses below.

      Overinterpretation of the data should be avoided. This is a mechanistic paper done completely in vitro, all references to the in vivo situation are speculative and should be avoided. 

      We appreciate the reviewer’s insightful comment. Following the reviewer’s suggestion, we have removed the interpretations of GO pathways in the context of in vivo situation.

      Reviewer #3 (Public Review): 

      Astrocyte biology is an active area of research and this study is timely and adds to a growing body of literature in the field. The RNA-seq, Herp expression, and Ca2+ release data across wild-type, Bmal1 knockout, and Herp knockdown cellular models are robust and lend considerable support to the study's conclusions, highlighting their importance. Despite these strengths, the manuscript presents a gap in elucidating the dynamics of HERP and the involvement of ITPR1/2 in modulating Ca2+ release patterns and their circadian variations, which remains insufficiently supported and characterized. While the Connexin data underscore the importance of rhythmic Ca2+ release triggered by ATP, the relationship here appears correlational and the role of HERP and ITPR in Cx function remains to be characterized. Moreover, enhancing the manuscript's clarity and readability could significantly benefit the presentation and comprehension of the findings. 

      We appreciate the reviewer’s acknowledgement of the strengths of our manuscript. Regarding the identified gaps, we have conducted several new experiments to clearly demonstrate the HERP-ITPR-Cx phosphorylation axis. Please see our detailed point-by-point responses below.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      - While HERP appears to be a clock-controlled gene and its protein levels appear to demonstrate rhythmicity as well, the data quality of the western blotting in Bmal1 knockout raises some concern about the accuracy of HERP protein quantification. 

      We understand the reviewer’s concern regarding the proximity of the HERP band to a nonspecific band in the Western blotting for the Bmal1 knockout. However, we took great care to ensure the accuracy of our HERP band quantification. We meticulously selected only the specific HERP band, excluding nonspecific band. Therefore, we are confident in the accuracy of our HERP protein measurements.

      - If HERP is rhythmic and ITPRs are not, if their model is correct, might we expect HERP suppression to result in 'unmasking' an ITPR rhythm? 

      Our model suggests that both HERP and ITPRs are rhythmic, with HERP regulating the degradation of ITPR proteins and driving their rhythms. Consistent with this, we observed that day/night variations in ITPR2 levels (Fig. 4N and 4O). Therefore, we concluded that circadian variations in HERP are sufficient to drive ITPR2 rhythms. We have explained this in detail in the Result section (lines 236-241) and the Discussion section (lines 324-332).

      - The authors make a rather abrupt switch to examining gap junctions and connexin 43 phosphorylation. While the data demonstrating that the phosphorylation of S368 may indeed be rhythmic - the authors do not connect these data to the rest of the manuscript by showing a connection to HERP-mediated calcium signaling, limiting the coherence of the narrative. 

      Thank you for the reviewer’s insightful comments. To address the reviewer's concern regarding the connection between Herp and the phosphorylation of CX43 at S368, we have conducted new experiments to test whether KD of Herp abolishes the rhythms of Cx43 phosphorylation at S368. We found that the phosphorylation of Cx43 at S368 is significantly enhanced at 30hrs post sync compared with 42hrs post sync in control siRNA-treated astrocytes consistent with our previous results (Fig. 6C & 6D). On the other hand, this circadian phase dependent difference in phosphorylation was abolished in Herp siRNA treated astrocytes. These results clearly indicate that circadian variations in Cx43 phosphorylation are driven by the HERP. These new results are now included in Fig. 6G and 6H and explained in the Results section (lines 276-281).

      - Comment on data presentation: the authors repeatedly present histograms with attached lines between data points - from my understanding of the experiments, this is inappropriate unless these were repeated measures from the same cells. Otherwise, the lines connecting one data point to another between different conditions (e.g., Ctrl or Herp knockdown) are arbitrary and possibly misleading (i.e., Figure 3K, 3M, 4L, 6D). 

      Thank you for the reviewer’s comment. We have updated the figures by removing the lines connecting data points in the relevant figures (Fig.3K, M, Fig4.N and Fig.6D)).

      Reviewer #2 (Recommendations For The Authors): 

      Most of the suggestions of this reviewer are related to the conceptual interpretation and presentation of the data and to the statistical analysis 

      In Figure 1 the authors analyzed the rhythmic transcriptome of cortical astrocytes synchronized with a serum shock in two different ways. The authors need to discuss what is the difference between the two methods used to detect rhythmic transcripts and make sense of them. 

      Following the reviewer’s suggestion, we have provided a more detailed explanation about MetaCycle and BioCycle, as well as the rationale for using both packages in our analysis as follows: “Various methods have been used to identify periodicity in time-series data, such as Lomb-Scargle (Glynn et al., 2006), JTK_CYCLE (Hughes et al., 2010) and ARSER (Yang & Su, 2010), each with distinct advantages and limitations. MetaCycle, integrates these three methods, facilitating the evaluation of periodicity in time-series data without requiring the selection of an optimal algorithm (Wu et al., 2016). Additionally, BioCycle has been developed using a deep neural network trained with extensive synthetic and biological time series datasets (Agostinelli et al., 2016). Because MetaCycle and Biocycle identify periodic signal based on different algorithms, we applied both packages to identify periodicity in our time-series transcriptome data. BioCycle and MetaCycle analyses detected 321 and 311 periodic transcripts, respectively (FDR corrected, q-value < 0.05) (Fig. 1B). Among these, 220 (53.4%) were detected by both methods, but many transcripts did not overlap. MetaCycle is known for its inability to detect asymmetric waveforms (Mei et al., 2020). In our analysis, genes with increasing waveforms like Adora1 and Mybph were identified as rhythmic only by BioCycle, while Plat and Il34 were identified as rhythmic only by MetaCycle (Fig. S1C). Despite these discrepancies, the clear circadian rhythmic expression profiles of these genes led us to conclude that using the union of the two lists compensates for the limitations of each algorithm.”

      Please refer to lines 105-117 in the Results section.

      The reasoning for comparing CT0 with the phase of the clock 8 hs after SS needs to be explained. Circadian time (CT) conceptually refers to the clock phase in the absence of entrainment cues in vivo, the direct transformation of "time after synchronization" in vitro to CT is misleading. 

      Thank you for the reviewer’s insightful comments. Initially, we believed that transforming TASS to CT, despite being in vitro data, might provide a more intuitive and physiologically relevant interpretation of our results. However, we agree that this approach might be misleading. Following the reviewer’s suggestion, we have revised our terminology by changing “CT” to “Time post sync (hr)”. Nonetheless, in Fig. 1F for circular peak phase map, we set 8hrs post sync to ZT0 based on a phase comparison result in Fig. 1D for physiologically relevant interpretation. We hope these revisions clarify our approach.

      Moreover, also by definition a CT cannot be defined in terms of "dark" or "light". Figure 6M needs to be changed. 

      Following the reviewer’s suggestion, we removed the labels CT22 and CT34. Instead. we have labeled the respective periods as “30hr post sync” and “42hr post sync”.

      In Figure 1D, the authors present a gene ontology analysis that is certainly interesting, however, it should not be overinterpreted when trying to explain processes that take place only in vivo (e.g. wound repair). 

      Thank you for the insightful comment. Following the reviewer’s feedback, we have removed the paragraph interpreting the cell migration process in relation to wound repair and have focused instead on Ca2+ ion homeostasis.

      In Figure 2A the relative expression of clock genes and Herp is again misleading by a white/grey shading indicating subjective night and subjective day when the system under study is a cell culture. 

      We understand the reviewer’s concern that a cell culture system is not equivalent to light/dark entrainment condition. However, we apply time-synchronizing stimuli to recapitulate in vivo entrainment. In addition, by comparing our data with CircaDB, we defined 8hrs post sync as corresponding to ZT0, thus aligning it with the beginning of the day. We have retained the shading to facilitate easier interpretation of our data in relation to in vivo situations. However, in response to the reviewer’s concern, we have revised the shading from white/grey to light grey/dark grey. We hope this adjustment addresses the reviewer’s concern, but if the reviewer still believes it is inappropriate, please let us know, we will gladly update it.

      In the Figure 2A legend, it is indicated that rhythmicity is assessed using MetaCycle with mean values obtained from n=2. The authors need to make clear whether this n=2 mean: 2 biological replicates or 2 technical replicates. This difference is relevant because it would make the analysis statistically valid or invalid, respectively. 

      Thank you for your feedback. n=2 refers to 2 biological replicates. Therefore, the analysis is statistically valid.

      In Figures 2C and D the authors applied a T-test, a parametric statistical test for one-to-one comparison that requires normality distribution of the data to be tested first. To test normality, the authors need at least 4 biological replicates. The suggestion of this reviewer is that these experiments have to be repeated and proper statistics applied. 

      Thank you for your feedback. In response to the reviewer's suggestion, we conducted additional experiments to increase the number of biological replicates to 4. After verifying the normality of the data, we applied a t-test for Figure 2C and a Mann-Whitney test for Figure 2D and 2E. These tests confirmed significant statistical difference between groups.

      Further evidence of Bmal1-dependent control of HERP circadian expression authors could check the presence of E-Box elements in the Herp promoter. 

      Thank you for the reviewer’s insightful comment. In the original version of our manuscript's Discussion section, we mentioned the absence of a canonical E-Box in the upstream of Herp gene. However, following the reviewer’s suggestion and considering the potential role of non-canonical E-Boxes, we conducted an additional analysis. This analysis identified several non-canonical E-Boxes within the 6 kb upstream region of the Herp gene (Table S2). Notably, we found one non-canonical E-Box, “CACGTT,” known to regulate circadian expression (Yoo et al., 2005) is close to the transcription start site (chr8:94386194-94386543). Moreover, this element is evolutionarily conserved across various mammals, including humans, rats, mice, dogs, and opossums (See Author response image 2). Therefore, we reasoned that these non-canonical E boxes might drive the CLOCK/BMAL1 dependent expression of Herp. We have updated the Discussion to reflect these findings in lines 315-319.

      Author response image 2.

      The calcium experiments shown in Figures 3A-I, could be more convincing if the authors showed that the different Ca2+ sensors are compartment-specific by showing co-localization with a subcellular marker. In the pictures shown it is not even possible to recognize the cell dimensions. 

      Following the reviewer’s suggestion, we performed co-staining experiments with organelle specific Ca2+ indicators and organelle markers. First, astrocytes were co-transfected with G-CEPIA1er, an ER specific Ca2+ indicator and ER targeted DsRed2 (with Calreticulin signal sequence). Live imaging analysis showed that the fluorescent intensities of G-CEPIA1er and DsRed2-ER-5 significantly overlapped in co-transfected cells. Secondly, astrocytes were transfected with Mito-R-GECO1 and Mitotracker, a cell permeable mitochondria dye, was applied. The fluorescent intensities of Mito-R-GECO1 and Mitotracker also significantly overlapped. These new data are included in Figure S4 and explained in the Result section (lines 194-195).

      Data analysis in Figure 3 K and M is misleading. According to the explanations of the results, each of the experiments to assess ITRP1 or 2 is run independently. Then it is not clear why the relative levels obtained with control or Herp siRNA are plotted as pairs. Same comment as above for Figure 4L and Figure 6D. 

      Thank you for the reviewer’s insightful comments. Reviewer1 raised similar issues. Following the reviewers’ suggestions, we have removed the lines connecting the data points in Fig. 3K, 3M, 4L, and 6D.

      In Figure 5E the authors need to explain why they consider that repeated measures 2-way ANOVA is the right statistical test to apply. According to the explained experimental design, cells transfected, synchronized, and then harvested independently at the indicated time after synchronization. 

      Thank you for the reviewer’s insightful comment. Upon reviewing the statistical methods as suggested, we have revised our approach. Instead of using repeated measures 2-way ANOVA, we have now applied a standard 2-way ANOVA, which is more appropriate given the experimental procedures were independent, as the reviewer pointed out.

      The English language needs to be revised throughout the text. 

      We have thoroughly revised the English language throughout the text.

      Reviewer #3 (Recommendations For The Authors): 

      (1) Figure 3. Clarify the physiological importance of 100 µM ATP. Would the Herp rhythm warrant Ca2+ release rhythms under basal conditions? In 3J-K, the relatively weak effect of Herp knockdown on ITPR1/2 levels, albeit statistically significant, may not be physiologically significant. This calls into question the claimed Herp-ITPR axis that underlies the Ca2+ release phenotype. Further, the correlation certainly exists but further characterization of Herp KD cells would be required to address the mechanism. 

      As previously reported, a broad range of ATP concentrations can induce Ca2+ activity in the astrocytes (Neary et al., 1988). Originally, we conducted an ATP dose-response analysis to observe ER Ca2+ release in our primary astrocyte culture. Our results show that ER Ca2+ release begins at 50 µM ATP and plateaus at 500 µM. Please refer to Author response image 3. We selected 100µM ATP for our experiments because it induces a medium level of ER Ca2+ response. Importantly, although measuring ATP concentrations at the synapse in vivo is challenging(Tan et al., 2017), estimates suggest synaptic ATP concentrations range from 5-500 µM (Pankratov et al., 2006). Thus, 100µM ATP is a physiologically relevant concentration that can affect nearby cells, including astrocytes, in the nervous system.

      Author response image 3.

      Cultured astrocytes were transfected with G-CEPIA1er ER and at 48hrs post transfection, cultured astrocytes were treated with various concentrations of ATP and Ca2+ imaging analysis was performed. (A) ΔF/F0 values over time following ATP application. (B) Area above curve values. Values in graphs are mean ± SEM (*p < 0.05, **p < 0.005, ***p < 0.0005, and ****p < 0.00005; one-way ANOVA).

      Regarding the comment on Ca2+ release rhythms under basal conditions, we interpret this as referring Ca2+ release in the absence of a stimulus. We typically observe Ca2+ release only upon stimulation, such as ATP treatment. However, we acknowledge that the modest effects of HERP knockdown on ITPR1/2 levels could question the HERP-ITPR axis’s role in ER Ca2+ release.

      To address this, we analyzed whether Herp KD induced increases in ER Ca2+ release were mediated through ITPRs by treating cells with Xestospongin C (XesC), an IP3R inhibitor. XesC treatment reduced ATP-induced ER Ca2+ release and eliminated the differences in ER Ca2+ release between control and Herp KD astrocytes (Fig. 3N – 3P). These results clearly indicate that HERP-ITPR axis plays critical role in controlling ER Ca2+ release. These new experiments have been included in Fig. 3 and explained in the result section (lines 217-221).

      Furthermore, following the reviewer’s suggestion, we examined whether HERP rhythms underlie the rhythms of ER Ca2+ response by analyzing ER Ca2+ response in Herp KD astrocyte in two different times following synchronization. In control astrocytes, ATP-induced ER Ca2+ responses vary depending on time, whereas these time-dependent variations were abolished in Herp KD astrocytes. These new experiments have been included in Fig. 4K – 4M and explained in the Results section (lines 232-235).

      Collectively, these results indicate that HERP rhythms lead to time-dependent differences in ER Ca2+ response through ITPRs.

      (2) Figure 4K-L. As data suggested the involvement of ITPR1 and ITPR2 (circadian effect), a reasonable next step is to determine their involvement, but the study did not pursue the hypothesis. 

      Thank you for your insightful comment. Our results indeed suggest that rhythms in ITPR2 levels may drive the time-dependent variations in ATP-induced ER Ca2+ release following synchronization. The newly conducted experiments demonstrated that treatment with the ITPR inhibitor XesC suppressed ATP-induced ER Ca2+ release at both control and Herp siRNA treatment conditions (Fig. 3). Based on these findings, we now further confirm that rhythms of ITPR levels, specifically ITPR2 underlie the circadian variations in ER Ca2+ release. While examining the effect of ITPR2 siRNA would directly prove the involvement of ITPR2, we have decided to pursue this experiment in the future studies.

      (3) Figure 5A-C. Data from WT cells should be included side by side with Bmal1-/- cells for comparison which is expected to be consistent with the HERP levels as in 5D-E. Again, the role of ITPR2 is suggested but not demonstrated. 

      Following the reviewer's suggestion, we conducted additional experiments including both WT and Bmal1-/- cultured astrocytes side-by-side. The results were consistent with our previous findings: WT astrocytes showed rhythms of ER Ca2+ release while Bmal1-/- astrocytes did not. We have updated the Figure 5A to 5C and the corresponding Results section in lines 242-245 accordingly.<br /> Regarding second comment, as mentioned in our previous response, we plan to examine the role of ITPR2 in further studies.

      (4) Figure 6. The Connexin data seems an addon and is correlative with the Ca2+ release. The role of Herp and Itpr in Connexin function is not addressed. Figure 6E-F was not called out in the results section. Suggest providing additional data to support the role of the HERP-ITPR axis in regulating Ca2+ release and Connexin activity. 

      We agree that additional data are needed to support the role of HERP in regulating CX43 phosphorylation. Therefore, we have conducted further experiments to determine whether rhythms of Cx43 phosphorylation are regulated by HERP. In the control astrocytes, ATP treatment induced time-dependent variations in Cx43 phosphorylation. However, these rhythms were abolished in Herp KD astrocytes. These results indicate that rhythms in HERP levels contribute to the time-dependent variations in Cx43 phosphorylation. These new experiments have included in Fig. 6G and 6H and explained in the results section (lines 276-281).

      Regarding second comment, we have corrected our oversight by properly referencing figures 6E-F in the results section. Please refer to lines 357-359 for clarification.

      (5) Discussion. This section should focus on noteworthy points to discuss, not repeating the results. 

      Based on the reviewer's valuable suggestions, we have revised the Discussion section to minimize repetition of the results. Thank you for your guidance.

      (6) The manuscript exhibits numerous grammatical and textual inaccuracies that necessitate careful revision by the authors. My observations here are confined to the title and the abstract alone. I recommend altering the title from "mouse cultured astrocytes" to "cultured mouse astrocytes" for clarity and grammatical correctness. The abstract, meanwhile, needs enhancements both in terms of its content and language. It should incorporate the results of the partitioning among the ER, cytoplasm, and mitochondria, and provide clear definitions for some of the critical terms used. It's worth noting that the abstract's second sentence contains a grammatical error. 

      Thank you for the reviewer’s valuable feedback. We have carefully revised the title, abstract, and main text to address the grammatical and textual issues. The title has been changed to “cultured mouse astrocytes”. Additionally, the abstract now includes results related to cytoplasmic Ca2+ dynamics and has been revised in several places. We appreciate your insights and have worked to enhance the content and language accordingly.

      Reference

      Agostinelli, F., Ceglia, N., Shahbaba, B., Sassone-Corsi, P., & Baldi, P. (2016). What time is it? Deep learning approaches for circadian rhythms. Bioinformatics, 32(12), i8-i17. https://doi.org/10.1093/bioinformatics/btw243

      Cahoy, J. D., Emery, B., Kaushal, A., Foo, L. C., Zamanian, J. L., Christopherson, K. S., Xing, Y., Lubischer, J. L., Krieg, P. A., Krupenko, S. A., Thompson, W. J., & Barres, B. A. (2008). A transcriptome database for astrocytes, neurons, and oligodendrocytes: a new resource for understanding brain development and function. J Neurosci, 28(1), 264-278. https://doi.org/10.1523/JNEUROSCI.4178-07.2008

      Carreras-Sureda, A., Pihán, P., & Hetz, C. (2018). Calcium signaling at the endoplasmic reticulum: fine-tuning stress responses. Cell Calcium, 70, 24-31. https://doi.org/10.1016/j.ceca.2017.08.004

      Enkvist, M. O., & McCarthy, K. D. (1992). Activation of protein kinase C blocks astroglial gap junction communication and inhibits the spread of calcium waves. J Neurochem, 59(2), 519-526. https://doi.org/10.1111/j.1471-4159.1992.tb09401.x

      Fujii, Y., Maekawa, S., & Morita, M. (2017). Astrocyte calcium waves propagate proximally by gap junction and distally by extracellular diffusion of ATP released from volume-regulated anion channels. Scientific Reports, 7(1), 13115. https://doi.org/10.1038/s41598-017-13243-0

      Giorgi, C., Marchi, S., & Pinton, P. (2018). The machineries, regulation and cellular functions of mitochondrial calcium. Nature Reviews Molecular Cell Biology, 19(11), 713-730. https://doi.org/10.1038/s41580-018-0052-8

      Glynn, E. F., Chen, J., & Mushegian, A. R. (2006). Detecting periodic patterns in unevenly spaced gene expression time series using Lomb-Scargle periodograms. Bioinformatics, 22(3), 310-316. https://doi.org/10.1093/bioinformatics/bti789

      Hughes, M. E., Hogenesch, J. B., & Kornacker, K. (2010). JTK_CYCLE: an efficient nonparametric algorithm for detecting rhythmic components in genome-scale data sets. J Biol Rhythms, 25(5), 372-380. https://doi.org/10.1177/0748730410379711

      Ingiosi, A. M., Hayworth, C. R., Harvey, D. O., Singletary, K. G., Rempe, M. J., Wisor, J. P., & Frank, M. G. (2020). A Role for Astroglial Calcium in Mammalian Sleep and Sleep Regulation. Curr Biol, 30(22), 4373-4383.e4377. https://doi.org/10.1016/j.cub.2020.08.052

      Mei, W., Jiang, Z., Chen, Y., Chen, L., Sancar, A., & Jiang, Y. (2020). Genome-wide circadian rhythm detection methods: systematic evaluations and practical guidelines. Briefings in Bioinformatics, 22(3). https://doi.org/10.1093/bib/bbaa135

      Neary, J. T., van Breemen, C., Forster, E., Norenberg, L. O., & Norenberg, M. D. (1988). ATP stimulates calcium influx in primary astrocyte cultures. Biochem Biophys Res Commun, 157(3), 1410-1416. https://doi.org/10.1016/s0006-291x(88)81032-5

      Pankratov, Y., Lalo, U., Verkhratsky, A., & North, R. A. (2006). Vesicular release of ATP at central synapses. Pflugers Arch, 452(5), 589-597. https://doi.org/10.1007/s00424-006-0061-x

      Paredes, F., Parra, V., Torrealba, N., Navarro-Marquez, M., Gatica, D., Bravo-Sagua, R., Troncoso, R., Pennanen, C., Quiroga, C., Chiong, M., Caesar, C., Taylor, W. R., Molgó, J., San Martin, A., Jaimovich, E., & Lavandero, S. (2016). HERPUD1 protects against oxidative stress-induced apoptosis through downregulation of the inositol 1,4,5-trisphosphate receptor. Free Radic Biol Med, 90, 206-218. https://doi.org/10.1016/j.freeradbiomed.2015.11.024

      Szabó, Z., Héja, L., Szalay, G., Kékesi, O., Füredi, A., Szebényi, K., Dobolyi, Á., Orbán, T. I., Kolacsek, O., Tompa, T., Miskolczy, Z., Biczók, L., Rózsa, B., Sarkadi, B., & Kardos, J. (2017). Extensive astrocyte synchronization advances neuronal coupling in slow wave activity in vivo. Scientific Reports, 7(1), 6018. https://doi.org/10.1038/s41598-017-06073-7

      Tan, Z., Liu, Y., Xi, W., Lou, H. F., Zhu, L., Guo, Z., Mei, L., & Duan, S. (2017). Glia-derived ATP inversely regulates excitability of pyramidal and CCK-positive neurons. Nat Commun, 8, 13772. https://doi.org/10.1038/ncomms13772

      Torrealba, N., Navarro-Marquez, M., Garrido, V., Pedrozo, Z., Romero, D., Eura, Y., Villalobos, E., Roa, J. C., Chiong, M., Kokame, K., & Lavandero, S. (2017). Herpud1 negatively regulates pathological cardiac hypertrophy by inducing IP3 receptor degradation. Sci Rep, 7(1), 13402. https://doi.org/10.1038/s41598-017-13797-z

      Tsunematsu, T., Sakata, S., Sanagi, T., Tanaka, K. F., & Matsui, K. (2021). Region-specific and state-dependent astrocyte Ca<sup>2+</sup> dynamics during the sleep-wake cycle in mice. The Journal of Neuroscience, JN-RM-2912-2920. https://doi.org/10.1523/jneurosci.2912-20.2021

      Verkhratsky, A., & Nedergaard, M. (2018). Physiology of Astroglia. Physiol Rev, 98(1), 239-389. https://doi.org/10.1152/physrev.00042.2016

      Vyazovskiy, V. V., Olcese, U., Lazimy, Y. M., Faraguna, U., Esser, S. K., Williams, J. C., Cirelli, C., & Tononi, G. (2009). Cortical firing and sleep homeostasis. Neuron, 63(6), 865-878. https://doi.org/10.1016/j.neuron.2009.08.024

      Wu, G., Anafi, R. C., Hughes, M. E., Kornacker, K., & Hogenesch, J. B. (2016). MetaCycle: an integrated R package to evaluate periodicity in large scale data. Bioinformatics, 32(21), 3351-3353. https://doi.org/10.1093/bioinformatics/btw405

      Yang, R., & Su, Z. (2010). Analyzing circadian expression data by harmonic regression based on autoregressive spectral estimation. Bioinformatics, 26(12), i168-174. https://doi.org/10.1093/bioinformatics/btq189

      Yoo, S. H., Ko, C. H., Lowrey, P. L., Buhr, E. D., Song, E. J., Chang, S., Yoo, O. J., Yamazaki, S., Lee, C., & Takahashi, J. S. (2005). A noncanonical E-box enhancer drives mouse Period2 circadian oscillations in vivo. Proc Natl Acad Sci U S A, 102(7), 2608-2613. https://doi.org/10.1073/pnas.0409763102

    2. eLife Assessment

      This work describes a circadian regulation in the expression of HERP, a regulator of endoplasmic reticulum calcium, in primary astrocytic cultures. This work is important because it highlights the potential importance of circadian rhythms in astrocytes, even though making a direct comparison between these rhythms in vitro and in vivo remains challenging. The technical approaches used in this work (RNA-seq, siRNA, Ca2+ imaging) are a solid support for data interpretation.

    3. Reviewer #2 (Public review):

      Summary:

      The article by Ryu and colleagues describes the circadian control of astrocytic intracellular calcium levels in vitro.

      Strengths:

      The authors used a variety of technical approaches that are appropriate and considerably improved the manuscript with experiments and more solid data analysis compared to the first version

      Weaknesses:

      Some conceptual issues are still present. This is a mechanistic paper done completely in vitro, all references to the in vivo situation are speculative and should be absolutely avoided unless the authors are citing in vivo work.

    4. Reviewer #3 (Public review):

      This study provides significant insights into how the circadian clock influences astrocytic Ca2+ homeostasis. Astrocyte biology is an active area of research and this study is timely and adds to a growing body of literature in the field. This research highlights the potential importance of circadian rhythms in astrocytes, offering a new perspective on their role in central nervous system regulation.

    1. eLife Assessment

      The manuscript by Hills, et al. presents data that support multiple conclusions regarding the gene expression patterns of cells, especially chemosensory neurons. The evidence is largely solid, with transcriptomic analysis combined and validated by spatially resolved expression in tissue sections, but is incomplete in other ways with some claims not fully supported. This large-scale single-cell transcriptomics dataset is an important resource, alongside a thorough exploration of the molecular features of the different cell types within the mouse vomeronasal organ, including expression of chemosensory receptors.

    2. Reviewer #1 (Public review):

      Summary:

      The authors comprehensively present data from single cell RNA sequencing and spatial transcriptomics experiments of the juvenile male and female mouse vomeronasal organ, with a particular emphasis on the neuronal populations found in this sensory tissue. The use of these two methods effectively maps the locations of relevant cell types in the vomeronasal organ at a level of depth beyond what is currently known. Targeted analysis of the neurons in the vomeronasal organ produced several important findings, notably the common co-expression of multiple vomeronasal type 1 receptors (V1Rs), vomeronasal type 2 receptors (V2Rs), and both V1R+V2Rs by individual neurons, as well as the presence of a small but noteworthy population of neurons expressing olfactory receptors (ORs) and associated signal transduction molecules. Additionally, the authors identify transcriptional patterns associated with neuronal development/maturation, producing lists of genes that can be used and/or further investigated by the field. Finally, the authors report the presence of coordinated combinatorial expression of transcription factors and axon guidance molecules associated with multiple neuronal types, providing the framework for future studies aimed at understanding how these patterns relate to the complex glomerular organization in the accessory olfactory bulb. Several of these conclusions have been reached by previous studies, partially limiting the overall impact of the current work. However, when combined, these results provide important insights into the cellular diversity in the vomeronasal organ that are likely to support multiple future studies of the vomeronasal system.

      Strengths:

      The comprehensive analysis of the data provides a wealth of information for future research into vomeronasal organ function. The targeted analysis of neuronal gene transcription demonstrates the co-expression of multiple receptors by individual neurons, and confirms the presence of a population of OR-expressing neurons in the vomeronasal organ. Although many of these findings have been noted by others, the depth of analysis here validates and extends prior findings in an effective manner. The use of spatial transcriptomics to identify the locations of specific cell types is especially useful and produces a template for the field's continued research into the various cell types present in this complex sensory tissue. Overall, the manuscript's biggest strength is found in the richness of the data presented, which will not only support future work in the broader field of vomeronasal system function but also provide insights into others studying complex sensory tissues.

      Weaknesses:

      The inherent weaknesses of single cell RNA sequencing studies based on the 10x Genomics platforms (need to dissociate tissues, limited depth of sequencing, etc.) is acknowledged. However, the authors document their extensive attempts to avoid making false positive conclusions through the use of software tools designed for this purpose. Because of its complexity, there are some portions of the manuscript where the data are difficult to interpret as presented, but this is a relatively minor weakness. The data resulting from the use of the Resolve Biosciences spatial transcriptomics platform are somewhat difficult to interpret because the methods are proprietary and presented in an opaque manner. That said, the resulting data provide useful links between transcriptional identities and cellular locations, which is not possible without the use of such tools.

    3. Reviewer #2 (Public review):

      In their paper entitled "Molecular, Cellular, and Developmental Organization of the Mouse Vomeronasal Organ at Single Cell Resolution" Hills Jr. et al. perform single-cell transcriptomic profiling and analyze tissue distribution of a large number of transcripts in the mouse vomeronasal organ (VNO). The use of these complementary tools provides a robust approach to investigating many aspects of vomeronasal sensory neuron (VSN) biology based on transcriptomics. Harnessing the power of these techniques, the authors present the discovery of previously unidentified sensory neuron types in the mouse VNO. Furthermore, they report co-expression of chemosensory receptors from different clades on individual neurons, including the co-expression of VR and OR. Finally, they evaluated the correlation between transcription factor expression and putative surface axon guidance molecules during the development of different neuronal lineages. Based on such correlation analysis, authors further propose a putative cascade of events that could give rise to different neuronal lineages and morphological organization.

      We appreciate the authors' efforts to add context and citations that relate to recent single cell RNA sequencing studies in the VNO as well as to studies on vomeronasal receptors co-expression and V1R/V2R lineage determination. We also appreciate the new details on the marker genes used for cell annotation as well as clarifications about the differences between juvenile versus adult or male versus female samples.

      A concern still remaining is that two major claims/interpretations - i.e., identification of canonical OSNs and a novel type sVSNs in the mouse VNO - either require experimental substantiation or the authors' claims should be toned down. In their response, Hills Jr. et al. acknowledge that their "paper is primarily intended as a resource paper to provide access to a large-scale single-cell RNA-sequenced dataset and discoveries based on the transcriptomic data that can support and inspire ongoing and future experiments in the field." The authors also write that given "the limited number of genes that we can probe using Molecular Cartography, the number of genes associated with sVSNs may be present in the non-sensory epithelium. This could lead to the identification of cells that may or may not be identical to the sVSNs in the non-neuronal epithelium. Indeed, further studies will need to be conducted to determine the specificity of these cells." Moreover, Hills Jr. et al. acknowledge that as "any transcriptomic study will only be correlative, additional studies will be needed to unequivocally determine the mechanistic link between the transcription factors with receptor choice. Our model provides a basis for these studies." We agree with all these points. Importantly, in the revised manuscript, the authors do not acknowledge that their primary intention is to present "a resource paper to provide access to a large-scale single-cell RNA-sequenced dataset", nor do they acknowledge any of the other caveats/limitations mentioned above. We believe that the authors should not only mention these aspects in their response to the reviews, but they should also make these intentions/caveats/limitations very clear in the manuscript text.

    4. Reviewer #3 (Public review):

      This study presents a detailed examination of the molecular and cellular organization of the mouse VNO, unveiling new cell types, receptor co-expression patterns, lineage specification regulation, and potential associations between transcription factors, guidance molecules, and receptor types crucial for vomeronasal circuitry wiring specificity. The study identifies a novel type of VSN molecularly different from classic VSNs, which may serve as accessory to other VSNs by secreting olfactory binding proteins and mucins in response to VNO activation. They also describe a previously undetected co-expression of multiple VRs in individual VSNs, providing an interesting view to the ongoing discussion on how receptor choice occurs in VSNs, either stochastic or deterministic. Finally, the study correlates the expression of axon guidance molecules associated with individual VRs, providing a putative molecular mechanism that specifies VSN axon projections and their connection with postsynaptic cells in the accessory olfactory bulb.

      The conclusions of this paper are well supported by data, but some aspects of data analysis and acquisition need to be clarified and extended.

      (1) The authors claim that they have identified two new classes of sensory neurons, one being a class of canonical olfactory sensory neurons (OSNs) within the VNO. This classification as canonical OSNs is based on expression data of neurons lacking the V1R or V2R markers but instead expressing ORs and signal transduction molecules, such as Gnal and Cnga2. Since OR-expressing neurons in the VNO have been previously described in many studies, it remains unclear to me why these OR-expressing cells are considered here a "new class of OSNs." Moreover, morphological features, including the presence of cilia, and functional data demonstrating the recognition of chemosignals by these neurons, are still lacking to classify these cells as OSNs akin to those present in the MOE. While these cells do express canonical markers of OSNs, they also appear to express other VSN-typical markers, such as Gnao1 and Gnai2 (Fig 2B), which are less commonly expressed by OSNs in the MOE. Therefore, it would be more precise to characterize this population as atypical VSNs that express ORs, rather than canonical OSNs.

      (2) The second new class of sensory neurons identified corresponds to a group of VSNs expressing prototypical VSN markers (including V1Rs, V2Rs, and ORs), but exhibiting lower ribosomal gene expression. Clustering analysis reveals that this cell group is relatively isolated from V1R- and V2R-expressing clusters, particularly those comprising immature VSNs. The question then arises: where do these cells originate? Considering their fewer overall genes and lower total counts compared to mature VSNs, I wonder if these cells might represent regular VSNs in a later developmental stage, i.e., senescent VSNs. While the secretory cell hypothesis is compelling and supported by solid data, it could also align with a late developmental stage scenario. Further data supporting or excluding these hypotheses would aid in understanding the nature of this new cell cluster, with a comparison between juvenile and adult subjects appearing particularly relevant in this context.

      (3) The authors' decision not to segregate the samples according to sex is understandable, especially considering previous bulk transcriptomic and functional studies supporting this approach. However, many of the highly expressed VR genes identified have been implicated in detecting sex-specific pheromones and triggering dimorphic behavior. It would be intriguing to investigate whether this lack of sex differences in VR expression persists at the single-cell level. Regardless of the outcome, understanding the presence or absence of major dimorphic changes would hold broad interest in the chemosensory field, offering insights into the regulation of dimorphic pheromone-induced behavior. Additionally, it could provide further support for proposed mechanisms of VR receptor choice in VSNs.

      (4) The expression analysis of VRs and ORs seems to have been restricted to the cell clusters associated to the neuronal lineage. Are VRs/ORs expressed in other cell types, i.e. sustentacular, HBC or other cells?

      Review update:

      I believe the novel discovery of two classes of sensory neurons within the VNO-canonical olfactory sensory neurons (OSNs) and secretory vomeronasal sensory neurons (sVSNs)-should be interpreted with caution. Firstly, these cell types are relatively rare, constituting less than 2% of total cells and only 2-6% of the neuronal population (according to Fig. S3). While the OSNs exhibit gene expression profiles consistent with canonical olfactory signal transduction and cilia-related gene ontology, key aspects such as their cell morphology (including the presence of cilia) and functional evidence for chemosignal detection have yet to be demonstrated. The neuronal lineage of sVSNs remains unclear to me. It is uncertain what developmental trajectories these cells follow: do they arise as a specialized subtype of V1R or V2R lineages, or do they have an independent lineage determination, similar to OSNs? At what stage does the commitment to the sVSN lineage begin-during the INP stage or the immature sensory neuron stage? A pseudotime inference analysis of sVSNs could help clarify these questions.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      …several previous studies have identified co-expression of vomeronasal receptors by vomeronasal sensory neurons, and the expression of non-vomeronasal receptors, and this was not adequately addressed in the manuscript as presented.

      We’ve added context and citations to the Introduction and Results sections relating to recent studies on the co-expression of vomeronasal receptors and the expression of non-vomeronasal receptors in VSNs.

      The data resulting from the use of the Resolve Biosciences spatial transcriptomics platform are somewhat difficult to interpret, and the methods are somewhat opaque.

      The Molecular Cartography platform relies on multi-plex imaging of fluorescent probes that bind specifically to individual gene transcripts to determine their spatial location. Unfortunately, the detailed protocols remain proprietary at Resolve Biosciences and were not disclosed. We have clarified this in the revised manuscript. Our role in the acquisition and processing of data for this experiment is included in the current Methods section. Additional analysis produced from the Molecular Cartography data have been added (See response to Reviewer #2, below) to the supplemental materials to help clarify interpretation of the results.

      Reviewer #2:

      …the authors present a biased report of previously published work, largely including only those results that do not overlap with their own findings, but ignoring results that would question the novelty of the data presented here.

      We had no intention of misleading the readers. In fact, we have discussed discrepancies between our results with other studies. However, we inadvertently left out a critical publication in preparing the manuscript. We have added context and citations relating to recent studies that use single cell RNA sequencing in the vomeronasal organ, studies relating to the co-expression of vomeronasal receptors, and studies discussing V1R/V2R lineage determination. In Discussion, we also compared our model with a previous one of genetic determination of VNO neuronal fate.

      Did the authors perform any cell selectivity, or any directed dissection, to obtain mainly neuronal cells? Previous studies reported a greater proportion of non-neuronal cells. For example, while Katreddi and co-workers (ref 89) found that the most populated clusters are identified as basal cells, macrophages, pericytes, and vascular smooth muscle, Hills Jr. et al. in this work did not report such types of cells. Did the authors check for the expression of marker genes listed in Ref 89 for such cell types?

      For VNO dissections, we removed bones and blood vessels from VNO tissue and only kept the sensory epithelium. This procedure removed vascular smooth muscle cells, pericytes, and other non-neuronal cell types, which explains differences in cell proportions between our study and previous studies. We used a DAPI/Draq5 assay to sort live/nucleated cells for sequencing and no specific markers were used for cell selection. All cells in the experiment were successfully annotated using the cell-type markers shown in Fig. 1B, save for cells from the sVSN cluster, which were novel, and required further analysis to characterize.

      The authors should report the marker genes used for cell annotation.

      Marker genes used for cell annotation are shown in figure 1B. A full list of all marker genes used in the cell annotation process has been added to the Methods section.

      The authors reported no differences between juvenile and adult samples, and between male and female samples. It is not clear how they evaluate statistically significant differences, which statistical test was used, or what parameters were evaluated.

      The claims made about male/female mice and P14/P56 mice directly pertain to the distribution of clusters and cells in UMAP space as seen in Figure 1 C & D. We have performed differential gene expression analysis for male/female and P14/P56 comparisons using the FindMarkers function from the Seurat R package. Although we have found significant differential expression between male and female, and between P14 and P56 animals, the genes in this list do not appear to be influential for the neuronal lineage and cell type specification or related to cell adhesion molecules, which are the main focuses of this study. Nevertheless, we have added these results to the supplemental materials.

      ‘Based on our transcriptomic analysis, we conclude that neurogenic activity is restricted to the marginal zone.’ This conclusion is quite a strong statement, given that this study was not directed to carefully study neurogenesis distribution, and when neurogenesis in the basal zone has been proposed by other works, as stated by the authors.

      We have used fourteen slides from whole VNO sections in our Molecular Cartography analysis to quantify the number of GBCs, INPs, and iVSNs predicted in the marginal zone, the intermediate zone, and main/medial zone. We have performed a Wilcoxon signed-rank test to check for the significant presence of GBCs, INPs, and iVSNs in the marginal zone over their presence in the main/medial zone. The results are included in new Figure S3. The result from this analysis justifies our claim that neurogenesis is restricted to the MZ. This claim is also supported by the 2021 study by Katreddi & Forni.

      The authors report at least two new types of sensory neurons in the mouse VNO, a finding of huge importance that could have a substantial impact on the field of sensory physiology. However, the evidence for such new cell types is based solely on this transcriptomic dataset and, as such, is quite weak, since many crucial morphological and physiological aspects would be missing to clearly identify them as novel cell types. As stated before, many control and confirmatory experiments, and a careful evaluation of the results presented in this work must be performed to confirm such a novel and interesting discovery. The reported "novel classes of sensory neurons" in this work could represent previously undescribed types of sensory neurons, but also previously reported cells (see below) or simply possible single-cell sequencing artefacts.

      The reviewer is correct that detailed morphological and physiological studies are needed to further understand these cells. This is an opinion we share. Our paper is primarily intended as a resource paper to provide access to a large-scale single-cell RNA-sequenced dataset and discoveries based on the transcriptomic data that can support and inspire ongoing and future experiments in the field. Nonetheless, we are confident that neither of the novel cell clusters are the result of sequencing artefacts. We performed a robust quality-control protocol, including count correction for ambient RNA with the R package, SoupX, multiplet cell detection and removal with the Python module, Scrublet, and a strict 5% mitochondrial gene expression cut-off. Furthermore, the cell clusters in question show no signs of being the result of sequencing artefacts, as they are physically connected in a reasonable orientation to the rest of the neuronal lineage in modular clusters in 2D and 3D UMAP space. The OSN and sVSN  cell clusters each show distinct and self-consistent expressions of genes (new Figure S4H). Gene ontology (GO) analysis reveals significant GO term enrichment for both the sVSN (Fig. 2G) and mOSN clusters when compared to mature V1R and V2R VSNs, indicating functional differences. We have performed  pseudotime analysis of sVSNs, differential gene expression and gene ontology analysis of mOSNs. The results are shown in the new Figure S6.

      The authors report the co-expression of V2R and Gnai2 transcripts based on sequencing data. That could dramatically change classical classifications of basal and apical VSNs. However, did the authors find support for this co-expression in spatial molecular imaging experiments?

      Genes with extremely high expression levels overwhelm signals from other genes, and therefore had to be removed from the experiment. This is a limitation of the Molecular Cartography platform. Unfortunately, Gnai2 was determined to be one of these genes and was not evaluated for this purpose.

      Canonical OSNs: The authors report a cluster of cells expressing neuronal markers and ORs and call them canonical OSN. However, VSNs expressing ORs have already been reported in a detailed study showing their morphology and location inside the sensory epithelium (References 82, 83). Such cells are not canonical OSNs since they do not show ciliary processes, they express TRPC2 channels and do not express Golf. Are the "canonical OSNs" reported in this study and the OR-expressing VSNs (ref 82, 83) different? Which parameters, other than Gnal and Cnga2 expression, support the authors' bold claim that these are "canonical OSNs"? What is the morphology of these neurons? In addition, the mapping of these "canonical OSNs" shown in Figure 2D paints a picture of the negligible expression/role of these cells (see their prediction confidence).

      We observe OR expression in VSNs in our data; these cells cluster with VSNs. The putative mOSN cluster exhibits its own trajectory, distinct from VSN clusters. These cells express Gnal (Golf), which is not expressed in VSNs expressing ORs, nor in any other cell-type in the data. After performing differential gene expression on the putative mOSN cluster, comparing with V1R and V2R VSNs, independently, GO analysis returned the top significantly enriched GO cellular component, ‘cilium’. This new piece of data is presented in the updated Figure S6. Because we were limited to list of 100 genes in Molecular Cartography probe panel, we have prioritized the detection of canonical VNO cell-types, vomeronasal receptor co-expression, and the putative sVSNs, and were not able to include a robust analysis of the putative OSNs.

      Secretory VSN: The authors report another novel type of sensory neurons in the VNO and call them "secretory VSNs". Here, the authors performed an analysis of differentially expressed genes for neuronal cells (dataset 2) and found several differentially expressed genes in the sVSN cluster. However, it would be interesting to perform a gene expression analysis using the whole dataset including neuronal and non-neuronal cells. Could the authors find any marker gene that unequivocally identifies this new cell type?

      We did not find unequivocal marker genes for sVSNs. We did perform differential analysis of the sVSN cluster with whole VNO data and with the neuronal subset, as well as against specific cell-types. We could not find a single gene that was perfectly exclusive to sVSNs. We used a combinatorial marker-gene approach to predicting sVSNs in the Molecular Cartography data. This required a larger subset of our 100 gene panel to be dedicated to genes for detecting sVSNs.

      When the authors evaluated the distribution of sVSN using the Molecular Cartography technique, they found expression of sVSN in both sensory and non-sensory epithelia. How do the authors explain such unexpected expression of sensory neurons in the non-sensory epithelium?

      In our scRNA-Seq experiment, blood vessels were removed, limiting the power to distinguish between certain cell types. Because of the limited number of genes that we can probe using Molecular Cartography, the number of genes associated with sVSNs may be present in the non-sensory epithelium. This could lead to the identification of cells that may or may not be identical to the sVSNs in the non-neuronal epithelium. Indeed, further studies will need to be conducted to determine the specificity of these cells.

      The low total genes count and low total reads count, combined with an "expression of marker genes for several cell types" could indicate low-quality beads (contamination) that were not excluded with the initial parameter setting. It looks like cells in this cluster express a bit of everything V1R, V2R, OR, secretory proteins.

      We are confident that the putative sVSN cell cluster is not the result of low-quality cells. We performed a robust quality-control protocol, including count correction for ambient RNA with the R package, SoupX, multiplet cell detection and removal with the Python module, Scrublet, and a strict 5% mitochondrial gene expression cut-off. Furthermore, the cell clusters in question show no signs of being the result of sequencing artefacts, as they are connected in a reasonable orientation to the rest of the neuronal lineage in modular clusters in 2D and 3D UMAP space. The OSN and sVSN cell clusters each show distinct and self-consistent expressions of genes (Fig. S1H). Gene ontology (GO) analysis reveals significant GO term enrichment for both the sVSN (Fig. 2G) and mOSN clusters when compared to mature V1R and V2R VSNs, indicating functional differences. Moreover, while some genes were expressed at a lower level when compared to the canonical VSNs, others were expressed at higher levels, precluding the cause of discrepancy as resulting from an overall loss of gene counts.

      The authors wrote ‘...the transcriptomic landscape that specifies the lineages is not known...’. This statement is not completely true, or at least misleading. There are still many undiscovered aspects of the transcriptomics landscape and lineage determination in VSNs. However, authors cannot ignore previously reported data showing the landscape of neuronal lineages in VSNs (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259). Expression of most of the transcription factors reported by this study (Ascl1, Sox2, Neurog1, Neurod1...) were already reported, and for some of them, their role was investigated, during early developmental stages of VSNs (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259). In summary, the authors should fully include the findings from previous works (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259), clearly state what has been already reported, what is contradictory and what is new when compared with the results from this work.

      This is a difference in opinion about the terminology. Transcriptomic landscape in our paper refers to the genome-wide expression by individual cells, not just individual genes. The reviewer is correct that many of the genetic specifiers have been identified, which we cited and discussed. We consider these studies as providing a “genetic” underpinning, rather than the “transcriptomic landscape” in lineage progression. To avoid confusion, we have revised the statement to “… the transcriptional program that specifies the lineages is not known.” 

      …the co-expression of specific V2Rs with specific transcription factors does not imply a direct implication in receptor selection. Directed experiments to evaluate the VR expression dependent on a specific transcription factor must be performed.

      The reviewer is correct, and we did not claim that the co-expression of specific transcription factors indicates a direct relationship with receptor selection. We agree that further directed experiments are required to investigate this question.

      This study reports that transcription factors, such as Pou2f1, Atf5, Egr1, or c-Fos could be associated with receptor choice in VSNs. However, no further evidence is shown to support this interaction. Based on these purely correlative data, it is rather bold to propose cascade model(s) of lineage consolidation.

      The reviewer is correct. As any transcriptomic study will only be correlative, additional studies will be needed to unequivocally determine the mechanistic link between the transcription factors with receptor choice. Our model provides a basis for these studies.

      The authors use spatial molecular imaging to evaluate the co-expression of many chemosensory receptors in single VNO cells. […] However, it is difficult to evaluate and interpret the results due to the lack of cell borders in spatial molecular imaging. The inclusion of cell border delimitation in the reported images (membrane-stained or computer-based) could be tremendously beneficial for the interpretation of the results.

      The most common practice for cell segmentation of spatial transcriptomics data is to determine cell borders based on nuclear staining with expansion. We have tested multiple algorithms based on recent studies, but each has its own caveat.

      It is surprising that the authors reported a new cell type expressing OR, however, they did not report the expression of ORs in Molecular Cartography technique. Did the authors evaluate the expression of OR using the cartography technique?

      We were limited to a 100-gene probe panel and only included one OR. The expression was not high enough for us to substantiate any claims.

      Reviewer #3:

      (1) The authors claim that they have identified two new classes of sensory neurons, one being a class of canonical olfactory sensory neurons (OSNs) within the VNO. This classification as canonical OSNs is based on expression data of neurons lacking the V1R or V2R markers but instead expressing ORs and signal transduction molecules, such as Gnal and Cnga2. Since OR-expressing neurons in the VNO have been previously described in many studies, it remains unclear to me why these OR-expressing cells are considered here a "new class of OSNs." Moreover, morphological features, including the presence of cilia, and functional data demonstrating the recognition of chemosignals by these neurons, are still lacking to classify these cells as OSNs akin to those present in the MOE. While these cells do express canonical markers of OSNs, they also appear to express other VSN-typical markers, such as Gnao1 and Gnai2 (Figure 2B), which are less commonly expressed by OSNs in the MOE. Therefore, it would be more precise to characterize this population as atypical VSNs that express ORs, rather than canonical OSNs.

      We observe OR expression in VSNs in our data; these cells cluster with VSNs. The putative mOSN cluster exhibits its own trajectory, distinct from VSN clusters. These cells express Gnal (Golf), which is not expressed in VSNs expressing ORs, nor in any other cell-type in the data. We have performed differential gene expression analysis on the putative mOSN cluster to compare with V1R and V2R VSNs. GO analysis returned the top significantly enriched GO terms, including many related to “cilium”., further supporting that these are OSNs. Because we were limited to list of 100 genes in Molecular Cartography probe panels, we have prioritized the detection of canonical VNO cell-types, vomeronasal receptor co-expression, and the putative sVSNs, and were not able to include a robust analysis of the putative OSNs. With regard to Gnai2 and Go expression, we have examined our data from the OSNs dissociated from the olfactory epithelium and detected substantial expression of both. This new analysis provides additional support for our claim. We now present differentially expressed genes and GO term analysis of the mOSN class in the updated Figure S6.

      (2) The second new class of sensory neurons identified corresponds to a group of VSNs expressing prototypical VSN markers (including V1Rs, V2Rs, and ORs), but exhibiting lower ribosomal gene expression. Clustering analysis reveals that this cell group is relatively isolated from V1R- and V2R-expressing clusters, particularly those comprising immature VSNs. The question then arises: where do these cells originate? Considering their fewer overall genes and lower total counts compared to mature VSNs, I wonder if these cells might represent regular VSNs in a later developmental stage, i.e., senescent VSNs. While the secretory cell hypothesis is compelling and supported by solid data, it could also align with a late developmental stage scenario. Further data supporting or excluding these hypotheses would aid in understanding the nature of this new cell cluster, with a comparison between juvenile and adult subjects appearing particularly relevant in this context.

      We wholeheartedly agree with this assessment. Our initial thought was that these were senescent VSNs, but the trajectory analysis did not support this scenario, leading us to propose that these are putative secretive cells. Our analysis also shows that overall, 46% of the putative sVSNs were from the P14 sample and 54% from P56. These cells comprise roughly 6.4% of all P14 cells and 8.5% of P56 cells. In comparison, 28.4% of all cells are mature V1R VSNs at P14, but the percentage rise to 46.7% at P56. The significant presence of sVSNs at P14, and the disproportionate increase when compared with mature VSNs indicate that these are unlikely to be late developmental stage or senescent cells, although we cannot exclude these possibilities.

      We have included the sVSNs in a trajectory inference analysis and found that the pseudotime values of the sVSNs are within the range of those cells within the V1R and V2R lineages, indicating a similar maturity (Fig. S6).

      (3) The authors' decision not to segregate the samples according to sex is understandable, especially considering previous bulk transcriptomic and functional studies supporting this approach. However, many of the highly expressed VR genes identified have been implicated in detecting sex-specific pheromones and triggering dimorphic behavior. It would be intriguing to investigate whether this lack of sex differences in VR expression persists at the single-cell level. Regardless of the outcome, understanding the presence or absence of major dimorphic changes would hold broad interest in the chemosensory field, offering insights into the regulation of dimorphic pheromone-induced behavior. Additionally, it could provide further support for proposed mechanisms of VR receptor choice in VSNs. 

      The reviewer raised a good point. We did not observe differences between male and female, or between P14 and P56 mice in the distribution of clusters and cells in UMAP space. Indeed, our differential expression analysis has revealed significantly differentially expressed genes in both comparisons. Results from these analyses are presented in the new Figures S1 and S2.   

      (4) The expression analysis of VRs and ORs seems to have been restricted to the cell clusters associated with the neuronal lineage. Are VRs/ORs expressed in other cell types, i.e. sustentacular, HBC, or other cells?

      Sparsely expressed low counts of VR and OR genes were observed in non-neuronal cell-types. When their expression as a percentage of cell-level gene counts is considered, however, the expression is negligible when compared to the neurons. The observed expression may be explained by stochastic base-level expression, or it may be the result of remnant ambient RNA that passed filtering.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:  

      Reviewer #1 (Public Review): 

      Summary: 

      The fungal cell wall is a very important structure for the physiology of a fungus but also for the interaction of pathogenic fungi with the host. Although a lot of knowledge on the fungal cell wall has been gained, there is a lack of understanding of the meaning of ß-1,6-glucan in the cell wall. In the current manuscript, the authors studied in particular this carbohydrate in the important humanpathogenic fungus Candida albicans. The authors provide a comprehensive characterization of cell wall constituents under different environmental and physiological conditions, in particular of ß-1,6glucan. Also, β-1,6-glucan biosynthesis was found to be likely a compensatory reaction when mannan elongation was defective. The absence of β-1,6-glucan resulted in a significantly sick growth phenotype and complete cell wall reorganization. The manuscript contains a detailed analysis of the genetic and biochemical basis of ß-1,6-glucan biosynthesis which is apparently in many aspects similar to yeast. Finally, the authors provide some initial studies on the immune modulatory effects of ß-1,6-glucan. 

      Strengths: 

      The findings are very well documented, and the data are clear and obtained by sophisticated biochemical methods. It is impressive that the authors successfully optimized methods for the analyses and quantification of ß-1-6-glucan under different environmental conditions and in different mutant strains. 

      Weaknesses: 

      However, although already very interesting, at this stage there are some loose ends that need to be combined to strengthen the manuscript. For example, the immunological studies are rather preliminary and need at least some substantiation. Also, at this stage, the manuscript in some places remains a bit too descriptive and needs the elucidation of potential causalities.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors provide the first (to my knowledge) detailed characterization of cell wall b-1,6 glucan in the pathogen Candida albicans. The approaches range from biochemistry to genetics to immunology. The study provides fundamental information and will be a resource of exceptional value to the field going forward. Highlights include the construction of a mutant that lacks all b-1,6 glucan and the characterization of its cell wall composition and structure. Figure 5a is a feast for the eyes, showing that b-1,6 glucan is vital for the outer fibrillar layer of the cell wall. Also much appreciated was the summary figure, Figure 7, which presents the main findings in digestible form.

      Strengths: 

      The work is highly significant for the fungal pathogen field especially, and more broadly for anyone studying fungi, antifungal drugs, or antifungal immune responses.

      The manuscript is very readable, which is important because most readers will be cell wall nonspecialists.

      The authors construct a key quadruple mutant, which is not trivial even with CRISPR methods, and validate it with a complemented strain. This aspect of the study sets the bar high. The authors develop new and transferable methods for b-1,6 glucan analysis. 

      Weaknesses: 

      The one "famous" cell type that would have been interesting to include is the opaque cell. This could be included in a future paper.

      Reviewer #3 (Public Review): 

      Summary: 

      The cell wall of human fungal pathogens, such as Candida albicans, is crucial for structural support and modulating the host immune response. Although extensively studied in yeasts and molds, the structural composition has largely focused on the structural glucan b,1,3-glucan and the surface exposed mannans, while the fibrillar component β-1,6-glucan, a significant component of the well wall, has been largely overlooked. This comprehensive biochemical and immunological study by a highly experienced cell wall group provides a strong case for the importance of β-1,6-glucan contributing critically to cell wall integrity, filamentous growth, and cell wall stability resulting from defects in mannan elongation. Additionally, β-1,6-glucan responds to environmental stimuli and stresses, playing a key role in wall remodeling and immune response modulation, making it a potential critical factor for host-pathogen interactions.

      Strengths: 

      Overall, this study is well-designed and executed. It provides the first comprehensive assessment of β-1,6-glucan as a dynamic, albeit underappreciated, molecule. The role of β-1,6-glucan genetics and biochemistry has been explored in molds like Aspergillus fumigatus, but this work shines an important light on its role in Candida albicans. This is important work that is of value to Medical Mycology, since β-1,6-glucan plays more than just a structural role in the wall. It may serve as a PAMP and a potential modulator of host-pathogen interactions. In keeping with this important role, the manuscript rigor would benefit from a more physiological evaluation ex vivo and preferably in vivo, assessment on stimulating the immune system within in the cell wall and not just as a purified component. This is a critical outcome measure for this study and gets squarely at its importance for host-pathogen interactions, especially in response to environmental stimuli and drug exposure.

      Response to reviewers (Public reviews):

      We thank all the three reviewers for their opinion on our work on Candida albicans β-1,6-glucan, which highlights the importance of this cell wall component in the biology of fungi. Here are our responses to their comments for public reviews:

      (1) Indeed, the data presented for immunological studies is preliminary. It has been acknowledged by the reviewers that our analysis providing insights into the biosynthetic pathways involved in comprehensive in dealing with organization and dynamics of the β-1,6-glucan polymer in relation with other cell wall components and environmental conditions (temperature, stress, nutrient availability, etc.). However, we anticipated that there would be immediate curiosity as to what the immunological contribution of β-1,6 glucan and we therefore felt we needed to initiative these studies and include them. We therefore performed immunological studies to assess whether β-1,6-glucans act as a pathogen-associated molecular pattern (PAMP), and if so, what its immunostimulatory potential is. Our data clearly suggest that β-1,6-glucan is a PAMP, and consequently lead to several questions: (a) what are the host immune receptors involved in the recognition of this polysaccharide, and thereby the downstream signaling pathways, (b) how is β-1,6-glucan differentially recognized by the host when C. albicans switches from a commensal to an opportunistic pathogen, and (c) how does the host environment impact the exposure of this polysaccharide on the fungal surface. We believe addressing these questions is beyond the scope of the present manuscript and aim to present new data in future manuscript. Nonetheless, in the revised manuscript, suggest approaches that we can take to identify the receptor that could be involved in the recognition of β-1,6-glucan. Moreover, we have modified the discussion presenting it based on the data rather than being descriptive.  

      (2) It will be interesting to assess the organization of β-1,6-glucan and other cell wall components in the opaque cells. It is documented that the opaque cells are induced at acidic pH and in the presence of N-acetylglucosamine and CO2. Our data shows that pH has an impact on β-1,6-glucan, which suggests that there will be differential organization of this polysaccharide in the cell wall of opaque cells. As suggested by the reviewer, we will include analysis of opaque cells (and other C. albicans cell types) in future studies. 

      With the exception of these major new avenues for this research, our revision can address each of the comments provided by the reviewers.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      Although the study is very interesting, there are some loose ends that need to be combined to strengthen the manuscript. For example, the immunological studies are rather preliminary and need at least some substantiation. Also, at this stage, the manuscript in some places remains a bit too descriptive and needs the elucidation of potential causalities.

      Specifically: 

      (1) As you showed, defects in chitin content led to a decrease in the cross-linking of β-glucans in the inner wall that corresponded to the effect of nikkomycin-treated C. albicans phenotype; conversely, an increase in chitin content led to more cross-linking of β-glucans as observed in the FKS1 mutant or in the presence of caspofungin. What is the mechanistic reason for these observations? 

      On one hand, yeast cell wall chitin occurs in three forms: free and covalently linked to β-1,3-glucan or β-1,6-glucan; crosslinked β-glucan-chitin forms core fibrillar structure resistant to alkali. A decrease in the chitin content, therefore, affect β-glucan-chitin crosslinking thereby making β-glucan alkali-soluble. On the other hand, a decrease in the β-glucan content, as in FKS1 mutant or upon caspofungin treatment, results in increased cell wall chitin and β-glucan-chitin contents. A decrease in the β-1,3-glucan biosynthesis is associated with upregulation of CRH1 involved in the β-glucan-chitin crosslinking, which explains an increased β-glucan-chitin content in the FKS1 mutant or upon caspofungin treatment. We have included in this discussion in the revised manuscript (p14, lines 2-10).     

      (2) The β-1,6-glucan biosynthesis is stimulated via a compensatory pathway when there is a defect in O- and N-linked cell wall mannan biosynthesis. Why? causality? Hypothesis?  

      Two phenomena were observed related to β-1,6-glucan and mannan biosynthesis: 1) a defect in the elongation of N-mannan led to an increase in the β-1,6-glucan content; 2) a defect of O-mannan elongation resulted in the reduce size of β-1,6-glucan chains, however, increased their branching. These observations of our study suggest a global rescue program of the cell wall damage that could occur due to defect in one of the cell wall contents. We have discussed this in the revised manuscript (p14, last paragraph, p15 first paragraph). Moreover, β-1,3-glucan and chitin are synthesized by respective membrane bound synthases, and a defect in of their synthesis is compensated by the other. In line, although need to be validated for β-1,6-glucan, biosynthesis of mannan and β-1,6-glucan seem to initiate intracellularly. Therefore, possibility is that the defective mannan biosynthesis could be compensated by β-1,6-glucan biosynthesis, but need to be further validated experimentally. 

      (3) You showed that the removal of β-1,6-glucan by periodate oxidation (AI-OxP) led to a significant decrease in the IL-8, IL-6, IL-1β, TNF-α, C5a, and IL-10 released, suggesting that their stimulation was in part β-1,6-glucan dependent. What is the consequence of the stimulation, e.g. better phagocytosis, etc.? This needs some more experiments, otherwise the data is purely descriptive, as the conclusion. Also, what do you want to show with the activation of the complement system? Is ß1,6-glucan detected by complement receptors? I think this is really a loose end. I think it is necessary to provide more data on this observation, which I think lacks control with serum lacking complement, this should then be moved to the main manuscript. 

      In this study, our aim was to assess whether β-1,6-glucan acts as a pathogen-associated molecular pattern (PAMP) of C. albicans, and if yes, what is its immunostimulatory capacity/potential. Our data confirms that, indeed, β-1,6-glucan acts as a PAMP, and its removal significantly reduces the immunostimulatory capacity of the fibrillar core structure of the C. albicans cell wall. On the other hand, data provided in the revised manuscript (see updated Figure S14, discussion p13 lines 16-21) indicate that the human serum factors significantly enhance the immunostimulatory capacity of β1,6-glucan and that β-1,6-glucan interacts with the complement component C3b. However, addressing the role of β-1,6-glucan in phagocytosis using β-1,6-glucan deletion mutant will not be possible as the cell wall of this mutant is modified, and β-1,6-glucan is not the only cell wall component interacting with C3b. Alternate is to coat β-1,6-glucan on beads and use to study phagocytosis and identify immune receptors; however, these are beyond the scope of our present study/focus.      

      (4) Also, you suggested that β-1,6-glucan and β-1,3-glucan stimulate innate immune cells in distinct ways. Please provide more data on this interesting suggestion. You can block the dectin-1 receptor for example or use dectin-1 deficient macrophages from mice. The part on the immune stimulation needs to be optimized. 

      Stimulation of immune cells by pustulan (insoluble linear β-1,6-glucan) via a dectin-1independent pathway has been described previously (PMIDs: 18005717, 16371356) as discussed in the manuscript. Our preliminary data indicate that dectin-1 blocking on immune cells (using antidectin-1 antibodies) has no effect on the immunostimulatory potential of β-1,6-glucan, unlike AI and AI-OxP that showed significantly reduced cytokine secretion by the immune cells upon dectin-1 blocking. Deciphering the β-1,6-glucan recognition and its immunomodulatory pathways are underway, and will be the subject of our future study/manuscript.   

      (5) β-1,6-glucan and mannan productions are coupled. What is the hypothesis? Is it due to the necessity of mannan residues in ß-1,6-glucan biosynthesis enzymes from the ER? Can that be experimentally proven? 

      β-1,6-glucan and mannan synthesis should be coupled in two ways. First, as mentioned above (Response 2), defects in mannan elongation led to an alteration of β-1,6-glucan production. Second, early steps of N-glycosylation led to a strong reduction of β-1,6-glucan size and its cell wall content. However, we do not believe that the synthesis of N-glycan is required for the synthesis of an acceptor essential to β-1,6-glucan synthesis. Defect in N-mannan elongation led to a global cell wall remodeling as described above. Kre5, Rot2 and Cwh41 are part of the calnexin cycle involved in the control of N-glycoprotein folding in the ER, suggesting that some protein directly involved in the β-1,6-glucan synthesis required a folding quality control to be active. We modified our discussion, accordingly, highlighting these points (p14, last paragraph, p15 second paragraph).

      (6) As PHR1 and PHR2 genes are strongly regulated by external pH, the compensatory differences described may be explained by pH-dependent regulation of β-1,6-glucan synthesis.' Please check. Also, could the pH regulation form the basis of e.g. differences you found for ß-1,6-glucan under different environmental conditions, i.e., growth on different carbon sources leads to different external pH values, as shown for many fungi?  

      We agree that environmental pH is dependent on carbon source and pH varies during growth curve. To test the effect of pH we buffered the medium with 100 mM MOPS or MES. Clearly, Fig. 2 and S1 show that the pH has an effect on the cell wall composition and polymer exposure as previously described (PMID: 28542528). Here, we show that pH has an impact on the β-1,6-glucan size as well as its branching. However, in buffered medium, addition of organic acid (such as acetate, propionate, butyrate or lactate) had an impact on cell wall composition, showing that not only pH has an effect on cell wall composition. About _phr1_Δ/Δ and _phr2_Δ/Δ mutants, we believe that the difference in the cell wall composition observed between mutants is mainly due to the pH-dependent regulation, which we indicated in the discussion (p14, end of first paragraph).

      Minor: 

      (1) In Figure 7B: dynamism should be replaced by dynamic and in term is rather in terms.  

      Modified as suggested.

      (2) Replace molecular size with molecular mass when you give daltons. 

      Molecular size has been replaced by molecular weight, when presented as daltons.

      (3) Page 7: for explanation, please add that nikkomycin is a chitin biosynthesis inhibitor.   

      As suggested, explained that nikkomycin is a chitin biosynthesis inhibitor.

      Reviewer #2 (Recommendations For The Authors):

      (1) I wondered if the increased chitin content of hyphae might reflect growth on the precursor GlcNAc. Have you tested hyphae that are induced in other ways? (2) Related to point 1, did you look at the relative abundance of yeast vs hyphae in the preparation? I wonder if yeast contamination might have reduced the extent of the composition changes observed. 

      We used GlcNAc as hyphae inducer as: 1) in presence of GlcNAc, hyphae are produced without any yeast contamination; in this condition, we observed an increase in the chitin content, as described, in hyphae (PMID: 16423067); 2) we excluded using of serum, another condition inducing hyphal formation, as we could not control serum factors that may impact cell wall composition. We now indicate in the methods section that hyphae induced by GlcNAc were not contaminated by yeast (p17, line 3). 

      (3) I recommend rephrasing the first sentence of the Figure 2 legend: "Cells were grown in liquid SD medium at 37oC at exponential phase under different growth conditions." The conditions varied extensively - stationary is not exponential; biofilm is probably not exponential. Also, the "D" in "SD" stands for dextrose, and the carbon source varied a good deal. Perhaps you could say: "Cells were grown in liquid synthetic medium at 37oC under different growth conditions, as specified in Methods." 

      Sentences have been rephrased.  

      (4) Figure 7b has a typo: "dependant" for "dependent".

      Typo-error has been corrected.

      Reviewer #3 (Recommendations For The Authors):

      To explore the biochemical composition of the cell wall, the authors fractionated the wall component into three categories based on polymer properties and reticulations: sodium-dodecyl-sulphate-βmercaptoethanol (SDS-β-ME) extract, alkali-insoluble (AI), and alkali-soluble (AS) fractions, and they developed several independent methods to distinguish between β-1,3-glucans and β-1,6-glucans. The composition and surface exposure of fungal cell wall polymers is known to depend on environmental growth conditions. It was shown that the cell wall of C. albicans hyphae increased chitin content (10% vs. 3%) and decreased β-1,6-glucan (18% vs. 23%) and mannan (13% vs. 20%) compared to the yeast form, and the reduced β-1,6-glucan content was associated with a smaller β1,6-glucan size (43 vs. 58 kDa), suggesting that both the content and structure of β-1,6-glucan are regulated during growth and cellular morphogenesis. Similar behavior was observed when exposing cells to acid and neutral medium pH. The most significant cell wall alteration occurred in a lactatecontaining medium, which led to a sharp reduction in structural core polysaccharides: chitin (-43%), β-1,3-glucan (-48%), and β-1,6-glucan (-72%). This reduction aligns with the previously observed decreases in inner cell wall layer thickness. As expected, the authors found that modulating chitin content genetically (chs3Δ/Δ knockout mutant) led to an increase of both β-1,3-glucan and β-1,6glucan. An increase in chitin content following genetic alteration of FKS genes impacting glucan synthase or after exposure to the echinocandin caspofungin led to enhanced cross-linking of βglucans. A slight increase in the β-1,3-glucan branching was also observed in the mnt1/mnt2Δ/Δ double mutant, suggesting that β-1,6-glucan and mannan synthesis may be coupled.

      - This effect is not that pronounced, and the relationship appears somewhat overstated and may reflect an indirect interaction. The authors should address accordingly. 

      We agree that this sentence was overstated. To make it clearer and less pronounced, we divided this sentence into to two with less pronounced statements (p8, line 34).

      The genetics of β-1,6-glucan biosynthesis appear complex and a figure describing putative roles for specific genes would be beneficial. For example, KRE6 is a glucosyl hydrolase required for beta1,6-glucan biosynthesis.

      - It would be valuable to better understand the overall biosynthetic process. Please elaborate more in a figure. 

      Although proteins/enzymatic activities directly involved in the β-1,6-glucan biosynthesis have not yet been identified, as suggested by this reviewer, we included a schematic representation of this process based on our hypothesis (Figure S15, and p15 lines 17-22 in revised manuscript), indicating the possible involvement of Kre6p.  

      The deletion of KRE6 homologs, essential for β-1,6-glucan biosynthesis, resulted in the absence of β-1,6-glucan production, and significant structural alterations of the cell wall. This result nicely confirms the important role of β-1,6-glucan in regulating cell wall homeostasis. The absence of β1,6-glucan was associated with increased (mutant v. WT) chitin content (9.5% vs. 2.5%) and highly branched β- β-1,6-glucan 1,3-glucan (48% vs. 20%). TEM ultrastructure studies nicely showed the change in cell wall overall architecture. From a drug discovery perspective, since the blockade of β1,6-glucan did not block growth, it may have more value as a potential virulence target. This would be valuable but needs to be assessed in animal model challenge competition experiments.

      - The authors may want to elaborate more. 

      We agree and modified “antifungal target” as “potential virulence target”.

      It is well known that β-1,3-glucan, mannan, and chitin function serve as PAMPs, which induce immune responses. The role of β-1,6-glucan as a PAMP is not well understood, and the authors provide evidence that different cell wall extracted fractions with enriched constituents induce immune responses invoking cytokines, chemokines, and acute phase proteins, as well as the complement system. While this data clearly shows that β-1,6-glucan is immunologically active and potentially important for host-pathogen interactions, the analysis is preliminary and falls short of making this case. 

      - This is a critical point in getting at the potential host signaling of β-1,6-glucan contained in the cell wall or shed by the cell (is this known?)

      - This analysis would be bolstered significantly by examining stimulation relative to other cell wall components, and most importantly, whole cell modulation of β-1,6-glucan exposure for immune presentation, and not just unnatural concentrated extracts. This can be readily accomplished with the various mutants in hand, as well as after exposure to various antifungal agents echinocandins and nikkomycins) (see Hohl et al. 2008 JID). Additional validation would benefit from animal model studies to examine in vivo immune modulation.

      We agree with the reviewer. However, the main focus of our present work was to study the organization and dynamics of C. albicans cell wall β-1,6-glucan, and to explore its possible role as pathogen-associated molecular pattern (PAMP). Our study indicates that, indeed, β-1,6-glucan acts as a PAMP with immunostimulatory potential. As pointed by this reviewer, and similar to β-1,3glucans, the exposure of β-1,6-glucan is probably a key point in immune response. However, this investigation beyond the scope of this study, underway and will be presented in our future work.

      - The Discussion would also benefit from an analysis of how β-1,6-glucan in Aspergillus fumigatus, which was largely elucidated by the same primary authors. 

      To our knowledge, β-1,6-glucan has never been identified, either by chemical analysis (PMID: 10869365; PMID: 36836270) or solid-state NMR (PMID: 34732740), in the cell wall of A. fumigatus, although a homolog of KRE6 is present in A. fumigatus but with unknown function.

    2. eLife Assessment

      The paper will be of broad interest to fungal biologists and fungal immunologists seeking to understand the biosynthesis of the fungal cell wall, in particular of ß-1,6-glucan synthesis and the importance of this so far understudied constituent of the cell wall for cell wall integrity and immune response. The study is of fundamental significance and adds structural clarity to the genetic, and biochemical basis of this difficult-to-analyze carbohydrate. It opens the potential for understanding its role in immune recognition and potentially as a drug target. Overall, the data is compelling, properly controlled and analyzed.

    3. Reviewer #1 (Public review):

      Summary:

      The fungal cell wall is a very important structure for the physiology of a fungus but also for the interaction of pathogenic fungi with the host. Although a lot of knowledge on the fungal cell wall has been gained, there is lack of understanding of the meaning of ß-1,6-glucan in the cell wall. In the current manuscript, the authors studied in particular this carbohydrate in the important human-pathogenic fungus Candida albicans. The authors provide a comprehensive characterization of cell wall constituents under different environmental and physiological conditions, in particular of ß-1,6-glucan. Also, β-1,6-glucan biosynthesis was found to be likely a compensatory reaction when mannan elongation was defective. The absence of β-1,6-glucan resulted in a significantly sick growth phenotype and complete cell wall reorganization. The manuscript contains a detailed analysis of the genetic and biochemical basis of ß-1,6-glucan biosynthesis which is apparently in many aspects similar to yeast. Finally, the authors provide some initial studies on immune modulatory effects of ß-1,6-glucan.

    4. Reviewer #2 (Public review):

      Summary:

      The authors provide the first (to my knowledge) detailed characterization of cell wall b-1,6 glucan in the pathogen Candida albicans. The approaches range from biochemistry to genetics to immunology. The study provides fundamental information and will be a resource of exceptional value to the field going forward. Highlights include the construction of a mutant that lacks all b-1,6 glucan and the characterization of its cell wall composition and structure. Figure 5a is a feast for the eyes, showing that b-1,6 glucan is vital for the outer fibrillar layer of the cell wall. Also much appreciated was the summary figure, Figure 7, that presents the main findings in digestible form.

      Strengths:

      The work is highly significant for the fungal pathogen field especially, and more broadly for anyone studying fungi, antifungal drugs, or antifungal immune responses.<br /> The manuscript is very readable, which is important because most readers will be cell wall nonspecialists.<br /> The authors construct a key quadruple mutant, which is not trivial even with CRISPR methods, and validate it with a complemented strain. This aspect of the study sets the bar high.<br /> The authors develop new and transferable methods for b-1,6 glucan analysis.

      Weaknesses:

      The one "famous" cell type that would have been interesting to include is the opaque cell. Please include it in the next paper!

    5. Reviewer #3 (Public review):

      Summary:

      The cell wall of human fungal pathogens, such as Candida albicans, is crucial for structural support and modulating the host immune response. Although extensively studied in yeasts and molds, the structural composition has largely focused on the structural glucan b,1,3-glucan and the surface exposed mannans, while the fibrillar component β-1,6-glucan, a significant component of the well wall, has been largely overlooked. This comprehensive biochemical and immunological study by a highly experienced cell wall group provides a strong case for the importance of β-1,6-glucan contributing critically to cell wall integrity, filamentous growth, and cell wall stability resulting from defects in mannan elongation. Additionally, β-1,6-glucan responds to environmental stimuli and stresses, playing a key role in wall remodeling and immune response modulation, making it a potential critical factor for host-pathogen interactions.

      Strengths:

      Overall, this study is well designed and executed. It provides the first comprehensive assessment of β-1,6-glucan as a dynamic, albeit underappreciated, molecule. The role of β-1,6-glucan genetics and biochemistry has been explored in molds like Aspergillus fumigatus, but this work shines important light on its role in Candida albicans. This is important work that is of value to Medical Mycology, since β-1,6-glucan plays more than just a structural role in the wall. It may serve as a PAMP and a potential modulator of host-pathogen interactions.

      Weaknesses:

      In keeping with an important role in immune recognition, it was suggested that the manuscript rigor would benefit from a more physiological evaluation ex vivo and preferably in vivo, assessment on stimulating the immune system within in the cell wall and not just as a purified component. This is a critical outcome measure for this study and gets squarely at its importance for host-pathogen interactions, especially in response to environmental stimuli and drug exposure. The authors addressed this issue contextually and indicate that it will require a more detailed immunologic evaluation but is not in keeping with the intent of this foundational study.

    1. eLife Assessment

      This valuable study uses fluorescence lifetime imaging and steady-state and time-resolved transition metal ion FRET to characterize conformational transitions in the isolated cyclic nucleotide binding domain of a bacterial CNG channel. The data are compelling and support the authors' conclusions. The results advance the understanding of allosteric mechanisms in CNBD channels and have theoretical and practical implications for other studies of protein allostery. A limitation is that only the cytosolic fragments of the channel were studied.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1: 

      Limitations are that only the cytosolic fragments of the channel were studied, and the current manuscript does not do a good job of placing the results in the context of what is already known about CNBDs from other methods that yield similar information.

      In the revision, we have now added a paragraph in the discussion that addresses why the cytosolic fragment was used and a paragraph putting our results into the context of previous work on CNBD channels where possible. 

      (1) Why do the authors not apply their approach to the full-length channel? A discussion of any limitations that make this difficult would be worthwhile.” Full-length ion channel protein expression is more challenging, and it was important to start with a simpler system. This is now stated in the discussion.

      (2) …nonetheless a comparison of the conformational heterogeneity and energetics obtained from these different approaches would help to place this work in a larger context.

      We have now added a paragraph in the discussion putting our work in a larger context and addressing the challenges of comparing our results to previous studies. 

      (3) Page 5 - 3:1 unlabeled:labeled subunits in mix => 42% of molecules have 3:1 stoichiometry as desired and 21% of molecules have 2:2 stoichiometry!!! (binomial distribution p=0.25, n=4). So 1/3 of molecules with labels have two labeled subunits. This does not seem like it is at all avoiding the problem of intersubunit FRET…

      From the experimental perspective, the 3:1 molar ratio stated is certainly a low estimate of the actual subunit ratios given our FSEC data in Figure 2D and the higher expression of the WT protein compared to labeled protein. Furthermore, even without the addition of any WT protein, the calculated contribution of intersubunit FRET is negligible given that the FRET efficiency is heavily dominated by the closest donor-acceptor distances (Figure 4). 

      (4) Figure 2E - Some monomers appear to still be present in the collected fraction. The authors should discuss any effect this might have on their results.

      We now describe in the text that, at the low concentrations (~10nM) used for mass photometry, a second small peak was observed of ~30kDa, which is below the analytical range for this method. This would not affect our results since all tmFRET experiments used higher protein concentrations to ensure tetramerization.

      (5) page 4 - "Time-resolved tmFRET, therefore, resolves the structure and relative abundance of multiple conformational states in a protein sample." - structure is not resolved, only a single distance.

      We have reworded this sentence.  

      Reviewer #2:

      Regarding cyclic nucleotide-binding domain (CNBD)-containing ion channels, I disagree with the authors when they state that "the precise allosteric mechanism governing channel activation upon ligand binding, particularly the energetic changes within domains, remains poorly understood". On the contrary, I would say that the literature on this subject is rather vast and based on a significantly large variety of methodologies…

      Despite this vast literature on the energetics of CNBD channels there is no consensus about the energetics and coupling of domains that underlies the allosteric mechanism in any CNBD channel. We have added a separate paragraph in the discussion to clarify our meaning.

      In light of the above, I suggest the authors better clarify the contribution/novelty that the present work provides to the state-of-the-art methodology employed (steady-state and time-resolved tmFRET) and of CNBD-containing ion channels…

      …In light of the above, what is the contribution/novelty that the present work provides to the SthK biophysics?

      This work is the first use of the time-resolved tmFRET method to obtain intrinsic G (of an apo conformation) and G values for different ligands. It is also the first application of this approach to SthK or, indeed, to any protein other than MBP. This is mentioned in the introduction.  

      …On the basis of the above-cited work (Evans et al., PNAS, 2020) the authors should clarify why they have decided to work on the isolated Clinker/CNBD fragment and not on the full-length protein…

      We chose to start on the C-terminal fragment to provide a technically more tractable system for validating our approach using time-resolved tmFRET before moving to the more challenging full-length membrane protein. This is now addressed in a new paragraph in the discussion. 

      What is the advantage of using the Clinker/CNBD fragment of a bacterial protein and not one of HCN channels, as already successfully employed by the authors (see above citations)?

      We have chosen to perform these studies in SthK rather than a mammalian CNBD channel as SthK presents a useful model system that allows us to later express fulllength channels in bacteria. In addition, the efficiency of noncanonical amino acid incorporation is much higher in bacteria than in mammalian cells.

      Reviewer #3: 

      While the use of a truncated construct of SthK is justified, it also comes with certain limitations…

      We agree that the truncated channel comes with limitations, but we still think that there is relevant energetic information from studies of the isolated CNBD. This is now addressed in the discussion. 

      I recommend the authors carefully assess their statements on allostery. …The authors also should consider discussing the discrepancies between their truncated construct and full-length channels in more detail.

      We added a paragraph in the introduction that now puts the conformational change of the CNBD in the context of the allosteric mechanism of the full-length channel. We also added a paragraph discussing in more detail the relationship between the energetics of the C-terminal fragment and the full-length channel.  

      Regarding the in silico predictions, it is unclear to me why the authors chose the closed state of SthK Y26F and the 'open' state of the isolated C-linker CNBD construct…

      The active cAMP bound structure (4d7t) was a high resolution X-ray crystallography structure chosen as the only model with a fully resolved C-helix. The resting state structure (7rsh) was selected as a the only resting state to resolve the acceptor residue studied here (V417).     

      Previously it has been shown that SthK (and CNG) goes through multiple states during gating. This may be discussed in more detail, especially when it comes to the simplified four-state model…

      As stated above, we added paragraphs to the introduction and discussion placing the conformational change of the CNBD in the context of the full-length channel.  

      It would be interesting to see how the conformational distribution of the C-helix position integrates with available structural data on SthK. In general, putting the results more into the context of what is known for SthK and CNG channels, could increase the impact.

      We now discuss the relationship between existing structures and energetics in the introduction.  

      This may be semantics, but when working with a truncated construct that is missing the transmembrane domains using 'open' and 'closed' state is questionable. I recommend the authors consider a different nomenclature.

      We refer to the conformational states of the CNBD as ‘resting’ and ‘active’ and used ‘closed’ and ‘open’ only for the conformational states of the pore.

    3. Reviewer #1 (Public review):

      Summary:

      The authors use fluorescence lifetime imaging (FLIM) and tmFRET to resolve resting vs. active conformational heterogeneity and free energy differences driven by cGMP and cAMP in a tetrameric arrangement of CNBDs from a prokaryotic CNG channel.

      Strengths:

      The data are excellent and provide detailed measures of the probability to adopt resting vs. activated conformations with and without bound ligands.

      Weaknesses:

      A limitation is that only the cytosolic fragments of the channel were studied.

    1. eLife Assessment

      This valuable work presents the latest version of CTFFIND, which is the most popular software for determination of the contrast transfer function (CTF) in cryo-electron microscopy. CTFFIND5 estimates and considers acquisition geometry and sample thickness, which leads to improved CTF determination. The paper describes compelling evidence that CTFFIND5 finds better CTF parameters than previous methods, in particular for tilted samples (e.g. for cryo-electron tomography) or where thickness is an issue (e.g. cellular samples, or electron microscopy at low voltages).

    2. Reviewer #1 (Public review):

      This work presents CTFFIND5, a new version of the software for determination of the Contrast Transfer Function (CTF) that models the distortions introduced by the microscope in cryoEM images. CTFFIND5 can take acquisition geometry and sample thickness into consideration to improve CTF estimation.

      To estimate tilt (tilt angle and tilt axis), the input image is split into tiles and correlation coefficients are computed between their power spectra and a local CTF model that includes the defocus variation according to a tilted plane. As a final step, by applying a rescaling factor to the power spectra of the tiles, an average tilt-corrected power spectrum is obtained used for diagnostic purposes and estimate the goodness of fit. This global procedure and the rescaling factor resemble those used in Bsoft, Warp, etc, with determination of the tilt parameters being a feature specific of CTFFIND5 (and formerly CTFTILT). The performance of the algorithm is evaluated with tilted 2D crystals and tilt-series, demonstrating accurate tilt estimation in general.

      CTFFIND5 represents the first CTF determination tool that considers the thickness-related modulation envelope of the CTF firstly described by McMullan et al. (2015) and experimentally confirmed by Tichelaar et al. (2020). To this end, CTFFIND5 uses a new CTF model that takes the sample thickness into account. CTFFIND5 thus provides more accurate CTF estimation and, furthermore, gives an estimation of the sample thickness, which may be a valuable resource to judge the potential for high resolution. To evaluate the accuracy of thickness estimation in CTFFIND5, the authors use the Lambert-Beer law on energy-filtered data and also tomographic data, thus demonstrating that the estimates are reasonable for images with exposure around 30 e/A2. While consideration of sample thickness in CTF determination sounds ideally suited for cryoET, practical application under the standard acquisition protocols in cryoET (exposure of 3-5 e/A2 per image) is still limited. In this regard, the authors are precise in the conclusions and clearly identify the areas where thickness-aware CTF determination will be valuable at present: in situ single particle analysis and in vitro single particle cryoEM of large specimens (e.g. viral particles).

      In conclusion, the manuscript introduces novel methods inside CTFFIND5 that improve CTF estimation, namely acquisition geometry and sample thickness. The evaluation demonstrates the performance of the new tool, with fairly accurate estimates of tilt axis, tilt angle and sample thickness and improved CTF estimation. The manuscript critically defines the current range of application of the new methods in cryoEM.

    3. Reviewer #2 (Public review):

      This paper describes the latest version of the most popular program for CTF estimation for cryo-EM images: CTFFIND5. New features in CTFFIND5 are the estimation of tilt geometry, including for samples, like FIB-milled lamellae, that are pre-tilted along a different axis than the tilt axis of the tomographic experiment, plus the estimation of sample thickness from the expanded CTF model described by McMullan et al (2015). The results convincingly show the added value of the program for thicker and tilted images, such as are common in modern cryo-ET experiments. The program will therefore have a considerable impact on the field.

      Comments on revised version:

      My comments have been addressed adequately.

    4. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their detailed comments. Several comments revolved around potential improvements in the 3D reconstructions that are obtained in later steps of the image processing pipelines for single-particle cryoEM and cryo-electron tomography. We have not investigated how our improvements in CTFFIND5 affect these downstream results and can therefore not make specific and quantitative statements in this regard. However, CTFFIND5 provided additional information about the sample that users will find useful (thickness, tilt) for selecting the data they would like to include in later processing, and how to process them. Furthermore, when the sample tilt of a thin specimen is known, local defocus estimates (e.g., per-particle defocus estimates) will be more accurate compared to estimates that ignore tilt information. In the following, we provide point-by-point responses to the reviewers’ comments.

      Reviewer #1 (Public Review):

      This work presents CTFFIND5, a new version of the software for determination of the Contrast Transfer Function (CTF) that models the distortions introduced by the microscope in cryoEM images. CTFFIND5 can take acquisition geometry and sample thickness into consideration to improve CTF estimation.

      To estimate tilt (tilt angle and tilt axis), the input image is split into tiles and correlation coefficients are computed between their power spectra and a local CTF model that includes the defocus variation according to a tilted plane. As a final step, by applying a rescaling factor to the power spectra of the tiles, an average tilt-corrected power spectrum is obtained and used for diagnostic purposes and to estimate the goodness of fit. This global procedure and the rescaling factor resemble those used in Bsoft, Warp, etc, with determination of the tilt parameters being a feature specific of CTFFIND5 (and formerly CTFTILT). The performance of the algorithm is evaluated with tilted 2D crystals and tiltseries, demonstrating accurate tilt estimation in some cases and some limitations in others. Further analysis of CTF determination with tilt-series, particularly showing whether there is accurate or stable estimation at high tilts, might be helpful to show the robustness of CTFFIND5 in cryoET.

      CTFFIND5 represents the first CTF determination tool that considers the thickness-related modulation envelope of the CTF firstly described by McMullan et al. (2015) and experimentally confirmed by Tichelaar et al. (2020). To this end, CTFFIND5 uses a new CTF model that takes the sample thickness into account. CTFFIND5 thus provides more accurate CTF estimation and, furthermore, gives an estimation of the sample thickness, which may be a valuable resource to judge the potential for high resolution. To evaluate the accuracy of thickness estimation in CTFFIND5, the authors use the Lambert-Beer law on energy-filtered data and also tomographic data, thus demonstrating that the estimates are reasonable for images with exposure around 30 e/A2. While consideration of sample thickness in CTF determination sounds ideally suited for cryoET, practical application under the standard acquisition protocols in cryoET (exposure of 3-5 e/A2 per image) is still limited. In this regard, the authors are honest in the conclusions and clearly identify the areas where thickness-aware CTF determination will be valuable at present: e.g. in situ single particle analysis and in vitro single particle cryoEM of purified samples at low voltages.

      In conclusion, the manuscript introduces novel methods inside CTFFIND5 that improve CTF estimation, namely acquisition geometry and sample thickness. The evaluation demonstrates the performance of the new tool, with fairly accurate estimates of tilt axis, tilt angle and sample thickness and improved CTF estimation. The manuscript critically defines the current range of application of the new methods in cryoEM.

      Reviewer #2 (Public Review):

      Summary:

      This paper describes the latest version of the most popular program for CTF estimation for cryo-EM images: CTFFIND5. New features in CTFFIND5 are the estimation of tilt geometry, including for samples, like FIB-milled lamellae, that are pre-tilted along a different axis than the tilt axis of the tomographic experiment, plus the estimation of sample thickness from the expanded CTF model described by McMullan et al (2015). The results convincingly show the added value of the program for thicker and tilted images, such as are common in modern cryo-ET experiments. The program will therefore have a considerable impact on the field.

      I have only minor suggestions for improvement below:

      Abstract: "[CTF estimation] has been one of the key aspects of the resolution revolution"-> This is a bit over the top. Not much changed in the actual algorithms for CTF estimation during the resolution revolution.

      We have removed this statement in the abstract.

      L34: "These parameters" -> Cs is typically given, only defocus (and if relevant phase shift) are estimated.

      We have modified the introduction to reflect this. Page 3, L30-35

      L110-116: The text is ambiguous: are rotations defined clockwise or counter-clockwise? It would be good to explicitly state what subsequent rotations, in which directions and around which axes this transformation matrix (and the input/output angles in CTFFIND5) correspond to.

      Thank you for pointing this out. We have revised the Methods section, Page 4 L57-61,  to explicitly define the convention for the tilt axis and tilt angle. We have also modified Fig. 1b to illustrate our convention for the tilt axis.

      L129-130: As a suggestion: it would be relatively easy, and possibly beneficial to the user, to implement a high-resolution limit that varies with the accumulated dose on the sample. One example of this exists in the tomography pipeline of RELION-5.

      We appreciate the suggestion. However, since CTFFIND5 currently has no concept of a tilt-series and treats every micrograph independently, this would not be trivial to implement. As detailed below, CTFFIND5 in its current form is not targeted toward tomography processing, but its features might be useful for its use in pipelines for tomography processing, such as RELION-5. We made this more explicit in the conclusion section. Page 16 L390-399

      Substituting Eq (7) into Eq (6) yields ksi=pi, which cannot be true. If t is the sample thickness, then how can this be a function of the frequency g of the first node of the CTF function? The former is a feature of the sample, the latter is a parameter of the optical system. This needs correction.

      We have rewritten the text describing equations 7 and 6 to avoid this confusion (Page 7, L146-153). The reviewer is right that inserting Eq. 7 into Eq. 6 yields ksi=psi, as in fact Eq. 7 is derived from Eq. 6, by substituting ksi=psi, since this describes the condition for the first node. Also, in this context, nodes in the CTF function refer to the places where the term sinc(ksi) becomes zero and therefore the CTF is apparently "flat". The frequency at which this occurs is sample-thickness dependent. As explained below, the previous version of our manuscript did not point out the difference between the first zero and first node in the power spectrum. We have amended Fig. 3a to make this difference clearer.

      Reviewer #3 (Public Review):

      In this manuscript, the authors detail improvements in the core CTFFIND (CTFFIND5 as implemented in cisTEM) algorithm that better estimates CTF parameters from titled micrographs and those that exhibit signal attenuation due to ice thickness. These improvements typically yield more accurate CTF values that better represent the data. Although some of the improvements result in slower calculations per micrograph, these can be easily overcome through parallelization.

      There are some concerns outlined below that would benefit from further evaluation by the authors.

      For the examples shown in Figure 3b, given the small differences in estimated defocus1 and 2, what type of improvements would be expected in the reconstructed tomograms? Do such improvements in estimates manifest in better tilt-series reconstruction?

      As explained in our preface, we do not believe that these difference would manifest in any improvements during tilt-series reconstruction and would not create any meaningful differences, even when tomograms are reconstructed with CTF correction. They might become meaningful during subtomogram averaging, but subtomograms are usually corrected using per-particle CTF estimation, similar to single-particle processing. We have included a new paragraph in the discussion to describe potential benefits of CTFFIND5 for cryo-tomography, Page 16 L390-399.

      Similarly, the data shown in Figure 3C shows minimal improvements in the CTF resolution estimate (e.g., 4.3 versus 4.2 Å), but exhibited several hundred Å difference in defocus values. How do such differences impact downstream processing? Is such a difference overcame by per-particle (local) CTF refinements (like the authors mention in the discussion, see below)?

      The difference in the defocus estimate (~600A) is substantially smaller than the thickness of the sample (2000A). Hence both estimates may be valid, depending on which particles inside the sample are considered. Particles with larger defocus errors could certainly be corrected by per-particle CTF refinement as long as the search range is chosen to be large enough. The main benefit of using CTFFIND5 is information for the user regarding the sample thickness to set the defocus search range appropriately.

      At which point does the thickness of the specimen preclude the ice thickness modulation to be included for "accurate" estimate? 500Å? 1000Å? 2000Å? Based on the data shown in Figure 3B, as high as 969 Å thick specimens benefit moderately (4.6 versus 3.4 Å fit estimate), but perhaps not significantly, from the ice thickness estimation. Considering the increased computational time for ice thickness estimation, such an estimate of when to incorporate for single-particle workflows would be beneficial.

      As explained in our preface, the main benefit for single-particle workflows will be sample tilt estimation. This will provide more accurate per-particle defocus estimates, compared to estimates that do not take the tilt into account. For single-particle samples, the ice thickness in holes is probably more efficiently monitored using the Beer-Lambert law.

      It would seem that this statement could be evaluated herein: "the analysis of images of purified samples recorded at lower acceleration voltages, e.g., 100 keV (McMullan et al., 2023), may also benefit since thickness-dependent CTF modulations will appear at lower resolution with longer electron wavelengths". There are numerous examples of 300kV, 200kV, and 100kV EMPIAR datasets to be compared and recommendations would be welcomed.

      Publicly available datasets recorded at 100kV and 200kV were collected in very thin ice, making it difficult to demonstrate the stated benefits. We have removed this statement.

      Although logical, this statement is not supported by the data presented in this manuscript: "The improvements of CTFFIND5 will provide better starting values for this refinement, yielding better overall CTF estimation and recovery of high-resolution information during 3D reconstruction."

      We have revised this statement and now explain that the sample tilt information will provide more accurate per-particle defocus estimates, compared to estimates that do not take the tilt into account, Page 17, L400-409. We did not investigate how this will affect downstream processing results.

      Moreso, the lack of single-particle data evaluation does present a concern. Naively, these improvements would benefit all cryoEM data, regardless of modality.

      We agree with the reviewer that all cryoEM modalities should benefit from more accurate defocus value estimates and have amended our concluding statement. However, how improved defocus values will benefit downstream processing results will depend on the processing pipeline, which includes various points of user input and data-dependent choices. We have therefore limited our analysis to the outputs of CTFFIND5.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) CTFFIND5 in cryo-ET

      (1.1) CTFFIND4 is prone to unreliable CTF estimates at high tilts in cryoET, a situation that can be identified by high variability or 'unstable' estimates as a function of the tilt angle. Prof. Mastronarde recently illustrated this situation in his article JSB 216:108057, 2024 (Fig. 7). Therefore, the authors could add results to show whether the improvements to tilt estimation introduced in CTFFIND5 overcome this problem. So, in addition to the estimation of tilt angle and tilt axis in Figure 2, the estimated defocus could also be shown.

      We have worked with Prof. Mastronarde to help him use CTFFIND as a tool in his cryoET processing pipeline. Mastronarde chose CTFFIND because it contains algorithms and architecture that he could optimize for his purposes. CTFFIND5 is currently lacking the concept of a tilt series and can therefore not take advantage of the additional information that comes with tilt series. Our own applications for CTFFIND5 currently do not include tomography, and our results presented in Fig. 2 were obtained for validation of the tilt estimation feature. We did not attempt to duplicate Mastronarde’s optimization for reliable tilt series processing.

      Figure 2b of this manuscript already suggests that CTFFIND5 may exhibit some variability of defocus estimates at high tilts (in view of the variability of tilt axis angle). A strategy used in IMOD and TOMOCTF is to consider the tiles of a group of consecutive images (typically 35; especially at high tilts) to add more signal to the average spectrum, thus providing more reliable estimates (illustrated in Mastronarde's article JSB 216:108057, 2024, Fig. 8). Will the authors think that CTFFIND5 might include a strategy like this for cryoET tilt-series?

      We currently do not have plans to develop CTFFIND5 as a tool for tomography as there are already other excellent tools available, some of them based on CTFFIND’s basic algorithms (see previous comment).

      (1.2) In cryoET, the CTF is often determined on the aligned tilt-series, with the tilt axis typically running along the Y axis. Has CTFFIND5 got the option to exclude estimation of the tilt geometry (tilt angle and/or axis) and, instead, take tilt geometry directly from the alignment and/or from the microscope??. This would significantly speed up determination of the CTF (in 1-2 seconds per image, according to Table 2) while still taking advantage of all power spectra in tilted images (as described in their tilt estimation algorithm) for improved CTF estimation. This strategy would be similar to what it is done in Bsoft and IMOD.

      This is an excellent idea and we may implement this in an updated version. The current version is primarily meant for lamellae and single-particle samples where we usually have a single tilt in an unknown direction. For these cases, the suggested feature will have less benefit. 

      Thus, I suggest that the authors should also include results comparing CTF estimation in aligned tilt-series with CTFFIND4 and with CTFFIND5 (with no tilt estimation but indeed taking the tilt information from the alignment or the microscope into account). The results would show that CTFFIND5 is more robust than CTFFIND4, especially at high tilts.

      Thank you for this suggestion. We are now showing a comparison of defocus estimates from CTFFIND4 and CTFFIND5 in Fig. 2. Indeed, in one case CTFFIND5 seems to report more robust defocus values at high tilt.

      (1.3) The newer improvements in CTFFIND5 seem to be especially tailored to cryoET. The cryoET community will be highly attracted by these improvements. However, the current standard acquisition protocols (exposure of 3-5 e/A2 per image, tilts up to 60 degrees, etc) limit their full exploitation, particularly the thickness-aware CTF determination. I believe that adding a paragraph exclusively focused on cryoET and describing the potential benefits from CTFFIND5 and their limitations could enrich the Conclusion section. In this paragraph, the authors could highlight the great benefits from the tilt-aware CTF estimation. They could also discuss the current standard acquisition protocols (e.g. exposure 3-5 e/A2 per image, nominal defocus 3-5 microns, cellular thickness from 150 nm up to 200-300 nm that, at a tilt of 60 degrees, become 300 nm up to 400-600 nm) and their implications for the potential benefit from the improvements available in CTFFIND5.

      This reviewer is clearly excited about the potential application of CTFFIND5 in cryoET. We are sorry that we are currently not developing CTFFIND5 in this direction.

      (1.4) Apologies for insisting on cryoET in the previous points. I am just trying to suggest ideas to make CTFFIND5 even more helpful in cryoET. You can consider them now, or for a future version of the software, or just ignore them.

      Thanks for your suggestions. Since there is clearly demand for tools to process tomographic tilt series, we will keep these suggestions in mind for the future development of CTFFIND.

      (2) Tilt estimation

      (2.1) Page 4. Tiles for the initial steps in tilt estimation are of size 128x128.  At which point tiles of larger size (e.g. 512x512) are used?. Please, define.

      Thank you for pointing out this lack of clarity. For the tilt estimation, we used a tile size 128 x 128, which has been hard-coded in our program, as mentioned in line 68 on page4. For generating the final power spectrum, we usually use size 512 x 512. This tile size can be defined by the user when running the program. We have now clarified this on Page 4, L74-76.

      (2.2) Page 6 and/or page 11: evaluation of tilt estimation with tilt-series.

      Please indicate the acquisition details of the tilt-series used for the evaluation, especially the exposure per image. This information is neither available in this manuscript nor in Elferich et al., 2022.

      Please, add these acquisition details similarly to page 9 in this manuscript (evaluation of sample thickness estimation using tomography): pixel size, exposure per image and total exposure, number of images, tilt range and interval

      The same tilt-series were used to verify tilt-estimation and sample thickness. We have revised the Methods section to make this clear on Page5, L98-105 and Page 10, L202.

      (2.3) Page 10. Section Results. Subsection Tilt estimation.

      The authors use "defocus correction" to refer to their method for scaling the power spectra. "Defocus correction" might perhaps be a misleading term. In contrast, in page 4 the authors use the term "tilt correction". Please, revise and make it consistent throughout the manuscript.

      We agree and now use “tilt correction” throughout the manuscript.

      (2.4) Legend of Figure 2.

      Please add what the red dashed curve represents. Also, please note there might be an error in the estimated stage tilt axis angle: the legend states "171.8" where in the main text it is "178.2" (apparently, the latter is the correct one).

      Thank you for pointing this out. We have modified the legend and changed the number in the legend to 178.2°.

      (3) Thickness estimation

      (3.1) Line 141, page 7. The sentence reads: "The modulation of the CTF due to sample thickness t is described by the function E (current Equation 6), "  I believe that the modulation envelope of the CTF due to sample thickness is not really E (current Equation 6), but the function sinc(E). Please, revise.

      We have revised the manuscript as advised, Page 7, L148.

      (3.2) Line 148, page 7. The sentence reads "an estimate of the frequency g of the first node of the CTF_t function "

      The concept of 'node' was introduced by Tichelaar et al. (2020). The authors should not assume that this concept is familiar to the readership. So, it is suggested that the authors should introduce this concept in this section. For instance, just after Equation 6 they could add a sentence like this: "This sinc modulation envelope increasingly attenuates the amplitude of the Thon rings with increasing spatial frequencies in an oscillatory fashion, with locations where the amplitude is zero known as nodes (Tichelaar et al., 2020)."

      Thank you for this suggestion. We have revised the manuscript accordingly (Page 7, L151-156) and also marked the position of the first node in Fig. 3a.

      (3.3) Line 154, page 8: A citation is lacking: "(corrected for astigmatism, as described in )". Perhaps the authors refer to the EPA (EquiPhase Averaging) method introduced by Zhang, JSB 193:1-12, 2016, 10.1016/j.jsb.2015.11.003.

      Thanks for spotting this omission. We have added the appropriate reference.

      (3.4) Figure 3.

      (3.4.1) Perhaps, the EPA (EquiPhase Averaging) method is used to reduce the 2D CTF to 1D curves, as represented in Figure 3b and 3c. Please, mention this in the legend of the figure or in the main text referring to Figure 3. The same might apply to Figure 1c.

      Thanks for spotting this omission. We have clarified that this is indeed an EPA in the figure legends.

      (3.4.2) Please indicate what the colored curves represent in 3b and 3c: The fitted CTF model (dashed red) and the EPA or astimatism-corrected radial average of power spectrum (solid black) ?

      Thanks for spotting this omission. We have added descriptions of the colored lines in these plots (red = modeled CTF, blue = goodness of fit).

      (3.4.3) Please note that the power spectrum (solid black curves in Figure 3b and 3c) does not look the same in the top and bottom panels: Without thickness estimation (top panels), the power spectrum is in the range [0,1] in Y, as expected. However, with thickness estimation (bottom panels), the power spectrum seems to have undergone a frequencydependent transformation (a rescaling or something that makes the power spectrum oscillates around 0.5 in Y). This transformation of the power spectrum resembles the thickness-induced sinc modulation of the CTF and seems to be appropriate to better fit the new thickness-aware CTF_t model in CTFFIND5 to the (transformed) power spectrum. However, this transformation of the power spectrum is not mentioned in the manuscript at all. Instead, according to the main text (page 8), the fitting method is based on the crosscorrelation between the new CTF model and the power spectrum, so I was expecting to see the same power spectrum black curve in the top and bottom panels. Please, clarify.

      Indeed, CTFFIND5 displays the power spectrum differently after thickness estimation. We have revised the methods to explain this (page8, L178-181). The reviewer is also correct that the 1D lines plots of the Thon ring patterns in Fig. 3b and 3c are not identical. These 1D plots are generated from the 2D plots according to the fitted CTF, which is needed to follow the astigmatic rings and avoid blurring of the oscillations in the radial average. This means that different CTF fits will also result in somewhat different 1D plots. However, these differences only affect the 1D EPA plots shown to the user. The actual fitting is performed against the same 2D spectra.

      (3.4.4) Line 319, Page 14. "A linear fit revealed .." It would be good to add a line with the linear fit in Figure 5.

      Agreed. The revised Fig. 5 now shows a line for the linear fit.

      (3.5) New CTF Model

      It is not clear from the text if the new CTF_t model is used at all times in CTFFIND5 or only when the user requests thickness estimation. Related to this, if the user requests both tilt estimation and thickness estimation, how is the CTF estimation process carried out in CTFFIND5?: Tilt and thickness are estimated at the same time? or one after the other (i.e. first the tilt is estimated, then followed by thickness estimation)?. Please, clarify.

      The new CTF_t model is only used when the user requests thickness estimation. When both tilt-estimation and thickness estimation are requested, the tilt is estimated first and the corrected power spectrum is then fitted using the CTF_t model. We have revised the Methods section to explain this better, Page 8, L158-159.

      (4) Pages 14-15. Section "CTF estimation and correction assists "

      This section just shows that correction of a highly underfocused image for the CTF with phase flipping or a Wiener filter reduces the CTF-induced fringes. I do not really understand the inclusion of this section to the manuscript. There is no contribution related to CTFFIND5.  

      The ability to apply a CTF correction to the input image according to Tegunov & Cramer is a new feature of apply_ctf, a program included with cisTEM. We think that this section fits into the theme of CTFFIND5 because the correction adds valuable information about the samples, such as FIB-milled lamellae.

      If the authors prefer to keep this section, then please take the following points into account:

      (4.1) Figure 6b: This is the only time that the term "EPA" (EquiPhase Averaging, I guess) is used in the manuscript. Please, spell it out somewhere in the manuscript, define what it means and add a proper citation, if convenient. This point is related to point 3.3 above.

      We have added the appropriate reference and defined EPA in the methods section as indicated in the reply to point 3.3.

      (4.2) Figure 6d. The contrast of this image is poor. Please, increase the contrast (to be similar to Figure 6c) so that the details can be better discerned. The image also shows a grainy texture, likely artefacts from the Wiener filter due to excessive amplification. Maybe the 'strength parameter' S of the deconvolution Wiener filter (Tegunov & Cramer, 2019) should be tuned down or the 'fall-off parameter' F tuned up to try to attenuate these artefacts.

      Agreed. The revised figure shows panel d with increased contrast with the custom fall-off parameter set to 1.3 and the custom strength parameter set to 0.7.

      (5) CTFFIND5 runtimes

      Table 2 shows that estimation of tilt increases the runtime up to 39 s in an image of 4070x2892 and to 208 s in one of 2880x2046. There is a significant difference between these two cases (39 s vs. 208 s) and the first image is much larger than the second. Why does CTFFIND5 on the smaller image take so long compared to the larger image?

      During tilt estimation, the images are binned to a pixel size of 5 Å. This causes micrograph 1 to be substantially smaller (in pixels) than micrographs 2 and 3, resulting in the faster runtime.

      (6) Conclusions

      (6.1) In the Conclusion section, the authors could elaborate a bit the insights about the sample quality provided by CTFFIND5. This is stated in the title of the manuscript, but it was hardly mentioned in the manuscript.

      We have revised the conclusion to make this clearer (Page 16, L389-396). CTFFIND5 helps in estimating sample quality since (1) the sample thickness is an important determinant in the amount of high-resolution signal in a micrograph and (2) the estimated fit-resolution reflects more accurately the amount of signal present in a micrograph after tilt and sample thickness have been taken into account.

      (6.2) The authors nicely identify and describe the applications where thickness-aware CTF determination will be valuable: in situ single particle analysis and in vitro single particle cryoEM of purified samples at low voltages. Perhaps, CTFFIND5 will also be of great interest for single particle cryoEM of thick specimens (e.g. capsid of large viruses with diameter in the range 120-200 nm such as PBCV-1 or HSV-1).

      Agreed. We have added this case to our Conclusions. (Fig. 3d)

      (7) Typographical errors:

      line 161, page 8. "1.5 time" should be "1.5 times"

      lines 185-191. All exposures are given in 'electrons/Angstrom', not in 'electrons/square Angstrom'

      line 206, page 10. With "slides" the authors seem to mean "slices"

      line 338, page 14: "describeD by Tegunov"

      line 349, page 15. "power spectra"

      lines 366 and 368, page 15: Note that Square Angstrom is written as "A2". Put "2" with superscript.

      Thank you for pointing out these errors. They have been corrected.

      (8) References:

      Reference: Lucas et al., eLife 10 e68946. Year is lacking. Add year: 2021.

      Reference: Yan et al. 2015 cited in line 169, page 8, does not appear in Bibliography. The authors may mean: Yan et al. 2015 JSB 192:287-296, 2015  

      It would be good to cite Bsoft, as it has a procedure similar to tilt-corrected CTF estimation: Heymann, Protein Science, 2021,  

      Thank you for carefully checking the cited references. We have revised the manuscript as suggested.

      Reviewer #2 (Recommendations For The Authors):

      I have only minor suggestions for improvement below:

      L218: "these option"

      Corrected

      L243: "chevron-shape" -> V-shape would be more accessible language for non-native speakers.

      Changed

      L281: "Based on these results we conclude that CTFFIND5 will provide more accurate CTF parameters" -> Given that the maximum resolutions of the fits by the old model and the new model are nearly the same, how big would the actual advantage of the new model be for subsequent sub-tomogram averaging?

      Please see our response above, Reviewer #3 (Public Review), 

      L376: The correct reference for RELION per-particle CTF estimation is Zivanov et al, (2018) [https://elifesciences.org/articles/42166]. Also, the cryoSPARC paper referenced does not describe per-particle CTF estimation and should thus be removed from this context.

      Thanks for pointing out these mistakes, which we have now corrected. We have chosen to keep the citation for CryoSPARC to reference the general software, but have added Ziavanov et.al. 2020 as suggested by the CryoSPARC website.

      Reviewer #3 (Recommendations For The Authors):

      Minor:

      Figure 1A legend - authors mention boxes but only 1 box is shown.

      Thank you for pointing this out. For visual clarity we decided to only show one box. We have corrected the legend.

      Figure 1B - it would be nice if the boxes that contributed to the power spectra were mapped on Figure 1A

      The shown power spectra are not actual data. Instead, we show power spectra with exaggerated defocus differences for visual clarity. We have revised the figure legends to make this clear. 

      The Y-axis legends in Figure 2 are not aligned vertically

      Corrected

      Figure 3A - CTFFIND4 is missing an "I"

      Corrected

      Figure 3 - Y-axis legends are not aligned vertically

      Corrected

      Page 16, line 376, Relion should be RELION

      We have revised the manuscript as advised.

      Typo in equation 5, sinc versus sin?

      “sinc” is correct here, since this is a thickness-dependent modulation of the CTF.

      Lambert-Beer's, Lambert-Beer are used variably but curious if Beer-Lambert should be used.

      We have revised the manuscript as advised.

    1. eLife Assessment

      This computational study integrates detailed electrophysiology and mechanical contraction predictions, which are often modeled separately. The findings of this important work are that abnormal ECGs that are associated with higher risk of sudden cardiac death are predicted to have almost no relationship with left ventricular ejection fraction, which is conventionally used as a risk factor for arrhythmia. The conclusions are based on compelling evidence for the need of incorporating additional risk factors for assessing post-myocardial infarction patients.

    2. Reviewer #1 (Public review):

      Summary:

      In this study from Zhou, Wang, and colleagues, the authors utilize biventricular electromechanical simulations to illustrate how different degrees of ionic remodeling can contribute to different ECG morphologies that are observed in either acute or chronic post-myocardial infarction (MI) patients. Interestingly, the simulations show that abnormal ECG phenotypes - associated with higher risk of sudden cardiac death - are predicted to have almost no correspondence with left ventricular ejection fraction, which is conventionally used as a risk factor for arrhythmia.

      Strengths:

      The numerical simulations are state-of-the-art, integrating detailed electrophysiology and mechanical contraction predictions, which are often modeled separately. The population of ventricular simulations provide mechanistic interpretation, down to the level of single cell ionic current remodeling, for different types of ECG morphologies observed in post-MI patients. Collectively, these results demonstrate compelling and significant evidence for the need of incorporating additional risk factors for assessing post-MI patients.

      The authors have addressed all of my previous concerns in this updated version.

    3. Reviewer #2 (Public review):

      Summary:

      The authors constructed a multi-scale modeling and simulation methods to investigate the electrical and mechanical properties under acute and chronic myocardial infarction (MI). The simulated three acute MI conditions and two chronic MI conditions. They showed that these conditions gave rise to distinct ECG characteristics that have seen in clinical settings. They showed that the post-MI remodeling reduced ejection fraction up to 10% due to weaker calcium current or SR calcium uptake, but the reduction of ejection fraction is not sensitive to remodeling of the repolarization heterogeneities.

      Strengths:

      The major strength of this study is the construction of the computer modeling that simulates both electrical behavior and mechanical behavior for post-MI remodeling. The links of different heterogeneities due to MI remodeling to different ECG characteristics provide some useful information for understanding the complex clinical problems.

      Weaknesses:

      The rationale (e.g., physiological or medical bases) for choosing the 3 acute MI and 2 chronic MI settings is not clear. Although the authors presented a huge number of simulation data, in particular in the supplemental materials, it is not clearly stated what novel findings or mechanistic insights that this study gained beyond the current understanding of the problem.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study by Zhou, Wang, and colleagues, the authors utilize biventricular electromechanical simulations to illustrate how different degrees of ionic remodeling can contribute to different ECG morphologies that are observed in either acute or chronic post-myocardial infarction (MI) patients. Interestingly, the simulations show that abnormal ECG phenotypes - associated with a higher risk of sudden cardiac death - are predicted to have almost no correspondence with left ventricular ejection fraction, which is conventionally used as a risk factor for arrhythmia.

      Strengths:

      The numerical simulations are state-of-the-art, integrating detailed electrophysiology and mechanical contraction predictions, which are often modeled separately. The simulation provides mechanistic interpretation, down to the level of single-cell ionic current remodeling, for different types of ECG morphologies observed in post-MI patients. Collectively, these results demonstrate compelling and significant evidence for the need to incorporate additional risk factors for assessing post-MI patients.

      Weaknesses:

      The study is rigorous and well-performed. However, some aspects of the methodology could be clearer, and the authors could also address some aspects of the robustness of the results. Specifically, does variability in ionic currents inherent in different patients, or the location/size of the infarct and surrounding remodeled tissue impact the presentation of these ECG morphologies?

      We thank the reviewer for their considered evaluation. In response to the reviewer’s comments regarding variability in ionic currents, we have added simulations using a n=17 populations of models with variability in ionic conductances in the baseline ToR-ORd model to the paper, to show the effect of such variation on the post-MI ECG presentation in acute and chronic conditions. This is now described in the Methods [lines 140, 158-161, 242-244, 245-246, 261-263], and shown in the methods Figure 1A, 1B. The ECG results using this population of models are shown in Figure 2C and described in [lines 333-335] and the pressure volume results using the population of models are shown in Figure 5A and 5B and described in [lines 417-418, 442-444, 448-450]. The population of models showed consistent patterns in both the ECG and LVEF as the baseline model, this is discussed in [lines 563-564, 688-690].

      Regarding the effect of scar location and size on the ECG, we refer the reader and reviewer to a related paper where this is explored in depth using a formal sensitivity analysis and deep learning inference (https://pubmed.ncbi.nlm.nih.gov/38373128/). This is better able to do justice to this question rather than overloading this paper with additional investigations. We include a reference to this paper in the discussion section [lines 694-695].

      Reviewer #2 (Public Review):

      Summary:

      The authors constructed multi-scale modeling and simulation methods to investigate the electrical and mechanical properties of acute and chronic myocardial infarction (MI). They simulated three acute MI conditions and two chronic MI conditions. They showed that these conditions gave rise to distinct ECG characteristics that have been seen in clinical settings. They showed that the post-MI remodeling reduced ejection fraction up to 10% due to weaker calcium current or SR calcium uptake, but the reduction of ejection fraction is not sensitive to remodeling of the repolarization heterogeneities.

      Strengths:

      The major strength of this study is the construction of computer modeling that simulates both electrical behavior and mechanical behavior for post-MI remodeling. The links of different heterogeneities due to MI remodeling to different ECG characteristics provide some useful information for understanding complex clinical problems.

      Weaknesses:

      The rationale (e.g., physiological or medical bases) for choosing the 3 acute MI and 2 chronic MI settings is not clear. Although the authors presented a huge number of simulation data, in particular in the supplemental materials, it is not clearly stated what novel findings or mechanistic insights this study gained beyond the current understanding of the problem.

      We thank the reviewer for their careful evaluations of our work. The justification for selecting the 3 acute MI and 2 chronic MI states is based on clinical and experimental reports, as summarised in the Methods section [lines 245-247, 252-256, 264-266].  We have also highlighted the key novelty and significance of the study in the Discussion [lines 579-582].

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) This was clarified very late in the Discussion, but for most of the paper, I was unclear if heart geometry was the same for all simulations. Presumably, this includes the size and location of the infarct, BZ, and RZ. It would be helpful to clarify this in the Methods.

      This has been clarified in the first paragraph of the Methods section [lines 142-145].

      (2) On lines 224-226, the Methods refers to implementing several population members from the ToR-ORd model (in addition to the baseline) into the biventricular EM simulations. Is this in reference to the simulations shown in Figures 6 and 7, or different simulations? Please clarify.

      We now randomly select 17 of the 245 cell models in the population to be embedded in ventricular simulations, to produce a ventricular population of models. This allows us to explore the effect that physiological variability in the baseline ionic conductances has on the phenotypic representation of ionic remodellings in the ECG and LVEF. An explanation of this can be found in the Methods section [lines 241-244].

      For Figures 6 and 7, we selected two arrhythmic cell models from the n=245 population of cell models to be embedded into two ventricular simulations to demonstrate the arrhythmic potential of the cellular model at ventricular scale. This has been clarified in Methods [lines 269-271].

      Additionally, for the cases where a population member is used, are all regions of the ventricles "scaled" in the same manner, or were only the properties of the particular region drawn from the population modified relative to baseline (e.g., mid-myocardial cells in Figure 6)?

      The cells were embedded according to transmural heterogeneity in the remote zone for Figures 6 and 7. This has been clarified in the Methods [line 271-273].

      (3) Interestingly, the study finds that the ionic remodeling in different peri-infarct regions to be most critical in the ECG phenotype, which at least strongly suggests that inherent intra-patient variability in ion channel expression could also be critical.

      This is related to the comment on the use of population members. If the authors utilized one of the ventricular myocyte population members as the 'reference' (instead of the baseline ToR-ORd parameters) and applied the same types of remodeling as in Figures 3 and 4, would they expect the same ECG morphologies?

      We have now performed this test and selected 17 cell models from the population to create a ventricular population of models. On top of this ventricular population, we have applied the remodellings, and showed that the simulated ECG morphologies were mostly consistent across these 20 members (Figure 2C).

      (4) Related, do the authors expect that the location and/or size of the infarct and peri-infarct regions would impact the different ECG morphologies?

      Regarding the effect of scar location and size on the ECG, we refer the reader and reviewer to a related paper where this is explored in depth using a formal sensitivity analysis and deep learning inference (https://pubmed.ncbi.nlm.nih.gov/38373128/). We feel this is better able to do justice to this question rather than overloading this paper with additional investigations. We include a reference to this paper in the discussion section [lines 694-695].

      Reviewer #2 (Recommendations For The Authors):

      (1) Although the authors listed the parameters and cited the papers for the origins of the parameter changes in SM4 and table S4, it should be summarized in the methods section what are the major changes or differences for the 5 conditions. Furthermore, it should be stated what is the rationale for choosing these conditions. Are these choices based on clinical classifications or experimental conditions?

      The major differences between the 5 conditions have now been summarised in the Methods [lines 252-256, 264-266]. These remodellings have been collated from a range of experimental measurements in both human and animal data, which are summarised in Table S4. This has been clarified in Methods [lines 245-247].

      (2) Figure 3C and Figure 4C do not add any additional information beyond the conductance changes listed in Table 4, and I'd suggest removing them from the figures. On the other hand, it took me some time to look at Table 4 to figure out the corresponding changes. As commented above, the remodeling changes should be summarized in the main text to help reading.

      Figure 3C and 4C provide a visual explanation of the ionic remodellings in these conditions to echo the added descriptions in the text [lines 252-256, 264-266]. For this reason, we have elected to keep those figures in the manuscript.

      (3) The authors presented a large amount of data in Supplemental Materials, some may be unnecessary and some are difficult to follow. For example; 1) There is a lot of data in Table S6, there is a simple mention in the main text and Table S6 legend. A summary of the data is needed for the readers to understand the properties of the different conditions, instead of letting the readers figure them out from the table. The same should be done for other tables and figures. There are some format issues for the tables, which mess up some of the numbers and text. 2) The data shown in Figures S25-29 provide almost no new information beyond the well-known effects of ionic currents on EAD genesis, i.e., EADs are promoted by inward currents and suppressed by outward currents. The data for alternans (Figures S18-22) are a little more complex than the cases for EADs, I think that they can be simplified.

      Thanks for the suggestions. We have now extracted the key information from Table S6- S9 and summarized them in the caption. We have also fixed the layout of the tables in this revision. The supplementary sections on alternans and EADs are simplified with the key parameters related to these proarrhythmic phenomena summarized in tables instead of showing all boxplots of parameter distributions (Tables S10 and S11).

      (4) The authors showed two mechanisms of alternans: EAD-driven and Ca-driven alternans in chronic MI. There are several distinct mechanisms of alternans including EAD-induced alternans (see the recent review by Qu and Weiss, Circ Res 132, 127(2023)). Theoretically, calcium alternans can also induce EAD alternans under proper conditions, can you rule out that the EAD alternans are not due to Ca alternans? The results in Fig.7D may say the opposite. There are some chicken-or-egg issues here.

      In Figure 7D, we showed that the epicardial cell type (blue trace) had stable EADs at fast pacing with no calcium alternans, while both the endocardial (red trace) and mid-myocardial (green trace) cell types failed to fully repolarise in every other beat. To explore whether the EAD alternans are driven by calcium alternans, we tested the effects of switching off the alternans related remodelling, and the APs tuned out to be normal. On the other hand, when we turned off the EAD related remodelling, neither EADs nor alternans occurred. Therefore, the results show the two types of ionic current remodelling are both necessary for the generation of EAD alternans (lines 656-659 in the discussion and SM9).

      (5) As for the formation of ectopic beats, it can be caused by EADs but it can caused by repolarization gradient, they are not the same and differ in different AP models (Liu et al, CircAE 12, e007571 (2019), Zhang et al, Biophy J 120, 352(2021)). It is not clear here whether the primary cause is repolarization gradient or EADs. At tissue, EADs tend to be suppressed by repolarization gradient, there is a goldilocks between the EAD amplitude and repolarization gradient for an ectopic beat to form.

      When isolated cells that showed EAD were embedded in ventricular tissue, we saw ectopic wave propagation. This was because the EADs in the RZ generated conduction block, which enabled a large repolarisation gradient to form between the BZ and RZ, thereby leading to ectopy. This has been clarified in the Results [lines 507-510].

      Additionally, we have clarified the presence of the EADs in the ventricular simulations by labelling where this occurs in the green, purple, and yellow traces in Figure 7C. This was easily missed before due to the stretched proportions of the traces in the x-axis, which is necessary to show clearly the repolarisation gradients that drive ectopy.

      (6) The authors showed many population simulations. I guess that they are all in single cells. If the population simulations were done in the whole heart, it should be stated how many models were simulated. If only one of the population models was selected for the whole heart for each case, it should clarify the rationale for choosing one of the many models. If populations of cells were modeled in the whole heart, clarify how the models were distributed in the heart.

      We now randomly select 17 of the 245 cell models in the population to be embedded in ventricular simulations, to produce a ventricular population of models. This allows us to explore the effect that physiological variability in the baseline ionic conductances has on the phenotypic representation of ionic remodellings in the ECG and LVEF. An explanation of this can be found in the Methods section [lines 241-244]. Whenever the cell models are embedded in the relevant zones, they are uniformly distributed according to the transmural heterogeneity [lines 271-273].  

      (7) QRS intervals in the simulations are much wider than the real recordings from patients (Figure 2 and Table S8). At least, a QRS of 120 ms for normal control is too wide and probably not normal.

      We have manually measured QRS duration and updated the delineation method to calculate the other biomarkers. The new values now lie within normal ranges and have been updated in SM Table S7 and S8 and in Figure 2, and the new delineation method has been included in SM2.

    1. eLife Assessment

      This valuable study provides solid support for the participation of the BMP-binding domain of MuSK, a tyrosine kinase mostly known for its role at the neuromuscular junction, in the maintenance and activation of muscle stem cells (SCs). These mononucleated cells, located between the muscle fiber basal lamina and its plasma membrane, are normally quiescent, but following muscle damage, become activated, proliferate, and mediate muscle regeneration. These cells are known to respond to a variety of signaling pathways, but this study makes the case for BMP acting via binding to MuSK in maintaining the quiescent state.

    2. Reviewer #1 (Public review):

      Summary:

      Madigan et al. assembled an interesting study investigating the role of the MuSK-BMP signaling pathway in maintaining adult mouse muscle stem cell (MuSC) quiescence and muscle function before and after trauma. Using a full body and MuSC-specific genetic knockout system, they demonstrate that MuSK is expressed on MuSCs and that eliminating the BMP binding domain from the MuSK gene (i.e., MuSK-IgG KO) in mice at homeostasis leads to reduced PAX7+ cells, increased myonuclear number, and increase myofiber size, which may be due to a deficit in maintaining quiescence. Additionally, after BaCl2 injury, MuSK-IgG KO mice display accelerated repair after 7 days post-injury (dpi) in males only. Finally, RNA profiling using nCounter technology showed that MuSK-IgG KO MuSCs express genes that may be associated with the activated state.

      Strengths:

      Overall, the biology regulating MuSC quiescence is still relatively unexplored, and thus, this work provides a new mechanism controlling this process. The experiments discussed in the paper are technically sound with great complementary mouse models (full body versus tissue-specific mouse KO) used to validate their hypothesis. Additionally, the paper is well written with all the necessary information in the legends, methods, and figures being reported.

      Weaknesses:

      While the data largely supports the author's conclusions, I do have a few points to consider when reading this paper.

      (1) For Figure 1, while I appreciate the author's confirming MuSK RNA and protein in MuSCs, I do think they should (a) quantify the RNA using qPCR and (b) determine the percentage of MuSCs expressing MuSK protein in their single fiber system in multiple biological replicates. This information will help us understand if MuSK is expressed in 1/10 or 10/10 PAX7-expressing MuSCs. Also, it will help place their phenotypes into the right context, especially when considering how much of the PAX7-pool is expressing MuSK from the beginning.

      (2) Throughout the paper the argument is made that MuSK-IgG KO (full body and MuSC-specific KOs) are more activated and/or break quiescence more readily, but there is no attempt to test directly. Therefore, the authors should consider measuring the activation dynamics (i.e., break from quiescence) of MuSCs directly (EdU assays or live-cell imaging) in culture and/or in muscle in vivo (EdU assays) using their various genetic mouse models.

      (3) For Figure 2, given that mice are considered adults by 3 months, it is really surprising how just two months later they are starting to see a phenotype (i.e., reduced PAX7-cells, increased number of myonuclei, and increased myofiber size)-which correlates with getting older. Given that aged MuSCs have activation defects (i.e., stuck somewhere in the quiescence cycle), a pending question is whether their phenotype gets stronger in aged mice, like 18-24 months. If yes, the argument that this pathway should be used in a therapeutic sense would be strengthened.

      (4) For Figure 4, the same question as in point (2), the increase in fiber sizes by 7dpi in MuSK-IgG KO males is minimal (going from ~23 to 27 by eye) and no difference at a later time point when compared to WT mice. However, if older mice are used (18-24 months old) - which are known to have repair deficits-will the regenerative phenotype in MuSK-IgG KO mice be more substantial and longer lasting?

      (5) For Figure 6, this gene set is not glaringly obvious as being markers of MuSC activation (i.e., no MyoD), so it's hard for the readers to know if this gene set is truly an activation signature. Also, the Shcherbina et al. data presented as a column with * being up or down (i.e. differentially expressed) is not helpful, since you don't know whether those mRNAs in that dataset are going up with the activation process. Addressing this point as well as my point (1) will further strengthen the author's conclusions about the MuSK-IgG KO MuSCs not being able to maintain quiescence as effectively.

    3. Reviewer #2 (Public review):

      Summary:

      The work by Madigan et al. provides evidence that the signaling of BMPs via the Ig3 domain of MuSK plays a role during muscle postnatal development and regeneration, ultimately resulting in enhanced contractile force generation in the absence of the MuSK Ig3 domain. They demonstrate that MuSK is expressed in satellite cells initially post-isolation of muscle single fibers both in WT and whole-body deletion of the BMP binding domain of MuSK (ΔIg3-MuSK). In mice, ΔIg3-MuSK results in increased muscle fiber size, a reduction in Pax7+ cells, and increased muscle contractile force in 5-month-old, but not 3-month-old, mice. These data are complemented by a model in which the kinetics of regeneration appear to be accelerated at early time points. Of note, the authors demonstrate muscle tibialis anterior (TA) weights and fiber feret are increased in a Pax7CreERT2;MuSK-Ig3loxp/loxp model in which satellite cells specifically lack the MuSK BMP binding domain. Finally, using Nanostring transcriptional the authors identified a short list of genes that differ between the WT and ΔIg3-MuSK SCs. These data provide the field with new evidence of signaling pathways that regulate satellite cell activation/quiescence in the context of skeletal muscle development and regeneration.

      On the whole, the findings in this paper are well supported, however additional validation of key satellite cell markers and data analysis need to be conducted given the current claims.

      (1) The Pax7CreERT2;MuSK-Ig3loxp/loxp model is the appropriate model to conduct studies to assess satellite cell involvement in MuSK/BMP regulation. Validation of changes to muscle force production is currently absent using this model, as is quantification of Pax7+ tdT+ cells in 5-month muscle. Given that MuSK is also expressed on mature myofibers at NMJs, these data would further inform the conclusions proposed in the paper.

      (2) All Pax7 quantification in the paper would benefit from high magnification images including staining for laminin demonstrating the cells are under the basal lamina.

      (3) The nanostring dataset could be further analyzed and clarified. In Figure 6b, it is not initially apparent what genes are upregulated or downregulated in young and aged SCs and how this compares with your data. Pathway analysis geared toward genes involved in the TGFb superfamily would be informative.

      (4) Characterizing MuSK expression on perfusion-fixed EDL fibers would be more conclusive to determine if MuSK is expressed in quiescent SCs. Additional characterization using MyoD, MyoG, and Fos staining of SCs on EDL fibers would help inform on their state of activation/quiescent.

      (5) Finally, the treatment of fibers in the presence or absence of recombinant BMP proteins would inform the claims of the paper.

    4. Reviewer #3 (Public review):

      Summary:

      Understanding the molecular regulation of muscle stem cell quiescence. The authors evaluated the role of the MuSK-BMP pathway in regulating adult SC quiescence by the deletion of the BMP-binding MuSK Ig3 domain ('ΔIg3-MuSK').

      Strengths:

      A novel mouse model to interrogate muscle stem cell molecular regulators. The authors have developed a nice mouse model to interrogate the role of MuSK signaling in muscle stem cells and myofibers and have unique tools to do this.

      Weaknesses:

      Only minor technical questions remain and there is a need for additional data to support the conclusions.

      (1) The authors claim that dIg3-MuSK satellite cells break quiescence and start fusing, based on the reduction of Pax7+ and increase of nuclei/fiber (Fig 2-3), and maybe the gene expression (Fig6). However, direct evidence is needed to support these findings such as quantifying quiescent (Pax7+Ki67-) or activated (Pax7+Ki67+) satellite cells (and maybe proliferating progenitors Pax7-Ki67+) in the dIg3-MuSK muscle.

      (2) It is not clear if the MuSK-BMP pathway is required to maintain satellite cell quiescence, by the end of the regeneration (29dpi), how Pax7+ numbers are comparable to the WT (Fig4d). I would expect to have less Pax7+, as in uninjured muscle. Can the authors evaluate this in more detail?

      (2) Figure 4 claims that regeneration is accelerated, but to claim this at a minimum they need to look at MYH3+ fibers, in addition to fiber size.

      (3) The Pax7 specific dIg3-MuSK (Fig5) is very exciting. However, it will be important to quantify the Pax7+ number. Could the authors check the reduction of Pax7+ in this model since it would confirm the importance of MuSK in quiescence?

      (3) Rescue of the BMP pathway in the model would be further supportive of the authors' findings.

      (4) Is the stem cell pool maintained long term in the deleted dIg3-MuSK SCs? Or would they be lost with extended treatment since they are reduced at the 5-month experiments? This is an important point and should be considered/discussed relevant to thinking about these data therapeutically.

      (5) Without the Pax7-specific targeting, when you target dIg3-MuSK in the entire muscle, what happens to the neuromuscular nuclei?

      (6) Why were differences seen in males and not females? Is XIST downregulation occurring in both sexes? Could the authors explain these findings in more detail?

    5. Author response:

      Reviewer #1 (Public review):

      Summary:

      Madigan et al. assembled an interesting study investigating the role of the MuSK-BMP signaling pathway in maintaining adult mouse muscle stem cell (MuSC) quiescence and muscle function before and after trauma. Using a full body and MuSC-specific genetic knockout system, they demonstrate that MuSK is expressed on MuSCs and that eliminating the BMP binding domain from the MuSK gene (i.e., MuSK-IgG KO) in mice at homeostasis leads to reduced PAX7+ cells, increased myonuclear number, and increase myofiber size, which may be due to a deficit in maintaining quiescence. Additionally, after BaCl2 injury, MuSK-IgG KO mice display accelerated repair after 7 days post-injury (dpi) in males only. Finally, RNA profiling using nCounter technology showed that MuSK-IgG KO MuSCs express genes that may be associated with the activated state.

      Strengths:

      Overall, the biology regulating MuSC quiescence is still relatively unexplored, and thus, this work provides a new mechanism controlling this process. The experiments discussed in the paper are technically sound with great complementary mouse models (full body versus tissue-specific mouse KO) used to validate their hypothesis. Additionally, the paper is well written with all the necessary information in the legends, methods, and figures being reported.

      Weaknesses:

      While the data largely supports the author's conclusions, I do have a few points to consider when reading this paper.

      (1) For Figure 1, while I appreciate the author's confirming MuSK RNA and protein in MuSCs, I do think they should (a) quantify the RNA using qPCR and (b) determine the percentage of MuSCs expressing MuSK protein in their single fiber system in multiple biological replicates. This information will help us understand if MuSK is expressed in 1/10 or 10/10 PAX7-expressing MuSCs. Also, it will help place their phenotypes into the right context, especially when considering how much of the PAX7-pool is expressing MuSK from the beginning.

      The quantification is a reasonable point; however, we don’t believe that this information is necessary for supporting the interpretation of the findings.

      We agree that determining the proportion of SCs that expressing MuSK is useful information and we will address this question in the Revision.

      (2) Throughout the paper the argument is made that MuSK-IgG KO (full body and MuSC-specific KOs) are more activated and/or break quiescence more readily, but there is no attempt to test directly. Therefore, the authors should consider measuring the activation dynamics (i.e., break from quiescence) of MuSCs directly (EdU assays or live-cell imaging) in culture and/or in muscle in vivo (EdU assays) using their various genetic mouse models

      We agree that this point is of interest and we plan to address it in future studies.

      (3) For Figure 2, given that mice are considered adults by 3 months, it is really surprising how just two months later they are starting to see a phenotype (i.e., reduced PAX7-cells, increased number of myonuclei, and increased myofiber size)-which correlates with getting older. Given that aged MuSCs have activation defects (i.e., stuck somewhere in the quiescence cycle), a pending question is whether their phenotype gets stronger in aged mice, like 18-24 months. If yes, the argument that this pathway should be used in a therapeutic sense would be strengthened.

      We agree that the potential role of the MuSK-BMP pathway in aged SCs is of import and could shed new light on SC dynamics in this context. However, we note that the activation observed between 3-5 months results in improved muscle quality (increased myofiber size and grip strength), which is opposite of what is observed with aging. We agree that activating the MuSK-BMP pathway in aged animals has the potential to activate SCs, promote muscle growth and counter sarcopenia.  Pharmacological and genetic approaches to test that question are underway, but given the time frame they are beyond the scope of the current manuscript.

      (4) For Figure 4, the same question as in point (2), the increase in fiber sizes by 7dpi in MuSK-IgG KO males is minimal (going from ~23 to 27 by eye) and no difference at a later time point when compared to WT mice. However, if older mice are used (18-24 months old) - which are known to have repair deficits-will the regenerative phenotype in MuSK-IgG KO mice be more substantial and longer lasting?

      Again, an interesting point that will be addressed in future studies. 

      (5) For Figure 6, this gene set is not glaringly obvious as being markers of MuSC activation (i.e., no MyoD), so it's hard for the readers to know if this gene set is truly an activation signature. Also, the Shcherbina et al. data presented as a column with * being up or down (i.e. differentially expressed) is not helpful, since you don't know whether those mRNAs in that dataset are going up with the activation process. Addressing this point as well as my point (1) will further strengthen the author's conclusions about the MuSK-IgG KO MuSCs not being able to maintain quiescence as effectively.

      We agree that this Figure should include more information and be formatted in a way more readily convey the point. We will provide these changes in the Revision.

      Reviewer #2 (Public review):

      Summary:

      The work by Madigan et al. provides evidence that the signaling of BMPs via the Ig3 domain of MuSK plays a role during muscle postnatal development and regeneration, ultimately resulting in enhanced contractile force generation in the absence of the MuSK Ig3 domain. They demonstrate that MuSK is expressed in satellite cells initially post-isolation of muscle single fibers both in WT and whole-body deletion of the BMP binding domain of MuSK (ΔIg3-MuSK). In developing mice, ΔIg3-MuSK results in increased muscle fiber size, a reduction in Pax7+ cells, and increased muscle contractile force in 5-month-old, but not 3-month-old, mice. These data are complemented by a model in which the kinetics of regeneration appear to be accelerated at early time points. Of note, the authors demonstrate muscle tibialis anterior (TA) weights and fiber feret are increased during development in a Pax7CreERT2;MuSK-Ig3loxp/loxp model in which satellite cells specifically lack the MuSK BMP binding domain. Finally, using Nanostring transcriptional the authors identified a short list of genes that differ between the WT and ΔIg3-MuSK SCs. These data provide the field with new evidence of signaling pathways that regulate satellite cell activation/quiescence in the context of skeletal muscle development and regeneration.

      On the whole, the findings in this paper are well supported, however additional validation of key satellite cell markers and data analysis need to be conducted given the current claims.

      (1) The Pax7CreERT2;MuSK-Ig3loxp/loxp model is the appropriate model to conduct studies to assess satellite cell involvement in MuSK/BMP regulation. Validation of changes to muscle force production is currently absent using this model, as is quantification of Pax7+ tdT+ cells in 5-month muscle. Given that MuSK is also expressed on mature myofibers at NMJs, these data would further inform the conclusions proposed in the paper.

      As reported in the manuscript, we observed increased myofiber size, length and TA weight in the conditional mutants at five months of age. We did not assess grip strength in those experiments. 

      We demonstrated highly efficient MuSK Ig3-domain recombination by PCR analysis of FACS-sorted SCs from these conditional mutants (Supplemental Fig. S3). However, while we checked for Pax7+ tdT+ cells in 5-month SCs, we did not quantify this finding.

      (2) All Pax7 quantification in the paper would benefit from high magnification images including staining for laminin demonstrating the cells are under the basal lamina.

      The point is reasonable, we observed that these Pax7+ cells were under the basal lamina, but we did not acquire images at higher magnification.   

      (3) The nanostring dataset could be further analyzed and clarified. In Figure 6b, it is not initially apparent what genes are upregulated or downregulated in young and aged SCs and how this compares with your data. Pathway analysis geared toward genes involved in the TGFb superfamily would be informative.

      We agree that further analysis and information regarding the data in this Figure is warranted and we will include it in the Revision.

      (4) Characterizing MuSK expression on perfusion-fixed EDL fibers would be more conclusive to determine if MuSK is expressed in quiescent SCs. Additional characterization using MyoD, MyoG, and Fos staining of SCs on EDL fibers would help inform on their state of activation/quiescent.

      These are all valid points that we intend to address in future experiments.

      (5) Finally, the treatment of fibers in the presence or absence of recombinant BMP proteins would inform the claims of the paper.

      As reported in Jaime et al (2024) we have extensively characterized the differences in BMP response in both cultured WT and DIg3-MuSK myofibers and myoblasts at the level of signaling (pSMAD 1/5/8 nuclear localization and phosphorylation) and gene expression (qRT-PCR).

      Reviewer #3 (Public review):

      Summary:

      Understanding the molecular regulation of muscle stem cell quiescence. The authors evaluated the role of the MuSK-BMP pathway in regulating adult SC quiescence by the deletion of the BMP-binding MuSK Ig3 domain ('ΔIg3-MuSK').

      Strengths:

      A novel mouse model to interrogate muscle stem cell molecular regulators. The authors have developed a nice mouse model to interrogate the role of MuSK signaling in muscle stem cells and myofibers and have unique tools to do this.

      Weaknesses:

      Only minor technical questions remain and there is a need for additional data to support the conclusions.

      (1) The authors claim that dIg3-MuSK satellite cells break quiescence and start fusing, based on the reduction of Pax7+ and increase of nuclei/fiber (Fig 2-3), and maybe the gene expression (Fig6). However, direct evidence is needed to support these findings such as quantifying quiescent (Pax7+Ki67-) or activated (Pax7+Ki67+) satellite cells (and maybe proliferating progenitors Pax7-Ki67+) in the dIg3-MuSK muscle.

      We believe that the data presented strongly supports the conclusion that the SCs break quiescence, activate, and fuse into myofibers in uninjured muscle.  As noted above, the mechanistic studies suggested are of interest and we will address them in future work.

      (2) It is not clear if the MuSK-BMP pathway is required to maintain satellite cell quiescence, by the end of the regeneration (29dpi), how Pax7+ numbers are comparable to the WT (Fig4d). I would expect to have less Pax7+, as in uninjured muscle. Can the authors evaluate this in more detail?

      The reviewer makes an important point. Our current interpretation of the findings is that quiescence is broken in SCs in uninjured muscle, but that ‘stemness’ is preserved, allowing for efficient muscle regeneration and restoration of the SC pool. Whether such properties reflect SC heterogeneity (as suggested in the comments of the other reviewers) and/or different states along a continuum is of particular interest and will be the focus of future studies. 

      (2) Figure 4 claims that regeneration is accelerated, but to claim this at a minimum they need to look at MYH3+ fibers, in addition to fiber size.

      We did not examine MYH3+ fibers in this study. However, we did observe increased in Pax7+ cells at 5dpi (male and female) as well as larger myofiber size (Feret diameter) at 7dpi in the male animals.  In addition, the panels in Figure 4 b,c (H&E and laminin, respectively) showing accelerated differentiation were selected to be representative of the experimental group. 

      (3) The Pax7 specific dIg3-MuSK (Fig5) is very exciting. However, it will be important to quantify the Pax7+ number. Could the authors check the reduction of Pax7+ in this model since it would confirm the importance of MuSK in quiescence?

      In Figure 5c, we assessed the number of Pax7+ cells in the conditional mutant during the course of regeneration (at 3, 5, 7, 14, 22 and 29 dpi). As discussed above, these results confirmed the findings of the constitutive mutant (reduction of Pax7+ cells in uninjured 5-month-old muscle) as well as showing the increased number at 5dpi and return to WT levels at 29 dpi.

      (3) Rescue of the BMP pathway in the model would be further supportive of the authors' findings.

      This point is valid. In a parallel study examining the role of the MuSK-BMP pathway at the NMJ, we have observed that BMP+/- (hypomorphs) recapitulate key phenotypes observed in DIg3-MuSK  NMJs (Fish et al., bioRxiv, 2023). This point will be included in the Revision. 

      (4) Is the stem cell pool maintained long term in the deleted dIg3-MuSK SCs? Or would they be lost with extended treatment since they are reduced at the 5-month experiments? This is an important point and should be considered/discussed relevant to thinking about these data therapeutically.

      We agree that this is an important point for future studies. 

      (5) Without the Pax7-specific targeting, when you target dIg3-MuSK in the entire muscle, what happens to the neuromuscular nuclei?

      A manuscript describing the phenotype of the NMJ in DIg3-MuSK constitutive mice is in bioRxiv (Fish et al., 2024) and is in Revision at another journal.  We anticipate discussing the findings in the Revised version of the current manuscript. 

      (6) Why were differences seen in males and not females? Is XIST downregulation occurring in both sexes? Could the authors explain these findings in more detail?

      The male and female difference in myofiber size is of interest.  The nanostring experiments,  which showed the XIST reduction, were only performed in male mice.

    1. eLife Assessment

      This valuable study reveals extensive binding of eukaryotic translation initiation factor 3 (eIF3) to the 3' untranslated regions (UTRs) of efficiently translated mRNAs in human pluripotent stem cell-derived neuronal progenitor cells. The authors provide solid evidence to support their conclusions, although this study may be enhanced by addressing potential biases of techniques employed to study eIF3:mRNA binding and providing additional mechanistic detail. This work will be of significant interest to researchers exploring post-transcriptional regulation of gene expression, including cellular, molecular, and developmental biologists, as well as biochemists.

    2. Reviewer #1 (Public review):

      Summary:

      The authors perform irCLIP of neuronal progenitor cells to profile eIF3-RNA interactions upon short-term neuronal differentiation. The data shows that eIF3 mostly interacts with 3'-UTRs - specifically, the poly-A signal. There appears to be a general correlation between eIF3 binding to 3'-UTRs and ribosome occupancy, which might suggest that eIF3 binding promotes protein synthesis, possibly through inducing mRNA closed-loop formation.

      Strengths:

      The study provides a wealth of new data on eIF3-mRNA interactions and points to the potential new concept that eIF3-mRNA interactions are polyadenylation-dependent and correlate with ribosome occupancy.

      Weaknesses:

      (1) A main limitation is the correlative nature of the study. Whereas the evidence that eIF3 interacts with 3-UTRs is solid, the biological role of the interactions remains entirely unknown. Similarly, the claim that eIF3 interactions with 3'-UTR termini require polyadenylation but are independent of poly(A) binding proteins lacks support as it solely relies on the absence of observable eIF3 binding to poly-A (-) histone mRNAs and a seeming failure to detect PABP binding to eIF3 by co-immunoprecipitation and Western blotting. In contrast, LC-MS data in Supplementary File 1 show ready co-purification of eIF3 with PABP.

      (2) Another question concerns the relevance of the cellular model studied. irCLIP is performed on neuronal progenitor cells subjected to neuronal induction for 2 hours. This short-term induction leads to a very modest - perhaps 10% - and very transient 1-hour-long increase in translation, although this is not carefully quantified. The cellular phenotype also does not appear to change and calling the cells treated with differentiation media for 2 hours "differentiated NPCs" seems a bit misleading. Perhaps unsurprisingly, the minor "burst" of translation coincides with minor effects on eIF3-mRNA interactions most of which seem to be driven by mRNA levels. Based on the ~15-fold increase in ID2 mRNA coinciding with a ~5-fold increase in ribosome occupancy (RPF), ID2 TE actually goes down upon neuronal induction.

      (3) The overlap in eIF3-mRNA interactions identified here and in the authors' previous reports is minimal. Some of the discrepancies may be related to the not well-justified approach for filtering data prior to assessing overlap. Still, the fundamentally different binding patterns - eIF3 mostly interacting with 5'-UTRs in the authors' previous report and other studies versus the strong preference for 3'-UTRs shown here - are striking. In the Discussion, it is speculated that the different methods used - PAR-CLIP versus irCLIP - lead to these fundamental differences. Unfortunately, this is not supported by any data, even though it would be very important for the translation field to learn whether different CLIP methodologies assess very different aspects of eIF3-mRNA interactions.

    3. Reviewer #2 (Public review):

      Summary:

      The paper documents the role of eIF3 in translational control during neural progenitor cell (NPC) differentiation. eIF3 predominantly binds to the 3' UTR termini of mRNAs during NPC differentiation, adjacent to the poly(A) tails, and is associated with efficiently translated mRNAs, indicating a role for eIF3 in promoting translation.

      Strengths:

      The manuscript is strong in addressing molecular mechanisms by using a combination of next-generation sequencing and crosslinking techniques, thus providing a comprehensive dataset that supports the authors' claims. The manuscript is methodologically sound, with clear experimental designs.

      Weaknesses:

      (1) The study could benefit from further exploration into the molecular mechanisms by which eIF3 interacts with 3' UTR termini. While the correlation between eIF3 binding and high translation levels is established, the functionality of these interactions needs validation. The authors should consider including experiments that test whether eIF3 binding sites are necessary for increased translation efficiency using reporter constructs.

      (2) The authors mention that the eIF3 3' UTR termini crosslinking pattern observed in their study was not reported in previous PAR-CLIP studies performed in HEK293T cells (Lee et al., 2015) and Jurkat cells (De Silva et al., 2021). They attribute this difference to the different UV wavelengths used in Quick-irCLIP (254 nm) and PAR-CLIP (365 nm with 4-thiouridine). While the explanation is plausible, it remains a caveat that different UV crosslinking methods may capture different eIF3 modules or binding sites, depending on the chemical propensities of the amino acid-nucleotide crosslinks at each wavelength. Without addressing this caveat in more detail, the authors cannot generalize their findings, and thus, the title of the paper, which suggests a broad role for eIF3, may be misleading. Previous studies have pointed to an enrichment of eIF3 binding at the 5' UTRs, and the divergence in results between studies needs to be more explicitly acknowledged.

      (3) While the manuscript concludes that eIF3's interaction with 3' UTR termini is independent of poly(A)-binding proteins, transient or indirect interactions should be tested using assays such as PLA (Proximity Ligation Assay), which could provide more insights.

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript by Mestre-Fos and colleagues, authors have analyzed the involvement of eIF3 binding to mRNA during differentiation of neural progenitor cells (NPC). The authors bring a lot of interesting observations leading to a novel function for eIF3 at the 3'UTR.

      During the translational burst that occurs during NPC differentiation, analysis of eIF3-associated mRNA by Quick-irCLIP reveals the unexpected binding of this initiation factor at the 3'UTR of most mRNA. Further analysis of alternative polyadenylation by APAseq highlights the close proximity of the eIF3-crosslinking position and the poly(A) tail. Furthermore, this interaction is not detected in Poly(A)-less transcripts. Using Riboseq, the authors then attempted to correlate eIF3 binding with the translation efficacy of mRNA, which would suggest a common mechanism of translational control in these cells. These observations indicate that eIF3-binding at the 3'UTR of mRNA, near the poly(A) tail, may participate to the closed-loop model of mRNA translation, bridging 5' and 3', and allowing ribosomes recycling. However, authors failed to detect interactions of eIF3, with either PABP or Paip1 or 40S subunit proteins, which is quite unexpected.

      Strength:

      The well-written manuscript presents an attractive concept regarding the mechanism of eIF3 function at the 3'UTR. Most mRNA in NPC seems to have eIF3 binding at the 3'UTR and only a few at the 5'end where it's commonly thought to bind. In a previous study from the Cate lab, eIF3 was reported to bind to a small region of the 3'UTR of the TCRA and TCRB mRNA, which was responsible for their specific translational stimulation, during T cell activation. Surprisingly in this study, the eIF3 association with mRNA occurs near polyadenylation signals in NPC, independently of cell differentiation status. This compelling evidence suggests a general mechanism of translation control by eIF3 in NPC. This observation brings back the old concept of mRNA circularization with new arguments, independent of PABP and eIF4G interaction. Finally, the discussion adequately describes the potential technical limitations of the present study compared to previous ones by the same group, due to the use of Quick-irCLIP as opposed to the PAR-CLIP/thiouridine.

      Weaknesses:

      (1) These data were obtained from an unusual cell type, limiting the generalizability of the model.

      (2) This study lacks a clear explanation for the increased translation associated with NPC differentiation, as eIF3 binding is observed in both differentiated and undifferentiated NPC. For example, I find a kind of inconsistency between changes in Riboseq density (Figure 3B) and changes in protein synthesis (Figure 1D). Thus, the title overstates a modest correlation between eIF3 binding and important changes in protein synthesis.

      (3) This is illustrated by the candidate selection that supports this demonstration. Looking at Figure 3B, ID2, and SNAT2 mRNA are not part of the High TE transcripts (in red). In contrast, the increase in mRNA abundance could explain a proportionally increased association with eIF3 as well as with ribosomes. The example of increased protein abundance of these best candidates is overall weak and uncertain.

      (4) Despite several attempts (chemical and UV cross-linking) to identify eIF3 partners in NPC such as PABP, PAIP1, or proteins from the 40S, the authors could not provide any evidence for such a mechanism consistent with the closed-loop model. Overall, this rather descriptive study lacks mechanistic insight (eIF3 binding partners).

      (5) Finally, the authors suspect a potential impact of technical improvement provided by Quick-irCLIP, that could have been addressed rather than discussed.

    5. Author response:

      eLife Assessment

      This valuable study reveals extensive binding of eukaryotic translation initiation factor 3 (eIF3) to the 3' untranslated regions (UTRs) of efficiently translated mRNAs in human pluripotent stem cell-derived neuronal progenitor cells. The authors provide solid evidence to support their conclusions, although this study may be enhanced by addressing potential biases of techniques employed to study eIF3:mRNA binding and providing additional mechanistic detail. This work will be of significant interest to researchers exploring post-transcriptional regulation of gene expression, including cellular, molecular, and developmental biologists, as well as biochemists.

      We thank the reviewers for their positive views of the results we present, along with the constructive feedback regarding the strengths and weaknesses of our manuscript, with which we generally agree. We acknowledge our results will require a deeper exploration of the molecular mechanisms behind eIF3 interactions with 3'-UTR termini and experiments to identify the molecular partners involved. Additionally, given that NPC differentiation toward mature neurons is a process that takes around 3 weeks, we recognize the importance of examining eIF3-mRNA interactions in NPCs that have undergone differentiation over longer periods than the 2-hr time point selected in this study. Finally, considering the molecular complexity of the 13-subunit human eIF3, we agree that a direct comparison between Quick-irCLIP and PAR-CLIP will be highly beneficial and will determine whether different UV crosslinking wavelengths report on different eIF3 molecular interactions. Additional comments are given below to the identified weaknesses.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors perform irCLIP of neuronal progenitor cells to profile eIF3-RNA interactions upon short-term neuronal differentiation. The data shows that eIF3 mostly interacts with 3'-UTRs - specifically, the poly-A signal. There appears to be a general correlation between eIF3 binding to 3'-UTRs and ribosome occupancy, which might suggest that eIF3 binding promotes protein synthesis, possibly through inducing mRNA closed-loop formation.

      Strengths:

      The study provides a wealth of new data on eIF3-mRNA interactions and points to the potential new concept that eIF3-mRNA interactions are polyadenylation-dependent and correlate with ribosome occupancy.

      Weaknesses:

      (1) A main limitation is the correlative nature of the study. Whereas the evidence that eIF3 interacts with 3-UTRs is solid, the biological role of the interactions remains entirely unknown. Similarly, the claim that eIF3 interactions with 3'-UTR termini require polyadenylation but are independent of poly(A) binding proteins lacks support as it solely relies on the absence of observable eIF3 binding to poly-A (-) histone mRNAs and a seeming failure to detect PABP binding to eIF3 by co-immunoprecipitation and Western blotting. In contrast, LC-MS data in Supplementary File 1 show ready co-purification of eIF3 with PABP.

      We agree the molecular mechanisms underlying the crosslinking between eIF3 and the end of mRNA 3’-UTRs remains to be determined. We also agree that the lack of interaction seen between eIF3 and PABP in Westerns, even from HEK293T cells, is a puzzle. The low sequence coverage in the LC-MS data gave us pause about making a strong statement that these represent direct eIF3 interactions, given the similar background levels of some ribosomal proteins.

      (2) Another question concerns the relevance of the cellular model studied. irCLIP is performed on neuronal progenitor cells subjected to neuronal induction for 2 hours. This short-term induction leads to a very modest - perhaps 10% - and very transient 1-hour-long increase in translation, although this is not carefully quantified. The cellular phenotype also does not appear to change and calling the cells treated with differentiation media for 2 hours "differentiated NPCs" seems a bit misleading. Perhaps unsurprisingly, the minor "burst" of translation coincides with minor effects on eIF3-mRNA interactions most of which seem to be driven by mRNA levels. Based on the ~15-fold increase in ID2 mRNA coinciding with a ~5-fold increase in ribosome occupancy (RPF), ID2 TE actually goes down upon neuronal induction.

      We agree that it will be interesting to look at eIF3-mRNA interactions at longer time points after induction of NPC differentiation. However, the pattern of eIF3 crosslinking to the end of 3’-UTRs occurs in both time points reported here, which is likely to be the more general finding in what we present.

      (3) The overlap in eIF3-mRNA interactions identified here and in the authors' previous reports is minimal. Some of the discrepancies may be related to the not well-justified approach for filtering data prior to assessing overlap. Still, the fundamentally different binding patterns - eIF3 mostly interacting with 5'-UTRs in the authors' previous report and other studies versus the strong preference for 3'-UTRs shown here - are striking. In the Discussion, it is speculated that the different methods used - PAR-CLIP versus irCLIP - lead to these fundamental differences. Unfortunately, this is not supported by any data, even though it would be very important for the translation field to learn whether different CLIP methodologies assess very different aspects of eIF3-mRNA interactions.

      We agree the more interesting aspect of what we observe is the difference in location of eIF3 crosslinking, i.e. the end of 3’-UTRs rather than 5’-UTRs or the pan-mRNA pattern we observed in T cells. The reviewer is right that it will be important in the future to compare PAR-CLIP and Quick-irCLIP side-by-side to begin to unravel the differences we observe with the two approaches.

      Reviewer #2 (Public review):

      Summary:

      The paper documents the role of eIF3 in translational control during neural progenitor cell (NPC) differentiation. eIF3 predominantly binds to the 3' UTR termini of mRNAs during NPC differentiation, adjacent to the poly(A) tails, and is associated with efficiently translated mRNAs, indicating a role for eIF3 in promoting translation.

      Strengths:

      The manuscript is strong in addressing molecular mechanisms by using a combination of next-generation sequencing and crosslinking techniques, thus providing a comprehensive dataset that supports the authors' claims. The manuscript is methodologically sound, with clear experimental designs.

      Weaknesses:

      (1) The study could benefit from further exploration into the molecular mechanisms by which eIF3 interacts with 3' UTR termini. While the correlation between eIF3 binding and high translation levels is established, the functionality of these interactions needs validation. The authors should consider including experiments that test whether eIF3 binding sites are necessary for increased translation efficiency using reporter constructs.

      We agree with the reviewer that the molecular mechanism by which eIF3 interacts with the 3’-UTR termini remains unclear, along with its biological significance, i.e. how it contributes to translation levels. We think it could be useful to try reporters in, perhaps, HEK293T cells in the future to probe the mechanism in more detail.

      (2) The authors mention that the eIF3 3' UTR termini crosslinking pattern observed in their study was not reported in previous PAR-CLIP studies performed in HEK293T cells (Lee et al., 2015) and Jurkat cells (De Silva et al., 2021). They attribute this difference to the different UV wavelengths used in Quick-irCLIP (254 nm) and PAR-CLIP (365 nm with 4-thiouridine). While the explanation is plausible, it remains a caveat that different UV crosslinking methods may capture different eIF3 modules or binding sites, depending on the chemical propensities of the amino acid-nucleotide crosslinks at each wavelength. Without addressing this caveat in more detail, the authors cannot generalize their findings, and thus, the title of the paper, which suggests a broad role for eIF3, may be misleading. Previous studies have pointed to an enrichment of eIF3 binding at the 5' UTRs, and the divergence in results between studies needs to be more explicitly acknowledged.

      We agree with the reviewer that the two methods of crosslinking will require a more detailed head-to-head comparison in the future. However, we do think the title is justified by the fact that we see crosslinking to the termini of 3’-UTRs across thousands of transcripts in each condition. Furthermore, the 3’-UTR crosslinking is enriched on mRNAs with higher ribosome protected fragment counts (RPF) in differentiated cells, Figure 3F.

      (3) While the manuscript concludes that eIF3's interaction with 3' UTR termini is independent of poly(A)-binding proteins, transient or indirect interactions should be tested using assays such as PLA (Proximity Ligation Assay), which could provide more insights.

      This is a good idea, but would require a substantial effort better suited to a future publication. We think our observations are interesting enough to the field to stimulate future experimentation that we may or may not be most capable of doing in our lab.

      Reviewer #3 (Public review):

      Summary:

      In this manuscript by Mestre-Fos and colleagues, authors have analyzed the involvement of eIF3 binding to mRNA during differentiation of neural progenitor cells (NPC). The authors bring a lot of interesting observations leading to a novel function for eIF3 at the 3'UTR.

      During the translational burst that occurs during NPC differentiation, analysis of eIF3-associated mRNA by Quick-irCLIP reveals the unexpected binding of this initiation factor at the 3'UTR of most mRNA. Further analysis of alternative polyadenylation by APAseq highlights the close proximity of the eIF3-crosslinking position and the poly(A) tail. Furthermore, this interaction is not detected in Poly(A)-less transcripts. Using Riboseq, the authors then attempted to correlate eIF3 binding with the translation efficacy of mRNA, which would suggest a common mechanism of translational control in these cells. These observations indicate that eIF3-binding at the 3'UTR of mRNA, near the poly(A) tail, may participate to the closed-loop model of mRNA translation, bridging 5' and 3', and allowing ribosomes recycling. However, authors failed to detect interactions of eIF3, with either PABP or Paip1 or 40S subunit proteins, which is quite unexpected.

      Strength:

      The well-written manuscript presents an attractive concept regarding the mechanism of eIF3 function at the 3'UTR. Most mRNA in NPC seems to have eIF3 binding at the 3'UTR and only a few at the 5'end where it's commonly thought to bind. In a previous study from the Cate lab, eIF3 was reported to bind to a small region of the 3'UTR of the TCRA and TCRB mRNA, which was responsible for their specific translational stimulation, during T cell activation. Surprisingly in this study, the eIF3 association with mRNA occurs near polyadenylation signals in NPC, independently of cell differentiation status. This compelling evidence suggests a general mechanism of translation control by eIF3 in NPC. This observation brings back the old concept of mRNA circularization with new arguments, independent of PABP and eIF4G interaction. Finally, the discussion adequately describes the potential technical limitations of the present study compared to previous ones by the same group, due to the use of Quick-irCLIP as opposed to the PAR-CLIP/thiouridine.

      Weaknesses:

      (1) These data were obtained from an unusual cell type, limiting the generalizability of the model.

      We agree that unraveling the mechanism employed by eIF3 at the mRNA 3’-UTR termini might be better studied in a stable cell line rather than in primary cells.

      (2) This study lacks a clear explanation for the increased translation associated with NPC differentiation, as eIF3 binding is observed in both differentiated and undifferentiated NPC. For example, I find a kind of inconsistency between changes in Riboseq density (Figure 3B) and changes in protein synthesis (Figure 1D). Thus, the title overstates a modest correlation between eIF3 binding and important changes in protein synthesis.

      We thank the reviewer for this question. Riboseq data and RNASeq data are not on absolute scales when comparing across cell conditions. They are normalized internally, so increases in for example RPF in Figure 3B are relative to the bulk RPF in a given condition. By contrast, the changes in protein synthesis measured in Figure 1D is closer to an absolute measure of protein synthesis.

      (3) This is illustrated by the candidate selection that supports this demonstration. Looking at Figure 3B, ID2, and SNAT2 mRNA are not part of the High TE transcripts (in red). In contrast, the increase in mRNA abundance could explain a proportionally increased association with eIF3 as well as with ribosomes. The example of increased protein abundance of these best candidates is overall weak and uncertain.

      We agree that using TE as the criterion for defining increased eIF3 association would not be correct. By “highly translated” we only mean to convey the extent of protein synthesis, i.e. increases in ribosome protected fragments (RPF), rather than the translational efficiency.

      (4) Despite several attempts (chemical and UV cross-linking) to identify eIF3 partners in NPC such as PABP, PAIP1, or proteins from the 40S, the authors could not provide any evidence for such a mechanism consistent with the closed-loop model. Overall, this rather descriptive study lacks mechanistic insight (eIF3 binding partners).

      We agree that it will be important to identify the molecular mechanism used by eIF3 to engage the termini of mRNA 3’-UTRs. Nevertheless, the identification of eIF3 crosslinking to that location in mRNAs is new, and we think will stimulate new experiments in the field.

      (5) Finally, the authors suspect a potential impact of technical improvement provided by Quick-irCLIP, that could have been addressed rather than discussed.

      We agree a side-by-side comparison of eIF3 crosslinks captured by PAR-CLIP versus Quick-irCLIP will be an important experiment to do. However, NPCs or other primary cells may not be the best system for the comparison. We think using an established cell line might be more informative, to control for effects such as 4-thiouridine toxicity.

    1. eLife Assessment

      The authors use single molecule imaging and in vivo loop-capture genomic approaches to investigate estrogen mediated enhancer-target gene activation in human cancer cells. These potentially important results suggest that ER-alpha can, in a temporal delay, activate a non-target gene TFF3, which is in proximity to the main target gene TFF1, even though the estrogen responsive enhancer does not loop with the TFF3 promoter. To explain these results, the authors invoke a transcriptional condensate model. The reviewers were split on the strength and interpretation of the evidence presented, which is considered incomplete at this stage. We encourage a revision which buttresses the findings with additional control experiments and careful consideration of alternative explanations and mathematical models. Further, the depth of the discussion on existing literature could be improved. This work will be of interest to those studying transcriptional gene regulation and hormone-aggravated cancers.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Bohra et al. describes the indirect effects of ligand-dependent gene activation on neighboring non-target genes. The authors utilized single-molecule RNA-FISH (targeting both mature and intronic regions), 4C-seq, and enhancer deletions to demonstrate that the non-enhancer-targeted gene TFF3, located in the same TAD as the target gene TFF1, alters its expression when TFF1 expression declines at the end of the estrogen signaling peak. Since the enhancer does not loop with TFF3, the authors conclude that mechanisms other than estrogen receptor or enhancer-driven induction are responsible for TFF3 expression. Moreover, ERα intensity correlations show that both high and low levels of ERα are unfavorable for TFF1 expression. The ERa level correlations are further supported by overexpression of GFP-ERa. The authors conclude that transcriptional machinery used by TFF1 for its acute activation can negatively impact the TFF3 at peak of signaling but once, the condensate dissolves, TFF3 benefits from it for its low expression.

      Strengths:

      The findings are indeed intriguing. The authors have maintained appropriate experimental controls, and their conclusions are well-supported by the data.

      Weaknesses:

      There are some major and minor concerns that related to approach, data presentation and discussion. But I think they can be fixed with more efforts.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript by Bohra et al., the authors use the well-established estrogen response in MCF7 cells to interrogate the role of genome architecture, enhancers, and estrogen receptor concentration in transcriptional regulation. They propose there is competition between the genes TFF1 and TFF3 which is mediated by transcriptional condensates. This reviewer does not find these claims persuasive as presented. Moreover, the results are not placed in the context of current knowledge.

      Strengths:

      High level of ERalpha expression seems to diminish the transcriptional response. Thus, the results in Fig. 4 have potential insight into ER-mediated transcription. Yet, this observation is not pursued in great depth however, for example with mutagenesis of ERalpha. However, this phenomenon - which falls under the general description of non monotonic dose response - is treated at great depth in the literature (i.e. PMID: 22419778). For example, the result the authors describe in Fig. 4 has been reported and in fact mathematically modeled in PMID 23134774. One possible avenue for improving this paper would be to dig into this result at the single-cell level using deletion mutants of ERalpha or by perturbing co-activators.

      Weaknesses:

      There are concerns with the smRNA FISH experiments. It is highly unusual to see so much intronic signal away from the site of transcription (Fig. 2) (PMID: 27932455, 30554876) which suggests to me the authors are carrying out incorrect thresholding or have a substantial amount of labeling background. The Cote paper cited in the manuscript is likewise inconsistent with their findings and is cited in a misleading manner: they see splicing within a very small region away from the site of transcription.

      One substantial way to improve the manuscript is to take a careful look at previous single cell analysis of the estrogen response, which in some cases has been done on the exact same genes (PMID: 29476006, 35081348, 30554876, 31930333). In some of these cases, the authors reach different conclusions than those presented in the present manuscript. Likewise, there have been more than a few studies which characterized these enhancers (the first one I know of is: PMID 18728018). Also, Oh et al. 2021 (cited in the manuscript) did show an interaction between TFF1e and TFF3, which seems to contradict the conclusion from Fig. 3. In summary, the results of this paper are not in dialog with the field, which is a major shortcoming.

      In the opinion of this reviewer, there are few - if any - experiments to interrogate the existence of LLPS for diffraction limited spots such as those associated with transcription. This difficulty is a general problem with the field and not specific to the present manuscript. For example, transient binding will also appear as a dynamic 'spot' in the nucleus, independently of any higher order interactions. As for Fig. 5, I don't think treating cells with 1,6 hexanediol is any longer considered a credible experiment. For example, there are profound effects on chromatin independent of changes in LLPS (PMID: 33536240).

      Summary:

      In conclusion, I suggest that the authors look at alternative explanations and analyses -- many of which are experimentally and mathematically rigorous and pre-date the condensate model -- to explain their data.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This work sets out to elucidate mechanistic intricacies in inflammatory responses in pneumonia in the context of the aging process (Terc deficiency - telomerase functionality).

      Strengths:

      Very interesting, conceptually speaking, approach that is by all means worth pursuing. An overall proper approach to the posited aim.

      We want to thank the reviewer for taking the time to review our manuscript and for providing positive feedback regarding our research question.  

      Weaknesses:

      The work is heavily underpowered and may have statistical deficits. This precludes it in its current state from drawing unequivocal conclusions.

      Thank you for this essential and valuable comment. We fully accept that the small sample size of the Tercko/ko mice is a major limitation of our study and transparently discuss this in our manuscript.  However, due to Animal Welfare regulations, only a reduced number of mice were approved because of the strong burden of disease. Consequently, only three non-infected and five infected mice were available to us. This reduced number of mice presents a clear limitation to our study. However, due to ethical considerations related to animal welfare and sustainability, as well as compliance with German animal welfare regulations, it is not possible to obtain additional Tercko/ko mice to increase the dataset.

      The animal studies are an important aspect of our study; however, our hypothesis was also investigated at multiple levels, including in an in vitro co-culture model (Figure 5), to ensure comprehensive analysis. Thus, we clearly demonstrated that S. aureus pneumonia in Tercko/ko mice leads to a more severe phenotype, orchestrated by the dysregulation of both innate and adaptive immune response.

      Reviewer #2 (Public Review):

      Summary:

      The authors demonstrate heightened susceptibility of Terc-KO mice to S. aureus-induced pneumonia, perform gene expression analysis from the infected lungs, find an elevated inflammatory (NLRP3) signature in some Terc-KO but not control mice, and some reduction in T cell signatures. Based on that, They conclude that disregulated inflammation and T-cell dysfunction play a major role in these phenomena.

      Strengths:

      The strengths of the work include a problem not previously addressed (the role of the Terc component of the telomerase complex) in certain aspects of resistance to bacterial infection and innate (and maybe adaptive) immune function.

      We would like to thank the reviewer for the positive feedback regarding our aim to investigate the impact of Terc deletion on the pulmonary immune response to S. aureus.  

      Weaknesses:

      The weaknesses outweigh the strengths, dominantly because conclusions are plagued by flaws in experimental design, by lack of rigorous controls, and by incomplete and inadequate approaches to testing immune function. These weaknesses are as follows

      (1) Terc-KO mice are a genomic knockout model, and therefore the authors need to carefully consider the impact of this KO on a wide range of tissues. This, however, is not the case. There are no attempts to perform cell transfers or use irradiation chimera or crosses that would be informative.

      We thank the reviewer for bringing up this important point. The aim of our study, however; was to investigate the impact of Terc deletion in the lung and on the response to bacterial pneumonia, rather than to provide a comprehensive characterization of the Tercko/ko model itself. This characterization of different tissues and cell types has already been conducted by previous studies. For instance, studies that characterize the general phenotype of the model (Herrera et al., 1999; Lee et al., 1998; Rudolph et al., 1999) but also investigations that shed light on the impact of Terc deletion on specific cell types such as microglia (Khan et al., 2015) or T cells (Matthe et al., 2022). The impact of Terc deletion on T cells is also discussed in our manuscript in lines 89 to 105. Furthermore, a section about the general phenotype of the Terc deletion model is included in the introduction in lines 126 to 138. Thus we discussed the relevant literature regarding Tercko/ko mice in our manuscript and attempted to provide a more in-depth characterization of the lung by investigating the inflammatory response to infection as well as changes in the gene expression (Figure 2-4).  

      (2) Throughout the manuscript the authors invoke the role of telomere shortening in aging, and according to them, their Terc-KO mice should be one potential model for aging. Yet the authors consistently describe major differences between young Terc-KO and naturally aging old mice, with no discussion of the implications. This further confuses the biological significance of this work as presented.

      Thank you for mentioning this relevant point. We want to apologize for the confusion regarding this matter. While Tercko/ko mice are a well-established model for premature aging, these effects become more apparent with increasing generations (G) and thus, G5 and 6 mice are the most affected by Terc deletion (Lee et al., 1998; Wong et al., 2008).

      Thus, while Tercko/ko mice are a common model for premature aging, this accelerated aging phenotype is predominantly apparent in later-generation Tercko/ko (G5 and 6) or aged Tercko/ko mice (Lee et al., 1998; Wong et al., 2008). Since the aim of this study was to analyze the impact of Terc deletion on the lung and its immune response to bacterial infections instead of the impact of telomere shortening and telomerase dysfunction, young G3 Tercko/ko mice (8 weeks) were used in this study. This is also mentioned in the lines 131-134. In this study, Tercko/ko mice were used not as a model of aging, but rather as a model specifically for Terc deletion. The old WT mice function as a control cohort to observe possible common but also deviating effects between aging and Terc deletion. In our sequencing data, we observe that uninfected young WT mice are very similar to uninfected Tercko/ko mice. Other studies have also reported this lack of major differences between uninfected WT and Tercko/ko mice in the G3 knockout mice (Kang et al., 2018). Conversely, uninfected young WT and Tercko/ko mice exhibited great differences, for instance, regarding the numbers of differentially expressed genes (Supplemental Figure 1H). Thus, differences between naturally aged mice and young G3 Tercko/ko mice are not surprising. To clarify this aspect we reconstructed the paragraph discussing the Tercko/ko mice (lines 126-134). Additionally we added a paragraph explaining the purpose of the naturally aged mice to the lines 134 to 138:

      “As control cohort age-matched young WT mice were utilized. To investigate whether Terc deletion, beyond critical telomere shortening, impacts the pulmonary immune response, we used young Tercko/ko mice. Additionally, naturally aged mice (2 years old) were infected to explore the potential link to a fully developed aging phenotype.”

      (3) Related to #2, group design for comparisons lacks a clear rationale. The authors stipulate that TercKO will mimic natural aging, but in fact, the only significant differences seen between groups in susceptibility to S. aureus are, contrary to the authors' expectation, between young Terc-KO and naturally old mice (Figures 1A and B, no difference between young Terc-KO and young wt); or there are no significant differences at all between groups (Figures 1, C, D,).

      We thank the reviewer for this essential comment. As mentioned above the Tercko/ko mice in this study are not selected to model natural aging. To model telomerase dysfunction and accelerated aging selection of later generation or aged Tercko/ko mice would have been more suitable. 

      The lack of statistical significance in some figures is likely due to the heterogeneity of disease phenotype of S. aureus infection in mice, which is a limitation of our study that we discuss in our discussion section in lines 576-582. The phenotype of S. aureus infection can vary greatly within a mouse population, highlighting the limitations of mice as a model for S. aureus infections. To account for this heterogeneity we divided the infected Tercko/ko mice cohort into different degrees of severity based on the clinical score and the presence of bacteria in organs other than the lung (mice with systemic infection). 

      Despite the heterogeneity especially within the Tercko/ko mice cohort the differences between the knockout and young as well as old WT mice were striking. Including the fatal infections, 80% of the Tercko/ko mice had a severe course of disease, while none of the WT mice displayed a severe course (Figure 1A, B and Supplemental Figure 1A, B). This hints towards a clear role of Terc in the response to S. aureus infection in mice. Thus while in some figures the differences are not significant, strong trends towards a more severe phenotype of S. aureus infection in the Tercko/ko mice regarding bacterial load, score and inflammatory response could be observed in our study. 

      Another example of inadequate group design is when the authors begin dividing their Terc-KO groups by clinical score into animals with or without "systemic infection" (the condition where a bacterium spreads uncontrollably across the many organs and via blood, which should be properly called sepsis), and then compare this sepsis group to other groups (Supplementary Figures 1G; Figure 2; lines 374-376 and 389391). This gives them significant differences in several figures, but because they did not clearly indicate where they applied this stratification in the figure legends, the data are somewhat confusing. Most importantly, methodologically it is highly inappropriate to compare one mouse with sepsis to another one without. If Terc-KO mice with sepsis are a comparator group, then their controls have to be wild-type mice with sepsis, who are dealing with the same high bacterial load across the body and are presumably forced to deploy the same set of immune defenses.

      We sincerely appreciate the significant time and effort you have invested in reviewing our manuscript. However, with all due respect, we must point out that the definition of sepsis you have referenced is considered outdated. According to the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3), sepsis is defined as "a life-threatening organ dysfunction caused by a dysregulated host response to infection" (Marvin Singer, 2016, JAMA). Given this fundamental misunderstanding of our findings, we find the comment regarding the inadequacy of our groups to be both dismissive and lacking in scientific merit. We would like to emphasize that the group size used in our study is consistent with accepted standards in infection research. We strongly reject any insinuations of inadequacy that have been repeatedly mentioned throughout the review.

      In order to provide a nuanced investigation of disease severity in Tercko/ko mice, we added the term “systemic infection” to the figures whenever the mice were divided into groups of mice with and without systemic infection. This is the case for Figure 2A and Supplemental Figure 1C-E. The division into mice with and without systemic infection is also mentioned in the figure legend of Figure 2A in lines 932 to 935 and for Supplemental Figure 1 in lines 1052-1053. We agree that Supplemental Figure 1G is somewhat confusing as the mice with systemic infection are highlighted in this graph but not included as a separate group within our sequencing analysis. We added a sentence to the figure legend clarifying this (lines 1042-1044):

      “Nevertheless, the infected Tercko/ko mice were considered one group for the expression analysis and not split into separate groups for the subsequent analysis.”

      Additionally, we revised the section regarding this grouping in different degrees of severity in our Material and Methods section to clarify that this division was only performed for specific analysis (line 191):

      “…for the indicated analysis.”

      Furthermore, the mice which were classified as systemically infected mice were not septic mice, as mentioned above. Those mice were classified by us as systemically infected based on their clinical score and the presence of bacteria in other organs than the lung as stated in the lines 188-191 and 377-381. Bacteremia is a symptom of very severe cases of hospital-acquired pneumonia with a very high mortality (De la Calle et al., 2016).

      Therefore, the systemically infected mice or rather mice with bacteremia display an especially severe pneumonia phenotype, which is distinct from sepsis. The presence of this symptom in our Tercko/ko mice further highlights the clinical relevance of our study. This aspect was added to the manuscript in the lines 568-570.

      “The detection of bacteria in extra pulmonary organs is of particular interest, as bacteremia is a symptom of severe pneumonia and is associated with high mortality (De la Calle et al., 2016).”

      (4) The authors conclude that disregulated inflammation and T-cell dysfunction play a major role in S. aureus susceptibility. This may or may not be an important observation, because many KO mice are abnormal for a variety of reasons, and until such reasons are mechanistically dissected, the physiological importance of the observation will remain unclear.

      Two points are important here. First, there is no natural counterpart to a Terc-KO, which is a complete loss of a key non-enzymatic component of the telomerase complex starting in utero. 

      Second, the authors truly did not examine the key basic features of their model, including the features of basic and induced inflammatory and immune responses. This analysis could be done either using model antigens in adjuvants, defined innate immune stimuli (e.g. TLR, RLR, or NLR agonists), or microbial challenge. The only data provided along these lines are the baseline frequencies of total T cells in the spleen of the three groups of mice examined (not statistically significant, Figure 4B). We do not know if the composition of naïve to memory T cell subsets may have been different, and more importantly, we have no data to evaluate whether recruitment of the immune response (including T cells) to the lung upon microbial challenge is similar or different. So, what are the numbers and percentages of T cells and alveolar macrophages in the lung following S. aureus challenge and are they even comparable or are there issues in mobilizing the T cell response to the site of infection? If, for example, Terc-KO mice do not mobilize enough T cells to the lung during infection, that would explain the paucity in many T-cellassociated genes in their transcriptomic set that the authors report. That in turn may not mean dysfunction of T cells but potentially a whole different set of defects in coordinating the response in Terc-KO mice.

      We thank the reviewer for highlighting these important aspects. Regarding the first point, indeed there is no naturally occurring deletion of Terc in humans. However, studies reported reduced expression of Terc and Tert in the tissues of aged mice and rats (Tarry-Adkins et al., 2021; Zhang et al., 2018). Terc itself has been found to have several important immunomodulatory functions such as the activation of the NFκB or PI3-kinase pathway (Liu et al., 2019; Wu et al., 2022). As those aforementioned pathways are relevant for the immune response to S. aureus infections, the authors were interested in exploring the impact of Terc deletion on the pulmonary immune response. The potential immunomodulatory functions of Terc are discussed in lines 106-121. To further clarify our rationale we added a sentence to the introduction in lines 121-125.

      “Interestingly, downregulation of Terc and Tert expression in tissues of aged mice and rats has been found (Tarry-Adkins, Aiken, Dearden, Fernandez-Twinn, & Ozanne, 2021; Zhang et al., 2018). Therefore, as a potential immunomodulatory factor reduced Terc expression could be connected to agerelated pathologies.”

      Regarding the second point, as we focused on the effect of Terc deletion in the lung and its role in S. aureus infection, we investigated inflammatory and immune response parameters relevant to this setting. For instance, inflammation parameters in the lungs of all three mice cohorts were measured to investigate differences in the inflammatory response in the non-infected and infected mice (Figure 2A). Those measurements showed no baseline difference in key inflammatory parameters between young WT and Tercko/ko mice, which is consistent with previous findings (Kang et al., 2018). The inflammatory response to infection with S. aureus in the Tercko/ko mice cohort differed significantly from the other cohorts (Figure 2A), hinting towards a dysregulated inflammatory response due to Terc deletion. Furthermore, we investigated general immune cell frequencies such as dendritic cells, macrophages, and B cells in the spleen of all three mice cohorts to gather a baseline understanding of the general immune cell populations. In our manuscript only total T cell frequencies were included due to its relevance for our data regarding T cells (Figure 4B). This data could show that there was no difference of total amount of T cells in the spleen of all three mice cohorts. For a more detailed insight into our analysis we added the frequencies of the other immune cell populations analyzed in the spleen as a Supplemental Figure 3B-F. Additionally, a figure legend for the graphs was added to lines 1075-1094.

      Therefore, while we did not analyze baseline frequencies of specific populations of T cells, we analyzed and characterized the inflammatory and immune response of our model in a way relevant to our research question. 

      The differences observed in T cell marker and TCR gene expression was also partly present between the uninfected and infected Tercko/ko mice such as the complete absence of CD247 expression in infected Tercko/ko, which is however expressed in uninfected mice of this cohort (Figure 4A, C and D). Thus, this effect cannot be solely attributed to an inadequate mobilization of T cells to the lung after infectious challenge. However, we agree that a more detailed insight into recruited immune cells to the lung or frequencies of different T cell populations could contribute to a better understanding of the proposed mechanism and would be an interesting experiment to conduct in further studies. We accept this as a limitation of our study and included it in our discussion section in lines 719-723:

      “As total CD4+ T cells were analyzed in this study, it would be useful to investigate specific T cell populations such as memory and effector T cells to elucidate the potential mechanism leading to T cell dysfunctionality in further detail. Additionally, analysis of differences in immune cell recruitment to the lungs between young WT and Tercko/ko mice would be relevant.”

      (5) Related to that, immunological analysis is also inadequate. First, the authors pull signatures from the total lung tissue, which is both imprecise and potentially skewed by differences, not in gene expression but in types of cells present and/or their abundance, a feature known to be affected by aging and perhaps by Terc deficiency during infection. Second, to draw any conclusions about immune responses, the authors would have to track antigen-specific T cells, which is possible for a wide range of microbial pathogens using peptide-MHC multimers. This would allow highly precise analysis of phenomena the authors are trying to conclude about. Moreover, it would allow them to confirm their gene expression data in populations of physiological interest

      We thank the reviewer for highlighting this important and relevant point. In our study, we aimed to investigate the role of Terc expression in modulating inflammation and the immune response to S. aureus infection in the lung. To address this, we examined the overall impact of age, genotype, and infection on lung inflammation and gene expression. Therefore, sequencing of total lung tissue was essential for addressing the research question posed. Our findings demonstrate that Tercko/ko mice exhibit a more severe phenotype following S. aureus infection, characterized by an increased bacterial load and heightened lung inflammation (Figures 1 and 2). Furthermore, our data suggest that Terc plays a role in regulating inflammation through activation of the NLRP3 inflammasome, along with the dysregulation of several T cell marker genes (Figures 2, 4, and 5). However, this study lacks a detailed analysis of distinct T cell populations, including antigen-specific T cells, as noted earlier. Investigating these aspects in future studies would be valuable to validate and expand upon our findings. We have incorporated these suggestions into the discussion section (lines 719-723)

      “As total CD4+ T cells were analyzed in this study, it would be useful to investigate specific T cell populations such as memory and effector T cells to elucidate the potential mechanism leading to T cell dysfunctionality in further detail. Additionally, analysis of differences in immune cell recruitment to the lungs between young WT and Tercko/ko mice would be relevant.”

      Nevertheless, our study provides first evidence of a potential connection between T cell functionality and Terc expression.  

      Third, the authors co-incubate AM and T cells with S. aureus. There is no information here about the phenotype of T cells used. Were they naïve, and how many S. aureus-specific T cells did they contain? Or were they a mix of different cell types, which we know will change with aging (fewer naïve and many more memory cells of different flavors), and maybe even with a Terc-KO? Naïve T cells do not interact with AM; only effector and memory cells would be able to do so, once they have been primed by contact with dendritic cells bringing antigen into the lymphoid tissues, so it is unclear what the authors are modeling here. Mature primed effector T cells would go to the lung and would interact with AM, but it is almost certain that the authors did not generate these cells for their experiment (or at least nothing like that was described in the methods or the text).

      Thank you for bringing up this important question. For the co-cultivation experiment of T cells and alveolar macrophages, total CD4+ T cells of both young WT and Tercko/ko were used. We did not select for a specific population of T cells. Our sequencing data indicated the complete downregulation of CD247 expression, which is an important part of the T cell receptor, in the lungs of infected Tercko/ko mice (Figure 4A, C and D). Given that this factor is downregulated under chronic inflammatory conditions, we investigated the impact of the inflammatory response in alveolar macrophages on the expression of various T cell-derived cytokines, as well as CD247 expression (Figure 5D, E) (Dexiu et al., 2022). This aspect is also highlighted in the discussion in lines 622-636. Therefore, a co-cultivation model of T cells and alveolar macrophages was established and confronted with heat-killed S. aureus to elicit an inflammatory response of the macrophages. To emphasize this purpose, we have revised our statement about the model setup in lines 516-518 of the manuscript: 

      “An overactive inflammatory response could be a potential explanation for the dysregulated TCR signaling.”

      The authors hope this will clarify the intent behind the model setup.

      (6) Overall, the authors began to address the role of Terc in bacterial susceptibility, but to what extent that specifically involves inflammation and macrophages, T cell immunity, or aging remains unclear at present.

      We thank the reviewer for the helpful and relevant comments. The authors accept the limitations of the presented study such as the reduced number of Tercko/ko mice and the limitations of murine models for S. aureus infection itself and discuss those in the discussion section in the lines 558-560; 576-582; 688-690 and 719-725. However, we hope that our responses have provided sufficient evidence to convince the reviewer that our data supports a clear role for Terc expression in regulating the immune response to bacterial infections, particularly with respect to inflammation and its potential connection to T cell functionality.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The good element first:

      I read this paper with genuine interest and applaud the authors for investigating the posited question. I consider it by all means scientifically relevant in the context of physiological/pathophysiological aging and reaction to a disease (here pneumonia). The Terc deletion model looks very appropriate for the question and the methodology is very advanced/in-depth. The data flow/selection of endpoints and assays is very logical to me. Moreover, I like the breakdown of pneumonia into varying levels of severity.

      We thank the reviewer for their time and effort taken to revise our manuscript. Additionally, we are grateful to receive your positive feedback regarding our study design and research question.

      The weaknesses:

      (1) I cannot help but notice that the study is heavily underpowered. As such, it is inadmissible. The key reason is that it is the first of its kind and seminal findings must be strongly propped by the evidence. It is apparent to me that the data scatter presented in the figures tends to be abnormally distributed (e.g. obvious bimodal distribution in some groups). Therefore, the presented comparisons (even if stat. sign) can be heavily misleading in terms of: i) the true magnitude of the observed effects and ii) possible type 2 error in some cases of p value >0.05. Solution: repeat the study to ensure reasonable power/reliability. This will also make it stronger as it will immediately demonstrate its reproducibility (or lack of it).

      Thank you for bringing up this extremely relevant point. We acknowledge the issue of the small sample size of Tercko/ko mice as a major limitation of our study. This limitation is also included in our discussion section in the lines 558-560. Thus we fully agree with this limitation and transparently discuss this in our manuscript. However, due to the strict German animal welfare regulations it is not possible to obtain more Tercko/ko mice, as mentioned above. Furthermore, since fatal infections occurred in the Tercko/ko mice cohort we had a reduced number of mice available. 

      However, the differences between the Tercko/ko and WT mice were striking. Including the fatal infections 80% of the Tercko/ko mice had a severe course of disease, while none of the WT mice displayed a severe course. This hints towards a clear role of Terc in the response to S. aureus infection in mice.  

      (2) In the stat analysis section of M&Ms, the authors feature only 1 sentence. This cannot be. A detailed stats workup needs to be included there. This is very much related to the above weakness; e.g. it is impossible to test for normality (to choose an appropriate post-hoc test) with n=3. Back to square one: study underpowered.

      We thank the reviewer for highlighting this important aspect. We carefully revised the method section in lines 357-360 to include all relevant information: 

      “Data are presented as mean ± SD, or as median with interquartile range for violin and box plots, with up to four levels of statistical significance indicated. P-values were calculated using Kruskal-Wallis test. Individual replicates are represented as single data points.”

      (3) Pneumonia severity. While I noted that as a strength, I also note it as weakness here. It looks to me like the authors stopped halfway with this. I totally support testing a biological effect(s) such as the one investigated here across a spectrum of a given disease severity. The authors mention that they had various severity phenotypes produced in their model but this is not visible in the data figs. I strongly suggest including that as well; i.e., to study the posited question in the severe and mild pneumonia phenotype. This is a very smart path and previous preclinical research clearly demonstrated that this severe/mild distinction is very relevant in the context of the observed responses (their presence/absence, longevity, dynamics, etc). I realize this is challenging, thus, I would probably use this approach in the Terc k/o model as sort of a calibrator to see whether the exacerbation observed in the current setup (severe?) will be also present in a mild pneumonia phenotype. S. aureus can be effectively titrated to produce pneumonia of varying severity.

      We thank the reviewer for bringing up this relevant point. 

      In our study, we could observe heterogeneity within the infected Tercko/ko cohort. Therefore as pointed out by the reviewer we assigned different degrees of severity to those groups based on clinical scores, the fatal outcome of the disease (fatal subgroup), and the presence of bacteria in organs other than the lungs (systemic infection subgroup) as stated in our materials and methods part in the lines 188-191 (Supplemental Figure 1A and B). Moreover, we highlighted this difference in a number of our figures. For example, when categorizing the mice into groups with and without systemic infection, we noticed that the mice with systemic infection demonstrated a higher bacterial load, significant body weight loss, and increased lung weight (see Supplemental Figure 1C-E). Interestingly, the two mice with systemic infection clustered separately from the other mice, indicating that the mice with systemic infection are transcriptomically distinct from the other mice cohorts (Supplemental Figure 1G). Additionally, the inflammatory response was exclusively elevated in the lungs of mice with systemic infection (Figure 2C). Thus, we included this distinction in several figures and attempted to study the differences between those subgroups but also their similarities. For instance, we could observe that some changes in the transcriptome were present in all three infected Tercko/ko mice such as the complete absence of CD247 expression at 24 hpi (Figure 4D). This distinction therefore provided a more detailed insight into the underlying mechanisms of disease severity in Tercko/ko mice and is lacking in other studies. We agree with the reviewer, that a study investigating mild and severe pneumonia phenotypes would be clinically relevant. However, as noted above, due to ethical considerations related to animal welfare and sustainability, as well as compliance with German animal welfare regulations, it is not possible to obtain additional Tercko/ko mice to carry out the proposed experiment. 

      (4) Please read ARRIVE guidelines and note the relevant info in M&Ms as ARRIVE guidelines point out.

      Thank you for emphasizing this crucial aspect. We revised our materials and methods section according to the ARRIVE guidelines (lines 179-206).

      “Tercko/ko mice aged 8 weeks, were used for infection studies (n = 8; non-infected = 3; infected = 5). Female young WT (age 8 weeks) and old WT (age 24 months) C57Bl/6 mice (both n = 10; non-infected = 5; infected = 5) were purchased from Janvier Labs (Le Genest-Saint-Isle, France). All infected mouse cohorts were compared to their respective non-infected controls, as well as to the infected groups from other cohorts. Additionally, comparisons were made between the non-infected cohorts across all groups.

      All mice were anesthetized with 2% isoflurane before intranasal infection with S. aureus USA300 (1x108 CFU/20µl) per mouse. After 24 hours, the mice were weighed and scored as previously described (Hornung et al., 2023). Infected Tercko/ko mice were grouped into different degrees of severity based on their clinical score, fatal outcome of the disease (fatal) and the presence of bacteria in organs other than the lung (systemic infection) for the indicated analysis. Mice with fatal infections were excluded from subsequent analyses, with only their final scores being reported. The mice were sacrificed via injection of an overdose of xylazine/ketamine and bleeding of axillary artery after 24 hpi. BAL was collected by instillation and subsequent retrieval of PBS into the lungs. Serum and organs were collected. Bacterial load in the BAL, kidney and liver was determined by plating of serially diluted sample as described above. For this organs were previously homogenized in the appropriate volume of PBS. Gene expression was analyzed in the right superior lung lobe. Lobes were therefore homogenized in the appropriate amount of TriZol LS reagent (Thermo Fisher Scientific, Waltham, MA, US) prior to RNA extraction. The left lung lobe was embedded into Tissue Tek O.C.T. (science services, Munich, Germany) and stored at 80°C until further processing for histological analysis. Cytokine measurements were performed using the right inferior lung lobe. Lobes were previously homogenized in the appropriate volume of PBS. Remaining organs were stored at -80°C until further usage. Mouse studies were conducted without the use of randomization or blinding.“

      (5) There are also some other descriptive deficits but they are of a much smaller caliber so I do not list them.

      We thank the reviewer for their valuable and insightful suggestions for improving our manuscript. We hope that our responses and the corresponding revisions address these suggestions satisfactorily.

      Concluding: the investigative idea is great/interesting and the methodological flow is adequate but the low power makes this study of low reliability in its current form. I strongly urge the authors to walk the extra mile with this work to make it comprehensive and reliable. Best of luck!

      Reviewer #2 (Recommendations For The Authors):

      (1) Many legends are uninformative and do not contain critical information about the experiments. For example, Figure 2A with cytokine measurements (in lung homogenates?) is likely showing data from an ELISA or Luminex test, but there is no mention of that in the legend. It stands next to Figure 2B, which is a gene expression map, again, likely from the lung (prepared how, normalized how, etc?) lacking even the most basic information. Further, Figure 2D has no information on the meaning/effect size of gene ratios on the x-axis. Figures 3 and 4 are presumably the subsets of their transcriptome data set (whole lung, harvested on d ?? post-infection), but that is just a guess on my part. Even in the main text, the timing and the controls for the transcriptomic study are not stated (ln. 398 and onwards). The authors really need to revise the figure legends and provide all the details that an average reader would need to be able to interpret the data.

      We thank the reviewer for bringing up this important point. The figure legends of all figures including supplemental figures were revised to ensure they include all relevant data necessary for accurate interpretation of the graphs. Additionally, we clarified the sequenced samples in lines 427-429:

      “We performed mRNA sequencing of the murine lung tissue of infected and non-infected mice at 24 hpi to elucidate potential differentially expressed genes that contribute to the more severe illness of Tercko/ko mice.”

      (2) Telomere shortening affects differentially different cells and its role in aging is nuanced - different in mesenchymal cells with no telomerase induction, in non-replicating cells, and in hematopoietic cells that can readily induce telomerase. The authors should be mindful of that in setting up their introduction and discussion.

      Thank you for mentioning this essential aspect. We revised our introduction and discussion to reflect the nuanced role of telomerase shortening in different tissues (lines 83-92 and 690-695):

      “Telomerase activity is restricted to specific tissues and cell types, largely dependent on the expression of Tert. While Tert is highly expressed in stem cells, progenitor cells, and germline cells, its expression is minimal in most differentiated cells (Chakravarti, LaBella, & DePinho, 2021). Consequently, the impact of telomerase dysfunction on tissues varies according to their self-renewal rate. (Chakravarti et al., 2021). One important aspect of telomere dysfunction is the impact of telomere shortening on the immune system as well as the hematopoietic system. Tissues or organ systems that are highly replicative, such as the skin or the hematopoietic system, are affected first by telomere shortening (Chakravarti et al., 2021).”

      “It is important to note that telomere shortening has a significant impact on the immune system. Although young Tercko/ko mice were used in this study, telomere shortening is still likely to be a contributing factor. Therefore, further experiments investigating the role of T cell senescence in this model should therefore be conducted.”

      (3) Syntax and formulations need to be improved and made more scientifically precise in several spots. Specifically, in 62-63, the authors say that the aged immune system "is also discussed to be more irritable", please change to reflect the common notion that the reaction to infection is dysregulated; in many cases inflammation itself is initially blunted, misdirected, and of different type (e.g. for viruses, the key IFN-I responses are not increased but decreased). In lines 114-117, presumably, the two sentences were supposed to be connected by a comma, although some editing for clarity is probably needed regardless. Line 252, please change "unspecific" to "non-specific". Line 264, please capitalize German.

      We thank the reviewer for bringing these important points to our attention. We revised our introduction regarding the aged immune response in lines 61-69:

      “Age-related dysregulation of the immune response is also characterized by inflammaging, defined as the presence of elevated levels of pro-inflammatory cytokines in the absence of an obvious inflammatory trigger (Franceschi et al., 2000; Mogilenko, Shchukina, & Artyomov, 2022). Additionally, immune cells, such as macrophages, exhibit an activated state that alters their response to infection (Canan et al., 2014). In contrast, the immune response of macrophages to infectious challenges has been shown to be initially impaired in aged mice (Boe, Boule, & Kovacs, 2017). Thus aging is a relevant factor impacting the pulmonary immune response.”

      Sentences were edited to provide more clarity in lines 131-134:

      “Although G3 Tercko/ko mice with shortened telomeres were used in this study, they were infected at a young age (8 weeks). This approach allowed for the investigation of Terc deletion effects rather than telomere dysfunction.”

      “Unspecific was changed to “non-specific” in line 282 and “German” was capitalized in line 293 and 558.

      We appreciate and thank you for your time spent processing this manuscript and look forward to your response.

      References

      De la Calle, C., Morata, L., Cobos-Trigueros, N., Martinez, J. A., Cardozo, C., Mensa, J., & Soriano, A. (2016). Staphylococcus aureus bacteremic pneumonia. European Journal of Clinical Microbiology & Infectious Diseases, 35(3), 497-502. https://doi.org/10.1007/s10096-015-2566-8  

      Dexiu, C., Xianying, L., Yingchun, H., & Jiafu, L. (2022). Advances in CD247. Scand J Immunol, 96(1), e13170. https://doi.org/10.1111/sji.13170  

      Herrera, E., Samper, E., Martín-Caballero, J., Flores, J. M., Lee, H. W., & Blasco, M. A. (1999). Disease

      states associated with telomerase deficiency appear earlier in mice with short telomeres. Embo j, 18(11), 2950-2960. https://doi.org/10.1093/emboj/18.11.2950  

      Hornung, F., Schulz, L., Köse-Vogel, N., Häder, A., Grießhammer, J., Wittschieber, D., Autsch, A., Ehrhardt, C., Mall, G., Löffler, B., & Deinhardt-Emmer, S. (2023). Thoracic adipose tissue contributes to severe virus infection of the lung. International Journal of Obesity, 47(11), 10881099. https://doi.org/10.1038/s41366-023-01362-w  

      Kang, Y., Zhang, H., Zhao, Y., Wang, Y., Wang, W., He, Y., Zhang, W., Zhang, W., Zhu, X., Zhou, Y., Zhang, L., Ju, Z., & Shi, L. (2018). Telomere Dysfunction Disturbs Macrophage Mitochondrial Metabolism and the NLRP3 Inflammasome through the PGC-1α/TNFAIP3 Axis. Cell Reports, 22(13), 3493-3506. https://doi.org/https://doi.org/10.1016/j.celrep.2018.02.071  

      Khan, A. M., Babcock, A. A., Saeed, H., Myhre, C. L., Kassem, M., & Finsen, B. (2015). Telomere dysfunction reduces microglial numbers without fully inducing an aging phenotype. Neurobiology of Aging, 36(6), 2164-2175. https://doi.org/https://doi.org/10.1016/j.neurobiolaging.2015.03.008  

      Lee, H.-W., Blasco, M. A., Gottlieb, G. J., Horner, J. W., Greider, C. W., & DePinho, R. A. (1998). Essential role of mouse telomerase in highly proliferative organs. Nature, 392(6676), 569-574. https://doi.org/10.1038/33345  

      Liu, H., Yang, Y., Ge, Y., Liu, J., & Zhao, Y. (2019). TERC promotes cellular inflammatory response independent of telomerase. Nucleic Acids Research, 47(15), 8084-8095. https://doi.org/10.1093/nar/gkz584  

      Matthe, D. M., Thoma, O. M., Sperka, T., Neurath, M. F., & Waldner, M. J. (2022). Telomerase deficiency reflects age-associated changes in CD4+ T cells. Immun Ageing, 19(1), 16. https://doi.org/10.1186/s12979-022-00273-0  

      Rudolph, K. L., Chang, S., Lee, H. W., Blasco, M., Gottlieb, G. J., Greider, C., & DePinho, R. A. (1999). Longevity, stress response, and cancer in aging telomerase-deficient mice. Cell, 96(5), 701-712. https://doi.org/10.1016/s0092-8674(00)80580-2  

      Tarry-Adkins, J. L., Aiken, C. E., Dearden, L., Fernandez-Twinn, D. S., & Ozanne, S. (2021). Exploring Telomere Dynamics in Aging Male Rat Tissues: Can Tissue-Specific Differences Contribute to Age-Associated Pathologies? Gerontology, 67(2), 233-242. https://doi.org/10.1159/000511608  

      Wong, L. S. M., Oeseburg, H., de Boer, R. A., van Gilst, W. H., van Veldhuisen, D. J., & van der Harst, P. (2008). Telomere biology in cardiovascular disease: the TERC−/− mouse as a model for heart failure and ageing. Cardiovascular Research, 81(2), 244-252. https://doi.org/10.1093/cvr/cvn337  

      Wu, S., Ge, Y., Lin, K., Liu, Q., Zhou, H., Hu, Q., Zhao, Y., He, W., & Ju, Z. (2022). Telomerase RNA TERC and the PI3K-AKT pathway form a positive feedback loop to regulate cell proliferation independent of telomerase activity. Nucleic Acids Res, 50(7), 3764-3776. https://doi.org/10.1093/nar/gkac179  

      Zhang, M. W., Zhao, P., Yung, W. H., Sheng, Y., Ke, Y., & Qian, Z. M. (2018). Tissue iron is negatively correlated with TERC or TERT mRNA expression: A heterochronic parabiosis study in mice. Aging (Albany NY), 10(12), 3834-3850. https://doi.org/10.18632/aging.101676

    2. eLife Assessment

      In this manuscript, the authors sought to elucidate mechanistic intricacies of inflammatory responses, with emphasis on T cell dysfunction, to S. aureus-induced pneumonia in the context of aging process using Terc deficient mice. Conceptually, the study is very interesting with a set of useful findings. Although some experimental approaches are appropriate, the work as shown in the revised manuscript remains significantly underpowered and the absence of rigorous controls make this study incomplete in support of its claims.

    3. Reviewer #1 (Public review):

      Summary:

      This work sets out to elucidate mechanistic intricacies in inflammatory responses in pneumonia in the context of aging process (Terc deficiency - telomerase functionality).

      Strengths:

      Very interesting, conceptually speaking, approach that is by all means worth pursuing. An overall proper approach to the posited aim.

      Weaknesses:

      The work is heavily underpowered and may have statistical deficits. This precludes at its current state drawing unequivocal conclusions.

      I remain at my initial position regarding the weaknesses.

    4. Reviewer #2 (Public review):

      Summary

      The authors demonstrate heightened susceptibility of Terc-KO mice to S. aureus-induced pneumonia, perform gene expression analysis from the infected lungs, find an elevated inflammatory (NLRP3) signature in some Terc-KO but not control mice, and some reduction in T cell signatures. Based on that, they conclude that dysregulated inflammation and T cell dysfunction play a major role in these phenomena.

      The strengths of the work did not change, and include a problem not previously addressed (the role of Terc component of the telomerase complex) in certain aspects of resistance to bacterial infection and innate (and maybe adaptive) immune function.<br /> The weaknesses of this revised version still outweigh the strengths, because the authors did not substantially or experimentally answer the main criticism points, and have rather tried to argue away that which cannot be argued away. In summary, the most germane conclusions of this study remain plagued by flaws in experimental design, by lack of rigorous controls and by incomplete and inadequate approaches to testing of immune function.

      I will devote the rest of the comments to the revised manuscript and its success or lack thereof in responding to prior criticisms. Prior criticisms are again listed below in italics, to provide context for the attempts of the investigators to respond.

      (1) Reviewer 1 has justifiably criticized the exceptionally low power of the study, with 5 control and 3 experimental animals. The responding author has replied that the animal welfare laws preclude them from doing more experiments. That is unfortunate, and I sympathize with the authors. Nonetheless, in the absence of robust corroboration the rigor of the study remains severely compromised and the work is reduced to what I have pointed above - a preliminary and inconclusive study that is in need of deeper and more serious mechanistic investigation.

      (2) Terc-KO mice are a genomic knockout model, and therefore the authors need to carefully consider the impact of this KO on a wide range of tissues. This, however, is not the case. There are no attempts to perform cell transfers, use irradiation chimera or crosses that would be informative.

      In response to this criticism, the authors have quoted a whole bunch of papers characterizing different aspects of biology of these same mice. The most important paper in that regard would be the one by Matthe et al. on CD4 cells from these same mice. That study was limited and simply diagnosed in situ the changes in T cell pool, but did not decipher whether and to what extent such defects are cell-intrinsic or a byproduct of similarly altered microenvironments. Most importantly, none of that answers the original critique question of which cell types are truly the culprits in the Terc deletion phenotype presented here. As I indicated, one has to perform cell transfers, bone marrow irradiation chimera, additional genetic crosses and combinations thereof to substantiate whether the defects are ascribable to the lung tissue itself, the infiltrating myeloid cells, including macrophages, the T cells or a combination thereof. The authors provided none of this.

      (3) Throughout the manuscript the authors invoke the role of telomere shortening in aging, and according to them their Terc-KO mice should be one potential model for aging. Yet the authors consistently describe major differences between young Terc-KO and naturally aging old mice, with no discussion of the implications. This further confuses the biological significance of this work as presented.

      (4) Related to #2, group design for comparisons lacks a clear rationale. The authors stipulate that Terc-KO will mimic natural aging, but in fact, the only significant differences seen between groups in susceptibility to S. aureus are, contrary to the authors' expectation, between young Terc-KO and naturally old mice (Fig. 1A and B, no difference between young Terc-KO and young wt); or there are no significant differences at all between groups (Fig. 1, C, D,). I have also raised the issue of non-physiological nature of a germline Terc-KO, that does not mimic any known physiological or pathological state.<br /> The authors provided a non-response to this criticism. They argue in their response under (2) of their rebuttal that they included old mice as controls not for aging, because their experimental Terc-deletion mice were G3 and do not exhibit as much of a progeroid phenotype as G5 or G6 mice. But they still say in the revised formulation that these mice were infected "to explore the potential link to a fully developed aging phenotype". They just never conclude that no such link is substantiated by the vast majority of their data. Moreover, they come back to state in their response (4) that because the literature reported ".... reduction of Terc and Tert in tissues of old mice and rats. Therefore, as a potential immunomodulatory factor reduced Terc expression could be connected to age-related pathologies." So either they have used old mice here to compare aging phenotypes, and found that Terc-KO mice diverge massively from aging phenotypes, in which case they have to state so, or they are not using them as age comparators (in which case I am not sure what their purpose is).

      (5) (originally part of criticism #4) I have criticized inadequate group design is when the authors begin dividing their Terc-KO groups by clinical score into animals with or without "systemic infection" (the condition where a bacterium spreads uncontrollably across the many organs and via blood, which should be properly called sepsis), and then compare this sepsis group to other groups (Suppl Fig. 1G; Fig. 2; lines 374-376 and 389-391). .... Most importantly, methodologically it is highly inappropriate to compare one mouse with sepsis to another one without. If Terc-KO mice with sepsis are a comparator group, then their controls have to be wild type mice with sepsis, who are dealing with the same high bacterial load across the body and are presumably forced to deploy the same set of immune defenses.<br /> The authors responded by making me aware of the 2016 JAMA definition of sepsis that invokes "a life-threatening organ dysfunction caused by a dysregulated host response to infection". I appreciate the correction, and note that in a human setting and globally, such a definition may make sense. The authors stated that bacteremia and not sepsis should be used as a criterion. I agree, and per my original criticism, believe it will be appropriate to compare bacteremic wt and KO mice.

      (6) I am shortening my prior critique to make it more to the point that was not addressed: The authors conclude that disregulated inflammation and T cell dysfunction play a major role in S. aureus susceptibility. This may or may not be an important observation, because many KO mice are abnormal for a variety of reasons, and until such reasons are mechanistically dissected, the physiological importance of the observation will remain unclear. ....., the authors truly did not examine the key basic features of their model, including the features of basic and induced inflammatory and immune response. This analysis could be done either using model antigens in adjuvants, defined innate immune stimuli (e.g. TLR, RLR or NLR agonsists), or microbial challenge. The only data provided along these lines are the baseline frequencies of total T cells in the spleen of the three groups of mice examined (not statistically significant, Fig. 4B). We do not know if the composition of naïve to memory T cell subsets may have been different, and more importantly, we have no data to evaluate whether recruitment of the immune response (including T cells) to the lung upon microbial challenge is similar or different. So, what are the numbers and percentages of T cells and alveolar macrophages in the lung following S. aureus challenge and are they even comparable or are there issues in mobilizing the T cell response to the site of infection ? If, for example, Terc-KO mice do not mobilize enough T cells to the lung during infection, that would explain paucity in many T cell -associated genes in their transcriptomic set that they authors report. That in turn may not mean dysfunction of T cells but potentially a whole different set of defects in coordinating the response in Terc-KO mice.<br /> The authors did not respond to this criticism other than to provide more frequencies of different subsets. The key here are the NUMBERS of cells present at the peak of challenge, or better yet the kinetics of cell accumulation (again numbers), as well as transfer experiments to establish where the defect actually lies (mobilization, activation, proliferation, etc.).

      (7) Related to that, immunological analysis is also inadequate. First, the authors pull signatures from the total lung tissue, which is both imprecise and potentially skewed by differences not in gene expression but in types of cells present and/or their abundance, a feature known to be affected by aging and perhaps by Terc deficiency during infection. Second, to draw any conclusions about immune responses, the authors would have to track antigen-specific T cells, which is possible for a wide range of microbial pathogens using peptide-MHC multimers. This would allow highly precise analysis of phenomena the authors are trying to conclude about. Moreover, it would allow them to confirm their gene expression data in populations of physiological interest.<br /> The authors agreed that this would be of interest but did nothing to provide it. They provided a sentence in the discussion stating that this (as well as many other experiments needed to interpret the results) would be of interest.

      (8) Overall, the authors begun to address the role of Terc in bacterial susceptibility, but to what extent that specifically involves inflammation and macrophages, T cell immunity or aging remains unclear at the present.<br /> My conclusion from the prior review remains unchanged in the face of the revision that did not answer most of the previous criticism. The study as it stands is inconclusive and highly preliminary, with lack of clearly defined mechanistic underpinnings.

    1. eLife Assessment

      In this useful study, the authors present convincing evidence linking the enzyme D-alanine-D-alanine ligase (Ddl), crucial for cell wall fortification, to organic acid exposure in Staphylococcus aureus. While it's established that organic acids impede bacterial growth, the researchers reveal a novel coping mechanism where S. aureus maintains elevated levels of D-alanine, the substrate for Ddl, to counteract this inhibition. This discovery illuminates a bacterial strategy for organic acid tolerance, offering new insights for microbiologists and potentially informing future antimicrobial approaches.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript entitled "Staphylococcus aureus counters organic acid anion-mediated inhibition of peptidoglycan cross-linking through robust alanine racemase activity" by Panda, S et al. reports an extensive biochemical analysis of the result from a Tn screen that identified alr1 as being required for acetic acid tolerance. In the end, they demonstrate that reduced D-Ala pools in the ∆alr1 mutant lead to a drastic reduction in D-Ala-D-Ala dipeptide. They show that this is due to the ability of organic acid anions to limit the D-Ala-D-Ala ligase enzyme Ddl. They demonstrate that:

      (1) Acetate exposure in the ∆alr1 results in reduced D-Ala-D-Ala dipeptide, but not the monomers.

      (2) Acetate can bind to purified Ddl in vitro.

      (3) This binding results in reduced enzyme activity.

      (4) Other organic acid anions such as lactate, proprionate, and itaconitate can also inhibit Ddl.

      The experiments are clearly described and logically laid out.

      Comments on revised version:

      Given that multiple reviewers noted that determining intracellular acetate levels would strengthen the impact of this manuscript, I still think the comment listed below should be dealt with. Radioactivity is not necessary for this. There are enzymatic kits that will allow for the accurate determination of acetate from a lysate of a known number of cells. This can be used to determine intracellular acetate levels.

      (1) It is kind of tricky, but it is possible to measure intracellular acetate. That might be of interest to know where in the Ddl inhibition curve the cells actually are.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, using Staphylococcus aureus as a model organism, Panda et al. aim to understand how organic acids inhibit bacterial growth. Through careful characterization and interdisciplinary collaboration, the authors present valuable evidence that acetic acid specifically inhibit the activity of Ddl enzyme that converts 2 D-alanine amino acids into D-ala-D-ala dipeptide, which is then used to generate the stem pentapeptide of peptidoglycan (PG) precursors in the cytoplasm. Thus, high concentration of acetic acid weakens the cell wall by limiting PG-crosslinking (which requires D-ala portion). However, S. aureus maintains a high intracellular D-ala concentration to circumvent acetate-mediated growth inhibition.

      Strengths:

      The authors utilized a well-established transposon mutant library to screen for mutants that struggle to grow in the presence of acetic acid. This screen allowed authors to identify that a strain lacking intact alr1, which encodes for alanine racemase (converts L-ala to D-ala), is unable to grow well in the presence of acetic acid. This phenotype is rescued by the addition of external D-ala. Next, the authors rule out the contribution of other pathways that could lead to the production of D-ala in the cell. Finally, by analyzing D-ala and D-ala-D-ala concentrations, as well as muropeptide intermediates accumulation in different mutants, the authors pinpoint Ddl as the specific target of acetic acid. In fact, synthetic overexpression of ddl alone overcomes the toxic effects of acetic acid. Using genetics, biochemistry, and structural biology, the authors show that Ddl activity is specifically inhibited by acetic acid and likely by other biologically relevant organic acids. Interestingly, this mechanism is different from what has been reported for other organisms such as Escherichia coli (where methionine synthesis is affected). It remains to be seen if this mechanism is conserved in other organisms that are more closely related to S. aureus, such as Clostridioides difficile and Enterococcus faecalis.

      Weaknesses:

      None noted. With new data the authors have satisfactorily addressed all the concerns of the previous version.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews:

      Reviewer #1:

      (1) Which allele is alr1, the one upstream of mazEF or the one in the lysine biosynthetic operon?

      Alr1 is encoded by SAUSA300_2027 and is the gene upstream to mazEF. We have now incorporated this information in the manuscript (Line# 127).

      (2) Figure 3B. Where does the C3N2 species come from in the WT and why is it absent in the mutants? It is about 25% of the total dipeptide pool.

      In Figure 3B, C3N2 species results from the combination of C3N1 (from Alr1) and C0N1 (from Dat). The reason this species is completely absent in either of the two mutants is because it requires one D-Ala from both Alr1 and Dat proteins to generate C3N2 D-Ala-D-Ala.

      (3) Figure 3D could perhaps be omitted. I understand that the authors attained statistical significance in the fitness defect, but biologically this difference is very minor. One would have to look at the isotopomer distribution in the Dat overexpressing strain to make sure that increased flux actually occurred since there are other means of affecting activity (e.g. allosteric modulators).

      Thank you for the suggestion. We agree with the reviewer that the fitness defect observed after increased dat expression is relatively minor and have moved this figure to the supplementary section as Figure 3-figure supplement 1.

      Although we attempted to amplify the fitness defect of dat expression by cloning dat on to a multicopy vector, we couldn't maintain its stable expression in S. aureus. This instability may be due to the depletion of D-Ala when dat is overexpressed. As a result, we switched to expressing dat from a single additional copy integrated into the SaPI locus, which was sufficient to cause the expected fitness defect, albeit a minor one.

      (4) In Figure 4A, why is the complete subunit UDP-NAM-AEKAA increasing in each strain upon acetate challenge if there was such a stark reduction in D-Ala-D-Ala, particularly in the ∆alr1 mutant? For that matter, why are the levels of UDP-NAM-AEKAA in the ∆alr1 mutant identical to that of WT with/out acetate?

      Thank you for raising this important point. We have addressed this in line# 299-302 and 451-455 of the revised manuscript. In short, we believe that the inhibition of Ddl by acetate significantly increases the intracellular pool of the tripeptide UDP-NAM-AEK, which then outcompetes the substrate (pentapeptide; UDP-NAM-AEKAA) of MraY. As a result, the intracellular concentration of the pentapeptide increases since it is no longer efficiently consumed by MraY. This explanation is also supported by a kinetic study conducted in Ref (1), where the competition between UDP-NAM-AEKAA and UDP-NAM-AEK as substrates for MraY is demonstrated.

      (5) Figure 4B. Is there no significant difference between ddl and murF transcripts between WT and ∆alr1 under acetate stress? This comparison was not labeled if the tests were done.

      Thank you for suggesting this comparison. The ddl and murF transcripts between WT and alr1 under acetate stress were significantly different. We have added this comparison to Figure 4B.

      (6) Although tricky, it is possible to measure intracellular acetate. It might be of interest to know where in the Ddl inhibition curve the cells actually are.

      Thank you for the suggestion. We agree this would have been an excellent addition to the manuscript. However, accurately measuring intracellular acetate would require the use of radiolabeled acetate (2), and we currently lack the expertise to do this experiment. However, since our study clearly shows that acetate-mediated growth impairment is due to Ddl inhibition, and the IC50 of acetate for Ddl is around 400 mM, we predict that the intracellular concentration must be close to or above this IC50 to observe the growth phenotypes we report.

      Reviewer #2:

      Although the authors have conclusively shown that Ddl is the target of acetic acid, it appears that the acetic acid concentration used in the experiments may not truly reflect the concentration range S. aureus would experience in its environment. Moreover, Ddl is only significantly inhibited at a very high acetate concentration (>400 mM). Thus, additional experiments showing growth phenotypes at lower organic acid concentrations may be beneficial.

      Thank you for the suggestion. In response to the reviewer, we have measured growth at various acetate concentrations and demonstrate a concentration-dependent effect (Figure 1C).

      We use 20 mM acetic acid in our study. In the gut, where S. aureus colonizes, acetate levels can reach up to 100 mM, so we believe our concentrations are physiologically relevant. When S. aureus encounters 20 mM acetate, the intracellular concentration can rise to 600 mM if the transmembrane pH gradient is 1.5 units, which is well above the ~400 mM IC50 we report for Ddl.

      Another aspect not adequately discussed is the presence of D-ala in the gut environment, which may be protective against acetate toxicity based on the model provided.

      Thank you for pointing this out. We agree that D-Ala from the gut microbiota could protect against acetate toxicity, and we’ve included this in the discussion. However, our study clearly indicates that S. aureus itself maintains high intracellular D-Ala levels through Alr1 activity which is sufficient to counter acetate anion intoxication.

      Recommendation for the authors:

      Reviewer #2:

      Major Comments:

      (1) In Line 85, authors indicate S. aureus may encounter a high concentration of ~100 mM acetic acid (extracellular?). Could the authors cite more (and recent) references indicating S. aureus encounters >100 mM acetic acid in the environment?

      To the best of our knowledge, no studies have specifically examined whether S. aureus encounters high mM concentration of acetate in the gut. Line 85 was surmised from multiple studies: recent findings that S. aureus colonizes the gut (3, 4) and that the gut environment has high acetate levels (~100 mM) (5). In response to the reviewers request, more recent references supporting high acetate concentrations in the gut (6, 7) have been added in Line# 86.

      (2) In Line 117, it is mentioned that S. aureus when grown in vitro at 20 mM acetic acid can accumulate ~600 mM acetic acid in the cytoplasm.

      a. Does the intracellular concentration go up proportionally if grown in 100 mM acetic acid? Given the IC50 of acetic acid-mediated inhibition of Ddl is ~400 mM, I wonder how physiologically relevant this finding presented here is.

      Thank you for the opportunity to explain this further. If S. aureus encounters a concentration of 100 mM acetate and its transmembrane pH gradient (pHin-pHout) is held at 1.5, the intracellular concentration of acetate could theoretically increase up to 3 M based on Ref (8). However, previous studies have shown that bacteria can lower the magnitude of transmembrane pH gradient by decreasing their intracellular pH to limit accumulation of anions within cells (9, 10).

      Although our study shows that the IC50 of Ddl inhibition by acetate is relatively high (~400 mM), we believe it’s still relevant because just 20 mM of environmental acetate at a pH of 6.0 can raise the intracellular concentration of acetate to over 600 mM, which is well above the IC50 we report for Ddl. Moreover, since S. aureus may encounter high concentrations of acetate during gut colonization, we believe our findings are physiologically relevant.

      b. Could the authors show concentration-dependent growth inhibition in alr::tn by titrating a range of acetic acid concentrations (for example 0, 0.5, 1, 5, 10, 20 mM)? Measuring intracellular acetate concentration may be beneficial as well.

      Thank you for this question. We now provide data to support that acetate-mediated inhibition of the alr1 mutant is concentration-dependent (see Figure 1C).

      c. It appears that there may be excess D-ala in the gut environment (PMIDs: 30559391; 35816159), which could counter the high acetate based on the model presented here. Could the authors clarify and/or include this information in the manuscript?

      This is an excellent point, and we have now included it in the discussion (Line# 470-475). It is indeed possible that D-Ala produced by the gut microbiome may further enhance S. aureus resistance to organic acid anions, in addition to the inherent contribution of Alr1 activity.

      (3) The following is not needed; however, it would be interesting if the authors could show that S. aureus cells grown in the presence of acetate are highly sensitive to cycloserine (which targets Alr and Ddl) compared to cells grown in the absence of acetate.

      Thank you for the suggestion. We are currently studying D-cycloserine (DCS) resistance in S. aureus. Although we provide the data below for clarification, it is not included in the current manuscript as it is part of a separate study.

      As the reviewer speculated, S. aureus is more susceptible to DCS when grown in the presence of acetate (see figure below). Normally, complete growth inhibition occurs at 32 µg/ml of DCS. However, with 20 mM acetic acid present, complete inhibition is achieved at just 8 µg/ml of DCS. Furthermore, the growth inhibition is completely rescued when externally supplemented with 5 mM D-Ala. We believe that DCS works synergistically with acetate to inhibit Ddl activity, and we are conducting additional studies to explore this further.

      Minor Comments:

      (1) Many commas are missing.

      Missing commas are now incorporated.

      (2) Line 77: disassociate --> dissociate

      Corrected.

      (3) Line 103: that --> which

      Corrected.

      (4) Lines 199-203: authors could have used gfp/luciferase reporter to test their hypotheses.

      Thank you for the suggestion. Initially, we created GFP translational fusions for all the mutants mentioned in Line# 199-203. However, the fluorescence intensity was too low to test the hypothesis, as these were single-copy fusions inserted at the SaPI site of the S. aureus genome. Because of this limitation, we took advantage of the essentiality of D-Ala-D-Ala in S. aureus to report on various mutants instead of a fluorescent reporter. In hindsight, a LacZ reporter assay might have been equally effective.

      (5) Line 339: It would be beneficial to introduce that Ddl has two independent ATP and D-ala binding sites.

      We have now added that information (Line# 338-339).

      (6) Is ddl an essential gene? If so, explicitly mention that.

      Yes, ddl is an essential gene and we have now incorporated this information in Line 103.

      (7) Line 354: shows a difference in density?

      The use of the term “difference density” is a technical crystallographic term commonly used to connote density observed for ligands in X-ray crystal structures. In this case, it simply refers to the observed density that corresponds to the two acetate ions bound within the Ddl active site.

      (8) Line 498: "Thus." Typo, change period to comma.

      We have corrected as suggested in Line 496.

      (9) Figure 1 legend says "was screen" instead of screened.

      This is now corrected.

      (10) Figure 1- Figure Supplement 1B: including data for alr2::tn dat::tn may ensure no redundancy (Lines 171-172). It is currently missing.

      Thank you for the suggestion. We now include both alr2dat double mutant and the alr1alr2dat triple mutant in Figure 1 - Figure Supplement 1B. In addition we also show that the alr1alr2dat mutant is resuced by the addition of D-Ala in Figure 1 - Figure Supplement 1C. The mutant information is also added to Table S5.

      (11) Figure 7: pentaglycine coming off of NAM is misleading. Remove untethered pentaglycine bridges.

      We thank you for pointing this out. We have modified the figure in the manuscript as suggested by the reviewer.

      (12) Are alr1/ddl cells (with limited 4-3 PG crosslink) less sensitive to vancomycin?

      On the contrary, the alr1 mutant is slightly more sensitive to vancomycin compared to the wild-type strain (see Figure below). We believe this happens because the alr1 mutant incorporates less D-Ala-D-Ala into the peptidoglycan, reducing the number of targets for vancomycin. As a result, vancomycin may be able to saturate the available D-Ala-D-Ala targets on the cell wall at a lower concentration in the alr1 mutant than in the wild type strain, leading to increased sensitivity. We haven’t included this data in the manuscript as it is part of a separate study.

      (13) Based on the structural studies, could the authors mutate the residues of Ddl involved in acetic acid binding, thereby making it resistant to acetic acid stress?

      The residues that the acetate anion interacts with are located within the ATP-binding and D-Ala-binding sites of Ddl. Since these residues are essential for Ddl function, we are unable to mutate them.

      (14) Microscopy to show the cell morphologies of wild-type and mutants exposed to acetic acid (and with D-ala supplementation) could be potentially interesting.

      Thank you for the suggestion. We did perform microscopy, expecting changes in cell shape or size, but the results were unremarkable and not included in the manuscript.

      References:

      (1) Hammes WP & Neuhaus FC (1974) On the specificity of phospho-N-acetylmuramyl-pentapeptide translocase. The peptide subunit of uridine diphosphate-N-actylmuramyl-pentapeptide. J Biol Chem 249(10):3140-3150.

      (2) Roe AJ, McLaggan D, Davidson I, O'Byrne C, & Booth IR (1998) Perturbation of anion balance during inhibition of growth of Escherichia coli by weak acids. J Bacteriol 180(4):767-772.

      (3) Acton DS, Plat-Sinnige MJ, van Wamel W, de Groot N, & van Belkum A (2009) Intestinal carriage of Staphylococcus aureus: how does its frequency compare with that of nasal carriage and what is its clinical impact? Eur J Clin Microbiol Infect Dis 28(2):115-127.

      (4) Piewngam P_, et al. (2023) Probiotic for pathogen-specific _Staphylococcus aureus decolonisation in Thailand: a phase 2, double-blind, randomised, placebo-controlled trial. Lancet Microbe 4(2):e75-e83.

      (5) Cummings JH, Pomare EW, Branch WJ, Naylor CP, & Macfarlane GT (1987) Short chain fatty acids in human large intestine, portal, hepatic and venous blood. Gut 28(10):1221-1227.

      (6) Correa-Oliveira R, Fachi JL, Vieira A, Sato FT, & Vinolo MA (2016) Regulation of immune cell function by short-chain fatty acids. Clin Transl Immunology 5(4):e73.

      (7) Hosmer J, McEwan AG, & Kappler U (2024) Bacterial acetate metabolism and its influence on human epithelia. Emerg Top Life Sci 8(1):1-13.

      (8) Carpenter CE & Broadbent JR (2009) External concentration of organic acid anions and pH: key independent variables for studying how organic acids inhibit growth of bacteria in mildly acidic foods. J Food Sci 74(1):R12-15.

      (9) Russell JB (1992) Another explanation for the toxicity of fermentation acids at low pH: anion accumulation versus uncoupling. Journal of Applied Bacteriology 73(5):363-370.

      (10) Russell JB & Diez-Gonzalez F (1998) The effects of fermentation acids on bacterial growth. Adv Microb Physiol 39:205-234.